When trying to run linux with various opensource riscv core on
resource limited FPGA platforms, for example, those FPGAs with less
than 16MB SDRAM, I want to save mem as much as possible. One of the
major technologies is kernel size optimizations, I found that riscv
does not currently support HAVE_LD_DEAD_CODE_DATA_ELIMINATION, which
passes -fdata-sections, -ffunction-sections to CFLAGS and passes the
--gc-sections flag to the linker.
This not only benefits my case on FPGA but also benefits defconfigs.
Here are some notable improvements from enabling this with defconfigs:
nommu_k210_defconfig:
text data bss dec hex
1112009 410288 59837 1582134 182436 before
962838 376656 51285 1390779 1538bb after
rv32_defconfig:
text data bss dec hex
8804455 2816544 290577 11911576 b5c198 before
8692295 2779872 288977 11761144 b375f8 after
defconfig:
text data bss dec hex
9438267 3391332 485333 13314932 cb2b74 before
9285914 3350052 483349 13119315 c82f53 after
patch1 and patch2 are clean ups.
patch3 fixes a typo.
patch4 finally enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION for riscv.
NOTE: Zhangjin Wu firstly sent out a patch to enable dead code
elimination for riscv several months ago, I didn't notice it until
yesterday. Although it missed some preparations and some sections's
keeping, he is the first person to enable this feature for riscv. To
ease merging, this series take his patch into my entire series and
makes patch4 authored by him after getting his ack to reflect
the above fact.
Since v1:
- collect Reviewed-by, Tested-by tag
- Make patch4 authored by Zhangjin Wu, add my co-developed-by tag
Jisheng Zhang (3):
riscv: move options to keep entries sorted
riscv: vmlinux-xip.lds.S: remove .alternative section
vmlinux.lds.h: use correct .init.data.* section name
Zhangjin Wu (1):
riscv: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
arch/riscv/Kconfig | 13 +-
arch/riscv/kernel/vmlinux-xip.lds.S | 6 -
arch/riscv/kernel/vmlinux.lds.S | 6 +-
include/asm-generic/vmlinux.lds.h | 2 +-
4 files changed, 11 insertions(+), 16 deletions(-)
--
2.40.1
If building with -fdata-sections on riscv, LD_ORPHAN_WARN will warn
similar as below:
riscv64-linux-gnu-ld: warning: orphan section `.init.data.efi_loglevel'
from `./drivers/firmware/efi/libstub/printk.stub.o' being placed in
section `.init.data.efi_loglevel'
I believe this is caused by a a typo:
init.data.* should be .init.data.*
Signed-off-by: Jisheng Zhang <[email protected]>
---
include/asm-generic/vmlinux.lds.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index d1f57e4868ed..371026ca7221 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -688,7 +688,7 @@
/* init and exit section handling */
#define INIT_DATA \
KEEP(*(SORT(___kentry+*))) \
- *(.init.data init.data.*) \
+ *(.init.data .init.data.*) \
MEM_DISCARD(init.data*) \
KERNEL_CTORS() \
MCOUNT_REC() \
--
2.40.1
Recently, some commits break the entries order. Properly move their
locations to keep entries sorted.
Signed-off-by: Jisheng Zhang <[email protected]>
---
arch/riscv/Kconfig | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 348c0fa1fc8c..8f55aa4aae34 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -101,6 +101,11 @@ config RISCV
select HAVE_CONTEXT_TRACKING_USER
select HAVE_DEBUG_KMEMLEAK
select HAVE_DMA_CONTIGUOUS if MMU
+ select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && MMU && (CLANG_SUPPORTS_DYNAMIC_FTRACE || GCC_SUPPORTS_DYNAMIC_FTRACE)
+ select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
+ select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
+ select HAVE_FUNCTION_GRAPH_TRACER
+ select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION
select HAVE_EBPF_JIT if MMU
select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUNCTION_ERROR_INJECTION
@@ -110,7 +115,6 @@ config RISCV
select HAVE_KPROBES if !XIP_KERNEL
select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL
select HAVE_KRETPROBES if !XIP_KERNEL
- select HAVE_RETHOOK if !XIP_KERNEL
select HAVE_MOVE_PMD
select HAVE_MOVE_PUD
select HAVE_PCI
@@ -119,6 +123,7 @@ config RISCV
select HAVE_PERF_USER_STACK_DUMP
select HAVE_POSIX_CPU_TIMERS_TASK_WORK
select HAVE_REGS_AND_STACK_ACCESS_API
+ select HAVE_RETHOOK if !XIP_KERNEL
select HAVE_RSEQ
select HAVE_STACKPROTECTOR
select HAVE_SYSCALL_TRACEPOINTS
@@ -142,11 +147,6 @@ config RISCV
select TRACE_IRQFLAGS_SUPPORT
select UACCESS_MEMCPY if !MMU
select ZONE_DMA32 if 64BIT
- select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && MMU && (CLANG_SUPPORTS_DYNAMIC_FTRACE || GCC_SUPPORTS_DYNAMIC_FTRACE)
- select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
- select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
- select HAVE_FUNCTION_GRAPH_TRACER
- select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION
config CLANG_SUPPORTS_DYNAMIC_FTRACE
def_bool CC_IS_CLANG
--
2.40.1
ALTERNATIVE mechanism can't work on XIP, and this is also reflected by
below Kconfig dependency:
RISCV_ALTERNATIVE
...
depends on !XIP_KERNEL
...
So there's no .alternative section at all for XIP case, remove it.
Signed-off-by: Jisheng Zhang <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
---
arch/riscv/kernel/vmlinux-xip.lds.S | 6 ------
1 files changed, 6 deletions(-)
diff --git a/arch/riscv/kernel/vmlinux-xip.lds.S b/arch/riscv/kernel/vmlinux-xip.lds.S
index eab9edc3b631..50767647fbc6 100644
--- a/arch/riscv/kernel/vmlinux-xip.lds.S
+++ b/arch/riscv/kernel/vmlinux-xip.lds.S
@@ -98,12 +98,6 @@ SECTIONS
__soc_builtin_dtb_table_end = .;
}
- . = ALIGN(8);
- .alternative : {
- __alt_start = .;
- *(.alternative)
- __alt_end = .;
- }
__init_end = .;
. = ALIGN(16);
--
2.40.1
From: Zhangjin Wu <[email protected]>
Select CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION for RISC-V, allowing
the user to enable dead code elimination. In order for this to work,
ensure that we keep the alternative table by annotating them with KEEP.
This boots well on qemu with both rv32_defconfig & rv64 defconfig, but
it only shrinks their builds by ~1%, a smaller config is thereforce
customized to test this feature:
| rv32 | rv64
--------|------------------------|---------------------
No DCE | 4460684 | 4893488
DCE | 3986716 | 4376400
Shrink | 473968 (~10.6%) | 517088 (~10.5%)
The config used above only reserves necessary options to boot on qemu
with serial console, more like the size-critical embedded scenes:
- rv64 config: https://pastebin.com/crz82T0s
- rv32 config: rv64 config + 32-bit.config
Here is Jisheng's original commit-msg:
When trying to run linux with various opensource riscv core on
resource limited FPGA platforms, for example, those FPGAs with less
than 16MB SDRAM, I want to save mem as much as possible. One of the
major technologies is kernel size optimizations, I found that riscv
does not currently support HAVE_LD_DEAD_CODE_DATA_ELIMINATION, which
passes -fdata-sections, -ffunction-sections to CFLAGS and passes the
--gc-sections flag to the linker.
This not only benefits my case on FPGA but also benefits defconfigs.
Here are some notable improvements from enabling this with defconfigs:
nommu_k210_defconfig:
text data bss dec hex
1112009 410288 59837 1582134 182436 before
962838 376656 51285 1390779 1538bb after
rv32_defconfig:
text data bss dec hex
8804455 2816544 290577 11911576 b5c198 before
8692295 2779872 288977 11761144 b375f8 after
defconfig:
text data bss dec hex
9438267 3391332 485333 13314932 cb2b74 before
9285914 3350052 483349 13119315 c82f53 after
Signed-off-by: Zhangjin Wu <[email protected]>
Co-developed-by: Jisheng Zhang <[email protected]>
Signed-off-by: Jisheng Zhang <[email protected]>
Reviewed-by: Guo Ren <[email protected]>
Tested-by: Bin Meng <[email protected]>
---
arch/riscv/Kconfig | 1 +
arch/riscv/kernel/vmlinux.lds.S | 6 +++---
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 8f55aa4aae34..62e84fee2cfd 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -115,6 +115,7 @@ config RISCV
select HAVE_KPROBES if !XIP_KERNEL
select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL
select HAVE_KRETPROBES if !XIP_KERNEL
+ select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
select HAVE_MOVE_PMD
select HAVE_MOVE_PUD
select HAVE_PCI
diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
index e5f9f4677bbf..492dd4b8f3d6 100644
--- a/arch/riscv/kernel/vmlinux.lds.S
+++ b/arch/riscv/kernel/vmlinux.lds.S
@@ -85,11 +85,11 @@ SECTIONS
INIT_DATA_SECTION(16)
.init.pi : {
- *(.init.pi*)
+ KEEP(*(.init.pi*))
}
.init.bss : {
- *(.init.bss) /* from the EFI stub */
+ KEEP(*(.init.bss*)) /* from the EFI stub */
}
.exit.data :
{
@@ -112,7 +112,7 @@ SECTIONS
. = ALIGN(8);
.alternative : {
__alt_start = .;
- *(.alternative)
+ KEEP(*(.alternative))
__alt_end = .;
}
__init_end = .;
--
2.40.1
On 2023/5/24 0:55, Jisheng Zhang wrote:
> From: Zhangjin Wu <[email protected]>
>
> Select CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION for RISC-V, allowing
> the user to enable dead code elimination. In order for this to work,
> ensure that we keep the alternative table by annotating them with KEEP.
>
> This boots well on qemu with both rv32_defconfig & rv64 defconfig, but
> it only shrinks their builds by ~1%, a smaller config is thereforce
> customized to test this feature:
>
> | rv32 | rv64
> --------|------------------------|---------------------
> No DCE | 4460684 | 4893488
> DCE | 3986716 | 4376400
> Shrink | 473968 (~10.6%) | 517088 (~10.5%)
>
> The config used above only reserves necessary options to boot on qemu
> with serial console, more like the size-critical embedded scenes:
>
> - rv64 config: https://pastebin.com/crz82T0s
> - rv32 config: rv64 config + 32-bit.config
>
> Here is Jisheng's original commit-msg:
> When trying to run linux with various opensource riscv core on
> resource limited FPGA platforms, for example, those FPGAs with less
> than 16MB SDRAM, I want to save mem as much as possible. One of the
> major technologies is kernel size optimizations, I found that riscv
> does not currently support HAVE_LD_DEAD_CODE_DATA_ELIMINATION, which
> passes -fdata-sections, -ffunction-sections to CFLAGS and passes the
> --gc-sections flag to the linker.
>
> This not only benefits my case on FPGA but also benefits defconfigs.
> Here are some notable improvements from enabling this with defconfigs:
>
> nommu_k210_defconfig:
> text data bss dec hex
> 1112009 410288 59837 1582134 182436 before
> 962838 376656 51285 1390779 1538bb after
>
> rv32_defconfig:
> text data bss dec hex
> 8804455 2816544 290577 11911576 b5c198 before
> 8692295 2779872 288977 11761144 b375f8 after
>
> defconfig:
> text data bss dec hex
> 9438267 3391332 485333 13314932 cb2b74 before
> 9285914 3350052 483349 13119315 c82f53 after
>
> Signed-off-by: Zhangjin Wu <[email protected]>
> Co-developed-by: Jisheng Zhang <[email protected]>
> Signed-off-by: Jisheng Zhang <[email protected]>
> Reviewed-by: Guo Ren <[email protected]>
> Tested-by: Bin Meng <[email protected]>
Reviewed-by: Kefeng Wang <[email protected]>
> ---
> arch/riscv/Kconfig | 1 +
> arch/riscv/kernel/vmlinux.lds.S | 6 +++---
> 2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 8f55aa4aae34..62e84fee2cfd 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -115,6 +115,7 @@ config RISCV
> select HAVE_KPROBES if !XIP_KERNEL
> select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL
> select HAVE_KRETPROBES if !XIP_KERNEL
> + select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> select HAVE_MOVE_PMD
> select HAVE_MOVE_PUD
> select HAVE_PCI
> diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
> index e5f9f4677bbf..492dd4b8f3d6 100644
> --- a/arch/riscv/kernel/vmlinux.lds.S
> +++ b/arch/riscv/kernel/vmlinux.lds.S
> @@ -85,11 +85,11 @@ SECTIONS
> INIT_DATA_SECTION(16)
>
> .init.pi : {
> - *(.init.pi*)
> + KEEP(*(.init.pi*))
> }
>
> .init.bss : {
> - *(.init.bss) /* from the EFI stub */
> + KEEP(*(.init.bss*)) /* from the EFI stub */
> }
> .exit.data :
> {
> @@ -112,7 +112,7 @@ SECTIONS
> . = ALIGN(8);
> .alternative : {
> __alt_start = .;
> - *(.alternative)
> + KEEP(*(.alternative))
> __alt_end = .;
> }
> __init_end = .;
On Wed, May 24, 2023 at 12:54:59AM +0800, Jisheng Zhang wrote:
> Recently, some commits break the entries order. Properly move their
> locations to keep entries sorted.
>
> Signed-off-by: Jisheng Zhang <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Thanks,
Conor.
Reviewed-by: Guo Ren <[email protected]>
On Wed, May 24, 2023 at 1:10 AM Jisheng Zhang <[email protected]> wrote:
>
> ALTERNATIVE mechanism can't work on XIP, and this is also reflected by
> below Kconfig dependency:
>
> RISCV_ALTERNATIVE
> ...
> depends on !XIP_KERNEL
> ...
>
> So there's no .alternative section at all for XIP case, remove it.
>
> Signed-off-by: Jisheng Zhang <[email protected]>
> Reviewed-by: Conor Dooley <[email protected]>
> ---
> arch/riscv/kernel/vmlinux-xip.lds.S | 6 ------
> 1 files changed, 6 deletions(-)
>
> diff --git a/arch/riscv/kernel/vmlinux-xip.lds.S b/arch/riscv/kernel/vmlinux-xip.lds.S
> index eab9edc3b631..50767647fbc6 100644
> --- a/arch/riscv/kernel/vmlinux-xip.lds.S
> +++ b/arch/riscv/kernel/vmlinux-xip.lds.S
> @@ -98,12 +98,6 @@ SECTIONS
> __soc_builtin_dtb_table_end = .;
> }
>
> - . = ALIGN(8);
> - .alternative : {
> - __alt_start = .;
> - *(.alternative)
> - __alt_end = .;
> - }
> __init_end = .;
>
> . = ALIGN(16);
> --
> 2.40.1
>
--
Best Regards
Guo Ren
On Wed, May 24, 2023 at 1:10 AM Jisheng Zhang <[email protected]> wrote:
>
> Recently, some commits break the entries order. Properly move their
> locations to keep entries sorted.
Acked-by: Guo Ren <[email protected]>
>
> Signed-off-by: Jisheng Zhang <[email protected]>
> ---
> arch/riscv/Kconfig | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 348c0fa1fc8c..8f55aa4aae34 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -101,6 +101,11 @@ config RISCV
> select HAVE_CONTEXT_TRACKING_USER
> select HAVE_DEBUG_KMEMLEAK
> select HAVE_DMA_CONTIGUOUS if MMU
> + select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && MMU && (CLANG_SUPPORTS_DYNAMIC_FTRACE || GCC_SUPPORTS_DYNAMIC_FTRACE)
> + select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
> + select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
> + select HAVE_FUNCTION_GRAPH_TRACER
> + select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION
> select HAVE_EBPF_JIT if MMU
> select HAVE_FUNCTION_ARG_ACCESS_API
> select HAVE_FUNCTION_ERROR_INJECTION
> @@ -110,7 +115,6 @@ config RISCV
> select HAVE_KPROBES if !XIP_KERNEL
> select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL
> select HAVE_KRETPROBES if !XIP_KERNEL
> - select HAVE_RETHOOK if !XIP_KERNEL
> select HAVE_MOVE_PMD
> select HAVE_MOVE_PUD
> select HAVE_PCI
> @@ -119,6 +123,7 @@ config RISCV
> select HAVE_PERF_USER_STACK_DUMP
> select HAVE_POSIX_CPU_TIMERS_TASK_WORK
> select HAVE_REGS_AND_STACK_ACCESS_API
> + select HAVE_RETHOOK if !XIP_KERNEL
> select HAVE_RSEQ
> select HAVE_STACKPROTECTOR
> select HAVE_SYSCALL_TRACEPOINTS
> @@ -142,11 +147,6 @@ config RISCV
> select TRACE_IRQFLAGS_SUPPORT
> select UACCESS_MEMCPY if !MMU
> select ZONE_DMA32 if 64BIT
> - select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && MMU && (CLANG_SUPPORTS_DYNAMIC_FTRACE || GCC_SUPPORTS_DYNAMIC_FTRACE)
> - select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
> - select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
> - select HAVE_FUNCTION_GRAPH_TRACER
> - select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION
>
> config CLANG_SUPPORTS_DYNAMIC_FTRACE
> def_bool CC_IS_CLANG
> --
> 2.40.1
>
--
Best Regards
Guo Ren
On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
> When trying to run linux with various opensource riscv core on
> resource limited FPGA platforms, for example, those FPGAs with less
> than 16MB SDRAM, I want to save mem as much as possible. One of the
> major technologies is kernel size optimizations, I found that riscv
> does not currently support HAVE_LD_DEAD_CODE_DATA_ELIMINATION, which
> passes -fdata-sections, -ffunction-sections to CFLAGS and passes the
> --gc-sections flag to the linker.
>
> This not only benefits my case on FPGA but also benefits defconfigs.
> Here are some notable improvements from enabling this with defconfigs:
>
> nommu_k210_defconfig:
> text data bss dec hex
> 1112009 410288 59837 1582134 182436 before
> 962838 376656 51285 1390779 1538bb after
>
> rv32_defconfig:
> text data bss dec hex
> 8804455 2816544 290577 11911576 b5c198 before
> 8692295 2779872 288977 11761144 b375f8 after
>
> defconfig:
> text data bss dec hex
> 9438267 3391332 485333 13314932 cb2b74 before
> 9285914 3350052 483349 13119315 c82f53 after
>
> patch1 and patch2 are clean ups.
> patch3 fixes a typo.
> patch4 finally enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION for riscv.
>
> NOTE: Zhangjin Wu firstly sent out a patch to enable dead code
> elimination for riscv several months ago, I didn't notice it until
> yesterday. Although it missed some preparations and some sections's
> keeping, he is the first person to enable this feature for riscv. To
> ease merging, this series take his patch into my entire series and
> makes patch4 authored by him after getting his ack to reflect
> the above fact.
>
> Since v1:
> - collect Reviewed-by, Tested-by tag
> - Make patch4 authored by Zhangjin Wu, add my co-developed-by tag
>
> Jisheng Zhang (3):
> riscv: move options to keep entries sorted
> riscv: vmlinux-xip.lds.S: remove .alternative section
> vmlinux.lds.h: use correct .init.data.* section name
>
> Zhangjin Wu (1):
> riscv: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
>
> arch/riscv/Kconfig | 13 +-
> arch/riscv/kernel/vmlinux-xip.lds.S | 6 -
> arch/riscv/kernel/vmlinux.lds.S | 6 +-
> include/asm-generic/vmlinux.lds.h | 2 +-
> 4 files changed, 11 insertions(+), 16 deletions(-)
Do you have a base commit for this? It's not applying to 6.4-rc1 and
the patchwork bot couldn't find one either.
On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
> > When trying to run linux with various opensource riscv core on
> > resource limited FPGA platforms, for example, those FPGAs with less
> > than 16MB SDRAM, I want to save mem as much as possible. One of the
> > major technologies is kernel size optimizations, I found that riscv
> > does not currently support HAVE_LD_DEAD_CODE_DATA_ELIMINATION, which
> > passes -fdata-sections, -ffunction-sections to CFLAGS and passes the
> > --gc-sections flag to the linker.
> >
> > This not only benefits my case on FPGA but also benefits defconfigs.
> > Here are some notable improvements from enabling this with defconfigs:
> >
> > nommu_k210_defconfig:
> > text data bss dec hex
> > 1112009 410288 59837 1582134 182436 before
> > 962838 376656 51285 1390779 1538bb after
> >
> > rv32_defconfig:
> > text data bss dec hex
> > 8804455 2816544 290577 11911576 b5c198 before
> > 8692295 2779872 288977 11761144 b375f8 after
> >
> > defconfig:
> > text data bss dec hex
> > 9438267 3391332 485333 13314932 cb2b74 before
> > 9285914 3350052 483349 13119315 c82f53 after
> >
> > patch1 and patch2 are clean ups.
> > patch3 fixes a typo.
> > patch4 finally enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION for riscv.
> >
> > NOTE: Zhangjin Wu firstly sent out a patch to enable dead code
> > elimination for riscv several months ago, I didn't notice it until
> > yesterday. Although it missed some preparations and some sections's
> > keeping, he is the first person to enable this feature for riscv. To
> > ease merging, this series take his patch into my entire series and
> > makes patch4 authored by him after getting his ack to reflect
> > the above fact.
> >
> > Since v1:
> > - collect Reviewed-by, Tested-by tag
> > - Make patch4 authored by Zhangjin Wu, add my co-developed-by tag
> >
> > Jisheng Zhang (3):
> > riscv: move options to keep entries sorted
> > riscv: vmlinux-xip.lds.S: remove .alternative section
> > vmlinux.lds.h: use correct .init.data.* section name
> >
> > Zhangjin Wu (1):
> > riscv: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> >
> > arch/riscv/Kconfig | 13 +-
> > arch/riscv/kernel/vmlinux-xip.lds.S | 6 -
> > arch/riscv/kernel/vmlinux.lds.S | 6 +-
> > include/asm-generic/vmlinux.lds.h | 2 +-
> > 4 files changed, 11 insertions(+), 16 deletions(-)
>
> Do you have a base commit for this? It's not applying to 6.4-rc1 and the
> patchwork bot couldn't find one either.
Hi Palmer,
Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
series is based on 6.4-rc2.
Thanks
On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
>
> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
>> > When trying to run linux with various opensource riscv core on
>> > resource limited FPGA platforms, for example, those FPGAs with less
>> > than 16MB SDRAM, I want to save mem as much as possible. One of the
>> > major technologies is kernel size optimizations, I found that riscv
>> > does not currently support HAVE_LD_DEAD_CODE_DATA_ELIMINATION, which
>> > passes -fdata-sections, -ffunction-sections to CFLAGS and passes the
>> > --gc-sections flag to the linker.
>> >
>> > This not only benefits my case on FPGA but also benefits defconfigs.
>> > Here are some notable improvements from enabling this with defconfigs:
>> >
>> > nommu_k210_defconfig:
>> > text data bss dec hex
>> > 1112009 410288 59837 1582134 182436 before
>> > 962838 376656 51285 1390779 1538bb after
>> >
>> > rv32_defconfig:
>> > text data bss dec hex
>> > 8804455 2816544 290577 11911576 b5c198 before
>> > 8692295 2779872 288977 11761144 b375f8 after
>> >
>> > defconfig:
>> > text data bss dec hex
>> > 9438267 3391332 485333 13314932 cb2b74 before
>> > 9285914 3350052 483349 13119315 c82f53 after
>> >
>> > patch1 and patch2 are clean ups.
>> > patch3 fixes a typo.
>> > patch4 finally enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION for riscv.
>> >
>> > NOTE: Zhangjin Wu firstly sent out a patch to enable dead code
>> > elimination for riscv several months ago, I didn't notice it until
>> > yesterday. Although it missed some preparations and some sections's
>> > keeping, he is the first person to enable this feature for riscv. To
>> > ease merging, this series take his patch into my entire series and
>> > makes patch4 authored by him after getting his ack to reflect
>> > the above fact.
>> >
>> > Since v1:
>> > - collect Reviewed-by, Tested-by tag
>> > - Make patch4 authored by Zhangjin Wu, add my co-developed-by tag
>> >
>> > Jisheng Zhang (3):
>> > riscv: move options to keep entries sorted
>> > riscv: vmlinux-xip.lds.S: remove .alternative section
>> > vmlinux.lds.h: use correct .init.data.* section name
>> >
>> > Zhangjin Wu (1):
>> > riscv: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
>> >
>> > arch/riscv/Kconfig | 13 +-
>> > arch/riscv/kernel/vmlinux-xip.lds.S | 6 -
>> > arch/riscv/kernel/vmlinux.lds.S | 6 +-
>> > include/asm-generic/vmlinux.lds.h | 2 +-
>> > 4 files changed, 11 insertions(+), 16 deletions(-)
>>
>> Do you have a base commit for this? It's not applying to 6.4-rc1 and the
>> patchwork bot couldn't find one either.
>
> Hi Palmer,
>
> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
> series is based on 6.4-rc2.
Thanks.
>
> Thanks
On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
> On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
>>
>> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
>>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
>>> > When trying to run linux with various opensource riscv core on
>>> > resource limited FPGA platforms, for example, those FPGAs with less
>>> > than 16MB SDRAM, I want to save mem as much as possible. One of the
>>> > major technologies is kernel size optimizations, I found that riscv
>>> > does not currently support HAVE_LD_DEAD_CODE_DATA_ELIMINATION, which
>>> > passes -fdata-sections, -ffunction-sections to CFLAGS and passes the
>>> > --gc-sections flag to the linker.
>>> >
>>> > This not only benefits my case on FPGA but also benefits defconfigs.
>>> > Here are some notable improvements from enabling this with defconfigs:
>>> >
>>> > nommu_k210_defconfig:
>>> > text data bss dec hex
>>> > 1112009 410288 59837 1582134 182436 before
>>> > 962838 376656 51285 1390779 1538bb after
>>> >
>>> > rv32_defconfig:
>>> > text data bss dec hex
>>> > 8804455 2816544 290577 11911576 b5c198 before
>>> > 8692295 2779872 288977 11761144 b375f8 after
>>> >
>>> > defconfig:
>>> > text data bss dec hex
>>> > 9438267 3391332 485333 13314932 cb2b74 before
>>> > 9285914 3350052 483349 13119315 c82f53 after
>>> >
>>> > patch1 and patch2 are clean ups.
>>> > patch3 fixes a typo.
>>> > patch4 finally enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION for riscv.
>>> >
>>> > NOTE: Zhangjin Wu firstly sent out a patch to enable dead code
>>> > elimination for riscv several months ago, I didn't notice it until
>>> > yesterday. Although it missed some preparations and some sections's
>>> > keeping, he is the first person to enable this feature for riscv. To
>>> > ease merging, this series take his patch into my entire series and
>>> > makes patch4 authored by him after getting his ack to reflect
>>> > the above fact.
>>> >
>>> > Since v1:
>>> > - collect Reviewed-by, Tested-by tag
>>> > - Make patch4 authored by Zhangjin Wu, add my co-developed-by tag
>>> >
>>> > Jisheng Zhang (3):
>>> > riscv: move options to keep entries sorted
>>> > riscv: vmlinux-xip.lds.S: remove .alternative section
>>> > vmlinux.lds.h: use correct .init.data.* section name
>>> >
>>> > Zhangjin Wu (1):
>>> > riscv: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
>>> >
>>> > arch/riscv/Kconfig | 13 +-
>>> > arch/riscv/kernel/vmlinux-xip.lds.S | 6 -
>>> > arch/riscv/kernel/vmlinux.lds.S | 6 +-
>>> > include/asm-generic/vmlinux.lds.h | 2 +-
>>> > 4 files changed, 11 insertions(+), 16 deletions(-)
>>>
>>> Do you have a base commit for this? It's not applying to 6.4-rc1 and the
>>> patchwork bot couldn't find one either.
>>
>> Hi Palmer,
>>
>> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
>> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
>> series is based on 6.4-rc2.
>
> Thanks.
Sorry to be so slow here, but I think this is causing LLD to hang on
allmodconfig. I'm still getting to the bottom of it, there's a few
other things I have in flight still.
>
>>
>> Thanks
On Mon, Jun 19, 2023 at 6:06 PM Palmer Dabbelt <[email protected]> wrote:
>
> On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
> > On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
> >>
> >> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
> >>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
> >>> > When trying to run linux with various opensource riscv core on
> >>> > resource limited FPGA platforms, for example, those FPGAs with less
> >>> > than 16MB SDRAM, I want to save mem as much as possible. One of the
> >>> > major technologies is kernel size optimizations, I found that riscv
> >>> > does not currently support HAVE_LD_DEAD_CODE_DATA_ELIMINATION, which
> >>> > passes -fdata-sections, -ffunction-sections to CFLAGS and passes the
> >>> > --gc-sections flag to the linker.
> >>> >
> >>> > This not only benefits my case on FPGA but also benefits defconfigs.
> >>> > Here are some notable improvements from enabling this with defconfigs:
> >>> >
> >>> > nommu_k210_defconfig:
> >>> > text data bss dec hex
> >>> > 1112009 410288 59837 1582134 182436 before
> >>> > 962838 376656 51285 1390779 1538bb after
> >>> >
> >>> > rv32_defconfig:
> >>> > text data bss dec hex
> >>> > 8804455 2816544 290577 11911576 b5c198 before
> >>> > 8692295 2779872 288977 11761144 b375f8 after
> >>> >
> >>> > defconfig:
> >>> > text data bss dec hex
> >>> > 9438267 3391332 485333 13314932 cb2b74 before
> >>> > 9285914 3350052 483349 13119315 c82f53 after
> >>> >
> >>> > patch1 and patch2 are clean ups.
> >>> > patch3 fixes a typo.
> >>> > patch4 finally enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION for riscv.
> >>> >
> >>> > NOTE: Zhangjin Wu firstly sent out a patch to enable dead code
> >>> > elimination for riscv several months ago, I didn't notice it until
> >>> > yesterday. Although it missed some preparations and some sections's
> >>> > keeping, he is the first person to enable this feature for riscv. To
> >>> > ease merging, this series take his patch into my entire series and
> >>> > makes patch4 authored by him after getting his ack to reflect
> >>> > the above fact.
> >>> >
> >>> > Since v1:
> >>> > - collect Reviewed-by, Tested-by tag
> >>> > - Make patch4 authored by Zhangjin Wu, add my co-developed-by tag
> >>> >
> >>> > Jisheng Zhang (3):
> >>> > riscv: move options to keep entries sorted
> >>> > riscv: vmlinux-xip.lds.S: remove .alternative section
> >>> > vmlinux.lds.h: use correct .init.data.* section name
> >>> >
> >>> > Zhangjin Wu (1):
> >>> > riscv: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> >>> >
> >>> > arch/riscv/Kconfig | 13 +-
> >>> > arch/riscv/kernel/vmlinux-xip.lds.S | 6 -
> >>> > arch/riscv/kernel/vmlinux.lds.S | 6 +-
> >>> > include/asm-generic/vmlinux.lds.h | 2 +-
> >>> > 4 files changed, 11 insertions(+), 16 deletions(-)
> >>>
> >>> Do you have a base commit for this? It's not applying to 6.4-rc1 and the
> >>> patchwork bot couldn't find one either.
> >>
> >> Hi Palmer,
> >>
> >> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
> >> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
> >> series is based on 6.4-rc2.
> >
> > Thanks.
>
> Sorry to be so slow here, but I think this is causing LLD to hang on
> allmodconfig. I'm still getting to the bottom of it, there's a few
> other things I have in flight still.
Confirmed with v3 on mainline (linux-next is pretty red at the moment).
https://lore.kernel.org/linux-riscv/[email protected]/
I was able to dump a backtrace of all of LLD's threads and all threads
seemed parked in a futex wait except for one thread with a more
interesting trace.
0x0000555557ea01ce in
lld::elf::LinkerScript::addOrphanSections()::$_0::operator()(lld::elf::InputSectionBase*)
const ()
(gdb) bt
#0 0x0000555557ea01ce in
lld::elf::LinkerScript::addOrphanSections()::$_0::operator()(lld::elf::InputSectionBase*)
const ()
#1 0x0000555557e9fc3f in lld::elf::LinkerScript::addOrphanSections() ()
#2 0x0000555557dd0ca1 in
lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) ()
#3 0x0000555557dc19a8 in
lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) ()
#4 0x0000555557dbfff9 in lld::elf::link(llvm::ArrayRef<char const*>,
llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) ()
#5 0x0000555557c3ffcf in lldMain(int, char const**,
llvm::raw_ostream&, llvm::raw_ostream&, bool) ()
#6 0x0000555557c3f7aa in lld_main(int, char**, llvm::ToolContext const&) ()
#7 0x0000555557c41ee1 in main ()
Makes me wonder if there's some kind of loop adding orphan sections
that aren't referenced, so they're cleaned up.
Though I don't think it's a hang; IIRC dead code elimination adds a
measurable amount of time to the build. As code is unreferenced and
removed, I think the linker is reshuffling layout and thus recomputing
relocations.
Though triple checking mainline without this patch vs mainline with
this patch, twice now I just got an error from LLD (in 2 minutes on my
system):
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(efi-stub-entry.stub.o):(.init.bss.screen_info_offset)
is being placed in '.init.bss.screen_info_offset'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(efi-stub-helper.stub.o):(.init.data.efi_nokaslr)
is being placed in '.init.data.efi_nokaslr'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(efi-stub-helper.stub.o):(.init.bss.efi_noinitrd)
is being placed in '.init.bss.efi_noinitrd'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(efi-stub-helper.stub.o):(.init.bss.efi_nochunk)
is being placed in '.init.bss.efi_nochunk'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(efi-stub-helper.stub.o):(.init.bss.efi_novamap)
is being placed in '.init.bss.efi_novamap'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(efi-stub-helper.stub.o):(.init.bss.efi_disable_pci_dma)
is being placed in '.init.bss.efi_disable_pci_dma'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(file.stub.o):(.init.bss.efi_open_device_path.text_to_dp)
is being placed in '.init.bss.efi_open_device_path.text_to_dp'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(gop.stub.o):(.init.bss.cmdline.0)
is being placed in '.init.bss.cmdline.0'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(gop.stub.o):(.init.bss.cmdline.1)
is being placed in '.init.bss.cmdline.1'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(gop.stub.o):(.init.bss.cmdline.2)
is being placed in '.init.bss.cmdline.2'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(gop.stub.o):(.init.bss.cmdline.3)
is being placed in '.init.bss.cmdline.3'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(gop.stub.o):(.init.bss.cmdline.4)
is being placed in '.init.bss.cmdline.4'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(printk.stub.o):(.init.data.efi_loglevel)
is being placed in '.init.data.efi_loglevel'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(riscv.stub.o):(.init.bss.hartid)
is being placed in '.init.bss.hartid'
ld.lld: error: ./drivers/firmware/efi/libstub/lib.a(systable.stub.o):(.init.bss.efi_system_table)
is being placed in '.init.bss.efi_system_table'
is it perhaps that these sections need placement in the linker script?
This is from the orphan section warn linker command line flag.
Does the EFI stub have one linker script, or one per arch? (Or am I
mistaken and the EFI stub is part of vmlinux)?
>
> >
> >>
> >> Thanks
>
--
Thanks,
~Nick Desaulniers
On Tue, Jun 20, 2023 at 04:05:55PM -0400, Nick Desaulniers wrote:
> On Mon, Jun 19, 2023 at 6:06 PM Palmer Dabbelt <[email protected]> wrote:
> > On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
> > > On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
> > >> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
> > >>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
> > >> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
> > >> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
> > >> series is based on 6.4-rc2.
> > >
> > > Thanks.
> >
> > Sorry to be so slow here, but I think this is causing LLD to hang on
> > allmodconfig. I'm still getting to the bottom of it, there's a few
> > other things I have in flight still.
>
> Confirmed with v3 on mainline (linux-next is pretty red at the moment).
> https://lore.kernel.org/linux-riscv/[email protected]/
Just FYI Nick, there's been some concurrent work here from different
people working on the same thing & the v3 you linked (from Zhangjin) was
superseded by this v2 (from Jisheng).
Cheers,
Conor.
On Tue, Jun 20, 2023 at 4:13 PM Conor Dooley <[email protected]> wrote:
>
> On Tue, Jun 20, 2023 at 04:05:55PM -0400, Nick Desaulniers wrote:
> > On Mon, Jun 19, 2023 at 6:06 PM Palmer Dabbelt <[email protected]> wrote:
> > > On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
> > > > On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
> > > >> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
> > > >>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
>
> > > >> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
> > > >> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
> > > >> series is based on 6.4-rc2.
> > > >
> > > > Thanks.
> > >
> > > Sorry to be so slow here, but I think this is causing LLD to hang on
> > > allmodconfig. I'm still getting to the bottom of it, there's a few
> > > other things I have in flight still.
> >
> > Confirmed with v3 on mainline (linux-next is pretty red at the moment).
> > https://lore.kernel.org/linux-riscv/[email protected]/
>
> Just FYI Nick, there's been some concurrent work here from different
> people working on the same thing & the v3 you linked (from Zhangjin) was
> superseded by this v2 (from Jisheng).
Ah! I've been testing the deprecated patch set, sorry I just looked on
lore for "dead code" on riscv-linux and grabbed the first thread,
without noticing the difference in authors or new version numbers for
distinct series. ok, nevermind my noise. I'll follow up with the
correct patch set, sorry!
>
> Cheers,
> Conor.
--
Thanks,
~Nick Desaulniers
On Tue, Jun 20, 2023 at 4:41 PM Palmer Dabbelt <[email protected]> wrote:
>
> On Tue, 20 Jun 2023 13:32:32 PDT (-0700), [email protected] wrote:
> > On Tue, Jun 20, 2023 at 4:13 PM Conor Dooley <[email protected]> wrote:
> >>
> >> On Tue, Jun 20, 2023 at 04:05:55PM -0400, Nick Desaulniers wrote:
> >> > On Mon, Jun 19, 2023 at 6:06 PM Palmer Dabbelt <[email protected]> wrote:
> >> > > On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
> >> > > > On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
> >> > > >> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
> >> > > >>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
> >>
> >> > > >> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
> >> > > >> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
> >> > > >> series is based on 6.4-rc2.
> >> > > >
> >> > > > Thanks.
> >> > >
> >> > > Sorry to be so slow here, but I think this is causing LLD to hang on
> >> > > allmodconfig. I'm still getting to the bottom of it, there's a few
> >> > > other things I have in flight still.
> >> >
> >> > Confirmed with v3 on mainline (linux-next is pretty red at the moment).
> >> > https://lore.kernel.org/linux-riscv/[email protected]/
> >>
> >> Just FYI Nick, there's been some concurrent work here from different
> >> people working on the same thing & the v3 you linked (from Zhangjin) was
> >> superseded by this v2 (from Jisheng).
> >
> > Ah! I've been testing the deprecated patch set, sorry I just looked on
> > lore for "dead code" on riscv-linux and grabbed the first thread,
> > without noticing the difference in authors or new version numbers for
> > distinct series. ok, nevermind my noise. I'll follow up with the
> > correct patch set, sorry!
>
> Ya, I hadn't even noticed the v3 because I pretty much only look at
> patchwork these days. Like we talked about in IRC, I'm going to go test
> the merge of this one and see what's up -- I've got it staged at
> <https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=for-next&id=1bd2963b21758a773206a1cb67c93e7a8ae8a195>,
> though that won't be a stable hash if it's actually broken...
Ok, https://lore.kernel.org/linux-riscv/[email protected]/
built for me. If you're seeing a hang, please let me know what
version of LLD you're using and I'll build that tag from source to see
if I can reproduce, then bisect if so.
$ ARCH=riscv LLVM=1 /usr/bin/time -v make -j128 allmodconfig vmlinux
...
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:35.68
...
Tested-by: Nick Desaulniers <[email protected]> # build
>
> >
> >>
> >> Cheers,
> >> Conor.
> >
> >
> >
> > --
> > Thanks,
> > ~Nick Desaulniers
--
Thanks,
~Nick Desaulniers
On Tue, 20 Jun 2023 13:47:07 PDT (-0700), [email protected] wrote:
> On Tue, Jun 20, 2023 at 4:41 PM Palmer Dabbelt <[email protected]> wrote:
>>
>> On Tue, 20 Jun 2023 13:32:32 PDT (-0700), [email protected] wrote:
>> > On Tue, Jun 20, 2023 at 4:13 PM Conor Dooley <[email protected]> wrote:
>> >>
>> >> On Tue, Jun 20, 2023 at 04:05:55PM -0400, Nick Desaulniers wrote:
>> >> > On Mon, Jun 19, 2023 at 6:06 PM Palmer Dabbelt <[email protected]> wrote:
>> >> > > On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
>> >> > > > On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
>> >> > > >> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
>> >> > > >>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
>> >>
>> >> > > >> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
>> >> > > >> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
>> >> > > >> series is based on 6.4-rc2.
>> >> > > >
>> >> > > > Thanks.
>> >> > >
>> >> > > Sorry to be so slow here, but I think this is causing LLD to hang on
>> >> > > allmodconfig. I'm still getting to the bottom of it, there's a few
>> >> > > other things I have in flight still.
>> >> >
>> >> > Confirmed with v3 on mainline (linux-next is pretty red at the moment).
>> >> > https://lore.kernel.org/linux-riscv/[email protected]/
>> >>
>> >> Just FYI Nick, there's been some concurrent work here from different
>> >> people working on the same thing & the v3 you linked (from Zhangjin) was
>> >> superseded by this v2 (from Jisheng).
>> >
>> > Ah! I've been testing the deprecated patch set, sorry I just looked on
>> > lore for "dead code" on riscv-linux and grabbed the first thread,
>> > without noticing the difference in authors or new version numbers for
>> > distinct series. ok, nevermind my noise. I'll follow up with the
>> > correct patch set, sorry!
>>
>> Ya, I hadn't even noticed the v3 because I pretty much only look at
>> patchwork these days. Like we talked about in IRC, I'm going to go test
>> the merge of this one and see what's up -- I've got it staged at
>> <https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=for-next&id=1bd2963b21758a773206a1cb67c93e7a8ae8a195>,
>> though that won't be a stable hash if it's actually broken...
>
> Ok, https://lore.kernel.org/linux-riscv/[email protected]/
> built for me. If you're seeing a hang, please let me know what
> version of LLD you're using and I'll build that tag from source to see
> if I can reproduce, then bisect if so.
>
> $ ARCH=riscv LLVM=1 /usr/bin/time -v make -j128 allmodconfig vmlinux
> ...
> Elapsed (wall clock) time (h:mm:ss or m:ss): 2:35.68
> ...
>
> Tested-by: Nick Desaulniers <[email protected]> # build
OK, it triggered enough of a rebuild that it might take a bit for
anything to filter out.
Thanks!
>
>>
>> >
>> >>
>> >> Cheers,
>> >> Conor.
>> >
>> >
>> >
>> > --
>> > Thanks,
>> > ~Nick Desaulniers
>
>
>
> --
> Thanks,
> ~Nick Desaulniers
On Tue, 20 Jun 2023 13:32:32 PDT (-0700), [email protected] wrote:
> On Tue, Jun 20, 2023 at 4:13 PM Conor Dooley <[email protected]> wrote:
>>
>> On Tue, Jun 20, 2023 at 04:05:55PM -0400, Nick Desaulniers wrote:
>> > On Mon, Jun 19, 2023 at 6:06 PM Palmer Dabbelt <[email protected]> wrote:
>> > > On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
>> > > > On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
>> > > >> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
>> > > >>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
>>
>> > > >> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
>> > > >> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
>> > > >> series is based on 6.4-rc2.
>> > > >
>> > > > Thanks.
>> > >
>> > > Sorry to be so slow here, but I think this is causing LLD to hang on
>> > > allmodconfig. I'm still getting to the bottom of it, there's a few
>> > > other things I have in flight still.
>> >
>> > Confirmed with v3 on mainline (linux-next is pretty red at the moment).
>> > https://lore.kernel.org/linux-riscv/[email protected]/
>>
>> Just FYI Nick, there's been some concurrent work here from different
>> people working on the same thing & the v3 you linked (from Zhangjin) was
>> superseded by this v2 (from Jisheng).
>
> Ah! I've been testing the deprecated patch set, sorry I just looked on
> lore for "dead code" on riscv-linux and grabbed the first thread,
> without noticing the difference in authors or new version numbers for
> distinct series. ok, nevermind my noise. I'll follow up with the
> correct patch set, sorry!
Ya, I hadn't even noticed the v3 because I pretty much only look at
patchwork these days. Like we talked about in IRC, I'm going to go test
the merge of this one and see what's up -- I've got it staged at
<https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=for-next&id=1bd2963b21758a773206a1cb67c93e7a8ae8a195>,
though that won't be a stable hash if it's actually broken...
>
>>
>> Cheers,
>> Conor.
>
>
>
> --
> Thanks,
> ~Nick Desaulniers
On Tue, 20 Jun 2023 14:08:33 PDT (-0700), Palmer Dabbelt wrote:
> On Tue, 20 Jun 2023 13:47:07 PDT (-0700), [email protected] wrote:
>> On Tue, Jun 20, 2023 at 4:41 PM Palmer Dabbelt <[email protected]> wrote:
>>>
>>> On Tue, 20 Jun 2023 13:32:32 PDT (-0700), [email protected] wrote:
>>> > On Tue, Jun 20, 2023 at 4:13 PM Conor Dooley <[email protected]> wrote:
>>> >>
>>> >> On Tue, Jun 20, 2023 at 04:05:55PM -0400, Nick Desaulniers wrote:
>>> >> > On Mon, Jun 19, 2023 at 6:06 PM Palmer Dabbelt <[email protected]> wrote:
>>> >> > > On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
>>> >> > > > On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
>>> >> > > >> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
>>> >> > > >>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
>>> >>
>>> >> > > >> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
>>> >> > > >> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
>>> >> > > >> series is based on 6.4-rc2.
>>> >> > > >
>>> >> > > > Thanks.
>>> >> > >
>>> >> > > Sorry to be so slow here, but I think this is causing LLD to hang on
>>> >> > > allmodconfig. I'm still getting to the bottom of it, there's a few
>>> >> > > other things I have in flight still.
>>> >> >
>>> >> > Confirmed with v3 on mainline (linux-next is pretty red at the moment).
>>> >> > https://lore.kernel.org/linux-riscv/[email protected]/
>>> >>
>>> >> Just FYI Nick, there's been some concurrent work here from different
>>> >> people working on the same thing & the v3 you linked (from Zhangjin) was
>>> >> superseded by this v2 (from Jisheng).
>>> >
>>> > Ah! I've been testing the deprecated patch set, sorry I just looked on
>>> > lore for "dead code" on riscv-linux and grabbed the first thread,
>>> > without noticing the difference in authors or new version numbers for
>>> > distinct series. ok, nevermind my noise. I'll follow up with the
>>> > correct patch set, sorry!
>>>
>>> Ya, I hadn't even noticed the v3 because I pretty much only look at
>>> patchwork these days. Like we talked about in IRC, I'm going to go test
>>> the merge of this one and see what's up -- I've got it staged at
>>> <https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=for-next&id=1bd2963b21758a773206a1cb67c93e7a8ae8a195>,
>>> though that won't be a stable hash if it's actually broken...
>>
>> Ok, https://lore.kernel.org/linux-riscv/[email protected]/
>> built for me. If you're seeing a hang, please let me know what
>> version of LLD you're using and I'll build that tag from source to see
>> if I can reproduce, then bisect if so.
>>
>> $ ARCH=riscv LLVM=1 /usr/bin/time -v make -j128 allmodconfig vmlinux
>> ...
>> Elapsed (wall clock) time (h:mm:ss or m:ss): 2:35.68
>> ...
>>
>> Tested-by: Nick Desaulniers <[email protected]> # build
>
> OK, it triggered enough of a rebuild that it might take a bit for
> anything to filter out.
I'm on LLVM 16.0.2
$ git describe
llvmorg-16.0.2
$ git log | head -n1
commit 18ddebe1a1a9bde349441631365f0472e9693520
that seems to hang for me -- or at least run for an hour without
completing, so I assume it's hung. I'm not wed to 16.0.2, it just
happens to be the last time I bumped the toolchain. I'm moving to
16.0.5 to see if that changes anything.
>
> Thanks!
>
>>
>>>
>>> >
>>> >>
>>> >> Cheers,
>>> >> Conor.
>>> >
>>> >
>>> >
>>> > --
>>> > Thanks,
>>> > ~Nick Desaulniers
>>
>>
>>
>> --
>> Thanks,
>> ~Nick Desaulniers
On Tue, 20 Jun 2023 17:13:17 PDT (-0700), Palmer Dabbelt wrote:
> On Tue, 20 Jun 2023 14:08:33 PDT (-0700), Palmer Dabbelt wrote:
>> On Tue, 20 Jun 2023 13:47:07 PDT (-0700), [email protected] wrote:
>>> On Tue, Jun 20, 2023 at 4:41 PM Palmer Dabbelt <[email protected]> wrote:
>>>>
>>>> On Tue, 20 Jun 2023 13:32:32 PDT (-0700), [email protected] wrote:
>>>> > On Tue, Jun 20, 2023 at 4:13 PM Conor Dooley <[email protected]> wrote:
>>>> >>
>>>> >> On Tue, Jun 20, 2023 at 04:05:55PM -0400, Nick Desaulniers wrote:
>>>> >> > On Mon, Jun 19, 2023 at 6:06 PM Palmer Dabbelt <[email protected]> wrote:
>>>> >> > > On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
>>>> >> > > > On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
>>>> >> > > >> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
>>>> >> > > >>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
>>>> >>
>>>> >> > > >> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
>>>> >> > > >> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
>>>> >> > > >> series is based on 6.4-rc2.
>>>> >> > > >
>>>> >> > > > Thanks.
>>>> >> > >
>>>> >> > > Sorry to be so slow here, but I think this is causing LLD to hang on
>>>> >> > > allmodconfig. I'm still getting to the bottom of it, there's a few
>>>> >> > > other things I have in flight still.
>>>> >> >
>>>> >> > Confirmed with v3 on mainline (linux-next is pretty red at the moment).
>>>> >> > https://lore.kernel.org/linux-riscv/[email protected]/
>>>> >>
>>>> >> Just FYI Nick, there's been some concurrent work here from different
>>>> >> people working on the same thing & the v3 you linked (from Zhangjin) was
>>>> >> superseded by this v2 (from Jisheng).
>>>> >
>>>> > Ah! I've been testing the deprecated patch set, sorry I just looked on
>>>> > lore for "dead code" on riscv-linux and grabbed the first thread,
>>>> > without noticing the difference in authors or new version numbers for
>>>> > distinct series. ok, nevermind my noise. I'll follow up with the
>>>> > correct patch set, sorry!
>>>>
>>>> Ya, I hadn't even noticed the v3 because I pretty much only look at
>>>> patchwork these days. Like we talked about in IRC, I'm going to go test
>>>> the merge of this one and see what's up -- I've got it staged at
>>>> <https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=for-next&id=1bd2963b21758a773206a1cb67c93e7a8ae8a195>,
>>>> though that won't be a stable hash if it's actually broken...
>>>
>>> Ok, https://lore.kernel.org/linux-riscv/[email protected]/
>>> built for me. If you're seeing a hang, please let me know what
>>> version of LLD you're using and I'll build that tag from source to see
>>> if I can reproduce, then bisect if so.
>>>
>>> $ ARCH=riscv LLVM=1 /usr/bin/time -v make -j128 allmodconfig vmlinux
>>> ...
>>> Elapsed (wall clock) time (h:mm:ss or m:ss): 2:35.68
>>> ...
>>>
>>> Tested-by: Nick Desaulniers <[email protected]> # build
>>
>> OK, it triggered enough of a rebuild that it might take a bit for
>> anything to filter out.
>
> I'm on LLVM 16.0.2
>
> $ git describe
> llvmorg-16.0.2
> $ git log | head -n1
> commit 18ddebe1a1a9bde349441631365f0472e9693520
>
> that seems to hang for me -- or at least run for an hour without
> completing, so I assume it's hung. I'm not wed to 16.0.2, it just
> happens to be the last time I bumped the toolchain. I'm moving to
> 16.0.5 to see if that changes anything.
That also takes at least an hour to link. I tried running on LLVM trunk
from last night
$ git log | head -n1
commit 5e9173c43a9b97c8614e36d6f754317f731e71e9
and that completed. Just as a curiosity I tried to re-spin it to see
how long it takes, and it's been running for 23 minutes so far.
So I'm no longer actually sure there's a hang, just something slow.
That's even more of a grey area, but I think it's sane to call a 1-hour
link time a regression -- unless it's expected that this is just very
slow to link?
>
>>
>> Thanks!
>>
>>>
>>>>
>>>> >
>>>> >>
>>>> >> Cheers,
>>>> >> Conor.
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Thanks,
>>>> > ~Nick Desaulniers
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> ~Nick Desaulniers
On Wed, Jun 21, 2023 at 05:42:08PM +0100, Conor Dooley wrote:
> On Wed, Jun 21, 2023 at 07:53:59AM -0700, Palmer Dabbelt wrote:
> > On Tue, 20 Jun 2023 17:13:17 PDT (-0700), Palmer Dabbelt wrote:
> > > On Tue, 20 Jun 2023 14:08:33 PDT (-0700), Palmer Dabbelt wrote:
> > >> On Tue, 20 Jun 2023 13:47:07 PDT (-0700), [email protected] wrote:
> > >>> On Tue, Jun 20, 2023 at 4:41 PM Palmer Dabbelt <[email protected]> wrote:
> > >>>>
> > >>>> On Tue, 20 Jun 2023 13:32:32 PDT (-0700), [email protected] wrote:
> > >>>> > On Tue, Jun 20, 2023 at 4:13 PM Conor Dooley <[email protected]> wrote:
> > >>>> >>
> > >>>> >> On Tue, Jun 20, 2023 at 04:05:55PM -0400, Nick Desaulniers wrote:
> > >>>> >> > On Mon, Jun 19, 2023 at 6:06 PM Palmer Dabbelt <[email protected]> wrote:
> > >>>> >> > > On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
> > >>>> >> > > > On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
> > >>>> >> > > >> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
> > >>>> >> > > >>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
> > >>>> >>
> > >>>> >> > > >> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
> > >>>> >> > > >> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
> > >>>> >> > > >> series is based on 6.4-rc2.
> > >>>> >> > > >
> > >>>> >> > > > Thanks.
> > >>>> >> > >
> > >>>> >> > > Sorry to be so slow here, but I think this is causing LLD to hang on
> > >>>> >> > > allmodconfig. I'm still getting to the bottom of it, there's a few
> > >>>> >> > > other things I have in flight still.
> > >>>> >> >
> > >>>> >> > Confirmed with v3 on mainline (linux-next is pretty red at the moment).
> > >>>> >> > https://lore.kernel.org/linux-riscv/[email protected]/
> > >>>> >>
> > >>>> >> Just FYI Nick, there's been some concurrent work here from different
> > >>>> >> people working on the same thing & the v3 you linked (from Zhangjin) was
> > >>>> >> superseded by this v2 (from Jisheng).
> > >>>> >
> > >>>> > Ah! I've been testing the deprecated patch set, sorry I just looked on
> > >>>> > lore for "dead code" on riscv-linux and grabbed the first thread,
> > >>>> > without noticing the difference in authors or new version numbers for
> > >>>> > distinct series. ok, nevermind my noise. I'll follow up with the
> > >>>> > correct patch set, sorry!
> > >>>>
> > >>>> Ya, I hadn't even noticed the v3 because I pretty much only look at
> > >>>> patchwork these days. Like we talked about in IRC, I'm going to go test
> > >>>> the merge of this one and see what's up -- I've got it staged at
> > >>>> <https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=for-next&id=1bd2963b21758a773206a1cb67c93e7a8ae8a195>,
> > >>>> though that won't be a stable hash if it's actually broken...
> > >>>
> > >>> Ok, https://lore.kernel.org/linux-riscv/[email protected]/
> > >>> built for me. If you're seeing a hang, please let me know what
> > >>> version of LLD you're using and I'll build that tag from source to see
> > >>> if I can reproduce, then bisect if so.
> > >>>
> > >>> $ ARCH=riscv LLVM=1 /usr/bin/time -v make -j128 allmodconfig vmlinux
> > >>> ...
> > >>> Elapsed (wall clock) time (h:mm:ss or m:ss): 2:35.68
> > >>> ...
> > >>>
> > >>> Tested-by: Nick Desaulniers <[email protected]> # build
> > >>
> > >> OK, it triggered enough of a rebuild that it might take a bit for
> > >> anything to filter out.
> > >
> > > I'm on LLVM 16.0.2
> > >
> > > $ git describe
> > > llvmorg-16.0.2
> > > $ git log | head -n1
> > > commit 18ddebe1a1a9bde349441631365f0472e9693520
> > >
> > > that seems to hang for me -- or at least run for an hour without
> > > completing, so I assume it's hung. I'm not wed to 16.0.2, it just
> > > happens to be the last time I bumped the toolchain. I'm moving to
> > > 16.0.5 to see if that changes anything.
> >
> > That also takes at least an hour to link. I tried running on LLVM trunk
> > from last night
> >
> > $ git log | head -n1
> > commit 5e9173c43a9b97c8614e36d6f754317f731e71e9
> >
> > and that completed. Just as a curiosity I tried to re-spin it to see
> > how long it takes, and it's been running for 23 minutes so far.
>
> After some misdirection through stupid user error, I have also
> reproduced this for an LLVM=1 build w/ llvmorg-16.0.0
>
> > So I'm no longer actually sure there's a hang, just something slow.
> > That's even more of a grey area, but I think it's sane to call a 1-hour
> > link time a regression -- unless it's expected that this is just very
> > slow to link?
>
> I dunno, if it was only a thing for allyesconfig, then whatever - but
> it's gonna significantly increase build times for any large kernels if LLD
> is this much slower than LD. Regression in my book.
>
> I'm gonna go and experiment with mixed toolchain builds, I'll report
> back..
Probably as expected, swapping out LLD for LD linked normally & using
gcc-13.1 + LLD hit the same problems with linking.
Cheers,
Conor.
On Wed, Jun 21, 2023 at 07:53:59AM -0700, Palmer Dabbelt wrote:
> On Tue, 20 Jun 2023 17:13:17 PDT (-0700), Palmer Dabbelt wrote:
> > On Tue, 20 Jun 2023 14:08:33 PDT (-0700), Palmer Dabbelt wrote:
> >> On Tue, 20 Jun 2023 13:47:07 PDT (-0700), [email protected] wrote:
> >>> On Tue, Jun 20, 2023 at 4:41 PM Palmer Dabbelt <[email protected]> wrote:
> >>>>
> >>>> On Tue, 20 Jun 2023 13:32:32 PDT (-0700), [email protected] wrote:
> >>>> > On Tue, Jun 20, 2023 at 4:13 PM Conor Dooley <[email protected]> wrote:
> >>>> >>
> >>>> >> On Tue, Jun 20, 2023 at 04:05:55PM -0400, Nick Desaulniers wrote:
> >>>> >> > On Mon, Jun 19, 2023 at 6:06 PM Palmer Dabbelt <[email protected]> wrote:
> >>>> >> > > On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
> >>>> >> > > > On Wed, 14 Jun 2023 09:25:49 PDT (-0700), [email protected] wrote:
> >>>> >> > > >> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
> >>>> >> > > >>> On Tue, 23 May 2023 09:54:58 PDT (-0700), [email protected] wrote:
> >>>> >>
> >>>> >> > > >> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
> >>>> >> > > >> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
> >>>> >> > > >> series is based on 6.4-rc2.
> >>>> >> > > >
> >>>> >> > > > Thanks.
> >>>> >> > >
> >>>> >> > > Sorry to be so slow here, but I think this is causing LLD to hang on
> >>>> >> > > allmodconfig. I'm still getting to the bottom of it, there's a few
> >>>> >> > > other things I have in flight still.
> >>>> >> >
> >>>> >> > Confirmed with v3 on mainline (linux-next is pretty red at the moment).
> >>>> >> > https://lore.kernel.org/linux-riscv/[email protected]/
> >>>> >>
> >>>> >> Just FYI Nick, there's been some concurrent work here from different
> >>>> >> people working on the same thing & the v3 you linked (from Zhangjin) was
> >>>> >> superseded by this v2 (from Jisheng).
> >>>> >
> >>>> > Ah! I've been testing the deprecated patch set, sorry I just looked on
> >>>> > lore for "dead code" on riscv-linux and grabbed the first thread,
> >>>> > without noticing the difference in authors or new version numbers for
> >>>> > distinct series. ok, nevermind my noise. I'll follow up with the
> >>>> > correct patch set, sorry!
> >>>>
> >>>> Ya, I hadn't even noticed the v3 because I pretty much only look at
> >>>> patchwork these days. Like we talked about in IRC, I'm going to go test
> >>>> the merge of this one and see what's up -- I've got it staged at
> >>>> <https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=for-next&id=1bd2963b21758a773206a1cb67c93e7a8ae8a195>,
> >>>> though that won't be a stable hash if it's actually broken...
> >>>
> >>> Ok, https://lore.kernel.org/linux-riscv/[email protected]/
> >>> built for me. If you're seeing a hang, please let me know what
> >>> version of LLD you're using and I'll build that tag from source to see
> >>> if I can reproduce, then bisect if so.
> >>>
> >>> $ ARCH=riscv LLVM=1 /usr/bin/time -v make -j128 allmodconfig vmlinux
> >>> ...
> >>> Elapsed (wall clock) time (h:mm:ss or m:ss): 2:35.68
> >>> ...
> >>>
> >>> Tested-by: Nick Desaulniers <[email protected]> # build
> >>
> >> OK, it triggered enough of a rebuild that it might take a bit for
> >> anything to filter out.
> >
> > I'm on LLVM 16.0.2
> >
> > $ git describe
> > llvmorg-16.0.2
> > $ git log | head -n1
> > commit 18ddebe1a1a9bde349441631365f0472e9693520
> >
> > that seems to hang for me -- or at least run for an hour without
> > completing, so I assume it's hung. I'm not wed to 16.0.2, it just
> > happens to be the last time I bumped the toolchain. I'm moving to
> > 16.0.5 to see if that changes anything.
>
> That also takes at least an hour to link. I tried running on LLVM trunk
> from last night
>
> $ git log | head -n1
> commit 5e9173c43a9b97c8614e36d6f754317f731e71e9
>
> and that completed. Just as a curiosity I tried to re-spin it to see
> how long it takes, and it's been running for 23 minutes so far.
After some misdirection through stupid user error, I have also
reproduced this for an LLVM=1 build w/ llvmorg-16.0.0
> So I'm no longer actually sure there's a hang, just something slow.
> That's even more of a grey area, but I think it's sane to call a 1-hour
> link time a regression -- unless it's expected that this is just very
> slow to link?
I dunno, if it was only a thing for allyesconfig, then whatever - but
it's gonna significantly increase build times for any large kernels if LLD
is this much slower than LD. Regression in my book.
I'm gonna go and experiment with mixed toolchain builds, I'll report
back..
Cheers,
Conor.
Conor Dooley <[email protected]> writes:
[...]
>> So I'm no longer actually sure there's a hang, just something slow.
>> That's even more of a grey area, but I think it's sane to call a 1-hour
>> link time a regression -- unless it's expected that this is just very
>> slow to link?
>
> I dunno, if it was only a thing for allyesconfig, then whatever - but
> it's gonna significantly increase build times for any large kernels if LLD
> is this much slower than LD. Regression in my book.
>
> I'm gonna go and experiment with mixed toolchain builds, I'll report
> back..
I took palmer/for-next (1bd2963b2175 ("Merge patch series "riscv: enable
HAVE_LD_DEAD_CODE_DATA_ELIMINATION"")) for a tuxmake build with llvm-16:
| ~/src/tuxmake/run -v --wrapper ccache --target-arch riscv \
| --toolchain=llvm-16 --runtime docker --directory . -k \
| allyesconfig
Took forever, but passed after 2.5h.
CONFIG_CC_VERSION_TEXT="Debian clang version 16.0.6 (++20230610113307+7cbf1a259152-1~exp1~20230610233402.106)"
Björn
On Wed, 21 Jun 2023 10:51:15 PDT (-0700), [email protected] wrote:
> Conor Dooley <[email protected]> writes:
>
> [...]
>
>>> So I'm no longer actually sure there's a hang, just something slow.
>>> That's even more of a grey area, but I think it's sane to call a 1-hour
>>> link time a regression -- unless it's expected that this is just very
>>> slow to link?
>>
>> I dunno, if it was only a thing for allyesconfig, then whatever - but
>> it's gonna significantly increase build times for any large kernels if LLD
>> is this much slower than LD. Regression in my book.
>>
>> I'm gonna go and experiment with mixed toolchain builds, I'll report
>> back..
>
> I took palmer/for-next (1bd2963b2175 ("Merge patch series "riscv: enable
> HAVE_LD_DEAD_CODE_DATA_ELIMINATION"")) for a tuxmake build with llvm-16:
>
> | ~/src/tuxmake/run -v --wrapper ccache --target-arch riscv \
> | --toolchain=llvm-16 --runtime docker --directory . -k \
> | allyesconfig
>
> Took forever, but passed after 2.5h.
Thanks. I just re-ran mine 17/trunk LLD under time (rather that just
checking top sometimes), it's at 1.5h but even that seems quite long.
I guess this is sort of up to the LLVM folks: if it's expected that DCE
takes a very long time to link then I'm not opposed to allowing it, but
if this is probably a bug in LLD then it seems best to turn it off until
we sort things out over there.
I think maybe Nick or Nathan is the best bet to know?
> CONFIG_CC_VERSION_TEXT="Debian clang version 16.0.6 (++20230610113307+7cbf1a259152-1~exp1~20230610233402.106)"
>
>
> Björn
On Wed, 21 Jun 2023 11:19:31 PDT (-0700), Palmer Dabbelt wrote:
> On Wed, 21 Jun 2023 10:51:15 PDT (-0700), [email protected] wrote:
>> Conor Dooley <[email protected]> writes:
>>
>> [...]
>>
>>>> So I'm no longer actually sure there's a hang, just something slow.
>>>> That's even more of a grey area, but I think it's sane to call a 1-hour
>>>> link time a regression -- unless it's expected that this is just very
>>>> slow to link?
>>>
>>> I dunno, if it was only a thing for allyesconfig, then whatever - but
>>> it's gonna significantly increase build times for any large kernels if LLD
>>> is this much slower than LD. Regression in my book.
>>>
>>> I'm gonna go and experiment with mixed toolchain builds, I'll report
>>> back..
>>
>> I took palmer/for-next (1bd2963b2175 ("Merge patch series "riscv: enable
>> HAVE_LD_DEAD_CODE_DATA_ELIMINATION"")) for a tuxmake build with llvm-16:
>>
>> | ~/src/tuxmake/run -v --wrapper ccache --target-arch riscv \
>> | --toolchain=llvm-16 --runtime docker --directory . -k \
>> | allyesconfig
>>
>> Took forever, but passed after 2.5h.
>
> Thanks. I just re-ran mine 17/trunk LLD under time (rather that just
> checking top sometimes), it's at 1.5h but even that seems quite long.
>
> I guess this is sort of up to the LLVM folks: if it's expected that DCE
> takes a very long time to link then I'm not opposed to allowing it, but
> if this is probably a bug in LLD then it seems best to turn it off until
> we sort things out over there.
>
> I think maybe Nick or Nathan is the best bet to know?
Looks like it's about 2h for me. I'm going to drop these from my
staging tree in the interest of making progress on other stuff, but if
this is just expected behavior them I'm OK taking them (though that's
too much compute for me to test regularly):
$ time ../../../../llvm/install/bin/ld.lld -melf64lriscv -z noexecstack -r -o vmlinux.o --whole-archive vmlinux.a --no-whole-archive --start-group ./drivers/firmware/efi/libstub/lib.a --end-group
real 111m50.678s
user 111m18.739s
sys 1m13.147s
>> CONFIG_CC_VERSION_TEXT="Debian clang version 16.0.6 (++20230610113307+7cbf1a259152-1~exp1~20230610233402.106)"
>>
>>
>> Björn
On Wed, Jun 21, 2023 at 12:46 PM Palmer Dabbelt <[email protected]> wrote:
>
> On Wed, 21 Jun 2023 11:19:31 PDT (-0700), Palmer Dabbelt wrote:
> > On Wed, 21 Jun 2023 10:51:15 PDT (-0700), [email protected] wrote:
> >> Conor Dooley <[email protected]> writes:
> >>
> >> [...]
> >>
> >>>> So I'm no longer actually sure there's a hang, just something slow.
> >>>> That's even more of a grey area, but I think it's sane to call a 1-hour
> >>>> link time a regression -- unless it's expected that this is just very
> >>>> slow to link?
> >>>
> >>> I dunno, if it was only a thing for allyesconfig, then whatever - but
> >>> it's gonna significantly increase build times for any large kernels if LLD
> >>> is this much slower than LD. Regression in my book.
> >>>
> >>> I'm gonna go and experiment with mixed toolchain builds, I'll report
> >>> back..
> >>
> >> I took palmer/for-next (1bd2963b2175 ("Merge patch series "riscv: enable
> >> HAVE_LD_DEAD_CODE_DATA_ELIMINATION"")) for a tuxmake build with llvm-16:
> >>
> >> | ~/src/tuxmake/run -v --wrapper ccache --target-arch riscv \
> >> | --toolchain=llvm-16 --runtime docker --directory . -k \
> >> | allyesconfig
> >>
> >> Took forever, but passed after 2.5h.
> >
> > Thanks. I just re-ran mine 17/trunk LLD under time (rather that just
> > checking top sometimes), it's at 1.5h but even that seems quite long.
> >
> > I guess this is sort of up to the LLVM folks: if it's expected that DCE
> > takes a very long time to link then I'm not opposed to allowing it, but
> > if this is probably a bug in LLD then it seems best to turn it off until
> > we sort things out over there.
> >
> > I think maybe Nick or Nathan is the best bet to know?
>
> Looks like it's about 2h for me. I'm going to drop these from my
> staging tree in the interest of making progress on other stuff, but if
> this is just expected behavior them I'm OK taking them (though that's
> too much compute for me to test regularly):
>
> $ time ../../../../llvm/install/bin/ld.lld -melf64lriscv -z noexecstack -r -o vmlinux.o --whole-archive vmlinux.a --no-whole-archive --start-group ./drivers/firmware/efi/libstub/lib.a --end-group
>
> real 111m50.678s
> user 111m18.739s
> sys 1m13.147s
Ah, I think you meant s/allmodconfig/allyesconfig/ in your initial
report. That makes more sense, and I can reproduce. Let me work on a
report.
>
> >> CONFIG_CC_VERSION_TEXT="Debian clang version 16.0.6 (++20230610113307+7cbf1a259152-1~exp1~20230610233402.106)"
> >>
> >>
> >> Björn
--
Thanks,
~Nick Desaulniers
On Thu, 22 Jun 2023 14:40:59 PDT (-0700), [email protected] wrote:
> On Wed, Jun 21, 2023 at 12:46 PM Palmer Dabbelt <[email protected]> wrote:
>>
>> On Wed, 21 Jun 2023 11:19:31 PDT (-0700), Palmer Dabbelt wrote:
>> > On Wed, 21 Jun 2023 10:51:15 PDT (-0700), [email protected] wrote:
>> >> Conor Dooley <[email protected]> writes:
>> >>
>> >> [...]
>> >>
>> >>>> So I'm no longer actually sure there's a hang, just something slow.
>> >>>> That's even more of a grey area, but I think it's sane to call a 1-hour
>> >>>> link time a regression -- unless it's expected that this is just very
>> >>>> slow to link?
>> >>>
>> >>> I dunno, if it was only a thing for allyesconfig, then whatever - but
>> >>> it's gonna significantly increase build times for any large kernels if LLD
>> >>> is this much slower than LD. Regression in my book.
>> >>>
>> >>> I'm gonna go and experiment with mixed toolchain builds, I'll report
>> >>> back..
>> >>
>> >> I took palmer/for-next (1bd2963b2175 ("Merge patch series "riscv: enable
>> >> HAVE_LD_DEAD_CODE_DATA_ELIMINATION"")) for a tuxmake build with llvm-16:
>> >>
>> >> | ~/src/tuxmake/run -v --wrapper ccache --target-arch riscv \
>> >> | --toolchain=llvm-16 --runtime docker --directory . -k \
>> >> | allyesconfig
>> >>
>> >> Took forever, but passed after 2.5h.
>> >
>> > Thanks. I just re-ran mine 17/trunk LLD under time (rather that just
>> > checking top sometimes), it's at 1.5h but even that seems quite long.
>> >
>> > I guess this is sort of up to the LLVM folks: if it's expected that DCE
>> > takes a very long time to link then I'm not opposed to allowing it, but
>> > if this is probably a bug in LLD then it seems best to turn it off until
>> > we sort things out over there.
>> >
>> > I think maybe Nick or Nathan is the best bet to know?
>>
>> Looks like it's about 2h for me. I'm going to drop these from my
>> staging tree in the interest of making progress on other stuff, but if
>> this is just expected behavior them I'm OK taking them (though that's
>> too much compute for me to test regularly):
>>
>> $ time ../../../../llvm/install/bin/ld.lld -melf64lriscv -z noexecstack -r -o vmlinux.o --whole-archive vmlinux.a --no-whole-archive --start-group ./drivers/firmware/efi/libstub/lib.a --end-group
>>
>> real 111m50.678s
>> user 111m18.739s
>> sys 1m13.147s
>
> Ah, I think you meant s/allmodconfig/allyesconfig/ in your initial
> report. That makes more sense, and I can reproduce. Let me work on a
> report.
Awesome, thanks!
>
>>
>> >> CONFIG_CC_VERSION_TEXT="Debian clang version 16.0.6 (++20230610113307+7cbf1a259152-1~exp1~20230610233402.106)"
>> >>
>> >>
>> >> Björn
On Wed, Jun 21, 2023 at 11:19:31AM -0700, Palmer Dabbelt wrote:
> On Wed, 21 Jun 2023 10:51:15 PDT (-0700), [email protected] wrote:
> > Conor Dooley <[email protected]> writes:
> >
> > [...]
> >
> > > > So I'm no longer actually sure there's a hang, just something
> > > > slow. That's even more of a grey area, but I think it's sane to
> > > > call a 1-hour link time a regression -- unless it's expected
> > > > that this is just very slow to link?
> > >
> > > I dunno, if it was only a thing for allyesconfig, then whatever - but
> > > it's gonna significantly increase build times for any large kernels if LLD
> > > is this much slower than LD. Regression in my book.
> > >
> > > I'm gonna go and experiment with mixed toolchain builds, I'll report
> > > back..
> >
> > I took palmer/for-next (1bd2963b2175 ("Merge patch series "riscv: enable
> > HAVE_LD_DEAD_CODE_DATA_ELIMINATION"")) for a tuxmake build with llvm-16:
> >
> > | ~/src/tuxmake/run -v --wrapper ccache --target-arch riscv \
> > | --toolchain=llvm-16 --runtime docker --directory . -k \
> > | allyesconfig
> >
> > Took forever, but passed after 2.5h.
>
> Thanks. I just re-ran mine 17/trunk LLD under time (rather that just
> checking top sometimes), it's at 1.5h but even that seems quite long.
>
> I guess this is sort of up to the LLVM folks: if it's expected that DCE
> takes a very long time to link then I'm not opposed to allowing it, but if
> this is probably a bug in LLD then it seems best to turn it off until we
> sort things out over there.
>
> I think maybe Nick or Nathan is the best bet to know?
I can confirm a regression with allyesconfig but not allmodconfig using
LLVM 16.0.6 on my 80-core Ampere Altra system.
allmodconfig: 8m 4s
allmodconfig + CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=n: 7m 4s
allyesconfig: 1h 58m 30s
allyesconfig + CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=n: 12m 41s
I am sure there is something that ld.lld can do better, given GNU ld
does not have any problems as earlier established, so that should
definitely be explored further. I see Nick already had a response about
writing up a report (I wrote most of this before that email so I am
still sending this one).
However, allyesconfig is pretty special and not really indicative of a
"real world" kernel build in my opinion (which will either be a fully
modular kernel to allow use on a wide range of hardware or a monolithic
kernel with just the drivers needed for a specific platform, which will
be much smaller than allyesconfig); it has given us problems with large
kernels before on other architectures.
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is already marked with 'depends on
EXPERT' and its help text mentions its perils, so it does not seem
unreasonable to me to add an additional dependency on !COMPILE_TEST so
that allmodconfig and allyesconfig cannot flip this on, something like
the following perhaps?
diff --git a/init/Kconfig b/init/Kconfig
index 32c24950c4ce..25434cbd2a6e 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1388,7 +1388,7 @@ config HAVE_LD_DEAD_CODE_DATA_ELIMINATION
config LD_DEAD_CODE_DATA_ELIMINATION
bool "Dead code and data elimination (EXPERIMENTAL)"
depends on HAVE_LD_DEAD_CODE_DATA_ELIMINATION
- depends on EXPERT
+ depends on EXPERT && !COMPILE_TEST
depends on $(cc-option,-ffunction-sections -fdata-sections)
depends on $(ld-option,--gc-sections)
help
If applying that dependency to all architectures is too much, the
selection in arch/riscv/Kconfig could be gated on the same condition.
Cheers,
Nathan
On Thu, 22 Jun 2023 14:53:27 PDT (-0700), [email protected] wrote:
> On Wed, Jun 21, 2023 at 11:19:31AM -0700, Palmer Dabbelt wrote:
>> On Wed, 21 Jun 2023 10:51:15 PDT (-0700), [email protected] wrote:
>> > Conor Dooley <[email protected]> writes:
>> >
>> > [...]
>> >
>> > > > So I'm no longer actually sure there's a hang, just something
>> > > > slow. That's even more of a grey area, but I think it's sane to
>> > > > call a 1-hour link time a regression -- unless it's expected
>> > > > that this is just very slow to link?
>> > >
>> > > I dunno, if it was only a thing for allyesconfig, then whatever - but
>> > > it's gonna significantly increase build times for any large kernels if LLD
>> > > is this much slower than LD. Regression in my book.
>> > >
>> > > I'm gonna go and experiment with mixed toolchain builds, I'll report
>> > > back..
>> >
>> > I took palmer/for-next (1bd2963b2175 ("Merge patch series "riscv: enable
>> > HAVE_LD_DEAD_CODE_DATA_ELIMINATION"")) for a tuxmake build with llvm-16:
>> >
>> > | ~/src/tuxmake/run -v --wrapper ccache --target-arch riscv \
>> > | --toolchain=llvm-16 --runtime docker --directory . -k \
>> > | allyesconfig
>> >
>> > Took forever, but passed after 2.5h.
>>
>> Thanks. I just re-ran mine 17/trunk LLD under time (rather that just
>> checking top sometimes), it's at 1.5h but even that seems quite long.
>>
>> I guess this is sort of up to the LLVM folks: if it's expected that DCE
>> takes a very long time to link then I'm not opposed to allowing it, but if
>> this is probably a bug in LLD then it seems best to turn it off until we
>> sort things out over there.
>>
>> I think maybe Nick or Nathan is the best bet to know?
>
> I can confirm a regression with allyesconfig but not allmodconfig using
> LLVM 16.0.6 on my 80-core Ampere Altra system.
>
> allmodconfig: 8m 4s
> allmodconfig + CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=n: 7m 4s
> allyesconfig: 1h 58m 30s
> allyesconfig + CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=n: 12m 41s
Are those backwards? I'm getting super slow builds after merging the
patch set, not before -- though apologize in advance if I'm reading it
wrong, I'm well on my way to falling asleep already ;)
> I am sure there is something that ld.lld can do better, given GNU ld
> does not have any problems as earlier established, so that should
> definitely be explored further. I see Nick already had a response about
> writing up a report (I wrote most of this before that email so I am
> still sending this one).
>
> However, allyesconfig is pretty special and not really indicative of a
> "real world" kernel build in my opinion (which will either be a fully
> modular kernel to allow use on a wide range of hardware or a monolithic
> kernel with just the drivers needed for a specific platform, which will
> be much smaller than allyesconfig); it has given us problems with large
> kernels before on other architectures.
I totally agree that allyesconfig is an oddity, but it's something that
does get regularly build tested so a big build time hit there is going
to cause trouble -- maybe not for users, but it'll be a problem for
maintainers and that's way more likely to get me yelled at ;)
> CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is already marked with 'depends on
> EXPERT' and its help text mentions its perils, so it does not seem
> unreasonable to me to add an additional dependency on !COMPILE_TEST so
> that allmodconfig and allyesconfig cannot flip this on, something like
> the following perhaps?
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 32c24950c4ce..25434cbd2a6e 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1388,7 +1388,7 @@ config HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> config LD_DEAD_CODE_DATA_ELIMINATION
> bool "Dead code and data elimination (EXPERIMENTAL)"
> depends on HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> - depends on EXPERT
> + depends on EXPERT && !COMPILE_TEST
> depends on $(cc-option,-ffunction-sections -fdata-sections)
> depends on $(ld-option,--gc-sections)
> help
>
> If applying that dependency to all architectures is too much, the
> selection in arch/riscv/Kconfig could be gated on the same condition.
Is the regression for all ports, or just RISC-V? I'm fine gating this
with some sort of Kconfig flag, if it's just impacting RISC-V then it
seems sane to keep it over here.
> Cheers,
> Nathan
On Thu, Jun 22, 2023 at 03:16:51PM -0700, Palmer Dabbelt wrote:
> On Thu, 22 Jun 2023 14:53:27 PDT (-0700), [email protected] wrote:
> > On Wed, Jun 21, 2023 at 11:19:31AM -0700, Palmer Dabbelt wrote:
> > > On Wed, 21 Jun 2023 10:51:15 PDT (-0700), [email protected] wrote:
> > > > Conor Dooley <[email protected]> writes:
> > > >
> > > > [...]
> > > >
> > > > > > So I'm no longer actually sure there's a hang, just something
> > > > > > slow. That's even more of a grey area, but I think it's sane to
> > > > > > call a 1-hour link time a regression -- unless it's expected
> > > > > > that this is just very slow to link?
> > > > >
> > > > > I dunno, if it was only a thing for allyesconfig, then whatever - but
> > > > > it's gonna significantly increase build times for any large kernels if LLD
> > > > > is this much slower than LD. Regression in my book.
> > > > >
> > > > > I'm gonna go and experiment with mixed toolchain builds, I'll report
> > > > > back..
> > > >
> > > > I took palmer/for-next (1bd2963b2175 ("Merge patch series "riscv: enable
> > > > HAVE_LD_DEAD_CODE_DATA_ELIMINATION"")) for a tuxmake build with llvm-16:
> > > >
> > > > | ~/src/tuxmake/run -v --wrapper ccache --target-arch riscv \
> > > > | --toolchain=llvm-16 --runtime docker --directory . -k \
> > > > | allyesconfig
> > > >
> > > > Took forever, but passed after 2.5h.
> > >
> > > Thanks. I just re-ran mine 17/trunk LLD under time (rather that just
> > > checking top sometimes), it's at 1.5h but even that seems quite long.
> > >
> > > I guess this is sort of up to the LLVM folks: if it's expected that DCE
> > > takes a very long time to link then I'm not opposed to allowing it, but if
> > > this is probably a bug in LLD then it seems best to turn it off until we
> > > sort things out over there.
> > >
> > > I think maybe Nick or Nathan is the best bet to know?
> >
> > I can confirm a regression with allyesconfig but not allmodconfig using
> > LLVM 16.0.6 on my 80-core Ampere Altra system.
> >
> > allmodconfig: 8m 4s
> > allmodconfig + CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=n: 7m 4s
> > allyesconfig: 1h 58m 30s
> > allyesconfig + CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=n: 12m 41s
>
> Are those backwards? I'm getting super slow builds after merging the patch
> set, not before -- though apologize in advance if I'm reading it wrong, I'm
> well on my way to falling asleep already ;)
I know I already responded to you around this on IRC but I will do it
here too for the benefit of others following this thread.
These numbers are from the patchset applied on top of dad9774deaf1
("Merge tag 'timers-urgent-2023-06-21' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"); in other words,
allmodconfig and allyesconfig have CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y
so turning it off is basically like building allmodconfig and
allyesconfig before the patchset was applied.
> > I am sure there is something that ld.lld can do better, given GNU ld
> > does not have any problems as earlier established, so that should
> > definitely be explored further. I see Nick already had a response about
> > writing up a report (I wrote most of this before that email so I am
> > still sending this one).
> >
> > However, allyesconfig is pretty special and not really indicative of a
> > "real world" kernel build in my opinion (which will either be a fully
> > modular kernel to allow use on a wide range of hardware or a monolithic
> > kernel with just the drivers needed for a specific platform, which will
> > be much smaller than allyesconfig); it has given us problems with large
> > kernels before on other architectures.
>
> I totally agree that allyesconfig is an oddity, but it's something that does
> get regularly build tested so a big build time hit there is going to cause
> trouble -- maybe not for users, but it'll be a problem for maintainers and
> that's way more likely to get me yelled at ;)
Agreed. That comment was more around justification for opting out of
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION with these configurations, since
CONFIG_COMPILE_TEST has effective become "am I allmodconfig or
allyesconfig?" nowadays.
> > CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is already marked with 'depends on
> > EXPERT' and its help text mentions its perils, so it does not seem
> > unreasonable to me to add an additional dependency on !COMPILE_TEST so
> > that allmodconfig and allyesconfig cannot flip this on, something like
> > the following perhaps?
> >
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 32c24950c4ce..25434cbd2a6e 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1388,7 +1388,7 @@ config HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> > config LD_DEAD_CODE_DATA_ELIMINATION
> > bool "Dead code and data elimination (EXPERIMENTAL)"
> > depends on HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> > - depends on EXPERT
> > + depends on EXPERT && !COMPILE_TEST
> > depends on $(cc-option,-ffunction-sections -fdata-sections)
> > depends on $(ld-option,--gc-sections)
> > help
> >
> > If applying that dependency to all architectures is too much, the
> > selection in arch/riscv/Kconfig could be gated on the same condition.
>
> Is the regression for all ports, or just RISC-V? I'm fine gating this with
> some sort of Kconfig flag, if it's just impacting RISC-V then it seems sane
> to keep it over here.
I am not sure. Only mips selects HAVE_LD_DEAD_CODE_DATA_ELIMINATION
unconditionally and we don't test ARCH=mips all{mod,yes}config (not sure
why off the top of my head). powerpc selects it when using objtool for
mcount generation, which only happens for ppc32 (which we don't test
heavily or with large kernels) or using '-mprofile-kernel', which clang
does not support.
If you wanted to restrict it to just LD_IS_BFD in arch/riscv/Kconfig,
that would be fine with me too.
select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if LD_IS_BFD
Nick said he would work on a report for the LLVM side, so as long as
this issue is handled in some way to avoid regressing LLD builds until
it is resolved, I don't think there is anything else for the kernel to
do. We like to have breadcrumbs via issue links, not sure if the report
will be internal to Google or on LLVM's issue tracker though;
regardless, we will have to touch this block to add a version check
later, at which point we can add a link to the fix in LLD.
Cheers,
Nathan
On Thu, Jun 22, 2023 at 11:18:03PM +0000, Nathan Chancellor wrote:
> If you wanted to restrict it to just LD_IS_BFD in arch/riscv/Kconfig,
> that would be fine with me too.
>
> select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if LD_IS_BFD
Hi Jisheng, would you mind sending a v3 with the attached patch applied
on top / at the end of your series?
>
> Nick said he would work on a report for the LLVM side, so as long as
> this issue is handled in some way to avoid regressing LLD builds until
> it is resolved, I don't think there is anything else for the kernel to
> do. We like to have breadcrumbs via issue links, not sure if the report
> will be internal to Google or on LLVM's issue tracker though;
> regardless, we will have to touch this block to add a version check
> later, at which point we can add a link to the fix in LLD.
https://github.com/ClangBuiltLinux/linux/issues/1881
On Sun, Jun 25, 2023 at 08:24:56PM +0800, Jisheng Zhang wrote:
> On Fri, Jun 23, 2023 at 10:17:54AM -0700, Nick Desaulniers wrote:
> > On Thu, Jun 22, 2023 at 11:18:03PM +0000, Nathan Chancellor wrote:
> > > If you wanted to restrict it to just LD_IS_BFD in arch/riscv/Kconfig,
> > > that would be fine with me too.
> > >
> > > select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if LD_IS_BFD
> >
> > Hi Jisheng, would you mind sending a v3 with the attached patch applied
> > on top / at the end of your series?
>
> Hi Nick, Nathan, Palmer,
>
> I saw the series has been applied to riscv-next, so I'm not sure which
> solution would it be, Palmer to apply Nick's patch to riscv-next or
> I to send out v3, any suggestion is appreciated.
I don't see what you are seeing w/ riscv/for-next. HEAD is currently at
4681dacadeef ("riscv: replace deprecated scall with ecall") and there
are no patches from your series in the branch:
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/log/?h=for-next
Cheers,
Conor.
> > > Nick said he would work on a report for the LLVM side, so as long as
> > > this issue is handled in some way to avoid regressing LLD builds until
> > > it is resolved, I don't think there is anything else for the kernel to
> > > do. We like to have breadcrumbs via issue links, not sure if the report
> > > will be internal to Google or on LLVM's issue tracker though;
> > > regardless, we will have to touch this block to add a version check
> > > later, at which point we can add a link to the fix in LLD.
> >
> > https://github.com/ClangBuiltLinux/linux/issues/1881
>
> > From 3e5e010958ee41b9fb408cfade8fb017c2fe7169 Mon Sep 17 00:00:00 2001
> > From: Nick Desaulniers <[email protected]>
> > Date: Fri, 23 Jun 2023 10:06:17 -0700
> > Subject: [PATCH] riscv: disable HAVE_LD_DEAD_CODE_DATA_ELIMINATION for LLD
> >
> > Linking allyesconfig with ld.lld-17 with CONFIG_DEAD_CODE_ELIMINATION=y
> > takes hours. Assuming this is a performance regression that can be
> > fixed, tentatively disable this for now so that allyesconfig builds
> > don't start timing out. If and when there's a fix to ld.lld, this can
> > be converted to a version check instead so that users of older but still
> > supported versions of ld.lld don't hurt themselves by enabling
> > CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y.
> >
> > Link: https://github.com/ClangBuiltLinux/linux/issues/1881
> > Reported-by: Palmer Dabbelt <[email protected]>
> > Suggested-by: Nathan Chancellor <[email protected]>
> > Signed-off-by: Nick Desaulniers <[email protected]>
> > ---
> > Hi Jisheng, would you mind sending a v3 with this patch on top/at the
> > end of your patch series?
> >
> > arch/riscv/Kconfig | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index 8effe5bb7788..0573991e9b78 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -116,7 +116,8 @@ config RISCV
> > select HAVE_KPROBES if !XIP_KERNEL
> > select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL
> > select HAVE_KRETPROBES if !XIP_KERNEL
> > - select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> > + # https://github.com/ClangBuiltLinux/linux/issues/1881
> > + select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if !LD_IS_LLD
> > select HAVE_MOVE_PMD
> > select HAVE_MOVE_PUD
> > select HAVE_PCI
> > --
> > 2.41.0.162.gfafddb0af9-goog
> >
>
On Fri, Jun 23, 2023 at 10:17:54AM -0700, Nick Desaulniers wrote:
> On Thu, Jun 22, 2023 at 11:18:03PM +0000, Nathan Chancellor wrote:
> > If you wanted to restrict it to just LD_IS_BFD in arch/riscv/Kconfig,
> > that would be fine with me too.
> >
> > select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if LD_IS_BFD
>
> Hi Jisheng, would you mind sending a v3 with the attached patch applied
> on top / at the end of your series?
Hi Nick, Nathan, Palmer,
I saw the series has been applied to riscv-next, so I'm not sure which
solution would it be, Palmer to apply Nick's patch to riscv-next or
I to send out v3, any suggestion is appreciated.
Thanks
>
> >
> > Nick said he would work on a report for the LLVM side, so as long as
> > this issue is handled in some way to avoid regressing LLD builds until
> > it is resolved, I don't think there is anything else for the kernel to
> > do. We like to have breadcrumbs via issue links, not sure if the report
> > will be internal to Google or on LLVM's issue tracker though;
> > regardless, we will have to touch this block to add a version check
> > later, at which point we can add a link to the fix in LLD.
>
> https://github.com/ClangBuiltLinux/linux/issues/1881
> From 3e5e010958ee41b9fb408cfade8fb017c2fe7169 Mon Sep 17 00:00:00 2001
> From: Nick Desaulniers <[email protected]>
> Date: Fri, 23 Jun 2023 10:06:17 -0700
> Subject: [PATCH] riscv: disable HAVE_LD_DEAD_CODE_DATA_ELIMINATION for LLD
>
> Linking allyesconfig with ld.lld-17 with CONFIG_DEAD_CODE_ELIMINATION=y
> takes hours. Assuming this is a performance regression that can be
> fixed, tentatively disable this for now so that allyesconfig builds
> don't start timing out. If and when there's a fix to ld.lld, this can
> be converted to a version check instead so that users of older but still
> supported versions of ld.lld don't hurt themselves by enabling
> CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/1881
> Reported-by: Palmer Dabbelt <[email protected]>
> Suggested-by: Nathan Chancellor <[email protected]>
> Signed-off-by: Nick Desaulniers <[email protected]>
> ---
> Hi Jisheng, would you mind sending a v3 with this patch on top/at the
> end of your patch series?
>
> arch/riscv/Kconfig | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 8effe5bb7788..0573991e9b78 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -116,7 +116,8 @@ config RISCV
> select HAVE_KPROBES if !XIP_KERNEL
> select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL
> select HAVE_KRETPROBES if !XIP_KERNEL
> - select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> + # https://github.com/ClangBuiltLinux/linux/issues/1881
> + select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if !LD_IS_LLD
> select HAVE_MOVE_PMD
> select HAVE_MOVE_PUD
> select HAVE_PCI
> --
> 2.41.0.162.gfafddb0af9-goog
>
On Sun, 25 Jun 2023 05:43:13 PDT (-0700), Conor Dooley wrote:
> On Sun, Jun 25, 2023 at 08:24:56PM +0800, Jisheng Zhang wrote:
>> On Fri, Jun 23, 2023 at 10:17:54AM -0700, Nick Desaulniers wrote:
>> > On Thu, Jun 22, 2023 at 11:18:03PM +0000, Nathan Chancellor wrote:
>> > > If you wanted to restrict it to just LD_IS_BFD in arch/riscv/Kconfig,
>> > > that would be fine with me too.
>> > >
>> > > select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if LD_IS_BFD
>> >
>> > Hi Jisheng, would you mind sending a v3 with the attached patch applied
>> > on top / at the end of your series?
>>
>> Hi Nick, Nathan, Palmer,
>>
>> I saw the series has been applied to riscv-next, so I'm not sure which
>> solution would it be, Palmer to apply Nick's patch to riscv-next or
>> I to send out v3, any suggestion is appreciated.
>
> I don't see what you are seeing w/ riscv/for-next. HEAD is currently at
> 4681dacadeef ("riscv: replace deprecated scall with ecall") and there
> are no patches from your series in the branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/log/?h=for-next
It's been in and out of staging a few times as we tracked down the
performance regression, but it shouldn't have ever made it to linux-next
for real.
I'm fine just picking up the patch to disable DCE, I've got a few other
(hopefully small) things to work through first though.
> Cheers,
> Conor.
>
>> > > Nick said he would work on a report for the LLVM side, so as long as
>> > > this issue is handled in some way to avoid regressing LLD builds until
>> > > it is resolved, I don't think there is anything else for the kernel to
>> > > do. We like to have breadcrumbs via issue links, not sure if the report
>> > > will be internal to Google or on LLVM's issue tracker though;
>> > > regardless, we will have to touch this block to add a version check
>> > > later, at which point we can add a link to the fix in LLD.
>> >
>> > https://github.com/ClangBuiltLinux/linux/issues/1881
>>
>> > From 3e5e010958ee41b9fb408cfade8fb017c2fe7169 Mon Sep 17 00:00:00 2001
>> > From: Nick Desaulniers <[email protected]>
>> > Date: Fri, 23 Jun 2023 10:06:17 -0700
>> > Subject: [PATCH] riscv: disable HAVE_LD_DEAD_CODE_DATA_ELIMINATION for LLD
>> >
>> > Linking allyesconfig with ld.lld-17 with CONFIG_DEAD_CODE_ELIMINATION=y
>> > takes hours. Assuming this is a performance regression that can be
>> > fixed, tentatively disable this for now so that allyesconfig builds
>> > don't start timing out. If and when there's a fix to ld.lld, this can
>> > be converted to a version check instead so that users of older but still
>> > supported versions of ld.lld don't hurt themselves by enabling
>> > CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y.
>> >
>> > Link: https://github.com/ClangBuiltLinux/linux/issues/1881
>> > Reported-by: Palmer Dabbelt <[email protected]>
>> > Suggested-by: Nathan Chancellor <[email protected]>
>> > Signed-off-by: Nick Desaulniers <[email protected]>
>> > ---
>> > Hi Jisheng, would you mind sending a v3 with this patch on top/at the
>> > end of your patch series?
>> >
>> > arch/riscv/Kconfig | 3 ++-
>> > 1 file changed, 2 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>> > index 8effe5bb7788..0573991e9b78 100644
>> > --- a/arch/riscv/Kconfig
>> > +++ b/arch/riscv/Kconfig
>> > @@ -116,7 +116,8 @@ config RISCV
>> > select HAVE_KPROBES if !XIP_KERNEL
>> > select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL
>> > select HAVE_KRETPROBES if !XIP_KERNEL
>> > - select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
>> > + # https://github.com/ClangBuiltLinux/linux/issues/1881
>> > + select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if !LD_IS_LLD
>> > select HAVE_MOVE_PMD
>> > select HAVE_MOVE_PUD
>> > select HAVE_PCI
>> > --
>> > 2.41.0.162.gfafddb0af9-goog
>> >
>>
On Sun, Jun 25, 2023 at 1:06 PM Palmer Dabbelt <[email protected]> wrote:
>
> On Sun, 25 Jun 2023 05:43:13 PDT (-0700), Conor Dooley wrote:
> > On Sun, Jun 25, 2023 at 08:24:56PM +0800, Jisheng Zhang wrote:
> >> On Fri, Jun 23, 2023 at 10:17:54AM -0700, Nick Desaulniers wrote:
> >> > On Thu, Jun 22, 2023 at 11:18:03PM +0000, Nathan Chancellor wrote:
> >> > > If you wanted to restrict it to just LD_IS_BFD in arch/riscv/Kconfig,
> >> > > that would be fine with me too.
> >> > >
> >> > > select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if LD_IS_BFD
> >> >
> >> > Hi Jisheng, would you mind sending a v3 with the attached patch applied
> >> > on top / at the end of your series?
> >>
> >> Hi Nick, Nathan, Palmer,
> >>
> >> I saw the series has been applied to riscv-next, so I'm not sure which
> >> solution would it be, Palmer to apply Nick's patch to riscv-next or
> >> I to send out v3, any suggestion is appreciated.
> >
> > I don't see what you are seeing w/ riscv/for-next. HEAD is currently at
> > 4681dacadeef ("riscv: replace deprecated scall with ecall") and there
> > are no patches from your series in the branch:
> > https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/log/?h=for-next
>
> It's been in and out of staging a few times as we tracked down the
> performance regression, but it shouldn't have ever made it to linux-next
> for real.
>
> I'm fine just picking up the patch to disable DCE, I've got a few other
> (hopefully small) things to work through first though.
Note: for GCC, -fpatchable-function-entry= (used by
arch/riscv/Kconfig) require GCC 13 for correct garbage collection
semantics.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110729
> > Cheers,
> > Conor.
> >
> >> > > Nick said he would work on a report for the LLVM side, so as long as
> >> > > this issue is handled in some way to avoid regressing LLD builds until
> >> > > it is resolved, I don't think there is anything else for the kernel to
> >> > > do. We like to have breadcrumbs via issue links, not sure if the report
> >> > > will be internal to Google or on LLVM's issue tracker though;
> >> > > regardless, we will have to touch this block to add a version check
> >> > > later, at which point we can add a link to the fix in LLD.
> >> >
> >> > https://github.com/ClangBuiltLinux/linux/issues/1881
> >>
> >> > From 3e5e010958ee41b9fb408cfade8fb017c2fe7169 Mon Sep 17 00:00:00 2001
> >> > From: Nick Desaulniers <[email protected]>
> >> > Date: Fri, 23 Jun 2023 10:06:17 -0700
> >> > Subject: [PATCH] riscv: disable HAVE_LD_DEAD_CODE_DATA_ELIMINATION for LLD
> >> >
> >> > Linking allyesconfig with ld.lld-17 with CONFIG_DEAD_CODE_ELIMINATION=y
> >> > takes hours. Assuming this is a performance regression that can be
> >> > fixed, tentatively disable this for now so that allyesconfig builds
> >> > don't start timing out. If and when there's a fix to ld.lld, this can
> >> > be converted to a version check instead so that users of older but still
> >> > supported versions of ld.lld don't hurt themselves by enabling
> >> > CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y.
> >> >
> >> > Link: https://github.com/ClangBuiltLinux/linux/issues/1881
> >> > Reported-by: Palmer Dabbelt <[email protected]>
> >> > Suggested-by: Nathan Chancellor <[email protected]>
> >> > Signed-off-by: Nick Desaulniers <[email protected]>
> >> > ---
> >> > Hi Jisheng, would you mind sending a v3 with this patch on top/at the
> >> > end of your patch series?
> >> >
> >> > arch/riscv/Kconfig | 3 ++-
> >> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> >> > index 8effe5bb7788..0573991e9b78 100644
> >> > --- a/arch/riscv/Kconfig
> >> > +++ b/arch/riscv/Kconfig
> >> > @@ -116,7 +116,8 @@ config RISCV
> >> > select HAVE_KPROBES if !XIP_KERNEL
> >> > select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL
> >> > select HAVE_KRETPROBES if !XIP_KERNEL
> >> > - select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
> >> > + # https://github.com/ClangBuiltLinux/linux/issues/1881
> >> > + select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if !LD_IS_LLD
> >> > select HAVE_MOVE_PMD
> >> > select HAVE_MOVE_PUD
> >> > select HAVE_PCI
> >> > --
> >> > 2.41.0.162.gfafddb0af9-goog
> >> >
> >>
>
--
宋方睿