2023-05-19 14:54:38

by Ricardo Ribalda

[permalink] [raw]
Subject: [PATCH v7 0/4] kexec: Fix kexec_file_load for llvm16 with PGO

When upreving llvm I realised that kexec stopped working on my test
platform.

The reason seems to be that due to PGO there are multiple .text sections
on the purgatory, and kexec does not supports that.

Signed-off-by: Ricardo Ribalda <[email protected]>
---
Changes in v7:
- Fix $SUBJECT of riscv patch
- Rename PGO as Profile-guided optimization
- Link to v6: https://lore.kernel.org/r/[email protected]

Changes in v6:
- Replace linker script with Makefile rule. Thanks Nick
- Link to v5: https://lore.kernel.org/r/[email protected]

Changes in v5:
- Add warning when multiple text sections are found. Thanks Simon!
- Add Fixes tag.
- Link to v4: https://lore.kernel.org/r/[email protected]

Changes in v4:
- Add Cc: stable
- Add linker script for x86
- Add a warning when the kernel image has overlapping sections.
- Link to v3: https://lore.kernel.org/r/[email protected]

Changes in v3:
- Fix initial value. Thanks Ross!
- Link to v2: https://lore.kernel.org/r/[email protected]

Changes in v2:
- Fix if condition. Thanks Steven!.
- Update Philipp email. Thanks Baoquan.
- Link to v1: https://lore.kernel.org/r/[email protected]

---
Ricardo Ribalda (4):
kexec: Support purgatories with .text.hot sections
x86/purgatory: Remove PGO flags
powerpc/purgatory: Remove PGO flags
riscv/purgatory: Remove PGO flags

arch/powerpc/purgatory/Makefile | 5 +++++
arch/riscv/purgatory/Makefile | 5 +++++
arch/x86/purgatory/Makefile | 5 +++++
kernel/kexec_file.c | 14 +++++++++++++-
4 files changed, 28 insertions(+), 1 deletion(-)
---
base-commit: 58390c8ce1bddb6c623f62e7ed36383e7fa5c02f
change-id: 20230321-kexec_clang16-4510c23d129c

Best regards,
--
Ricardo Ribalda Delgado <[email protected]>



2023-05-19 14:54:46

by Ricardo Ribalda

[permalink] [raw]
Subject: [PATCH v7 1/4] kexec: Support purgatories with .text.hot sections

Clang16 links the purgatory text in two sections when PGO is in use:

[ 1] .text PROGBITS 0000000000000000 00000040
00000000000011a1 0000000000000000 AX 0 0 16
[ 2] .rela.text RELA 0000000000000000 00003498
0000000000000648 0000000000000018 I 24 1 8
...
[17] .text.hot. PROGBITS 0000000000000000 00003220
000000000000020b 0000000000000000 AX 0 0 1
[18] .rela.text.hot. RELA 0000000000000000 00004428
0000000000000078 0000000000000018 I 24 17 8

And both of them have their range [sh_addr ... sh_addr+sh_size] on the
area pointed by `e_entry`.

This causes that image->start is calculated twice, once for .text and
another time for .text.hot. The second calculation leaves image->start
in a random location.

Because of this, the system crashes immediately after:

kexec_core: Starting new kernel

Cc: [email protected]
Fixes: 930457057abe ("kernel/kexec_file.c: split up __kexec_load_puragory")
Reviewed-by: Ross Zwisler <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Reviewed-by: Philipp Rudo <[email protected]>
Signed-off-by: Ricardo Ribalda <[email protected]>
---
kernel/kexec_file.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index f989f5f1933b..69ee4a29136f 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -901,10 +901,22 @@ static int kexec_purgatory_setup_sechdrs(struct purgatory_info *pi,
}

offset = ALIGN(offset, align);
+
+ /*
+ * Check if the segment contains the entry point, if so,
+ * calculate the value of image->start based on it.
+ * If the compiler has produced more than one .text section
+ * (Eg: .text.hot), they are generally after the main .text
+ * section, and they shall not be used to calculate
+ * image->start. So do not re-calculate image->start if it
+ * is not set to the initial value, and warn the user so they
+ * have a chance to fix their purgatory's linker script.
+ */
if (sechdrs[i].sh_flags & SHF_EXECINSTR &&
pi->ehdr->e_entry >= sechdrs[i].sh_addr &&
pi->ehdr->e_entry < (sechdrs[i].sh_addr
- + sechdrs[i].sh_size)) {
+ + sechdrs[i].sh_size) &&
+ !WARN_ON(kbuf->image->start != pi->ehdr->e_entry)) {
kbuf->image->start -= sechdrs[i].sh_addr;
kbuf->image->start += kbuf->mem + offset;
}

--
2.40.1.698.g37aff9b760-goog


2023-05-19 14:58:20

by Ricardo Ribalda

[permalink] [raw]
Subject: [PATCH v7 3/4] powerpc/purgatory: Remove PGO flags

If profile-guided optimization is enabled, the purgatory ends up with
multiple .text sections.
This is not supported by kexec and crashes the system.

Cc: [email protected]
Fixes: 930457057abe ("kernel/kexec_file.c: split up __kexec_load_puragory")
Signed-off-by: Ricardo Ribalda <[email protected]>
---
arch/powerpc/purgatory/Makefile | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/purgatory/Makefile b/arch/powerpc/purgatory/Makefile
index 6f5e2727963c..78473d69cd2b 100644
--- a/arch/powerpc/purgatory/Makefile
+++ b/arch/powerpc/purgatory/Makefile
@@ -5,6 +5,11 @@ KCSAN_SANITIZE := n

targets += trampoline_$(BITS).o purgatory.ro

+# When profile-guided optimization is enabled, llvm emits two different
+# overlapping text sections, which is not supported by kexec. Remove profile
+# optimization flags.
+KBUILD_CFLAGS := $(filter-out -fprofile-sample-use=% -fprofile-use=%,$(KBUILD_CFLAGS))
+
LDFLAGS_purgatory.ro := -e purgatory_start -r --no-undefined

$(obj)/purgatory.ro: $(obj)/trampoline_$(BITS).o FORCE

--
2.40.1.698.g37aff9b760-goog


2023-05-19 15:03:54

by Ricardo Ribalda

[permalink] [raw]
Subject: [PATCH v7 2/4] x86/purgatory: Remove PGO flags

If profile-guided optimization is enabled, the purgatory ends up with
multiple .text sections.
This is not supported by kexec and crashes the system.

Cc: [email protected]
Fixes: 930457057abe ("kernel/kexec_file.c: split up __kexec_load_puragory")
Signed-off-by: Ricardo Ribalda <[email protected]>
---
arch/x86/purgatory/Makefile | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index 82fec66d46d2..42abd6af1198 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -14,6 +14,11 @@ $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE

CFLAGS_sha256.o := -D__DISABLE_EXPORTS

+# When profile-guided optimization is enabled, llvm emits two different
+# overlapping text sections, which is not supported by kexec. Remove profile
+# optimization flags.
+KBUILD_CFLAGS := $(filter-out -fprofile-sample-use=% -fprofile-use=%,$(KBUILD_CFLAGS))
+
# When linking purgatory.ro with -r unresolved symbols are not checked,
# also link a purgatory.chk binary without -r to check for unresolved symbols.
PURGATORY_LDFLAGS := -e purgatory_start -z nodefaultlib

--
2.40.1.698.g37aff9b760-goog


2023-05-19 15:22:37

by Ricardo Ribalda

[permalink] [raw]
Subject: [PATCH v7 4/4] riscv/purgatory: Remove PGO flags

If profile-guided optimization is enabled, the purgatory ends up with
multiple .text sections.
This is not supported by kexec and crashes the system.

Cc: [email protected]
Fixes: 930457057abe ("kernel/kexec_file.c: split up __kexec_load_puragory")
Acked-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Ricardo Ribalda <[email protected]>
---
arch/riscv/purgatory/Makefile | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/riscv/purgatory/Makefile b/arch/riscv/purgatory/Makefile
index 5730797a6b40..bd2e27f82532 100644
--- a/arch/riscv/purgatory/Makefile
+++ b/arch/riscv/purgatory/Makefile
@@ -35,6 +35,11 @@ CFLAGS_sha256.o := -D__DISABLE_EXPORTS
CFLAGS_string.o := -D__DISABLE_EXPORTS
CFLAGS_ctype.o := -D__DISABLE_EXPORTS

+# When profile-guided optimization is enabled, llvm emits two different
+# overlapping text sections, which is not supported by kexec. Remove profile
+# optimization flags.
+KBUILD_CFLAGS := $(filter-out -fprofile-sample-use=% -fprofile-use=%,$(KBUILD_CFLAGS))
+
# When linking purgatory.ro with -r unresolved symbols are not checked,
# also link a purgatory.chk binary without -r to check for unresolved symbols.
PURGATORY_LDFLAGS := -e purgatory_start -z nodefaultlib

--
2.40.1.698.g37aff9b760-goog


2023-05-22 04:46:54

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v7 3/4] powerpc/purgatory: Remove PGO flags

Ricardo Ribalda <[email protected]> writes:
> If profile-guided optimization is enabled, the purgatory ends up with
> multiple .text sections.
> This is not supported by kexec and crashes the system.
>
> Cc: [email protected]
> Fixes: 930457057abe ("kernel/kexec_file.c: split up __kexec_load_puragory")
> Signed-off-by: Ricardo Ribalda <[email protected]>
> ---
> arch/powerpc/purgatory/Makefile | 5 +++++
> 1 file changed, 5 insertions(+)

Acked-by: Michael Ellerman <[email protected]> (powerpc)

cheers

2023-09-08 23:12:41

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v7 0/4] kexec: Fix kexec_file_load for llvm16 with PGO

Hi Ricardo,

Thanks for your kind reply.

On Fri, Sep 8, 2023 at 2:18 PM Ricardo Ribalda <[email protected]> wrote:
>
> Hi Song
>
> On Fri, 8 Sept 2023 at 01:08, Song Liu <[email protected]> wrote:
> >
> > Hi Ricardo and folks,
> >
> > On Fri, May 19, 2023 at 7:48 AM Ricardo Ribalda <[email protected]> wrote:
> > >
> > > When upreving llvm I realised that kexec stopped working on my test
> > > platform.
> > >
> > > The reason seems to be that due to PGO there are multiple .text sections
> > > on the purgatory, and kexec does not supports that.
> > >
> > > Signed-off-by: Ricardo Ribalda <[email protected]>
> >
> > We are seeing WARNINGs like the following while kexec'ing a PGO and
> > LTO enabled kernel:
> >
> > WARNING: CPU: 26 PID: 110894 at kernel/kexec_file.c:919
> > kexec_load_purgatory+0x37f/0x390
> >
> > AFAICT, the warning was added by this set, and it was triggered when
> > we have many .text sections
> > in purgatory.ro. The kexec was actually successful. So I wonder
> > whether we really need the
> > WARNING here. If we disable LTO (PGO is still enabled), we don't see
> > the WARNING any more.
> >
> > I also tested an older kernel (5.19 based), where we also see many
> > .text sections with LTO. It
> > kexec()'ed fine. (It doesn't have the WARN_ON() in
> > kexec_purgatory_setup_sechdrs).
>
> You have been "lucky" that the code has chosen the correct start
> address, you need to modify the linker script of your kernel to
> disable PGO.
> You need to backport a patch like this:
> https://lore.kernel.org/lkml/CAPhsuW5_qAvV0N3o+hOiAnb1=buJ1pLzqYW9D+Bwft6hxJvAeQ@mail.gmail.com/T/#md68b7f832216b0c56bbec0c9b07332e180b9ba2b

We already have this commit in our branch. AFAICT, the issue was
triggered by LTO. So something like the following seems fixes it
(I haven't finished the end-to-end test yet). Does this change make
sense to you?

Thanks again,
Song

diff --git i/arch/x86/purgatory/Makefile w/arch/x86/purgatory/Makefile
index 8f71aaa04cc2..dc306fa7197d 100644
--- i/arch/x86/purgatory/Makefile
+++ w/arch/x86/purgatory/Makefile
@@ -19,6 +19,10 @@ CFLAGS_sha256.o := -D__DISABLE_EXPORTS
# optimization flags.
KBUILD_CFLAGS := $(filter-out -fprofile-sample-use=%
-fprofile-use=%,$(KBUILD_CFLAGS))

+# When LTO is enabled, llvm emits many text sections, which is not supported
+# by kexec. Remove -flto=* flags.
+KBUILD_CFLAGS := $(filter-out -flto=%,$(KBUILD_CFLAGS))
+
# When linking purgatory.ro with -r unresolved symbols are not checked,
# also link a purgatory.chk binary without -r to check for unresolved symbols.
PURGATORY_LDFLAGS := -e purgatory_start -z nodefaultlib

2023-09-08 23:24:09

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v7 0/4] kexec: Fix kexec_file_load for llvm16 with PGO

On Fri, Sep 8, 2023 at 2:52 PM Ricardo Ribalda <[email protected]> wrote:
>
> Hi Song
>
> On Fri, 8 Sept 2023 at 23:48, Song Liu <[email protected]> wrote:
> >
> > Hi Ricardo,
> >
> > Thanks for your kind reply.
> >
> > On Fri, Sep 8, 2023 at 2:18 PM Ricardo Ribalda <[email protected]> wrote:
> > >
> > > Hi Song
> > >
> > > On Fri, 8 Sept 2023 at 01:08, Song Liu <[email protected]> wrote:
> > > >
> > > > Hi Ricardo and folks,
> > > >
> > > > On Fri, May 19, 2023 at 7:48 AM Ricardo Ribalda <[email protected]> wrote:
> > > > >
> > > > > When upreving llvm I realised that kexec stopped working on my test
> > > > > platform.
> > > > >
> > > > > The reason seems to be that due to PGO there are multiple .text sections
> > > > > on the purgatory, and kexec does not supports that.
> > > > >
> > > > > Signed-off-by: Ricardo Ribalda <[email protected]>
> > > >
> > > > We are seeing WARNINGs like the following while kexec'ing a PGO and
> > > > LTO enabled kernel:
> > > >
> > > > WARNING: CPU: 26 PID: 110894 at kernel/kexec_file.c:919
> > > > kexec_load_purgatory+0x37f/0x390
> > > >
> > > > AFAICT, the warning was added by this set, and it was triggered when
> > > > we have many .text sections
> > > > in purgatory.ro. The kexec was actually successful. So I wonder
> > > > whether we really need the
> > > > WARNING here. If we disable LTO (PGO is still enabled), we don't see
> > > > the WARNING any more.
> > > >
> > > > I also tested an older kernel (5.19 based), where we also see many
> > > > .text sections with LTO. It
> > > > kexec()'ed fine. (It doesn't have the WARN_ON() in
> > > > kexec_purgatory_setup_sechdrs).
> > >
> > > You have been "lucky" that the code has chosen the correct start
> > > address, you need to modify the linker script of your kernel to
> > > disable PGO.
> > > You need to backport a patch like this:
> > > https://lore.kernel.org/lkml/CAPhsuW5_qAvV0N3o+hOiAnb1=buJ1pLzqYW9D+Bwft6hxJvAeQ@mail.gmail.com/T/#md68b7f832216b0c56bbec0c9b07332e180b9ba2b
> >
> > We already have this commit in our branch. AFAICT, the issue was
> > triggered by LTO. So something like the following seems fixes it
> > (I haven't finished the end-to-end test yet). Does this change make
> > sense to you?
>
> if the end-to-end works, please send it as a patch to the mailing list.
>
> Thanks! :)

OK, it works (AFAICT). Sending the patch.

Thanks,
Song