2020-03-11 21:46:44

by Hans de Goede

[permalink] [raw]
Subject: [PATCH v4 0/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

Hi All,

Here is a new version of my patch (now patch-series) to fix $subject.

The actual patch making the build fail on missing symbols is unchanged,
new in v4 is a preparation patch which fixes a missing symbol (and thus
after the second patch a build failure) when CONFIG_TRACE_BRANCH_PROFILING
is enabled.

Regards,

Hans


2020-03-11 21:46:53

by Hans de Goede

[permalink] [raw]
Subject: [PATCH v4 1/2] x86/purgatory: Fix missing ftrace_likely_update symbol

Fix the purgatory code having a (silent, not reported) missing
symbol reference to ftrace_likely_update when CONFIG_TRACE_BRANCH_PROFILING
is enabled.

Signed-off-by: Hans de Goede <[email protected]>
---
Changes in v4:
-This is a new patch in v4 of this series
---
arch/x86/purgatory/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index fb4ee5444379..c1bd85d6392d 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -25,7 +25,7 @@ KCOV_INSTRUMENT := n

PURGATORY_CFLAGS_REMOVE := -mcmodel=kernel
PURGATORY_CFLAGS := -mcmodel=large -ffreestanding -fno-zero-initialized-in-bss
-PURGATORY_CFLAGS += $(DISABLE_STACKLEAK_PLUGIN)
+PURGATORY_CFLAGS += $(DISABLE_STACKLEAK_PLUGIN) -DDISABLE_BRANCH_PROFILING

# Default KBUILD_CFLAGS can have -pg option set when FTRACE is enabled. That
# in turn leaves some undefined symbols like __fentry__ in purgatory and not
--
2.25.1

2020-03-11 21:48:16

by Hans de Goede

[permalink] [raw]
Subject: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

Since we link purgatory.ro with -r aka we enable "incremental linking"
no checks for unresolved symbols is done while linking purgatory.ro.

Changes to the sha256 code has caused the purgatory in 5.4-rc1 to have
a missing symbol on memzero_explicit, yet things still happily build.

This commit adds an extra check for unresolved symbols by calling ld
without -r before running bin2c to generate kexec-purgatory.c.

This causes a build of 5.4-rc1 with this patch added to fail as it should:

CHK arch/x86/purgatory/purgatory.ro
ld: arch/x86/purgatory/purgatory.ro: in function `sha256_transform':
sha256.c:(.text+0x1c0c): undefined reference to `memzero_explicit'
make[2]: *** [arch/x86/purgatory/Makefile:72:
arch/x86/purgatory/kexec-purgatory.c] Error 1
make[1]: *** [scripts/Makefile.build:509: arch/x86/purgatory] Error 2
make: *** [Makefile:1650: arch/x86] Error 2

This will help us catch missing symbols in the purgatory sooner.

Note this commit also removes --no-undefined from LDFLAGS_purgatory.ro
as that has no effect.

Signed-off-by: Hans de Goede <[email protected]>
---
Changes in v3:
- Add a .gitignore file with purgatory.chk listed in it

Changes in v2:
- Using 2 if_changed lines under a single rule does not work, then
1 of the 2 will always execute each build.
Instead add a new (unused) purgatory.chk intermediate which gets
linked from purgatory.ro without -r to do the missing symbols check
- This also fixes the check generating an a.out file (oops)
---
arch/x86/purgatory/.gitignore | 1 +
arch/x86/purgatory/Makefile | 13 ++++++++++---
2 files changed, 11 insertions(+), 3 deletions(-)
create mode 100644 arch/x86/purgatory/.gitignore

diff --git a/arch/x86/purgatory/.gitignore b/arch/x86/purgatory/.gitignore
new file mode 100644
index 000000000000..d2be1500671d
--- /dev/null
+++ b/arch/x86/purgatory/.gitignore
@@ -0,0 +1 @@
+purgatory.chk
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index c1bd85d6392d..eca8366b8d2e 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -14,8 +14,12 @@ $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE

CFLAGS_sha256.o := -D__DISABLE_EXPORTS

-LDFLAGS_purgatory.ro := -e purgatory_start -r --no-undefined -nostdlib -z nodefaultlib
-targets += purgatory.ro
+# Since we link purgatory.ro with -r unresolved symbols are not checked, so we
+# also link a purgatory.chk binary without -r to check for unresolved symbols.
+PURGATORY_LDFLAGS := -e purgatory_start -nostdlib -z nodefaultlib
+LDFLAGS_purgatory.ro := -r $(PURGATORY_LDFLAGS)
+LDFLAGS_purgatory.chk := $(PURGATORY_LDFLAGS)
+targets += purgatory.ro purgatory.chk

KASAN_SANITIZE := n
KCOV_INSTRUMENT := n
@@ -58,12 +62,15 @@ CFLAGS_string.o += $(PURGATORY_CFLAGS)
$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
$(call if_changed,ld)

+$(obj)/purgatory.chk: $(obj)/purgatory.ro FORCE
+ $(call if_changed,ld)
+
targets += kexec-purgatory.c

quiet_cmd_bin2c = BIN2C $@
cmd_bin2c = $(objtree)/scripts/bin2c kexec_purgatory < $< > $@

-$(obj)/kexec-purgatory.c: $(obj)/purgatory.ro FORCE
+$(obj)/kexec-purgatory.c: $(obj)/purgatory.ro $(obj)/purgatory.chk FORCE
$(call if_changed,bin2c)

obj-$(CONFIG_KEXEC_FILE) += kexec-purgatory.o
--
2.25.1

2020-03-12 00:14:01

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Wed, Mar 11, 2020 at 10:46:01PM +0100, Hans de Goede wrote:
> Since we link purgatory.ro with -r aka we enable "incremental linking"
> no checks for unresolved symbols is done while linking purgatory.ro.
>

Do we actually need to link purgatory with -r? We could use
--emit-relocs to get the relocation sections generated the way the main
x86 kernel does, no?

Eg like the below? This would avoid the double-link creating
purgatory.chk.

diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index fb4ee5444379..5332f95ca1d3 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -14,8 +14,8 @@ $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE

CFLAGS_sha256.o := -D__DISABLE_EXPORTS

-LDFLAGS_purgatory.ro := -e purgatory_start -r --no-undefined -nostdlib -z nodefaultlib
-targets += purgatory.ro
+LDFLAGS_purgatory := -e purgatory_start --emit-relocs -nostdlib -z nodefaultlib
+targets += purgatory

KASAN_SANITIZE := n
KCOV_INSTRUMENT := n
@@ -55,7 +55,7 @@ CFLAGS_sha256.o += $(PURGATORY_CFLAGS)
CFLAGS_REMOVE_string.o += $(PURGATORY_CFLAGS_REMOVE)
CFLAGS_string.o += $(PURGATORY_CFLAGS)

-$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
+$(obj)/purgatory: $(PURGATORY_OBJS) FORCE
$(call if_changed,ld)

targets += kexec-purgatory.c
@@ -63,7 +63,7 @@ targets += kexec-purgatory.c
quiet_cmd_bin2c = BIN2C $@
cmd_bin2c = $(objtree)/scripts/bin2c kexec_purgatory < $< > $@

-$(obj)/kexec-purgatory.c: $(obj)/purgatory.ro FORCE
+$(obj)/kexec-purgatory.c: $(obj)/purgatory FORCE
$(call if_changed,bin2c)

obj-$(CONFIG_KEXEC_FILE) += kexec-purgatory.o

2020-03-12 11:32:35

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

Hi,

On 3/12/20 1:10 AM, Arvind Sankar wrote:
> On Wed, Mar 11, 2020 at 10:46:01PM +0100, Hans de Goede wrote:
>> Since we link purgatory.ro with -r aka we enable "incremental linking"
>> no checks for unresolved symbols is done while linking purgatory.ro.
>>
>
> Do we actually need to link purgatory with -r? We could use
> --emit-relocs to get the relocation sections generated the way the main
> x86 kernel does, no?
>
> Eg like the below? This would avoid the double-link creating
> purgatory.chk.

Ok, I've just given that a try and it does indeed catch missing symbols
I'm not sure if it still generates a working purgatory though (I did
not try kexec from a kernel with it).

Since this all is somewhat black magic to me, my goal was to not change
how we build the purgatory while still adding a step which checks for
missing symbols, as my changes from a while back to unify all the SHA256
implementations in the kernel had accidentally caused a missing symbol
which went unnoticed for a while.

Also the same patch, using the 2 step approach has already been merged
for the s390x purgatory code, so doing it your way would lead to 2
different approaches in the kernel.

I do agree that your way does seem to be more elegant though...

I also see that the kbuild test robot has managed to come up with
yet another set of Kconfig options triggering missing symbols.

IMHO this shows that we really need some patch to detect these,
because clearly there are a lot of config-s with missing symbols
in the purgatory out there with no-one noticing.

I will send out a v5 of my patch-set changing the first patch to
also fix the new issue the kbuild test robot has found. I'm going
to leave this patch as is. If you prefer replacing the second patch
in the set (this patch) with your solution then that is fine with me.

Regards,

Hans


> diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
> index fb4ee5444379..5332f95ca1d3 100644
> --- a/arch/x86/purgatory/Makefile
> +++ b/arch/x86/purgatory/Makefile
> @@ -14,8 +14,8 @@ $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE
>
> CFLAGS_sha256.o := -D__DISABLE_EXPORTS
>
> -LDFLAGS_purgatory.ro := -e purgatory_start -r --no-undefined -nostdlib -z nodefaultlib
> -targets += purgatory.ro
> +LDFLAGS_purgatory := -e purgatory_start --emit-relocs -nostdlib -z nodefaultlib
> +targets += purgatory
>
> KASAN_SANITIZE := n
> KCOV_INSTRUMENT := n
> @@ -55,7 +55,7 @@ CFLAGS_sha256.o += $(PURGATORY_CFLAGS)
> CFLAGS_REMOVE_string.o += $(PURGATORY_CFLAGS_REMOVE)
> CFLAGS_string.o += $(PURGATORY_CFLAGS)
>
> -$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
> +$(obj)/purgatory: $(PURGATORY_OBJS) FORCE
> $(call if_changed,ld)
>
> targets += kexec-purgatory.c
> @@ -63,7 +63,7 @@ targets += kexec-purgatory.c
> quiet_cmd_bin2c = BIN2C $@
> cmd_bin2c = $(objtree)/scripts/bin2c kexec_purgatory < $< > $@
>
> -$(obj)/kexec-purgatory.c: $(obj)/purgatory.ro FORCE
> +$(obj)/kexec-purgatory.c: $(obj)/purgatory FORCE
> $(call if_changed,bin2c)
>
> obj-$(CONFIG_KEXEC_FILE) += kexec-purgatory.o
>

2020-03-12 11:43:04

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Thu, Mar 12, 2020 at 12:31:39PM +0100, Hans de Goede wrote:
> I will send out a v5 of my patch-set changing the first patch to
> also fix the new issue the kbuild test robot has found. I'm going
> to leave this patch as is. If you prefer replacing the second patch
> in the set (this patch) with your solution then that is fine with me.

Can we please slow down here, select the best solution, test it properly
- yes, kexec file-based syscall whatever which uses the purgatory - and be
done with it once and for all instead of quickly shooting out patchsets
which keep breaking some randconfigs?

In order to check for the latter, you can script around "make
randconfig" and let it run for a while on a big machine. This is how I do it.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-03-12 12:00:01

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

Hi,

On 3/12/20 12:42 PM, Borislav Petkov wrote:
> On Thu, Mar 12, 2020 at 12:31:39PM +0100, Hans de Goede wrote:
>> I will send out a v5 of my patch-set changing the first patch to
>> also fix the new issue the kbuild test robot has found. I'm going
>> to leave this patch as is. If you prefer replacing the second patch
>> in the set (this patch) with your solution then that is fine with me.
>
> Can we please slow down here, select the best solution, test it properly
> - yes, kexec file-based syscall whatever which uses the purgatory -

My version of this patch has already been tested this way. It is
Arvind's alternative version which has not been tested yet.

> and be
> done with it once and for all instead of quickly shooting out patchsets
> which keep breaking some randconfigs?

For v5 I have synced the list of disabled trace/sanitize options with
the one from drivers/firmware/efi/libstub/Makefile so v5 should catch
all known cases of things breaking with some special randconfigs.

Also I would like to point out that:

1. Things are already broken, my patch just exposes the brokenness
of some configs, it is not actually breaking things (well it breaks
the build, changing a silent brokenness into an obvious one).

2. I send out the first version of this patch on 7 October 2019, it
has not seen any reaction until now. So I'm sending out new versions
quickly now that this issue is finally getting some attention...

Regards,

Hans

2020-03-12 12:52:18

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Thu, Mar 12, 2020 at 12:58:24PM +0100, Hans de Goede wrote:
> My version of this patch has already been tested this way. It is

Tested with kexec maybe but if the 0day bot keeps finding breakage, that
ain't good enough.

> 1. Things are already broken, my patch just exposes the brokenness
> of some configs, it is not actually breaking things (well it breaks
> the build, changing a silent brokenness into an obvious one).

As I already explained, that is not good enough.

> 2. I send out the first version of this patch on 7 October 2019, it
> has not seen any reaction until now. So I'm sending out new versions
> quickly now that this issue is finally getting some attention...

And that is never the right approach.

Maintainers are busy as hell so !urgent stuff gets to wait. Spamming
them with more patchsets does not help - fixing stuff properly does.

So, to sum up: if Arvind's approach is the better one, then we should do
that and s390 should be fixed this way too. And all tested. And we will
remove the hurry element from it all since it has not been noticed so
far so it is not urgent and we can take our time and fix it properly.

Ok?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-03-12 13:35:31

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

Hi,

On 3/12/20 1:50 PM, Borislav Petkov wrote:
> On Thu, Mar 12, 2020 at 12:58:24PM +0100, Hans de Goede wrote:
>> My version of this patch has already been tested this way. It is
>
> Tested with kexec maybe but if the 0day bot keeps finding breakage, that
> ain't good enough.

Which is why I have been fixing the issues which the 0day bot finds,
but then I get complaints about reving the patch set to quickly...

>> 1. Things are already broken, my patch just exposes the brokenness
>> of some configs, it is not actually breaking things (well it breaks
>> the build, changing a silent brokenness into an obvious one).
>
> As I already explained, that is not good enough.

Right, which as I already said is why I'm fixing those issues.

>> 2. I send out the first version of this patch on 7 October 2019, it
>> has not seen any reaction until now. So I'm sending out new versions
>> quickly now that this issue is finally getting some attention...
>
> And that is never the right approach.

In my experience once a patch-set has a maintainers attention,
quickly fixing any issues found usually is the right approach.
Because then usually it can get merged quickly and both the maintainer
and I can move on to other stuff. I'm sorry if you are finding this
annoying.

> Maintainers are busy as hell so !urgent stuff gets to wait. Spamming
> them with more patchsets does not help - fixing stuff properly does.

I am trying to fix this properly, the reason the 0daybot is
complaining is because of pre-existing bugs, not because of issues
with my original patch.

If I was not trying to fix this properly I would have long dropped
this patch to the floor and moved on.

TBH I'm quite unhappy that I'm being "yelled" at now (or so it
feels) while all I'm doing is trying to fix a long standing issue :(

> So, to sum up: if Arvind's approach is the better one, then we should do
> that and s390 should be fixed this way too. And all tested. And we will
> remove the hurry element from it all since it has not been noticed so
> far so it is not urgent and we can take our time and fix it properly.
>
> Ok?

No not ok, I'm doing my best to help make things better here and
in return I'm getting what feels as a bunch of negativity and that
is NOT ok!

Now as how to move forward with this, I suggest that:

1) We wait a bit to see if the 0daybot finds any more existing issues
which are exposed by my patch

2) Change my patch to check for missing symbols to use the approach
which Arvind has suggested

3) Check that "kexec -l <kernel>" + "kexec -e" still work

4) Post v6.

Does that work for you ?

Regards,

Hans

2020-03-12 14:26:45

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Thu, Mar 12, 2020 at 02:34:30PM +0100, Hans de Goede wrote:
> Which is why I have been fixing the issues which the 0day bot finds,
> but then I get complaints about reving the patch set to quickly...

So here's what I'm seeing: patchsets get sent and 0day bot finds an
issue almost each time. Which tells me, this patchset doesn't look ready.

I suggested you to do "make randconfig" builds to find those breakages
yourself but if you prefer 0day bot to do that, that's fine too.

> In my experience once a patch-set has a maintainers attention,
> quickly fixing any issues found usually is the right approach.
> Because then usually it can get merged quickly and both the maintainer
> and I can move on to other stuff. I'm sorry if you are finding this
> annoying.

My experience shows that almost always there's an aspect where both the
submitter and the maintainer haven't thought about and hurrying stuff
makes it worse. That's why I prefer stuff to be hammered out and tested
properly and *then* applied.

> TBH I'm quite unhappy that I'm being "yelled" at now (or so it
> feels) while all I'm doing is trying to fix a long standing issue :(

What in my reply made you feel you're being "yelled" at?

> No not ok, I'm doing my best to help make things better here and
> in return I'm getting what feels as a bunch of negativity and that
> is NOT ok!

I have no clue what in my replies made you feel that. Please explain.
How should I have replied so that it doesn't come across negative?

> Now as how to move forward with this, I suggest that:
>
> 1) We wait a bit to see if the 0daybot finds any more existing issues
> which are exposed by my patch
>
> 2) Change my patch to check for missing symbols to use the approach
> which Arvind has suggested
>
> 3) Check that "kexec -l <kernel>" + "kexec -e" still work
>
> 4) Post v6.

5) Wait for 0day bot to chew on it too.

> Does that work for you ?

Yes, sounds ok.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-03-12 14:40:23

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

Hi,

On 3/12/20 3:25 PM, Borislav Petkov wrote:
> On Thu, Mar 12, 2020 at 02:34:30PM +0100, Hans de Goede wrote:

<snip>

>> No not ok, I'm doing my best to help make things better here and
>> in return I'm getting what feels as a bunch of negativity and that
>> is NOT ok!
>
> I have no clue what in my replies made you feel that. Please explain.
> How should I have replied so that it doesn't come across negative?

I posted v3 of this patch 5 months ago, then after 0day bot found
an issue after the resend I send a v4 honestly believing that that
was the only issue with it.

0day bot found another issue, so I send out v5, checking what special
options similar code (EFI libstub) uses to make sure I cover all special
cases this time.

So I've send out 2 versions, not 5 not 10, but only 2 versions in
the past 2 days and you start complaining about me rushing this and
not fixing it properly, to me that does not come across positive.

More specifically my intentions / motives on this were well intended
and I too believe in fixing things the proper way. Your reply suggested
that I just want to rush this through, which calls my motives into
question, for which in my mind there was no reason.

If you complain about 2 versions in 2 days, or 5 versions over 5 months
then that feels exaggerated and it certainly does not give me a feeling
that the effort which I'm putting into this is being appreciated.

Anyways we have a plan how to move forward with this now, so lets
focus on that.

>> Now as how to move forward with this, I suggest that:
>>
>> 1) We wait a bit to see if the 0daybot finds any more existing issues
>> which are exposed by my patch
>>
>> 2) Change my patch to check for missing symbols to use the approach
>> which Arvind has suggested
>>
>> 3) Check that "kexec -l <kernel>" + "kexec -e" still work
>>
>> 4) Post v6.
>
> 5) Wait for 0day bot to chew on it too.
>
>> Does that work for you ?
>
> Yes, sounds ok.

Ok, then lets move forward with this.

Regards,

Hans

2020-03-12 14:50:20

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Thu, Mar 12, 2020 at 03:38:22PM +0100, Hans de Goede wrote:
> So I've send out 2 versions, not 5 not 10, but only 2 versions in
> the past 2 days and you start complaining about me rushing this and
> not fixing it properly, to me that does not come across positive.

Maybe there's a misunderstanding: when you send a patchset which is not
marked RFC, I read this, as, this patchset is ready for application. But
then the 0day bot catches build errors which means, not ready yet.

And I believe you expected for the 0day bot to test the patches first
and they should then to be considered for application. Yes, no?

That's why I suggested you to do randconfig builds yourself and not
depend on the 0day bot as it is known to be unreliable.

So I didn't do anything to make you feel negative - definitely not
intentionally.

> More specifically my intentions / motives on this were well intended
> and I too believe in fixing things the proper way. Your reply suggested
> that I just want to rush this through, which calls my motives into
> question, for which in my mind there was no reason.
>
> If you complain about 2 versions in 2 days, or 5 versions over 5 months
> then that feels exaggerated and it certainly does not give me a feeling
> that the effort which I'm putting into this is being appreciated.

I believe I already explained what my problem with that is. If you don't
see it, then let's agree to disagree.

> Anyways we have a plan how to move forward with this now, so lets
> focus on that.

Yes, let's do that.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-03-12 14:58:40

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

Hi,

On 3/12/20 3:49 PM, Borislav Petkov wrote:
> On Thu, Mar 12, 2020 at 03:38:22PM +0100, Hans de Goede wrote:
>> So I've send out 2 versions, not 5 not 10, but only 2 versions in
>> the past 2 days and you start complaining about me rushing this and
>> not fixing it properly, to me that does not come across positive.
>
> Maybe there's a misunderstanding: when you send a patchset which is not
> marked RFC, I read this, as, this patchset is ready for application. But
> then the 0day bot catches build errors which means, not ready yet.
>
> And I believe you expected for the 0day bot to test the patches first
> and they should then to be considered for application. Yes, no?

I guess this is the root cause of our misunderstanding. I certainly
did not expect the 0day bot to catch any issues, because I did not
expect there to be any pre-existing issues.

As said I wrote the patch because my sha256 changes from a while ago
broke the purgatory because of introducing a missing symbol. My intend
was to avoid a repeat of that regression by catching issues like this
during build time. I did not expect there to already be (more)
such issues in the existing code; and I certainly did not expect
there to be more then 1 such issue.

So having to do v4 to fix one pre-existing issue was a surprise.
Having to then do a v5 because there was more then one pre-existing
issue was an even bigger surprise.

I understand that you are pushing-back against people using 0day bot
to find bugs for them and that was never my goal.

OTOH I don't appreciate getting push-back because if my change
exposing *pre*-existing bugs. I am not responsible for those
pre-existing bugs and as such I also do not feel responsible for
0day bot triggering on them. Are the 0day bot reports and the need
to rev the patch-set and post a new version annoying? Yes they are;
however they are not my fault.

Regards,

Hans

2020-03-12 15:14:12

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Thu, Mar 12, 2020 at 03:57:41PM +0100, Hans de Goede wrote:
> I understand that you are pushing-back against people using 0day bot
> to find bugs for them and that was never my goal.

No, this is not it: the 0day bot is very helpful, when it works, to
catch issues like that. But it is not always reliable, thus I suggested
for you to do the testing yourself before sending your next patchset.
That's all.

But that's fine - I have a test setup on which to run this so we don't
have to wait for the 0day bot.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-03-12 17:47:55

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

Hi,

On 3/12/20 1:10 AM, Arvind Sankar wrote:
> On Wed, Mar 11, 2020 at 10:46:01PM +0100, Hans de Goede wrote:
>> Since we link purgatory.ro with -r aka we enable "incremental linking"
>> no checks for unresolved symbols is done while linking purgatory.ro.
>>
>
> Do we actually need to link purgatory with -r? We could use
> --emit-relocs to get the relocation sections generated the way the main
> x86 kernel does, no?
>
> Eg like the below? This would avoid the double-link creating
> purgatory.chk.

So I've changed the patch for this in my local tree over to the version
suggested below and tested kexec with this (and I can confirm that it
still works)

I'm wondering though if it would not be better to keep the purgatory.ro name ? :

1. The generated ELF binary should still be relocatable
2. .ro files are part of the global .gitignore settings, for the
new purgatory name we need to add an arch/x86/purgatory/.gitignore file
3. Keeping the purgatory.ro name will make the diff easier to read

Regards,

Hans


>
> diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
> index fb4ee5444379..5332f95ca1d3 100644
> --- a/arch/x86/purgatory/Makefile
> +++ b/arch/x86/purgatory/Makefile
> @@ -14,8 +14,8 @@ $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE
>
> CFLAGS_sha256.o := -D__DISABLE_EXPORTS
>
> -LDFLAGS_purgatory.ro := -e purgatory_start -r --no-undefined -nostdlib -z nodefaultlib
> -targets += purgatory.ro
> +LDFLAGS_purgatory := -e purgatory_start --emit-relocs -nostdlib -z nodefaultlib
> +targets += purgatory
>
> KASAN_SANITIZE := n
> KCOV_INSTRUMENT := n
> @@ -55,7 +55,7 @@ CFLAGS_sha256.o += $(PURGATORY_CFLAGS)
> CFLAGS_REMOVE_string.o += $(PURGATORY_CFLAGS_REMOVE)
> CFLAGS_string.o += $(PURGATORY_CFLAGS)
>
> -$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
> +$(obj)/purgatory: $(PURGATORY_OBJS) FORCE
> $(call if_changed,ld)
>
> targets += kexec-purgatory.c
> @@ -63,7 +63,7 @@ targets += kexec-purgatory.c
> quiet_cmd_bin2c = BIN2C $@
> cmd_bin2c = $(objtree)/scripts/bin2c kexec_purgatory < $< > $@
>
> -$(obj)/kexec-purgatory.c: $(obj)/purgatory.ro FORCE
> +$(obj)/kexec-purgatory.c: $(obj)/purgatory FORCE
> $(call if_changed,bin2c)
>
> obj-$(CONFIG_KEXEC_FILE) += kexec-purgatory.o
>

2020-03-12 18:24:16

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Thu, Mar 12, 2020 at 06:46:44PM +0100, Hans de Goede wrote:
> Hi,
>
> On 3/12/20 1:10 AM, Arvind Sankar wrote:
> > On Wed, Mar 11, 2020 at 10:46:01PM +0100, Hans de Goede wrote:
> >> Since we link purgatory.ro with -r aka we enable "incremental linking"
> >> no checks for unresolved symbols is done while linking purgatory.ro.
> >>
> >
> > Do we actually need to link purgatory with -r? We could use
> > --emit-relocs to get the relocation sections generated the way the main
> > x86 kernel does, no?
> >
> > Eg like the below? This would avoid the double-link creating
> > purgatory.chk.
>
> So I've changed the patch for this in my local tree over to the version
> suggested below and tested kexec with this (and I can confirm that it
> still works)
>
> I'm wondering though if it would not be better to keep the purgatory.ro name ? :
>
> 1. The generated ELF binary should still be relocatable
> 2. .ro files are part of the global .gitignore settings, for the
> new purgatory name we need to add an arch/x86/purgatory/.gitignore file
> 3. Keeping the purgatory.ro name will make the diff easier to read
>
> Regards,
>
> Hans
>

No objections to the name, we can change it later to purgatory.elf or
something if we want in a separate patch.

There is one issue I noticed -- x86_64 kernel default LDFLAGS have
max-page-size=0x200000. If purgatory is linked into a real executable
this makes kexec-purgatory.o become quite large, from about 25k to just
over 4Mb. Adding -z max-page-size=4096 to the LDFLAGS lowers this back
down to around 29k. arch/x86/entry/vdso/Makefile adds this flag too.

2020-03-13 04:45:19

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Thu, Mar 12, 2020 at 01:50:39PM +0100, Borislav Petkov wrote:
> On Thu, Mar 12, 2020 at 12:58:24PM +0100, Hans de Goede wrote:
> > My version of this patch has already been tested this way. It is
>
> Tested with kexec maybe but if the 0day bot keeps finding breakage, that
> ain't good enough.
>
> > 1. Things are already broken, my patch just exposes the brokenness
> > of some configs, it is not actually breaking things (well it breaks
> > the build, changing a silent brokenness into an obvious one).
>
> As I already explained, that is not good enough.
>
> > 2. I send out the first version of this patch on 7 October 2019, it
> > has not seen any reaction until now. So I'm sending out new versions
> > quickly now that this issue is finally getting some attention...
>
> And that is never the right approach.
>
> Maintainers are busy as hell so !urgent stuff gets to wait. Spamming
> them with more patchsets does not help - fixing stuff properly does.
>
> So, to sum up: if Arvind's approach is the better one, then we should do
> that and s390 should be fixed this way too. And all tested. And we will
> remove the hurry element from it all since it has not been noticed so
> far so it is not urgent and we can take our time and fix it properly.
>
> Ok?
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

If I could try to summarize the situation here:
- the purgatory requires filtering out certain CFLAGS/other settings set
for the generic kernel in order to work correctly
- the patch proposed by Hans de Goede will detect missing filters at
build time rather than when kexec is executed
- the filtering is currently not perfect as demonstrated by issues that
0day bot is finding -- but the patchset will find these problems at
build time rather than runtime
- there might be a slight optimization as proposed by me [1] but it
might have problems as in [2] even if it seems to work

I think the patch as of v5 [0] is useful right now, to catch CFLAGS
additions that aren't currently being filtered correctly. The real
problem is that there exist CFLAGS that should be used for all source
files in the kernel, and there are CFLAGS (eg tracing, stack check etc)
that should only be used for the kernel proper. For special
compilations, such as boot stubs, vdso's, purgatory we should have the
generic CFLAGS but not the kernel-proper CFLAGS. The issue currently is
that these special compilations need to filter out all the flags added
for kernel-proper, and this is a moving target as more tracing/sanity
flags get added. Neither the solution of simply re-initializing CFLAGS
(which will miss generic CFLAGS) nor trying to filter out CFLAGS (which
will miss new kernel-proper CFLAGS) works very well. I think ideally
splitting these into independent variables, i.e. BASE_FLAGS that can be
used for everything, and KERNEL_FLAGS only to be used for the kernel
proper is likely eventually the better solution, rather than conflating
both into KBUILD_CFLAGS.

But to move forward incrementally, patch v5 is probably the cleanest. My
suggestion in [1] I'm thinking is changing things significantly for
kexec, by changing the purgatory from a relocatable object file into an
actual executable, and might have knock-on implications that need to be
reviewed and tested carefully before it can be merged, as shown by [2].

[0] https://lore.kernel.org/lkml/[email protected]/
[1] https://lore.kernel.org/lkml/[email protected]/
[2] https://lore.kernel.org/lkml/[email protected]/

2020-03-13 05:00:27

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Fri, Mar 13, 2020 at 12:42:36AM -0400, Arvind Sankar wrote:
> On Thu, Mar 12, 2020 at 01:50:39PM +0100, Borislav Petkov wrote:
> > On Thu, Mar 12, 2020 at 12:58:24PM +0100, Hans de Goede wrote:
> > > My version of this patch has already been tested this way. It is
> >
> > Tested with kexec maybe but if the 0day bot keeps finding breakage, that
> > ain't good enough.
> >
> > > 1. Things are already broken, my patch just exposes the brokenness
> > > of some configs, it is not actually breaking things (well it breaks
> > > the build, changing a silent brokenness into an obvious one).
> >
> > As I already explained, that is not good enough.
> >
> > > 2. I send out the first version of this patch on 7 October 2019, it
> > > has not seen any reaction until now. So I'm sending out new versions
> > > quickly now that this issue is finally getting some attention...
> >
> > And that is never the right approach.
> >
> > Maintainers are busy as hell so !urgent stuff gets to wait. Spamming
> > them with more patchsets does not help - fixing stuff properly does.
> >
> > So, to sum up: if Arvind's approach is the better one, then we should do
> > that and s390 should be fixed this way too. And all tested. And we will
> > remove the hurry element from it all since it has not been noticed so
> > far so it is not urgent and we can take our time and fix it properly.
> >
> > Ok?
> >
> > Thx.
> >
> > --
> > Regards/Gruss,
> > Boris.
> >
> > https://people.kernel.org/tglx/notes-about-netiquette
>
> If I could try to summarize the situation here:
> - the purgatory requires filtering out certain CFLAGS/other settings set
> for the generic kernel in order to work correctly
> - the patch proposed by Hans de Goede will detect missing filters at
> build time rather than when kexec is executed
> - the filtering is currently not perfect as demonstrated by issues that
> 0day bot is finding -- but the patchset will find these problems at
> build time rather than runtime
> - there might be a slight optimization as proposed by me [1] but it
> might have problems as in [2] even if it seems to work
>
> I think the patch as of v5 [0] is useful right now, to catch CFLAGS
> additions that aren't currently being filtered correctly. The real
> problem is that there exist CFLAGS that should be used for all source
> files in the kernel, and there are CFLAGS (eg tracing, stack check etc)
> that should only be used for the kernel proper. For special
> compilations, such as boot stubs, vdso's, purgatory we should have the
> generic CFLAGS but not the kernel-proper CFLAGS. The issue currently is
> that these special compilations need to filter out all the flags added
> for kernel-proper, and this is a moving target as more tracing/sanity
> flags get added. Neither the solution of simply re-initializing CFLAGS
> (which will miss generic CFLAGS) nor trying to filter out CFLAGS (which
> will miss new kernel-proper CFLAGS) works very well. I think ideally
> splitting these into independent variables, i.e. BASE_FLAGS that can be
> used for everything, and KERNEL_FLAGS only to be used for the kernel
> proper is likely eventually the better solution, rather than conflating
> both into KBUILD_CFLAGS.
>
> But to move forward incrementally, patch v5 is probably the cleanest. My
> suggestion in [1] I'm thinking is changing things significantly for
> kexec, by changing the purgatory from a relocatable object file into an
> actual executable, and might have knock-on implications that need to be
> reviewed and tested carefully before it can be merged, as shown by [2].
>
> [0] https://lore.kernel.org/lkml/[email protected]/
> [1] https://lore.kernel.org/lkml/[email protected]/
> [2] https://lore.kernel.org/lkml/[email protected]/

Cc Nick Desaulniers, Nathan Chancellor, Ard Biesheuvel, who've all been
involved in these issue of trying to decide whether to filter out CFLAGS
or recreate them from scratch in various places.

2020-03-13 05:15:56

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Fri, Mar 13, 2020 at 12:58:54AM -0400, Arvind Sankar wrote:
> On Fri, Mar 13, 2020 at 12:42:36AM -0400, Arvind Sankar wrote:
> > On Thu, Mar 12, 2020 at 01:50:39PM +0100, Borislav Petkov wrote:
> > > On Thu, Mar 12, 2020 at 12:58:24PM +0100, Hans de Goede wrote:
> > > > My version of this patch has already been tested this way. It is
> > >
> > > Tested with kexec maybe but if the 0day bot keeps finding breakage, that
> > > ain't good enough.
> > >
> > > > 1. Things are already broken, my patch just exposes the brokenness
> > > > of some configs, it is not actually breaking things (well it breaks
> > > > the build, changing a silent brokenness into an obvious one).
> > >
> > > As I already explained, that is not good enough.
> > >
> > > > 2. I send out the first version of this patch on 7 October 2019, it
> > > > has not seen any reaction until now. So I'm sending out new versions
> > > > quickly now that this issue is finally getting some attention...
> > >
> > > And that is never the right approach.
> > >
> > > Maintainers are busy as hell so !urgent stuff gets to wait. Spamming
> > > them with more patchsets does not help - fixing stuff properly does.
> > >
> > > So, to sum up: if Arvind's approach is the better one, then we should do
> > > that and s390 should be fixed this way too. And all tested. And we will
> > > remove the hurry element from it all since it has not been noticed so
> > > far so it is not urgent and we can take our time and fix it properly.
> > >
> > > Ok?
> > >
> > > Thx.
> > >
> > > --
> > > Regards/Gruss,
> > > Boris.
> > >
> > > https://people.kernel.org/tglx/notes-about-netiquette
> >
> > If I could try to summarize the situation here:
> > - the purgatory requires filtering out certain CFLAGS/other settings set
> > for the generic kernel in order to work correctly
> > - the patch proposed by Hans de Goede will detect missing filters at
> > build time rather than when kexec is executed
> > - the filtering is currently not perfect as demonstrated by issues that
> > 0day bot is finding -- but the patchset will find these problems at
> > build time rather than runtime
> > - there might be a slight optimization as proposed by me [1] but it
> > might have problems as in [2] even if it seems to work
> >
> > I think the patch as of v5 [0] is useful right now, to catch CFLAGS
> > additions that aren't currently being filtered correctly. The real
> > problem is that there exist CFLAGS that should be used for all source
> > files in the kernel, and there are CFLAGS (eg tracing, stack check etc)
> > that should only be used for the kernel proper. For special
> > compilations, such as boot stubs, vdso's, purgatory we should have the
> > generic CFLAGS but not the kernel-proper CFLAGS. The issue currently is
> > that these special compilations need to filter out all the flags added
> > for kernel-proper, and this is a moving target as more tracing/sanity
> > flags get added. Neither the solution of simply re-initializing CFLAGS
> > (which will miss generic CFLAGS) nor trying to filter out CFLAGS (which
> > will miss new kernel-proper CFLAGS) works very well. I think ideally
> > splitting these into independent variables, i.e. BASE_FLAGS that can be
> > used for everything, and KERNEL_FLAGS only to be used for the kernel
> > proper is likely eventually the better solution, rather than conflating
> > both into KBUILD_CFLAGS.
> >
> > But to move forward incrementally, patch v5 is probably the cleanest. My
> > suggestion in [1] I'm thinking is changing things significantly for
> > kexec, by changing the purgatory from a relocatable object file into an
> > actual executable, and might have knock-on implications that need to be
> > reviewed and tested carefully before it can be merged, as shown by [2].
> >
> > [0] https://lore.kernel.org/lkml/[email protected]/
> > [1] https://lore.kernel.org/lkml/[email protected]/
> > [2] https://lore.kernel.org/lkml/[email protected]/
>
> Cc Nick Desaulniers, Nathan Chancellor, Ard Biesheuvel, who've all been
> involved in these issue of trying to decide whether to filter out CFLAGS
> or recreate them from scratch in various places.

And just to add, I've personally been involved in two patches to unbreak
the purgatory because of changes that broke it only at runtime, not
build time, which would both have been caught by Hans's patchset.

[1] ca14c996afe7 ("x86/purgatory: Disable the stackleak GCC plugin for the purgatory")
[2] bec500777089 ("lib/string: Make memzero_explicit() inline instead of external")

2020-03-13 10:49:28

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

Hi,

On 3/13/20 5:42 AM, Arvind Sankar wrote:
> On Thu, Mar 12, 2020 at 01:50:39PM +0100, Borislav Petkov wrote:
>> On Thu, Mar 12, 2020 at 12:58:24PM +0100, Hans de Goede wrote:
>>> My version of this patch has already been tested this way. It is
>>
>> Tested with kexec maybe but if the 0day bot keeps finding breakage, that
>> ain't good enough.
>>
>>> 1. Things are already broken, my patch just exposes the brokenness
>>> of some configs, it is not actually breaking things (well it breaks
>>> the build, changing a silent brokenness into an obvious one).
>>
>> As I already explained, that is not good enough.
>>
>>> 2. I send out the first version of this patch on 7 October 2019, it
>>> has not seen any reaction until now. So I'm sending out new versions
>>> quickly now that this issue is finally getting some attention...
>>
>> And that is never the right approach.
>>
>> Maintainers are busy as hell so !urgent stuff gets to wait. Spamming
>> them with more patchsets does not help - fixing stuff properly does.
>>
>> So, to sum up: if Arvind's approach is the better one, then we should do
>> that and s390 should be fixed this way too. And all tested. And we will
>> remove the hurry element from it all since it has not been noticed so
>> far so it is not urgent and we can take our time and fix it properly.
>>
>> Ok?
>>
>> Thx.
>>
>> --
>> Regards/Gruss,
>> Boris.
>>
>> https://people.kernel.org/tglx/notes-about-netiquette
>
> If I could try to summarize the situation here:
> - the purgatory requires filtering out certain CFLAGS/other settings set
> for the generic kernel in order to work correctly
> - the patch proposed by Hans de Goede will detect missing filters at
> build time rather than when kexec is executed
> - the filtering is currently not perfect as demonstrated by issues that
> 0day bot is finding -- but the patchset will find these problems at
> build time rather than runtime
> - there might be a slight optimization as proposed by me [1] but it
> might have problems as in [2] even if it seems to work
>
> I think the patch as of v5 [0] is useful right now, to catch CFLAGS
> additions that aren't currently being filtered correctly. The real
> problem is that there exist CFLAGS that should be used for all source
> files in the kernel, and there are CFLAGS (eg tracing, stack check etc)
> that should only be used for the kernel proper. For special
> compilations, such as boot stubs, vdso's, purgatory we should have the
> generic CFLAGS but not the kernel-proper CFLAGS. The issue currently is
> that these special compilations need to filter out all the flags added
> for kernel-proper, and this is a moving target as more tracing/sanity
> flags get added. Neither the solution of simply re-initializing CFLAGS
> (which will miss generic CFLAGS) nor trying to filter out CFLAGS (which
> will miss new kernel-proper CFLAGS) works very well. I think ideally
> splitting these into independent variables, i.e. BASE_FLAGS that can be
> used for everything, and KERNEL_FLAGS only to be used for the kernel
> proper is likely eventually the better solution, rather than conflating
> both into KBUILD_CFLAGS.
>
> But to move forward incrementally, patch v5 is probably the cleanest. My
> suggestion in [1] I'm thinking is changing things significantly for
> kexec, by changing the purgatory from a relocatable object file into an
> actual executable, and might have knock-on implications that need to be
> reviewed and tested carefully before it can be merged, as shown by [2].
>
> [0] https://lore.kernel.org/lkml/[email protected]/
> [1] https://lore.kernel.org/lkml/[email protected]/
> [2] https://lore.kernel.org/lkml/[email protected]/

Thank you for taking such a detailed look at this.

Given your concerns about the alternative fix you proposed I agree that
it is best to just move forward with my v5 patch-set for now; and then we
can try to improve the missing-symbols check in the future.

Regards,

Hans



2020-03-13 18:07:37

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Fri, Mar 13, 2020 at 12:42:36AM -0400, Arvind Sankar wrote:
> The real problem is that there exist CFLAGS that should be used for
> all source files in the kernel, and there are CFLAGS (eg tracing,
> stack check etc) that should only be used for the kernel proper. For
> special compilations, such as boot stubs, vdso's, purgatory we should
> have the generic CFLAGS but not the kernel-proper CFLAGS. The issue
> currently is that these special compilations need to filter out all
> the flags added for kernel-proper, and this is a moving target as
> more tracing/sanity flags get added. Neither the solution of simply
> re-initializing CFLAGS (which will miss generic CFLAGS) nor trying to
> filter out CFLAGS (which will miss new kernel-proper CFLAGS) works
> very well. I think ideally splitting these into independent variables,
> i.e. BASE_FLAGS that can be used for everything, and KERNEL_FLAGS
> only to be used for the kernel proper is likely eventually the better
> solution, rather than conflating both into KBUILD_CFLAGS.

Hohumm, this has come up a bunch of times in the past in conjunction
with boot/{,compressed/} Makefiles too. I'd be open towards reworking
this properly but I'm afraid it would cause a lot of churn and breakage
and it is hard to say how ugly it would become before someone actually
tries it. ;-\

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-03-16 18:55:02

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

On Thu, Mar 12, 2020 at 9:58 PM Arvind Sankar <[email protected]> wrote:
>
> On Fri, Mar 13, 2020 at 12:42:36AM -0400, Arvind Sankar wrote:
> > On Thu, Mar 12, 2020 at 01:50:39PM +0100, Borislav Petkov wrote:
> > > On Thu, Mar 12, 2020 at 12:58:24PM +0100, Hans de Goede wrote:
> > > > My version of this patch has already been tested this way. It is
> > >
> > > Tested with kexec maybe but if the 0day bot keeps finding breakage, that
> > > ain't good enough.
> > >
> > > > 1. Things are already broken, my patch just exposes the brokenness
> > > > of some configs, it is not actually breaking things (well it breaks
> > > > the build, changing a silent brokenness into an obvious one).
> > >
> > > As I already explained, that is not good enough.
> > >
> > > > 2. I send out the first version of this patch on 7 October 2019, it
> > > > has not seen any reaction until now. So I'm sending out new versions
> > > > quickly now that this issue is finally getting some attention...
> > >
> > > And that is never the right approach.
> > >
> > > Maintainers are busy as hell so !urgent stuff gets to wait. Spamming
> > > them with more patchsets does not help - fixing stuff properly does.
> > >
> > > So, to sum up: if Arvind's approach is the better one, then we should do
> > > that and s390 should be fixed this way too. And all tested. And we will
> > > remove the hurry element from it all since it has not been noticed so
> > > far so it is not urgent and we can take our time and fix it properly.
> > >
> > > Ok?
> > >
> > > Thx.
> > >
> > > --
> > > Regards/Gruss,
> > > Boris.
> > >
> > > https://people.kernel.org/tglx/notes-about-netiquette
> >
> > If I could try to summarize the situation here:
> > - the purgatory requires filtering out certain CFLAGS/other settings set
> > for the generic kernel in order to work correctly
> > - the patch proposed by Hans de Goede will detect missing filters at
> > build time rather than when kexec is executed
> > - the filtering is currently not perfect as demonstrated by issues that
> > 0day bot is finding -- but the patchset will find these problems at
> > build time rather than runtime
> > - there might be a slight optimization as proposed by me [1] but it
> > might have problems as in [2] even if it seems to work
> >
> > I think the patch as of v5 [0] is useful right now, to catch CFLAGS
> > additions that aren't currently being filtered correctly. The real
> > problem is that there exist CFLAGS that should be used for all source
> > files in the kernel, and there are CFLAGS (eg tracing, stack check etc)
> > that should only be used for the kernel proper. For special
> > compilations, such as boot stubs, vdso's, purgatory we should have the
> > generic CFLAGS but not the kernel-proper CFLAGS. The issue currently is
> > that these special compilations need to filter out all the flags added
> > for kernel-proper, and this is a moving target as more tracing/sanity
> > flags get added. Neither the solution of simply re-initializing CFLAGS
> > (which will miss generic CFLAGS) nor trying to filter out CFLAGS (which
> > will miss new kernel-proper CFLAGS) works very well. I think ideally
> > splitting these into independent variables, i.e. BASE_FLAGS that can be
> > used for everything, and KERNEL_FLAGS only to be used for the kernel
> > proper is likely eventually the better solution, rather than conflating
> > both into KBUILD_CFLAGS.

Yep, I agree with the above summary. The conflation of flags that
shouldn't be applied to purgatory, boot, etc with flags that are
always required is currently messier than I'd like. Will definitely
be a yak shave to detangle them.

> >
> > But to move forward incrementally, patch v5 is probably the cleanest. My
> > suggestion in [1] I'm thinking is changing things significantly for
> > kexec, by changing the purgatory from a relocatable object file into an
> > actual executable, and might have knock-on implications that need to be
> > reviewed and tested carefully before it can be merged, as shown by [2].
> >
> > [0] https://lore.kernel.org/lkml/[email protected]/
> > [1] https://lore.kernel.org/lkml/[email protected]/
> > [2] https://lore.kernel.org/lkml/[email protected]/
>
> Cc Nick Desaulniers, Nathan Chancellor, Ard Biesheuvel, who've all been
> involved in these issue of trying to decide whether to filter out CFLAGS
> or recreate them from scratch in various places.



--
Thanks,
~Nick Desaulniers