2019-11-10 02:50:14

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 4.19 133/191] efi: honour memory reservations passed via a linux specific config table

From: Ard Biesheuvel <[email protected]>

[ Upstream commit 71e0940d52e107748b270213a01d3b1546657d74 ]

In order to allow the OS to reserve memory persistently across a
kexec, introduce a Linux-specific UEFI configuration table that
points to the head of a linked list in memory, allowing each kernel
to add list items describing memory regions that the next kernel
should treat as reserved.

This is useful, e.g., for GICv3 based ARM systems that cannot disable
DMA access to the LPI tables, forcing them to reuse the same memory
region again after a kexec reboot.

Tested-by: Jeremy Linton <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/firmware/efi/efi.c | 27 ++++++++++++++++++++++++++-
include/linux/efi.h | 8 ++++++++
2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index d54fca902e64f..f265309859781 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -52,7 +52,8 @@ struct efi __read_mostly efi = {
.properties_table = EFI_INVALID_TABLE_ADDR,
.mem_attr_table = EFI_INVALID_TABLE_ADDR,
.rng_seed = EFI_INVALID_TABLE_ADDR,
- .tpm_log = EFI_INVALID_TABLE_ADDR
+ .tpm_log = EFI_INVALID_TABLE_ADDR,
+ .mem_reserve = EFI_INVALID_TABLE_ADDR,
};
EXPORT_SYMBOL(efi);

@@ -487,6 +488,7 @@ static __initdata efi_config_table_type_t common_tables[] = {
{EFI_MEMORY_ATTRIBUTES_TABLE_GUID, "MEMATTR", &efi.mem_attr_table},
{LINUX_EFI_RANDOM_SEED_TABLE_GUID, "RNG", &efi.rng_seed},
{LINUX_EFI_TPM_EVENT_LOG_GUID, "TPMEventLog", &efi.tpm_log},
+ {LINUX_EFI_MEMRESERVE_TABLE_GUID, "MEMRESERVE", &efi.mem_reserve},
{NULL_GUID, NULL, NULL},
};

@@ -594,6 +596,29 @@ int __init efi_config_parse_tables(void *config_tables, int count, int sz,
early_memunmap(tbl, sizeof(*tbl));
}

+ if (efi.mem_reserve != EFI_INVALID_TABLE_ADDR) {
+ unsigned long prsv = efi.mem_reserve;
+
+ while (prsv) {
+ struct linux_efi_memreserve *rsv;
+
+ /* reserve the entry itself */
+ memblock_reserve(prsv, sizeof(*rsv));
+
+ rsv = early_memremap(prsv, sizeof(*rsv));
+ if (rsv == NULL) {
+ pr_err("Could not map UEFI memreserve entry!\n");
+ return -ENOMEM;
+ }
+
+ if (rsv->size)
+ memblock_reserve(rsv->base, rsv->size);
+
+ prsv = rsv->next;
+ early_memunmap(rsv, sizeof(*rsv));
+ }
+ }
+
return 0;
}

diff --git a/include/linux/efi.h b/include/linux/efi.h
index cc3391796c0b8..f43fc61fbe2c9 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -672,6 +672,7 @@ void efi_native_runtime_setup(void);
#define LINUX_EFI_LOADER_ENTRY_GUID EFI_GUID(0x4a67b082, 0x0a4c, 0x41cf, 0xb6, 0xc7, 0x44, 0x0b, 0x29, 0xbb, 0x8c, 0x4f)
#define LINUX_EFI_RANDOM_SEED_TABLE_GUID EFI_GUID(0x1ce1e5bc, 0x7ceb, 0x42f2, 0x81, 0xe5, 0x8a, 0xad, 0xf1, 0x80, 0xf5, 0x7b)
#define LINUX_EFI_TPM_EVENT_LOG_GUID EFI_GUID(0xb7799cb0, 0xeca2, 0x4943, 0x96, 0x67, 0x1f, 0xae, 0x07, 0xb7, 0x47, 0xfa)
+#define LINUX_EFI_MEMRESERVE_TABLE_GUID EFI_GUID(0x888eb0c6, 0x8ede, 0x4ff5, 0xa8, 0xf0, 0x9a, 0xee, 0x5c, 0xb9, 0x77, 0xc2)

typedef struct {
efi_guid_t guid;
@@ -957,6 +958,7 @@ extern struct efi {
unsigned long mem_attr_table; /* memory attributes table */
unsigned long rng_seed; /* UEFI firmware random seed */
unsigned long tpm_log; /* TPM2 Event Log table */
+ unsigned long mem_reserve; /* Linux EFI memreserve table */
efi_get_time_t *get_time;
efi_set_time_t *set_time;
efi_get_wakeup_time_t *get_wakeup_time;
@@ -1667,4 +1669,10 @@ extern int efi_tpm_eventlog_init(void);
/* Workqueue to queue EFI Runtime Services */
extern struct workqueue_struct *efi_rts_wq;

+struct linux_efi_memreserve {
+ phys_addr_t next;
+ phys_addr_t base;
+ phys_addr_t size;
+};
+
#endif /* _LINUX_EFI_H */
--
2.20.1


2019-11-10 07:39:49

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 4.19 133/191] efi: honour memory reservations passed via a linux specific config table

On Sun, 10 Nov 2019 at 03:44, Sasha Levin <[email protected]> wrote:
>
> From: Ard Biesheuvel <[email protected]>
>
> [ Upstream commit 71e0940d52e107748b270213a01d3b1546657d74 ]
>
> In order to allow the OS to reserve memory persistently across a
> kexec, introduce a Linux-specific UEFI configuration table that
> points to the head of a linked list in memory, allowing each kernel
> to add list items describing memory regions that the next kernel
> should treat as reserved.
>
> This is useful, e.g., for GICv3 based ARM systems that cannot disable
> DMA access to the LPI tables, forcing them to reuse the same memory
> region again after a kexec reboot.
>
> Tested-by: Jeremy Linton <[email protected]>
> Signed-off-by: Ard Biesheuvel <[email protected]>
> Signed-off-by: Sasha Levin <[email protected]>

NAK

This doesn't belong in -stable, and I'd be interested in understanding
how this got autoselected, and how I can prevent this from happening
again in the future.


> ---
> drivers/firmware/efi/efi.c | 27 ++++++++++++++++++++++++++-
> include/linux/efi.h | 8 ++++++++
> 2 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index d54fca902e64f..f265309859781 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -52,7 +52,8 @@ struct efi __read_mostly efi = {
> .properties_table = EFI_INVALID_TABLE_ADDR,
> .mem_attr_table = EFI_INVALID_TABLE_ADDR,
> .rng_seed = EFI_INVALID_TABLE_ADDR,
> - .tpm_log = EFI_INVALID_TABLE_ADDR
> + .tpm_log = EFI_INVALID_TABLE_ADDR,
> + .mem_reserve = EFI_INVALID_TABLE_ADDR,
> };
> EXPORT_SYMBOL(efi);
>
> @@ -487,6 +488,7 @@ static __initdata efi_config_table_type_t common_tables[] = {
> {EFI_MEMORY_ATTRIBUTES_TABLE_GUID, "MEMATTR", &efi.mem_attr_table},
> {LINUX_EFI_RANDOM_SEED_TABLE_GUID, "RNG", &efi.rng_seed},
> {LINUX_EFI_TPM_EVENT_LOG_GUID, "TPMEventLog", &efi.tpm_log},
> + {LINUX_EFI_MEMRESERVE_TABLE_GUID, "MEMRESERVE", &efi.mem_reserve},
> {NULL_GUID, NULL, NULL},
> };
>
> @@ -594,6 +596,29 @@ int __init efi_config_parse_tables(void *config_tables, int count, int sz,
> early_memunmap(tbl, sizeof(*tbl));
> }
>
> + if (efi.mem_reserve != EFI_INVALID_TABLE_ADDR) {
> + unsigned long prsv = efi.mem_reserve;
> +
> + while (prsv) {
> + struct linux_efi_memreserve *rsv;
> +
> + /* reserve the entry itself */
> + memblock_reserve(prsv, sizeof(*rsv));
> +
> + rsv = early_memremap(prsv, sizeof(*rsv));
> + if (rsv == NULL) {
> + pr_err("Could not map UEFI memreserve entry!\n");
> + return -ENOMEM;
> + }
> +
> + if (rsv->size)
> + memblock_reserve(rsv->base, rsv->size);
> +
> + prsv = rsv->next;
> + early_memunmap(rsv, sizeof(*rsv));
> + }
> + }
> +
> return 0;
> }
>
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index cc3391796c0b8..f43fc61fbe2c9 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -672,6 +672,7 @@ void efi_native_runtime_setup(void);
> #define LINUX_EFI_LOADER_ENTRY_GUID EFI_GUID(0x4a67b082, 0x0a4c, 0x41cf, 0xb6, 0xc7, 0x44, 0x0b, 0x29, 0xbb, 0x8c, 0x4f)
> #define LINUX_EFI_RANDOM_SEED_TABLE_GUID EFI_GUID(0x1ce1e5bc, 0x7ceb, 0x42f2, 0x81, 0xe5, 0x8a, 0xad, 0xf1, 0x80, 0xf5, 0x7b)
> #define LINUX_EFI_TPM_EVENT_LOG_GUID EFI_GUID(0xb7799cb0, 0xeca2, 0x4943, 0x96, 0x67, 0x1f, 0xae, 0x07, 0xb7, 0x47, 0xfa)
> +#define LINUX_EFI_MEMRESERVE_TABLE_GUID EFI_GUID(0x888eb0c6, 0x8ede, 0x4ff5, 0xa8, 0xf0, 0x9a, 0xee, 0x5c, 0xb9, 0x77, 0xc2)
>
> typedef struct {
> efi_guid_t guid;
> @@ -957,6 +958,7 @@ extern struct efi {
> unsigned long mem_attr_table; /* memory attributes table */
> unsigned long rng_seed; /* UEFI firmware random seed */
> unsigned long tpm_log; /* TPM2 Event Log table */
> + unsigned long mem_reserve; /* Linux EFI memreserve table */
> efi_get_time_t *get_time;
> efi_set_time_t *set_time;
> efi_get_wakeup_time_t *get_wakeup_time;
> @@ -1667,4 +1669,10 @@ extern int efi_tpm_eventlog_init(void);
> /* Workqueue to queue EFI Runtime Services */
> extern struct workqueue_struct *efi_rts_wq;
>
> +struct linux_efi_memreserve {
> + phys_addr_t next;
> + phys_addr_t base;
> + phys_addr_t size;
> +};
> +
> #endif /* _LINUX_EFI_H */
> --
> 2.20.1
>

2019-11-10 13:31:30

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 4.19 133/191] efi: honour memory reservations passed via a linux specific config table

On Sun, Nov 10, 2019 at 08:33:47AM +0100, Ard Biesheuvel wrote:
>On Sun, 10 Nov 2019 at 03:44, Sasha Levin <[email protected]> wrote:
>>
>> From: Ard Biesheuvel <[email protected]>
>>
>> [ Upstream commit 71e0940d52e107748b270213a01d3b1546657d74 ]
>>
>> In order to allow the OS to reserve memory persistently across a
>> kexec, introduce a Linux-specific UEFI configuration table that
>> points to the head of a linked list in memory, allowing each kernel
>> to add list items describing memory regions that the next kernel
>> should treat as reserved.
>>
>> This is useful, e.g., for GICv3 based ARM systems that cannot disable
>> DMA access to the LPI tables, forcing them to reuse the same memory
>> region again after a kexec reboot.
>>
>> Tested-by: Jeremy Linton <[email protected]>
>> Signed-off-by: Ard Biesheuvel <[email protected]>
>> Signed-off-by: Sasha Levin <[email protected]>
>
>NAK
>
>This doesn't belong in -stable, and I'd be interested in understanding
>how this got autoselected, and how I can prevent this from happening
>again in the future.

It was selected because it's part of a fix for a real issue reported by
users:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1806766

Besides ubuntu, it is also carried by:

SUSE: https://www.suse.com/support/update/announcement/2019/suse-su-20191530-1/
CentOS: https://koji.mbox.centos.org/koji/buildinfo?buildID=4558

As a way to resolve the reported bug.

Any reason this *shouldn't* be in stable? I'm aware that there might be
dependencies that are not obvious to me, but the solution here is to
take those dependencies as well rather than ignore the process
completely.

--
Thanks,
Sasha

2019-11-10 14:21:58

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 4.19 133/191] efi: honour memory reservations passed via a linux specific config table

On Sun, 10 Nov 2019 at 13:27, Sasha Levin <[email protected]> wrote:
>
> On Sun, Nov 10, 2019 at 08:33:47AM +0100, Ard Biesheuvel wrote:
> >On Sun, 10 Nov 2019 at 03:44, Sasha Levin <[email protected]> wrote:
> >>
> >> From: Ard Biesheuvel <[email protected]>
> >>
> >> [ Upstream commit 71e0940d52e107748b270213a01d3b1546657d74 ]
> >>
> >> In order to allow the OS to reserve memory persistently across a
> >> kexec, introduce a Linux-specific UEFI configuration table that
> >> points to the head of a linked list in memory, allowing each kernel
> >> to add list items describing memory regions that the next kernel
> >> should treat as reserved.
> >>
> >> This is useful, e.g., for GICv3 based ARM systems that cannot disable
> >> DMA access to the LPI tables, forcing them to reuse the same memory
> >> region again after a kexec reboot.
> >>
> >> Tested-by: Jeremy Linton <[email protected]>
> >> Signed-off-by: Ard Biesheuvel <[email protected]>
> >> Signed-off-by: Sasha Levin <[email protected]>
> >
> >NAK
> >
> >This doesn't belong in -stable, and I'd be interested in understanding
> >how this got autoselected, and how I can prevent this from happening
> >again in the future.
>
> It was selected because it's part of a fix for a real issue reported by
> users:
>

For my understanding, are you saying your AI is reading launchpad bug
reports etc? Because it is marked AUTOSEL.

> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1806766
>

That pages mentions

"""
2 upstream patch series are required to fix this:
https://<email address hidden>/msg10328.html
Which provides an EFI facility consumed by:
https://lkml.org/lkml/2018/9/21/1066
There were also some follow-on fixes to deal with ARM-specific
problems associated with this usage:
https://www.spinics.net/lists/arm-kernel/msg685751.html
"""

and without the other patches, we only add bugs and don't fix any.

> Besides ubuntu, it is also carried by:
>
> SUSE: https://www.suse.com/support/update/announcement/2019/suse-su-20191530-1/
> CentOS: https://koji.mbox.centos.org/koji/buildinfo?buildID=4558
>
> As a way to resolve the reported bug.
>

Backporting a feature/fix like this requires careful consideration of
the patches involved, and doing actual testing on hardware.

> Any reason this *shouldn't* be in stable?

Yes. By itself, it causes crashes at early boot and does not actually
solve the problem.

> I'm aware that there might be
> dependencies that are not obvious to me, but the solution here is to
> take those dependencies as well rather than ignore the process
> completely.
>

This is not a bugfix. kexec/kdump never worked correctly on the
hardware involved, and backporting a feature like that goes way beyond
what I am willing to accept for stable backports affecting the EFI
subsystem.

2019-11-10 16:08:38

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 4.19 133/191] efi: honour memory reservations passed via a linux specific config table

On Sun, Nov 10, 2019 at 02:16:57PM +0000, Ard Biesheuvel wrote:
>On Sun, 10 Nov 2019 at 13:27, Sasha Levin <[email protected]> wrote:
>>
>> On Sun, Nov 10, 2019 at 08:33:47AM +0100, Ard Biesheuvel wrote:
>> >On Sun, 10 Nov 2019 at 03:44, Sasha Levin <[email protected]> wrote:
>> >>
>> >> From: Ard Biesheuvel <[email protected]>
>> >>
>> >> [ Upstream commit 71e0940d52e107748b270213a01d3b1546657d74 ]
>> >>
>> >> In order to allow the OS to reserve memory persistently across a
>> >> kexec, introduce a Linux-specific UEFI configuration table that
>> >> points to the head of a linked list in memory, allowing each kernel
>> >> to add list items describing memory regions that the next kernel
>> >> should treat as reserved.
>> >>
>> >> This is useful, e.g., for GICv3 based ARM systems that cannot disable
>> >> DMA access to the LPI tables, forcing them to reuse the same memory
>> >> region again after a kexec reboot.
>> >>
>> >> Tested-by: Jeremy Linton <[email protected]>
>> >> Signed-off-by: Ard Biesheuvel <[email protected]>
>> >> Signed-off-by: Sasha Levin <[email protected]>
>> >
>> >NAK
>> >
>> >This doesn't belong in -stable, and I'd be interested in understanding
>> >how this got autoselected, and how I can prevent this from happening
>> >again in the future.
>>
>> It was selected because it's part of a fix for a real issue reported by
>> users:
>>
>
>For my understanding, are you saying your AI is reading launchpad bug
>reports etc? Because it is marked AUTOSEL.

Not quite. This review set was me feeding all the patches Ubuntu has on
top of stable trees into AUTOSEL, and sending out the output for review.
I doesn't look into launchpad bug reports on it's own, but in my
experience one can find a bug report for mostly everything AUTOSEL
considers to be a bug.

>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1806766
>>
>
>That pages mentions
>
>"""
>2 upstream patch series are required to fix this:
> https://<email address hidden>/msg10328.html
>Which provides an EFI facility consumed by:
> https://lkml.org/lkml/2018/9/21/1066
>There were also some follow-on fixes to deal with ARM-specific
>problems associated with this usage:
> https://www.spinics.net/lists/arm-kernel/msg685751.html
>"""
>
>and without the other patches, we only add bugs and don't fix any.
>
>> Besides ubuntu, it is also carried by:
>>
>> SUSE: https://www.suse.com/support/update/announcement/2019/suse-su-20191530-1/
>> CentOS: https://koji.mbox.centos.org/koji/buildinfo?buildID=4558
>>
>> As a way to resolve the reported bug.
>>
>
>Backporting a feature/fix like this requires careful consideration of
>the patches involved, and doing actual testing on hardware.
>
>> Any reason this *shouldn't* be in stable?
>
>Yes. By itself, it causes crashes at early boot and does not actually
>solve the problem.

Sure, let's work on gathering all the needed patches then and testing it
out.

>> I'm aware that there might be
>> dependencies that are not obvious to me, but the solution here is to
>> take those dependencies as well rather than ignore the process
>> completely.
>>
>
>This is not a bugfix. kexec/kdump never worked correctly on the
>hardware involved, and backporting a feature like that goes way beyond
>what I am willing to accept for stable backports affecting the EFI
>subsystem.

I'm a bit confused. The bug report starts with:

[Impact]
kdump support isn't usable on HiSilicon D05 systems. This
previously worked in bionic.

So it seems like it did use to work, but not anymore?

Either way, I understand that you want to keep the stable tree
conservative, but keep in mind that the flip side of not taking fixes
that users ask for means that distros end up having to carry them
anyway, which means that they don't get the review and testing they
need.

We can argue all we want around whether it's a fix or not, but if most
distros carry it then I think our argument is moot.

--
Thanks,
Sasha

2019-11-10 16:30:31

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 4.19 133/191] efi: honour memory reservations passed via a linux specific config table

(adding Marc, the GIC maintainer)

On Sun, 10 Nov 2019 at 15:57, Sasha Levin <[email protected]> wrote:
>
> On Sun, Nov 10, 2019 at 02:16:57PM +0000, Ard Biesheuvel wrote:
> >On Sun, 10 Nov 2019 at 13:27, Sasha Levin <[email protected]> wrote:
> >>
> >> On Sun, Nov 10, 2019 at 08:33:47AM +0100, Ard Biesheuvel wrote:
> >> >On Sun, 10 Nov 2019 at 03:44, Sasha Levin <[email protected]> wrote:
> >> >>
> >> >> From: Ard Biesheuvel <[email protected]>
> >> >>
> >> >> [ Upstream commit 71e0940d52e107748b270213a01d3b1546657d74 ]
> >> >>
> >> >> In order to allow the OS to reserve memory persistently across a
> >> >> kexec, introduce a Linux-specific UEFI configuration table that
> >> >> points to the head of a linked list in memory, allowing each kernel
> >> >> to add list items describing memory regions that the next kernel
> >> >> should treat as reserved.
> >> >>
> >> >> This is useful, e.g., for GICv3 based ARM systems that cannot disable
> >> >> DMA access to the LPI tables, forcing them to reuse the same memory
> >> >> region again after a kexec reboot.
> >> >>
> >> >> Tested-by: Jeremy Linton <[email protected]>
> >> >> Signed-off-by: Ard Biesheuvel <[email protected]>
> >> >> Signed-off-by: Sasha Levin <[email protected]>
> >> >
> >> >NAK
> >> >
> >> >This doesn't belong in -stable, and I'd be interested in understanding
> >> >how this got autoselected, and how I can prevent this from happening
> >> >again in the future.
> >>
> >> It was selected because it's part of a fix for a real issue reported by
> >> users:
> >>
> >
> >For my understanding, are you saying your AI is reading launchpad bug
> >reports etc? Because it is marked AUTOSEL.
>
> Not quite. This review set was me feeding all the patches Ubuntu has on
> top of stable trees into AUTOSEL, and sending out the output for review.
> I doesn't look into launchpad bug reports on it's own, but in my
> experience one can find a bug report for mostly everything AUTOSEL
> considers to be a bug.
>

So the assumption is that taking an arbitrary subset of what Ubuntu
backported (and tested extensively), and letting that subset be chosen
by a bot is a process that improves the quality of stable trees? I'm
rather skeptical of that tbh.

> >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1806766
> >>
> >
> >That pages mentions
> >
> >"""
> >2 upstream patch series are required to fix this:
> > https://<email address hidden>/msg10328.html
> >Which provides an EFI facility consumed by:
> > https://lkml.org/lkml/2018/9/21/1066
> >There were also some follow-on fixes to deal with ARM-specific
> >problems associated with this usage:
> > https://www.spinics.net/lists/arm-kernel/msg685751.html
> >"""
> >
> >and without the other patches, we only add bugs and don't fix any.
> >
> >> Besides ubuntu, it is also carried by:
> >>
> >> SUSE: https://www.suse.com/support/update/announcement/2019/suse-su-20191530-1/
> >> CentOS: https://koji.mbox.centos.org/koji/buildinfo?buildID=4558
> >>
> >> As a way to resolve the reported bug.
> >>
> >
> >Backporting a feature/fix like this requires careful consideration of
> >the patches involved, and doing actual testing on hardware.
> >
> >> Any reason this *shouldn't* be in stable?
> >
> >Yes. By itself, it causes crashes at early boot and does not actually
> >solve the problem.
>
> Sure, let's work on gathering all the needed patches then and testing it
> out.
>

No, let's not. This is a feature that was introduced to address a
shortcoming in some hardware that makes kexec/kdump problematic on
them. If you want kexec/kdump on that hardware, use a newer kernel.

> >> I'm aware that there might be
> >> dependencies that are not obvious to me, but the solution here is to
> >> take those dependencies as well rather than ignore the process
> >> completely.
> >>
> >
> >This is not a bugfix. kexec/kdump never worked correctly on the
> >hardware involved, and backporting a feature like that goes way beyond
> >what I am willing to accept for stable backports affecting the EFI
> >subsystem.
>
> I'm a bit confused. The bug report starts with:
>
> [Impact]
> kdump support isn't usable on HiSilicon D05 systems. This
> previously worked in bionic.
>
> So it seems like it did use to work, but not anymore?
>

I have no idea what Ubuntu shipped in the previous kernel, but
labelling this as a software regression is dubious at least, and
wholly inaccurate for upstream.

> Either way, I understand that you want to keep the stable tree
> conservative, but keep in mind that the flip side of not taking fixes
> that users ask for means that distros end up having to carry them
> anyway, which means that they don't get the review and testing they
> need.
>

I'd say it is the opposite. At least the distros test their backports
on actual hardware. Taking any part of this set without testing it by
doing kexec/kdump on an affected ARM system, and regression testing it
on the hardware that got broken by it (with hundreds of cores IIRC) is
totally irresponsible, and I don't have the time or the hardware to do
the testing.

> We can argue all we want around whether it's a fix or not, but if most
> distros carry it then I think our argument is moot.
>

If someone cares enough to backport these as a coherent set, with boot
tests on the affected hardware etc, then I am not going to object.

2019-11-10 18:09:02

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 4.19 133/191] efi: honour memory reservations passed via a linux specific config table

On Sun, 10 Nov 2019 16:26:15 +0000,
Ard Biesheuvel <[email protected]> wrote:
>
> (adding Marc, the GIC maintainer)
>
> On Sun, 10 Nov 2019 at 15:57, Sasha Levin <[email protected]> wrote:
> >
> > On Sun, Nov 10, 2019 at 02:16:57PM +0000, Ard Biesheuvel wrote:
> > >On Sun, 10 Nov 2019 at 13:27, Sasha Levin <[email protected]> wrote:
> > >>
> > >> On Sun, Nov 10, 2019 at 08:33:47AM +0100, Ard Biesheuvel wrote:
> > >> >On Sun, 10 Nov 2019 at 03:44, Sasha Levin <[email protected]> wrote:
> > >> >>
> > >> >> From: Ard Biesheuvel <[email protected]>
> > >> >>
> > >> >> [ Upstream commit 71e0940d52e107748b270213a01d3b1546657d74 ]
> > >> >>
> > >> >> In order to allow the OS to reserve memory persistently across a
> > >> >> kexec, introduce a Linux-specific UEFI configuration table that
> > >> >> points to the head of a linked list in memory, allowing each kernel
> > >> >> to add list items describing memory regions that the next kernel
> > >> >> should treat as reserved.
> > >> >>
> > >> >> This is useful, e.g., for GICv3 based ARM systems that cannot disable
> > >> >> DMA access to the LPI tables, forcing them to reuse the same memory
> > >> >> region again after a kexec reboot.
> > >> >>
> > >> >> Tested-by: Jeremy Linton <[email protected]>
> > >> >> Signed-off-by: Ard Biesheuvel <[email protected]>
> > >> >> Signed-off-by: Sasha Levin <[email protected]>
> > >> >
> > >> >NAK
> > >> >
> > >> >This doesn't belong in -stable, and I'd be interested in understanding
> > >> >how this got autoselected, and how I can prevent this from happening
> > >> >again in the future.
> > >>
> > >> It was selected because it's part of a fix for a real issue reported by
> > >> users:
> > >>
> > >
> > >For my understanding, are you saying your AI is reading launchpad bug
> > >reports etc? Because it is marked AUTOSEL.
> >
> > Not quite. This review set was me feeding all the patches Ubuntu has on
> > top of stable trees into AUTOSEL, and sending out the output for review.
> > I doesn't look into launchpad bug reports on it's own, but in my
> > experience one can find a bug report for mostly everything AUTOSEL
> > considers to be a bug.
> >
>
> So the assumption is that taking an arbitrary subset of what Ubuntu
> backported (and tested extensively), and letting that subset be chosen
> by a bot is a process that improves the quality of stable trees? I'm
> rather skeptical of that tbh.
>
> > >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1806766
> > >>
> > >
> > >That pages mentions
> > >
> > >"""
> > >2 upstream patch series are required to fix this:
> > > https://<email address hidden>/msg10328.html
> > >Which provides an EFI facility consumed by:
> > > https://lkml.org/lkml/2018/9/21/1066
> > >There were also some follow-on fixes to deal with ARM-specific
> > >problems associated with this usage:
> > > https://www.spinics.net/lists/arm-kernel/msg685751.html
> > >"""
> > >
> > >and without the other patches, we only add bugs and don't fix any.
> > >
> > >> Besides ubuntu, it is also carried by:
> > >>
> > >> SUSE: https://www.suse.com/support/update/announcement/2019/suse-su-20191530-1/
> > >> CentOS: https://koji.mbox.centos.org/koji/buildinfo?buildID=4558
> > >>
> > >> As a way to resolve the reported bug.
> > >>
> > >
> > >Backporting a feature/fix like this requires careful consideration of
> > >the patches involved, and doing actual testing on hardware.
> > >
> > >> Any reason this *shouldn't* be in stable?
> > >
> > >Yes. By itself, it causes crashes at early boot and does not actually
> > >solve the problem.
> >
> > Sure, let's work on gathering all the needed patches then and testing it
> > out.
> >
>
> No, let's not. This is a feature that was introduced to address a
> shortcoming in some hardware that makes kexec/kdump problematic on
> them. If you want kexec/kdump on that hardware, use a newer kernel.

That's my position as well. This isn't a bug fix at all, but a
workaround for a HW defect with complex dependencies. It wasn't cc'd
stable for good reasons, and I wish stable maintainers would respect
this decision.

> > >> I'm aware that there might be
> > >> dependencies that are not obvious to me, but the solution here is to
> > >> take those dependencies as well rather than ignore the process
> > >> completely.
> > >>
> > >
> > >This is not a bugfix. kexec/kdump never worked correctly on the
> > >hardware involved, and backporting a feature like that goes way beyond
> > >what I am willing to accept for stable backports affecting the EFI
> > >subsystem.
> >
> > I'm a bit confused. The bug report starts with:
> >
> > [Impact]
> > kdump support isn't usable on HiSilicon D05 systems. This
> > previously worked in bionic.
> >
> > So it seems like it did use to work, but not anymore?
> >
>
> I have no idea what Ubuntu shipped in the previous kernel, but
> labelling this as a software regression is dubious at least, and
> wholly inaccurate for upstream.

I have this exact machine keeping my feet warm, and kexec never worked
on it (nor on any GICv3 machine that is LPI-capable) before we added
this workaround. Whatever distros carried as hacks to make it work at
the time, I don't want to know. What is more likely is that they always
kexec'd a kernel with the exact same allocation layout and that it
worked by luck (or maybe resulted in silent memory corruption).

If anything, I'd merge a patch *disabling* kexec altogether on these
systems, because pretending that it ever worked is a damn lie.

> > Either way, I understand that you want to keep the stable tree
> > conservative, but keep in mind that the flip side of not taking fixes
> > that users ask for means that distros end up having to carry them
> > anyway, which means that they don't get the review and testing they
> > need.
> >
>
> I'd say it is the opposite. At least the distros test their backports
> on actual hardware. Taking any part of this set without testing it by
> doing kexec/kdump on an affected ARM system, and regression testing it
> on the hardware that got broken by it (with hundreds of cores IIRC) is
> totally irresponsible, and I don't have the time or the hardware to do
> the testing.
>
> > We can argue all we want around whether it's a fix or not, but if most
> > distros carry it then I think our argument is moot.
> >
>
> If someone cares enough to backport these as a coherent set, with boot
> tests on the affected hardware etc, then I am not going to object.

Ideally, I'd like the distros to test these backports, because only them
have access to the variety of HW that's required. But they'd have to
majorly up their game, if the above is anything to go by...

Thanks,

M.

--
Jazz is not dead, it just smells funny.