Although there was a previous fix to avoid early kernel access to the
EFI config table on Intel systems, the problem can still exist on AMD
systems that support SEV (Secure Encrypted Virtualization). The
command line option "nogbpages" brings this bug to the surface. And
this is what caused the regression with my earlier patch that
attempted to reduce the use of gbpages. This patch series fixes that
problem and restores my earlier patch.
The following 2 commits caused the EFI config table, and the CC_BLOB
entry in that table, to be accessed when enabling SEV at kernel
startup.
commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
earlier during boot")
commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
detection/setup")
These accesses happen before the new kernel establishes its own
identity map, and before establishing a routine to handle page faults.
But the areas referenced are not explicitly added to the kexec
identity map.
This goes unnoticed when these areas happen to be placed close enough
to others areas that are explicitly added to the identity map, but
that is not always the case.
Under certain conditions, for example Intel Atom processors that don't
support 1GB pages, it was found that these areas don't end up mapped,
and the SEV initialization code causes an unrecoverable page fault,
and the kexec fails.
Tau Liu had offered a patch to put the config table into the kexec
identity map to avoid this problem:
https://lore.kernel.org/all/[email protected]/
But the community chose instead to avoid referencing this memory on
non-AMD systems where the problem was reported.
commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
on non-AMD hardware")
I later wanted to make a different change to kexec identity map
creation, and had this patch accepted:
commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
but it quickly needed to be reverted because of problems on AMD systems.
The reported regression problems on AMD systems were due to the above
mentioned references to the EFI config table. In fact, on the same
systems, the "nogbpages" command line option breaks kexec as well.
So I resubmit Tau Liu's original patch that maps the EFI config
table, add an additional patch by me that ensures that the CC blob is
also mapped (if present), and also resubmit my earlier patch to use
gpbages only when a full GB of space is requested to be mapped.
I do not advocate for removing the earlier, non-AMD fix. With kexec,
two different kernel versions can be in play, and the earlier fix
still covers non-AMD systems when the kexec'd-from kernel doesn't have
these patches applied.
All three of the people who reported regression with my earlier patch
have retested with this patch series and found it to work where my
single patch previously did not. With current kernels, all fail to
kexec when "nogbpages" is on the command line, but all succeed with
"nogbpages" after the series is applied.
Tao Liu (1):
x86/kexec: Add EFI config table identity mapping for kexec kernel
Steve Wahl (2):
x86/kexec: Add EFI Confidential Computing blob to kexec identity
mapping.
x86/mm/ident_map: Use gbpages only where full GB page should be
mapped.
arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
arch/x86/mm/ident_map.c | 23 +++++++--
2 files changed, 95 insertions(+), 10 deletions(-)
--
2.26.2
From: Tao Liu <[email protected]>
A kexec kernel boot failure is sometimes observed on AMD CPUs due to
unmapped EFI config table. This is seen when "nogbpages" is on the
kernel command line, and has been observed as a full BIOS reboot
rather than a successful kexec.
Currently EFI system table is identity-mapped for the kexec kernel, but EFI
config table is not mapped explicitly:
commit 6bbeb276b71f ("x86/kexec: Add the EFI system tables and ACPI
tables to the ident map")
The following 2 commits caused the EFI config table to be accessed
when enabling SEV at kernel startup.
commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
earlier during boot")
commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
detection/setup")
This may result in a page fault due to EFI config table's unmapped
address. Since the page fault occurs before the new kernel establishes
its own identity map and page fault routines, it is unrecoverable and
kexec fails.
The issue doesn't appear on all systems, because the pages used by
kexec to create the identity map are usually large 1GB pages that, by
luck, end up including the needed address space when other nearby
areas are explicitly mapped.
However if nogbpages is set, the reduced page size (2 MB) used to
create the identity map means it's less likely that the EFI config
table's address space ends up mapped by mapping requests for nearby
areas.
Therefore, explicitly include the EFI config table in the kexec
identity map.
Signed-off-by: Tao Liu <[email protected]>
Tested-by: Pavin Joseph <[email protected]>
Tested-by: Sarah Brofeldt <[email protected]>
Tested-by: Eric Hagberg <[email protected]>
---
arch/x86/kernel/machine_kexec_64.c | 35 ++++++++++++++++++++++++++----
1 file changed, 31 insertions(+), 4 deletions(-)
I (Steve Wahl) modified the above commit message, but did not modify
the code. I am not clear if that requires additional Co-developed-by:
and Signed-off-by: lines. If so, copy them from here:
Co-developed-by: Steve Wahl <[email protected]>
Signed-off-by: Steve Wahl <[email protected]>
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index b180d8e497c3..d89942307659 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -28,6 +28,7 @@
#include <asm/setup.h>
#include <asm/set_memory.h>
#include <asm/cpu.h>
+#include <asm/efi.h>
#ifdef CONFIG_ACPI
/*
@@ -83,10 +84,12 @@ const struct kexec_file_ops * const kexec_file_loaders[] = {
#endif
static int
-map_efi_systab(struct x86_mapping_info *info, pgd_t *level4p)
+map_efi_tables(struct x86_mapping_info *info, pgd_t *level4p)
{
#ifdef CONFIG_EFI
unsigned long mstart, mend;
+ void *kaddr;
+ int ret;
if (!efi_enabled(EFI_BOOT))
return 0;
@@ -102,6 +105,30 @@ map_efi_systab(struct x86_mapping_info *info, pgd_t *level4p)
if (!mstart)
return 0;
+ ret = kernel_ident_mapping_init(info, level4p, mstart, mend);
+ if (ret)
+ return ret;
+
+ kaddr = memremap(mstart, mend - mstart, MEMREMAP_WB);
+ if (!kaddr) {
+ pr_err("Could not map UEFI system table\n");
+ return -ENOMEM;
+ }
+
+ mstart = efi_config_table;
+
+ if (efi_enabled(EFI_64BIT)) {
+ efi_system_table_64_t *stbl = (efi_system_table_64_t *)kaddr;
+
+ mend = mstart + sizeof(efi_config_table_64_t) * stbl->nr_tables;
+ } else {
+ efi_system_table_32_t *stbl = (efi_system_table_32_t *)kaddr;
+
+ mend = mstart + sizeof(efi_config_table_32_t) * stbl->nr_tables;
+ }
+
+ memunmap(kaddr);
+
return kernel_ident_mapping_init(info, level4p, mstart, mend);
#endif
return 0;
@@ -241,10 +268,10 @@ static int init_pgtable(struct kimage *image, unsigned long start_pgtable)
}
/*
- * Prepare EFI systab and ACPI tables for kexec kernel since they are
- * not covered by pfn_mapped.
+ * Prepare EFI systab, config table and ACPI tables for kexec kernel
+ * since they are not covered by pfn_mapped.
*/
- result = map_efi_systab(&info, level4p);
+ result = map_efi_tables(&info, level4p);
if (result)
return result;
--
2.26.2
Like the EFI config table itself, the Confidential Computing blob entry
in that table, if it exists, is referenced by the kexec kernel before
it establishes its own identity map.
This could potentially cause a kexec failure if the CC blob is not
located close to other identity map areas. Such a failure is more
likely with the nogbpages command line option.
So, explicitly add the CC blob to the kexec identity map.
Signed-off-by: Steve Wahl <[email protected]>
Tested-by: Pavin Joseph <[email protected]>
Tested-by: Sarah Brofeldt <[email protected]>
Tested-by: Eric Hagberg <[email protected]>
---
arch/x86/kernel/machine_kexec_64.c | 47 +++++++++++++++++++++++++++++-
1 file changed, 46 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index d89942307659..bb68d86ecafe 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -29,6 +29,7 @@
#include <asm/set_memory.h>
#include <asm/cpu.h>
#include <asm/efi.h>
+#include <asm/sev.h>
#ifdef CONFIG_ACPI
/*
@@ -90,6 +91,7 @@ map_efi_tables(struct x86_mapping_info *info, pgd_t *level4p)
unsigned long mstart, mend;
void *kaddr;
int ret;
+ unsigned int cfg_tbl_len;
if (!efi_enabled(EFI_BOOT))
return 0;
@@ -120,16 +122,59 @@ map_efi_tables(struct x86_mapping_info *info, pgd_t *level4p)
if (efi_enabled(EFI_64BIT)) {
efi_system_table_64_t *stbl = (efi_system_table_64_t *)kaddr;
+ cfg_tbl_len = stbl->nr_tables;
mend = mstart + sizeof(efi_config_table_64_t) * stbl->nr_tables;
} else {
efi_system_table_32_t *stbl = (efi_system_table_32_t *)kaddr;
+ cfg_tbl_len = stbl->nr_tables;
mend = mstart + sizeof(efi_config_table_32_t) * stbl->nr_tables;
}
memunmap(kaddr);
- return kernel_ident_mapping_init(info, level4p, mstart, mend);
+ ret = kernel_ident_mapping_init(info, level4p, mstart, mend);
+ if (ret)
+ return ret;
+
+ /*
+ * CC blob is referenced in kernel startup before the new
+ * kernel creates it's own identity map, so make sure it's
+ * included in the kexec identity map.
+ */
+ kaddr = memremap(mstart, mend - mstart, MEMREMAP_WB);
+ if (!kaddr) {
+ pr_err("Could not map UEFI config table\n");
+ return -ENOMEM;
+ }
+
+ mstart = 0;
+ if (efi_enabled(EFI_64BIT)) {
+ efi_config_table_64_t *ctbl = (void *) kaddr;
+ int i;
+
+ for (i = 0; i < cfg_tbl_len; i++) {
+ if (!efi_guidcmp(EFI_CC_BLOB_GUID, ctbl[i].guid)) {
+ mstart = ctbl[i].table;
+ mend = mstart + sizeof(struct cc_blob_sev_info);
+ break;
+ }
+ }
+ } else {
+ efi_config_table_32_t *ctbl = (void *) kaddr;
+ int i;
+
+ for (i = 0; i < cfg_tbl_len; i++) {
+ if (!efi_guidcmp(EFI_CC_BLOB_GUID, ctbl[i].guid)) {
+ mstart = ctbl[i].table;
+ mend = mstart + sizeof(struct cc_blob_sev_info);
+ break;
+ }
+ }
+ }
+ memunmap(kaddr);
+ if (mstart)
+ return kernel_ident_mapping_init(info, level4p, mstart, mend);
#endif
return 0;
}
--
2.26.2
When ident_pud_init() uses only gbpages to create identity maps, large
ranges of addresses not actually requested can be included in the
resulting table; a 4K request will map a full GB. On UV systems, this
ends up including regions that will cause hardware to halt the system
if accessed (these are marked "reserved" by BIOS). Even processor
speculation into these regions is enough to trigger the system halt.
Only use gbpages when map creation requests include the full GB page
of space. Fall back to using smaller 2M pages when only portions of a
GB page are included in the request.
No attempt is made to coalesce mapping requests. If a request requires
a map entry at the 2M (pmd) level, subsequent mapping requests within
the same 1G region will also be at the pmd level, even if adjacent or
overlapping such requests could have been combined to map a full
gbpage. Existing usage starts with larger regions and then adds
smaller regions, so this should not have any great consequence.
Signed-off-by: Steve Wahl <[email protected]>
Tested-by: Pavin Joseph <[email protected]>
Tested-by: Sarah Brofeldt <[email protected]>
Tested-by: Eric Hagberg <[email protected]>
---
arch/x86/mm/ident_map.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index 968d7005f4a7..a204a332c71f 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -26,18 +26,31 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page,
for (; addr < end; addr = next) {
pud_t *pud = pud_page + pud_index(addr);
pmd_t *pmd;
+ bool use_gbpage;
next = (addr & PUD_MASK) + PUD_SIZE;
if (next > end)
next = end;
- if (info->direct_gbpages) {
- pud_t pudval;
+ /* if this is already a gbpage, this portion is already mapped */
+ if (pud_leaf(*pud))
+ continue;
+
+ /* Is using a gbpage allowed? */
+ use_gbpage = info->direct_gbpages;
- if (pud_present(*pud))
- continue;
+ /* Don't use gbpage if it maps more than the requested region. */
+ /* at the beginning: */
+ use_gbpage &= ((addr & ~PUD_MASK) == 0);
+ /* ... or at the end: */
+ use_gbpage &= ((next & ~PUD_MASK) == 0);
+
+ /* Never overwrite existing mappings */
+ use_gbpage &= !pud_present(*pud);
+
+ if (use_gbpage) {
+ pud_t pudval;
- addr &= PUD_MASK;
pudval = __pud((addr - info->offset) | info->page_flag);
set_pud(pud, pudval);
continue;
--
2.26.2
Add Tao in the cc list.
On Tue, 21 May 2024 at 02:37, Steve Wahl <[email protected]> wrote:
>
> Although there was a previous fix to avoid early kernel access to the
> EFI config table on Intel systems, the problem can still exist on AMD
> systems that support SEV (Secure Encrypted Virtualization). The
> command line option "nogbpages" brings this bug to the surface. And
> this is what caused the regression with my earlier patch that
> attempted to reduce the use of gbpages. This patch series fixes that
> problem and restores my earlier patch.
>
> The following 2 commits caused the EFI config table, and the CC_BLOB
> entry in that table, to be accessed when enabling SEV at kernel
> startup.
>
> commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> earlier during boot")
> commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> detection/setup")
>
> These accesses happen before the new kernel establishes its own
> identity map, and before establishing a routine to handle page faults.
> But the areas referenced are not explicitly added to the kexec
> identity map.
>
> This goes unnoticed when these areas happen to be placed close enough
> to others areas that are explicitly added to the identity map, but
> that is not always the case.
>
> Under certain conditions, for example Intel Atom processors that don't
> support 1GB pages, it was found that these areas don't end up mapped,
> and the SEV initialization code causes an unrecoverable page fault,
> and the kexec fails.
>
> Tau Liu had offered a patch to put the config table into the kexec
> identity map to avoid this problem:
>
> https://lore.kernel.org/all/[email protected]/
>
> But the community chose instead to avoid referencing this memory on
> non-AMD systems where the problem was reported.
>
> commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> on non-AMD hardware")
>
> I later wanted to make a different change to kexec identity map
> creation, and had this patch accepted:
>
> commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
>
> but it quickly needed to be reverted because of problems on AMD systems.
>
> The reported regression problems on AMD systems were due to the above
> mentioned references to the EFI config table. In fact, on the same
> systems, the "nogbpages" command line option breaks kexec as well.
>
> So I resubmit Tau Liu's original patch that maps the EFI config
> table, add an additional patch by me that ensures that the CC blob is
> also mapped (if present), and also resubmit my earlier patch to use
> gpbages only when a full GB of space is requested to be mapped.
>
> I do not advocate for removing the earlier, non-AMD fix. With kexec,
> two different kernel versions can be in play, and the earlier fix
> still covers non-AMD systems when the kexec'd-from kernel doesn't have
> these patches applied.
>
> All three of the people who reported regression with my earlier patch
> have retested with this patch series and found it to work where my
> single patch previously did not. With current kernels, all fail to
> kexec when "nogbpages" is on the command line, but all succeed with
> "nogbpages" after the series is applied.
>
> Tao Liu (1):
> x86/kexec: Add EFI config table identity mapping for kexec kernel
>
> Steve Wahl (2):
> x86/kexec: Add EFI Confidential Computing blob to kexec identity
> mapping.
> x86/mm/ident_map: Use gbpages only where full GB page should be
> mapped.
>
> arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> arch/x86/mm/ident_map.c | 23 +++++++--
> 2 files changed, 95 insertions(+), 10 deletions(-)
>
> --
> 2.26.2
>
Cc kexec list as well.
On Thu, 23 May 2024 at 10:52, Dave Young <[email protected]> wrote:
>
> Add Tao in the cc list.
>
> On Tue, 21 May 2024 at 02:37, Steve Wahl <[email protected]> wrote:
> >
> > Although there was a previous fix to avoid early kernel access to the
> > EFI config table on Intel systems, the problem can still exist on AMD
> > systems that support SEV (Secure Encrypted Virtualization). The
> > command line option "nogbpages" brings this bug to the surface. And
> > this is what caused the regression with my earlier patch that
> > attempted to reduce the use of gbpages. This patch series fixes that
> > problem and restores my earlier patch.
> >
> > The following 2 commits caused the EFI config table, and the CC_BLOB
> > entry in that table, to be accessed when enabling SEV at kernel
> > startup.
> >
> > commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> > earlier during boot")
> > commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> > detection/setup")
> >
> > These accesses happen before the new kernel establishes its own
> > identity map, and before establishing a routine to handle page faults.
> > But the areas referenced are not explicitly added to the kexec
> > identity map.
> >
> > This goes unnoticed when these areas happen to be placed close enough
> > to others areas that are explicitly added to the identity map, but
> > that is not always the case.
> >
> > Under certain conditions, for example Intel Atom processors that don't
> > support 1GB pages, it was found that these areas don't end up mapped,
> > and the SEV initialization code causes an unrecoverable page fault,
> > and the kexec fails.
> >
> > Tau Liu had offered a patch to put the config table into the kexec
> > identity map to avoid this problem:
> >
> > https://lore.kernel.org/all/[email protected]/
> >
> > But the community chose instead to avoid referencing this memory on
> > non-AMD systems where the problem was reported.
> >
> > commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> > on non-AMD hardware")
> >
> > I later wanted to make a different change to kexec identity map
> > creation, and had this patch accepted:
> >
> > commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
> >
> > but it quickly needed to be reverted because of problems on AMD systems.
> >
> > The reported regression problems on AMD systems were due to the above
> > mentioned references to the EFI config table. In fact, on the same
> > systems, the "nogbpages" command line option breaks kexec as well.
> >
> > So I resubmit Tau Liu's original patch that maps the EFI config
> > table, add an additional patch by me that ensures that the CC blob is
> > also mapped (if present), and also resubmit my earlier patch to use
> > gpbages only when a full GB of space is requested to be mapped.
> >
> > I do not advocate for removing the earlier, non-AMD fix. With kexec,
> > two different kernel versions can be in play, and the earlier fix
> > still covers non-AMD systems when the kexec'd-from kernel doesn't have
> > these patches applied.
> >
> > All three of the people who reported regression with my earlier patch
> > have retested with this patch series and found it to work where my
> > single patch previously did not. With current kernels, all fail to
> > kexec when "nogbpages" is on the command line, but all succeed with
> > "nogbpages" after the series is applied.
> >
> > Tao Liu (1):
> > x86/kexec: Add EFI config table identity mapping for kexec kernel
> >
> > Steve Wahl (2):
> > x86/kexec: Add EFI Confidential Computing blob to kexec identity
> > mapping.
> > x86/mm/ident_map: Use gbpages only where full GB page should be
> > mapped.
> >
> > arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> > arch/x86/mm/ident_map.c | 23 +++++++--
> > 2 files changed, 95 insertions(+), 10 deletions(-)
> >
> > --
> > 2.26.2
> >
Gentle ping. Can someone give me some feedback, please?
Thanks,
Steve Wahl, HPE.
On Thu, May 23, 2024 at 10:54:33AM +0800, Dave Young wrote:
> Cc kexec list as well.
>
> On Thu, 23 May 2024 at 10:52, Dave Young <[email protected]> wrote:
> >
> > Add Tao in the cc list.
> >
> > On Tue, 21 May 2024 at 02:37, Steve Wahl <[email protected]> wrote:
> > >
> > > Although there was a previous fix to avoid early kernel access to the
> > > EFI config table on Intel systems, the problem can still exist on AMD
> > > systems that support SEV (Secure Encrypted Virtualization). The
> > > command line option "nogbpages" brings this bug to the surface. And
> > > this is what caused the regression with my earlier patch that
> > > attempted to reduce the use of gbpages. This patch series fixes that
> > > problem and restores my earlier patch.
> > >
> > > The following 2 commits caused the EFI config table, and the CC_BLOB
> > > entry in that table, to be accessed when enabling SEV at kernel
> > > startup.
> > >
> > > commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> > > earlier during boot")
> > > commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> > > detection/setup")
> > >
> > > These accesses happen before the new kernel establishes its own
> > > identity map, and before establishing a routine to handle page faults.
> > > But the areas referenced are not explicitly added to the kexec
> > > identity map.
> > >
> > > This goes unnoticed when these areas happen to be placed close enough
> > > to others areas that are explicitly added to the identity map, but
> > > that is not always the case.
> > >
> > > Under certain conditions, for example Intel Atom processors that don't
> > > support 1GB pages, it was found that these areas don't end up mapped,
> > > and the SEV initialization code causes an unrecoverable page fault,
> > > and the kexec fails.
> > >
> > > Tau Liu had offered a patch to put the config table into the kexec
> > > identity map to avoid this problem:
> > >
> > > https://lore.kernel.org/all/[email protected]/
> > >
> > > But the community chose instead to avoid referencing this memory on
> > > non-AMD systems where the problem was reported.
> > >
> > > commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> > > on non-AMD hardware")
> > >
> > > I later wanted to make a different change to kexec identity map
> > > creation, and had this patch accepted:
> > >
> > > commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
> > >
> > > but it quickly needed to be reverted because of problems on AMD systems.
> > >
> > > The reported regression problems on AMD systems were due to the above
> > > mentioned references to the EFI config table. In fact, on the same
> > > systems, the "nogbpages" command line option breaks kexec as well.
> > >
> > > So I resubmit Tau Liu's original patch that maps the EFI config
> > > table, add an additional patch by me that ensures that the CC blob is
> > > also mapped (if present), and also resubmit my earlier patch to use
> > > gpbages only when a full GB of space is requested to be mapped.
> > >
> > > I do not advocate for removing the earlier, non-AMD fix. With kexec,
> > > two different kernel versions can be in play, and the earlier fix
> > > still covers non-AMD systems when the kexec'd-from kernel doesn't have
> > > these patches applied.
> > >
> > > All three of the people who reported regression with my earlier patch
> > > have retested with this patch series and found it to work where my
> > > single patch previously did not. With current kernels, all fail to
> > > kexec when "nogbpages" is on the command line, but all succeed with
> > > "nogbpages" after the series is applied.
> > >
> > > Tao Liu (1):
> > > x86/kexec: Add EFI config table identity mapping for kexec kernel
> > >
> > > Steve Wahl (2):
> > > x86/kexec: Add EFI Confidential Computing blob to kexec identity
> > > mapping.
> > > x86/mm/ident_map: Use gbpages only where full GB page should be
> > > mapped.
> > >
> > > arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> > > arch/x86/mm/ident_map.c | 23 +++++++--
> > > 2 files changed, 95 insertions(+), 10 deletions(-)
> > >
> > > --
> > > 2.26.2
> > >
>
On Mon, May 20, 2024 at 01:36:30PM -0500, Steve Wahl wrote:
> Although there was a previous fix to avoid early kernel access to the
> EFI config table on Intel systems, the problem can still exist on AMD
> systems that support SEV (Secure Encrypted Virtualization). The
> command line option "nogbpages" brings this bug to the surface. And
> this is what caused the regression with my earlier patch that
> attempted to reduce the use of gbpages. This patch series fixes that
> problem and restores my earlier patch.
>
> The following 2 commits caused the EFI config table, and the CC_BLOB
> entry in that table, to be accessed when enabling SEV at kernel
> startup.
>
> commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> earlier during boot")
> commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> detection/setup")
>
> These accesses happen before the new kernel establishes its own
> identity map, and before establishing a routine to handle page faults.
> But the areas referenced are not explicitly added to the kexec
> identity map.
>
> This goes unnoticed when these areas happen to be placed close enough
> to others areas that are explicitly added to the identity map, but
> that is not always the case.
>
> Under certain conditions, for example Intel Atom processors that don't
> support 1GB pages, it was found that these areas don't end up mapped,
> and the SEV initialization code causes an unrecoverable page fault,
> and the kexec fails.
What does Intel Atom have to do with SEV?!
> Tau Liu had offered a patch to put the config table into the kexec
> identity map to avoid this problem:
>
> https://lore.kernel.org/all/[email protected]/
>
> But the community chose instead to avoid referencing this memory on
> non-AMD systems where the problem was reported.
>
> commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> on non-AMD hardware")
>
> I later wanted to make a different change to kexec identity map
> creation, and had this patch accepted:
>
> commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
>
> but it quickly needed to be reverted because of problems on AMD systems.
>
> The reported regression problems on AMD systems were due to the above
> mentioned references to the EFI config table. In fact, on the same
> systems, the "nogbpages" command line option breaks kexec as well.
>
> So I resubmit Tau Liu's original patch that maps the EFI config
> table, add an additional patch by me that ensures that the CC blob is
> also mapped (if present), and also resubmit my earlier patch to use
> gpbages only when a full GB of space is requested to be mapped.
>
> I do not advocate for removing the earlier, non-AMD fix. With kexec,
> two different kernel versions can be in play, and the earlier fix
> still covers non-AMD systems when the kexec'd-from kernel doesn't have
> these patches applied.
>
> All three of the people who reported regression with my earlier patch
> have retested with this patch series and found it to work where my
> single patch previously did not. With current kernels, all fail to
> kexec when "nogbpages" is on the command line, but all succeed with
> "nogbpages" after the series is applied.
>
> Tao Liu (1):
> x86/kexec: Add EFI config table identity mapping for kexec kernel
>
> Steve Wahl (2):
> x86/kexec: Add EFI Confidential Computing blob to kexec identity
> mapping.
> x86/mm/ident_map: Use gbpages only where full GB page should be
> mapped.
>
> arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> arch/x86/mm/ident_map.c | 23 +++++++--
> 2 files changed, 95 insertions(+), 10 deletions(-)
Anyway, + Ashish who's been dealing with SNP kexec. We have identified one EFI
issue so far:
https://lore.kernel.org/r/20240612135638.298882-2-ardb%[email protected]
You could give it a try and report back.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Jun 13, 2024 at 05:28:56PM +0200, Borislav Petkov wrote:
Thank you for at least saying something on this!
> On Mon, May 20, 2024 at 01:36:30PM -0500, Steve Wahl wrote:
> > Although there was a previous fix to avoid early kernel access to the
> > EFI config table on Intel systems, the problem can still exist on AMD
> > systems that support SEV (Secure Encrypted Virtualization). The
> > command line option "nogbpages" brings this bug to the surface. And
> > this is what caused the regression with my earlier patch that
> > attempted to reduce the use of gbpages. This patch series fixes that
> > problem and restores my earlier patch.
> >
> > The following 2 commits caused the EFI config table, and the CC_BLOB
> > entry in that table, to be accessed when enabling SEV at kernel
> > startup.
> >
> > commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> > earlier during boot")
> > commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> > detection/setup")
> >
> > These accesses happen before the new kernel establishes its own
> > identity map, and before establishing a routine to handle page faults.
> > But the areas referenced are not explicitly added to the kexec
> > identity map.
> >
> > This goes unnoticed when these areas happen to be placed close enough
> > to others areas that are explicitly added to the identity map, but
> > that is not always the case.
> >
> > Under certain conditions, for example Intel Atom processors that don't
> > support 1GB pages, it was found that these areas don't end up mapped,
> > and the SEV initialization code causes an unrecoverable page fault,
> > and the kexec fails.
>
> What does Intel Atom have to do with SEV?!
The Atom was the prominent example of a platform that the code
introduced for SEV broke. Unfortunately, the fix currently
implemented leaves things still broken for actual AMD SEV capable
processors when nogbpages is used, and this problem is the reason for
the apparent regression when my reduce-use-of-gbpages patch was
accepted (later removed).
Tau Liu's original patch fixed this problem, but was not accepted.
The patch that was accepted does not fix this.
> > Tau Liu had offered a patch to put the config table into the kexec
> > identity map to avoid this problem:
> >
> > https://lore.kernel.org/all/[email protected]/
> >
> > But the community chose instead to avoid referencing this memory on
> > non-AMD systems where the problem was reported.
> >
> > commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> > on non-AMD hardware")
> >
> > I later wanted to make a different change to kexec identity map
> > creation, and had this patch accepted:
> >
> > commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
> >
> > but it quickly needed to be reverted because of problems on AMD systems.
> >
> > The reported regression problems on AMD systems were due to the above
> > mentioned references to the EFI config table. In fact, on the same
> > systems, the "nogbpages" command line option breaks kexec as well.
> >
> > So I resubmit Tau Liu's original patch that maps the EFI config
> > table, add an additional patch by me that ensures that the CC blob is
> > also mapped (if present), and also resubmit my earlier patch to use
> > gpbages only when a full GB of space is requested to be mapped.
> >
> > I do not advocate for removing the earlier, non-AMD fix. With kexec,
> > two different kernel versions can be in play, and the earlier fix
> > still covers non-AMD systems when the kexec'd-from kernel doesn't have
> > these patches applied.
> >
> > All three of the people who reported regression with my earlier patch
> > have retested with this patch series and found it to work where my
> > single patch previously did not. With current kernels, all fail to
> > kexec when "nogbpages" is on the command line, but all succeed with
> > "nogbpages" after the series is applied.
> >
> > Tao Liu (1):
> > x86/kexec: Add EFI config table identity mapping for kexec kernel
> >
> > Steve Wahl (2):
> > x86/kexec: Add EFI Confidential Computing blob to kexec identity
> > mapping.
> > x86/mm/ident_map: Use gbpages only where full GB page should be
> > mapped.
> >
> > arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> > arch/x86/mm/ident_map.c | 23 +++++++--
> > 2 files changed, 95 insertions(+), 10 deletions(-)
>
> Anyway, + Ashish who's been dealing with SNP kexec. We have identified one EFI
> issue so far:
>
> https://lore.kernel.org/r/20240612135638.298882-2-ardb%[email protected]
>
> You could give it a try and report back.
I will look at it, but a cursory inspection doesn't show anything
that affects what I'm talking about here.
Thanks!
--> Steve
--
Steve Wahl, Hewlett Packard Enterprise