Subject:

Date: Thu, 17 Dec 2015 17:40:55 -0600
Subject: [PATCH 0/4] x86/efi: support persistent memory in efi_print_memmap

This series adds support for persistent memory type and NV attribute
in the efi_print_memmap function, which is only used if EFI_DEBUG is true
(which is the case for x86).

Robert Elliott (4):
x86/efi: show actual ending addresses in efi_print_memmap
efi: add NV memory attribute
efi: add Persistent Memory type name
x86/efi: print size and base in binary units in efi_print_memmap

arch/x86/platform/efi/efi.c | 29 +++++++++++++++++++++++++----
drivers/firmware/efi/efi.c | 8 ++++++--
include/linux/efi.h | 1 +
3 files changed, 32 insertions(+), 6 deletions(-)

--
2.4.3


Subject: [PATCH 1/4] x86/efi: show actual ending addresses in efi_print_memmap

Adjust efi_print_memmap to print the real end address of each
range, not 1 byte beyond. This matches other prints like those for
SRAT and nosave memory.

Change the closing ) to ] to match the opening [.

old:
efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c80000000) (16384MB)

new:
efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff] (16384MB)

Example other address range prints:
SRAT: Node 1 PXM 1 [mem 0x480000000-0x87fffffff]
PM: Registered nosave memory: [mem 0x880000000-0xc7fffffff]

Signed-off-by: Robert Elliott <[email protected]>
---
arch/x86/platform/efi/efi.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index ad28540..635a955 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -235,10 +235,10 @@ void __init efi_print_memmap(void)
char buf[64];

md = p;
- pr_info("mem%02u: %s range=[0x%016llx-0x%016llx) (%lluMB)\n",
+ pr_info("mem%02u: %s range=[0x%016llx-0x%016llx] (%lluMB)\n",
i, efi_md_typeattr_format(buf, sizeof(buf), md),
md->phys_addr,
- md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT),
+ md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT) - 1,
(md->num_pages >> (20 - EFI_PAGE_SHIFT)));
}
#endif /* EFI_DEBUG */
--
2.4.3

Subject: [PATCH 2/4] efi: add NV memory attribute

Add the NV memory attribute introduced in UEFI 2.5 and add a column
for it in the types and attributes string used when printing the UEFI
memory map.

old:
efi: mem61: [type=14 | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)

new:
efi: mem61: [type=14 | | |NV| | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)

Signed-off-by: Robert Elliott <[email protected]>
---
drivers/firmware/efi/efi.c | 5 ++++-
include/linux/efi.h | 1 +
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 027ca21..4dd5464 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -611,13 +611,16 @@ char * __init efi_md_typeattr_format(char *buf, size_t size,
if (attr & ~(EFI_MEMORY_UC | EFI_MEMORY_WC | EFI_MEMORY_WT |
EFI_MEMORY_WB | EFI_MEMORY_UCE | EFI_MEMORY_RO |
EFI_MEMORY_WP | EFI_MEMORY_RP | EFI_MEMORY_XP |
+ EFI_MEMORY_NV |
EFI_MEMORY_RUNTIME | EFI_MEMORY_MORE_RELIABLE))
snprintf(pos, size, "|attr=0x%016llx]",
(unsigned long long)attr);
else
- snprintf(pos, size, "|%3s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
+ snprintf(pos, size,
+ "|%3s|%2s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
attr & EFI_MEMORY_RUNTIME ? "RUN" : "",
attr & EFI_MEMORY_MORE_RELIABLE ? "MR" : "",
+ attr & EFI_MEMORY_NV ? "NV" : "",
attr & EFI_MEMORY_XP ? "XP" : "",
attr & EFI_MEMORY_RP ? "RP" : "",
attr & EFI_MEMORY_WP ? "WP" : "",
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 569b5a8..9ce9e9e 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -97,6 +97,7 @@ typedef struct {
#define EFI_MEMORY_WP ((u64)0x0000000000001000ULL) /* write-protect */
#define EFI_MEMORY_RP ((u64)0x0000000000002000ULL) /* read-protect */
#define EFI_MEMORY_XP ((u64)0x0000000000004000ULL) /* execute-protect */
+#define EFI_MEMORY_NV ((u64)0x0000000000008000ULL) /* non-volatile */
#define EFI_MEMORY_MORE_RELIABLE \
((u64)0x0000000000010000ULL) /* higher reliability */
#define EFI_MEMORY_RO ((u64)0x0000000000020000ULL) /* read-only */
--
2.4.3

Subject: [PATCH 3/4] efi: add Persistent Memory type name

Add the "Persistent Memory" string for type 14 introduced in
UEFI 2.5. This is used when printing the UEFI memory map.

old:
efi: mem61: [type=14 | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)

new:
efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)

Signed-off-by: Robert Elliott <[email protected]>
---
drivers/firmware/efi/efi.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 4dd5464..0b16e88 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -584,7 +584,8 @@ static __initdata char memory_type_name[][20] = {
"ACPI Memory NVS",
"Memory Mapped I/O",
"MMIO Port Space",
- "PAL Code"
+ "PAL Code",
+ "Persistent Memory"
};

char * __init efi_md_typeattr_format(char *buf, size_t size,
--
2.4.3

Subject: [PATCH 4/4] x86/efi: print size and base in binary units in efi_print_memmap

Print the base address for each range in decimal alongside the size.
Use a "(size @ base)" format similar to the fake_memmap kernel parameter.

Print the range and base in the best-fit B, KiB, MiB, etc. units rather
than always MiB. This avoids rounding, which can be misleading.

Use proper IEC binary units (KiB, MiB, etc.) rather than misuse SI
decimal units (KB, MB, etc.).

old:
efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)

new:
efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff] (16 GiB @ 34 GiB)

Signed-off-by: Robert Elliott <[email protected]>
---
arch/x86/platform/efi/efi.c | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 635a955..030ba91 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -222,6 +222,25 @@ int __init efi_memblock_x86_reserve_range(void)
return 0;
}

+char * __init efi_size_format(char *buf, size_t size, u64 bytes)
+{
+ if (!bytes || (bytes & 0x3ff))
+ snprintf(buf, size, "%llu B", bytes);
+ else if (bytes & 0xfffff)
+ snprintf(buf, size, "%llu KiB", bytes >> 10);
+ else if (bytes & 0x3fffffff)
+ snprintf(buf, size, "%llu MiB", bytes >> 20);
+ else if (bytes & 0xffffffffff)
+ snprintf(buf, size, "%llu GiB", bytes >> 30);
+ else if (bytes & 0x3ffffffffffff)
+ snprintf(buf, size, "%llu TiB", bytes >> 40);
+ else if (bytes & 0xfffffffffffffff)
+ snprintf(buf, size, "%llu PiB", bytes >> 50);
+ else
+ snprintf(buf, size, "%llu EiB", bytes >> 60);
+ return buf;
+}
+
void __init efi_print_memmap(void)
{
#ifdef EFI_DEBUG
@@ -232,14 +251,16 @@ void __init efi_print_memmap(void)
for (p = memmap.map, i = 0;
p < memmap.map_end;
p += memmap.desc_size, i++) {
- char buf[64];
+ char buf[64], buf2[32], buf3[32];

md = p;
- pr_info("mem%02u: %s range=[0x%016llx-0x%016llx] (%lluMB)\n",
+ pr_info("mem%02u: %s range=[0x%016llx-0x%016llx] (%s @ %s)\n",
i, efi_md_typeattr_format(buf, sizeof(buf), md),
md->phys_addr,
md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT) - 1,
- (md->num_pages >> (20 - EFI_PAGE_SHIFT)));
+ efi_size_format(buf3, sizeof(buf3),
+ md->num_pages << EFI_PAGE_SHIFT),
+ efi_size_format(buf2, sizeof(buf2), md->phys_addr));
}
#endif /* EFI_DEBUG */
}
--
2.4.3

2015-12-21 15:50:43

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 1/4] x86/efi: show actual ending addresses in efi_print_memmap

On Thu, 17 Dec, at 07:28:31PM, Robert Elliott wrote:
> Adjust efi_print_memmap to print the real end address of each
> range, not 1 byte beyond. This matches other prints like those for
> SRAT and nosave memory.
>
> Change the closing ) to ] to match the opening [.
>
> old:
> efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c80000000) (16384MB)
>
> new:
> efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff] (16384MB)
>
> Example other address range prints:
> SRAT: Node 1 PXM 1 [mem 0x480000000-0x87fffffff]
> PM: Registered nosave memory: [mem 0x880000000-0xc7fffffff]
>
> Signed-off-by: Robert Elliott <[email protected]>
> ---
> arch/x86/platform/efi/efi.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)

Is this change purely for aesthetic reasons? We're usually not in the
habit of changing the output of print messages without a good reason
because people downstream do rely on them being consistent.

2015-12-21 15:54:41

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 2/4] efi: add NV memory attribute

(Cc'ing people that have worked in this area recently)

On Thu, 17 Dec, at 07:28:32PM, Robert Elliott wrote:
> Add the NV memory attribute introduced in UEFI 2.5 and add a column
> for it in the types and attributes string used when printing the UEFI
> memory map.
>
> old:
> efi: mem61: [type=14 | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)
>
> new:
> efi: mem61: [type=14 | | |NV| | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)
>
> Signed-off-by: Robert Elliott <[email protected]>
> ---
> drivers/firmware/efi/efi.c | 5 ++++-
> include/linux/efi.h | 1 +
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 027ca21..4dd5464 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -611,13 +611,16 @@ char * __init efi_md_typeattr_format(char *buf, size_t size,
> if (attr & ~(EFI_MEMORY_UC | EFI_MEMORY_WC | EFI_MEMORY_WT |
> EFI_MEMORY_WB | EFI_MEMORY_UCE | EFI_MEMORY_RO |
> EFI_MEMORY_WP | EFI_MEMORY_RP | EFI_MEMORY_XP |
> + EFI_MEMORY_NV |
> EFI_MEMORY_RUNTIME | EFI_MEMORY_MORE_RELIABLE))
> snprintf(pos, size, "|attr=0x%016llx]",
> (unsigned long long)attr);
> else
> - snprintf(pos, size, "|%3s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
> + snprintf(pos, size,
> + "|%3s|%2s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
> attr & EFI_MEMORY_RUNTIME ? "RUN" : "",
> attr & EFI_MEMORY_MORE_RELIABLE ? "MR" : "",
> + attr & EFI_MEMORY_NV ? "NV" : "",
> attr & EFI_MEMORY_XP ? "XP" : "",
> attr & EFI_MEMORY_RP ? "RP" : "",
> attr & EFI_MEMORY_WP ? "WP" : "",
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index 569b5a8..9ce9e9e 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -97,6 +97,7 @@ typedef struct {
> #define EFI_MEMORY_WP ((u64)0x0000000000001000ULL) /* write-protect */
> #define EFI_MEMORY_RP ((u64)0x0000000000002000ULL) /* read-protect */
> #define EFI_MEMORY_XP ((u64)0x0000000000004000ULL) /* execute-protect */
> +#define EFI_MEMORY_NV ((u64)0x0000000000008000ULL) /* non-volatile */
> #define EFI_MEMORY_MORE_RELIABLE \
> ((u64)0x0000000000010000ULL) /* higher reliability */
> #define EFI_MEMORY_RO ((u64)0x0000000000020000ULL) /* read-only */

Seems straight forward to me.

Applied, thanks.

2015-12-21 16:06:35

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 1/4] x86/efi: show actual ending addresses in efi_print_memmap

On Mon, 21 Dec, at 03:50:38PM, Matt Fleming wrote:
> On Thu, 17 Dec, at 07:28:31PM, Robert Elliott wrote:
> > Adjust efi_print_memmap to print the real end address of each
> > range, not 1 byte beyond. This matches other prints like those for
> > SRAT and nosave memory.
> >
> > Change the closing ) to ] to match the opening [.
> >
> > old:
> > efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c80000000) (16384MB)
> >
> > new:
> > efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff] (16384MB)
> >
> > Example other address range prints:
> > SRAT: Node 1 PXM 1 [mem 0x480000000-0x87fffffff]
> > PM: Registered nosave memory: [mem 0x880000000-0xc7fffffff]
> >
> > Signed-off-by: Robert Elliott <[email protected]>
> > ---
> > arch/x86/platform/efi/efi.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Is this change purely for aesthetic reasons? We're usually not in the
> habit of changing the output of print messages without a good reason
> because people downstream do rely on them being consistent.

I just noticed the bracket change.

My comment above only refers to printing the range addresses. Swapping
')' for ']' is a perfectly valid change.

2015-12-21 16:16:34

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86/efi: print size and base in binary units in efi_print_memmap

On Thu, 17 Dec, at 07:28:34PM, Robert Elliott wrote:
> Print the base address for each range in decimal alongside the size.
> Use a "(size @ base)" format similar to the fake_memmap kernel parameter.
>
> Print the range and base in the best-fit B, KiB, MiB, etc. units rather
> than always MiB. This avoids rounding, which can be misleading.
>
> Use proper IEC binary units (KiB, MiB, etc.) rather than misuse SI
> decimal units (KB, MB, etc.).
>
> old:
> efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)
>
> new:
> efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff] (16 GiB @ 34 GiB)
>
> Signed-off-by: Robert Elliott <[email protected]>
> ---
> arch/x86/platform/efi/efi.c | 27 ++++++++++++++++++++++++---
> 1 file changed, 24 insertions(+), 3 deletions(-)

I'm not at all sure of the value of printing the physical address as a
size. I would have thought that you'd have to convert it back to an
address whenever you wanted to use it anyway.

> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> index 635a955..030ba91 100644
> --- a/arch/x86/platform/efi/efi.c
> +++ b/arch/x86/platform/efi/efi.c
> @@ -222,6 +222,25 @@ int __init efi_memblock_x86_reserve_range(void)
> return 0;
> }
>
> +char * __init efi_size_format(char *buf, size_t size, u64 bytes)
> +{
> + if (!bytes || (bytes & 0x3ff))
> + snprintf(buf, size, "%llu B", bytes);
> + else if (bytes & 0xfffff)
> + snprintf(buf, size, "%llu KiB", bytes >> 10);
> + else if (bytes & 0x3fffffff)
> + snprintf(buf, size, "%llu MiB", bytes >> 20);
> + else if (bytes & 0xffffffffff)
> + snprintf(buf, size, "%llu GiB", bytes >> 30);
> + else if (bytes & 0x3ffffffffffff)
> + snprintf(buf, size, "%llu TiB", bytes >> 40);
> + else if (bytes & 0xfffffffffffffff)
> + snprintf(buf, size, "%llu PiB", bytes >> 50);
> + else
> + snprintf(buf, size, "%llu EiB", bytes >> 60);
> + return buf;
> +}
> +

Can we use string_get_size() instead of rolling our own function?

Subject: RE: [PATCH 1/4] x86/efi: show actual ending addresses in efi_print_memmap

> -----Original Message-----
> From: Matt Fleming [mailto:[email protected]]
> Sent: Monday, December 21, 2015 9:51 AM
> To: Elliott, Robert (Persistent Memory) <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [PATCH 1/4] x86/efi: show actual ending addresses in
> efi_print_memmap
>
> On Thu, 17 Dec, at 07:28:31PM, Robert Elliott wrote:
> > Adjust efi_print_memmap to print the real end address of each
> > range, not 1 byte beyond. This matches other prints like those for
> > SRAT and nosave memory.
> >
> > Change the closing ) to ] to match the opening [.
> >
> > old:
> > efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC]
> range=[0x0000000880000000-0x0000000c80000000) (16384MB)
> >
> > new:
> > efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC]
> range=[0x0000000880000000-0x0000000c7fffffff] (16384MB)
> >
> > Example other address range prints:
> > SRAT: Node 1 PXM 1 [mem 0x480000000-0x87fffffff]
> > PM: Registered nosave memory: [mem 0x880000000-0xc7fffffff]
> >
> > Signed-off-by: Robert Elliott <[email protected]>
> > ---
> > arch/x86/platform/efi/efi.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Is this change purely for aesthetic reasons? We're usually not in the
> habit of changing the output of print messages without a good reason
> because people downstream do rely on them being consistent.

The format of that line is architecture-specific and only appears
under the efi=debug kernel parameter, so I don't know how much
anyone relies on the specific format. Good question for the lists.

arch/ia64/kernel/efi.c shares the range=[...) format, but prints
the range after the bitmask rather than before:
printk("mem%02d: %s "
"range=[0x%016lx-0x%016lx) (%4lu%s)\n",
i, efi_md_typeattr_format(buf, sizeof(buf), md),
md->phys_addr,
md->phys_addr + efi_md_size(md), size, unit);

arch/arm64/kernel/efi.c has no mem prefix, or range=[...) text
surrounding the range:
pr_info(" 0x%012llx-0x%012llx %s",
paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
efi_md_typeattr_format(buf, sizeof(buf), md));
pr_cont("*");
pr_cont("\n");

The x86 code is inside ifdef EFI_DEBUG, which is always
defined in that file. I wonder if it was supposed to change
to efi_enabled(EFI_DBG) to be based off the efi=debug kernel
parameter? efi_init() qualified the call to this function
based on that:
if (efi_enabled(EFI_DBG))
efi_print_memmap();

In arch/ia64/kernel/efi.c efi_init(), EFI_DEBUG is set to 0
so the print doesn't happen at all without editing the
source code. It doesn't use efi_enabled(EFI_DBG).

arch/arm64/kernel/efi.c uses efi_enabled(EFI_DBG) exclusively.

---
Robert Elliott, HPE Persistent Memory

Subject: RE: [PATCH 1/4] x86/efi: show actual ending addresses in efi_print_memmap



---
Robert Elliott, HPE Persistent Memory


> -----Original Message-----
> From: [email protected] [mailto:linux-kernel-
> [email protected]] On Behalf Of Matt Fleming
> Sent: Monday, December 21, 2015 10:06 AM
> To: Elliott, Robert (Persistent Memory) <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [PATCH 1/4] x86/efi: show actual ending addresses in
> efi_print_memmap
>
> On Mon, 21 Dec, at 03:50:38PM, Matt Fleming wrote:
> > On Thu, 17 Dec, at 07:28:31PM, Robert Elliott wrote:
> > > Adjust efi_print_memmap to print the real end address of each
> > > range, not 1 byte beyond. This matches other prints like those for
> > > SRAT and nosave memory.
> > >
> > > Change the closing ) to ] to match the opening [.
> > >
> > > old:
> > > efi: mem61: [Persistent Memory | | | | | | |
> |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c80000000) (16384MB)
> > >
> > > new:
> > > efi: mem61: [Persistent Memory | | | | | | |
> |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff] (16384MB)
> > >
> > > Example other address range prints:
> > > SRAT: Node 1 PXM 1 [mem 0x480000000-0x87fffffff]
> > > PM: Registered nosave memory: [mem 0x880000000-0xc7fffffff]
> > >
> > > Signed-off-by: Robert Elliott <[email protected]>
> > > ---
> > > arch/x86/platform/efi/efi.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > Is this change purely for aesthetic reasons? We're usually not in the
> > habit of changing the output of print messages without a good reason
> > because people downstream do rely on them being consistent.
>
> I just noticed the bracket change.
>
> My comment above only refers to printing the range addresses. Swapping
> ')' for ']' is a perfectly valid change.

While investigating the grub persistent memory corruption
issues, it was helpful to make this table match the ending
address convention used by:
* the kernel's e820 table prints
* the kernel's nosave memory prints
* grub's lsmmap and lsefimmap commands
* the UEFI shell's memmap command

Using the excerpts from the patch, if you grep all the various
logs for c7fffffff, you won't find the line with c80000000.

Subject: RE: [PATCH 4/4] x86/efi: print size and base in binary units in efi_print_memmap


> -----Original Message-----
> From: Matt Fleming [mailto:[email protected]]
> Sent: Monday, December 21, 2015 10:16 AM
> To: Elliott, Robert (Persistent Memory) <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [PATCH 4/4] x86/efi: print size and base in binary units in
> efi_print_memmap
>
> On Thu, 17 Dec, at 07:28:34PM, Robert Elliott wrote:
> > Print the base address for each range in decimal alongside the size.
> > Use a "(size @ base)" format similar to the fake_memmap kernel
> parameter.
> >
> > Print the range and base in the best-fit B, KiB, MiB, etc. units rather
> > than always MiB. This avoids rounding, which can be misleading.
> >
> > Use proper IEC binary units (KiB, MiB, etc.) rather than misuse SI
> > decimal units (KB, MB, etc.).
> >
> > old:
> > efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC]
> range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)
> >
> > new:
> > efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC]
> range=[0x0000000880000000-0x0000000c7fffffff] (16 GiB @ 34 GiB)
> >
> > Signed-off-by: Robert Elliott <[email protected]>
> > ---
> > arch/x86/platform/efi/efi.c | 27 ++++++++++++++++++++++++---
> > 1 file changed, 24 insertions(+), 3 deletions(-)
>
> I'm not at all sure of the value of printing the physical address as a
> size. I would have thought that you'd have to convert it back to an
> address whenever you wanted to use it anyway.

I was trying to make it resemble the memmap=size@address
kernel parameter format for creating e820 entries, which
does accept abbreviations in addition to hex values:
memmap=nn[KMG]@ss[KMG] for usable DRAM
memmap=nn[KMG]#ss[KMG] for ACPI data
memmap=nn[KMG]$ss[KMG] for reserved
memmap=nn[KMG]!ss[KMG] for persistent memory

Mapping the UEFI type to the corresponding @, #, $, or ! was
more than I wanted to tackle, so it's not a drop-in
replacement string.

memparse() also accepts T, P, and E units; I guess those
need to be added to Documentation/kernel-parameters.txt.

> > diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> > index 635a955..030ba91 100644
> > --- a/arch/x86/platform/efi/efi.c
> > +++ b/arch/x86/platform/efi/efi.c
> > @@ -222,6 +222,25 @@ int __init efi_memblock_x86_reserve_range(void)
> > return 0;
> > }
> >
> > +char * __init efi_size_format(char *buf, size_t size, u64 bytes)
> > +{
> > + if (!bytes || (bytes & 0x3ff))
> > + snprintf(buf, size, "%llu B", bytes);
> > + else if (bytes & 0xfffff)
> > + snprintf(buf, size, "%llu KiB", bytes >> 10);
> > + else if (bytes & 0x3fffffff)
> > + snprintf(buf, size, "%llu MiB", bytes >> 20);
> > + else if (bytes & 0xffffffffff)
> > + snprintf(buf, size, "%llu GiB", bytes >> 30);
> > + else if (bytes & 0x3ffffffffffff)
> > + snprintf(buf, size, "%llu TiB", bytes >> 40);
> > + else if (bytes & 0xfffffffffffffff)
> > + snprintf(buf, size, "%llu PiB", bytes >> 50);
> > + else
> > + snprintf(buf, size, "%llu EiB", bytes >> 60);
> > + return buf;
> > +}
> > +
>
> Can we use string_get_size() instead of rolling our own function?

Thanks for the pointer; I wondered if there was a similar
function somewhere. However, that function throws away
precision in favor of printing just 3 significant digits;
I think that's dangerous. Its non-integer output is not
supported by memmap=, and the function appears to use
assembly code to get CPU divide instructions, losing the
ability to use shifts for these power of two divisions.

Example results...

efi: mem01:... range=[0x0000000000093000-0x0000000000093fff] (4 KiB @ 588 KiB)
efi: mem01:... range=[0x0000000000093000-0x0000000000093fff] (4.00 KiB @ 588 KiB) SGS

efi: mem03:... range=[0x0000000000100000-0x00000000013e8fff] (19364 KiB @ 1 MiB)
efi: mem03:... range=[0x0000000000100000-0x00000000013e8fff] (18.9 MiB @ 1.00 MiB) SGS
(example of lost precision: 19364 KiB is really 18.91015625 MiB)

efi: mem04:... range=[0x00000000013e9000-0x0000000001ffffff] (12380 KiB @ 20388 KiB)
efi: mem04:... range=[0x00000000013e9000-0x0000000001ffffff] (12.0 MiB @ 19.9 MiB) SGS

efi: mem28:... range=[0x00000000717c2000-0x0000000072acafff] (19492 KiB @ 1859336 KiB)
efi: mem28:... range=[0x00000000717c2000-0x0000000072acafff] (19.0 MiB @ 1.77 GiB) SGS

efi: mem57:... range=[0x0000000880000000-0x0000000e7fffffff] (24 GiB @ 34 GiB)
efi: mem57:... range=[0x0000000880000000-0x0000000e7fffffff] (24.0 GiB @ 34.0 GiB) SGS

---
Robert Elliott, HPE Persistent Memory

2015-12-23 12:44:44

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 1/4] x86/efi: show actual ending addresses in efi_print_memmap

On Tue, 22 Dec, at 08:08:19PM, Elliott, Robert (Persistent Memory) wrote:
>
> While investigating the grub persistent memory corruption
> issues, it was helpful to make this table match the ending
> address convention used by:
> * the kernel's e820 table prints
> * the kernel's nosave memory prints
> * grub's lsmmap and lsefimmap commands
> * the UEFI shell's memmap command
>
> Using the excerpts from the patch, if you grep all the various
> logs for c7fffffff, you won't find the line with c80000000.

I like this. This is reasonable justification. Please inclue it in the
commit log.

2015-12-23 12:47:31

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 1/4] x86/efi: show actual ending addresses in efi_print_memmap

On Mon, 21 Dec, at 04:44:19PM, Elliott, Robert (Persistent Memory) wrote:
>
> The format of that line is architecture-specific and only appears
> under the efi=debug kernel parameter, so I don't know how much
> anyone relies on the specific format. Good question for the lists.

Both kernel and non-kernel developers are consumers of these debug
statements, and people do come to rely on the format of the text.

I have certainly become acustomed to the current format we have,
particularly when debugging issues via other users, i.e. when I'm not
the person running the kernel being debugged.

Just because it's a debug option doesn't mean it's completely open to
modification. Reasonable Justification must be provided. Having said
that, I think you provide a good argument in your other email,

https://lkml.kernel.org/r/94D0CD8314A33A4D9D801C0FE68B40295BEC74CF@G9W0745.americas.hpqcorp.net

> arch/ia64/kernel/efi.c shares the range=[...) format, but prints
> the range after the bitmask rather than before:
> printk("mem%02d: %s "
> "range=[0x%016lx-0x%016lx) (%4lu%s)\n",
> i, efi_md_typeattr_format(buf, sizeof(buf), md),
> md->phys_addr,
> md->phys_addr + efi_md_size(md), size, unit);
>
> arch/arm64/kernel/efi.c has no mem prefix, or range=[...) text
> surrounding the range:
> pr_info(" 0x%012llx-0x%012llx %s",
> paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
> efi_md_typeattr_format(buf, sizeof(buf), md));
> pr_cont("*");
> pr_cont("\n");
>
> The x86 code is inside ifdef EFI_DEBUG, which is always
> defined in that file. I wonder if it was supposed to change
> to efi_enabled(EFI_DBG) to be based off the efi=debug kernel
> parameter? efi_init() qualified the call to this function
> based on that:
> if (efi_enabled(EFI_DBG))
> efi_print_memmap();

The EFI_DEBUG symbol should just be deleted. It's been enabled since
forever and really provides no value, because as you point out,
printing of the memory map is guarded by EFI_DBG anyway.

I'll send out a patch for ripping that out.

> In arch/ia64/kernel/efi.c efi_init(), EFI_DEBUG is set to 0
> so the print doesn't happen at all without editing the
> source code. It doesn't use efi_enabled(EFI_DBG).
>
> arch/arm64/kernel/efi.c uses efi_enabled(EFI_DBG) exclusively.

2015-12-23 15:53:02

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86/efi: print size and base in binary units in efi_print_memmap

On Wed, 23 Dec, at 12:11:56AM, Elliott, Robert (Persistent Memory) wrote:
>
> I was trying to make it resemble the memmap=size@address
> kernel parameter format for creating e820 entries, which
> does accept abbreviations in addition to hex values:
> memmap=nn[KMG]@ss[KMG] for usable DRAM
> memmap=nn[KMG]#ss[KMG] for ACPI data
> memmap=nn[KMG]$ss[KMG] for reserved
> memmap=nn[KMG]!ss[KMG] for persistent memory
>
> Mapping the UEFI type to the corresponding @, #, $, or ! was
> more than I wanted to tackle, so it's not a drop-in
> replacement string.
>
> memparse() also accepts T, P, and E units; I guess those
> need to be added to Documentation/kernel-parameters.txt.

I think the value of the "@ address" portion of the string is
questionable.

> Thanks for the pointer; I wondered if there was a similar
> function somewhere. However, that function throws away
> precision in favor of printing just 3 significant digits;
> I think that's dangerous. Its non-integer output is not
> supported by memmap=, and the function appears to use
> assembly code to get CPU divide instructions, losing the
> ability to use shifts for these power of two divisions.
>
> Example results...
>
> efi: mem01:... range=[0x0000000000093000-0x0000000000093fff] (4 KiB @ 588 KiB)
> efi: mem01:... range=[0x0000000000093000-0x0000000000093fff] (4.00 KiB @ 588 KiB) SGS
>
> efi: mem03:... range=[0x0000000000100000-0x00000000013e8fff] (19364 KiB @ 1 MiB)
> efi: mem03:... range=[0x0000000000100000-0x00000000013e8fff] (18.9 MiB @ 1.00 MiB) SGS
> (example of lost precision: 19364 KiB is really 18.91015625 MiB)
>
> efi: mem04:... range=[0x00000000013e9000-0x0000000001ffffff] (12380 KiB @ 20388 KiB)
> efi: mem04:... range=[0x00000000013e9000-0x0000000001ffffff] (12.0 MiB @ 19.9 MiB) SGS
>
> efi: mem28:... range=[0x00000000717c2000-0x0000000072acafff] (19492 KiB @ 1859336 KiB)
> efi: mem28:... range=[0x00000000717c2000-0x0000000072acafff] (19.0 MiB @ 1.77 GiB) SGS
>
> efi: mem57:... range=[0x0000000880000000-0x0000000e7fffffff] (24 GiB @ 34 GiB)
> efi: mem57:... range=[0x0000000880000000-0x0000000e7fffffff] (24.0 GiB @ 34.0 GiB) SGS

Good points! I agree that string_get_size() (unfortunately) doesn't
look useful in this scenario. The code in efi_size_format() looks
fine.

Subject: [PATCH v2 1/4] x86/efi: show actual ending addresses in efi_print_memmap

Adjust efi_print_memmap to print the real end address of each
range, not 1 byte beyond. This matches other prints like those for
SRAT and nosave memory.

While investigating grub persistent memory corruption issues, it
was helpful to make this table match the ending address convention
used by:
* the kernel's e820 table prints
BIOS-e820: [mem 0x0000001680000000-0x0000001c7fffffff] reserved
* the kernel's nosave memory prints
PM: Registered nosave memory: [mem 0x880000000-0xc7fffffff]
* the kernel's ACPI System Resource Affinity Table prints
SRAT: Node 1 PXM 1 [mem 0x480000000-0x87fffffff]
* grub's lsmmap and lsefimmap commands
reserved 0000001680000000-0000001c7fffffff 00600000 24GiB UC WC WT WB NV
* the UEFI shell's memmap command
Reserved 000000007FC00000-000000007FFFFFFF 0000000000000400 0000000000000001

For example, if you grep all the various logs for c7fffffff, you
won't find the kernel's line if it uses c80000000.

Also, change the closing ) to ] to match the opening [.

old:
efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c80000000) (16384MB)

new:
efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff] (16384MB)

Signed-off-by: Robert Elliott <[email protected]>
---
Changes in v2:
- Expanded rationale in the commit message
---
arch/x86/platform/efi/efi.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index ad28540..635a955 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -235,10 +235,10 @@ void __init efi_print_memmap(void)
char buf[64];

md = p;
- pr_info("mem%02u: %s range=[0x%016llx-0x%016llx) (%lluMB)\n",
+ pr_info("mem%02u: %s range=[0x%016llx-0x%016llx] (%lluMB)\n",
i, efi_md_typeattr_format(buf, sizeof(buf), md),
md->phys_addr,
- md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT),
+ md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT) - 1,
(md->num_pages >> (20 - EFI_PAGE_SHIFT)));
}
#endif /* EFI_DEBUG */
--
2.4.3

2015-12-27 14:35:15

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86/efi: print size and base in binary units in efi_print_memmap

On Mon, Dec 21, 2015 at 6:16 PM, Matt Fleming <[email protected]> wrote:
> On Thu, 17 Dec, at 07:28:34PM, Robert Elliott wrote:
>> Print the base address for each range in decimal alongside the size.
>> Use a "(size @ base)" format similar to the fake_memmap kernel parameter.
>>
>> Print the range and base in the best-fit B, KiB, MiB, etc. units rather
>> than always MiB. This avoids rounding, which can be misleading.
>>
>> Use proper IEC binary units (KiB, MiB, etc.) rather than misuse SI
>> decimal units (KB, MB, etc.).
>>
>> old:
>> efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)
>>
>> new:
>> efi: mem61: [Persistent Memory | | | | | | | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff] (16 GiB @ 34 GiB)
>>
>> Signed-off-by: Robert Elliott <[email protected]>
>> ---
>> arch/x86/platform/efi/efi.c | 27 ++++++++++++++++++++++++---
>> 1 file changed, 24 insertions(+), 3 deletions(-)
>
> I'm not at all sure of the value of printing the physical address as a
> size. I would have thought that you'd have to convert it back to an
> address whenever you wanted to use it anyway.
>
>> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
>> index 635a955..030ba91 100644
>> --- a/arch/x86/platform/efi/efi.c
>> +++ b/arch/x86/platform/efi/efi.c
>> @@ -222,6 +222,25 @@ int __init efi_memblock_x86_reserve_range(void)
>> return 0;
>> }
>>
>> +char * __init efi_size_format(char *buf, size_t size, u64 bytes)
>> +{
>> + if (!bytes || (bytes & 0x3ff))
>> + snprintf(buf, size, "%llu B", bytes);
>> + else if (bytes & 0xfffff)
>> + snprintf(buf, size, "%llu KiB", bytes >> 10);
>> + else if (bytes & 0x3fffffff)
>> + snprintf(buf, size, "%llu MiB", bytes >> 20);
>> + else if (bytes & 0xffffffffff)
>> + snprintf(buf, size, "%llu GiB", bytes >> 30);
>> + else if (bytes & 0x3ffffffffffff)
>> + snprintf(buf, size, "%llu TiB", bytes >> 40);
>> + else if (bytes & 0xfffffffffffffff)
>> + snprintf(buf, size, "%llu PiB", bytes >> 50);
>> + else
>> + snprintf(buf, size, "%llu EiB", bytes >> 60);
>> + return buf;

For me it looks like ffs with name in the table can be used.

>> +}
>> +
>
> Can we use string_get_size() instead of rolling our own function?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/



--
With Best Regards,
Andy Shevchenko