2022-07-21 18:08:08

by Kani, Toshimitsu

[permalink] [raw]
Subject: [PATCH v2] EDAC/ghes: Fix buffer overflow in ghes_edac_register()

The following buffer overflow BUG was observed on an HPE system.
ghes_edac_register() called strlen() on an uninitialized label,
which had non-zero values from krealloc_array().

In dimm_setup_label(), *device was set but *bank was null, which
left the label uninitialized. *bank is set from SMBIOS type 17
Bank Locator, offset 11h. This system had this value set to 0x0
(null string).

Change dimm_setup_label() to always initialize the label and use
"NA" in case bank or device is null.

detected buffer overflow in __fortify_strlen
------------[ cut here ]------------
kernel BUG at lib/string_helpers.c:983!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 1 Comm: swapper/0 Tainted: G I 5.18.6-200.fc36.x86_64 #1
Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019
RIP: 0010:fortify_panic+0xf/0x11
...
Call Trace:
<TASK>
ghes_edac_register.cold+0x128/0x128
ghes_probe+0x142/0x3a0
platform_probe+0x41/0x90
really_probe+0x19e/0x370
__driver_probe_device+0xfc/0x170
driver_probe_device+0x1f/0x90
__driver_attach+0xbb/0x190
? __device_attach_driver+0xe0/0xe0
bus_for_each_dev+0x5f/0x90
bus_add_driver+0x159/0x200
driver_register+0x89/0xd0
acpi_ghes_init+0x72/0xc3
acpi_init+0x441/0x493
? acpi_sleep_proc_init+0x24/0x24
do_one_initcall+0x41/0x200

Fixes: b9cae27728d1f ("EDAC/ghes: Scan the system once on driver init")
Tested-by: Robert Elliott <[email protected]>
Signed-off-by: Toshi Kani <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
---
drivers/edac/ghes_edac.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 59b0bedc9c24..8256065b1801 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -100,12 +100,13 @@ static struct dimm_info *find_dimm_by_handle(struct mem_ctl_info *mci, u16 handl
static void dimm_setup_label(struct dimm_info *dimm, u16 handle)
{
const char *bank = NULL, *device = NULL;
+ const char *na = "NA";

dmi_memdev_name(handle, &bank, &device);

- /* both strings must be non-zero */
- if (bank && *bank && device && *device)
- snprintf(dimm->label, sizeof(dimm->label), "%s %s", bank, device);
+ snprintf(dimm->label, sizeof(dimm->label), "%s %s",
+ (bank && *bank) ? bank : na,
+ (device && *device) ? device : na);
}

static void assign_dmi_dimm_info(struct dimm_info *dimm, struct memdev_dmi_entry *entry)


2022-07-22 13:27:40

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] EDAC/ghes: Fix buffer overflow in ghes_edac_register()

+ fix Robert's address.

On Thu, Jul 21, 2022 at 12:05:03PM -0600, Toshi Kani wrote:
> The following buffer overflow BUG was observed on an HPE system.
> ghes_edac_register() called strlen() on an uninitialized label,
> which had non-zero values from krealloc_array().

I ended up massaging it into this:

---
From: Toshi Kani <[email protected]>
Date: Thu, 21 Jul 2022 12:05:03 -0600
Subject: [PATCH] EDAC/ghes: Set the DIMM label unconditionally

The commit in Fixes enforced that both the bank and device strings
passed to dimm_setup_label() are not NULL.

However, there are BIOSes, for example on a

HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019

which don't populate both strings:

Handle 0x0020, DMI type 17, 84 bytes
Memory Device
Array Handle: 0x0013
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 32 GB
Form Factor: DIMM
Set: None
Locator: PROC 1 DIMM 1 <===== device
Bank Locator: Not Specified <===== bank

This results in a buffer overflow because ghes_edac_register() calls
strlen() on an uninitialized label, which had non-zero values left over
from krealloc_array():

detected buffer overflow in __fortify_strlen
------------[ cut here ]------------
kernel BUG at lib/string_helpers.c:983!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 1 Comm: swapper/0 Tainted: G I 5.18.6-200.fc36.x86_64 #1
Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019
RIP: 0010:fortify_panic
...
Call Trace:
<TASK>
ghes_edac_register.cold
ghes_probe
platform_probe
really_probe
__driver_probe_device
driver_probe_device
__driver_attach
? __device_attach_driver
bus_for_each_dev
bus_add_driver
driver_register
acpi_ghes_init
acpi_init
? acpi_sleep_proc_init
do_one_initcall

Change dimm_setup_label() to always initialize the label and use
"N/A" in case bank or device is null.

[ bp: Rewrite commit message. ]

Fixes: cb51a371d08e ("EDAC/ghes: Setup DIMM label from DMI and use it in error reports")
Signed-off-by: Toshi Kani <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Tested-by: Robert Elliott <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/edac/ghes_edac.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 59b0bedc9c24..fcab10e26b43 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -103,9 +103,9 @@ static void dimm_setup_label(struct dimm_info *dimm, u16 handle)

dmi_memdev_name(handle, &bank, &device);

- /* both strings must be non-zero */
- if (bank && *bank && device && *device)
- snprintf(dimm->label, sizeof(dimm->label), "%s %s", bank, device);
+ snprintf(dimm->label, sizeof(dimm->label), "%s %s",
+ (bank && *bank) ? bank : "N/A",
+ (device && *device) ? device : "N/A");
}

static void assign_dmi_dimm_info(struct dimm_info *dimm, struct memdev_dmi_entry *entry)
--
2.35.1

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-07-22 15:57:49

by Kani, Toshimitsu

[permalink] [raw]
Subject: RE: [PATCH v2] EDAC/ghes: Fix buffer overflow in ghes_edac_register()

Borislav Petkov wrote:
> On Thu, Jul 21, 2022 at 12:05:03PM -0600, Toshi Kani wrote:
> > The following buffer overflow BUG was observed on an HPE system.
> > ghes_edac_register() called strlen() on an uninitialized label, which
> > had non-zero values from krealloc_array().
>
> I ended up massaging it into this:

Thanks for the update! It looks good to me.

Toshi

2022-07-25 09:59:27

by Robert Richter

[permalink] [raw]
Subject: Re: [PATCH v2] EDAC/ghes: Fix buffer overflow in ghes_edac_register()

On 22.07.22 15:20:50, Borislav Petkov wrote:
> + fix Robert's address.
>
> On Thu, Jul 21, 2022 at 12:05:03PM -0600, Toshi Kani wrote:
> > The following buffer overflow BUG was observed on an HPE system.
> > ghes_edac_register() called strlen() on an uninitialized label,
> > which had non-zero values from krealloc_array().
>
> I ended up massaging it into this:
>
> ---
> From: Toshi Kani <[email protected]>
> Date: Thu, 21 Jul 2022 12:05:03 -0600
> Subject: [PATCH] EDAC/ghes: Set the DIMM label unconditionally
>
> The commit in Fixes enforced that both the bank and device strings
> passed to dimm_setup_label() are not NULL.
>
> However, there are BIOSes, for example on a
>
> HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019
>
> which don't populate both strings:
>
> Handle 0x0020, DMI type 17, 84 bytes
> Memory Device
> Array Handle: 0x0013
> Error Information Handle: Not Provided
> Total Width: 72 bits
> Data Width: 64 bits
> Size: 32 GB
> Form Factor: DIMM
> Set: None
> Locator: PROC 1 DIMM 1 <===== device
> Bank Locator: Not Specified <===== bank
>
> This results in a buffer overflow because ghes_edac_register() calls
> strlen() on an uninitialized label, which had non-zero values left over
> from krealloc_array():
>
> detected buffer overflow in __fortify_strlen
> ------------[ cut here ]------------
> kernel BUG at lib/string_helpers.c:983!
> invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 1 PID: 1 Comm: swapper/0 Tainted: G I 5.18.6-200.fc36.x86_64 #1
> Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019
> RIP: 0010:fortify_panic
> ...
> Call Trace:
> <TASK>
> ghes_edac_register.cold
> ghes_probe
> platform_probe
> really_probe
> __driver_probe_device
> driver_probe_device
> __driver_attach
> ? __device_attach_driver
> bus_for_each_dev
> bus_add_driver
> driver_register
> acpi_ghes_init
> acpi_init
> ? acpi_sleep_proc_init
> do_one_initcall
>
> Change dimm_setup_label() to always initialize the label and use
> "N/A" in case bank or device is null.
>
> [ bp: Rewrite commit message. ]
>
> Fixes: cb51a371d08e ("EDAC/ghes: Setup DIMM label from DMI and use it in error reports")
> Signed-off-by: Toshi Kani <[email protected]>
> Signed-off-by: Borislav Petkov <[email protected]>
> Tested-by: Robert Elliott <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> drivers/edac/ghes_edac.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 59b0bedc9c24..fcab10e26b43 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -103,9 +103,9 @@ static void dimm_setup_label(struct dimm_info *dimm, u16 handle)
>
> dmi_memdev_name(handle, &bank, &device);
>
> - /* both strings must be non-zero */
> - if (bank && *bank && device && *device)
> - snprintf(dimm->label, sizeof(dimm->label), "%s %s", bank, device);
> + snprintf(dimm->label, sizeof(dimm->label), "%s %s",
> + (bank && *bank) ? bank : "N/A",
> + (device && *device) ? device : "N/A");

The dimm->label buffer must be pre-initialized zero. This broke with:

b9cae27728d1 EDAC/ghes: Scan the system once on driver init

since krealloc/krealloc_array() does not zero out the new allocated
buffer for struct dimm_info in enumerate_dimms(). This uninitialized
broken struct is then copied in ghes_edac_register() to the final
dimm_info destination. Originally, before b9cae27728d1, the struct was
zero initialized with kzalloc'ed by edac_mc_alloc() and directly used.
Now, that a copy is created first, this copy must be also zero
initialized.

IMO this is the proper fix:

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 59b0bedc9c24..3b5bf6502141 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -220,6 +220,7 @@ static void enumerate_dimms(const struct dmi_header *dh, void *arg)
}

d = &hw->dimms[hw->num_dimms];
+ memset(d, 0, sizeof(*d));
d->idx = hw->num_dimms;

assign_dmi_dimm_info(d, entry);


-Robert

> }

2022-07-25 10:20:57

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] EDAC/ghes: Fix buffer overflow in ghes_edac_register()

On Mon, Jul 25, 2022 at 11:53:06AM +0200, Robert Richter wrote:
> The dimm->label buffer must be pre-initialized zero. This broke with:
>
> b9cae27728d1 EDAC/ghes: Scan the system once on driver init
>
> since krealloc/krealloc_array() does not zero out the new allocated
> buffer for struct dimm_info in enumerate_dimms(). This uninitialized
> broken struct is then copied in ghes_edac_register() to the final
> dimm_info destination. Originally, before b9cae27728d1, the struct was
> zero initialized with kzalloc'ed by edac_mc_alloc() and directly used.
> Now, that a copy is created first, this copy must be also zero
> initialized.
>
> IMO this is the proper fix:

Maybe, but this needs fixing too:

/* both strings must be non-zero */
if (bank && *bank && device && *device)

Obviously one of those strings are zero coming from that BIOS...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-07-25 10:57:51

by Robert Richter

[permalink] [raw]
Subject: Re: [PATCH v2] EDAC/ghes: Fix buffer overflow in ghes_edac_register()

On 25.07.22 12:14:21, Borislav Petkov wrote:
> On Mon, Jul 25, 2022 at 11:53:06AM +0200, Robert Richter wrote:
> > The dimm->label buffer must be pre-initialized zero. This broke with:
> >
> > b9cae27728d1 EDAC/ghes: Scan the system once on driver init
> >
> > since krealloc/krealloc_array() does not zero out the new allocated
> > buffer for struct dimm_info in enumerate_dimms(). This uninitialized
> > broken struct is then copied in ghes_edac_register() to the final
> > dimm_info destination. Originally, before b9cae27728d1, the struct was
> > zero initialized with kzalloc'ed by edac_mc_alloc() and directly used.
> > Now, that a copy is created first, this copy must be also zero
> > initialized.
> >
> > IMO this is the proper fix:
>
> Maybe, but this needs fixing too:
>
> /* both strings must be non-zero */
> if (bank && *bank && device && *device)
>
> Obviously one of those strings are zero coming from that BIOS...

But the label is pre-initialized in edac_mc_alloc_dimms():

p = dimm->label;
n = snprintf(p, len, "mc#%u", mci->mc_idx);

You check if the label is emtpy when copying it in
ghes_edac_register():

if (strlen(src->label))
memcpy(dst->label, src->label, sizeof(src->label));

So if there is nothing that comes from the bios, this default label
string from edac_mc_alloc_dimms() will be used.

If you write "N/A" to the label instead, the sysfs dimm_label values
wont be unique any longer between dimms, which might break existing
applications.

-Robert

>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

2022-07-25 11:25:14

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] EDAC/ghes: Fix buffer overflow in ghes_edac_register()

On Mon, Jul 25, 2022 at 01:09:11PM +0200, Robert Richter wrote:
> I see now, what you mean here, may be change this:
>
> snprintf(dimm->label, sizeof(dimm->label), "%s %s",
> (bank && *bank) ? bank : "N/A",
> (device && *device) ? device : "N/A");
>
> to:
>
> snprintf(dimm->label, sizeof(dimm->label), "%s%s%s",
> (bank && *bank) ? bank : "",
> (bank && device) ? " " : "",
> (device && *device) ? device : "");
>
> It keeps the default assignment from edac_mc_alloc_dimms() but changes
> we label if one of bank or device is given.

Yap, that should take care of all possible "configurations" BIOS throws
at us.

Toshi, could you pls add this to the fix and test it on your machine to
make sure it still works as expected?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-07-25 12:04:41

by Robert Richter

[permalink] [raw]
Subject: Re: [PATCH v2] EDAC/ghes: Fix buffer overflow in ghes_edac_register()

On 25.07.22 12:45:24, Robert Richter wrote:
> On 25.07.22 12:14:21, Borislav Petkov wrote:
> > On Mon, Jul 25, 2022 at 11:53:06AM +0200, Robert Richter wrote:
> > > The dimm->label buffer must be pre-initialized zero. This broke with:
> > >
> > > b9cae27728d1 EDAC/ghes: Scan the system once on driver init
> > >
> > > since krealloc/krealloc_array() does not zero out the new allocated
> > > buffer for struct dimm_info in enumerate_dimms(). This uninitialized
> > > broken struct is then copied in ghes_edac_register() to the final
> > > dimm_info destination. Originally, before b9cae27728d1, the struct was
> > > zero initialized with kzalloc'ed by edac_mc_alloc() and directly used.
> > > Now, that a copy is created first, this copy must be also zero
> > > initialized.
> > >
> > > IMO this is the proper fix:
> >
> > Maybe, but this needs fixing too:
> >
> > /* both strings must be non-zero */
> > if (bank && *bank && device && *device)
> >
> > Obviously one of those strings are zero coming from that BIOS...

I see now, what you mean here, may be change this:

snprintf(dimm->label, sizeof(dimm->label), "%s %s",
(bank && *bank) ? bank : "N/A",
(device && *device) ? device : "N/A");

to:

snprintf(dimm->label, sizeof(dimm->label), "%s%s%s",
(bank && *bank) ? bank : "",
(bank && device) ? " " : "",
(device && *device) ? device : "");

It keeps the default assignment from edac_mc_alloc_dimms() but changes
we label if one of bank or device is given.

-Robert

>
> But the label is pre-initialized in edac_mc_alloc_dimms():
>
> p = dimm->label;
> n = snprintf(p, len, "mc#%u", mci->mc_idx);
>
> You check if the label is emtpy when copying it in
> ghes_edac_register():
>
> if (strlen(src->label))
> memcpy(dst->label, src->label, sizeof(src->label));
>
> So if there is nothing that comes from the bios, this default label
> string from edac_mc_alloc_dimms() will be used.
>
> If you write "N/A" to the label instead, the sysfs dimm_label values
> wont be unique any longer between dimms, which might break existing
> applications.
>
> -Robert
>
> >
> > --
> > Regards/Gruss,
> > Boris.
> >
> > https://people.kernel.org/tglx/notes-about-netiquette

2022-07-25 12:07:16

by Robert Richter

[permalink] [raw]
Subject: Re: [PATCH v2] EDAC/ghes: Fix buffer overflow in ghes_edac_register()

On 25.07.22 13:23:26, Borislav Petkov wrote:
> On Mon, Jul 25, 2022 at 01:09:11PM +0200, Robert Richter wrote:
> > I see now, what you mean here, may be change this:
> >
> > snprintf(dimm->label, sizeof(dimm->label), "%s %s",
> > (bank && *bank) ? bank : "N/A",
> > (device && *device) ? device : "N/A");
> >
> > to:
> >
> > snprintf(dimm->label, sizeof(dimm->label), "%s%s%s",
> > (bank && *bank) ? bank : "",
> > (bank && device) ? " " : "",
> > (device && *device) ? device : "");
> >
> > It keeps the default assignment from edac_mc_alloc_dimms() but changes
> > we label if one of bank or device is given.
>
> Yap, that should take care of all possible "configurations" BIOS throws
> at us.
>
> Toshi, could you pls add this to the fix and test it on your machine to
> make sure it still works as expected?

If you like, you can add my:

Signed-off-by: Robert Richter <[email protected]>

-Robert

2022-07-25 13:13:04

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] EDAC/ghes: Fix buffer overflow in ghes_edac_register()

On Mon, Jul 25, 2022 at 01:30:41PM +0200, Robert Richter wrote:
> If you like, you can add my:
>
> Signed-off-by: Robert Richter <[email protected]>

Do you mean by this Co-developed-by: perhaps?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-07-25 16:52:09

by Kani, Toshimitsu

[permalink] [raw]
Subject: RE: [PATCH v2] EDAC/ghes: Fix buffer overflow in ghes_edac_register()

Borislav Petkov wrote:
> > On Mon, Jul 25, 2022 at 01:09:11PM +0200, Robert Richter wrote:
> > It keeps the default assignment from edac_mc_alloc_dimms() but changes
> > we label if one of bank or device is given.

Good idea.

> Yap, that should take care of all possible "configurations" BIOS throws
> at us.
>
> Toshi, could you pls add this to the fix and test it on your machine to
> make sure it still works as expected?

Will do.

Thanks,
Toshi