2019-01-28 10:04:57

by Ross Lagerwall

[permalink] [raw]
Subject: [PATCH v2 0/2] Fix crash in cper_estatus_check()

v2 changes:
- Address Boris's comments.

---

I recently encountered a crash in cper_estatus_check() when called by
bert_init(). Patches follow to fix the problem. Note that I cannot fully
test the patches since the hardware error record on that machine has
been cleared.

The crash log:

[ 125.666350] BUG: unable to handle kernel paging request at ffffc9004046d02c
[ 125.666503] PGD 1f6dce067 P4D 1f6dce067 PUD 1e6532067 PMD 1e3d11067 PTE 0
[ 125.666696] Oops: 0000 [#1] SMP KASAN NOPTI
[ 125.666837] CPU: 7 PID: 1 Comm: swapper/0 Not tainted 4.19.0+0 #1
[ 125.666983] Hardware name: Dell Inc. PowerEdge M520/0DW6GX, BIOS 1.8.6 08/30/2013
[ 125.667171] RIP: e030:cper_estatus_check+0x7e/0xf0
[ 125.667315] Code: 41 29 c5 48 98 48 01 c3 48 89 d8 4c 29 e0 48 39 e8 7d 4a 48 8d 7b 18 be 04 00 00 00 e8 bb 6f 9f ff 48 8d 7b 14 be 02 00 00 00 <44> 8b 73 18 e8 a9 6f 9f ff 0f b6 4b 15 44 89 ee 66 83 f9 03 19 d2
[ 125.667554] RSP: e02b:ffff8881e65efce0 EFLAGS: 00010246
[ 125.667699] RAX: fffff5200808da06 RBX: ffffc9004046d014 RCX: ffffffff8192bf25
[ 125.667849] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffffc9004046d028
[ 125.668009] RBP: 0000000000000700 R08: fffff5200808da06 R09: fffff5200808da06
[ 125.668207] R10: 0000000000000001 R11: fffff5200808da05 R12: ffffc9004046cc14
[ 125.668358] R13: 0000000000000300 R14: 00000000000000c0 R15: ffffc9004046cc00
[ 125.668519] FS: 0000000000000000(0000) GS:ffff8881e77c0000(0000) knlGS:0000000000000000
[ 125.668698] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 125.668844] CR2: ffffc9004046d02c CR3: 000000000260c000 CR4: 0000000000042660
[ 125.668999] Call Trace:
[ 125.669139] bert_init+0x21c/0x362
[ 125.669279] ? setup_bert_disable+0x12/0x12
[ 125.669420] ? pci_get_dev_by_id+0x57/0x70
[ 125.669560] ? pci_get_device+0x86/0xc0
[ 125.669738] ? pci_create_sysfs_dev_files+0x1a6/0x330
[ 125.669883] ? setup_bert_disable+0x12/0x12
[ 125.670026] ? set_debug_rodata+0x11/0x11
[ 125.670166] ? do_one_initcall+0x8b/0x253
[ 125.670306] do_one_initcall+0x8b/0x253
[ 125.670447] ? perf_trace_initcall_level+0x250/0x250
[ 125.670592] ? __wake_up_common+0x140/0x1d0
[ 125.670736] ? kasan_unpoison_shadow+0x30/0x40
[ 125.670879] ? kasan_unpoison_shadow+0x30/0x40
[ 125.671023] ? set_debug_rodata+0x11/0x11
[ 125.671164] kernel_init_freeable+0x269/0x304
[ 125.671346] ? rest_init+0xc0/0xc0
[ 125.671485] kernel_init+0xf/0x130
[ 125.671623] ? rest_init+0xc0/0xc0
[ 125.671761] ? rest_init+0xc0/0xc0
[ 125.671901] ret_from_fork+0x35/0x40
[ 125.672063] Modules linked in:
[ 125.672201] CR2: ffffc9004046d02c
[ 125.672349] ---[ end trace a17cd87742b2c49e ]---
[ 125.683693] RIP: e030:cper_estatus_check+0x7e/0xf0
[ 125.683840] Code: 41 29 c5 48 98 48 01 c3 48 89 d8 4c 29 e0 48 39 e8 7d 4a 48 8d 7b 18 be 04 00 00 00 e8 bb 6f 9f ff 48 8d 7b 14 be 02 00 00 00 <44> 8b 73 18 e8 a9 6f 9f ff 0f b6 4b 15 44 89 ee 66 83 f9 03 19 d2
[ 125.684103] RSP: e02b:ffff8881e65efce0 EFLAGS: 00010246
[ 125.684247] RAX: fffff5200808da06 RBX: ffffc9004046d014 RCX: ffffffff8192bf25
[ 125.684397] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffffc9004046d028
[ 125.684548] RBP: 0000000000000700 R08: fffff5200808da06 R09: fffff5200808da06
[ 125.684699] R10: 0000000000000001 R11: fffff5200808da05 R12: ffffc9004046cc14
[ 125.684850] R13: 0000000000000300 R14: 00000000000000c0 R15: ffffc9004046cc00
[ 125.685009] FS: 0000000000000000(0000) GS:ffff8881e77c0000(0000) knlGS:0000000000000000
[ 125.685224] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 125.685371] CR2: ffffc9004046d02c CR3: 000000000260c000 CR4: 0000000000042660
[ 125.685566] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009

Thanks,

Ross Lagerwall (2):
acpi/apei: Fix possible out-of-bounds access to BERT region
efi/cper: Fix possible out-of-bounds access

drivers/acpi/apei/bert.c | 23 ++++++++++-------------
drivers/firmware/efi/cper.c | 13 +++++++++----
2 files changed, 19 insertions(+), 17 deletions(-)

--
2.17.2



2019-01-28 10:04:52

by Ross Lagerwall

[permalink] [raw]
Subject: [PATCH v2 2/2] efi/cper: Fix possible out-of-bounds access

When checking a generic status block, we iterate over all the generic
data blocks. The loop condition only checks that the start of the
generic data block is valid (within estatus->data_length) but not the
whole block. Because the size of data blocks (excluding error data) may
vary depending on the revision and the revision is contained within the
data block, ensure that enough of the current data block is valid before
dereferencing any members otherwise an out-of-bounds access may occur if
estatus->data_length is invalid. This relies on the fact that struct
acpi_hest_generic_data_v300 is a superset of the earlier version. Also
rework the other checks to avoid potential underflow.

Signed-off-by: Ross Lagerwall <[email protected]>
---
drivers/firmware/efi/cper.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index a7902fccdcfa..6090d25dce85 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -546,19 +546,24 @@ EXPORT_SYMBOL_GPL(cper_estatus_check_header);
int cper_estatus_check(const struct acpi_hest_generic_status *estatus)
{
struct acpi_hest_generic_data *gdata;
- unsigned int data_len, gedata_len;
+ unsigned int data_len, record_size;
int rc;

rc = cper_estatus_check_header(estatus);
if (rc)
return rc;
+
data_len = estatus->data_length;

apei_estatus_for_each_section(estatus, gdata) {
- gedata_len = acpi_hest_get_error_length(gdata);
- if (gedata_len > data_len - acpi_hest_get_size(gdata))
+ if (sizeof(struct acpi_hest_generic_data) > data_len)
+ return -EINVAL;
+
+ record_size = acpi_hest_get_record_size(gdata);
+ if (record_size > data_len)
return -EINVAL;
- data_len -= acpi_hest_get_record_size(gdata);
+
+ data_len -= record_size;
}
if (data_len)
return -EINVAL;
--
2.17.2


2019-01-28 10:05:04

by Ross Lagerwall

[permalink] [raw]
Subject: [PATCH v2 1/2] acpi/apei: Fix possible out-of-bounds access to BERT region

Check that the length recorded in the generic error status block is
within the region before checking the contents of the region itself.
Otherwise it may result in an out-of-bounds access if the system
firmware has generated a status block with an invalid length (larger
than the mapped region). Also move the block_status check so that it
only happens after the block has been verified to be within the mapped
region.

Signed-off-by: Ross Lagerwall <[email protected]>
---
drivers/acpi/apei/bert.c | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/acpi/apei/bert.c b/drivers/acpi/apei/bert.c
index 12771fcf0417..0d948d0a41af 100644
--- a/drivers/acpi/apei/bert.c
+++ b/drivers/acpi/apei/bert.c
@@ -42,15 +42,7 @@ static void __init bert_print_all(struct acpi_bert_region *region,
int remain = region_len;
u32 estatus_len;

- if (!estatus->block_status)
- return;
-
- while (remain > sizeof(struct acpi_bert_region)) {
- if (cper_estatus_check(estatus)) {
- pr_err(FW_BUG "Invalid error record.\n");
- return;
- }
-
+ while (remain >= sizeof(struct acpi_bert_region)) {
estatus_len = cper_estatus_len(estatus);
if (remain < estatus_len) {
pr_err(FW_BUG "Truncated status block (length: %u).\n",
@@ -58,6 +50,15 @@ static void __init bert_print_all(struct acpi_bert_region *region,
return;
}

+ /* No more error records. */
+ if (!estatus->block_status)
+ return;
+
+ if (cper_estatus_check(estatus)) {
+ pr_err(FW_BUG "Invalid error record.\n");
+ return;
+ }
+
pr_info_once("Error records from previous boot:\n");

cper_estatus_print(KERN_INFO HW_ERR, estatus);
@@ -70,10 +71,6 @@ static void __init bert_print_all(struct acpi_bert_region *region,
estatus->block_status = 0;

estatus = (void *)estatus + estatus_len;
- /* No more error records. */
- if (!estatus->block_status)
- return;
-
remain -= estatus_len;
}
}
--
2.17.2


2019-01-30 16:05:54

by Tyler Baicar

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix crash in cper_estatus_check()

On Mon, Jan 28, 2019 at 5:06 AM Ross Lagerwall
<[email protected]> wrote:
>
> v2 changes:
> - Address Boris's comments.
>
> ---
>
> I recently encountered a crash in cper_estatus_check() when called by
> bert_init(). Patches follow to fix the problem. Note that I cannot fully
> test the patches since the hardware error record on that machine has
> been cleared.
>
> Ross Lagerwall (2):
> acpi/apei: Fix possible out-of-bounds access to BERT region
> efi/cper: Fix possible out-of-bounds access

For both patches:

Tested-by: Tyler Baicar <[email protected]>

2019-02-19 10:02:56

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix crash in cper_estatus_check()

On Monday, January 28, 2019 11:04:22 AM CET Ross Lagerwall wrote:
> v2 changes:
> - Address Boris's comments.
>
> ---
>
> I recently encountered a crash in cper_estatus_check() when called by
> bert_init(). Patches follow to fix the problem. Note that I cannot fully
> test the patches since the hardware error record on that machine has
> been cleared.
>
> The crash log:
>
> [ 125.666350] BUG: unable to handle kernel paging request at ffffc9004046d02c
> [ 125.666503] PGD 1f6dce067 P4D 1f6dce067 PUD 1e6532067 PMD 1e3d11067 PTE 0
> [ 125.666696] Oops: 0000 [#1] SMP KASAN NOPTI
> [ 125.666837] CPU: 7 PID: 1 Comm: swapper/0 Not tainted 4.19.0+0 #1
> [ 125.666983] Hardware name: Dell Inc. PowerEdge M520/0DW6GX, BIOS 1.8.6 08/30/2013
> [ 125.667171] RIP: e030:cper_estatus_check+0x7e/0xf0
> [ 125.667315] Code: 41 29 c5 48 98 48 01 c3 48 89 d8 4c 29 e0 48 39 e8 7d 4a 48 8d 7b 18 be 04 00 00 00 e8 bb 6f 9f ff 48 8d 7b 14 be 02 00 00 00 <44> 8b 73 18 e8 a9 6f 9f ff 0f b6 4b 15 44 89 ee 66 83 f9 03 19 d2
> [ 125.667554] RSP: e02b:ffff8881e65efce0 EFLAGS: 00010246
> [ 125.667699] RAX: fffff5200808da06 RBX: ffffc9004046d014 RCX: ffffffff8192bf25
> [ 125.667849] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffffc9004046d028
> [ 125.668009] RBP: 0000000000000700 R08: fffff5200808da06 R09: fffff5200808da06
> [ 125.668207] R10: 0000000000000001 R11: fffff5200808da05 R12: ffffc9004046cc14
> [ 125.668358] R13: 0000000000000300 R14: 00000000000000c0 R15: ffffc9004046cc00
> [ 125.668519] FS: 0000000000000000(0000) GS:ffff8881e77c0000(0000) knlGS:0000000000000000
> [ 125.668698] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 125.668844] CR2: ffffc9004046d02c CR3: 000000000260c000 CR4: 0000000000042660
> [ 125.668999] Call Trace:
> [ 125.669139] bert_init+0x21c/0x362
> [ 125.669279] ? setup_bert_disable+0x12/0x12
> [ 125.669420] ? pci_get_dev_by_id+0x57/0x70
> [ 125.669560] ? pci_get_device+0x86/0xc0
> [ 125.669738] ? pci_create_sysfs_dev_files+0x1a6/0x330
> [ 125.669883] ? setup_bert_disable+0x12/0x12
> [ 125.670026] ? set_debug_rodata+0x11/0x11
> [ 125.670166] ? do_one_initcall+0x8b/0x253
> [ 125.670306] do_one_initcall+0x8b/0x253
> [ 125.670447] ? perf_trace_initcall_level+0x250/0x250
> [ 125.670592] ? __wake_up_common+0x140/0x1d0
> [ 125.670736] ? kasan_unpoison_shadow+0x30/0x40
> [ 125.670879] ? kasan_unpoison_shadow+0x30/0x40
> [ 125.671023] ? set_debug_rodata+0x11/0x11
> [ 125.671164] kernel_init_freeable+0x269/0x304
> [ 125.671346] ? rest_init+0xc0/0xc0
> [ 125.671485] kernel_init+0xf/0x130
> [ 125.671623] ? rest_init+0xc0/0xc0
> [ 125.671761] ? rest_init+0xc0/0xc0
> [ 125.671901] ret_from_fork+0x35/0x40
> [ 125.672063] Modules linked in:
> [ 125.672201] CR2: ffffc9004046d02c
> [ 125.672349] ---[ end trace a17cd87742b2c49e ]---
> [ 125.683693] RIP: e030:cper_estatus_check+0x7e/0xf0
> [ 125.683840] Code: 41 29 c5 48 98 48 01 c3 48 89 d8 4c 29 e0 48 39 e8 7d 4a 48 8d 7b 18 be 04 00 00 00 e8 bb 6f 9f ff 48 8d 7b 14 be 02 00 00 00 <44> 8b 73 18 e8 a9 6f 9f ff 0f b6 4b 15 44 89 ee 66 83 f9 03 19 d2
> [ 125.684103] RSP: e02b:ffff8881e65efce0 EFLAGS: 00010246
> [ 125.684247] RAX: fffff5200808da06 RBX: ffffc9004046d014 RCX: ffffffff8192bf25
> [ 125.684397] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffffc9004046d028
> [ 125.684548] RBP: 0000000000000700 R08: fffff5200808da06 R09: fffff5200808da06
> [ 125.684699] R10: 0000000000000001 R11: fffff5200808da05 R12: ffffc9004046cc14
> [ 125.684850] R13: 0000000000000300 R14: 00000000000000c0 R15: ffffc9004046cc00
> [ 125.685009] FS: 0000000000000000(0000) GS:ffff8881e77c0000(0000) knlGS:0000000000000000
> [ 125.685224] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 125.685371] CR2: ffffc9004046d02c CR3: 000000000260c000 CR4: 0000000000042660
> [ 125.685566] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
>
> Thanks,
>
> Ross Lagerwall (2):
> acpi/apei: Fix possible out-of-bounds access to BERT region
> efi/cper: Fix possible out-of-bounds access
>
> drivers/acpi/apei/bert.c | 23 ++++++++++-------------
> drivers/firmware/efi/cper.c | 13 +++++++++----
> 2 files changed, 19 insertions(+), 17 deletions(-)
>
>

Boris, any comments here?


2019-02-19 13:51:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix crash in cper_estatus_check()

On Tue, Feb 19, 2019 at 11:00:49AM +0100, Rafael J. Wysocki wrote:
> Boris, any comments here?

For both:

Acked-by: Borislav Petkov <[email protected]>

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2019-02-20 09:36:56

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix crash in cper_estatus_check()

On Tuesday, February 19, 2019 2:50:04 PM CET Borislav Petkov wrote:
> On Tue, Feb 19, 2019 at 11:00:49AM +0100, Rafael J. Wysocki wrote:
> > Boris, any comments here?
>
> For both:
>
> Acked-by: Borislav Petkov <[email protected]>

Both applied, thanks!