2020-01-07 19:17:36

by Barret Rhoden

[permalink] [raw]
Subject: [PATCH v2 0/2] iommu/vt-d bad RMRR workarounds

Commit f036c7fa0ab6 ("iommu/vt-d: Check VT-d RMRR region in BIOS is
reported as reserved") caused a machine to fail to boot for me, but only
after a kexec.

Buggy firmware provided an RMRR entry with base and end both == 0. That
is an invalid RMRR format, and only happens to pass the RMRR sanity
check. After a kexec, that entry fails the RMRR sanity check, due to a
slight change in the first e820 mapping. See the v1 link for details.

v1->v2:
v1: https://lore.kernel.org/lkml/[email protected]/
- Added the TAINT_FIRMWARE_WORKAROUND
- Dropped the commit that treated missing e820 regions as "RMRR OK"


Barret Rhoden (2):
iommu/vt-d: skip RMRR entries that fail the sanity check
iommu/vt-d: skip invalid RMRR entries

drivers/iommu/intel-iommu.c | 26 ++++++++++++++++++++++----
1 file changed, 22 insertions(+), 4 deletions(-)

--
2.24.1.735.g03f4e72817-goog


2020-01-07 19:18:20

by Barret Rhoden

[permalink] [raw]
Subject: [PATCH v2 2/2] iommu/vt-d: skip invalid RMRR entries

The VT-d docs specify requirements for the RMRR entries base and end
(called 'Limit' in the docs) addresses.

This commit will cause the DMAR processing to skip any RMRR entries that
do not meet these requirements and mark the firmware as tainted, since
the firmware is giving us junk.

Signed-off-by: Barret Rhoden <[email protected]>
---
drivers/iommu/intel-iommu.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a8bb458845bc..32c3c6338a3d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4315,13 +4315,25 @@ static void __init init_iommu_pm_ops(void)
static inline void init_iommu_pm_ops(void) {}
#endif /* CONFIG_PM */

+static int rmrr_validity_check(struct acpi_dmar_reserved_memory *rmrr)
+{
+ if ((rmrr->base_address & PAGE_MASK) ||
+ (rmrr->end_address <= rmrr->base_address) ||
+ ((rmrr->end_address - rmrr->base_address + 1) & PAGE_MASK)) {
+ pr_err(FW_BUG "Broken RMRR base: %#018Lx end: %#018Lx\n",
+ rmrr->base_address, rmrr->end_address);
+ return -EINVAL;
+ }
+ return 0;
+}
+
int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
{
struct acpi_dmar_reserved_memory *rmrr;
struct dmar_rmrr_unit *rmrru;

rmrr = (struct acpi_dmar_reserved_memory *)header;
- if (arch_rmrr_sanity_check(rmrr)) {
+ if (rmrr_validity_check(rmrr) || arch_rmrr_sanity_check(rmrr)) {
WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND,
"Your BIOS is broken; bad RMRR [%#018Lx-%#018Lx]\n"
"BIOS vendor: %s; Ver: %s; Product Version: %s\n",
--
2.24.1.735.g03f4e72817-goog

2020-01-07 19:18:39

by Barret Rhoden

[permalink] [raw]
Subject: [PATCH v2 1/2] iommu/vt-d: skip RMRR entries that fail the sanity check

RMRR entries describe memory regions that are DMA targets for devices
outside the kernel's control.

RMRR entries that fail the sanity check are pointing to regions of
memory that the firmware did not tell the kernel are reserved or
otherwise should not be used.

Instead of aborting DMAR processing, this commit skips these RMRR
entries and marks the firmware as tainted. They will not be mapped into
the IOMMU, but the IOMMU can still be utilized. If anything, when the
IOMMU is on, those devices will not be able to clobber RAM that the
kernel has allocated from those regions.

Signed-off-by: Barret Rhoden <[email protected]>
---
drivers/iommu/intel-iommu.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0c8d81f56a30..a8bb458845bc 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4319,12 +4319,18 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
{
struct acpi_dmar_reserved_memory *rmrr;
struct dmar_rmrr_unit *rmrru;
- int ret;

rmrr = (struct acpi_dmar_reserved_memory *)header;
- ret = arch_rmrr_sanity_check(rmrr);
- if (ret)
- return ret;
+ if (arch_rmrr_sanity_check(rmrr)) {
+ WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND,
+ "Your BIOS is broken; bad RMRR [%#018Lx-%#018Lx]\n"
+ "BIOS vendor: %s; Ver: %s; Product Version: %s\n",
+ rmrr->base_address, rmrr->end_address,
+ dmi_get_system_info(DMI_BIOS_VENDOR),
+ dmi_get_system_info(DMI_BIOS_VERSION),
+ dmi_get_system_info(DMI_PRODUCT_VERSION));
+ return 0;
+ }

rmrru = kzalloc(sizeof(*rmrru), GFP_KERNEL);
if (!rmrru)
--
2.24.1.735.g03f4e72817-goog

2020-01-08 01:30:56

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] iommu/vt-d: skip invalid RMRR entries

Hi,

On 1/8/20 3:16 AM, Barret Rhoden via iommu wrote:
> The VT-d docs specify requirements for the RMRR entries base and end
> (called 'Limit' in the docs) addresses.
>
> This commit will cause the DMAR processing to skip any RMRR entries that
> do not meet these requirements and mark the firmware as tainted, since
> the firmware is giving us junk.
>
> Signed-off-by: Barret Rhoden <[email protected]>
> ---
> drivers/iommu/intel-iommu.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index a8bb458845bc..32c3c6338a3d 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -4315,13 +4315,25 @@ static void __init init_iommu_pm_ops(void)
> static inline void init_iommu_pm_ops(void) {}
> #endif /* CONFIG_PM */
>
> +static int rmrr_validity_check(struct acpi_dmar_reserved_memory *rmrr)
> +{
> + if ((rmrr->base_address & PAGE_MASK) ||
> + (rmrr->end_address <= rmrr->base_address) ||
> + ((rmrr->end_address - rmrr->base_address + 1) & PAGE_MASK)) {
> + pr_err(FW_BUG "Broken RMRR base: %#018Lx end: %#018Lx\n",
> + rmrr->base_address, rmrr->end_address);

Since you will WARN_TAINT below, do you still want an error message
here?

> + return -EINVAL;
> + }
> + return 0;
> +}
> +
> int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
> {
> struct acpi_dmar_reserved_memory *rmrr;
> struct dmar_rmrr_unit *rmrru;
>
> rmrr = (struct acpi_dmar_reserved_memory *)header;
> - if (arch_rmrr_sanity_check(rmrr)) {
> + if (rmrr_validity_check(rmrr) || arch_rmrr_sanity_check(rmrr)) {
> WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND,
> "Your BIOS is broken; bad RMRR [%#018Lx-%#018Lx]\n"
> "BIOS vendor: %s; Ver: %s; Product Version: %s\n",
>

Best regards,
baolu

2020-01-08 19:55:50

by Barret Rhoden

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] iommu/vt-d: skip invalid RMRR entries

On 1/7/20 8:27 PM, Lu Baolu wrote:
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index a8bb458845bc..32c3c6338a3d 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -4315,13 +4315,25 @@ static void __init init_iommu_pm_ops(void)
>>   static inline void init_iommu_pm_ops(void) {}
>>   #endif    /* CONFIG_PM */
>> +static int rmrr_validity_check(struct acpi_dmar_reserved_memory *rmrr)
>> +{
>> +    if ((rmrr->base_address & PAGE_MASK) ||
>> +        (rmrr->end_address <= rmrr->base_address) ||
>> +        ((rmrr->end_address - rmrr->base_address + 1) & PAGE_MASK)) {
>> +        pr_err(FW_BUG "Broken RMRR base: %#018Lx end: %#018Lx\n",
>> +               rmrr->base_address, rmrr->end_address);
>
> Since you will WARN_TAINT below, do you still want an error message
> here?

I'm fine either way.

I put it in since arch_rmrr_sanity_check() also has a pr_err():

pr_err(FW_BUG "No firmware reserved region can cover this RMRR
[%#018Lx-%#018Lx], contact BIOS vendor for fixes\n",
start, end - 1);

Thanks,

Barret

2020-01-10 19:29:10

by Chen, Yian

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] iommu/vt-d: skip invalid RMRR entries

Hi Barret,

this looks good.

thanks
Yian

On 1/7/2020 11:16 AM, Barret Rhoden wrote:
> The VT-d docs specify requirements for the RMRR entries base and end
> (called 'Limit' in the docs) addresses.
>
> This commit will cause the DMAR processing to skip any RMRR entries that
> do not meet these requirements and mark the firmware as tainted, since
> the firmware is giving us junk.
>
> Signed-off-by: Barret Rhoden <[email protected]>
> ---
> drivers/iommu/intel-iommu.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index a8bb458845bc..32c3c6338a3d 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -4315,13 +4315,25 @@ static void __init init_iommu_pm_ops(void)
> static inline void init_iommu_pm_ops(void) {}
> #endif /* CONFIG_PM */
>
> +static int rmrr_validity_check(struct acpi_dmar_reserved_memory *rmrr)
> +{
> + if ((rmrr->base_address & PAGE_MASK) ||
> + (rmrr->end_address <= rmrr->base_address) ||
> + ((rmrr->end_address - rmrr->base_address + 1) & PAGE_MASK)) {
> + pr_err(FW_BUG "Broken RMRR base: %#018Lx end: %#018Lx\n",
> + rmrr->base_address, rmrr->end_address);
> + return -EINVAL;
> + }
> + return 0;
> +}
> +
> int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
> {
> struct acpi_dmar_reserved_memory *rmrr;
> struct dmar_rmrr_unit *rmrru;
>
> rmrr = (struct acpi_dmar_reserved_memory *)header;
> - if (arch_rmrr_sanity_check(rmrr)) {
> + if (rmrr_validity_check(rmrr) || arch_rmrr_sanity_check(rmrr)) {
> WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND,
> "Your BIOS is broken; bad RMRR [%#018Lx-%#018Lx]\n"
> "BIOS vendor: %s; Ver: %s; Product Version: %s\n",