2019-06-13 17:15:43

by Kevin Mitchell

[permalink] [raw]
Subject: [PATCH 0/3] handle init errors more gracefully in amd_iommu

This series makes error handling more robust in the amd_iommu init
code. It was initially motivated by problematic firmware that does not
set up the physical address of the iommu. This led to a NULL dereference
panic when iommu_disable was called during cleanup.

While the first patch is sufficient to avoid the panic, the subsequent
two move the cleanup closer to the actual error and avoid calling the
cleanup code twice when amd_iommu=off is specified on the command line.

I have tested this series on a variety of AMD CPUs with firmware
exhibiting the issue. I have additionally tested on platforms where the
firmware has been fixed. I tried both with and without amd_iommu=off. I
have also tested on older CPUs where no IOMMU is detected and even one
where the GART driver ends up running.

Thanks,

Kevin

Kevin Mitchell (3):
iommu/amd: make iommu_disable safer
iommu/amd: move gart fallback to amd_iommu_init
iommu/amd: only free resources once on init error

drivers/iommu/amd_iommu_init.c | 45 ++++++++++++++++++----------------
1 file changed, 24 insertions(+), 21 deletions(-)

--
2.20.1


2019-06-13 17:16:28

by Kevin Mitchell

[permalink] [raw]
Subject: [PATCH 3/3] iommu/amd: only free resources once on init error

When amd_iommu=off was specified on the command line, free_X_resources
functions were called immediately after early_amd_iommu_init. They were
then called again when amd_iommu_init also failed (as expected).

Instead, call them only once: at the end of state_next() whenever
there's an error. These functions should be safe to call any time and
any number of times. However, since state_next is never called again in
an error state, the cleanup will only ever be run once.

This also ensures that cleanup code is run as soon as possible after an
error is detected rather than waiting for amd_iommu_init() to be called.

Signed-off-by: Kevin Mitchell <[email protected]>
---
drivers/iommu/amd_iommu_init.c | 27 +++++++++++++--------------
1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 5f3df5ae6ba8..24fc060fe596 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2638,8 +2638,6 @@ static int __init state_next(void)
init_state = ret ? IOMMU_INIT_ERROR : IOMMU_ACPI_FINISHED;
if (init_state == IOMMU_ACPI_FINISHED && amd_iommu_disabled) {
pr_info("AMD IOMMU disabled on kernel command-line\n");
- free_dma_resources();
- free_iommu_resources();
init_state = IOMMU_CMDLINE_DISABLED;
ret = -EINVAL;
}
@@ -2680,6 +2678,19 @@ static int __init state_next(void)
BUG();
}

+ if (ret) {
+ free_dma_resources();
+ if (!irq_remapping_enabled) {
+ disable_iommus();
+ free_iommu_resources();
+ } else {
+ struct amd_iommu *iommu;
+
+ uninit_device_table_dma();
+ for_each_iommu(iommu)
+ iommu_flush_all_caches(iommu);
+ }
+ }
return ret;
}

@@ -2753,18 +2764,6 @@ static int __init amd_iommu_init(void)
int ret;

ret = iommu_go_to_state(IOMMU_INITIALIZED);
- if (ret) {
- free_dma_resources();
- if (!irq_remapping_enabled) {
- disable_iommus();
- free_iommu_resources();
- } else {
- uninit_device_table_dma();
- for_each_iommu(iommu)
- iommu_flush_all_caches(iommu);
- }
- }
-
#ifdef CONFIG_GART_IOMMU
if (ret && list_empty(&amd_iommu_list)) {
/*
--
2.20.1

2019-07-01 12:41:31

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 0/3] handle init errors more gracefully in amd_iommu

Hi Kevin,

On Wed, Jun 12, 2019 at 02:52:01PM -0700, Kevin Mitchell wrote:
> I have tested this series on a variety of AMD CPUs with firmware
> exhibiting the issue. I have additionally tested on platforms where the
> firmware has been fixed. I tried both with and without amd_iommu=off. I
> have also tested on older CPUs where no IOMMU is detected and even one
> where the GART driver ends up running.

Thanks for the fixes, applied them all.