2021-07-15 14:20:06

by Laurentiu Tudor

[permalink] [raw]
Subject: [PATCH 1/8] bus: fsl-mc: fix arg in call to dprc_scan_objects()

From: Laurentiu Tudor <[email protected]>

Second parameter of dprc_scan_objects() is a bool not a pointer
so change from NULL to false.

Signed-off-by: Laurentiu Tudor <[email protected]>
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 09c8ab5e0959..ffec838450f3 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -220,7 +220,7 @@ static int scan_fsl_mc_bus(struct device *dev, void *data)
root_mc_dev = to_fsl_mc_device(dev);
root_mc_bus = to_fsl_mc_bus(root_mc_dev);
mutex_lock(&root_mc_bus->scan_mutex);
- dprc_scan_objects(root_mc_dev, NULL);
+ dprc_scan_objects(root_mc_dev, false);
mutex_unlock(&root_mc_bus->scan_mutex);

exit:
--
2.17.1


2021-07-15 14:20:12

by Laurentiu Tudor

[permalink] [raw]
Subject: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case

From: Laurentiu Tudor <[email protected]>

ACPI DMA configure API may return a defer status code, so handle it.
On top of this, move the MC firmware resume after the DMA setup
is completed to avoid crashing due to DMA setup not being done yet or
being deferred.

Signed-off-by: Laurentiu Tudor <[email protected]>
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index ffec838450f3..ffd7a1ff957a 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -1089,17 +1089,6 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
}

if (mc->fsl_mc_regs) {
- /*
- * Some bootloaders pause the MC firmware before booting the
- * kernel so that MC will not cause faults as soon as the
- * SMMU probes due to the fact that there's no configuration
- * in place for MC.
- * At this point MC should have all its SMMU setup done so make
- * sure it is resumed.
- */
- writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) & (~GCR1_P1_STOP),
- mc->fsl_mc_regs + FSL_MC_GCR1);
-
if (IS_ENABLED(CONFIG_ACPI) && !dev_of_node(&pdev->dev)) {
mc_stream_id = readl(mc->fsl_mc_regs + FSL_MC_FAPR);
/*
@@ -1113,11 +1102,24 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
error = acpi_dma_configure_id(&pdev->dev,
DEV_DMA_COHERENT,
&mc_stream_id);
+ if (error == -EPROBE_DEFER)
+ return error;
if (error)
dev_warn(&pdev->dev,
"failed to configure dma: %d.\n",
error);
}
+
+ /*
+ * Some bootloaders pause the MC firmware before booting the
+ * kernel so that MC will not cause faults as soon as the
+ * SMMU probes due to the fact that there's no configuration
+ * in place for MC.
+ * At this point MC should have all its SMMU setup done so make
+ * sure it is resumed.
+ */
+ writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) & (~GCR1_P1_STOP),
+ mc->fsl_mc_regs + FSL_MC_GCR1);
}

/*
--
2.17.1

2021-07-15 14:20:15

by Laurentiu Tudor

[permalink] [raw]
Subject: [PATCH 4/8] bus: fsl-mc: add .shutdown() op for the bus driver

From: Laurentiu Tudor <[email protected]>

The fsl-mc bus driver is missing the .shutdown() callback, so add it.
The implementation simply calls the .remove() callback.

Signed-off-by: Laurentiu Tudor <[email protected]>
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 2341de6bce67..efff48b3efa5 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -1206,6 +1206,11 @@ static int fsl_mc_bus_remove(struct platform_device *pdev)
return 0;
}

+static void fsl_mc_bus_shutdown(struct platform_device *pdev)
+{
+ fsl_mc_bus_remove(pdev);
+}
+
static const struct of_device_id fsl_mc_bus_match_table[] = {
{.compatible = "fsl,qoriq-mc",},
{},
@@ -1228,6 +1233,7 @@ static struct platform_driver fsl_mc_bus_driver = {
},
.probe = fsl_mc_bus_probe,
.remove = fsl_mc_bus_remove,
+ .shutdown = fsl_mc_bus_shutdown,
};

static int __init fsl_mc_bus_driver_init(void)
--
2.17.1

2021-07-15 14:20:17

by Laurentiu Tudor

[permalink] [raw]
Subject: [PATCH 5/8] bus: fsl-mc: pause the MC firmware before IOMMU setup

From: Laurentiu Tudor <[email protected]>

Add a bus notifier to pause the MC firmware as soon as its device
gets discovered. This is needed as the firmware is live thus, as soon
as the SMMU gets probed and enabled, it will crash the firmware due to
SMMU context faults. The firmware will be resumed at probe time, after
the required IOMMU setup was completed.

Signed-off-by: Laurentiu Tudor <[email protected]>
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 44 ++++++++++++++++++++++++++++++++-
1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index efff48b3efa5..41861ca5c8f1 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -896,6 +896,8 @@ int fsl_mc_device_add(struct fsl_mc_obj_desc *obj_desc,
}
EXPORT_SYMBOL_GPL(fsl_mc_device_add);

+static struct notifier_block fsl_mc_nb;
+
/**
* fsl_mc_device_remove - Remove an fsl-mc device from being visible to
* Linux
@@ -1203,6 +1205,8 @@ static int fsl_mc_bus_remove(struct platform_device *pdev)
fsl_destroy_mc_io(mc->root_mc_bus_dev->mc_io);
mc->root_mc_bus_dev->mc_io = NULL;

+ bus_unregister_notifier(&fsl_mc_bus_type, &fsl_mc_nb);
+
return 0;
}

@@ -1236,6 +1240,44 @@ static struct platform_driver fsl_mc_bus_driver = {
.shutdown = fsl_mc_bus_shutdown,
};

+static int fsl_mc_bus_notifier(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct device *dev = data;
+ struct resource *res;
+ void __iomem *fsl_mc_regs;
+
+ if (action != BUS_NOTIFY_ADD_DEVICE)
+ return 0;
+
+ if (!of_match_device(fsl_mc_bus_match_table, dev) &&
+ !acpi_match_device(fsl_mc_bus_acpi_match_table, dev))
+ return 0;
+
+ res = platform_get_resource(to_platform_device(dev), IORESOURCE_MEM, 1);
+ if (!res)
+ return 0;
+
+ fsl_mc_regs = ioremap(res->start, resource_size(res));
+ if (!fsl_mc_regs)
+ return 0;
+
+ /*
+ * Make sure that the MC firmware is paused before the IOMMU setup for
+ * it is done or otherwise the firmware will crash right after the SMMU
+ * gets probed and enabled.
+ */
+ writel(readl(fsl_mc_regs + FSL_MC_GCR1) | (GCR1_P1_STOP | GCR1_P2_STOP),
+ fsl_mc_regs + FSL_MC_GCR1);
+ iounmap(fsl_mc_regs);
+
+ return 0;
+}
+
+static struct notifier_block fsl_mc_nb = {
+ .notifier_call = fsl_mc_bus_notifier,
+};
+
static int __init fsl_mc_bus_driver_init(void)
{
int error;
@@ -1260,7 +1302,7 @@ static int __init fsl_mc_bus_driver_init(void)
if (error < 0)
goto error_cleanup_dprc_driver;

- return 0;
+ return bus_register_notifier(&platform_bus_type, &fsl_mc_nb);

error_cleanup_dprc_driver:
dprc_driver_exit();
--
2.17.1

2021-07-15 14:20:23

by Laurentiu Tudor

[permalink] [raw]
Subject: [PATCH 6/8] bus: fsl-mc: pause the MC firmware when unloading

From: Laurentiu Tudor <[email protected]>

Pause the MC firmware when unloading the driver so that it doesn't
crash in certain scenarios, such as kexec.

Signed-off-by: Laurentiu Tudor <[email protected]>
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 41861ca5c8f1..e5b4830cf3c5 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -1207,6 +1207,16 @@ static int fsl_mc_bus_remove(struct platform_device *pdev)

bus_unregister_notifier(&fsl_mc_bus_type, &fsl_mc_nb);

+ if (mc->fsl_mc_regs) {
+ /*
+ * Pause the MC firmware so that it doesn't crash in certain
+ * scenarios, such as kexec.
+ */
+ writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) |
+ (GCR1_P1_STOP | GCR1_P2_STOP),
+ mc->fsl_mc_regs + FSL_MC_GCR1);
+ }
+
return 0;
}

--
2.17.1

2021-07-15 14:21:17

by Laurentiu Tudor

[permalink] [raw]
Subject: [PATCH 8/8] bus: fsl-mc: fix mmio base address for child DPRCs

From: Laurentiu Tudor <[email protected]>

Some versions of the MC firmware wrongly report 0 for register base
address of the DPMCP associated with child DPRC objects thus rendering
them unusable. This is particularly troublesome in ACPI boot scenarios
where the legacy way of extracting this base address from the device
tree does not apply.
Given that DPMCPs share the same base address, workaround this by using
the base address extracted from the root DPRC container.

Signed-off-by: Laurentiu Tudor <[email protected]>
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 31595017d207..6273f782d0f2 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -69,6 +69,8 @@ struct fsl_mc_addr_translation_range {
#define MC_FAPR_PL BIT(18)
#define MC_FAPR_BMT BIT(17)

+static phys_addr_t mc_portal_base_phys_addr;
+
/**
* fsl_mc_bus_match - device to driver matching callback
* @dev: the fsl-mc device to match against
@@ -704,14 +706,30 @@ static int fsl_mc_device_get_mmio_regions(struct fsl_mc_device *mc_dev,
* If base address is in the region_desc use it otherwise
* revert to old mechanism
*/
- if (region_desc.base_address)
+ if (region_desc.base_address) {
regions[i].start = region_desc.base_address +
region_desc.base_offset;
- else
+ } else {
error = translate_mc_addr(mc_dev, mc_region_type,
region_desc.base_offset,
&regions[i].start);

+ /*
+ * Some versions of the MC firmware wrongly report
+ * 0 for register base address of the DPMCP associated
+ * with child DPRC objects thus rendering them unusable.
+ * This is particularly troublesome in ACPI boot
+ * scenarios where the legacy way of extracting this
+ * base address from the device tree does not apply.
+ * Given that DPMCPs share the same base address,
+ * workaround this by using the base address extracted
+ * from the root DPRC container.
+ */
+ if (is_fsl_mc_bus_dprc(mc_dev) &&
+ regions[i].start == region_desc.base_offset)
+ regions[i].start += mc_portal_base_phys_addr;
+ }
+
if (error < 0) {
dev_err(parent_dev,
"Invalid MC offset: %#x (for %s.%d\'s region %d)\n",
@@ -1150,6 +1168,8 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
plat_res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
mc_portal_phys_addr = plat_res->start;
mc_portal_size = resource_size(plat_res);
+ mc_portal_base_phys_addr = mc_portal_phys_addr & ~0x3ffffff;
+
error = fsl_create_mc_io(&pdev->dev, mc_portal_phys_addr,
mc_portal_size, NULL,
FSL_MC_IO_ATOMIC_CONTEXT_PORTAL, &mc_io);
--
2.17.1

2021-07-15 14:22:26

by Laurentiu Tudor

[permalink] [raw]
Subject: [PATCH 3/8] bus: fsl-mc: fully resume the firmware

From: Laurentiu Tudor <[email protected]>

The MC firmware has two execution units. Resume them both, as on some
Layerscape SoCs not doing so breaks the firmware.

Signed-off-by: Laurentiu Tudor <[email protected]>
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index ffd7a1ff957a..2341de6bce67 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -63,6 +63,7 @@ struct fsl_mc_addr_translation_range {

#define FSL_MC_GCR1 0x0
#define GCR1_P1_STOP BIT(31)
+#define GCR1_P2_STOP BIT(30)

#define FSL_MC_FAPR 0x28
#define MC_FAPR_PL BIT(18)
@@ -1118,7 +1119,8 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
* At this point MC should have all its SMMU setup done so make
* sure it is resumed.
*/
- writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) & (~GCR1_P1_STOP),
+ writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) &
+ (~(GCR1_P1_STOP | GCR1_P2_STOP)),
mc->fsl_mc_regs + FSL_MC_GCR1);
}

--
2.17.1

2021-07-15 16:46:36

by Laurentiu Tudor

[permalink] [raw]
Subject: [PATCH 7/8] bus: fsl-mc: rescan devices if endpoint not found

From: Laurentiu Tudor <[email protected]>

If the endpoint of a device is not yet probed on the bus, force
a rescan of the devices and retry to get a reference to the
endpoint device. If the device is still not found then we assume
it's in a different isolation context (container/DPRC) thus
unavailable and return a permission error.

Signed-off-by: Laurentiu Tudor <[email protected]>
Signed-off-by: Robert-Ionut Alexa <[email protected]>
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index e5b4830cf3c5..31595017d207 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -950,10 +950,28 @@ struct fsl_mc_device *fsl_mc_get_endpoint(struct fsl_mc_device *mc_dev)
* We know that the device has an endpoint because we verified by
* interrogating the firmware. This is the case when the device was not
* yet discovered by the fsl-mc bus, thus the lookup returned NULL.
- * Differentiate this case by returning EPROBE_DEFER.
+ * Force a rescan of the devices in this container and retry the lookup.
+ */
+ if (!endpoint) {
+ struct fsl_mc_bus *mc_bus = to_fsl_mc_bus(mc_bus_dev);
+
+ if (mutex_trylock(&mc_bus->scan_mutex)) {
+ err = dprc_scan_objects(mc_bus_dev, true);
+ mutex_unlock(&mc_bus->scan_mutex);
+ }
+
+ if (err < 0)
+ return ERR_PTR(err);
+ }
+
+ endpoint = fsl_mc_device_lookup(&endpoint_desc, mc_bus_dev);
+ /*
+ * This means that the endpoint might reside in a different isolation
+ * context (DPRC/container). Not much to do, so return a permssion
+ * error.
*/
if (!endpoint)
- return ERR_PTR(-EPROBE_DEFER);
+ return ERR_PTR(-EPERM);

return endpoint;
}
--
2.17.1

2021-07-21 21:03:21

by Diana Madalina Craciun

[permalink] [raw]
Subject: Re: [PATCH 1/8] bus: fsl-mc: fix arg in call to dprc_scan_objects()

Reviewed-by: Diana Craciun <[email protected]>

for all patches in the series

On 7/15/2021 5:07 PM, [email protected] wrote:
> From: Laurentiu Tudor <[email protected]>
>
> Second parameter of dprc_scan_objects() is a bool not a pointer
> so change from NULL to false.
>
> Signed-off-by: Laurentiu Tudor <[email protected]>
> ---
> drivers/bus/fsl-mc/fsl-mc-bus.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
> index 09c8ab5e0959..ffec838450f3 100644
> --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> @@ -220,7 +220,7 @@ static int scan_fsl_mc_bus(struct device *dev, void *data)
> root_mc_dev = to_fsl_mc_device(dev);
> root_mc_bus = to_fsl_mc_bus(root_mc_dev);
> mutex_lock(&root_mc_bus->scan_mutex);
> - dprc_scan_objects(root_mc_dev, NULL);
> + dprc_scan_objects(root_mc_dev, false);
> mutex_unlock(&root_mc_bus->scan_mutex);
>
> exit:

2021-11-11 17:23:47

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case

Hi Laurentiu

On Thu, Jul 15, 2021 at 05:07:12PM +0300, [email protected] wrote:
> From: Laurentiu Tudor <[email protected]>
>
> ACPI DMA configure API may return a defer status code, so handle it.
> On top of this, move the MC firmware resume after the DMA setup
> is completed to avoid crashing due to DMA setup not being done yet or
> being deferred.
>
> Signed-off-by: Laurentiu Tudor <[email protected]>

I saw regressions on my Honeycomb LX2 (NXP LX2060A) when I switched to
v5.15. It seems like it results in so many sMMU errors that the system
cannot function correctly (it's only about a 75% chance the system will
boot to GUI and even if it does boot successfully the system will hang
up soon after).

Bisect took me up a couple of blind alleys (mostly due to unrelated boot
problems in v5.14-rc2) by eventually led me to this patch as the cause.
Applying/unapplying this patch to a v5.14-rc3 tree will provoke/fix the
problem and reverting it against v5.15 also resolves the problem.

Is there some specific firmware version required for this patch to work
correctly?


Daniel.


PS: Below is the revert I applied to the v5.15 kernel (after
a fairly simple merge conflict fix)

From 4162b64e4f361a6a773e065b592dbc5493202524 Mon Sep 17 00:00:00 2001
From: Daniel Thompson <[email protected]>
Date: Thu, 11 Nov 2021 16:50:25 +0000
Subject: [PATCH] Revert "bus: fsl-mc: handle DMA config deferral in ACPI case"

This reverts commit d31e7fe20a2251f87adc6ecefbdaf25e6961ce74 because
it was causing regressions on my Honeycomb LX2 (NXP LX2060A).

All kernels where the problem manifests (as either a boot hang or a desktop
hang) issue the following messages in vast number:

~~~
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm_smmu_context_fault: 1697259 callbacks suppressed
~~~

Signed-off-by: Daniel Thompson <[email protected]>
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 26 ++++++++++++--------------
1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 8fd4a356a86e..429bacc7de20 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -1130,6 +1130,18 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
}

if (mc->fsl_mc_regs) {
+ /*
+ * Some bootloaders pause the MC firmware before booting the
+ * kernel so that MC will not cause faults as soon as the
+ * SMMU probes due to the fact that there's no configuration
+ * in place for MC.
+ * At this point MC should have all its SMMU setup done so make
+ * sure it is resumed.
+ */
+ writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) &
+ (~(GCR1_P1_STOP | GCR1_P2_STOP)),
+ mc->fsl_mc_regs + FSL_MC_GCR1);
+
if (IS_ENABLED(CONFIG_ACPI) && !dev_of_node(&pdev->dev)) {
mc_stream_id = readl(mc->fsl_mc_regs + FSL_MC_FAPR);
/*
@@ -1143,25 +1155,11 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
error = acpi_dma_configure_id(&pdev->dev,
DEV_DMA_COHERENT,
&mc_stream_id);
- if (error == -EPROBE_DEFER)
- return error;
if (error)
dev_warn(&pdev->dev,
"failed to configure dma: %d.\n",
error);
}
-
- /*
- * Some bootloaders pause the MC firmware before booting the
- * kernel so that MC will not cause faults as soon as the
- * SMMU probes due to the fact that there's no configuration
- * in place for MC.
- * At this point MC should have all its SMMU setup done so make
- * sure it is resumed.
- */
- writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) &
- (~(GCR1_P1_STOP | GCR1_P2_STOP)),
- mc->fsl_mc_regs + FSL_MC_GCR1);
}

/*
--
2.33.0


2021-11-11 17:37:41

by Jon Nettleton

[permalink] [raw]
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case

On Thu, Nov 11, 2021 at 6:23 PM Daniel Thompson
<[email protected]> wrote:
>
> Hi Laurentiu
>
> On Thu, Jul 15, 2021 at 05:07:12PM +0300, [email protected] wrote:
> > From: Laurentiu Tudor <[email protected]>
> >
> > ACPI DMA configure API may return a defer status code, so handle it.
> > On top of this, move the MC firmware resume after the DMA setup
> > is completed to avoid crashing due to DMA setup not being done yet or
> > being deferred.
> >
> > Signed-off-by: Laurentiu Tudor <[email protected]>
>
> I saw regressions on my Honeycomb LX2 (NXP LX2060A) when I switched to
> v5.15. It seems like it results in so many sMMU errors that the system
> cannot function correctly (it's only about a 75% chance the system will
> boot to GUI and even if it does boot successfully the system will hang
> up soon after).
>
> Bisect took me up a couple of blind alleys (mostly due to unrelated boot
> problems in v5.14-rc2) by eventually led me to this patch as the cause.
> Applying/unapplying this patch to a v5.14-rc3 tree will provoke/fix the
> problem and reverting it against v5.15 also resolves the problem.
>
> Is there some specific firmware version required for this patch to work
> correctly?
>
>
> Daniel.
>
>
> PS: Below is the revert I applied to the v5.15 kernel (after
> a fairly simple merge conflict fix)
>
> From 4162b64e4f361a6a773e065b592dbc5493202524 Mon Sep 17 00:00:00 2001
> From: Daniel Thompson <[email protected]>
> Date: Thu, 11 Nov 2021 16:50:25 +0000
> Subject: [PATCH] Revert "bus: fsl-mc: handle DMA config deferral in ACPI case"
>
> This reverts commit d31e7fe20a2251f87adc6ecefbdaf25e6961ce74 because
> it was causing regressions on my Honeycomb LX2 (NXP LX2060A).
>
> All kernels where the problem manifests (as either a boot hang or a desktop
> hang) issue the following messages in vast number:
>
> ~~~
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm_smmu_context_fault: 1697259 callbacks suppressed
> ~~~
>
> Signed-off-by: Daniel Thompson <[email protected]>
> ---
> drivers/bus/fsl-mc/fsl-mc-bus.c | 26 ++++++++++++--------------
> 1 file changed, 12 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
> index 8fd4a356a86e..429bacc7de20 100644
> --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> @@ -1130,6 +1130,18 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
> }
>
> if (mc->fsl_mc_regs) {
> + /*
> + * Some bootloaders pause the MC firmware before booting the
> + * kernel so that MC will not cause faults as soon as the
> + * SMMU probes due to the fact that there's no configuration
> + * in place for MC.
> + * At this point MC should have all its SMMU setup done so make
> + * sure it is resumed.
> + */
> + writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) &
> + (~(GCR1_P1_STOP | GCR1_P2_STOP)),
> + mc->fsl_mc_regs + FSL_MC_GCR1);
> +
> if (IS_ENABLED(CONFIG_ACPI) && !dev_of_node(&pdev->dev)) {
> mc_stream_id = readl(mc->fsl_mc_regs + FSL_MC_FAPR);
> /*
> @@ -1143,25 +1155,11 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
> error = acpi_dma_configure_id(&pdev->dev,
> DEV_DMA_COHERENT,
> &mc_stream_id);
> - if (error == -EPROBE_DEFER)
> - return error;
> if (error)
> dev_warn(&pdev->dev,
> "failed to configure dma: %d.\n",
> error);
> }
> -
> - /*
> - * Some bootloaders pause the MC firmware before booting the
> - * kernel so that MC will not cause faults as soon as the
> - * SMMU probes due to the fact that there's no configuration
> - * in place for MC.
> - * At this point MC should have all its SMMU setup done so make
> - * sure it is resumed.
> - */
> - writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) &
> - (~(GCR1_P1_STOP | GCR1_P2_STOP)),
> - mc->fsl_mc_regs + FSL_MC_GCR1);
> }
>
> /*
> --
> 2.33.0
>

This patch was merged as a requirement for operational on board networking.
This was merged as a prerequisite to landing the patches to support MDIO and
phy initialization in general. The correct solution for the problem
you are seeing
is the ACPI maintainers figuring out how to land the IORT RMR patchset. Until
that is done the only workaround is setting "arm-smmu.disable_bypass=0
iommu.passthrough=1" on the kernel commandline. The latter option is required
since 5.15 and I haven't had time or energy to figure out why. The
proper solution
is to just land the IORT RMR patchset and let HoneyComb run with the SMMU
enabled.

-Jon

2021-11-12 17:31:59

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case

On Thu, Nov 11, 2021 at 06:36:58PM +0100, Jon Nettleton wrote:
> On Thu, Nov 11, 2021 at 6:23 PM Daniel Thompson
> <[email protected]> wrote:
> >
> > Hi Laurentiu
> >
> > On Thu, Jul 15, 2021 at 05:07:12PM +0300, [email protected] wrote:
> > > From: Laurentiu Tudor <[email protected]>
> > >
> > > ACPI DMA configure API may return a defer status code, so handle it.
> > > On top of this, move the MC firmware resume after the DMA setup
> > > is completed to avoid crashing due to DMA setup not being done yet or
> > > being deferred.
> > >
> > > Signed-off-by: Laurentiu Tudor <[email protected]>
> >
> > I saw regressions on my Honeycomb LX2 (NXP LX2060A) when I switched to
> > v5.15. It seems like it results in so many sMMU errors that the system
> > cannot function correctly (it's only about a 75% chance the system will
> > boot to GUI and even if it does boot successfully the system will hang
> > up soon after).
> >
> > Bisect took me up a couple of blind alleys (mostly due to unrelated boot
> > problems in v5.14-rc2) by eventually led me to this patch as the cause.
> > Applying/unapplying this patch to a v5.14-rc3 tree will provoke/fix the
> > problem and reverting it against v5.15 also resolves the problem.
> >
> > Is there some specific firmware version required for this patch to work
> > correctly?
>
> This patch was merged as a requirement for operational on board networking.
> This was merged as a prerequisite to landing the patches to support MDIO and
> phy initialization in general.

Interesting.

I assumed the change of behaviour comes from properly handling
-EPROBE_DEFER (which can hardly be regarded as a fault with the patch).

Having said that the patch does not seem to be mandatory to get the 1G
networking working on Honeycomb LX2 (running ACPI). By taking v5.15 and
reverting as I shared previously, I am still able to access the network
using the 1G port on the back of the unit (although I didn't do any
performance tests).


> The correct solution for the problem you are seeing is the ACPI
> maintainers figuring out how to land the IORT RMR patchset. Until
> that is done the only workaround is setting "arm-smmu.disable_bypass=0
> iommu.passthrough=1" on the kernel commandline. The latter option is
> required since 5.15 and I haven't had time or energy to figure out
> why. The proper solution is to just land the IORT RMR patchset and
> let HoneyComb run with the SMMU enabled.

Thanks for the update. I'll probably adopt iommu.passthrough=1 for now.
That allows me to adopt a distro kernel when it updates to v5.15.


Daniel.

2021-11-17 13:03:19

by Laurentiu Tudor

[permalink] [raw]
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case

Hi Daniel,

Sorry for the late reply, please see some comments inline.

On 11/11/2021 7:23 PM, Daniel Thompson wrote:
> Hi Laurentiu
>
> On Thu, Jul 15, 2021 at 05:07:12PM +0300, [email protected] wrote:
>> From: Laurentiu Tudor <[email protected]>
>>
>> ACPI DMA configure API may return a defer status code, so handle it.
>> On top of this, move the MC firmware resume after the DMA setup
>> is completed to avoid crashing due to DMA setup not being done yet or
>> being deferred.
>>
>> Signed-off-by: Laurentiu Tudor <[email protected]>
>
> I saw regressions on my Honeycomb LX2 (NXP LX2060A) when I switched to
> v5.15. It seems like it results in so many sMMU errors that the system
> cannot function correctly (it's only about a 75% chance the system will
> boot to GUI and even if it does boot successfully the system will hang
> up soon after).
>
> Bisect took me up a couple of blind alleys (mostly due to unrelated boot
> problems in v5.14-rc2) by eventually led me to this patch as the cause.
> Applying/unapplying this patch to a v5.14-rc3 tree will provoke/fix the
> problem and reverting it against v5.15 also resolves the problem.

That's pretty strange. Was the DPAA2 based networking working with this
patch reverted?

> Is there some specific firmware version required for this patch to work
> correctly?

It's a bit of a long story. As Jon already mentioned, we're waiting for
maintainers to agree on the IORT RMR support on which we depend to
declare in UEFI reserved memory regions for the MC firmware.
For now, the recommended workaround is to use the
"arm-smmu.disable_bypass=0" kernel boot arg.

---
Best Regards, Laurentiu

>
>
> PS: Below is the revert I applied to the v5.15 kernel (after
> a fairly simple merge conflict fix)
>
> From 4162b64e4f361a6a773e065b592dbc5493202524 Mon Sep 17 00:00:00 2001
> From: Daniel Thompson <[email protected]>
> Date: Thu, 11 Nov 2021 16:50:25 +0000
> Subject: [PATCH] Revert "bus: fsl-mc: handle DMA config deferral in ACPI case"
>
> This reverts commit d31e7fe20a2251f87adc6ecefbdaf25e6961ce74 because
> it was causing regressions on my Honeycomb LX2 (NXP LX2060A).
>
> All kernels where the problem manifests (as either a boot hang or a desktop
> hang) issue the following messages in vast number:
>
> ~~~
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm_smmu_context_fault: 1697259 callbacks suppressed
> ~~~
>
> Signed-off-by: Daniel Thompson <[email protected]>
> ---
> drivers/bus/fsl-mc/fsl-mc-bus.c | 26 ++++++++++++--------------
> 1 file changed, 12 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
> index 8fd4a356a86e..429bacc7de20 100644
> --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> @@ -1130,6 +1130,18 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
> }
>
> if (mc->fsl_mc_regs) {
> + /*
> + * Some bootloaders pause the MC firmware before booting the
> + * kernel so that MC will not cause faults as soon as the
> + * SMMU probes due to the fact that there's no configuration
> + * in place for MC.
> + * At this point MC should have all its SMMU setup done so make
> + * sure it is resumed.
> + */
> + writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) &
> + (~(GCR1_P1_STOP | GCR1_P2_STOP)),
> + mc->fsl_mc_regs + FSL_MC_GCR1);
> +
> if (IS_ENABLED(CONFIG_ACPI) && !dev_of_node(&pdev->dev)) {
> mc_stream_id = readl(mc->fsl_mc_regs + FSL_MC_FAPR);
> /*
> @@ -1143,25 +1155,11 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
> error = acpi_dma_configure_id(&pdev->dev,
> DEV_DMA_COHERENT,
> &mc_stream_id);
> - if (error == -EPROBE_DEFER)
> - return error;
> if (error)
> dev_warn(&pdev->dev,
> "failed to configure dma: %d.\n",
> error);
> }
> -
> - /*
> - * Some bootloaders pause the MC firmware before booting the
> - * kernel so that MC will not cause faults as soon as the
> - * SMMU probes due to the fact that there's no configuration
> - * in place for MC.
> - * At this point MC should have all its SMMU setup done so make
> - * sure it is resumed.
> - */
> - writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) &
> - (~(GCR1_P1_STOP | GCR1_P2_STOP)),
> - mc->fsl_mc_regs + FSL_MC_GCR1);
> }
>
> /*
>

2021-11-17 13:07:59

by Laurentiu Tudor

[permalink] [raw]
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case



On 11/12/2021 7:31 PM, Daniel Thompson wrote:
> On Thu, Nov 11, 2021 at 06:36:58PM +0100, Jon Nettleton wrote:
>> On Thu, Nov 11, 2021 at 6:23 PM Daniel Thompson
>> <[email protected]> wrote:
>>>
>>> Hi Laurentiu
>>>
>>> On Thu, Jul 15, 2021 at 05:07:12PM +0300, [email protected] wrote:
>>>> From: Laurentiu Tudor <[email protected]>
>>>>
>>>> ACPI DMA configure API may return a defer status code, so handle it.
>>>> On top of this, move the MC firmware resume after the DMA setup
>>>> is completed to avoid crashing due to DMA setup not being done yet or
>>>> being deferred.
>>>>
>>>> Signed-off-by: Laurentiu Tudor <[email protected]>
>>>
>>> I saw regressions on my Honeycomb LX2 (NXP LX2060A) when I switched to
>>> v5.15. It seems like it results in so many sMMU errors that the system
>>> cannot function correctly (it's only about a 75% chance the system will
>>> boot to GUI and even if it does boot successfully the system will hang
>>> up soon after).
>>>
>>> Bisect took me up a couple of blind alleys (mostly due to unrelated boot
>>> problems in v5.14-rc2) by eventually led me to this patch as the cause.
>>> Applying/unapplying this patch to a v5.14-rc3 tree will provoke/fix the
>>> problem and reverting it against v5.15 also resolves the problem.
>>>
>>> Is there some specific firmware version required for this patch to work
>>> correctly?
>>
>> This patch was merged as a requirement for operational on board networking.
>> This was merged as a prerequisite to landing the patches to support MDIO and
>> phy initialization in general.
>
> Interesting.
>
> I assumed the change of behaviour comes from properly handling
> -EPROBE_DEFER (which can hardly be regarded as a fault with the patch).
>
> Having said that the patch does not seem to be mandatory to get the 1G
> networking working on Honeycomb LX2 (running ACPI). By taking v5.15 and
> reverting as I shared previously, I am still able to access the network
> using the 1G port on the back of the unit (although I didn't do any
> performance tests).
>
>
>> The correct solution for the problem you are seeing is the ACPI
>> maintainers figuring out how to land the IORT RMR patchset. Until
>> that is done the only workaround is setting "arm-smmu.disable_bypass=0
>> iommu.passthrough=1" on the kernel commandline. The latter option is
>> required since 5.15 and I haven't had time or energy to figure out
>> why. The proper solution is to just land the IORT RMR patchset and
>> let HoneyComb run with the SMMU enabled.
>
> Thanks for the update. I'll probably adopt iommu.passthrough=1 for now.
> That allows me to adopt a distro kernel when it updates to v5.15.

The "iommu.passthrough=1" kernel arg shouldn't be needed. By chance, do
you remember what errors were you seeing? What was failing?

---
Thanks & Best Regards, Laurentiu

2021-11-17 13:23:34

by Jon Nettleton

[permalink] [raw]
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case

On Wed, Nov 17, 2021 at 2:07 PM Laurentiu Tudor <[email protected]> wrote:
>
>
>
> On 11/12/2021 7:31 PM, Daniel Thompson wrote:
> > On Thu, Nov 11, 2021 at 06:36:58PM +0100, Jon Nettleton wrote:
> >> On Thu, Nov 11, 2021 at 6:23 PM Daniel Thompson
> >> <[email protected]> wrote:
> >>>
> >>> Hi Laurentiu
> >>>
> >>> On Thu, Jul 15, 2021 at 05:07:12PM +0300, [email protected] wrote:
> >>>> From: Laurentiu Tudor <[email protected]>
> >>>>
> >>>> ACPI DMA configure API may return a defer status code, so handle it.
> >>>> On top of this, move the MC firmware resume after the DMA setup
> >>>> is completed to avoid crashing due to DMA setup not being done yet or
> >>>> being deferred.
> >>>>
> >>>> Signed-off-by: Laurentiu Tudor <[email protected]>
> >>>
> >>> I saw regressions on my Honeycomb LX2 (NXP LX2060A) when I switched to
> >>> v5.15. It seems like it results in so many sMMU errors that the system
> >>> cannot function correctly (it's only about a 75% chance the system will
> >>> boot to GUI and even if it does boot successfully the system will hang
> >>> up soon after).
> >>>
> >>> Bisect took me up a couple of blind alleys (mostly due to unrelated boot
> >>> problems in v5.14-rc2) by eventually led me to this patch as the cause.
> >>> Applying/unapplying this patch to a v5.14-rc3 tree will provoke/fix the
> >>> problem and reverting it against v5.15 also resolves the problem.
> >>>
> >>> Is there some specific firmware version required for this patch to work
> >>> correctly?
> >>
> >> This patch was merged as a requirement for operational on board networking.
> >> This was merged as a prerequisite to landing the patches to support MDIO and
> >> phy initialization in general.
> >
> > Interesting.
> >
> > I assumed the change of behaviour comes from properly handling
> > -EPROBE_DEFER (which can hardly be regarded as a fault with the patch).
> >
> > Having said that the patch does not seem to be mandatory to get the 1G
> > networking working on Honeycomb LX2 (running ACPI). By taking v5.15 and
> > reverting as I shared previously, I am still able to access the network
> > using the 1G port on the back of the unit (although I didn't do any
> > performance tests).
> >
> >
> >> The correct solution for the problem you are seeing is the ACPI
> >> maintainers figuring out how to land the IORT RMR patchset. Until
> >> that is done the only workaround is setting "arm-smmu.disable_bypass=0
> >> iommu.passthrough=1" on the kernel commandline. The latter option is
> >> required since 5.15 and I haven't had time or energy to figure out
> >> why. The proper solution is to just land the IORT RMR patchset and
> >> let HoneyComb run with the SMMU enabled.
> >
> > Thanks for the update. I'll probably adopt iommu.passthrough=1 for now.
> > That allows me to adopt a distro kernel when it updates to v5.15.
>
> The "iommu.passthrough=1" kernel arg shouldn't be needed. By chance, do
> you remember what errors were you seeing? What was failing?

This wasn't needed prior to 5.15, both are needed now. I have not bothered
to bisect what caused it, since we have a proper solution that just needs
to be merged. Then we won't need any kernel arguments.

-Jon

>
> ---
> Thanks & Best Regards, Laurentiu

2021-11-17 13:59:16

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case

On Wed, Nov 17, 2021 at 03:07:51PM +0200, Laurentiu Tudor wrote:
> On 11/12/2021 7:31 PM, Daniel Thompson wrote:
> > On Thu, Nov 11, 2021 at 06:36:58PM +0100, Jon Nettleton wrote:
> >> On Thu, Nov 11, 2021 at 6:23 PM Daniel Thompson
> >> <[email protected]> wrote:
> >>> Hi Laurentiu
> >>>
> >>> On Thu, Jul 15, 2021 at 05:07:12PM +0300, [email protected] wrote:
> >>>> From: Laurentiu Tudor <[email protected]>
> >>>>
> >>>> ACPI DMA configure API may return a defer status code, so handle it.
> >>>> On top of this, move the MC firmware resume after the DMA setup
> >>>> is completed to avoid crashing due to DMA setup not being done yet or
> >>>> being deferred.
> >>>>
> >>>> Signed-off-by: Laurentiu Tudor <[email protected]>
> >>>
> >>> I saw regressions on my Honeycomb LX2 (NXP LX2060A) when I switched to
> >>> v5.15. It seems like it results in so many sMMU errors that the system
> >>> cannot function correctly (it's only about a 75% chance the system will
> >>> boot to GUI and even if it does boot successfully the system will hang
> >>> up soon after).
> >>>
> >>> Bisect took me up a couple of blind alleys (mostly due to unrelated boot
> >>> problems in v5.14-rc2) by eventually led me to this patch as the cause.
> >>> Applying/unapplying this patch to a v5.14-rc3 tree will provoke/fix the
> >>> problem and reverting it against v5.15 also resolves the problem.
> >>>
> >>> Is there some specific firmware version required for this patch to work
> >>> correctly?
> >>
> >> This patch was merged as a requirement for operational on board networking.
> >> This was merged as a prerequisite to landing the patches to support MDIO and
> >> phy initialization in general.
> >
> > Interesting.
> >
> > I assumed the change of behaviour comes from properly handling
> > -EPROBE_DEFER (which can hardly be regarded as a fault with the patch).
> >
> > Having said that the patch does not seem to be mandatory to get the 1G
> > networking working on Honeycomb LX2 (running ACPI). By taking v5.15 and
> > reverting as I shared previously, I am still able to access the network
> > using the 1G port on the back of the unit (although I didn't do any
> > performance tests).
> >
> >
> >> The correct solution for the problem you are seeing is the ACPI
> >> maintainers figuring out how to land the IORT RMR patchset. Until
> >> that is done the only workaround is setting "arm-smmu.disable_bypass=0
> >> iommu.passthrough=1" on the kernel commandline. The latter option is
> >> required since 5.15 and I haven't had time or energy to figure out
> >> why. The proper solution is to just land the IORT RMR patchset and
> >> let HoneyComb run with the SMMU enabled.
> >
> > Thanks for the update. I'll probably adopt iommu.passthrough=1 for now.
> > That allows me to adopt a distro kernel when it updates to v5.15.
>
> The "iommu.passthrough=1" kernel arg shouldn't be needed. By chance, do
> you remember what errors were you seeing? What was failing?

For all testing of v5.15 I had "arm-smmu.disable_bypass=0" set because I
was guided to enable that by the error messages in older kernels ;-) .

Anyhow without "iommu.passthrough=1" (and without the patch from this thread
reverted) then the logs are being massively spammed with error messages:

~~~
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm_smmu_context_fault: 1697259 callbacks suppressed
~~~

This results a relatively simple workstation (LX2 + nVidia GT-710 + USB
for networking) becoming unresponsive. How long to fail is a little
unpredictable. I assumed that the weight of such dense log messages
eventually gets into a timing pattern that prevented any useful
interrupts from being serviced... but that is only a guess.


Daniel.

2021-11-17 14:46:43

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case

On Wed, Nov 17, 2021 at 03:03:04PM +0200, Laurentiu Tudor wrote:
> Hi Daniel,
>
> Sorry for the late reply, please see some comments inline.
>
> On 11/11/2021 7:23 PM, Daniel Thompson wrote:
> > Hi Laurentiu
> >
> > On Thu, Jul 15, 2021 at 05:07:12PM +0300, [email protected] wrote:
> >> From: Laurentiu Tudor <[email protected]>
> >>
> >> ACPI DMA configure API may return a defer status code, so handle it.
> >> On top of this, move the MC firmware resume after the DMA setup
> >> is completed to avoid crashing due to DMA setup not being done yet or
> >> being deferred.
> >>
> >> Signed-off-by: Laurentiu Tudor <[email protected]>
> >
> > I saw regressions on my Honeycomb LX2 (NXP LX2060A) when I switched to
> > v5.15. It seems like it results in so many sMMU errors that the system
> > cannot function correctly (it's only about a 75% chance the system will
> > boot to GUI and even if it does boot successfully the system will hang
> > up soon after).
> >
> > Bisect took me up a couple of blind alleys (mostly due to unrelated boot
> > problems in v5.14-rc2) by eventually led me to this patch as the cause.
> > Applying/unapplying this patch to a v5.14-rc3 tree will provoke/fix the
> > problem and reverting it against v5.15 also resolves the problem.
>
> That's pretty strange. Was the DPAA2 based networking working with this
> patch reverted?

I think so. I haven't studied the LX2K architecture too heavily but I
assume the 1G networking socket at the back of Honeycomb LX2 is DPAA2
baseD? If so, that 1G socket works with the patch reverted.

Note that I was already using "arm-smmu.disable_bypass=0" on this platform
since I was guided into doing that based on the error messages from
older kernels. It was only the new requirement to set iommu.passthrough
that caught me out.


> > Is there some specific firmware version required for this patch to work
> > correctly?
>
> It's a bit of a long story. As Jon already mentioned, we're waiting for
> maintainers to agree on the IORT RMR support on which we depend to
> declare in UEFI reserved memory regions for the MC firmware.
> For now, the recommended workaround is to use the
> "arm-smmu.disable_bypass=0" kernel boot arg.

I see. Looks like, after the traffic on the ML in October, that this
patch set is pending a v8 revision in order to stimulate the next round
of discussion?


Daniel.

2021-11-17 15:30:42

by Laurentiu Tudor

[permalink] [raw]
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case



On 11/17/2021 3:59 PM, Daniel Thompson wrote:
> On Wed, Nov 17, 2021 at 03:07:51PM +0200, Laurentiu Tudor wrote:
>> On 11/12/2021 7:31 PM, Daniel Thompson wrote:
>>> On Thu, Nov 11, 2021 at 06:36:58PM +0100, Jon Nettleton wrote:
>>>> On Thu, Nov 11, 2021 at 6:23 PM Daniel Thompson
>>>> <[email protected]> wrote:
>>>>> Hi Laurentiu
>>>>>
>>>>> On Thu, Jul 15, 2021 at 05:07:12PM +0300, [email protected] wrote:
>>>>>> From: Laurentiu Tudor <[email protected]>
>>>>>>
>>>>>> ACPI DMA configure API may return a defer status code, so handle it.
>>>>>> On top of this, move the MC firmware resume after the DMA setup
>>>>>> is completed to avoid crashing due to DMA setup not being done yet or
>>>>>> being deferred.
>>>>>>
>>>>>> Signed-off-by: Laurentiu Tudor <[email protected]>
>>>>>
>>>>> I saw regressions on my Honeycomb LX2 (NXP LX2060A) when I switched to
>>>>> v5.15. It seems like it results in so many sMMU errors that the system
>>>>> cannot function correctly (it's only about a 75% chance the system will
>>>>> boot to GUI and even if it does boot successfully the system will hang
>>>>> up soon after).
>>>>>
>>>>> Bisect took me up a couple of blind alleys (mostly due to unrelated boot
>>>>> problems in v5.14-rc2) by eventually led me to this patch as the cause.
>>>>> Applying/unapplying this patch to a v5.14-rc3 tree will provoke/fix the
>>>>> problem and reverting it against v5.15 also resolves the problem.
>>>>>
>>>>> Is there some specific firmware version required for this patch to work
>>>>> correctly?
>>>>
>>>> This patch was merged as a requirement for operational on board networking.
>>>> This was merged as a prerequisite to landing the patches to support MDIO and
>>>> phy initialization in general.
>>>
>>> Interesting.
>>>
>>> I assumed the change of behaviour comes from properly handling
>>> -EPROBE_DEFER (which can hardly be regarded as a fault with the patch).
>>>
>>> Having said that the patch does not seem to be mandatory to get the 1G
>>> networking working on Honeycomb LX2 (running ACPI). By taking v5.15 and
>>> reverting as I shared previously, I am still able to access the network
>>> using the 1G port on the back of the unit (although I didn't do any
>>> performance tests).
>>>
>>>
>>>> The correct solution for the problem you are seeing is the ACPI
>>>> maintainers figuring out how to land the IORT RMR patchset. Until
>>>> that is done the only workaround is setting "arm-smmu.disable_bypass=0
>>>> iommu.passthrough=1" on the kernel commandline. The latter option is
>>>> required since 5.15 and I haven't had time or energy to figure out
>>>> why. The proper solution is to just land the IORT RMR patchset and
>>>> let HoneyComb run with the SMMU enabled.
>>>
>>> Thanks for the update. I'll probably adopt iommu.passthrough=1 for now.
>>> That allows me to adopt a distro kernel when it updates to v5.15.
>>
>> The "iommu.passthrough=1" kernel arg shouldn't be needed. By chance, do
>> you remember what errors were you seeing? What was failing?
>
> For all testing of v5.15 I had "arm-smmu.disable_bypass=0" set because I
> was guided to enable that by the error messages in older kernels ;-) .
>
> Anyhow without "iommu.passthrough=1" (and without the patch from this thread
> reverted) then the logs are being massively spammed with error messages:
>
> ~~~
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm_smmu_context_fault: 1697259 callbacks suppressed
> ~~~
>
> This results a relatively simple workstation (LX2 + nVidia GT-710 + USB
> for networking) becoming unresponsive. How long to fail is a little
> unpredictable. I assumed that the weight of such dense log messages
> eventually gets into a timing pattern that prevented any useful
> interrupts from being serviced... but that is only a guess.
>

Few comments here:
- I'm suspecting that the PCI video card is triggering the smmu faults.
Would it be possible to give it a try with the card out and without
"iommu.passthrough=1"?
- the IOVAs look weird to me, they should look something like
0xffffxxxxxx or so. Maybe there are issues in the nvidia driver?
- Would it be possible to share a full boot log? I'm thinking that it
would be interesting to see how the devices are allocated in iommu groups.

---
Thanks & Best Regards, Laurentiu

2021-11-17 17:00:22

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case

On Wed, Nov 17, 2021 at 05:30:32PM +0200, Laurentiu Tudor wrote:
> On 11/17/2021 3:59 PM, Daniel Thompson wrote:
> > On Wed, Nov 17, 2021 at 03:07:51PM +0200, Laurentiu Tudor wrote:
> >> On 11/12/2021 7:31 PM, Daniel Thompson wrote:
> >>> On Thu, Nov 11, 2021 at 06:36:58PM +0100, Jon Nettleton wrote:
> >>>> On Thu, Nov 11, 2021 at 6:23 PM Daniel Thompson
> >>>> <[email protected]> wrote:
> >>>> The correct solution for the problem you are seeing is the ACPI
> >>>> maintainers figuring out how to land the IORT RMR patchset. Until
> >>>> that is done the only workaround is setting "arm-smmu.disable_bypass=0
> >>>> iommu.passthrough=1" on the kernel commandline. The latter option is
> >>>> required since 5.15 and I haven't had time or energy to figure out
> >>>> why. The proper solution is to just land the IORT RMR patchset and
> >>>> let HoneyComb run with the SMMU enabled.
> >>>
> >>> Thanks for the update. I'll probably adopt iommu.passthrough=1 for now.
> >>> That allows me to adopt a distro kernel when it updates to v5.15.
> >>
> >> The "iommu.passthrough=1" kernel arg shouldn't be needed. By chance, do
> >> you remember what errors were you seeing? What was failing?
> >
> > For all testing of v5.15 I had "arm-smmu.disable_bypass=0" set because I
> > was guided to enable that by the error messages in older kernels ;-) .
> >
> > Anyhow without "iommu.passthrough=1" (and without the patch from this thread
> > reverted) then the logs are being massively spammed with error messages:
> >
> > ~~~
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm_smmu_context_fault: 1697259 callbacks suppressed
> > ~~~
> >
> > This results a relatively simple workstation (LX2 + nVidia GT-710 + USB
> > for networking) becoming unresponsive. How long to fail is a little
> > unpredictable. I assumed that the weight of such dense log messages
> > eventually gets into a timing pattern that prevented any useful
> > interrupts from being serviced... but that is only a guess.
> >
>
> Few comments here:
> - I'm suspecting that the PCI video card is triggering the smmu faults.
> Would it be possible to give it a try with the card out and without
> "iommu.passthrough=1"?

The PCIe video card does not cause the smmu faults. These still manifest
when the card is removed (and with same IOVA).


> - the IOVAs look weird to me, they should look something like
> 0xffffxxxxxx or so. Maybe there are issues in the nvidia driver?

I guess there could be, but why would a problem that bisects down to
a change in the fsl-mc-bus initialization configuration alter the
behaviour of the PCIe graphics driver?


> - Would it be possible to share a full boot log? I'm thinking that it
> would be interesting to see how the devices are allocated in iommu groups.

See
https://gist.github.com/daniel-thompson/07489561f14965fd1af7d5bd4340f54b

It contains three files, all gathered with the GPU removed:

* Logs from unmodified v5.15 with iommu.passthrough=1 set
(networking is good).
* Logs from v5.15 patched with the revert I shared earlier in
the thread (networking is good).
* Logs from v5.15 without iommu.passthough=1 set (many SMMU messages,
networking is broken).


Daniel.

2021-11-18 12:42:02

by Laurentiu Tudor

[permalink] [raw]
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case



On 11/17/2021 7:00 PM, Daniel Thompson wrote:
> On Wed, Nov 17, 2021 at 05:30:32PM +0200, Laurentiu Tudor wrote:
>> On 11/17/2021 3:59 PM, Daniel Thompson wrote:
>>> On Wed, Nov 17, 2021 at 03:07:51PM +0200, Laurentiu Tudor wrote:
>>>> On 11/12/2021 7:31 PM, Daniel Thompson wrote:
>>>>> On Thu, Nov 11, 2021 at 06:36:58PM +0100, Jon Nettleton wrote:
>>>>>> On Thu, Nov 11, 2021 at 6:23 PM Daniel Thompson
>>>>>> <[email protected]> wrote:
>>>>>> The correct solution for the problem you are seeing is the ACPI
>>>>>> maintainers figuring out how to land the IORT RMR patchset. Until
>>>>>> that is done the only workaround is setting "arm-smmu.disable_bypass=0
>>>>>> iommu.passthrough=1" on the kernel commandline. The latter option is
>>>>>> required since 5.15 and I haven't had time or energy to figure out
>>>>>> why. The proper solution is to just land the IORT RMR patchset and
>>>>>> let HoneyComb run with the SMMU enabled.
>>>>>
>>>>> Thanks for the update. I'll probably adopt iommu.passthrough=1 for now.
>>>>> That allows me to adopt a distro kernel when it updates to v5.15.
>>>>
>>>> The "iommu.passthrough=1" kernel arg shouldn't be needed. By chance, do
>>>> you remember what errors were you seeing? What was failing?
>>>
>>> For all testing of v5.15 I had "arm-smmu.disable_bypass=0" set because I
>>> was guided to enable that by the error messages in older kernels ;-) .
>>>
>>> Anyhow without "iommu.passthrough=1" (and without the patch from this thread
>>> reverted) then the logs are being massively spammed with error messages:
>>>
>>> ~~~
>>> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
>>> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
>>> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
>>> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
>>> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
>>> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
>>> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
>>> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
>>> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
>>> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
>>> arm_smmu_context_fault: 1697259 callbacks suppressed
>>> ~~~
>>>
>>> This results a relatively simple workstation (LX2 + nVidia GT-710 + USB
>>> for networking) becoming unresponsive. How long to fail is a little
>>> unpredictable. I assumed that the weight of such dense log messages
>>> eventually gets into a timing pattern that prevented any useful
>>> interrupts from being serviced... but that is only a guess.
>>>
>>
>> Few comments here:
>> - I'm suspecting that the PCI video card is triggering the smmu faults.
>> Would it be possible to give it a try with the card out and without
>> "iommu.passthrough=1"?
>
> The PCIe video card does not cause the smmu faults. These still manifest
> when the card is removed (and with same IOVA).
>
>
>> - the IOVAs look weird to me, they should look something like
>> 0xffffxxxxxx or so. Maybe there are issues in the nvidia driver?
>
> I guess there could be, but why would a problem that bisects down to
> a change in the fsl-mc-bus initialization configuration alter the
> behaviour of the PCIe graphics driver?
>
>
>> - Would it be possible to share a full boot log? I'm thinking that it
>> would be interesting to see how the devices are allocated in iommu groups.
>
> See
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2Fdaniel-thompson%2F07489561f14965fd1af7d5bd4340f54b&amp;data=04%7C01%7Claurentiu.tudor%40nxp.com%7Cea1a5bd1614a4fc6c71f08d9a9ebbb15%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637727652186934191%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=gYsxsm20NsCKKbSXWPentLAJJPAn6A9hEh3fAKBn2Kw%3D&amp;reserved=0
>
> It contains three files, all gathered with the GPU removed:
>
> * Logs from unmodified v5.15 with iommu.passthrough=1 set
> (networking is good).
> * Logs from v5.15 patched with the revert I shared earlier in
> the thread (networking is good).
> * Logs from v5.15 without iommu.passthough=1 set (many SMMU messages,
> networking is broken).
>

Ok, it appears there was some confusion on my side, sorry about it.
So, to summarize:
- the "arm-smmu.disable_bypass=0" workaround is not enough in the ACPI
scenario but works for DT based boot
- the result of reverting the patch is that the IOMMU for MC is no
longer configured (MC device does not get configured in SMMU) leading to
"arm-smmu.disable_bypass=0" being sufficient
- for ACPI too boot without "iommu.passthrough=1" the IORT RMR patches
are required

---
Best Regards, Laurentiu