Hi all,
On RT, the function iommu_dma_map_msi_msg expects to be called from preemptible
context. However, this is not always the case resulting a splat with
!CONFIG_DEBUG_ATOMIC_SLEEP:
[ 48.875777] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:974
[ 48.875779] in_atomic(): 1, irqs_disabled(): 128, pid: 2103, name: ip
[ 48.875782] INFO: lockdep is turned off.
[ 48.875784] irq event stamp: 10684
[ 48.875786] hardirqs last enabled at (10683): [<ffff0000110c8d70>] _raw_spin_unlock_irqrestore+0x88/0x90
[ 48.875791] hardirqs last disabled at (10684): [<ffff0000110c8b2c>] _raw_spin_lock_irqsave+0x24/0x68
[ 48.875796] softirqs last enabled at (0): [<ffff0000100ec590>] copy_process.isra.1.part.2+0x8d8/0x1970
[ 48.875801] softirqs last disabled at (0): [<0000000000000000>] (null)
[ 48.875805] Preemption disabled at:
[ 48.875805] [<ffff000010189ae8>] __setup_irq+0xd8/0x6c0
[ 48.875811] CPU: 2 PID: 2103 Comm: ip Not tainted 5.0.3-rt1-00007-g42ede9a0fed6 #45
[ 48.875815] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Jan 23 2017
[ 48.875817] Call trace:
[ 48.875818] dump_backtrace+0x0/0x140
[ 48.875821] show_stack+0x14/0x20
[ 48.875823] dump_stack+0xa0/0xd4
[ 48.875827] ___might_sleep+0x16c/0x1f8
[ 48.875831] rt_spin_lock+0x5c/0x70
[ 48.875835] iommu_dma_map_msi_msg+0x5c/0x1d8
[ 48.875839] gicv2m_compose_msi_msg+0x3c/0x48
[ 48.875843] irq_chip_compose_msi_msg+0x40/0x58
[ 48.875846] msi_domain_activate+0x38/0x98
[ 48.875849] __irq_domain_activate_irq+0x58/0xa0
[ 48.875852] irq_domain_activate_irq+0x34/0x58
[ 48.875855] irq_activate+0x28/0x30
[ 48.875858] __setup_irq+0x2b0/0x6c0
[ 48.875861] request_threaded_irq+0xdc/0x188
[ 48.875865] sky2_setup_irq+0x44/0xf8
[ 48.875868] sky2_open+0x1a4/0x240
[ 48.875871] __dev_open+0xd8/0x188
[ 48.875874] __dev_change_flags+0x164/0x1f0
[ 48.875877] dev_change_flags+0x20/0x60
[ 48.875879] do_setlink+0x2a0/0xd30
[ 48.875882] __rtnl_newlink+0x5b4/0x6d8
[ 48.875885] rtnl_newlink+0x50/0x78
[ 48.875888] rtnetlink_rcv_msg+0x178/0x640
[ 48.875891] netlink_rcv_skb+0x58/0x118
[ 48.875893] rtnetlink_rcv+0x14/0x20
[ 48.875896] netlink_unicast+0x188/0x200
[ 48.875898] netlink_sendmsg+0x248/0x3d8
[ 48.875900] sock_sendmsg+0x18/0x40
[ 48.875904] ___sys_sendmsg+0x294/0x2d0
[ 48.875908] __sys_sendmsg+0x68/0xb8
[ 48.875911] __arm64_sys_sendmsg+0x20/0x28
[ 48.875914] el0_svc_common+0x90/0x118
[ 48.875918] el0_svc_handler+0x2c/0x80
[ 48.875922] el0_svc+0x8/0xc
This series is a first attempt to rework how MSI are mapped and composed
when an IOMMU is present.
I was able to test the changes in GICv2m and GICv3 ITS. I don't have
hardware for the other interrupt controllers.
Cheers,
Julien Grall (7):
genirq/msi: Add a new field in msi_desc to store an IOMMU cookie
iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts
irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg
irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg
irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg
irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg
iommu/dma-iommu: Remove iommu_dma_map_msi_msg()
drivers/iommu/dma-iommu.c | 45 ++++++++++++++++++++-------------------
drivers/irqchip/irq-gic-v2m.c | 8 ++++++-
drivers/irqchip/irq-gic-v3-its.c | 5 ++++-
drivers/irqchip/irq-gic-v3-mbi.c | 15 +++++++++++--
drivers/irqchip/irq-ls-scfg-msi.c | 7 +++++-
include/linux/dma-iommu.h | 20 +++++++++++++++--
include/linux/msi.h | 3 +++
7 files changed, 74 insertions(+), 29 deletions(-)
--
2.11.0
The function its_irq_compose_msi_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg requires to be called
from a preemptible context.
A recent patch split the function iommu_dma_map_msi_msg in 2 functions:
one that should be called in preemptible context, the other does
not have any requirement.
This patch reworks the GICv3 ITS driver to avoid executing preemptible
code in non-preemptible context by preparing the MSI mapping when
allocating the MSI interrupt.
Signed-off-by: Julien Grall <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7577755bdcf4..1e8e01797d9b 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1179,7 +1179,7 @@ static void its_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
msg->address_hi = upper_32_bits(addr);
msg->data = its_get_event_id(d);
- iommu_dma_map_msi_msg(d->irq, msg);
+ iommu_dma_compose_msi_msg(d->irq, msg);
}
static int its_irq_set_irqchip_state(struct irq_data *d,
@@ -2566,6 +2566,7 @@ static int its_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
{
msi_alloc_info_t *info = args;
struct its_device *its_dev = info->scratchpad[0].ptr;
+ struct its_node *its = its_dev->its;
irq_hw_number_t hwirq;
int err;
int i;
@@ -2574,6 +2575,8 @@ static int its_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
if (err)
return err;
+ err = iommu_dma_prepare_msi(info->desc, its->get_msi_base(its_dev));
+
for (i = 0; i < nr_irqs; i++) {
err = its_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
--
2.11.0
The function ls_scfg_msi_compose_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg requires to be called
from a preemptible context.
A recent patch split the function iommu_dma_map_msi_msg in 2 functions:
one that should be called in preemptible context, the other does
not have any requirement.
This patch reworks the FreeScale SCFG MSI driver to avoid executing
preemptible code in non-preemptible context by preparing the MSI mapping
when allocating the MSI interrupt.
Signed-off-by: Julien Grall <[email protected]>
---
drivers/irqchip/irq-ls-scfg-msi.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-ls-scfg-msi.c b/drivers/irqchip/irq-ls-scfg-msi.c
index c671b3212010..8099c5b1fcb5 100644
--- a/drivers/irqchip/irq-ls-scfg-msi.c
+++ b/drivers/irqchip/irq-ls-scfg-msi.c
@@ -100,7 +100,7 @@ static void ls_scfg_msi_compose_msg(struct irq_data *data, struct msi_msg *msg)
msg->data |= cpumask_first(mask);
}
- iommu_dma_map_msi_msg(data->irq, msg);
+ iommu_dma_compose_msi_msg(data->irq, msg);
}
static int ls_scfg_msi_set_affinity(struct irq_data *irq_data,
@@ -141,6 +141,7 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain *domain,
unsigned int nr_irqs,
void *args)
{
+ msi_alloc_info_t *info = args;
struct ls_scfg_msi *msi_data = domain->host_data;
int pos, err = 0;
@@ -157,6 +158,10 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain *domain,
if (err)
return err;
+ err = iommu_dma_prepare_msi(info->desc, msi_data->msiir_addr);
+ if (err)
+ return err;
+
irq_domain_set_info(domain, virq, pos,
&ls_scfg_msi_parent_chip, msi_data,
handle_simple_irq, NULL, NULL);
--
2.11.0
The functions mbi_compose_m{b, s}i_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg requires to be called
from a preemptible context.
A recent patch split the function iommu_dma_map_msi_msg in 2 functions:
one that should be called in preemptible context, the other does
not have any requirement.
This patch reworks the GICv3 MBI driver to avoid executing preemptible
code in non-preemptible context by preparing the MSI mappings when
allocating the MSI interrupt.
Signed-off-by: Julien Grall <[email protected]>
---
drivers/irqchip/irq-gic-v3-mbi.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v3-mbi.c b/drivers/irqchip/irq-gic-v3-mbi.c
index fbfa7ff6deb1..c812b80e3ce9 100644
--- a/drivers/irqchip/irq-gic-v3-mbi.c
+++ b/drivers/irqchip/irq-gic-v3-mbi.c
@@ -84,6 +84,7 @@ static void mbi_free_msi(struct mbi_range *mbi, unsigned int hwirq,
static int mbi_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
unsigned int nr_irqs, void *args)
{
+ msi_alloc_info_t *info = args;
struct mbi_range *mbi = NULL;
int hwirq, offset, i, err = 0;
@@ -104,6 +105,16 @@ static int mbi_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
hwirq = mbi->spi_start + offset;
+ err = iommu_dma_prepare_msi(info->desc,
+ mbi_phys_base + GICD_CLRSPI_NSR);
+ if (err)
+ return err;
+
+ err = iommu_dma_prepare_msi(info->desc,
+ mbi_phys_base + GICD_SETSPI_NSR);
+ if (err)
+ return err;
+
for (i = 0; i < nr_irqs; i++) {
err = mbi_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
@@ -142,7 +153,7 @@ static void mbi_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
msg[0].address_lo = lower_32_bits(mbi_phys_base + GICD_SETSPI_NSR);
msg[0].data = data->parent_data->hwirq;
- iommu_dma_map_msi_msg(data->irq, msg);
+ iommu_dma_compose_msi_msg(data->irq, msg);
}
#ifdef CONFIG_PCI_MSI
@@ -202,7 +213,7 @@ static void mbi_compose_mbi_msg(struct irq_data *data, struct msi_msg *msg)
msg[1].address_lo = lower_32_bits(mbi_phys_base + GICD_CLRSPI_NSR);
msg[1].data = data->parent_data->hwirq;
- iommu_dma_map_msi_msg(data->irq, &msg[1]);
+ iommu_dma_compose_msi_msg(data->irq, &msg[1]);
}
/* Platform-MSI specific irqchip */
--
2.11.0
On RT, the function iommu_dma_map_msi_msg may be called from
non-preemptible context. This will lead to a splat with
CONFIG_DEBUG_ATOMIC_SLEEP as the function is using spin_lock
(they can sleep on RT).
The function iommu_dma_map_msi_msg is used to map the MSI page in the
IOMMU PT and update the MSI message with the IOVA.
Only the part to lookup for the MSI page requires to be called in
preemptible context. As the MSI page cannot change over the lifecycle
of the MSI interrupt, the lookup can be cached and re-used later on.
This patch split the function iommu_dma_map_msi_msg in two new
functions:
- iommu_dma_prepare_msi: This function will prepare the mapping in
the IOMMU and store the cookie in the structure msi_desc. This
function should be called in preemptible context.
- iommu_dma_compose_msi_msg: This function will update the MSI
message with the IOVA when the device is behind an IOMMU.
Signed-off-by: Julien Grall <[email protected]>
---
drivers/iommu/dma-iommu.c | 43 ++++++++++++++++++++++++++++++++-----------
include/linux/dma-iommu.h | 21 +++++++++++++++++++++
2 files changed, 53 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77aabe637a60..f5c1f1685095 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -888,17 +888,17 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
return NULL;
}
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
{
- struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
+ struct device *dev = msi_desc_to_dev(desc);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_dma_cookie *cookie;
- struct iommu_dma_msi_page *msi_page;
- phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
unsigned long flags;
- if (!domain || !domain->iova_cookie)
- return;
+ if (!domain || !domain->iova_cookie) {
+ desc->iommu_cookie = NULL;
+ return 0;
+ }
cookie = domain->iova_cookie;
@@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
* of an MSI from within an IPI handler.
*/
spin_lock_irqsave(&cookie->msi_lock, flags);
- msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
+ desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);
spin_unlock_irqrestore(&cookie->msi_lock, flags);
- if (WARN_ON(!msi_page)) {
+ return (desc->iommu_cookie) ? 0 : -ENOMEM;
+}
+
+void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg)
+{
+ struct msi_desc *desc = irq_get_msi_desc(irq);
+ struct device *dev = msi_desc_to_dev(desc);
+ const struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+ const struct iommu_dma_msi_page *msi_page = desc->iommu_cookie;
+
+ if (!domain || !domain->iova_cookie || WARN_ON(!msi_page))
+ return;
+
+ msg->address_hi = upper_32_bits(msi_page->iova);
+ msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
+ msg->address_lo += lower_32_bits(msi_page->iova);
+}
+
+void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+{
+ struct msi_desc *desc = irq_get_msi_desc(irq);
+ phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
+
+ if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
/*
* We're called from a void callback, so the best we can do is
* 'fail' by filling the message with obviously bogus values.
@@ -922,8 +945,6 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msg->address_lo = ~0U;
msg->data = ~0U;
} else {
- msg->address_hi = upper_32_bits(msi_page->iova);
- msg->address_lo &= cookie_msi_granule(cookie) - 1;
- msg->address_lo += lower_32_bits(msi_page->iova);
+ iommu_dma_compose_msi_msg(irq, msg);
}
}
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index e760dc5d1fa8..2f4b2c2cc859 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -71,12 +71,23 @@ void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
size_t size, enum dma_data_direction dir, unsigned long attrs);
/* The DMA API isn't _quite_ the whole story, though... */
+/*
+ * Map the MSI page in the IOMMU device and store it in @desc
+ *
+ * Return 0 if succeeded other an error if the preparation has failed.
+ */
+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr);
+
+/* Update the MSI message if required. */
+void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg);
+
void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
#else
struct iommu_domain;
+struct msi_desc;
struct msi_msg;
struct device;
@@ -99,6 +110,16 @@ static inline void iommu_put_dma_cookie(struct iommu_domain *domain)
{
}
+static inline int iommu_dma_prepare_msi(struct msi_desc *desc,
+ phys_addr_t msi_addr)
+{
+ return 0;
+}
+
+static inline void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg)
+{
+}
+
static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
{
}
--
2.11.0
The function gicv2m_compose_msi_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg requires to be called
from a preemptible context.
A recent patch split the function iommu_dma_map_msi_msg in 2 functions:
one that should be called in preemptible context, the other does
not have any requirement.
This patch reworks the gicv2m driver to avoid executing preemptible code
in non-preemptible context by preparing the MSI mapping when allocating
the MSI interrupt.
Signed-off-by: Julien Grall <[email protected]>
---
drivers/irqchip/irq-gic-v2m.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index f5fe0100f9ff..e5372acd92c9 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -110,7 +110,7 @@ static void gicv2m_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
msg->data -= v2m->spi_offset;
- iommu_dma_map_msi_msg(data->irq, msg);
+ iommu_dma_compose_msi_msg(data->irq, msg);
}
static struct irq_chip gicv2m_irq_chip = {
@@ -167,6 +167,7 @@ static void gicv2m_unalloc_msi(struct v2m_data *v2m, unsigned int hwirq,
static int gicv2m_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
unsigned int nr_irqs, void *args)
{
+ msi_alloc_info_t *info = args;
struct v2m_data *v2m = NULL, *tmp;
int hwirq, offset, i, err = 0;
@@ -186,6 +187,11 @@ static int gicv2m_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
hwirq = v2m->spi_start + offset;
+ err = iommu_dma_prepare_msi(info->desc,
+ v2m->res.start + V2M_MSI_SETSPI_NS);
+ if (err)
+ return err;
+
for (i = 0; i < nr_irqs; i++) {
err = gicv2m_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
--
2.11.0
When an MSI doorbell is located downstream of an IOMMU, it is required
to swizzle the physical address with an appropriately-mapped IOVA for any
device attached to one of our DMA ops domain.
At the moment, the allocation of the mapping may be done when composing
the message. However, the composing may be done in non-preemtible
context while the allocation requires to be called from preemptible
context.
A follow-up patch will split the current logic in two functions
requiring to keep an IOMMU cookie per MSI.
This patch introduces a new field in msi_desc to store an IOMMU cookie
when CONFIG_IOMMU_DMA is selected.
Signed-off-by: Julien Grall <[email protected]>
---
include/linux/msi.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 7e9b81c3b50d..d7907feef1bb 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -77,6 +77,9 @@ struct msi_desc {
struct device *dev;
struct msi_msg msg;
struct irq_affinity_desc *affinity;
+#ifdef CONFIG_IOMMU_DMA
+ const void *iommu_cookie;
+#endif
union {
/* PCI MSI/X specific data */
--
2.11.0
A recent patch introduced two new functions to replace
iommu_dma_map_msi_msg() to avoid executing preemptible code in
non-preemptible context.
All the existings callers are now using the two new functions, so
iommu_dma_map_msi_msg() can be removed.
Signed-off-by: Julien Grall <[email protected]>
---
drivers/iommu/dma-iommu.c | 20 --------------------
include/linux/dma-iommu.h | 5 -----
2 files changed, 25 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f5c1f1685095..fdc8ded62e87 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -928,23 +928,3 @@ void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg)
msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
msg->address_lo += lower_32_bits(msi_page->iova);
}
-
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
- struct msi_desc *desc = irq_get_msi_desc(irq);
- phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
-
- if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
- /*
- * We're called from a void callback, so the best we can do is
- * 'fail' by filling the message with obviously bogus values.
- * Since we got this far due to an IOMMU being present, it's
- * not like the existing address would have worked anyway...
- */
- msg->address_hi = ~0U;
- msg->address_lo = ~0U;
- msg->data = ~0U;
- } else {
- iommu_dma_compose_msi_msg(irq, msg);
- }
-}
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 2f4b2c2cc859..4fe2b2fb19bf 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -81,7 +81,6 @@ int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr);
/* Update the MSI message if required. */
void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg);
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
#else
@@ -120,10 +119,6 @@ static inline void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg)
{
}
-static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-}
-
static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
{
}
--
2.11.0
On Thu, 18 Apr 2019, Julien Grall wrote:
> When an MSI doorbell is located downstream of an IOMMU, it is required
> to swizzle the physical address with an appropriately-mapped IOVA for any
> device attached to one of our DMA ops domain.
>
> At the moment, the allocation of the mapping may be done when composing
> the message. However, the composing may be done in non-preemtible
> context while the allocation requires to be called from preemptible
> context.
>
> A follow-up patch will split the current logic in two functions
> requiring to keep an IOMMU cookie per MSI.
>
> This patch introduces a new field in msi_desc to store an IOMMU cookie
> when CONFIG_IOMMU_DMA is selected.
# git grep 'This patch' Documentation/process/
Applied to the whole series.
Thanks
tglx
On Thu, Apr 18, 2019 at 06:26:06PM +0100, Julien Grall wrote:
> +int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
> {
> + struct device *dev = msi_desc_to_dev(desc);
> struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> struct iommu_dma_cookie *cookie;
> unsigned long flags;
>
> + if (!domain || !domain->iova_cookie) {
> + desc->iommu_cookie = NULL;
> + return 0;
> + }
>
> cookie = domain->iova_cookie;
>
> @@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
> * of an MSI from within an IPI handler.
> */
> spin_lock_irqsave(&cookie->msi_lock, flags);
> + desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);
> spin_unlock_irqrestore(&cookie->msi_lock, flags);
>
> + return (desc->iommu_cookie) ? 0 : -ENOMEM;
No need for the braces. Also I personally find a:
if (!desc->iommu_cookie)
return -ENOMEM;
return 0;
much more readable, but that might just be personal preference.
Hi Thomas,
On 4/18/19 8:28 PM, Thomas Gleixner wrote:
> On Thu, 18 Apr 2019, Julien Grall wrote:
>
>> When an MSI doorbell is located downstream of an IOMMU, it is required
>> to swizzle the physical address with an appropriately-mapped IOVA for any
>> device attached to one of our DMA ops domain.
>>
>> At the moment, the allocation of the mapping may be done when composing
>> the message. However, the composing may be done in non-preemtible
>> context while the allocation requires to be called from preemptible
>> context.
>>
>> A follow-up patch will split the current logic in two functions
>> requiring to keep an IOMMU cookie per MSI.
>>
>> This patch introduces a new field in msi_desc to store an IOMMU cookie
>> when CONFIG_IOMMU_DMA is selected.
>
> # git grep 'This patch' Documentation/process/
>
> Applied to the whole series.
Sorry for that. I will rework all the commit messages and resend the series.
Cheers,
--
Julien Grall
Hi Julien,
On 18/04/2019 18:26, Julien Grall wrote:
> When an MSI doorbell is located downstream of an IOMMU, it is required
> to swizzle the physical address with an appropriately-mapped IOVA for any
> device attached to one of our DMA ops domain.
>
> At the moment, the allocation of the mapping may be done when composing
> the message. However, the composing may be done in non-preemtible
> context while the allocation requires to be called from preemptible
> context.
>
> A follow-up patch will split the current logic in two functions
> requiring to keep an IOMMU cookie per MSI.
>
> This patch introduces a new field in msi_desc to store an IOMMU cookie
> when CONFIG_IOMMU_DMA is selected.
>
> Signed-off-by: Julien Grall <[email protected]>
> ---
> include/linux/msi.h | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index 7e9b81c3b50d..d7907feef1bb 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -77,6 +77,9 @@ struct msi_desc {
> struct device *dev;
> struct msi_msg msg;
> struct irq_affinity_desc *affinity;
> +#ifdef CONFIG_IOMMU_DMA
> + const void *iommu_cookie;
> +#endif
>
> union {
> /* PCI MSI/X specific data */
>
Given that this is the only member in this structure that is dependent
on a config option, you could also add a couple of accessors that would
do nothing when IOMMU_DMA is not selected (and use that in the DMA code).
Thanks,
M.
--
Jazz is not dead. It just smells funny...
On 4/23/19 11:23 AM, Marc Zyngier wrote:
> Hi Julien,
Hi Marc,
> On 18/04/2019 18:26, Julien Grall wrote:
>> When an MSI doorbell is located downstream of an IOMMU, it is required
>> to swizzle the physical address with an appropriately-mapped IOVA for any
>> device attached to one of our DMA ops domain.
>>
>> At the moment, the allocation of the mapping may be done when composing
>> the message. However, the composing may be done in non-preemtible
>> context while the allocation requires to be called from preemptible
>> context.
>>
>> A follow-up patch will split the current logic in two functions
>> requiring to keep an IOMMU cookie per MSI.
>>
>> This patch introduces a new field in msi_desc to store an IOMMU cookie
>> when CONFIG_IOMMU_DMA is selected.
>>
>> Signed-off-by: Julien Grall <[email protected]>
>> ---
>> include/linux/msi.h | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/include/linux/msi.h b/include/linux/msi.h
>> index 7e9b81c3b50d..d7907feef1bb 100644
>> --- a/include/linux/msi.h
>> +++ b/include/linux/msi.h
>> @@ -77,6 +77,9 @@ struct msi_desc {
>> struct device *dev;
>> struct msi_msg msg;
>> struct irq_affinity_desc *affinity;
>> +#ifdef CONFIG_IOMMU_DMA
>> + const void *iommu_cookie;
>> +#endif
>>
>> union {
>> /* PCI MSI/X specific data */
>>
>
> Given that this is the only member in this structure that is dependent
> on a config option, you could also add a couple of accessors that would
> do nothing when IOMMU_DMA is not selected (and use that in the DMA code).
I haven't seen any use of the helpers so far because the DMA code is
also protected by IOMMU_DMA.
I can add the helpers in the next version if you see any use outside of
the DMA code.
Cheers,
--
Julien Grall
On 18/04/2019 18:26, Julien Grall wrote:
> On RT, the function iommu_dma_map_msi_msg may be called from
> non-preemptible context. This will lead to a splat with
> CONFIG_DEBUG_ATOMIC_SLEEP as the function is using spin_lock
> (they can sleep on RT).
>
> The function iommu_dma_map_msi_msg is used to map the MSI page in the
> IOMMU PT and update the MSI message with the IOVA.
>
> Only the part to lookup for the MSI page requires to be called in
> preemptible context. As the MSI page cannot change over the lifecycle
> of the MSI interrupt, the lookup can be cached and re-used later on.
>
> This patch split the function iommu_dma_map_msi_msg in two new
> functions:
> - iommu_dma_prepare_msi: This function will prepare the mapping in
> the IOMMU and store the cookie in the structure msi_desc. This
> function should be called in preemptible context.
> - iommu_dma_compose_msi_msg: This function will update the MSI
> message with the IOVA when the device is behind an IOMMU.
>
> Signed-off-by: Julien Grall <[email protected]>
> ---
> drivers/iommu/dma-iommu.c | 43 ++++++++++++++++++++++++++++++++-----------
> include/linux/dma-iommu.h | 21 +++++++++++++++++++++
> 2 files changed, 53 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 77aabe637a60..f5c1f1685095 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -888,17 +888,17 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
> return NULL;
> }
>
> -void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
> +int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
I quite like the idea of moving from having an irq to having an msi_desc
passed to the IOMMU layer...
> {
> - struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
> + struct device *dev = msi_desc_to_dev(desc);
> struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> struct iommu_dma_cookie *cookie;
> - struct iommu_dma_msi_page *msi_page;
> - phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
> unsigned long flags;
>
> - if (!domain || !domain->iova_cookie)
> - return;
> + if (!domain || !domain->iova_cookie) {
> + desc->iommu_cookie = NULL;
> + return 0;
> + }
>
> cookie = domain->iova_cookie;
>
> @@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
> * of an MSI from within an IPI handler.
> */
> spin_lock_irqsave(&cookie->msi_lock, flags);
> - msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
> + desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);
> spin_unlock_irqrestore(&cookie->msi_lock, flags);
>
> - if (WARN_ON(!msi_page)) {
> + return (desc->iommu_cookie) ? 0 : -ENOMEM;
> +}
> +
> +void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg)
... but I'd like it even better if it was uniform. Can you please move
the irq_get_msi_desc() to the callers of iommu_dma_compose_msi_msg(),
and make both functions take a msi_desc?
Thanks,
M.
--
Jazz is not dead. It just smells funny...
Hi,
On 4/23/19 8:08 AM, Christoph Hellwig wrote:
> On Thu, Apr 18, 2019 at 06:26:06PM +0100, Julien Grall wrote:
>> +int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
>> {
>> + struct device *dev = msi_desc_to_dev(desc);
>> struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
>> struct iommu_dma_cookie *cookie;
>> unsigned long flags;
>>
>> + if (!domain || !domain->iova_cookie) {
>> + desc->iommu_cookie = NULL;
>> + return 0;
>> + }
>>
>> cookie = domain->iova_cookie;
>>
>> @@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
>> * of an MSI from within an IPI handler.
>> */
>> spin_lock_irqsave(&cookie->msi_lock, flags);
>> + desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);
>> spin_unlock_irqrestore(&cookie->msi_lock, flags);
>>
>> + return (desc->iommu_cookie) ? 0 : -ENOMEM;
>
> No need for the braces. Also I personally find a:
>
> if (!desc->iommu_cookie)
> return -ENOMEM;
> return 0;
>
> much more readable, but that might just be personal preference.
I am happy either way. I will use your suggestion in the next version.
Cheers,
--
Julien Grall
On 23/04/2019 11:51, Julien Grall wrote:
> On 4/23/19 11:23 AM, Marc Zyngier wrote:
>> Hi Julien,
>
> Hi Marc,
>
>> On 18/04/2019 18:26, Julien Grall wrote:
>>> When an MSI doorbell is located downstream of an IOMMU, it is required
>>> to swizzle the physical address with an appropriately-mapped IOVA for any
>>> device attached to one of our DMA ops domain.
>>>
>>> At the moment, the allocation of the mapping may be done when composing
>>> the message. However, the composing may be done in non-preemtible
>>> context while the allocation requires to be called from preemptible
>>> context.
>>>
>>> A follow-up patch will split the current logic in two functions
>>> requiring to keep an IOMMU cookie per MSI.
>>>
>>> This patch introduces a new field in msi_desc to store an IOMMU cookie
>>> when CONFIG_IOMMU_DMA is selected.
>>>
>>> Signed-off-by: Julien Grall <[email protected]>
>>> ---
>>> include/linux/msi.h | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/include/linux/msi.h b/include/linux/msi.h
>>> index 7e9b81c3b50d..d7907feef1bb 100644
>>> --- a/include/linux/msi.h
>>> +++ b/include/linux/msi.h
>>> @@ -77,6 +77,9 @@ struct msi_desc {
>>> struct device *dev;
>>> struct msi_msg msg;
>>> struct irq_affinity_desc *affinity;
>>> +#ifdef CONFIG_IOMMU_DMA
>>> + const void *iommu_cookie;
>>> +#endif
>>>
>>> union {
>>> /* PCI MSI/X specific data */
>>>
>>
>> Given that this is the only member in this structure that is dependent
>> on a config option, you could also add a couple of accessors that would
>> do nothing when IOMMU_DMA is not selected (and use that in the DMA code).
>
> I haven't seen any use of the helpers so far because the DMA code is
> also protected by IOMMU_DMA.
>
> I can add the helpers in the next version if you see any use outside of
> the DMA code.
There may not be any user user yet, but I'd surely like to see the
accessors. This isn't very different from the stub functions you add in
patch #2.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
On 23/04/2019 12:46, Marc Zyngier wrote:
> On 23/04/2019 11:51, Julien Grall wrote:
>> On 4/23/19 11:23 AM, Marc Zyngier wrote:
>>> Hi Julien,
>>
>> Hi Marc,
>>
>>> On 18/04/2019 18:26, Julien Grall wrote:
>>>> When an MSI doorbell is located downstream of an IOMMU, it is required
>>>> to swizzle the physical address with an appropriately-mapped IOVA for any
>>>> device attached to one of our DMA ops domain.
>>>>
>>>> At the moment, the allocation of the mapping may be done when composing
>>>> the message. However, the composing may be done in non-preemtible
>>>> context while the allocation requires to be called from preemptible
>>>> context.
>>>>
>>>> A follow-up patch will split the current logic in two functions
>>>> requiring to keep an IOMMU cookie per MSI.
>>>>
>>>> This patch introduces a new field in msi_desc to store an IOMMU cookie
>>>> when CONFIG_IOMMU_DMA is selected.
>>>>
>>>> Signed-off-by: Julien Grall <[email protected]>
>>>> ---
>>>> include/linux/msi.h | 3 +++
>>>> 1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/include/linux/msi.h b/include/linux/msi.h
>>>> index 7e9b81c3b50d..d7907feef1bb 100644
>>>> --- a/include/linux/msi.h
>>>> +++ b/include/linux/msi.h
>>>> @@ -77,6 +77,9 @@ struct msi_desc {
>>>> struct device *dev;
>>>> struct msi_msg msg;
>>>> struct irq_affinity_desc *affinity;
>>>> +#ifdef CONFIG_IOMMU_DMA
>>>> + const void *iommu_cookie;
>>>> +#endif
>>>>
>>>> union {
>>>> /* PCI MSI/X specific data */
>>>>
>>>
>>> Given that this is the only member in this structure that is dependent
>>> on a config option, you could also add a couple of accessors that would
>>> do nothing when IOMMU_DMA is not selected (and use that in the DMA code).
>>
>> I haven't seen any use of the helpers so far because the DMA code is
>> also protected by IOMMU_DMA.
>>
>> I can add the helpers in the next version if you see any use outside of
>> the DMA code.
>
> There may not be any user user yet, but I'd surely like to see the
> accessors. This isn't very different from the stub functions you add in
> patch #2.
If you foresee this being useful in general, do you reckon it would be
worth decoupling it under its own irqchip-layer Kconfig which can then
be selected by IOMMU_DMA?
Robin.
On 23/04/2019 14:19, Robin Murphy wrote:
> On 23/04/2019 12:46, Marc Zyngier wrote:
>> On 23/04/2019 11:51, Julien Grall wrote:
>>> On 4/23/19 11:23 AM, Marc Zyngier wrote:
>>>> Hi Julien,
>>>
>>> Hi Marc,
>>>
>>>> On 18/04/2019 18:26, Julien Grall wrote:
>>>>> When an MSI doorbell is located downstream of an IOMMU, it is required
>>>>> to swizzle the physical address with an appropriately-mapped IOVA for any
>>>>> device attached to one of our DMA ops domain.
>>>>>
>>>>> At the moment, the allocation of the mapping may be done when composing
>>>>> the message. However, the composing may be done in non-preemtible
>>>>> context while the allocation requires to be called from preemptible
>>>>> context.
>>>>>
>>>>> A follow-up patch will split the current logic in two functions
>>>>> requiring to keep an IOMMU cookie per MSI.
>>>>>
>>>>> This patch introduces a new field in msi_desc to store an IOMMU cookie
>>>>> when CONFIG_IOMMU_DMA is selected.
>>>>>
>>>>> Signed-off-by: Julien Grall <[email protected]>
>>>>> ---
>>>>> include/linux/msi.h | 3 +++
>>>>> 1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/msi.h b/include/linux/msi.h
>>>>> index 7e9b81c3b50d..d7907feef1bb 100644
>>>>> --- a/include/linux/msi.h
>>>>> +++ b/include/linux/msi.h
>>>>> @@ -77,6 +77,9 @@ struct msi_desc {
>>>>> struct device *dev;
>>>>> struct msi_msg msg;
>>>>> struct irq_affinity_desc *affinity;
>>>>> +#ifdef CONFIG_IOMMU_DMA
>>>>> + const void *iommu_cookie;
>>>>> +#endif
>>>>>
>>>>> union {
>>>>> /* PCI MSI/X specific data */
>>>>>
>>>>
>>>> Given that this is the only member in this structure that is dependent
>>>> on a config option, you could also add a couple of accessors that would
>>>> do nothing when IOMMU_DMA is not selected (and use that in the DMA code).
>>>
>>> I haven't seen any use of the helpers so far because the DMA code is
>>> also protected by IOMMU_DMA.
>>>
>>> I can add the helpers in the next version if you see any use outside of
>>> the DMA code.
>>
>> There may not be any user user yet, but I'd surely like to see the
>> accessors. This isn't very different from the stub functions you add in
>> patch #2.
>
> If you foresee this being useful in general, do you reckon it would be
> worth decoupling it under its own irqchip-layer Kconfig which can then
> be selected by IOMMU_DMA?
I think that'd be a useful thing to do, as most architectures do not
require this dynamic mapping of MSIs.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
Hi Marc,
On 23/04/2019 11:54, Marc Zyngier wrote:
> On 18/04/2019 18:26, Julien Grall wrote:
>> On RT, the function iommu_dma_map_msi_msg may be called from
>> non-preemptible context. This will lead to a splat with
>> CONFIG_DEBUG_ATOMIC_SLEEP as the function is using spin_lock
>> (they can sleep on RT).
>>
>> The function iommu_dma_map_msi_msg is used to map the MSI page in the
>> IOMMU PT and update the MSI message with the IOVA.
>>
>> Only the part to lookup for the MSI page requires to be called in
>> preemptible context. As the MSI page cannot change over the lifecycle
>> of the MSI interrupt, the lookup can be cached and re-used later on.
>>
>> This patch split the function iommu_dma_map_msi_msg in two new
>> functions:
>> - iommu_dma_prepare_msi: This function will prepare the mapping in
>> the IOMMU and store the cookie in the structure msi_desc. This
>> function should be called in preemptible context.
>> - iommu_dma_compose_msi_msg: This function will update the MSI
>> message with the IOVA when the device is behind an IOMMU.
>>
>> Signed-off-by: Julien Grall <[email protected]>
>> ---
>> drivers/iommu/dma-iommu.c | 43 ++++++++++++++++++++++++++++++++-----------
>> include/linux/dma-iommu.h | 21 +++++++++++++++++++++
>> 2 files changed, 53 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 77aabe637a60..f5c1f1685095 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -888,17 +888,17 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
>> return NULL;
>> }
>>
>> -void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
>> +int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
>
> I quite like the idea of moving from having an irq to having an msi_desc
> passed to the IOMMU layer...
>
>> {
>> - struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
>> + struct device *dev = msi_desc_to_dev(desc);
>> struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
>> struct iommu_dma_cookie *cookie;
>> - struct iommu_dma_msi_page *msi_page;
>> - phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
>> unsigned long flags;
>>
>> - if (!domain || !domain->iova_cookie)
>> - return;
>> + if (!domain || !domain->iova_cookie) {
>> + desc->iommu_cookie = NULL;
>> + return 0;
>> + }
>>
>> cookie = domain->iova_cookie;
>>
>> @@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
>> * of an MSI from within an IPI handler.
>> */
>> spin_lock_irqsave(&cookie->msi_lock, flags);
>> - msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
>> + desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);
>> spin_unlock_irqrestore(&cookie->msi_lock, flags);
>>
>> - if (WARN_ON(!msi_page)) {
>> + return (desc->iommu_cookie) ? 0 : -ENOMEM;
>> +}
>> +
>> +void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg)
>
> ... but I'd like it even better if it was uniform. Can you please move
> the irq_get_msi_desc() to the callers of iommu_dma_compose_msi_msg(),
> and make both functions take a msi_desc?
Make sense. I will modify iommu_dma_compose_msi_msg to take a msi_desc in parameter.
Cheers,
--
Julien Grall