2014-07-08 16:22:44

by Suman Anna

[permalink] [raw]
Subject: [PATCH 0/2] couple of generic remoteproc enhancements

Hi Ohad,

The following are couple of generic enhancements to the remoteproc core.
The changes are developed to support adding new rproc drivers for processors
like the WkupM3 processor on AM335/AM437, but are generic enough to apply to
non-OMAP devices as well.

The first patch allows MMU-enabled processors like OMAP IPU/DSP as well as
other non-MMU enabled processors on multi-arch kernel images.

The second-patch was developed as part of enabling internal memory loading
of firmware segments on OMAP DSPs, but will also be used by the WkupM3
remoteproc driver that Dave Gerlach is gonna submit soon.

The two patches are based on 3.16-rc4 and are technically independent of each
other, but submitting them together as the WkupM3 remoteproc driver would need
both of them.

regards
Suman

Suman Anna (2):
remoteproc: use a flag to detect the presence of IOMMU
remoteproc: add support to handle internal memories

drivers/remoteproc/omap_remoteproc.c | 6 +++
drivers/remoteproc/remoteproc_core.c | 100 ++++++++++++++++++++++++++++++-----
include/linux/remoteproc.h | 45 +++++++++++++++-
3 files changed, 136 insertions(+), 15 deletions(-)

--
2.0.0


2014-07-08 16:22:57

by Suman Anna

[permalink] [raw]
Subject: [PATCH 1/2] remoteproc: use a flag to detect the presence of IOMMU

The remoteproc driver core currently relies on iommu_present() on
the bus the device is on, to perform MMU management. However, this
logic doesn't scale for multi-arch, especially for processors that
do not have an IOMMU. Replace this logic instead by using a h/w
capability flag for the presence of IOMMU in the rproc structure.

The individual platform implementations are required to set this
flag appropriately. The default setting is to not have an MMU.

The OMAP remoteproc driver is updated while at this, to maintain
the functionality with the IOMMU detection logic change in this
patch.

Cc: Sjur Brændeland <[email protected]>
Cc: Robert Tivy <[email protected]>
Signed-off-by: Suman Anna <[email protected]>
---
drivers/remoteproc/omap_remoteproc.c | 6 ++++++
drivers/remoteproc/remoteproc_core.c | 15 ++-------------
include/linux/remoteproc.h | 2 ++
3 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/remoteproc/omap_remoteproc.c b/drivers/remoteproc/omap_remoteproc.c
index 5168972..858abf0 100644
--- a/drivers/remoteproc/omap_remoteproc.c
+++ b/drivers/remoteproc/omap_remoteproc.c
@@ -199,6 +199,12 @@ static int omap_rproc_probe(struct platform_device *pdev)

oproc = rproc->priv;
oproc->rproc = rproc;
+ /*
+ * All existing OMAP IPU and DSP processors do have an MMU, and
+ * are expected to use MMU, so this statement suffices.
+ * XXX: Replace this logic if and when a need arises.
+ */
+ rproc->has_iommu = true;

platform_set_drvdata(pdev, rproc);

diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index 3cd85a6..11cdb11 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -94,19 +94,8 @@ static int rproc_enable_iommu(struct rproc *rproc)
struct device *dev = rproc->dev.parent;
int ret;

- /*
- * We currently use iommu_present() to decide if an IOMMU
- * setup is needed.
- *
- * This works for simple cases, but will easily fail with
- * platforms that do have an IOMMU, but not for this specific
- * rproc.
- *
- * This will be easily solved by introducing hw capabilities
- * that will be set by the remoteproc driver.
- */
- if (!iommu_present(dev->bus)) {
- dev_dbg(dev, "iommu not found\n");
+ if (!rproc->has_iommu) {
+ dev_dbg(dev, "iommu not present\n");
return 0;
}

diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
index 9e7e745..78b8a9b 100644
--- a/include/linux/remoteproc.h
+++ b/include/linux/remoteproc.h
@@ -404,6 +404,7 @@ enum rproc_crash_type {
* @table_ptr: pointer to the resource table in effect
* @cached_table: copy of the resource table
* @table_csum: checksum of the resource table
+ * @has_iommu: flag to indicate if remote processor is behind an MMU
*/
struct rproc {
struct klist_node node;
@@ -435,6 +436,7 @@ struct rproc {
struct resource_table *table_ptr;
struct resource_table *cached_table;
u32 table_csum;
+ bool has_iommu;
};

/* we currently support only two vrings per rvdev */
--
2.0.0

2014-07-08 16:23:05

by Suman Anna

[permalink] [raw]
Subject: [PATCH 2/2] remoteproc: add support to handle internal memories

A remote processor may need to load certain firmware sections into
internal memories (eg: RAM at L1 or L2 levels) for performance or
other reasons. Introduce a new resource type (RSC_INTMEM) and add
an associated handler function to handle such memories. The handler
creates a kernel mapping for the resource's 'pa' (physical address).

Note that no iommu mapping is performed for this resource, as the
resource is primarily used to represent physical internal memories.
If the internal memory region can only be accessed through an iommu,
a devmem resource entry should be used instead.

Signed-off-by: Robert Tivy <[email protected]>
Signed-off-by: Suman Anna <[email protected]>
---
drivers/remoteproc/remoteproc_core.c | 85 +++++++++++++++++++++++++++++++++++-
include/linux/remoteproc.h | 43 +++++++++++++++++-
2 files changed, 126 insertions(+), 2 deletions(-)

diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index 11cdb11..e2bd869 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -664,6 +664,84 @@ free_carv:
return ret;
}

+/**
+ * rproc_handle_intmem() - handle internal memory resource entry
+ * @rproc: rproc handle
+ * @rsc: the intmem resource entry
+ * @offset: offset of the resource data in resource table
+ * @avail: size of available data (for image validation)
+ *
+ * This function will handle firmware requests for mapping a memory region
+ * internal to a remote processor into kernel. It neither allocates any
+ * physical pages, nor performs any iommu mapping, as this resource entry
+ * is primarily used for representing physical internal memories. If the
+ * internal memory region can only be accessed through an iommu, please
+ * use a devmem resource entry.
+ *
+ * These resource entries should be grouped near the carveout entries in
+ * the firmware's resource table, as other firmware entries might request
+ * placing other data objects inside these memory regions (e.g. data/code
+ * segments, trace resource entries, ...).
+ */
+static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
+ int offset, int avail)
+{
+ struct rproc_mem_entry *intmem;
+ struct device *dev = &rproc->dev;
+ void *va;
+ int ret;
+
+ if (sizeof(*rsc) > avail) {
+ dev_err(dev, "intmem rsc is truncated\n");
+ return -EINVAL;
+ }
+
+ if (rsc->version != 1) {
+ dev_err(dev, "intmem rsc version %d is not supported\n",
+ rsc->version);
+ return -EINVAL;
+ }
+
+ if (rsc->reserved) {
+ dev_err(dev, "intmem rsc has non zero reserved bytes\n");
+ return -EINVAL;
+ }
+
+ dev_dbg(dev, "intmem rsc: da 0x%x, pa 0x%x, len 0x%x\n",
+ rsc->da, rsc->pa, rsc->len);
+
+ intmem = kzalloc(sizeof(*intmem), GFP_KERNEL);
+ if (!intmem) {
+ dev_err(dev, "kzalloc carveout failed\n");
+ return -ENOMEM;
+ }
+
+ va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);
+ if (!va) {
+ dev_err(dev, "ioremap_nocache err: %d\n", rsc->len);
+ ret = -ENOMEM;
+ goto free_intmem;
+ }
+
+ dev_dbg(dev, "intmem mapped pa 0x%x of len 0x%x into kernel va %p\n",
+ rsc->pa, rsc->len, va);
+
+ intmem->va = va;
+ intmem->len = rsc->len;
+ intmem->dma = rsc->pa;
+ intmem->da = rsc->da;
+ intmem->priv = (void *)1; /* prevents freeing */
+
+ /* reuse the rproc->carveouts list, so that loading is automatic */
+ list_add_tail(&intmem->node, &rproc->carveouts);
+
+ return 0;
+
+free_intmem:
+ kfree(intmem);
+ return ret;
+}
+
static int rproc_count_vrings(struct rproc *rproc, struct fw_rsc_vdev *rsc,
int offset, int avail)
{
@@ -681,6 +759,7 @@ static rproc_handle_resource_t rproc_loading_handlers[RSC_LAST] = {
[RSC_CARVEOUT] = (rproc_handle_resource_t)rproc_handle_carveout,
[RSC_DEVMEM] = (rproc_handle_resource_t)rproc_handle_devmem,
[RSC_TRACE] = (rproc_handle_resource_t)rproc_handle_trace,
+ [RSC_INTMEM] = (rproc_handle_resource_t)rproc_handle_intmem,
[RSC_VDEV] = NULL, /* VDEVs were handled upon registrarion */
};

@@ -768,7 +847,11 @@ static void rproc_resource_cleanup(struct rproc *rproc)

/* clean up carveout allocations */
list_for_each_entry_safe(entry, tmp, &rproc->carveouts, node) {
- dma_free_coherent(dev->parent, entry->len, entry->va, entry->dma);
+ if (!entry->priv)
+ dma_free_coherent(dev->parent, entry->len, entry->va,
+ entry->dma);
+ else
+ iounmap((__force void __iomem *)entry->va);
list_del(&entry->node);
kfree(entry);
}
diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
index 78b8a9b..2a25ee8 100644
--- a/include/linux/remoteproc.h
+++ b/include/linux/remoteproc.h
@@ -100,6 +100,7 @@ struct fw_rsc_hdr {
* the remote processor will be writing logs.
* @RSC_VDEV: declare support for a virtio device, and serve as its
* virtio header.
+ * @RSC_INTMEM: request to map into kernel an internal memory region.
* @RSC_LAST: just keep this one at the end
*
* For more details regarding a specific resource type, please see its
@@ -115,7 +116,8 @@ enum fw_resource_type {
RSC_DEVMEM = 1,
RSC_TRACE = 2,
RSC_VDEV = 3,
- RSC_LAST = 4,
+ RSC_INTMEM = 4,
+ RSC_LAST = 5,
};

#define FW_RSC_ADDR_ANY (0xFFFFFFFFFFFFFFFF)
@@ -306,6 +308,45 @@ struct fw_rsc_vdev {
} __packed;

/**
+ * struct fw_rsc_intmem - internal memory publishing request
+ * @version: version for this resource type (must be one)
+ * @da: device address
+ * @pa: physical address
+ * @len: length (in bytes)
+ * @reserved: reserved (must be zero)
+ * @name: human-readable name of the region being published
+ *
+ * This resource entry allows a remote processor to publish an internal
+ * memory region to the host. This resource type allows a remote processor
+ * to publish the whole or just a portion of certain internal memories,
+ * while it owns and manages any unpublished portion (eg: a shared L1
+ * memory that can be split configured as RAM and/or cache). This is
+ * primarily provided to allow a host to load code/data into internal
+ * memories, the memory for which is neither allocated nor required to
+ * be mapped into an iommu.
+ *
+ * @da should specify the required address as accessible by the device
+ * without going through an iommu, @pa should specify the physical address
+ * for the region as seen on the bus, @len should specify the size of the
+ * memory region. As always, @name may (optionally) contain a human readable
+ * name of this mapping (mainly for debugging purposes). The @version field
+ * is added for future scalability, and should be 1 for now.
+ *
+ * Note: at this point we just "trust" these intmem entries to contain valid
+ * physical bus addresses. these are not currently intended to be managed
+ * as host-controlled heaps, as it is much better to do that from the remote
+ * processor side.
+ */
+struct fw_rsc_intmem {
+ u32 version;
+ u32 da;
+ u32 pa;
+ u32 len;
+ u32 reserved;
+ u8 name[32];
+} __packed;
+
+/**
* struct rproc_mem_entry - memory entry descriptor
* @va: virtual address
* @dma: dma address
--
2.0.0

2014-07-29 10:57:35

by Ohad Ben Cohen

[permalink] [raw]
Subject: Re: [PATCH 1/2] remoteproc: use a flag to detect the presence of IOMMU

Hi Suman,

On Tue, Jul 8, 2014 at 7:22 PM, Suman Anna <[email protected]> wrote:
> The remoteproc driver core currently relies on iommu_present() on
> the bus the device is on, to perform MMU management. However, this
> logic doesn't scale for multi-arch, especially for processors that
> do not have an IOMMU.

Is there a specific hw/scenario where you need this? Can you please
provide more details about it?

Ideally we should add them to the commit log as well.

> The individual platform implementations are required to set this
> flag appropriately. The default setting is to not have an MMU.

Let's explicitly set the default please so this would be clear for
users reading the code.

> Cc: Sjur Brændeland <[email protected]>

Sjur is no longer with STE, so no point in cc'ing his old email address.

> + /*
> + * All existing OMAP IPU and DSP processors do have an MMU, and
> + * are expected to use MMU, so this statement suffices.
> + * XXX: Replace this logic if and when a need arises.

The last XXX comment is always true for any kernel code, so I'd drop it.

Thanks,
Ohad.

2014-07-29 11:01:00

by Ohad Ben Cohen

[permalink] [raw]
Subject: Re: [PATCH 2/2] remoteproc: add support to handle internal memories

Hi Suman,

On Tue, Jul 8, 2014 at 7:22 PM, Suman Anna <[email protected]> wrote:
> A remote processor may need to load certain firmware sections into
> internal memories (eg: RAM at L1 or L2 levels) for performance or
> other reasons.

Can you please provide as much details as you can about the scenario
you need this for? what hardware, what sections, which specific
memory, what's the use case, numbers, sizes, everything.

I'd like to better understand the use case please.

Thanks,
Ohad.

2014-07-29 16:10:55

by Suman Anna

[permalink] [raw]
Subject: Re: [PATCH 1/2] remoteproc: use a flag to detect the presence of IOMMU

Hi Ohad,

On 07/29/2014 05:57 AM, Ohad Ben-Cohen wrote:
> Hi Suman,
>
> On Tue, Jul 8, 2014 at 7:22 PM, Suman Anna <[email protected]> wrote:
>> The remoteproc driver core currently relies on iommu_present() on
>> the bus the device is on, to perform MMU management. However, this
>> logic doesn't scale for multi-arch, especially for processors that
>> do not have an IOMMU.
>
> Is there a specific hw/scenario where you need this? Can you please
> provide more details about it?

We are trying to add a remoteproc driver for a small Cortex M3 called
the WkupM3 used for suspend/resume management on TI AM335/AM437x SoCs.
This processor does not have an MMU. Same is the case with another
processor subsystem PRU-ICSS on AM335/AM437x. All these are platform
devices, and the current iommu_present check will not scale for the same
kernel image to support OMAP4/OMAP5 and AM335/AM437x.

This patch mainly addresses the existing comments in the code,
- * This works for simple cases, but will easily fail with
- * platforms that do have an IOMMU, but not for this specific
- * rproc.
- *
- * This will be easily solved by introducing hw capabilities
- * that will be set by the remoteproc driver.

>
> Ideally we should add them to the commit log as well.
>
>> The individual platform implementations are required to set this
>> flag appropriately. The default setting is to not have an MMU.
>
> Let's explicitly set the default please so this would be clear for
> users reading the code.

OK, I can update the existing drivers to explicitly set this field.

>
>> Cc: Sjur Brændeland <[email protected]>
>
> Sjur is no longer with STE, so no point in cc'ing his old email address.

Yeah, I wasn't aware until I got a bounced email.

>
>> + /*
>> + * All existing OMAP IPU and DSP processors do have an MMU, and
>> + * are expected to use MMU, so this statement suffices.
>> + * XXX: Replace this logic if and when a need arises.
>
> The last XXX comment is always true for any kernel code, so I'd drop it.

Sure.

regards
Suman

2014-07-29 19:33:38

by Suman Anna

[permalink] [raw]
Subject: Re: [PATCH 2/2] remoteproc: add support to handle internal memories

Hi Ohad,

On 07/29/2014 06:00 AM, Ohad Ben-Cohen wrote:
> Hi Suman,
>
> On Tue, Jul 8, 2014 at 7:22 PM, Suman Anna <[email protected]> wrote:
>> A remote processor may need to load certain firmware sections into
>> internal memories (eg: RAM at L1 or L2 levels) for performance or
>> other reasons.
>
> Can you please provide as much details as you can about the scenario
> you need this for? what hardware, what sections, which specific
> memory, what's the use case, numbers, sizes, everything.
>
> I'd like to better understand the use case please.

We currently have two usecases. The primary usecase is the WkupM3
processor on TI Sitara AM335x/AM437x SoCs used for suspend/resume
management. This series is a dependency for the WkupM3 remoteproc driver
that Dave posted [1]. More details are in section 8.1.4.6 of the AM335x
TRM [2]. The program/data sections for this processor all _needs_ to be
in the two internal memory RAMS (16K Unified RAM and 8K Data RAM), and
there is no MMU for this processor. The current RSC_CARVEOUT and
RSC_DEVMEM do not fit to describe this type of memory (we neither
allocate memory through dma api nor do we need to map these into an MMU).

The second usecase is for some code to be loaded into the internal
memories of the DSP in existing OMAPs directly during remoteproc loading
stage. These memories are accessible to the processor again without
having to go through the L2MMU through which the external RAM and
peripherals are accessed through.

regards
Suman

[1] https://patchwork.kernel.org/patch/4529651/
[2] http://www.ti.com/lit.ug/spruh73k/spruh73k.pdf