2014-12-11 02:58:39

by Abhilash Kesavan

[permalink] [raw]
Subject: [PATCH 1/2] lib: devres: add a helper function for ioremap_wc

Implement a resource managed writecombine ioremap function.

Signed-off-by: Abhilash Kesavan <[email protected]>
---
Documentation/driver-model/devres.txt | 1 +
include/linux/io.h | 2 ++
lib/devres.c | 28 ++++++++++++++++++++++++++++
3 files changed, 31 insertions(+)

diff --git a/Documentation/driver-model/devres.txt b/Documentation/driver-model/devres.txt
index b5ab416..0f80cee 100644
--- a/Documentation/driver-model/devres.txt
+++ b/Documentation/driver-model/devres.txt
@@ -274,6 +274,7 @@ IOMAP
devm_ioport_unmap()
devm_ioremap()
devm_ioremap_nocache()
+ devm_ioremap_wc()
devm_ioremap_resource() : checks resource, requests memory region, ioremaps
devm_iounmap()
pcim_iomap()
diff --git a/include/linux/io.h b/include/linux/io.h
index fa02e55..42b33f0 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -64,6 +64,8 @@ void __iomem *devm_ioremap(struct device *dev, resource_size_t offset,
resource_size_t size);
void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t offset,
resource_size_t size);
+void __iomem *devm_ioremap_wc(struct device *dev, resource_size_t offset,
+ resource_size_t size);
void devm_iounmap(struct device *dev, void __iomem *addr);
int check_signature(const volatile void __iomem *io_addr,
const unsigned char *signature, int length);
diff --git a/lib/devres.c b/lib/devres.c
index 0f1dd2e..e8e1738 100644
--- a/lib/devres.c
+++ b/lib/devres.c
@@ -72,6 +72,34 @@ void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t offset,
EXPORT_SYMBOL(devm_ioremap_nocache);

/**
+ * devm_ioremap_wc - Managed ioremap_wc()
+ * @dev: Generic device to remap IO address for
+ * @offset: BUS offset to map
+ * @size: Size of map
+ *
+ * Managed ioremap_wc(). Map is automatically unmapped on driver detach.
+ */
+void __iomem *devm_ioremap_wc(struct device *dev, resource_size_t offset,
+ resource_size_t size)
+{
+ void __iomem **ptr, *addr;
+
+ ptr = devres_alloc(devm_ioremap_release, sizeof(*ptr), GFP_KERNEL);
+ if (!ptr)
+ return NULL;
+
+ addr = ioremap_wc(offset, size);
+ if (addr) {
+ *ptr = addr;
+ devres_add(dev, ptr);
+ } else
+ devres_free(ptr);
+
+ return addr;
+}
+EXPORT_SYMBOL(devm_ioremap_wc);
+
+/**
* devm_iounmap - Managed iounmap()
* @dev: Generic device to unmap for
* @addr: Address to unmap
--
1.7.9.5


2014-12-11 02:58:58

by Abhilash Kesavan

[permalink] [raw]
Subject: [PATCH 2/2] misc: sram: switch to ioremap_wc from ioremap

Currently, the SRAM allocator returns device memory via ioremap.
This causes issues on ARM64 when the internal SoC SRAM allocated by
the generic sram driver is used for audio playback. The destination
buffer address (which is ioremapped SRAM) is not 64-bit aligned for
certain streams (e.g. 44.1k sampling rate). In such cases we get
unhandled alignment faults. Use ioremap_wc in place of ioremap which
gives us normal non-cacheable memory instead of device memory.

Signed-off-by: Abhilash Kesavan <[email protected]>
---
This is based on the discussion about the crash here:
http://www.spinics.net/lists/arm-kernel/msg384647.html

drivers/misc/sram.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/misc/sram.c b/drivers/misc/sram.c
index 21181fa..15b4d4e 100644
--- a/drivers/misc/sram.c
+++ b/drivers/misc/sram.c
@@ -69,12 +69,23 @@ static int sram_probe(struct platform_device *pdev)
INIT_LIST_HEAD(&reserve_list);

res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
- virt_base = devm_ioremap_resource(&pdev->dev, res);
- if (IS_ERR(virt_base))
- return PTR_ERR(virt_base);
+ if (!res) {
+ dev_err(&pdev->dev, "found no memory resource\n");
+ return -EINVAL;
+ }

size = resource_size(res);

+ if (!devm_request_mem_region(&pdev->dev,
+ res->start, size, pdev->name)) {
+ dev_err(&pdev->dev, "could not request region for resource\n");
+ return -EBUSY;
+ }
+
+ virt_base = devm_ioremap_wc(&pdev->dev, res->start, size);
+ if (IS_ERR(virt_base))
+ return PTR_ERR(virt_base);
+
sram = devm_kzalloc(&pdev->dev, sizeof(*sram), GFP_KERNEL);
if (!sram)
return -ENOMEM;
--
1.7.9.5

2014-12-11 10:09:37

by Philipp Zabel

[permalink] [raw]
Subject: Re: [PATCH 2/2] misc: sram: switch to ioremap_wc from ioremap

Hi Abhilash,

Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan:
> Currently, the SRAM allocator returns device memory via ioremap.
> This causes issues on ARM64 when the internal SoC SRAM allocated by
> the generic sram driver is used for audio playback. The destination
> buffer address (which is ioremapped SRAM) is not 64-bit aligned for
> certain streams (e.g. 44.1k sampling rate). In such cases we get
> unhandled alignment faults. Use ioremap_wc in place of ioremap which
> gives us normal non-cacheable memory instead of device memory.

Could this break the omap_bus_sync() implementation in
arch/arm/mach-omap2/omap4-common.c?

void omap_bus_sync(void)
{
if (dram_sync && sram_sync) {
writel_relaxed(readl_relaxed(dram_sync), dram_sync);
writel_relaxed(readl_relaxed(sram_sync), sram_sync);
isb();
}
}

It is used in wmb() and omap_do_wfi() to drain interconnect write
buffers on omap4/5. If sram_sync is mapped with write-combining, could
the last write to sram_sync stay stuck in the write-combining buffer
until after the function returns?

regards
Philipp

> Signed-off-by: Abhilash Kesavan <[email protected]>
> ---
> This is based on the discussion about the crash here:
> http://www.spinics.net/lists/arm-kernel/msg384647.html
>
> drivers/misc/sram.c | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/misc/sram.c b/drivers/misc/sram.c
> index 21181fa..15b4d4e 100644
> --- a/drivers/misc/sram.c
> +++ b/drivers/misc/sram.c
> @@ -69,12 +69,23 @@ static int sram_probe(struct platform_device *pdev)
> INIT_LIST_HEAD(&reserve_list);
>
> res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> - virt_base = devm_ioremap_resource(&pdev->dev, res);
> - if (IS_ERR(virt_base))
> - return PTR_ERR(virt_base);
> + if (!res) {
> + dev_err(&pdev->dev, "found no memory resource\n");
> + return -EINVAL;
> + }
>
> size = resource_size(res);
>
> + if (!devm_request_mem_region(&pdev->dev,
> + res->start, size, pdev->name)) {
> + dev_err(&pdev->dev, "could not request region for resource\n");
> + return -EBUSY;
> + }
> +
> + virt_base = devm_ioremap_wc(&pdev->dev, res->start, size);
> + if (IS_ERR(virt_base))
> + return PTR_ERR(virt_base);
> +
> sram = devm_kzalloc(&pdev->dev, sizeof(*sram), GFP_KERNEL);
> if (!sram)
> return -ENOMEM;

2014-12-11 10:39:33

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 2/2] misc: sram: switch to ioremap_wc from ioremap

On Thu, Dec 11, 2014 at 10:08:33AM +0000, Philipp Zabel wrote:
> Hi Abhilash,
>
> Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan:
> > Currently, the SRAM allocator returns device memory via ioremap.
> > This causes issues on ARM64 when the internal SoC SRAM allocated by
> > the generic sram driver is used for audio playback. The destination
> > buffer address (which is ioremapped SRAM) is not 64-bit aligned for
> > certain streams (e.g. 44.1k sampling rate). In such cases we get
> > unhandled alignment faults. Use ioremap_wc in place of ioremap which
> > gives us normal non-cacheable memory instead of device memory.
>
> Could this break the omap_bus_sync() implementation in
> arch/arm/mach-omap2/omap4-common.c?
>
> void omap_bus_sync(void)
> {
> if (dram_sync && sram_sync) {
> writel_relaxed(readl_relaxed(dram_sync), dram_sync);
> writel_relaxed(readl_relaxed(sram_sync), sram_sync);
> isb();
> }
> }
>
> It is used in wmb() and omap_do_wfi() to drain interconnect write
> buffers on omap4/5. If sram_sync is mapped with write-combining, could
> the last write to sram_sync stay stuck in the write-combining buffer
> until after the function returns?

I think you have that issue anyway, since you can get an early write
response even if you use ioremap. Does the write to sram_sync have
side-effects that we need to wait for?

Will

2014-12-11 11:41:15

by Philipp Zabel

[permalink] [raw]
Subject: Re: [PATCH 2/2] misc: sram: switch to ioremap_wc from ioremap

Hi Will,

Am Donnerstag, den 11.12.2014, 10:39 +0000 schrieb Will Deacon:
> On Thu, Dec 11, 2014 at 10:08:33AM +0000, Philipp Zabel wrote:
> > Hi Abhilash,
> >
> > Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan:
> > > Currently, the SRAM allocator returns device memory via ioremap.
> > > This causes issues on ARM64 when the internal SoC SRAM allocated by
> > > the generic sram driver is used for audio playback. The destination
> > > buffer address (which is ioremapped SRAM) is not 64-bit aligned for
> > > certain streams (e.g. 44.1k sampling rate). In such cases we get
> > > unhandled alignment faults. Use ioremap_wc in place of ioremap which
> > > gives us normal non-cacheable memory instead of device memory.
> >
> > Could this break the omap_bus_sync() implementation in
> > arch/arm/mach-omap2/omap4-common.c?
> >
> > void omap_bus_sync(void)
> > {
> > if (dram_sync && sram_sync) {
> > writel_relaxed(readl_relaxed(dram_sync), dram_sync);
> > writel_relaxed(readl_relaxed(sram_sync), sram_sync);
> > isb();
> > }
> > }
> >
> > It is used in wmb() and omap_do_wfi() to drain interconnect write
> > buffers on omap4/5. If sram_sync is mapped with write-combining, could
> > the last write to sram_sync stay stuck in the write-combining buffer
> > until after the function returns?
>
> I think you have that issue anyway, since you can get an early write
> response even if you use ioremap. Does the write to sram_sync have
> side-effects that we need to wait for?

[Added Tony Lindgren and Santosh Shilimkar to Cc:]
I don't know.

regards
Philipp

2014-12-11 14:59:09

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2/2] misc: sram: switch to ioremap_wc from ioremap

On Thu, Dec 11, 2014 at 11:40:46AM +0000, Philipp Zabel wrote:
> Hi Will,
>
> Am Donnerstag, den 11.12.2014, 10:39 +0000 schrieb Will Deacon:
> > On Thu, Dec 11, 2014 at 10:08:33AM +0000, Philipp Zabel wrote:
> > > Hi Abhilash,
> > >
> > > Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan:
> > > > Currently, the SRAM allocator returns device memory via ioremap.
> > > > This causes issues on ARM64 when the internal SoC SRAM allocated by
> > > > the generic sram driver is used for audio playback. The destination
> > > > buffer address (which is ioremapped SRAM) is not 64-bit aligned for
> > > > certain streams (e.g. 44.1k sampling rate). In such cases we get
> > > > unhandled alignment faults. Use ioremap_wc in place of ioremap which
> > > > gives us normal non-cacheable memory instead of device memory.
> > >
> > > Could this break the omap_bus_sync() implementation in
> > > arch/arm/mach-omap2/omap4-common.c?
> > >
> > > void omap_bus_sync(void)
> > > {
> > > if (dram_sync && sram_sync) {
> > > writel_relaxed(readl_relaxed(dram_sync), dram_sync);
> > > writel_relaxed(readl_relaxed(sram_sync), sram_sync);
> > > isb();
> > > }
> > > }
> > >
> > > It is used in wmb() and omap_do_wfi() to drain interconnect write
> > > buffers on omap4/5. If sram_sync is mapped with write-combining, could
> > > the last write to sram_sync stay stuck in the write-combining buffer
> > > until after the function returns?
> >
> > I think you have that issue anyway, since you can get an early write
> > response even if you use ioremap. Does the write to sram_sync have
> > side-effects that we need to wait for?
>
> [Added Tony Lindgren and Santosh Shilimkar to Cc:]
> I don't know.

In addition to Will's question, do you care about the access size?
ioremap() returns Device memory which is bufferable (early
acknowledgement) but it guarantees the access size. With write
combining, you may get a different access size than requested.

--
Catalin

2014-12-17 12:35:04

by Abhilash Kesavan

[permalink] [raw]
Subject: Re: [PATCH 2/2] misc: sram: switch to ioremap_wc from ioremap

Hi,

On Thu, Dec 11, 2014 at 8:28 PM, Catalin Marinas
<[email protected]> wrote:
> On Thu, Dec 11, 2014 at 11:40:46AM +0000, Philipp Zabel wrote:
>> Hi Will,
>>
>> Am Donnerstag, den 11.12.2014, 10:39 +0000 schrieb Will Deacon:
>> > On Thu, Dec 11, 2014 at 10:08:33AM +0000, Philipp Zabel wrote:
>> > > Hi Abhilash,
>> > >
>> > > Am Donnerstag, den 11.12.2014, 08:28 +0530 schrieb Abhilash Kesavan:
>> > > > Currently, the SRAM allocator returns device memory via ioremap.
>> > > > This causes issues on ARM64 when the internal SoC SRAM allocated by
>> > > > the generic sram driver is used for audio playback. The destination
>> > > > buffer address (which is ioremapped SRAM) is not 64-bit aligned for
>> > > > certain streams (e.g. 44.1k sampling rate). In such cases we get
>> > > > unhandled alignment faults. Use ioremap_wc in place of ioremap which
>> > > > gives us normal non-cacheable memory instead of device memory.
>> > >
>> > > Could this break the omap_bus_sync() implementation in
>> > > arch/arm/mach-omap2/omap4-common.c?
>> > >
>> > > void omap_bus_sync(void)
>> > > {
>> > > if (dram_sync && sram_sync) {
>> > > writel_relaxed(readl_relaxed(dram_sync), dram_sync);
>> > > writel_relaxed(readl_relaxed(sram_sync), sram_sync);
>> > > isb();
>> > > }
>> > > }
>> > >
>> > > It is used in wmb() and omap_do_wfi() to drain interconnect write
>> > > buffers on omap4/5. If sram_sync is mapped with write-combining, could
>> > > the last write to sram_sync stay stuck in the write-combining buffer
>> > > until after the function returns?
>> >
>> > I think you have that issue anyway, since you can get an early write
>> > response even if you use ioremap. Does the write to sram_sync have
>> > side-effects that we need to wait for?
>>
>> [Added Tony Lindgren and Santosh Shilimkar to Cc:]
>> I don't know.
>
> In addition to Will's question, do you care about the access size?
> ioremap() returns Device memory which is bufferable (early
> acknowledgement) but it guarantees the access size. With write
> combining, you may get a different access size than requested.

>From the existing dts files, omap, imx, rockchip and exynos seem to be
the only users of the sram allocator code. I have tested this on
Exynos5420, Exynos5800 and Exynos7; there is no change in behavior
seen on these boards. Tested-by for other SoCs would be appreciated.

Regards,
Abhilash
>
> --
> Catalin