2013-09-17 12:43:53

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 00/11] Embeddable Position Independent Executable

This patch adds support for and demonstrates the usage of an embedded
position independent executable (PIE). The goal is to allow the use of C
code in situations where carefully written position independent assembly
was previously required.

The example used is the suspend/resume code for the am335x, a Texas
Instruments ARM SoC. In order to save the maximum amount of power during
suspend, the am335x has to perform several power saving operations after
SDRAM has been disabled, and undo those steps at resume time. The am335x
provides an SRAM region for the processor to execute such code.

A PIE executable was chosen because it limits the types of relocations
that must be performed before the code is executed. In the case of ARM,
the only required relocation type is a relative relocation of pointers.

The kernel is provided symbols into the PIE executable by way of exporting
those symbols from the PIE and then importing them into vmlinux at final
link time. Because the PIE is loaded dynamically at runtime, any access
to PIE functions or data must pass through special accessor macros that
apply the necessary offset. Code within the PIE does not have access to
symbols outside of the PIE, but it still can access code and data outside
the PIE so long as it is passed pointers to that code/data.

The PIE executable is provided its own set of functions required by gcc,
such as memcpy, memmove, etc. The different PIE sections are collected
together in the linker script as an overlay so that the kernel only needs
one copy of these functions. When the PIE is loaded, the library functions
and appropriate sections are copied into a genalloc pool (in the case of
the example, backed by SRAM). The PIE code then applies the necessary
relocations to the loaded code. Because the relocations are just relative
offsets to pointers, the relocations can be reapplied to allow the code
to run with the MMU enabled or disabled.

This patchset is a complete rethinking of an earlier patchset [1]. Ard
Biesheuvel provided the suggestion to use the PIE executable format for
storing the relocatable code within the kernel. Russell King and Dave
Martin pointed out the shortcomings of my initial naive approach.

This patchset depends on Rajendra Nayak's SRAM DT-ification patch series
[2], Suman Anna's am335x mailbox series [3], and a portion of Dave
Gerlach's am335x suspend/resume patchset [4]. I've collected together
the necessary dependances and applied this patch series on top of them
here [5].

Because special ioremap variants are required on ARM for io mappings
that allow code execution, the first two patches provide generic
accessors for those variants. The third patch provides a DT and pdata
method for instructing misc/sram to map the memory in such a way that
allows code execution.

The 4th patch provides a generic set of functions for handling function
pointers as addresses and vice versa. This is necessary on ARM because
of the way that Thumb2 function pointers are handled by gcc. The PIE
framework requires this functionality because it performs translations
of function pointers.

The 5th patch is the general PIE framework. The 6th patch is the addition
of ARM support for PIE. The 7th patch provides the ability of ARM to
fixup PIE code on the fly. This is necessary since at suspend time the
MMU will be working, but at resume time, it will be off. The 8th patch
provides a predefined trampoline that utilizes the on the fly fixup.

The 9th patch configures the SRAM DT entries for am335x so that they can
be easily found by the PM code, and so that they are mapped with exec
enabled. The 10th patch adds PIE entries for am335x, and the 11th patch
finally adds suspend/resume support for am33xx utilizing C code for
suspend/resume paths.

[1] http://www.spinics.net/lists/arm-kernel/msg271525.html
[2] http://comments.gmane.org/gmane.linux.ports.arm.omap/103774
[3] http://www.spinics.net/lists/devicetree/msg00227.html
[4] http://www.spinics.net/lists/linux-omap/msg95305.html
[5] https://github.com/russdill/linux/commits/sram

Russ Dill (10):
asm-generic: io: Add exec versions of ioremap
lib: devres: Add exec versions of devm_ioremap_resource and friends
misc: SRAM: Add option to map SRAM to allow code execution
asm-generic: fncpy: Add function copying macros
PIE: Support embedding position independent executables
ARM: PIE: Add position independent executable embedding to ARM
ARM: PIE: Add support for updating PIE relocations
ARM: PIE: Add macro for generating PIE resume trampoline
ARM: dts: AM33XX: Associate SRAM with MPU and mark it exec
ARM: OMAP2+: AM33XX: Add PIE support for AM33XX

Vaibhav Bedia (1):
ARM: OMAP2+: AM33XX: Basic suspend resume support

Documentation/devicetree/bindings/misc/sram.txt | 4 +
Documentation/pie.txt | 167 ++++++++
Makefile | 17 +-
arch/alpha/include/asm/fncpy.h | 1 +
arch/arc/include/asm/fncpy.h | 1 +
arch/arm/Kconfig | 1 +
arch/arm/Makefile | 5 +
arch/arm/boot/dts/am33xx.dtsi | 2 +
arch/arm/configs/omap2plus_defconfig | 1 +
arch/arm/include/asm/elf.h | 1 +
arch/arm/include/asm/fncpy.h | 76 +---
arch/arm/include/asm/io.h | 2 +
arch/arm/include/asm/pie.h | 42 ++
arch/arm/include/asm/suspend.h | 25 ++
arch/arm/kernel/.gitignore | 1 +
arch/arm/kernel/Makefile | 4 +-
arch/arm/kernel/pie.c | 92 +++++
arch/arm/kernel/pie.lds.S | 41 ++
arch/arm/kernel/vmlinux.lds.S | 2 +
arch/arm/libpie/.gitignore | 3 +
arch/arm/libpie/Makefile | 32 ++
arch/arm/libpie/empty.S | 12 +
arch/arm/libpie/relocate.S | 76 ++++
arch/arm/mach-omap2/Kconfig | 7 +-
arch/arm/mach-omap2/Makefile | 2 +
arch/arm/mach-omap2/board-generic.c | 1 +
arch/arm/mach-omap2/common.h | 10 +
arch/arm/mach-omap2/io.c | 5 +
arch/arm/mach-omap2/pm.c | 3 +-
arch/arm/mach-omap2/pm33xx.c | 486 ++++++++++++++++++++++++
arch/arm/mach-omap2/pm33xx.h | 68 ++++
arch/arm/mach-omap2/sleep33xx.c | 314 +++++++++++++++
arch/arm/mach-omap2/wkup_m3.c | 183 +++++++++
arch/arm/plat-omap/sram.c | 2 +-
arch/arm64/include/asm/fncpy.h | 1 +
arch/avr32/include/asm/fncpy.h | 1 +
arch/blackfin/include/asm/fncpy.h | 1 +
arch/c6x/include/asm/fncpy.h | 1 +
arch/cris/include/asm/fncpy.h | 1 +
arch/frv/include/asm/fncpy.h | 1 +
arch/h8300/include/asm/fncpy.h | 1 +
arch/hexagon/include/asm/fncpy.h | 1 +
arch/ia64/include/asm/fncpy.h | 1 +
arch/m32r/include/asm/fncpy.h | 1 +
arch/m68k/include/asm/fncpy.h | 1 +
arch/metag/include/asm/fncpy.h | 1 +
arch/microblaze/include/asm/fncpy.h | 1 +
arch/mips/include/asm/fncpy.h | 1 +
arch/mn10300/include/asm/fncpy.h | 1 +
arch/openrisc/include/asm/fncpy.h | 1 +
arch/parisc/include/asm/fncpy.h | 1 +
arch/powerpc/include/asm/fncpy.h | 1 +
arch/s390/include/asm/fncpy.h | 1 +
arch/score/include/asm/fncpy.h | 1 +
arch/sh/include/asm/fncpy.h | 1 +
arch/sparc/include/asm/fncpy.h | 1 +
arch/tile/include/asm/fncpy.h | 1 +
arch/um/include/asm/fncpy.h | 1 +
arch/unicore32/include/asm/fncpy.h | 1 +
arch/x86/include/asm/fncpy.h | 1 +
arch/xtensa/include/asm/fncpy.h | 1 +
drivers/misc/sram.c | 13 +-
include/asm-generic/fncpy.h | 104 +++++
include/asm-generic/iomap.h | 5 +
include/asm-generic/pie.lds.h | 82 ++++
include/asm-generic/vmlinux.lds.h | 1 +
include/linux/device.h | 17 +-
include/linux/io.h | 4 +
include/linux/pie.h | 196 ++++++++++
include/linux/platform_data/sram.h | 8 +
lib/Kconfig | 14 +
lib/Makefile | 2 +
lib/devres.c | 97 ++++-
lib/pie.c | 138 +++++++
pie/.gitignore | 3 +
pie/Makefile | 85 +++++
scripts/link-vmlinux.sh | 11 +-
77 files changed, 2425 insertions(+), 71 deletions(-)
create mode 100644 Documentation/pie.txt
create mode 100644 arch/alpha/include/asm/fncpy.h
create mode 100644 arch/arc/include/asm/fncpy.h
create mode 100644 arch/arm/include/asm/pie.h
create mode 100644 arch/arm/kernel/pie.c
create mode 100644 arch/arm/kernel/pie.lds.S
create mode 100644 arch/arm/libpie/.gitignore
create mode 100644 arch/arm/libpie/Makefile
create mode 100644 arch/arm/libpie/empty.S
create mode 100644 arch/arm/libpie/relocate.S
create mode 100644 arch/arm/mach-omap2/pm33xx.c
create mode 100644 arch/arm/mach-omap2/pm33xx.h
create mode 100644 arch/arm/mach-omap2/sleep33xx.c
create mode 100644 arch/arm/mach-omap2/wkup_m3.c
create mode 100644 arch/arm64/include/asm/fncpy.h
create mode 100644 arch/avr32/include/asm/fncpy.h
create mode 100644 arch/blackfin/include/asm/fncpy.h
create mode 100644 arch/c6x/include/asm/fncpy.h
create mode 100644 arch/cris/include/asm/fncpy.h
create mode 100644 arch/frv/include/asm/fncpy.h
create mode 100644 arch/h8300/include/asm/fncpy.h
create mode 100644 arch/hexagon/include/asm/fncpy.h
create mode 100644 arch/ia64/include/asm/fncpy.h
create mode 100644 arch/m32r/include/asm/fncpy.h
create mode 100644 arch/m68k/include/asm/fncpy.h
create mode 100644 arch/metag/include/asm/fncpy.h
create mode 100644 arch/microblaze/include/asm/fncpy.h
create mode 100644 arch/mips/include/asm/fncpy.h
create mode 100644 arch/mn10300/include/asm/fncpy.h
create mode 100644 arch/openrisc/include/asm/fncpy.h
create mode 100644 arch/parisc/include/asm/fncpy.h
create mode 100644 arch/powerpc/include/asm/fncpy.h
create mode 100644 arch/s390/include/asm/fncpy.h
create mode 100644 arch/score/include/asm/fncpy.h
create mode 100644 arch/sh/include/asm/fncpy.h
create mode 100644 arch/sparc/include/asm/fncpy.h
create mode 100644 arch/tile/include/asm/fncpy.h
create mode 100644 arch/um/include/asm/fncpy.h
create mode 100644 arch/unicore32/include/asm/fncpy.h
create mode 100644 arch/x86/include/asm/fncpy.h
create mode 100644 arch/xtensa/include/asm/fncpy.h
create mode 100644 include/asm-generic/fncpy.h
create mode 100644 include/asm-generic/pie.lds.h
create mode 100644 include/linux/pie.h
create mode 100644 include/linux/platform_data/sram.h
create mode 100644 lib/pie.c
create mode 100644 pie/.gitignore
create mode 100644 pie/Makefile

--
1.8.3.2


2013-09-17 12:43:56

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 01/11] asm-generic: io: Add exec versions of ioremap

If code is to be copied into and area (such as SRAM) and run,
it needs to be marked as exec. Currently only an ARM version
of this exists.

Signed-off-by: Russ Dill <[email protected]>
---
arch/arm/include/asm/io.h | 2 ++
include/asm-generic/iomap.h | 5 +++++
2 files changed, 7 insertions(+)

diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
index d070741..212d095 100644
--- a/arch/arm/include/asm/io.h
+++ b/arch/arm/include/asm/io.h
@@ -329,6 +329,8 @@ extern void _memset_io(volatile void __iomem *, int, size_t);
#define ioremap_nocache(cookie,size) __arm_ioremap((cookie), (size), MT_DEVICE)
#define ioremap_cached(cookie,size) __arm_ioremap((cookie), (size), MT_DEVICE_CACHED)
#define ioremap_wc(cookie,size) __arm_ioremap((cookie), (size), MT_DEVICE_WC)
+#define ioremap_exec(cookie,size) __arm_ioremap_exec((cookie), (size), true)
+#define ioremap_exec_nocache(cookie,size) __arm_ioremap_exec((cookie), (size), false)
#define iounmap __arm_iounmap

/*
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 6afd7d6..e72c451 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -66,6 +66,11 @@ extern void ioport_unmap(void __iomem *);
#define ioremap_wc ioremap_nocache
#endif

+#ifndef ARCH_HAS_IOREMAP_EXEC
+#define ioremap_exec ioremap
+#define ioremap_exec_nocache ioremap_nocache
+#endif
+
#ifdef CONFIG_PCI
/* Destroy a virtual mapping cookie for a PCI BAR (memory or IO) */
struct pci_dev;
--
1.8.3.2

2013-09-17 12:44:00

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 02/11] lib: devres: Add exec versions of devm_ioremap_resource and friends

Now that there is an _exec version of ioremap, add devm support for it.

Signed-off-by: Russ Dill <[email protected]>
---
include/linux/device.h | 17 ++++++++-
include/linux/io.h | 4 +++
lib/devres.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++--
3 files changed, 114 insertions(+), 4 deletions(-)

diff --git a/include/linux/device.h b/include/linux/device.h
index 22b546a..204180a 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -596,10 +596,25 @@ extern int devres_release_group(struct device *dev, void *id);
extern void *devm_kzalloc(struct device *dev, size_t size, gfp_t gfp);
extern void devm_kfree(struct device *dev, void *p);

-void __iomem *devm_ioremap_resource(struct device *dev, struct resource *res);
+void __iomem *__devm_ioremap_resource(struct device *dev, struct resource *res,
+ bool exec);
+static inline void __iomem *devm_ioremap_resource(struct device *dev,
+ struct resource *res)
+{
+ return __devm_ioremap_resource(dev, res, false);
+}
void __iomem *devm_request_and_ioremap(struct device *dev,
struct resource *res);

+static inline void __iomem *devm_ioremap_exec_resource(struct device *dev,
+ struct resource *res)
+{
+ return __devm_ioremap_resource(dev, res, true);
+}
+
+void __iomem *devm_request_and_ioremap_exec(struct device *dev,
+ struct resource *res);
+
/* allows to add/remove a custom action to devres stack */
int devm_add_action(struct device *dev, void (*action)(void *), void *data);
void devm_remove_action(struct device *dev, void (*action)(void *), void *data);
diff --git a/include/linux/io.h b/include/linux/io.h
index f4f42fa..582b207 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -62,6 +62,10 @@ void __iomem *devm_ioremap(struct device *dev, resource_size_t offset,
unsigned long size);
void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t offset,
unsigned long size);
+void __iomem *devm_ioremap_exec(struct device *dev, resource_size_t offset,
+ unsigned long size);
+void __iomem *devm_ioremap_exec_nocache(struct device *dev, resource_size_t offset,
+ unsigned long size);
void devm_iounmap(struct device *dev, void __iomem *addr);
int check_signature(const volatile void __iomem *io_addr,
const unsigned char *signature, int length);
diff --git a/lib/devres.c b/lib/devres.c
index 8235331..06cafe5 100644
--- a/lib/devres.c
+++ b/lib/devres.c
@@ -72,6 +72,64 @@ void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t offset,
EXPORT_SYMBOL(devm_ioremap_nocache);

/**
+ * devm_ioremap_exec - Managed ioremap_exec()
+ * @dev: Generic device to remap IO address for
+ * @offset: BUS offset to map
+ * @size: Size of map
+ *
+ * Managed ioremap_exec(). Map is automatically unmapped on driver detach.
+ */
+void __iomem *devm_ioremap_exec(struct device *dev, resource_size_t offset,
+ unsigned long size)
+{
+ void __iomem **ptr, *addr;
+
+ ptr = devres_alloc(devm_ioremap_release, sizeof(*ptr), GFP_KERNEL);
+ if (!ptr)
+ return NULL;
+
+ addr = ioremap_exec(offset, size);
+ if (addr) {
+ *ptr = addr;
+ devres_add(dev, ptr);
+ } else
+ devres_free(ptr);
+
+ return addr;
+}
+EXPORT_SYMBOL(devm_ioremap_exec);
+
+/**
+ * devm_ioremap_exec_nocache - Managed ioremap_exec_nocache()
+ * @dev: Generic device to remap IO address for
+ * @offset: BUS offset to map
+ * @size: Size of map
+ *
+ * Managed ioremap_exec_nocache(). Map is automatically unmapped on driver
+ * detach.
+ */
+void __iomem *devm_ioremap_exec_nocache(struct device *dev,
+ resource_size_t offset,
+ unsigned long size)
+{
+ void __iomem **ptr, *addr;
+
+ ptr = devres_alloc(devm_ioremap_release, sizeof(*ptr), GFP_KERNEL);
+ if (!ptr)
+ return NULL;
+
+ addr = ioremap_exec_nocache(offset, size);
+ if (addr) {
+ *ptr = addr;
+ devres_add(dev, ptr);
+ } else
+ devres_free(ptr);
+
+ return addr;
+}
+EXPORT_SYMBOL(devm_ioremap_exec_nocache);
+
+/**
* devm_iounmap - Managed iounmap()
* @dev: Generic device to unmap for
* @addr: Address to unmap
@@ -104,7 +162,8 @@ EXPORT_SYMBOL(devm_iounmap);
* if (IS_ERR(base))
* return PTR_ERR(base);
*/
-void __iomem *devm_ioremap_resource(struct device *dev, struct resource *res)
+void __iomem *__devm_ioremap_resource(struct device *dev, struct resource *res,
+ bool exec)
{
resource_size_t size;
const char *name;
@@ -125,7 +184,11 @@ void __iomem *devm_ioremap_resource(struct device *dev, struct resource *res)
return ERR_PTR(-EBUSY);
}

- if (res->flags & IORESOURCE_CACHEABLE)
+ if (exec && res->flags & IORESOURCE_CACHEABLE)
+ dest_ptr = devm_ioremap_exec(dev, res->start, size);
+ else if (exec)
+ dest_ptr = devm_ioremap_exec_nocache(dev, res->start, size);
+ else if (res->flags & IORESOURCE_CACHEABLE)
dest_ptr = devm_ioremap(dev, res->start, size);
else
dest_ptr = devm_ioremap_nocache(dev, res->start, size);
@@ -138,7 +201,7 @@ void __iomem *devm_ioremap_resource(struct device *dev, struct resource *res)

return dest_ptr;
}
-EXPORT_SYMBOL(devm_ioremap_resource);
+EXPORT_SYMBOL(__devm_ioremap_resource);

/**
* devm_request_and_ioremap() - Check, request region, and ioremap resource
@@ -168,6 +231,34 @@ void __iomem *devm_request_and_ioremap(struct device *device,
}
EXPORT_SYMBOL(devm_request_and_ioremap);

+/**
+ * devm_request_and_ioremap_exec() - Check, request region, and ioremap resource
+ * @dev: Generic device to handle the resource for
+ * @res: resource to be handled
+ *
+ * Takes all necessary steps to ioremap a mem resource. Uses managed device, so
+ * everything is undone on driver detach. Checks arguments, so you can feed
+ * it the result from e.g. platform_get_resource() directly. Returns the
+ * remapped pointer or NULL on error. Usage example:
+ *
+ * res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ * base = devm_request_and_ioremap_exec(&pdev->dev, res);
+ * if (!base)
+ * return -EADDRNOTAVAIL;
+ */
+void __iomem *devm_request_and_ioremap_exec(struct device *device,
+ struct resource *res)
+{
+ void __iomem *dest_ptr;
+
+ dest_ptr = devm_ioremap_exec_resource(device, res);
+ if (IS_ERR(dest_ptr))
+ return NULL;
+
+ return dest_ptr;
+}
+EXPORT_SYMBOL(devm_request_and_ioremap_exec);
+
#ifdef CONFIG_HAS_IOPORT
/*
* Generic iomap devres
--
1.8.3.2

2013-09-17 12:44:05

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 04/11] asm-generic: fncpy: Add function copying macros

Under certain arches (ARM) function pointers cannot be
used naively. Specifically, for thumb functions, their 0 bit
is set, but they are contained on a word aligned address.

Add a fncpy macro to perform function copies correctly
along with two helpers, fnptr_to_address, and fnptr_translate.

Signed-off-by: Russ Dill <[email protected]>
---
arch/alpha/include/asm/fncpy.h | 1 +
arch/arc/include/asm/fncpy.h | 1 +
arch/arm/include/asm/fncpy.h | 76 +++++++-------------------
arch/arm/plat-omap/sram.c | 2 +-
arch/arm64/include/asm/fncpy.h | 1 +
arch/avr32/include/asm/fncpy.h | 1 +
arch/blackfin/include/asm/fncpy.h | 1 +
arch/c6x/include/asm/fncpy.h | 1 +
arch/cris/include/asm/fncpy.h | 1 +
arch/frv/include/asm/fncpy.h | 1 +
arch/h8300/include/asm/fncpy.h | 1 +
arch/hexagon/include/asm/fncpy.h | 1 +
arch/ia64/include/asm/fncpy.h | 1 +
arch/m32r/include/asm/fncpy.h | 1 +
arch/m68k/include/asm/fncpy.h | 1 +
arch/metag/include/asm/fncpy.h | 1 +
arch/microblaze/include/asm/fncpy.h | 1 +
arch/mips/include/asm/fncpy.h | 1 +
arch/mn10300/include/asm/fncpy.h | 1 +
arch/openrisc/include/asm/fncpy.h | 1 +
arch/parisc/include/asm/fncpy.h | 1 +
arch/powerpc/include/asm/fncpy.h | 1 +
arch/s390/include/asm/fncpy.h | 1 +
arch/score/include/asm/fncpy.h | 1 +
arch/sh/include/asm/fncpy.h | 1 +
arch/sparc/include/asm/fncpy.h | 1 +
arch/tile/include/asm/fncpy.h | 1 +
arch/um/include/asm/fncpy.h | 1 +
arch/unicore32/include/asm/fncpy.h | 1 +
arch/x86/include/asm/fncpy.h | 1 +
arch/xtensa/include/asm/fncpy.h | 1 +
include/asm-generic/fncpy.h | 104 ++++++++++++++++++++++++++++++++++++
32 files changed, 154 insertions(+), 57 deletions(-)
create mode 100644 arch/alpha/include/asm/fncpy.h
create mode 100644 arch/arc/include/asm/fncpy.h
create mode 100644 arch/arm64/include/asm/fncpy.h
create mode 100644 arch/avr32/include/asm/fncpy.h
create mode 100644 arch/blackfin/include/asm/fncpy.h
create mode 100644 arch/c6x/include/asm/fncpy.h
create mode 100644 arch/cris/include/asm/fncpy.h
create mode 100644 arch/frv/include/asm/fncpy.h
create mode 100644 arch/h8300/include/asm/fncpy.h
create mode 100644 arch/hexagon/include/asm/fncpy.h
create mode 100644 arch/ia64/include/asm/fncpy.h
create mode 100644 arch/m32r/include/asm/fncpy.h
create mode 100644 arch/m68k/include/asm/fncpy.h
create mode 100644 arch/metag/include/asm/fncpy.h
create mode 100644 arch/microblaze/include/asm/fncpy.h
create mode 100644 arch/mips/include/asm/fncpy.h
create mode 100644 arch/mn10300/include/asm/fncpy.h
create mode 100644 arch/openrisc/include/asm/fncpy.h
create mode 100644 arch/parisc/include/asm/fncpy.h
create mode 100644 arch/powerpc/include/asm/fncpy.h
create mode 100644 arch/s390/include/asm/fncpy.h
create mode 100644 arch/score/include/asm/fncpy.h
create mode 100644 arch/sh/include/asm/fncpy.h
create mode 100644 arch/sparc/include/asm/fncpy.h
create mode 100644 arch/tile/include/asm/fncpy.h
create mode 100644 arch/um/include/asm/fncpy.h
create mode 100644 arch/unicore32/include/asm/fncpy.h
create mode 100644 arch/x86/include/asm/fncpy.h
create mode 100644 arch/xtensa/include/asm/fncpy.h
create mode 100644 include/asm-generic/fncpy.h

diff --git a/arch/alpha/include/asm/fncpy.h b/arch/alpha/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/alpha/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/arc/include/asm/fncpy.h b/arch/arc/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/arc/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/arm/include/asm/fncpy.h b/arch/arm/include/asm/fncpy.h
index de53547..f165f20 100644
--- a/arch/arm/include/asm/fncpy.h
+++ b/arch/arm/include/asm/fncpy.h
@@ -17,16 +17,12 @@
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/

+#ifndef __ASM_FNCPY_H
+#define __ASM_FNCPY_H
+
+#include <linux/types.h>
+
/*
- * These macros are intended for use when there is a need to copy a low-level
- * function body into special memory.
- *
- * For example, when reconfiguring the SDRAM controller, the code doing the
- * reconfiguration may need to run from SRAM.
- *
- * NOTE: that the copied function body must be entirely self-contained and
- * position-independent in order for this to work properly.
- *
* NOTE: in order for embedded literals and data to get referenced correctly,
* the alignment of functions must be preserved when copying. To ensure this,
* the source and destination addresses for fncpy() must be aligned to a
@@ -34,61 +30,29 @@
* You will typically need a ".align 3" directive in the assembler where the
* function to be copied is defined, and ensure that your allocator for the
* destination buffer returns 8-byte-aligned pointers.
- *
- * Typical usage example:
- *
- * extern int f(args);
- * extern uint32_t size_of_f;
- * int (*copied_f)(args);
- * void *sram_buffer;
- *
- * copied_f = fncpy(sram_buffer, &f, size_of_f);
- *
- * ... later, call the function: ...
- *
- * copied_f(args);
- *
- * The size of the function to be copied can't be determined from C:
- * this must be determined by other means, such as adding assmbler directives
- * in the file where f is defined.
- */
+*/
+#define ARCH_FNCPY_ALIGN 3

-#ifndef __ASM_FNCPY_H
-#define __ASM_FNCPY_H
-
-#include <linux/types.h>
-#include <linux/string.h>
-
-#include <asm/bug.h>
-#include <asm/cacheflush.h>
-
-/*
- * Minimum alignment requirement for the source and destination addresses
- * for function copying.
- */
-#define FNCPY_ALIGN 8
-
-#define fncpy(dest_buf, funcp, size) ({ \
+/* Clear the Thumb bit */
+#define fnptr_to_addr(funcp) ({ \
uintptr_t __funcp_address; \
- typeof(funcp) __result; \
\
asm("" : "=r" (__funcp_address) : "0" (funcp)); \
+ __funcp_address & ~1; \
+})
+
+/* Put the Thumb bit back */
+#define fnptr_translate(orig_funcp, new_addr) ({ \
+ uintptr_t __funcp_address; \
+ typeof(orig_funcp) __result; \
\
- /* \
- * Ensure alignment of source and destination addresses, \
- * disregarding the function's Thumb bit: \
- */ \
- BUG_ON((uintptr_t)(dest_buf) & (FNCPY_ALIGN - 1) || \
- (__funcp_address & ~(uintptr_t)1 & (FNCPY_ALIGN - 1))); \
- \
- memcpy(dest_buf, (void const *)(__funcp_address & ~1), size); \
- flush_icache_range((unsigned long)(dest_buf), \
- (unsigned long)(dest_buf) + (size)); \
- \
+ asm("" : "=r" (__funcp_address) : "0" (orig_funcp)); \
asm("" : "=r" (__result) \
- : "0" ((uintptr_t)(dest_buf) | (__funcp_address & 1))); \
+ : "0" ((uintptr_t)(new_addr) | (__funcp_address & 1))); \
\
__result; \
})

+#include <asm-generic/fncpy.h>
+
#endif /* !__ASM_FNCPY_H */
diff --git a/arch/arm/plat-omap/sram.c b/arch/arm/plat-omap/sram.c
index a5bc92d..90ccd74 100644
--- a/arch/arm/plat-omap/sram.c
+++ b/arch/arm/plat-omap/sram.c
@@ -54,7 +54,7 @@ void *omap_sram_push_address(unsigned long size)
}

new_ceil -= size;
- new_ceil = ROUND_DOWN(new_ceil, FNCPY_ALIGN);
+ new_ceil = ROUND_DOWN(new_ceil, 1 << ARCH_FNCPY_ALIGN);
omap_sram_ceil = IOMEM(new_ceil);

return (void *)omap_sram_ceil;
diff --git a/arch/arm64/include/asm/fncpy.h b/arch/arm64/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/arm64/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/avr32/include/asm/fncpy.h b/arch/avr32/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/avr32/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/blackfin/include/asm/fncpy.h b/arch/blackfin/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/blackfin/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/c6x/include/asm/fncpy.h b/arch/c6x/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/c6x/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/cris/include/asm/fncpy.h b/arch/cris/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/cris/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/frv/include/asm/fncpy.h b/arch/frv/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/frv/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/h8300/include/asm/fncpy.h b/arch/h8300/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/h8300/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/hexagon/include/asm/fncpy.h b/arch/hexagon/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/hexagon/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/ia64/include/asm/fncpy.h b/arch/ia64/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/ia64/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/m32r/include/asm/fncpy.h b/arch/m32r/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/m32r/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/m68k/include/asm/fncpy.h b/arch/m68k/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/m68k/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/metag/include/asm/fncpy.h b/arch/metag/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/metag/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/microblaze/include/asm/fncpy.h b/arch/microblaze/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/microblaze/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/mips/include/asm/fncpy.h b/arch/mips/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/mips/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/mn10300/include/asm/fncpy.h b/arch/mn10300/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/mn10300/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/openrisc/include/asm/fncpy.h b/arch/openrisc/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/openrisc/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/parisc/include/asm/fncpy.h b/arch/parisc/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/parisc/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/powerpc/include/asm/fncpy.h b/arch/powerpc/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/powerpc/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/s390/include/asm/fncpy.h b/arch/s390/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/s390/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/score/include/asm/fncpy.h b/arch/score/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/score/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/sh/include/asm/fncpy.h b/arch/sh/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/sh/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/sparc/include/asm/fncpy.h b/arch/sparc/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/sparc/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/tile/include/asm/fncpy.h b/arch/tile/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/tile/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/um/include/asm/fncpy.h b/arch/um/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/um/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/unicore32/include/asm/fncpy.h b/arch/unicore32/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/unicore32/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/x86/include/asm/fncpy.h b/arch/x86/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/x86/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/arch/xtensa/include/asm/fncpy.h b/arch/xtensa/include/asm/fncpy.h
new file mode 100644
index 0000000..ee4741c
--- /dev/null
+++ b/arch/xtensa/include/asm/fncpy.h
@@ -0,0 +1 @@
+#include <asm-generic/fncpy.h>
diff --git a/include/asm-generic/fncpy.h b/include/asm-generic/fncpy.h
new file mode 100644
index 0000000..1a25282
--- /dev/null
+++ b/include/asm-generic/fncpy.h
@@ -0,0 +1,104 @@
+/*
+ * include/asm-generic/fncpy.h - helper macros for function body copying
+ *
+ * Copyright (C) 2011 Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+/*
+ * These macros are intended for use when there is a need to copy a low-level
+ * function body into special memory.
+ *
+ * For example, when reconfiguring the SDRAM controller, the code doing the
+ * reconfiguration may need to run from SRAM.
+ *
+ * NOTE: that the copied function body must be entirely self-contained and
+ * position-independent in order for this to work properly.
+ *
+ * Typical usage example:
+ *
+ * extern int f(args);
+ * extern uint32_t size_of_f;
+ * int (*copied_f)(args);
+ * void *sram_buffer;
+ *
+ * copied_f = fncpy(sram_buffer, &f, size_of_f);
+ *
+ * ... later, call the function: ...
+ *
+ * copied_f(args);
+ *
+ * The size of the function to be copied can't be determined from C:
+ * this must be determined by other means, such as adding assmbler directives
+ * in the file where f is defined.
+ */
+
+#ifndef __ASM_GENERIC_FNCPY_H
+#define __ASM_GENERIC_FNCPY_H
+
+#include <linux/types.h>
+#include <linux/string.h>
+
+#include <asm/bug.h>
+#include <asm/cacheflush.h>
+
+/*
+ * Minimum alignment requirement for the source and destination addresses
+ * for function copying.
+ */
+#ifndef ARCH_FNCPY_ALIGN
+#define ARCH_FNCPY_ALIGN 0
+#endif
+
+#define ARCH_FNCPY_MASK ((1 << (ARCH_FNCPY_ALIGN)) - 1)
+
+#ifndef fnptr_to_addr
+#define fnptr_to_addr(funcp) ({ \
+ (uintptr_t) (funcp); \
+})
+#endif
+
+#ifndef fnptr_translate
+#define fnptr_translate(orig_funcp, new_addr) ({ \
+ (typeof(orig_funcp)) (new_addr); \
+})
+#endif
+
+/* Ensure alignment of source and destination addresses */
+#ifndef fn_dest_invalid
+#define fn_dest_invalid(funcp, dest_buf) ({ \
+ uintptr_t __funcp_address; \
+ \
+ __funcp_address = fnptr_to_addr(funcp); \
+ \
+ ((uintptr_t)(dest_buf) & ARCH_FNCPY_MASK) || \
+ (__funcp_address & ARCH_FNCPY_MASK); \
+})
+#endif
+
+#ifndef fncpy
+#define fncpy(dest_buf, funcp, size) ({ \
+ BUG_ON(fn_dest_invalid(funcp, dest_buf)); \
+ \
+ memcpy(dest_buf, (void const *)(funcp), size); \
+ flush_icache_range((unsigned long)(dest_buf), \
+ (unsigned long)(dest_buf) + (size)); \
+ \
+ fnptr_translate(funcp, dest_buf); \
+})
+#endif
+
+#endif /* !__ASM_GENERIC_FNCPY_H */
+
--
1.8.3.2

2013-09-17 12:44:09

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 06/11] ARM: PIE: Add position independent executable embedding to ARM

Add support to ARM for embedding PIEs into the kernel, loading them into
genalloc pools (such as SRAM) and executing them. Support for ARM means
performing R_ARM_RELATIVE fixups within the .rel.dyn section.

Signed-off-by: Russ Dill <[email protected]>
---
arch/arm/Kconfig | 1 +
arch/arm/Makefile | 5 +++
arch/arm/include/asm/elf.h | 1 +
arch/arm/kernel/.gitignore | 1 +
arch/arm/kernel/Makefile | 4 ++-
arch/arm/kernel/pie.c | 83 +++++++++++++++++++++++++++++++++++++++++++
arch/arm/kernel/pie.lds.S | 40 +++++++++++++++++++++
arch/arm/kernel/vmlinux.lds.S | 2 ++
arch/arm/libpie/.gitignore | 3 ++
arch/arm/libpie/Makefile | 32 +++++++++++++++++
arch/arm/libpie/empty.S | 12 +++++++
11 files changed, 183 insertions(+), 1 deletion(-)
create mode 100644 arch/arm/kernel/pie.c
create mode 100644 arch/arm/kernel/pie.lds.S
create mode 100644 arch/arm/libpie/.gitignore
create mode 100644 arch/arm/libpie/Makefile
create mode 100644 arch/arm/libpie/empty.S

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 43594d5..de7b7603 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -49,6 +49,7 @@ config ARM
select HAVE_MEMBLOCK
select HAVE_OPROFILE if (HAVE_PERF_EVENTS)
select HAVE_PERF_EVENTS
+ select HAVE_PIE
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_UID16
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 6fd2cea..a673d36 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -128,6 +128,8 @@ KBUILD_AFLAGS +=$(CFLAGS_ABI) $(AFLAGS_ISA) $(arch-y) $(tune-y) -include asm/uni

CHECKFLAGS += -D__arm__

+OBJCOPY_OUTPUT_FORMAT := elf32-littlearm
+
#Default value
head-y := arch/arm/kernel/head$(MMUEXT).o
textofs-y := 0x00008000
@@ -273,6 +275,9 @@ drivers-$(CONFIG_OPROFILE) += arch/arm/oprofile/

libs-y := arch/arm/lib/ $(libs-y)

+PIE_LDS := arch/arm/kernel/pie.lds
+libpie-$(CONFIG_PIE) += arch/arm/libpie/
+
# Default target when executing plain make
ifeq ($(CONFIG_XIP_KERNEL),y)
KBUILD_IMAGE := xipImage
diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index 56211f2..a8d036b 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -50,6 +50,7 @@ typedef struct user_fp elf_fpregset_t;
#define R_ARM_NONE 0
#define R_ARM_PC24 1
#define R_ARM_ABS32 2
+#define R_ARM_RELATIVE 23
#define R_ARM_CALL 28
#define R_ARM_JUMP24 29
#define R_ARM_V4BX 40
diff --git a/arch/arm/kernel/.gitignore b/arch/arm/kernel/.gitignore
index c5f676c..a055a48 100644
--- a/arch/arm/kernel/.gitignore
+++ b/arch/arm/kernel/.gitignore
@@ -1 +1,2 @@
vmlinux.lds
+pie.lds
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 86d10dd..652312e 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -96,4 +96,6 @@ obj-y += psci.o
obj-$(CONFIG_SMP) += psci_smp.o
endif

-extra-y := $(head-y) vmlinux.lds
+obj-$(CONFIG_PIE) += pie.o
+
+extra-y := $(head-y) vmlinux.lds pie.lds
diff --git a/arch/arm/kernel/pie.c b/arch/arm/kernel/pie.c
new file mode 100644
index 0000000..5dff5d6
--- /dev/null
+++ b/arch/arm/kernel/pie.c
@@ -0,0 +1,83 @@
+/*
+ * Copyright 2013 Texas Instruments, Inc.
+ * Russ Dill <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/pie.h>
+#include <linux/elf.h>
+
+#include <asm/elf.h>
+
+extern char __pie_rel_dyn_start[];
+extern char __pie_rel_dyn_end[];
+extern char __pie_tail_offset[];
+
+struct arm_pie_tail {
+ int count;
+ uintptr_t offset[0];
+};
+
+int pie_arch_fill_tail(void *tail, void *common_start, void *common_end,
+ void *overlay_start, void *code_start, void *code_end)
+{
+ Elf32_Rel *rel;
+ int records;
+ int i;
+ struct arm_pie_tail *pie_tail = tail;
+ int count;
+
+ rel = (Elf32_Rel *) __pie_rel_dyn_start;
+ records = (__pie_rel_dyn_end - __pie_rel_dyn_start) /
+ sizeof(*rel);
+
+ count = 0;
+ for (i = 0; i < records; i++, rel++) {
+ void *kern_off;
+ if (ELF32_R_TYPE(rel->r_info) != R_ARM_RELATIVE)
+ return -ENOEXEC;
+
+ /* Adjust offset to match area in kernel */
+ kern_off = common_start + rel->r_offset;
+
+ if (kern_off >= common_start && kern_off < code_end) {
+ if (tail)
+ pie_tail->offset[count] = rel->r_offset;
+ count++;
+ } else if (kern_off >= code_start && kern_off < code_end) {
+ if (tail)
+ pie_tail->offset[count] = rel->r_offset -
+ (code_start - overlay_start);
+ count++;
+ }
+ }
+
+ if (tail)
+ pie_tail->count = count;
+
+ return count * sizeof(uintptr_t) + sizeof(*pie_tail);
+}
+EXPORT_SYMBOL_GPL(pie_arch_fill_tail);
+
+/*
+ * R_ARM_RELATIVE: B(S) + A
+ * B(S) - Addressing origin of the output segment defining the symbol S.
+ * A - Addend for the relocation.
+ */
+int pie_arch_fixup(struct pie_chunk *chunk, void *base, void *tail,
+ unsigned long offset)
+{
+ struct arm_pie_tail *pie_tail = tail;
+ int i;
+
+ /* Perform relocation fixups for given offset */
+ for (i = 0; i < pie_tail->count; i++)
+ *((uintptr_t *) (pie_tail->offset[i] + base)) += offset;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(pie_arch_fixup);
diff --git a/arch/arm/kernel/pie.lds.S b/arch/arm/kernel/pie.lds.S
new file mode 100644
index 0000000..4fd5ac5
--- /dev/null
+++ b/arch/arm/kernel/pie.lds.S
@@ -0,0 +1,40 @@
+/*
+ * ld script to make ARM PIEs
+ * taken from the ARM vmlinux.lds.S version by Russ Dill <[email protected].
+ */
+
+#include <asm-generic/pie.lds.h>
+
+OUTPUT_ARCH(arm)
+
+SECTIONS
+{
+ . = 0x0;
+
+ PIE_COMMON_START
+ .got.plt : {
+ *(.got)
+ *(.got.plt)
+ }
+ .text : {
+ PIE_TEXT_TEXT
+ }
+ PIE_COMMON_END
+
+ PIE_OVERLAY_START
+ OVERLAY : NOCROSSREFS {
+ }
+ PIE_OVERLAY_SEND
+
+ __pie_rel_dyn_start : {
+ VMLINUX_SYMBOL(__pie_rel_dyn_start) = .;
+ }
+ .rel.dyn : {
+ KEEP(*(.rel*))
+ }
+ __pie_rel_dyn_end : {
+ VMLINUX_SYMBOL(__pie_rel_dyn_end) = .;
+ }
+
+ PIE_DISCARDS
+}
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 7bcee5c..8c11235 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -77,6 +77,8 @@ SECTIONS
#ifndef CONFIG_SMP_ON_UP
*(.alt.smp.init)
#endif
+ *(.pie.*)
+ *(.ARM.exidx.pie.*.text)
*(.discard)
*(.discard.*)
}
diff --git a/arch/arm/libpie/.gitignore b/arch/arm/libpie/.gitignore
new file mode 100644
index 0000000..02e3cd5
--- /dev/null
+++ b/arch/arm/libpie/.gitignore
@@ -0,0 +1,3 @@
+lib1funcs.S
+ashldi3.S
+string.c
diff --git a/arch/arm/libpie/Makefile b/arch/arm/libpie/Makefile
new file mode 100644
index 0000000..5662e99
--- /dev/null
+++ b/arch/arm/libpie/Makefile
@@ -0,0 +1,32 @@
+#
+# linux/arch/arm/libpie/Makefile
+#
+ccflags-y := -fpic -mno-single-pic-base -fno-builtin
+
+obj-y := empty.o
+obj-y += lib1funcs.o ashldi3.o string.o
+
+# string library code (-Os is enforced to keep it much smaller)
+string = $(obj)/string.o
+CFLAGS_string.o := -Os
+
+$(obj)/string.c: $(srctree)/arch/$(SRCARCH)/boot/compressed/string.c
+ $(call cmd,shipped)
+
+# For __aeabi_uidivmod
+lib1funcs = $(obj)/lib1funcs.o
+
+$(obj)/lib1funcs.S: $(srctree)/arch/$(SRCARCH)/lib/lib1funcs.S
+ $(call cmd,shipped)
+
+# For __aeabi_llsl
+ashldi3 = $(obj)/ashldi3.o
+
+$(obj)/ashldi3.S: $(srctree)/arch/$(SRCARCH)/lib/ashldi3.S
+ $(call cmd,shipped)
+
+$(obj)/libpie.o: $(string) $(lib1funcs) $(ashldi3) $(addprefix $(obj)/,$(OBJS))
+ $(call if_changed,ld)
+
+# Make sure files are removed during clean
+extra-y += string.c lib1funcs.S ashldi3.S
diff --git a/arch/arm/libpie/empty.S b/arch/arm/libpie/empty.S
new file mode 100644
index 0000000..2416862
--- /dev/null
+++ b/arch/arm/libpie/empty.S
@@ -0,0 +1,12 @@
+#include <linux/linkage.h>
+
+ENTRY(__div0)
+ENTRY(__aeabi_unwind_cpp_pr0)
+ENTRY(__aeabi_unwind_cpp_pr1)
+ENTRY(__aeabi_unwind_cpp_pr2)
+ mov pc, lr
+ENDPROC(__div0)
+ENDPROC(__aeabi_unwind_cpp_pr0)
+ENDPROC(__aeabi_unwind_cpp_pr1)
+ENDPROC(__aeabi_unwind_cpp_pr2)
+
--
1.8.3.2

2013-09-17 12:44:12

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 08/11] ARM: PIE: Add macro for generating PIE resume trampoline

Add a helper that generates a short snippet of code that updates PIE
relocations, loads the stack pointer and calls a C (or asm) function.
The code gets placed into a PIE section.

Signed-off-by: Russ Dill <[email protected]>
---
arch/arm/include/asm/suspend.h | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/arch/arm/include/asm/suspend.h b/arch/arm/include/asm/suspend.h
index cd20029..92996f0 100644
--- a/arch/arm/include/asm/suspend.h
+++ b/arch/arm/include/asm/suspend.h
@@ -1,6 +1,8 @@
#ifndef __ASM_ARM_SUSPEND_H
#define __ASM_ARM_SUSPEND_H

+#include <asm/pie.h>
+
struct sleep_save_sp {
u32 *save_ptr_stash;
u32 save_ptr_stash_phys;
@@ -9,4 +11,27 @@ struct sleep_save_sp {
extern void cpu_resume(void);
extern int cpu_suspend(unsigned long, int (*)(unsigned long));

+/**
+ * ARM_PIE_RESUME - generate a PIE trampoline for resume
+ * @proc: SoC, should match argument used with PIE_OVERLAY_SECTION()
+ * @func: C or asm function to call at resume
+ * @stack: stack to use before calling func
+ */
+#define ARM_PIE_RESUME(proc, func, stack) \
+static void __naked __noreturn __pie(proc) proc##_resume_trampoline2(void) \
+{ \
+ __asm__ __volatile__( \
+ " mov sp, %0\n" \
+ : : "r"((stack)) : "sp"); \
+ \
+ func(); \
+} \
+ \
+void __naked __noreturn __pie(proc) proc##_resume_trampoline(void) \
+{ \
+ pie_relocate_from_pie(); \
+ proc##_resume_trampoline2(); \
+} \
+EXPORT_PIE_SYMBOL(proc##_resume_trampoline)
+
#endif
--
1.8.3.2

2013-09-17 12:44:21

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 10/11] ARM: OMAP2+: AM33XX: Add PIE support for AM33XX

This enables CONFIG_PIE for omap2plus_defconfig and adds
an am33xx PIE section group. This is necessary for am33xx
suspend/resume code as it is written in C.

Signed-off-by: Russ Dill <[email protected]>
---
arch/arm/configs/omap2plus_defconfig | 1 +
arch/arm/kernel/pie.lds.S | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/arm/configs/omap2plus_defconfig b/arch/arm/configs/omap2plus_defconfig
index 5d4c9b8..f342f85 100644
--- a/arch/arm/configs/omap2plus_defconfig
+++ b/arch/arm/configs/omap2plus_defconfig
@@ -102,6 +102,7 @@ CONFIG_SENSORS_TSL2550=m
CONFIG_SENSORS_LIS3_I2C=m
CONFIG_BMP085_I2C=m
CONFIG_SRAM=y
+CONFIG_PIE=y
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_SCSI_MULTI_LUN=y
diff --git a/arch/arm/kernel/pie.lds.S b/arch/arm/kernel/pie.lds.S
index 4fd5ac5..92fba61 100644
--- a/arch/arm/kernel/pie.lds.S
+++ b/arch/arm/kernel/pie.lds.S
@@ -23,6 +23,7 @@ SECTIONS

PIE_OVERLAY_START
OVERLAY : NOCROSSREFS {
+ PIE_OVERLAY_SECTION(am33xx)
}
PIE_OVERLAY_SEND

--
1.8.3.2

2013-09-17 12:44:54

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 11/11] ARM: OMAP2+: AM33XX: Basic suspend resume support

From: Vaibhav Bedia <[email protected]>

AM335x supports various low power modes as documented
in section 8.1.4.3 of the AM335x TRM which is available
@ http://www.ti.com/litv/pdf/spruh73f

DeepSleep0 mode offers the lowest power mode with limited
wakeup sources without a system reboot and is mapped as
the suspend state in the kernel. In this state, MPU and
PER domains are turned off with the internal RAM held in
retention to facilitate resume process. As part of the boot
process, the assembly code is copied over to OCMCRAM using
the OMAP SRAM code.

AM335x has a Cortex-M3 (WKUP_M3) which assists the MPU
in DeepSleep0 entry and exit. WKUP_M3 takes care of the
clockdomain and powerdomain transitions based on the
intended low power state. MPU needs to load the appropriate
WKUP_M3 binary onto the WKUP_M3 memory space before it can
leverage any of the PM features like DeepSleep.

The IPC mechanism between MPU and WKUP_M3 uses a mailbox
sub-module and 8 IPC registers in the Control module. MPU
uses the assigned Mailbox for issuing an interrupt to
WKUP_M3 which then goes and checks the IPC registers for
the payload. WKUP_M3 has the ability to trigger on interrupt
to MPU by executing the "sev" instruction.

In the current implementation when the suspend process
is initiated MPU interrupts the WKUP_M3 to let it know about
the intent of entering DeepSleep0 and waits for an ACK. When
the ACK is received MPU continues with its suspend process
to suspend all the drivers and then jumps to assembly in
OCMC RAM. The assembly code puts the PLLs in bypass, puts the
external RAM in self-refresh mode and then finally execute the
WFI instruction. Execution of the WFI instruction triggers another
interrupt to the WKUP_M3 which then continues wiht the power down
sequence wherein the clockdomain and powerdomain transition takes
place. As part of the sleep sequence, WKUP_M3 unmasks the interrupt
lines for the wakeup sources. WFI execution on WKUP_M3 causes the
hardware to disable the main oscillator of the SoC.

When a wakeup event occurs, WKUP_M3 starts the power-up
sequence by switching on the power domains and finally
enabling the clock to MPU. Since the MPU gets powered down
as part of the sleep sequence in the resume path ROM code
starts executing. The ROM code detects a wakeup from sleep
and then jumps to the resume location in OCMC which was
populated in one of the IPC registers as part of the suspend
sequence.

The low level code in OCMC relocks the PLLs, enables access
to external RAM and then jumps to the cpu_resume code of
the kernel to finish the resume process.

Signed-off-by: Vaibhav Bedia <[email protected]>
Signed-off-by: Dave Gerlach <[email protected]>
Signed-off-by: Russ Dill <[email protected]>
Cc: Tony Lingren <[email protected]>
Cc: Santosh Shilimkar <[email protected]>
Cc: Benoit Cousson <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Kevin Hilman <[email protected]>
---
arch/arm/mach-omap2/Kconfig | 7 +-
arch/arm/mach-omap2/Makefile | 2 +
arch/arm/mach-omap2/board-generic.c | 1 +
arch/arm/mach-omap2/common.h | 10 +
arch/arm/mach-omap2/io.c | 5 +
arch/arm/mach-omap2/pm.c | 3 +-
arch/arm/mach-omap2/pm33xx.c | 486 ++++++++++++++++++++++++++++++++++++
arch/arm/mach-omap2/pm33xx.h | 68 +++++
arch/arm/mach-omap2/sleep33xx.c | 314 +++++++++++++++++++++++
arch/arm/mach-omap2/wkup_m3.c | 183 ++++++++++++++
10 files changed, 1076 insertions(+), 3 deletions(-)
create mode 100644 arch/arm/mach-omap2/pm33xx.c
create mode 100644 arch/arm/mach-omap2/pm33xx.h
create mode 100644 arch/arm/mach-omap2/sleep33xx.c
create mode 100644 arch/arm/mach-omap2/wkup_m3.c

diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig
index 3eed000..ef3fe40 100644
--- a/arch/arm/mach-omap2/Kconfig
+++ b/arch/arm/mach-omap2/Kconfig
@@ -67,11 +67,14 @@ config SOC_OMAP5
config SOC_AM33XX
bool "AM33XX support"
depends on ARCH_MULTI_V7
- select ARCH_OMAP2PLUS
+ default y
select ARM_CPU_SUSPEND if PM
+ select COMMON_CLK
select CPU_V7
+ select MAILBOX if PM
select MULTI_IRQ_HANDLER
- select COMMON_CLK
+ select OMAP_MBOX_FWK if PM
+ select OMAP2PLUS_MBOX if PM

config SOC_AM43XX
bool "TI AM43x"
diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
index d4f6715..42442c4 100644
--- a/arch/arm/mach-omap2/Makefile
+++ b/arch/arm/mach-omap2/Makefile
@@ -87,6 +87,7 @@ obj-$(CONFIG_ARCH_OMAP2) += sleep24xx.o
obj-$(CONFIG_ARCH_OMAP3) += pm34xx.o sleep34xx.o
obj-$(CONFIG_ARCH_OMAP4) += pm44xx.o omap-mpuss-lowpower.o
obj-$(CONFIG_SOC_OMAP5) += omap-mpuss-lowpower.o
+obj-$(CONFIG_SOC_AM33XX) += pm33xx.o sleep33xx.o wkup_m3.o
obj-$(CONFIG_PM_DEBUG) += pm-debug.o

obj-$(CONFIG_POWER_AVS_OMAP) += sr_device.o
@@ -94,6 +95,7 @@ obj-$(CONFIG_POWER_AVS_OMAP_CLASS3) += smartreflex-class3.o

AFLAGS_sleep24xx.o :=-Wa,-march=armv6
AFLAGS_sleep34xx.o :=-Wa,-march=armv7-a$(plus_sec)
+CFLAGS_sleep33xx.o :=-march=armv7-a

endif

diff --git a/arch/arm/mach-omap2/board-generic.c b/arch/arm/mach-omap2/board-generic.c
index aed750c..3f2d6a7 100644
--- a/arch/arm/mach-omap2/board-generic.c
+++ b/arch/arm/mach-omap2/board-generic.c
@@ -159,6 +159,7 @@ DT_MACHINE_START(AM33XX_DT, "Generic AM33XX (Flattened Device Tree)")
.reserve = am33xx_reserve,
.map_io = am33xx_map_io,
.init_early = am33xx_init_early,
+ .init_late = am33xx_init_late,
.init_irq = omap_intc_of_init,
.handle_irq = omap3_intc_handle_irq,
.init_machine = omap_generic_init,
diff --git a/arch/arm/mach-omap2/common.h b/arch/arm/mach-omap2/common.h
index 6b8ef74..80bf0da 100644
--- a/arch/arm/mach-omap2/common.h
+++ b/arch/arm/mach-omap2/common.h
@@ -69,6 +69,15 @@ static inline int omap4_pm_init(void)
}
#endif

+#if defined(CONFIG_PM) && defined(CONFIG_SOC_AM33XX)
+int am33xx_pm_init(void);
+#else
+static inline int am33xx_pm_init(void)
+{
+ return 0;
+}
+#endif
+
#ifdef CONFIG_OMAP_MUX
int omap_mux_late_init(void);
#else
@@ -107,6 +116,7 @@ void omap2430_init_late(void);
void omap3430_init_late(void);
void omap35xx_init_late(void);
void omap3630_init_late(void);
+void am33xx_init_late(void);
void am35xx_init_late(void);
void ti81xx_init_late(void);
int omap2_common_pm_late_init(void);
diff --git a/arch/arm/mach-omap2/io.c b/arch/arm/mach-omap2/io.c
index 11583a6d..fca216d 100644
--- a/arch/arm/mach-omap2/io.c
+++ b/arch/arm/mach-omap2/io.c
@@ -567,6 +567,11 @@ void __init am33xx_init_early(void)
omap_hwmod_init_postsetup();
omap_clk_init = am33xx_clk_init;
}
+
+void __init am33xx_init_late(void)
+{
+ am33xx_pm_init();
+}
#endif

#ifdef CONFIG_SOC_AM43XX
diff --git a/arch/arm/mach-omap2/pm.c b/arch/arm/mach-omap2/pm.c
index e742118..f8bd883 100644
--- a/arch/arm/mach-omap2/pm.c
+++ b/arch/arm/mach-omap2/pm.c
@@ -305,7 +305,8 @@ int __init omap2_common_pm_late_init(void)
}

#ifdef CONFIG_SUSPEND
- suspend_set_ops(&omap_pm_ops);
+ if (!soc_is_am33xx())
+ suspend_set_ops(&omap_pm_ops);
#endif

return 0;
diff --git a/arch/arm/mach-omap2/pm33xx.c b/arch/arm/mach-omap2/pm33xx.c
new file mode 100644
index 0000000..11d3173
--- /dev/null
+++ b/arch/arm/mach-omap2/pm33xx.c
@@ -0,0 +1,486 @@
+/*
+ * AM33XX Power Management Routines
+ *
+ * Copyright (C) 2012 Texas Instruments Incorporated - http://www.ti.com/
+ * Vaibhav Bedia <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/err.h>
+#include <linux/firmware.h>
+#include <linux/io.h>
+#include <linux/platform_device.h>
+#include <linux/sched.h>
+#include <linux/suspend.h>
+#include <linux/completion.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+#include <linux/ti_emif.h>
+#include <linux/pie.h>
+#include <linux/genalloc.h>
+#include <linux/omap-mailbox.h>
+
+#include <asm/suspend.h>
+#include <asm/proc-fns.h>
+#include <asm/sizes.h>
+#include <asm/fncpy.h>
+#include <asm/system_misc.h>
+
+#include "pm.h"
+#include "cm33xx.h"
+#include "pm33xx.h"
+#include "control.h"
+#include "common.h"
+#include "clockdomain.h"
+#include "powerdomain.h"
+#include "omap_hwmod.h"
+#include "omap_device.h"
+#include "soc.h"
+
+static unsigned long am33xx_mem_type;
+static void __iomem *am33xx_emif_base;
+static struct pie_chunk *am33xx_pie_chunk;
+static struct powerdomain *cefuse_pwrdm, *gfx_pwrdm, *per_pwrdm, *mpu_pwrdm;
+static struct clockdomain *gfx_l4ls_clkdm;
+
+struct wakeup_src wakeups[] = {
+ {.irq_nr = 35, .src = "USB0_PHY"},
+ {.irq_nr = 36, .src = "USB1_PHY"},
+ {.irq_nr = 40, .src = "I2C0"},
+ {.irq_nr = 41, .src = "RTC Timer"},
+ {.irq_nr = 42, .src = "RTC Alarm"},
+ {.irq_nr = 43, .src = "Timer0"},
+ {.irq_nr = 44, .src = "Timer1"},
+ {.irq_nr = 45, .src = "UART"},
+ {.irq_nr = 46, .src = "GPIO0"},
+ {.irq_nr = 48, .src = "MPU_WAKE"},
+ {.irq_nr = 49, .src = "WDT0"},
+ {.irq_nr = 50, .src = "WDT1"},
+ {.irq_nr = 51, .src = "ADC_TSC"},
+};
+
+struct forced_standby_module am33xx_mod[] = {
+ {.oh_name = "usb_otg_hs"},
+ {.oh_name = "tptc0"},
+ {.oh_name = "tptc1"},
+ {.oh_name = "tptc2"},
+ {.oh_name = "cpgmac0"},
+};
+
+static struct am33xx_pm_context *am33xx_pm;
+
+static DECLARE_COMPLETION(am33xx_pm_sync);
+
+#ifdef CONFIG_SUSPEND
+
+static int am33xx_pm_suspend(void)
+{
+ int i, j, ret = 0;
+
+ int status = 0;
+ struct platform_device *pdev;
+ struct omap_device *od;
+
+ /*
+ * By default the following IPs do not have MSTANDBY asserted
+ * which is necessary for PER domain transition. If the drivers
+ * are not compiled into the kernel HWMOD code will not change the
+ * state of the IPs if the IP was not never enabled. To ensure
+ * that there no issues with or without the drivers being compiled
+ * in the kernel, we forcefully put these IPs to idle.
+ */
+ for (i = 0; i < ARRAY_SIZE(am33xx_mod); i++) {
+ pdev = to_platform_device(am33xx_mod[i].dev);
+ od = to_omap_device(pdev);
+ if (od->_driver_status != BUS_NOTIFY_BOUND_DRIVER) {
+ omap_device_enable_hwmods(od);
+ omap_device_idle_hwmods(od);
+ }
+ }
+
+ /* Try to put GFX to sleep */
+ omap_set_pwrdm_state(gfx_pwrdm, PWRDM_POWER_OFF);
+ ret = cpu_suspend(am33xx_mem_type, am33xx_suspend);
+
+ status = pwrdm_read_prev_pwrst(gfx_pwrdm);
+ if (status != PWRDM_POWER_OFF)
+ pr_err("PM: GFX domain did not transition\n");
+ else
+ pr_info("PM: GFX domain entered low power state\n");
+
+ /*
+ * BUG: GFX_L4LS clock domain needs to be woken up to
+ * ensure thet L4LS clock domain does not get stuck in transition
+ * If that happens L3 module does not get disabled, thereby leading
+ * to PER power domain transition failing
+ */
+ clkdm_wakeup(gfx_l4ls_clkdm);
+ clkdm_sleep(gfx_l4ls_clkdm);
+
+ if (ret) {
+ pr_err("PM: Kernel suspend failure\n");
+ } else {
+ i = am33xx_pm_status();
+ switch (i) {
+ case 0:
+ pr_info("PM: Successfully put all powerdomains to target state\n");
+
+ /*
+ * The PRCM registers on AM335x do not contain previous state
+ * information like those present on OMAP4 so we must manually
+ * indicate transition so state counters are properly incremented
+ */
+ pwrdm_post_transition(mpu_pwrdm);
+ pwrdm_post_transition(per_pwrdm);
+ break;
+ case 1:
+ pr_err("PM: Could not transition all powerdomains to target state\n");
+ ret = -1;
+ break;
+ default:
+ pr_err("PM: CM3 returned unknown result :(\nStatus = %d\n", i);
+ ret = -1;
+ }
+
+ /* print the wakeup reason */
+ i = am33xx_pm_wake_src();
+ for (j = 0; j < ARRAY_SIZE(wakeups); j++) {
+ if (wakeups[j].irq_nr == i) {
+ pr_info("PM: Wakeup source %s\n", wakeups[j].src);
+ break;
+ }
+ }
+
+ if (j == ARRAY_SIZE(wakeups))
+ pr_info("PM: Unknown wakeup source %d!\n", i);
+ }
+
+ return ret;
+}
+
+static int am33xx_pm_enter(suspend_state_t suspend_state)
+{
+ int ret = 0;
+
+ switch (suspend_state) {
+ case PM_SUSPEND_STANDBY:
+ case PM_SUSPEND_MEM:
+ ret = am33xx_pm_suspend();
+ break;
+ default:
+ ret = -EINVAL;
+ }
+
+ return ret;
+}
+
+/* returns the error code from msg_send - 0 for success, failure otherwise */
+static int am33xx_ping_wkup_m3(void)
+{
+ int ret = 0;
+
+ /*
+ * Write a dummy message to the mailbox in order to trigger the RX
+ * interrupt to alert the M3 that data is available in the IPC
+ * registers.
+ */
+ ret = omap_mbox_msg_send(am33xx_pm->mbox, 0xABCDABCD);
+
+ return ret;
+}
+
+static void am33xx_m3_state_machine_reset(void)
+{
+ int i;
+
+ am33xx_pm->ipc.sleep_mode = IPC_CMD_RESET;
+
+ am33xx_pm_ipc_cmd(&am33xx_pm->ipc);
+
+ am33xx_pm->state = M3_STATE_MSG_FOR_RESET;
+
+ pr_info("PM: Sending message for resetting M3 state machine\n");
+
+ if (!am33xx_ping_wkup_m3()) {
+ i = wait_for_completion_timeout(&am33xx_pm_sync,
+ msecs_to_jiffies(500));
+ if (WARN(i == 0, "PM: MPU<->CM3 sync failure\n"))
+ am33xx_pm->state = M3_STATE_UNKNOWN;
+ } else {
+ pr_warn("PM: Unable to ping CM3\n");
+ }
+}
+
+static int am33xx_pm_begin(suspend_state_t state)
+{
+ int i;
+
+ cpu_idle_poll_ctrl(true);
+
+ am33xx_pm->ipc.sleep_mode = IPC_CMD_DS0;
+ am33xx_pm->ipc.param1 = DS_IPC_DEFAULT;
+ am33xx_pm->ipc.param2 = DS_IPC_DEFAULT;
+
+ am33xx_pm_ipc_cmd(&am33xx_pm->ipc);
+
+ am33xx_pm->state = M3_STATE_MSG_FOR_LP;
+
+ pr_info("PM: Sending message for entering DeepSleep mode\n");
+
+ if (!am33xx_ping_wkup_m3()) {
+ i = wait_for_completion_timeout(&am33xx_pm_sync,
+ msecs_to_jiffies(500));
+ if (WARN(i == 0, "PM: MPU<->CM3 sync failure\n"))
+ return -1;
+ } else {
+ pr_warn("PM: Unable to ping CM3\n");
+ }
+
+ return 0;
+}
+
+static void am33xx_pm_end(void)
+{
+ am33xx_m3_state_machine_reset();
+
+ cpu_idle_poll_ctrl(false);
+
+ return;
+}
+
+static struct platform_suspend_ops am33xx_pm_ops = {
+ .begin = am33xx_pm_begin,
+ .end = am33xx_pm_end,
+ .enter = am33xx_pm_enter,
+};
+
+/*
+ * Dummy notifier for the mailbox
+ */
+
+static int wkup_mbox_msg(struct notifier_block *self, unsigned long len,
+ void *msg)
+{
+ return 0;
+}
+
+static struct notifier_block wkup_mbox_notifier = {
+ .notifier_call = wkup_mbox_msg,
+};
+
+void am33xx_txev_handler(void)
+{
+ switch (am33xx_pm->state) {
+ case M3_STATE_RESET:
+ am33xx_pm->state = M3_STATE_INITED;
+ am33xx_pm->ver = am33xx_pm_version_get();
+ if (am33xx_pm->ver == M3_VERSION_UNKNOWN ||
+ am33xx_pm->ver < M3_BASELINE_VERSION) {
+ pr_warn("PM: CM3 Firmware Version %x not supported\n",
+ am33xx_pm->ver);
+ } else {
+ pr_info("PM: CM3 Firmware Version = 0x%x\n",
+ am33xx_pm->ver);
+ am33xx_pm_ops.valid = suspend_valid_only_mem;
+ }
+ break;
+ case M3_STATE_MSG_FOR_RESET:
+ am33xx_pm->state = M3_STATE_INITED;
+ complete(&am33xx_pm_sync);
+ break;
+ case M3_STATE_MSG_FOR_LP:
+ complete(&am33xx_pm_sync);
+ break;
+ case M3_STATE_UNKNOWN:
+ pr_warn("PM: Unknown CM3 State\n");
+ }
+
+ return;
+}
+
+static void am33xx_pm_firmware_cb(const struct firmware *fw, void *context)
+{
+ struct am33xx_pm_context *am33xx_pm = context;
+ int ret = 0;
+ unsigned long pie_trampoline;
+
+ /* no firmware found */
+ if (!fw) {
+ pr_err("PM: request_firmware failed\n");
+ return;
+ }
+
+ wkup_m3_copy_code(fw->data, fw->size);
+
+ wkup_m3_register_txev_handler(am33xx_txev_handler);
+
+ pr_info("PM: Copied the M3 firmware to UMEM\n");
+
+ /*
+ * Invalidate M3 firmware version before hardreset.
+ * Write invalid version in lower 4 nibbles of parameter
+ * register (ipc_regs + 0x8).
+ */
+ am33xx_pm_version_clear();
+
+ am33xx_pm->state = M3_STATE_RESET;
+
+ ret = wkup_m3_prepare();
+ if (ret) {
+ pr_err("PM: Could not prepare WKUP_M3\n");
+ return;
+ }
+
+ /* Physical resume address to be used by ROM code */
+ pie_trampoline = (long) fn_to_pie(am33xx_pie_chunk,
+ &am33xx_resume_trampoline);
+ am33xx_pm->ipc.resume_addr = pie_to_phys(am33xx_pie_chunk,
+ pie_trampoline);
+
+ am33xx_pm->mbox = omap_mbox_get("wkup_m3", &wkup_mbox_notifier);
+
+ if (IS_ERR(am33xx_pm->mbox)) {
+ ret = -EBUSY;
+ pr_err("PM: IPC Request for A8->M3 Channel failed!\n");
+ return;
+ } else {
+ suspend_set_ops(&am33xx_pm_ops);
+ }
+
+ return;
+}
+
+#endif /* CONFIG_SUSPEND */
+
+static int __init am33xx_map_emif(void)
+{
+ am33xx_emif_base = ioremap(AM33XX_EMIF_BASE, SZ_32K);
+
+ if (!am33xx_emif_base)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static int __init am33xx_pie_chunk_init(void)
+{
+ struct device_node *np;
+ struct gen_pool *pool;
+
+ np = of_find_compatible_node(NULL, NULL, "ti,omap3-mpu");
+ if (!np)
+ return -ENOENT;
+
+ pool = of_get_named_gen_pool(np, "sram", 0);
+ if (!pool)
+ return -ENOENT;
+
+ am33xx_pie_chunk = pie_load_sections(pool, am33xx);
+ if (!IS_ERR(am33xx_pie_chunk))
+ am33xx_pie_init(am33xx_pie_chunk, am33xx_emif_base,
+ am33xx_dram_sync);
+
+ return PTR_RET(am33xx_pie_chunk);
+}
+
+int __init am33xx_pm_init(void)
+{
+ int ret;
+ u32 temp;
+ struct device_node *np;
+ int i;
+
+ if (!soc_is_am33xx())
+ return -ENODEV;
+
+ pr_info("Power Management for AM33XX family\n");
+
+ /*
+ * By default the following IPs do not have MSTANDBY asserted
+ * which is necessary for PER domain transition. If the drivers
+ * are not compiled into the kernel HWMOD code will not change the
+ * state of the IPs if the IP was not never enabled
+ */
+ for (i = 0; i < ARRAY_SIZE(am33xx_mod); i++)
+ am33xx_mod[i].dev = omap_device_get_by_hwmod_name(am33xx_mod[i].oh_name);
+
+ gfx_pwrdm = pwrdm_lookup("gfx_pwrdm");
+ per_pwrdm = pwrdm_lookup("per_pwrdm");
+ mpu_pwrdm = pwrdm_lookup("mpu_pwrdm");
+
+ gfx_l4ls_clkdm = clkdm_lookup("gfx_l4ls_gfx_clkdm");
+
+ if ((!gfx_pwrdm) || (!per_pwrdm) || (!mpu_pwrdm) || (!gfx_l4ls_clkdm)) {
+ ret = -ENODEV;
+ goto err;
+ }
+
+ am33xx_pm = kzalloc(sizeof(*am33xx_pm), GFP_KERNEL);
+ if (!am33xx_pm) {
+ pr_err("Memory allocation failed\n");
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ ret = am33xx_map_emif();
+ if (ret) {
+ pr_err("PM: Could not ioremap EMIF\n");
+ goto err;
+ }
+ /* Determine Memory Type */
+ temp = readl(am33xx_emif_base + EMIF_SDRAM_CONFIG);
+ temp = (temp & SDRAM_TYPE_MASK) >> SDRAM_TYPE_SHIFT;
+ /* Parameters to pass to aseembly code */
+ am33xx_mem_type = temp;
+ am33xx_pm->ipc.param3 = temp;
+
+ np = of_find_compatible_node(NULL, NULL, "ti,am3353-wkup-m3");
+ if (np) {
+ if (of_find_property(np, "ti,needs_vtt_toggle", NULL) &&
+ (!(of_property_read_u32(np, "vtt-gpio-pin",
+ &temp)))) {
+ if (temp >= 0 && temp <= 31)
+ am33xx_pm->ipc.param3 |=
+ ((1 << VTT_STAT_SHIFT) |
+ (temp << VTT_GPIO_PIN_SHIFT));
+ }
+ }
+
+ ret = am33xx_pie_chunk_init();
+ if (ret) {
+ pr_err("PM: Could not load suspend/resume code into SRAM\n");
+ goto err;
+ }
+
+ (void) clkdm_for_each(omap_pm_clkdms_setup, NULL);
+
+ /* CEFUSE domain can be turned off post bootup */
+ cefuse_pwrdm = pwrdm_lookup("cefuse_pwrdm");
+ if (cefuse_pwrdm)
+ omap_set_pwrdm_state(cefuse_pwrdm, PWRDM_POWER_OFF);
+ else
+ pr_err("PM: Failed to get cefuse_pwrdm\n");
+
+#ifdef CONFIG_SUSPEND
+ pr_info("PM: Trying to load am335x-pm-firmware.bin");
+
+ /* We don't want to delay boot */
+ request_firmware_nowait(THIS_MODULE, 0, "am335x-pm-firmware.bin",
+ NULL, GFP_KERNEL, am33xx_pm,
+ am33xx_pm_firmware_cb);
+#endif /* CONFIG_SUSPEND */
+
+err:
+ return ret;
+}
diff --git a/arch/arm/mach-omap2/pm33xx.h b/arch/arm/mach-omap2/pm33xx.h
new file mode 100644
index 0000000..b470fa5
--- /dev/null
+++ b/arch/arm/mach-omap2/pm33xx.h
@@ -0,0 +1,68 @@
+/*
+ * AM33XX Power Management Routines
+ *
+ * Copyright (C) 2012 Texas Instruments Inc.
+ * Vaibhav Bedia <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+#ifndef __ARCH_ARM_MACH_OMAP2_PM33XX_H
+#define __ARCH_ARM_MACH_OMAP2_PM33XX_H
+
+#include <linux/kernel.h>
+
+#include "control.h"
+
+struct am33xx_pm_context {
+ struct am33xx_ipc_data ipc;
+ struct firmware *firmware;
+ struct omap_mbox *mbox;
+ u8 state;
+ u32 ver;
+};
+
+struct wakeup_src {
+ int irq_nr;
+ char src[10];
+};
+
+struct forced_standby_module {
+ char oh_name[15];
+ struct device *dev;
+};
+
+struct pie_chunk;
+
+int wkup_m3_copy_code(const u8 *data, size_t size);
+int wkup_m3_prepare(void);
+void wkup_m3_register_txev_handler(void (*txev_handler)(void));
+int am33xx_suspend(long unsigned int flags);
+void am33xx_resume_trampoline(void);
+void am33xx_pie_init(struct pie_chunk *chunk, void __iomem *emif_base,
+ void __iomem *dram_sync);
+
+#define IPC_CMD_DS0 0x4
+#define IPC_CMD_RESET 0xe
+#define DS_IPC_DEFAULT 0xffffffff
+#define M3_VERSION_UNKNOWN 0x0000ffff
+#define M3_BASELINE_VERSION 0x21
+
+#define M3_STATE_UNKNOWN 0
+#define M3_STATE_RESET 1
+#define M3_STATE_INITED 2
+#define M3_STATE_MSG_FOR_LP 3
+#define M3_STATE_MSG_FOR_RESET 4
+
+#define AM33XX_OCMC_END 0x40310000
+#define AM33XX_EMIF_BASE 0x4C000000
+
+#define MEM_TYPE_DDR2 2
+
+#endif
diff --git a/arch/arm/mach-omap2/sleep33xx.c b/arch/arm/mach-omap2/sleep33xx.c
new file mode 100644
index 0000000..2a4322c
--- /dev/null
+++ b/arch/arm/mach-omap2/sleep33xx.c
@@ -0,0 +1,314 @@
+/*
+ * AM33XX Power Management Routines
+ *
+ * Copyright (C) 2012 Texas Instruments Incorporated - http://www.ti.com/
+ * Vaibhav Bedia <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+#include <linux/kernel.h>
+#include <linux/io.h>
+#include <linux/ti_emif.h>
+#include <linux/platform_data/emif_plat.h>
+#include <linux/pie.h>
+
+#include <asm/suspend.h>
+#include <asm/cp15.h>
+#include <asm/pie.h>
+
+#include "pm33xx.h"
+#include "cm33xx.h"
+#include "cm-regbits-33xx.h"
+#include "omap_hwmod.h"
+
+#define CLKCTRL_IDLEST_FUNCTIONAL 0x0
+#define CLKCTRL_IDLEST_DISABLED 0x3
+
+struct emif_regs {
+ u32 sdcfg;
+ u32 ref_ctrl;
+ u32 timing1;
+ u32 timing2;
+ u32 timing3;
+ u32 pmcr;
+ u32 pmcr_shdw;
+ u32 zqcfg;
+ u32 rd_lat;
+};
+
+extern int call_with_stack(int (*fn)(void *), void *arg, void *sp);
+extern void v7_flush_dcache_all(void);
+
+void (*__abs_v7_flush_dcache_all)(void) __pie_data(am33xx);
+char sram_stack[1024] __pie_data(am33xx);
+void __noreturn (*__cpu_resume_phys)(void) __pie_data(am33xx);
+void __iomem *emif_virt_base __pie_data(am33xx);
+void __iomem *dram_sync_addr __pie_data(am33xx);
+
+EXPORT_PIE_SYMBOL(__abs_v7_flush_dcache_all);
+EXPORT_PIE_SYMBOL(sram_stack);
+EXPORT_PIE_SYMBOL(__cpu_resume_phys);
+EXPORT_PIE_SYMBOL(emif_virt_base);
+EXPORT_PIE_SYMBOL(dram_sync_addr);
+
+static struct emif_regs emif_regs __pie_data(am33xx);
+static void __iomem *emif_base __pie_data(am33xx);
+static u32 mem_type __pie_data(am33xx);
+static u32 cm_offset __pie_data(am33xx);
+
+static struct pie_chunk *am33xx_chunk;
+
+static inline void flush_dcache_all(void)
+{
+ __asm__ __volatile__("" : : : "r0", "r1", "r2", "r3", "r4", "r5",
+ "r5", "r6", "r7", "r9", "r10", "r11");
+ __abs_v7_flush_dcache_all();
+}
+
+static u32 __pie(am33xx) emif_read(u16 idx)
+{
+ return __raw_readl(emif_base + idx);
+}
+
+static void __pie(am33xx) emif_write(u32 val, u16 idx)
+{
+ __raw_writel(val, emif_base + idx);
+}
+
+static inline void am33xx_wkup_write(u32 val, void __iomem *reg)
+{
+ __raw_writel(val, reg + cm_offset);
+}
+
+static inline u32 am33xx_wkup_read(void __iomem *reg)
+{
+ return __raw_readl(reg + cm_offset);
+}
+
+static void __pie(am33xx) am33xx_module_set(u16 mode, void __iomem *reg)
+{
+ u32 val = am33xx_wkup_read(reg) & ~AM33XX_MODULEMODE_MASK;
+ am33xx_wkup_write(val | mode, reg);
+}
+
+static void __pie(am33xx) am33xx_module_disable(void __iomem *reg)
+{
+ am33xx_module_set(0, reg);
+}
+
+static void __pie(am33xx) am33xx_module_disable_wait(void __iomem *reg)
+{
+ u32 val;
+ am33xx_module_disable(reg);
+ do {
+ val = am33xx_wkup_read(reg) & AM33XX_IDLEST_MASK;
+ val >>= AM33XX_IDLEST_SHIFT;
+ } while (val != CLKCTRL_IDLEST_DISABLED);
+}
+
+static void __pie(am33xx) am33xx_module_enable(void __iomem *reg)
+{
+ am33xx_module_set(MODULEMODE_SWCTRL, reg);
+}
+
+static void __pie(am33xx) am33xx_module_enable_wait(void __iomem *reg)
+{
+ u32 val;
+ am33xx_module_enable(reg);
+ do {
+ val = am33xx_wkup_read(reg) & AM33XX_IDLEST_MASK;
+ val >>= AM33XX_IDLEST_SHIFT;
+ } while (val != CLKCTRL_IDLEST_FUNCTIONAL);
+}
+
+static void __pie(am33xx) noinline am33xx_enable_sr(void)
+{
+ u32 val;
+
+ emif_regs.sdcfg = emif_read(EMIF_SDRAM_CONFIG);
+ val = emif_read(EMIF_POWER_MANAGEMENT_CONTROL);
+ val &= ~SR_TIM_MASK;
+ val |= 0xa << SR_TIM_SHIFT;
+ emif_write(val, EMIF_POWER_MANAGEMENT_CONTROL);
+ emif_write(val, EMIF_POWER_MANAGEMENT_CTRL_SHDW);
+
+ __raw_readl(dram_sync_addr);
+ val &= ~LP_MODE_MASK;
+ val |= EMIF_LP_MODE_SELF_REFRESH << LP_MODE_SHIFT;
+ emif_write(val, EMIF_POWER_MANAGEMENT_CONTROL);
+}
+
+static void __pie(am33xx) noinline am33xx_disable_sr(void)
+{
+ u32 val;
+
+ val = emif_read(EMIF_POWER_MANAGEMENT_CONTROL);
+ val &= ~LP_MODE_MASK;
+ val |= EMIF_LP_MODE_DISABLE << LP_MODE_SHIFT;
+ emif_write(val, EMIF_POWER_MANAGEMENT_CONTROL);
+ emif_write(val, EMIF_POWER_MANAGEMENT_CTRL_SHDW);
+
+ /*
+ * A write to SDRAM CONFIG register triggers
+ * an init sequence and hence it must be done
+ * at the end for DDR2
+ */
+ emif_write(emif_regs.sdcfg, EMIF_SDRAM_CONFIG);
+}
+
+static void __pie(am33xx) noinline am33xx_emif_save(void)
+{
+ emif_regs.ref_ctrl = emif_read(EMIF_SDRAM_REFRESH_CONTROL);
+ emif_regs.timing1 = emif_read(EMIF_SDRAM_TIMING_1);
+ emif_regs.timing2 = emif_read(EMIF_SDRAM_TIMING_2);
+ emif_regs.timing3 = emif_read(EMIF_SDRAM_TIMING_3);
+ emif_regs.pmcr = emif_read(EMIF_POWER_MANAGEMENT_CONTROL);
+ emif_regs.pmcr_shdw = emif_read(EMIF_POWER_MANAGEMENT_CTRL_SHDW);
+ emif_regs.zqcfg = emif_read(EMIF_SDRAM_OUTPUT_IMPEDANCE_CALIBRATION_CONFIG);
+ emif_regs.rd_lat = emif_read(EMIF_DDR_PHY_CTRL_1);
+}
+
+static void __pie(am33xx) noinline am33xx_emif_restore(void)
+{
+ emif_write(emif_regs.rd_lat, EMIF_DDR_PHY_CTRL_1);
+ emif_write(emif_regs.rd_lat, EMIF_DDR_PHY_CTRL_1_SHDW);
+ emif_write(emif_regs.timing1, EMIF_SDRAM_TIMING_1);
+ emif_write(emif_regs.timing1, EMIF_SDRAM_TIMING_1_SHDW);
+ emif_write(emif_regs.timing2, EMIF_SDRAM_TIMING_2);
+ emif_write(emif_regs.timing2, EMIF_SDRAM_TIMING_2_SHDW);
+ emif_write(emif_regs.timing3, EMIF_SDRAM_TIMING_3);
+ emif_write(emif_regs.timing3, EMIF_SDRAM_TIMING_3_SHDW);
+ emif_write(emif_regs.ref_ctrl, EMIF_SDRAM_REFRESH_CONTROL);
+ emif_write(emif_regs.ref_ctrl, EMIF_SDRAM_REFRESH_CTRL_SHDW);
+ emif_write(emif_regs.pmcr, EMIF_POWER_MANAGEMENT_CONTROL);
+ emif_write(emif_regs.pmcr_shdw, EMIF_POWER_MANAGEMENT_CTRL_SHDW);
+ /*
+ * Output impedence calib needed only for DDR3
+ * but since the initial state of this will be
+ * disabled for DDR2 no harm in restoring the
+ * old configuration
+ */
+ emif_write(emif_regs.zqcfg, EMIF_SDRAM_OUTPUT_IMPEDANCE_CALIBRATION_CONFIG);
+
+ /* Write to SDRAM_CONFIG only for DDR2 */
+ if (mem_type == MEM_TYPE_DDR2)
+ emif_write(emif_regs.sdcfg, EMIF_SDRAM_CONFIG);
+}
+
+int __pie(am33xx) am33xx_wfi_sram(void *data)
+{
+ mem_type = (unsigned long) data;
+ emif_base = emif_virt_base;
+ cm_offset = 0;
+
+ /*
+ * Flush all data from the L1 data cache before disabling
+ * SCTLR.C bit.
+ */
+ flush_dcache_all();
+ /*
+ * Clear the SCTLR.C bit to prevent further data cache
+ * allocation. Clearing SCTLR.C would make all the data
+ * accesses strongly ordered and would not hit the cache.
+ */
+ set_cr(get_cr() & ~CR_C);
+ /*
+ * Invalidate L1 data cache. Even though only invalidate is
+ * necessary exported flush API is used here. Doing clean
+ * on already clean cache would be almost NOP.
+ */
+ flush_dcache_all();
+
+ am33xx_emif_save();
+ am33xx_enable_sr();
+
+ am33xx_module_disable_wait(AM33XX_CM_PER_EMIF_CLKCTRL);
+
+ /*
+ * For the MPU WFI to be registered as an interrupt
+ * to WKUP_M3, MPU_CLKCTRL.MODULEMODE needs to be set
+ * to DISABLED
+ */
+ am33xx_module_disable(AM33XX_CM_MPU_MPU_CLKCTRL);
+
+ __asm__ __volatile__ (
+ /*
+ * Execute an ISB instruction to ensure that all of the
+ * CP15 register changes have been committed.
+ */
+ "isb\n\t"
+ /*
+ * Execute a barrier instruction to ensure that all cache,
+ * TLB and branch predictor maintenance operations issued
+ * have completed.
+ */
+ "dsb\n\t"
+ "dmb\n\t"
+ /*
+ * Execute a WFI instruction and wait until the
+ * STANDBYWFI output is asserted to indicate that the
+ * CPU is in idle and low power state. CPU can specualatively
+ * prefetch the instructions so add NOPs after WFI. Thirteen
+ * NOPs as per Cortex-A8 pipeline.
+ */
+ "wfi\n\t"
+ ".rept 13\n\t"
+ "nop\n\t"
+ ".endr" : : : "memory");
+
+ /* We come here in case of an abort due to a late interrupt */
+
+ am33xx_module_enable(AM33XX_CM_MPU_MPU_CLKCTRL);
+
+ am33xx_module_enable_wait(AM33XX_CM_PER_EMIF_CLKCTRL);
+ am33xx_disable_sr();
+ /* Set SCTLR.C bit to allow data cache allocation */
+ set_cr(get_cr() | CR_C);
+
+ /* Let the suspend code know about the abort */
+ return 1;
+}
+EXPORT_PIE_SYMBOL(am33xx_wfi_sram);
+
+int am33xx_suspend(long unsigned int mem_type)
+{
+ pie_relocate_from_kern(am33xx_chunk);
+ return call_with_stack(fn_to_pie(am33xx_chunk, &am33xx_wfi_sram),
+ (void *) mem_type,
+ kern_to_pie(am33xx_chunk, (char *) sram_stack) +
+ sizeof(sram_stack));
+}
+
+static void __pie(am33xx) __noreturn noinline am33xx_resume(void)
+{
+ emif_base = (void *) AM33XX_EMIF_BASE;
+ /* Undo the offset built into the register defines */
+ cm_offset = -AM33XX_L4_WK_IO_OFFSET;
+
+ am33xx_module_enable_wait(AM33XX_CM_PER_EMIF_CLKCTRL);
+ am33xx_emif_restore();
+
+ /* We are back. Branch to the common CPU resume routine */
+ __cpu_resume_phys();
+}
+
+ARM_PIE_RESUME(am33xx, am33xx_resume, sram_stack + ARRAY_SIZE(sram_stack));
+
+void am33xx_pie_init(struct pie_chunk *chunk, void __iomem *emif_base,
+ void __iomem *dram_sync)
+{
+ am33xx_chunk = chunk;
+
+ *kern_to_pie(chunk, &__abs_v7_flush_dcache_all) = v7_flush_dcache_all;
+ *kern_to_pie(chunk, &__cpu_resume_phys) =
+ (void *) virt_to_phys(cpu_resume);
+ *kern_to_pie(chunk, &emif_virt_base) = emif_base;
+ *kern_to_pie(chunk, &dram_sync_addr) = dram_sync;
+}
diff --git a/arch/arm/mach-omap2/wkup_m3.c b/arch/arm/mach-omap2/wkup_m3.c
new file mode 100644
index 0000000..8eaa7f3
--- /dev/null
+++ b/arch/arm/mach-omap2/wkup_m3.c
@@ -0,0 +1,183 @@
+/*
+ * AM33XX Power Management Routines
+ *
+ * Copyright (C) 2012 Texas Instruments Incorporated - http://www.ti.com/
+ * Vaibhav Bedia <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/cpu.h>
+#include <linux/err.h>
+#include <linux/firmware.h>
+#include <linux/io.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+#include <linux/of.h>
+
+#include "pm33xx.h"
+#include "control.h"
+#include "omap_device.h"
+#include "soc.h"
+
+struct wkup_m3_context {
+ struct device *dev;
+ void __iomem *code;
+ void (*txev_handler)(void);
+};
+
+struct wkup_m3_context *wkup_m3;
+
+int wkup_m3_copy_code(const u8 *data, size_t size)
+{
+ if (size > SZ_16K)
+ return -ENOMEM;
+
+ memcpy_toio(wkup_m3->code, data, size);
+
+ return 0;
+}
+
+
+void wkup_m3_register_txev_handler(void (*txev_handler)(void))
+{
+ wkup_m3->txev_handler = txev_handler;
+}
+
+/* have platforms do what they want in atomic context over here? */
+static irqreturn_t wkup_m3_txev_handler(int irq, void *unused)
+{
+ am33xx_txev_eoi();
+
+ /* callback to be executed in atomic context */
+ /* return 0 implies IRQ_HANDLED else IRQ_NONE */
+ wkup_m3->txev_handler();
+
+ am33xx_txev_enable();
+
+ return IRQ_HANDLED;
+}
+
+int wkup_m3_prepare(void)
+{
+ struct platform_device *pdev = to_platform_device(wkup_m3->dev);
+
+ /* check that the code is loaded */
+ omap_device_deassert_hardreset(pdev, "wkup_m3");
+
+ return 0;
+}
+
+static int wkup_m3_probe(struct platform_device *pdev)
+{
+ int irq, ret = 0;
+ struct resource *mem;
+
+ pm_runtime_enable(&pdev->dev);
+
+ ret = pm_runtime_get_sync(&pdev->dev);
+ if (IS_ERR_VALUE(ret)) {
+ dev_err(&pdev->dev, "pm_runtime_get_sync() failed\n");
+ return ret;
+ }
+
+ irq = platform_get_irq(pdev, 0);
+ if (!irq) {
+ dev_err(wkup_m3->dev, "no irq resource\n");
+ ret = -ENXIO;
+ goto err;
+ }
+
+ mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!mem) {
+ dev_err(wkup_m3->dev, "no memory resource\n");
+ ret = -ENXIO;
+ goto err;
+ }
+
+ wkup_m3 = kzalloc(sizeof(*wkup_m3), GFP_KERNEL);
+ if (!wkup_m3) {
+ pr_err("Memory allocation failed\n");
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ wkup_m3->dev = &pdev->dev;
+
+ wkup_m3->code = devm_request_and_ioremap(wkup_m3->dev, mem);
+ if (!wkup_m3->code) {
+ dev_err(wkup_m3->dev, "could not ioremap\n");
+ ret = -EADDRNOTAVAIL;
+ goto err;
+ }
+
+ ret = devm_request_irq(wkup_m3->dev, irq, wkup_m3_txev_handler,
+ IRQF_DISABLED, "wkup_m3_txev", NULL);
+ if (ret) {
+ dev_err(wkup_m3->dev, "request_irq failed\n");
+ goto err;
+ }
+
+err:
+ return ret;
+}
+
+static int wkup_m3_remove(struct platform_device *pdev)
+{
+ return 0;
+}
+
+static struct of_device_id wkup_m3_dt_ids[] = {
+ { .compatible = "ti,am3353-wkup-m3" },
+ { }
+};
+MODULE_DEVICE_TABLE(of, wkup_m3_dt_ids);
+
+static int wkup_m3_rpm_suspend(struct device *dev)
+{
+ return -EBUSY;
+}
+
+static int wkup_m3_rpm_resume(struct device *dev)
+{
+ return 0;
+}
+
+static const struct dev_pm_ops wkup_m3_ops = {
+ SET_RUNTIME_PM_OPS(wkup_m3_rpm_suspend, wkup_m3_rpm_resume, NULL)
+};
+
+static struct platform_driver wkup_m3_driver = {
+ .probe = wkup_m3_probe,
+ .remove = wkup_m3_remove,
+ .driver = {
+ .name = "wkup_m3",
+ .owner = THIS_MODULE,
+ .of_match_table = of_match_ptr(wkup_m3_dt_ids),
+ .pm = &wkup_m3_ops,
+ },
+};
+
+static __init int wkup_m3_init(void)
+{
+ return platform_driver_register(&wkup_m3_driver);
+}
+
+static __exit void wkup_m3_exit(void)
+{
+ platform_driver_unregister(&wkup_m3_driver);
+}
+omap_postcore_initcall(wkup_m3_init);
+module_exit(wkup_m3_exit);
--
1.8.3.2

2013-09-17 12:45:35

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 09/11] ARM: dts: AM33XX: Associate SRAM with MPU and mark it exec

The SRAM is for use by the MPU. Marking it as such makes it
easier for PM initialization code to locate the SRAM in order to
load a PIE section into it.

Additionally, set the map-exec flag to allow code to be run
from SRAM. This is necessary for suspend/resume.

Signed-off-by: Russ Dill <[email protected]>
---
arch/arm/boot/dts/am33xx.dtsi | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
index 302463d..20942a2 100644
--- a/arch/arm/boot/dts/am33xx.dtsi
+++ b/arch/arm/boot/dts/am33xx.dtsi
@@ -62,6 +62,7 @@
mpu {
compatible = "ti,omap3-mpu";
ti,hwmods = "mpu";
+ sram = <&ocmcram>;
};
};

@@ -507,6 +508,7 @@
ocmcram: ocmcram@40300000 {
compatible = "mmio-sram";
reg = <0x40300000 0x10000>; /* 64k */
+ map-exec;
};

wkup_m3: wkup_m3@44d00000 {
--
1.8.3.2

2013-09-17 12:46:02

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 07/11] ARM: PIE: Add support for updating PIE relocations

This adds support for updating PIE relocations under ARM. This
is necessary in the case that the same PIE must run both with
virtual mapping (MMU enabled) and physical mapping (MMU
disabled).

Signed-off-by: Russ Dill <[email protected]>
---
arch/arm/include/asm/pie.h | 42 +++++++++++++++++++++++++
arch/arm/kernel/pie.c | 9 ++++++
arch/arm/libpie/Makefile | 2 +-
arch/arm/libpie/relocate.S | 76 ++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 128 insertions(+), 1 deletion(-)
create mode 100644 arch/arm/include/asm/pie.h
create mode 100644 arch/arm/libpie/relocate.S

diff --git a/arch/arm/include/asm/pie.h b/arch/arm/include/asm/pie.h
new file mode 100644
index 0000000..977f11d
--- /dev/null
+++ b/arch/arm/include/asm/pie.h
@@ -0,0 +1,42 @@
+/*
+ * arch/arm/include/asm/pie.h
+ *
+ * Copyright 2013 Texas Instruments, Inc
+ * Russ Dill <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _ASMARM_PIE_H
+#define _ASMARM_PIE_H
+
+#include <linux/pie.h>
+
+#ifdef CONFIG_PIE
+extern void __pie_relocate(void);
+extern void __pie___pie_relocate(void);
+
+#define pie_relocate_from_pie() \
+ __asm__ __volatile__("bl __pie_relocate\n" \
+ : : : "cc", "memory", "lr", "r4", "r5", "r6", "r7", "r8", "r9");
+
+static inline void pie_relocate_from_kern(struct pie_chunk *chunk)
+{
+ void (*fn)(void) = fn_to_pie(chunk, &__pie___pie_relocate);
+ __asm__ __volatile__("" : : : "cc", "memory", "r4", "r5", "r6",
+ "r7", "r8", "r9");
+ fn();
+}
+#else
+
+#define pie_relocate_from_pie() do {} while(0)
+
+static inline void pie_relocate_from_kern(struct pie_chunk *chunk)
+{
+}
+
+#endif
+
+#endif
diff --git a/arch/arm/kernel/pie.c b/arch/arm/kernel/pie.c
index 5dff5d6..598562f 100644
--- a/arch/arm/kernel/pie.c
+++ b/arch/arm/kernel/pie.c
@@ -12,10 +12,12 @@
#include <linux/elf.h>

#include <asm/elf.h>
+#include <asm/pie.h>

extern char __pie_rel_dyn_start[];
extern char __pie_rel_dyn_end[];
extern char __pie_tail_offset[];
+extern char __pie_reloc_offset[];

struct arm_pie_tail {
int count;
@@ -72,12 +74,19 @@ int pie_arch_fixup(struct pie_chunk *chunk, void *base, void *tail,
unsigned long offset)
{
struct arm_pie_tail *pie_tail = tail;
+ void *reloc;
int i;

/* Perform relocation fixups for given offset */
for (i = 0; i < pie_tail->count; i++)
*((uintptr_t *) (pie_tail->offset[i] + base)) += offset;

+ /* Store the PIE offset to tail and recol func */
+ *kern_to_pie(chunk, (uintptr_t *) __pie_tail_offset) = tail - base;
+ reloc = kern_to_pie(chunk,
+ (void *) fnptr_to_addr(&__pie___pie_relocate));
+ *kern_to_pie(chunk, (uintptr_t *) __pie_reloc_offset) = reloc - base;
+
return 0;
}
EXPORT_SYMBOL_GPL(pie_arch_fixup);
diff --git a/arch/arm/libpie/Makefile b/arch/arm/libpie/Makefile
index 5662e99..b1ac52a 100644
--- a/arch/arm/libpie/Makefile
+++ b/arch/arm/libpie/Makefile
@@ -3,7 +3,7 @@
#
ccflags-y := -fpic -mno-single-pic-base -fno-builtin

-obj-y := empty.o
+obj-y := relocate.o empty.o
obj-y += lib1funcs.o ashldi3.o string.o

# string library code (-Os is enforced to keep it much smaller)
diff --git a/arch/arm/libpie/relocate.S b/arch/arm/libpie/relocate.S
new file mode 100644
index 0000000..70cc36e
--- /dev/null
+++ b/arch/arm/libpie/relocate.S
@@ -0,0 +1,76 @@
+/*
+ * arch/arm/libpie/relocate.S - Relocation updating for PIEs
+ *
+ * Copyright 2013 Texas Instruments, Inc.
+ * Russ Dill <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+
+/*
+ * Update relocations based on current pc
+ *
+ * On exit:
+ * r4-r9 corrupted
+ */
+
+ENTRY(__pie_relocate)
+ /* Calculate offset of our code compared to existing relocations */
+ ldr r4, pie_relocate_address
+ adr r5, __pie_relocate
+ subs r6, r5, r4
+ moveq pc, lr /* 0 offset, no need to do anything */
+
+ /* Base of PIE group */
+ ldr r7, reloc_offset
+ sub r5, r5, r7
+
+ /* Calculate address of tail */
+ ldr r7, tail_offset
+ add r7, r7, r5
+
+ /* First byte of tail is number of entries */
+ ldr r8, [r7], #4
+ add r8, r7, r8, lsl #2
+
+ /*
+ * r5 - current base address of PIE group
+ * r6 - fixup offset needed for relocs
+ * r7 - relocs start
+ * r8 - relocs end
+ */
+
+1:
+ cmp r7, r8
+ ldrne r4, [r7], #4 /* Load next reloc offset */
+
+ addne r4, r4, r5 /* Calculate address of reloc entry */
+ ldrne r9, [r4]
+ addne r9, r9, r6 /* Fixup reloc entry */
+ strne r9, [r4]
+
+ bne 1b
+
+ mov pc, lr
+ENDPROC(__pie_relocate)
+
+/*
+ * This ends up in the .rel.dyn section and can be used to read the current
+ * relocation offset
+ */
+pie_relocate_address:
+ .long __pie_relocate
+
+/* Offset from PIE section start to reloc function */
+.global reloc_offset
+reloc_offset:
+ .space 4
+
+/* Offset from PIE section start to tail */
+.globl tail_offset
+tail_offset:
+ .space 4
--
1.8.3.2

2013-09-17 12:46:35

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 05/11] PIE: Support embedding position independent executables

This commit adds support for embedding PIEs into the kernel, loading them
into genalloc sections, performing necessary relocations, and running code
from them. This allows platforms that need to run code from SRAM, such
an during suspend/resume, to develop that code in C instead of assembly.

Functions and data for each PIE should be grouped into sections with the
__pie(<group>) and __pie_data(<group>) macros respectively. Any symbols or
functions that are to be accessed from outside the PIE should be marked with
EXPORT_PIE_SYMBOL(<sym>). For example:

static struct ddr_timings xyz_timings __pie_data(platformxyz) = {
[...]
};

void __pie(platformxyz) xyz_ddr_on(void *addr)
{
[...]
}
EXPORT_PIE_SYMBOL(xyz_ddr_on);

While the kernel can access exported symbols from the PIE, the PIE cannot
access symbols from the kernel, but can access data from the kernel and
call functions in the kernel so long as addresses are passed into the PIE.

PIEs are loaded from the kernel into a genalloc pool with pie_load_sections.
pie_load_sections allocates space within the pool, copies the neccesary
code/data, and performs any necessary relocations. A chunk identifier is
returned for removing the PIE from the pool, and for translating symbols.

Because the PIEs are dynamically relocated, special accessors must be used
to access PIE symbols from kernel code:

- kern_to_pie(chunk, ptr): Translate a PIE symbol to the virtual address
it is loaded into within the pool.

- fn_to_pie(chunk, ptr): Same as above, but for function pointers.

- sram_to_phys(chunk, addr): Translate a virtual address within a loaded PIE
to a physical address.

Loading a PIE involves three main steps. First a set of common functions to
cover built-ins emitted by gcc (memcpy, memmove, etc) is copied into the pool.
Then the actual PIE code and data is copied into the pool. Because the PIE
code is contained within an overlay with other PIEs, offsets to the common
functions are maintained. Finally, relocations are performed as necessary.

Signed-off-by: Russ Dill <[email protected]>
---
Documentation/pie.txt | 167 ++++++++++++++++++++++++++++++++
Makefile | 17 +++-
include/asm-generic/pie.lds.h | 82 ++++++++++++++++
include/asm-generic/vmlinux.lds.h | 1 +
include/linux/pie.h | 196 ++++++++++++++++++++++++++++++++++++++
lib/Kconfig | 14 +++
lib/Makefile | 2 +
lib/pie.c | 138 +++++++++++++++++++++++++++
pie/.gitignore | 3 +
pie/Makefile | 85 +++++++++++++++++
scripts/link-vmlinux.sh | 11 ++-
11 files changed, 711 insertions(+), 5 deletions(-)
create mode 100644 Documentation/pie.txt
create mode 100644 include/asm-generic/pie.lds.h
create mode 100644 include/linux/pie.h
create mode 100644 lib/pie.c
create mode 100644 pie/.gitignore
create mode 100644 pie/Makefile

diff --git a/Documentation/pie.txt b/Documentation/pie.txt
new file mode 100644
index 0000000..54a1646
--- /dev/null
+++ b/Documentation/pie.txt
@@ -0,0 +1,167 @@
+Position Independent Executables (PIEs)
+=======================================
+
+About
+=====
+
+The PIE framework is designed to allow normal C code from the kernel to be
+embedded into the kernel, loaded at arbirary addresses, and executed.
+
+A PIE is a position independent executable is a piece of self contained code
+that can be relocated to any address. Before the code is run, a simple list
+of offset based relocations has to be performed.
+
+Copyright 2013 Texas Instruments, Inc
+ Russ Dill <[email protected]>
+
+Motivation
+==========
+
+Without the PIE framework, the only way to support platforms that require
+code loaded to and run from arbitrary addresses was to write the code in
+assembly. For example, a platform may have suspend/resume steps that
+disable/enable SDRAM and must be run from on chip SRAM.
+
+In addition to the SRAM virtual address not being known at compile time
+for device tree platforms, the code must often run with the MMU enabled or
+disabled (physical vs virtual address).
+
+Design
+======
+
+The PIE code is separated into two main pieces. libpie satifies various
+function calls emitted by gcc. The kernel contains only one copy of libpie
+but whenever a PIE is loaded, a copy of libpie is copied along with the PIE
+code. The second piece is the PIE code and data marked with special PIE
+sections. At build time, libpie and the PIE sections are collected together
+into a single PIE executable:
+
+ +---------------------------------------+
+ | __pie_common_start |
+ | <libpie> |
+ | __pie_common_end |
+ +---------------------------------------+
+ | __pie_overlay_start |
+ | +-----------------------------+ |
+ | | __pie_groupxyz_start | |
+ | | <groupxyz functions/data> | |
+ | | __pie_groupxyz_end | |
+ | +-----------------------------+ |
+ | | __pie_groupabc_start | |
+ | | <groupabc functions/data> | |
+ | | __pie_groupabc_end | |
+ | +-----------------------------+ |
+ | | __pie_groupijk_start | |
+ | | <groupijk functions/data> | |
+ | | __pie_groupijk_end | |
+ | +-----------------------------+ |
+ | __pie_overlay_end |
+ +---------------------------------------+
+ | <Architecture specific relocations> |
+ +---------------------------------------+
+
+The PIE executable is then embedded into the kernel. Symbols are exported
+from the PIE executable and passed back into the kernel at link time. When
+the PIE is loaded, the memory layout then looks like the following:
+
+ +---------------------------------------+
+ | <libpie> |
+ +---------------------------------------+
+ | <groupabc_functions/data> |
+ +---------------------------------------+
+ | Tail (Arch specific data/relocations |
+ +---------------------------------------+
+
+The architecture specific code is responsible for reading the relocations
+and performing the necessary fixups.
+
+Marking code/data
+=================
+
+Marking code and data for inclusing into a PIE group is done with the PIE
+section markers, __pie(<group>) and __pie_data(<group>). Any symbols that
+will be used outside of the PIE must be exported with EXPORT_PIE_SYMBOL:
+
+ static struct ddr_timings xyz_timings __pie_data(platformxyz) = {
+ [...]
+ };
+
+ void __pie(platformxyz) xyz_ddr_on(void *addr)
+ {
+ [...]
+ }
+ EXPORT_PIE_SYMBOL(xyz_ddr_on);
+
+Loading PIEs
+============
+
+PIEs can be loaded into a genalloc pool (such as one backed by SRAM). The
+following functions are provided:
+
+ - pie_load_sections(pool, <group>)
+ - pie_load_sections_phys(pool, <group>)
+ - pie_free(chunk)
+
+pie_load_sections/pie_load_sections_phys load a PIE section group into the
+given pool. Any necessary fixups are peformed and a chunk identifier is
+returned. The first variant performs fixups such that the code can be run
+with the current address layout. The second (phys) variant performs fixups
+such that the code can be executed with the MMU disabled.
+
+The pie_free function unloads a PIE from a pool.
+
+Utilizing PIEs
+==============
+
+In order to translate between symbols and addresses within a loaded PIE, the
+following macros/functions are provided:
+
+ - kern_to_pie(chunk, sym)
+ - fn_to_pie(chunk, fn)
+ - pie_to_phys(chunk, addr)
+
+All three take as the first argument the chunk returned by pie_load_sections.
+Data symbols can be translated with kern_to_pie. The macro is made so that
+the type returned is the type passed:
+
+ kern_to_pie(chunk, xyz_struct_ptr)->foo = 15;
+ *kern_to_pie(chunk, &xyz_flags) = XYZ_DO_THE_THING;
+
+Because certain architectures require special handling of function pointers,
+a special varaint is provided:
+
+ ret = fn_to_pie(chunk, &xyz_ddr_on)(addr);
+ fnptr = fn_to_pie(chunk, &abc_fn);
+
+In the case that a PIE has been configured to run with the MMU disabled,
+physical addresses can be translated with pie_to_phys. For instance, if
+the resume ROM jumps to a given physical address:
+
+ trampoline = fn_to_pie(chunk, resume_trampoline);
+ writel(pie_to_phys(chunk, trampoline), XYZ_RESUME_ADDR_REG);
+
+On the Fly Fixup
+================
+
+The tail portion of the PIE can be used to store data necessary to perform
+on the fly fixups. This is necessary for code that needs to run from
+different address spaces at different times. Any on the fly fixup support
+is architecture specific.
+
+Architecture Requirements
+=========================
+
+Individual architectures must implement two functions:
+
+pie_arch_fill_tail - This function examines the architecture specific
+relocation entries and copies the ones necessary for the given PIE.
+
+pie_arch_fixup - This function performs fixups of the PIE code based
+on the tail data generated above.
+
+pie.lds - A linker script for the PIE executable must be provided.
+include/asm-generic/pie.lds.S provides a template.
+
+libpie.o - The architecture must also provide a library of functions that
+gcc may expect as a built-in, such as memcpy, memmove, etc. The list of
+functions is architecture specific.
diff --git a/Makefile b/Makefile
index fe8204b..4791a0f 100644
--- a/Makefile
+++ b/Makefile
@@ -396,7 +396,7 @@ export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_GCOV
export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE
export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE
export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL
-export KBUILD_ARFLAGS
+export KBUILD_ARFLAGS OBJCOPY_OUTPUT_FORMAT

# When compiling out-of-tree modules, put MODVERDIR in the module
# tree rather than in the kernel tree. The kernel tree might
@@ -682,6 +682,10 @@ ifeq ($(CONFIG_STRIP_ASM_SYMS),y)
LDFLAGS_vmlinux += $(call ld-option, -X,)
endif

+ifeq ($(CONFIG_PIE),y)
+LDFLAGS_vmlinux += --just-symbols=pie/pie.syms
+endif
+
# Default kernel image to build when no specific target is given.
# KBUILD_IMAGE may be overruled on the command line or
# set in the environment
@@ -737,13 +741,15 @@ core-y += kernel/ mm/ fs/ ipc/ security/ crypto/ block/

vmlinux-dirs := $(patsubst %/,%,$(filter %/, $(init-y) $(init-m) \
$(core-y) $(core-m) $(drivers-y) $(drivers-m) \
- $(net-y) $(net-m) $(libs-y) $(libs-m)))
+ $(net-y) $(net-m) $(libs-y) $(libs-m) $(libpie-y)))

vmlinux-alldirs := $(sort $(vmlinux-dirs) $(patsubst %/,%,$(filter %/, \
$(init-n) $(init-) \
$(core-n) $(core-) $(drivers-n) $(drivers-) \
$(net-n) $(net-) $(libs-n) $(libs-))))

+pie-$(CONFIG_PIE) := pie/
+
init-y := $(patsubst %/, %/built-in.o, $(init-y))
core-y := $(patsubst %/, %/built-in.o, $(core-y))
drivers-y := $(patsubst %/, %/built-in.o, $(drivers-y))
@@ -751,16 +757,21 @@ net-y := $(patsubst %/, %/built-in.o, $(net-y))
libs-y1 := $(patsubst %/, %/lib.a, $(libs-y))
libs-y2 := $(patsubst %/, %/built-in.o, $(libs-y))
libs-y := $(libs-y1) $(libs-y2)
+pie-y := $(patsubst %/, %/built-in.o, $(pie-y))
+libpie-y := $(patsubst %/, %/built-in.o, $(libpie-y))

# Externally visible symbols (used by link-vmlinux.sh)
export KBUILD_VMLINUX_INIT := $(head-y) $(init-y)
export KBUILD_VMLINUX_MAIN := $(core-y) $(libs-y) $(drivers-y) $(net-y)
+export KBUILD_VMLINUX_PIE := $(pie-y)
+export KBUILD_LIBPIE := $(libpie-y)
+export KBUILD_PIE_LDS := $(PIE_LDS)
export KBUILD_LDS := arch/$(SRCARCH)/kernel/vmlinux.lds
export LDFLAGS_vmlinux
# used by scripts/pacmage/Makefile
export KBUILD_ALLDIRS := $(sort $(filter-out arch/%,$(vmlinux-alldirs)) arch Documentation include samples scripts tools virt)

-vmlinux-deps := $(KBUILD_LDS) $(KBUILD_VMLINUX_INIT) $(KBUILD_VMLINUX_MAIN)
+vmlinux-deps := $(KBUILD_LDS) $(KBUILD_PIE_LDS) $(KBUILD_VMLINUX_INIT) $(KBUILD_VMLINUX_MAIN) $(KBUILD_VMLINUX_PIE)

# Final link of vmlinux
cmd_link-vmlinux = $(CONFIG_SHELL) $< $(LD) $(LDFLAGS) $(LDFLAGS_vmlinux)
diff --git a/include/asm-generic/pie.lds.h b/include/asm-generic/pie.lds.h
new file mode 100644
index 0000000..2f8d20e
--- /dev/null
+++ b/include/asm-generic/pie.lds.h
@@ -0,0 +1,82 @@
+/*
+ * Helper macros to support writing architecture specific
+ * pie linker scripts.
+ *
+ * A minimal linker scripts has following content:
+ * [This is a sample, architectures may have special requiriements]
+ *
+ * OUTPUT_FORMAT(...)
+ * OUTPUT_ARCH(...)
+ * SECTIONS
+ * {
+ * . = 0x0;
+ *
+ * PIE_COMMON_START
+ * .text {
+ * PIE_TEXT_TEXT
+ * }
+ * PIE_COMMON_END
+ *
+ * PIE_OVERLAY_START
+ * OVERLAY : NOCROSSREFS {
+ * PIE_OVERLAY_SECTION(am33xx)
+ * PIE_OVERLAY_SECTION(am347x)
+ * [...]
+ * }
+ * PIE_OVERLAY_END
+ *
+ * PIE_DISCARDS // must be the last
+ * }
+ */
+
+#include <asm-generic/vmlinux.lds.h>
+
+#define PIE_COMMON_START \
+ __pie_common_start : { \
+ VMLINUX_SYMBOL(__pie_common_start) = .; \
+ }
+
+#define PIE_COMMON_END \
+ __pie_common_end : { \
+ VMLINUX_SYMBOL(__pie_common_end) = .; \
+ }
+
+#define PIE_OVERLAY_START \
+ __pie_overlay_start : { \
+ VMLINUX_SYMBOL(__pie_overlay_start) = .; \
+ }
+
+#define PIE_OVERLAY_END \
+ __pie_overlay_end : { \
+ VMLINUX_SYMBOL(__pie_overlay_end) = .; \
+ }
+
+#define PIE_TEXT_TEXT \
+ KEEP(*(.pie.text))
+
+#define PIE_OVERLAY_SECTION(name) \
+ .pie.##name { \
+ KEEP(*(.pie.##name##.*)) \
+ VMLINUX_SYMBOL(__pie_##name##_start) = \
+ LOADADDR(.pie.##name##); \
+ VMLINUX_SYMBOL(__pie_##name##_end) = \
+ LOADADDR(.pie.##name##) + \
+ SIZEOF(.pie.##name##); \
+ }
+
+#define PIE_DISCARDS \
+ /DISCARD/ : { \
+ *(.dynsym) \
+ *(.dynstr*) \
+ *(.dynamic*) \
+ *(.plt*) \
+ *(.interp*) \
+ *(.gnu*) \
+ *(.hash) \
+ *(.comment) \
+ *(.bss*) \
+ *(.data) \
+ *(.discard) \
+ *(.discard.*) \
+ }
+
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 69732d2..5a21cfe 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -666,6 +666,7 @@
EXIT_CALL \
*(.discard) \
*(.discard.*) \
+ *(.pie.*) \
}

/**
diff --git a/include/linux/pie.h b/include/linux/pie.h
new file mode 100644
index 0000000..66450c1
--- /dev/null
+++ b/include/linux/pie.h
@@ -0,0 +1,196 @@
+/*
+ * Copyright 2013 Texas Instruments, Inc.
+ * Russ Dill <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+#ifndef _LINUX_PIE_H
+#define _LINUX_PIE_H
+
+#include <linux/kernel.h>
+#include <linux/err.h>
+
+#include <asm/fncpy.h>
+#include <asm/bug.h>
+
+struct gen_pool;
+struct pie_chunk;
+
+/**
+ * pie_arch_fixup - arch specific fixups of copied PIE code
+ * @chunk: identifier to be used with kern_to_pie/pie_to_phys
+ * @base: virtual address of start of copied PIE section
+ * @tail: virtual address of tail data in copied PIE
+ * @offset: offset to apply to relocation entries.
+ *
+ * When this code is done executing, it should be possible to jump to code
+ * so long as it is located at the given offset.
+ */
+extern int pie_arch_fixup(struct pie_chunk *chunk, void *base, void *tail,
+ unsigned long offset);
+
+/**
+ * pie_arch_fill_tail - arch specific tail information for copied PIE
+ * @tail: virtual address of tail data in copied PIE to be filled
+ * @common_start: virtual address of common code within kernel data
+ * @common_end: virtual end address of common code within kernel data
+ * @overlay_start: virtual address of first overlay within kernel data
+ * @code_start: virtual address of this overlay within kernel data
+ * @code_end: virtual end address of this overlay within kernel data
+ *
+ * Fill tail data with data necessary to for pie_arch_fixup to perform
+ * relocations. If tail is NULL, do not update data, but still calculate
+ * the number of bytes required.
+ *
+ * Returns number of bytes required/used for tail on success, -EERROR otherwise.
+ */
+extern int pie_arch_fill_tail(void *tail, void *common_start, void *common_end,
+ void *overlay_start, void *code_start, void *code_end);
+
+#ifdef CONFIG_PIE
+
+/**
+ * __pie_load_data - load and fixup PIE code from kernel data
+ * @pool: pool to allocate memory from and copy code into
+ * @start: virtual start address in kernel of chunk specific code
+ * @end: virtual end address in kernel of chunk specific code
+ * @phys: %true to fixup to physical address of destination, %false to
+ * fixup to virtual address of destination
+ *
+ * Returns 0 on success, -EERROR otherwise
+ */
+extern struct pie_chunk *__pie_load_data(struct gen_pool *pool,
+ void *start, void *end, bool phys);
+
+/**
+ * pie_to_phys - translate a virtual PIE address into a physical one
+ * @chunk: identifier returned by pie_load_sections
+ * @addr: virtual address within pie chunk
+ *
+ * Returns physical address on success, -1 otherwise
+ */
+extern phys_addr_t pie_to_phys(struct pie_chunk *chunk, unsigned long addr);
+
+extern void __iomem *__kern_to_pie(struct pie_chunk *chunk, void *ptr);
+
+/**
+ * pie_free - free the pool space used by an pie chunk
+ * @chunk: identifier returned by pie_load_sections
+ */
+extern void pie_free(struct pie_chunk *chunk);
+
+#define __pie_load_sections(pool, name, phys) ({ \
+ extern char __pie_##name##_start[]; \
+ extern char __pie_##name##_end[]; \
+ \
+ __pie_load_data(pool, __pie_##name##_start, \
+ __pie_##name##_end, phys); \
+})
+
+/*
+ * Required for any symbol within an PIE section that is referenced by the
+ * kernel
+ */
+#define EXPORT_PIE_SYMBOL(sym) extern typeof(sym) sym __weak
+
+/* For marking data and functions that should be part of a PIE */
+#define __pie(name) __attribute__ ((__section__(".pie." #name ".text")))
+#define __pie_data(name) __attribute__ ((__section__(".pie." #name ".data")))
+
+#else
+
+static inline struct pie_chunk *__pie_load_data(struct gen_pool *pool,
+ void *start, void *end, bool phys)
+{
+ return ERR_PTR(-EINVAL);
+}
+
+static inline phys_addr_t pie_to_phys(struct pie_chunk *chunk,
+ unsigned long addr)
+{
+ return -1;
+}
+
+static inline void __iomem *__kern_to_pie(struct pie_chunk *chunk, void *ptr)
+{
+ return NULL;
+}
+
+static inline void pie_free(struct pie_chunk *chunk)
+{
+}
+
+#define __pie_load_sections(pool, name, phys) ({ ERR_PTR(-EINVAL); })
+
+#define EXPORT_PIE_SYMBOL(sym)
+
+#define __pie(name)
+#define __pie_data(name)
+
+#endif
+
+/**
+ * pie_load_sections - load and fixup sections associated with the given name
+ * @pool: pool to allocate memory from and copy code into
+ * fixup to virtual address of destination
+ * @name: the name given to __pie() and __pie_data() when marking
+ * data and code
+ *
+ * Returns 0 on success, -EERROR otherwise
+ */
+#define pie_load_sections(pool, name) ({ \
+ __pie_load_sections(pool, name, false); \
+})
+
+/**
+ * pie_load_sections_phys - load and fixup sections associated with the given
+ * name for execution with the MMU off
+ *
+ * @pool: pool to allocate memory from and copy code into
+ * fixup to virtual address of destination
+ * @name: the name given to __pie() and __pie_data() when marking
+ * data and code
+ *
+ * Returns 0 on success, -EERROR otherwise
+ */
+#define pie_load_sections_phys(pool, name) ({ \
+ __pie_load_sections(pool, name, true); \
+})
+
+/**
+ * kern_to_pie - convert a kernel symbol to the virtual address of where
+ * that symbol is loaded into the given PIE chunk.
+ *
+ * @chunk: identifier returned by pie_load_sections
+ * @p: symbol to convert
+ *
+ * Return type is the same as type passed
+ */
+#define kern_to_pie(chunk, p) ({ \
+ void *__ptr = (void *) (p); \
+ typeof(p) __result = (typeof(p)) __kern_to_pie(chunk, __ptr); \
+ __result; \
+})
+
+/**
+ * kern_to_fn - convert a kernel function symbol to the virtual address of where
+ * that symbol is loaded into the given PIE chunk
+ *
+ * @chunk: identifier returned by pie_load_sections
+ * @p: function to convert
+ *
+ * Return type is the same as type passed
+ */
+#define fn_to_pie(chunk, funcp) ({ \
+ uintptr_t __kern_addr, __pie_addr; \
+ \
+ __kern_addr = fnptr_to_addr(funcp); \
+ __pie_addr = kern_to_pie(chunk, __kern_addr); \
+ \
+ fnptr_translate(funcp, __pie_addr); \
+})
+
+#endif
diff --git a/lib/Kconfig b/lib/Kconfig
index 71d9f81..d47df14 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -353,6 +353,20 @@ config DQL
config NLATTR
bool

+config HAVE_PIE
+ bool
+ help
+ See Documentation/pie.txt for details.
+
+config PIE
+ bool "Embedded position independant executables"
+ depends on HAVE_PIE
+ help
+ This option adds support for embedding position indepentant (PIE)
+ executables into the kernel. The PIEs can then be copied into
+ genalloc regions such as SRAM and executed. Some platforms require
+ this for suspend/resume support.
+
#
# Generic 64-bit atomic support is selected if needed
#
diff --git a/lib/Makefile b/lib/Makefile
index 7baccfd..2b6123d 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -145,6 +145,8 @@ obj-$(CONFIG_GENERIC_NET_UTILS) += net_utils.o

obj-$(CONFIG_STMP_DEVICE) += stmp_device.o

+obj-$(CONFIG_PIE) += pie.o
+
libfdt_files = fdt.o fdt_ro.o fdt_wip.o fdt_rw.o fdt_sw.o fdt_strerror.o
$(foreach file, $(libfdt_files), \
$(eval CFLAGS_$(file) = -I$(src)/../scripts/dtc/libfdt))
diff --git a/lib/pie.c b/lib/pie.c
new file mode 100644
index 0000000..c0190dd
--- /dev/null
+++ b/lib/pie.c
@@ -0,0 +1,138 @@
+/*
+ * Copyright 2013 Texas Instruments, Inc.
+ * Russ Dill <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/err.h>
+#include <linux/io.h>
+#include <linux/genalloc.h>
+#include <linux/pie.h>
+#include <asm/cacheflush.h>
+
+struct pie_chunk {
+ struct gen_pool *pool;
+ unsigned long addr;
+ size_t sz;
+};
+
+extern char __pie_common_start[];
+extern char __pie_common_end[];
+extern char __pie_overlay_start[];
+
+int __weak pie_arch_fill_tail(void *tail, void *common_start, void *common_end,
+ void *overlay_start, void *code_start, void *code_end)
+{
+ return 0;
+}
+
+int __weak pie_arch_fixup(struct pie_chunk *chunk, void *base, void *tail,
+ unsigned long offset)
+{
+ return 0;
+}
+
+struct pie_chunk *__pie_load_data(struct gen_pool *pool, void *code_start,
+ void *code_end, bool phys)
+{
+ struct pie_chunk *chunk;
+ unsigned long offset;
+ int ret;
+ char *tail;
+ size_t common_sz;
+ size_t code_sz;
+ size_t tail_sz;
+
+ /* Calculate the tail size */
+ ret = pie_arch_fill_tail(NULL, __pie_common_start, __pie_common_end,
+ __pie_overlay_start, code_start, code_end);
+ if (ret < 0)
+ goto err;
+ tail_sz = ret;
+
+ chunk = kzalloc(sizeof(*chunk), GFP_KERNEL);
+ if (!chunk) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ common_sz = __pie_overlay_start - __pie_common_start;
+ code_sz = code_end - code_start;
+
+ chunk->pool = pool;
+ chunk->sz = common_sz + code_sz + tail_sz;
+
+ chunk->addr = gen_pool_alloc(pool, chunk->sz);
+ if (!chunk->addr) {
+ ret = -ENOMEM;
+ goto err_free;
+ }
+
+ /* Copy common code/data */
+ tail = (char *) chunk->addr;
+ memcpy(tail, __pie_common_start, common_sz);
+ tail += common_sz;
+
+ /* Copy chunk specific code/data */
+ memcpy(tail, code_start, code_sz);
+ tail += code_sz;
+
+ /* Fill in tail data */
+ ret = pie_arch_fill_tail(tail, __pie_common_start, __pie_common_end,
+ __pie_overlay_start, code_start, code_end);
+ if (ret < 0)
+ goto err_alloc;
+
+ /* Calculate initial offset */
+ if (phys)
+ offset = gen_pool_virt_to_phys(pool, chunk->addr);
+ else
+ offset = chunk->addr;
+
+ /* Perform arch specific code fixups */
+ ret = pie_arch_fixup(chunk, (void *) chunk->addr, tail, offset);
+ if (ret < 0)
+ goto err_alloc;
+
+ flush_icache_range(chunk->addr, chunk->addr + chunk->sz);
+
+ return chunk;
+
+err_alloc:
+ gen_pool_free(chunk->pool, chunk->addr, chunk->sz);
+
+err_free:
+ kfree(chunk);
+err:
+ return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(__pie_load_data);
+
+phys_addr_t pie_to_phys(struct pie_chunk *chunk, unsigned long addr)
+{
+ return gen_pool_virt_to_phys(chunk->pool, addr);
+}
+EXPORT_SYMBOL_GPL(pie_to_phys);
+
+void __iomem *__kern_to_pie(struct pie_chunk *chunk, void *ptr)
+{
+ uintptr_t offset = (uintptr_t) ptr;
+ offset -= (uintptr_t) __pie_common_start;
+ if (offset >= chunk->sz)
+ return NULL;
+ else
+ return (void *) (chunk->addr + offset);
+}
+EXPORT_SYMBOL_GPL(__kern_to_pie);
+
+void pie_free(struct pie_chunk *chunk)
+{
+ gen_pool_free(chunk->pool, chunk->addr, chunk->sz);
+ kfree(chunk);
+}
+EXPORT_SYMBOL_GPL(pie_free);
diff --git a/pie/.gitignore b/pie/.gitignore
new file mode 100644
index 0000000..4f29803
--- /dev/null
+++ b/pie/.gitignore
@@ -0,0 +1,3 @@
+*.syms
+pie.lds
+pie.lds.S
diff --git a/pie/Makefile b/pie/Makefile
new file mode 100644
index 0000000..9afed70
--- /dev/null
+++ b/pie/Makefile
@@ -0,0 +1,85 @@
+#
+# linux/pie/Makefile
+#
+# Copyright 2013 Texas Instruments, Inc.
+# Russ Dill <[email protected]>
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms and conditions of the GNU General Public License,
+# version 2, as published by the Free Software Foundation.
+#
+
+obj-y := pie.bin.o
+
+# Report unresolved symbol references
+ldflags-y += --no-undefined
+# Delete all temporary local symbols
+ldflags-y += -X
+
+# Reset objcopy flags, ARM puts "-O binary" here
+OBJCOPYFLAGS =
+
+# Reference gcc builtins for use in PIE with __pie_
+$(obj)/pie_rename.syms: $(KBUILD_LIBPIE)
+ @$(NM) $^ | awk '{if ($$3) print $$3,"__pie_"$$3}' > $@
+
+# For weakening the links to the original gcc builtins
+$(obj)/pie_weaken.syms: $(KBUILD_LIBPIE)
+ @$(NM) $^ | awk '{if ($$3) print "__pie_"$$3}' > $@
+
+# For embedding address of the symbols copied from the PIE into the kernel
+$(obj)/pie.syms: $(obj)/pie.elf
+ @$(NM) $^ | awk '{if ($$3 && $$2 == toupper($$2)) print $$3,"=","0x"$$1" + _binary_pie_pie_bin_start;"}' > $@
+
+# Collect together the libpie objects
+LDFLAGS_libpie_stage1.o += -r
+
+$(obj)/libpie_stage1.o: $(KBUILD_LIBPIE)
+ $(call if_changed,ld)
+
+# Rename the libpie gcc builtins with a __pie_ prefix
+OBJCOPYFLAGS_libpie_stage2.o += --redefine-syms=$(obj)/pie_rename.syms
+OBJCOPYFLAGS_libpie_stage2.o += --rename-section .text=.pie.text
+
+$(obj)/libpie_stage2.o: $(obj)/libpie_stage1.o
+ $(call if_changed,objcopy)
+
+# Generate a version of vmlinux.o with weakened and rename references to gcc
+# builtins.
+OBJCOPYFLAGS_pie_stage1.o += --weaken-symbols=$(obj)/pie_weaken.syms
+OBJCOPYFLAGS_pie_stage1.o += --redefine-syms=$(obj)/pie_rename.syms
+
+$(obj)/pie_stage1.o: $(obj)/../vmlinux.o $(obj)/pie_rename.syms $(obj)/pie_weaken.syms
+ $(call if_changed,objcopy)
+
+# Drop in the PIE versions instead
+LDFLAGS_pie_stage2.o += -r
+# Allow the _GLOBAL_OFFSET_TABLE to redefine
+LDFLAGS_pie_stage2.o += --defsym=_GLOBAL_OFFSET_TABLE_=_GLOBAL_OFFSET_TABLE_
+
+$(obj)/pie_stage2.o: $(obj)/pie_stage1.o $(obj)/libpie_stage2.o
+ $(call if_changed,ld)
+
+# Drop everything but the pie sections
+OBJCOPYFLAGS_pie_stage3.o += -j ".pie.*"
+
+$(obj)/pie_stage3.o: $(obj)/pie_stage2.o
+ $(call if_changed,objcopy)
+
+# Create the position independant executable
+LDFLAGS_pie.elf += -T $(KBUILD_PIE_LDS) --pie --gc-sections
+
+$(obj)/pie.elf: $(obj)/pie_stage3.o $(KBUILD_PIE_LDS)
+ $(call if_changed,ld)
+
+# Create binary data for the kernel
+OBJCOPYFLAGS_pie.bin += -O binary
+
+$(obj)/pie.bin: $(obj)/pie.elf $(obj)/pie.syms
+ $(call if_changed,objcopy)
+
+# Import the data into the kernel
+OBJCOPYFLAGS_pie.bin.o += -B $(ARCH) -I binary -O $(OBJCOPY_OUTPUT_FORMAT)
+
+$(obj)/pie.bin.o: $(obj)/pie.bin
+ $(call if_changed,objcopy)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 0149949..8cf4971 100644
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -55,12 +55,16 @@ vmlinux_link()
if [ "${SRCARCH}" != "um" ]; then
${LD} ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2} \
-T ${lds} ${KBUILD_VMLINUX_INIT} \
- --start-group ${KBUILD_VMLINUX_MAIN} --end-group ${1}
+ --start-group \
+ ${KBUILD_VMLINUX_MAIN} \
+ ${KBUILD_VMLINUX_PIE} \
+ --end-group ${1}
else
${CC} ${CFLAGS_vmlinux} -o ${2} \
-Wl,-T,${lds} ${KBUILD_VMLINUX_INIT} \
-Wl,--start-group \
${KBUILD_VMLINUX_MAIN} \
+ ${KBUILD_VMLINUX_PIE} \
-Wl,--end-group \
-lutil ${1}
rm -f linux
@@ -143,10 +147,13 @@ esac
#link vmlinux.o
info LD vmlinux.o
modpost_link vmlinux.o
-
# modpost vmlinux.o to check for section mismatches
${MAKE} -f "${srctree}/scripts/Makefile.modpost" vmlinux.o

+if [ -n "${CONFIG_PIE}" ]; then
+ ${MAKE} -f "${srctree}/scripts/Makefile.build" obj=pie
+fi
+
# Update version
info GEN .version
if [ ! -r .version ]; then
--
1.8.3.2

2013-09-17 12:46:58

by Russ Dill

[permalink] [raw]
Subject: [RFC PATCH 03/11] misc: SRAM: Add option to map SRAM to allow code execution

This is necessary for platforms that use SRAM to execute suspend/resume stubs.

Signed-off-by: Russ Dill <[email protected]>
---
Documentation/devicetree/bindings/misc/sram.txt | 4 ++++
drivers/misc/sram.c | 13 ++++++++++++-
include/linux/platform_data/sram.h | 8 ++++++++
3 files changed, 24 insertions(+), 1 deletion(-)
create mode 100644 include/linux/platform_data/sram.h

diff --git a/Documentation/devicetree/bindings/misc/sram.txt b/Documentation/devicetree/bindings/misc/sram.txt
index 4d0a00e..4fa9af3 100644
--- a/Documentation/devicetree/bindings/misc/sram.txt
+++ b/Documentation/devicetree/bindings/misc/sram.txt
@@ -8,6 +8,10 @@ Required properties:

- reg : SRAM iomem address range

+Optional properties:
+
+- map-exec: Map range to allow code execution
+
Example:

sram: sram@5c000000 {
diff --git a/drivers/misc/sram.c b/drivers/misc/sram.c
index d87cc91..baa5008 100644
--- a/drivers/misc/sram.c
+++ b/drivers/misc/sram.c
@@ -28,6 +28,7 @@
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/genalloc.h>
+#include <linux/platform_data/sram.h>

#define SRAM_GRANULARITY 32

@@ -38,14 +39,24 @@ struct sram_dev {

static int sram_probe(struct platform_device *pdev)
{
+ struct sram_platform_data *pdata = pdev->dev.platform_data;
void __iomem *virt_base;
struct sram_dev *sram;
struct resource *res;
unsigned long size;
+ bool map_exec = false;
int ret;

+ if (of_get_property(pdev->dev.of_node, "map-exec", NULL))
+ map_exec = true;
+ if (pdata && pdata->map_exec)
+ map_exec |= true;
+
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
- virt_base = devm_ioremap_resource(&pdev->dev, res);
+ if (map_exec)
+ virt_base = devm_ioremap_exec_resource(&pdev->dev, res);
+ else
+ virt_base = devm_ioremap_resource(&pdev->dev, res);
if (IS_ERR(virt_base))
return PTR_ERR(virt_base);

diff --git a/include/linux/platform_data/sram.h b/include/linux/platform_data/sram.h
new file mode 100644
index 0000000..8f5c4ba
--- /dev/null
+++ b/include/linux/platform_data/sram.h
@@ -0,0 +1,8 @@
+#ifndef _LINUX_SRAM_H
+#define _LINUX_SRAM_H
+
+struct sram_platform_data {
+ bool map_exec;
+};
+
+#endif
--
1.8.3.2

2013-09-17 14:23:53

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [RFC PATCH 04/11] asm-generic: fncpy: Add function copying macros

On Tue, Sep 17, 2013 at 2:43 PM, Russ Dill <[email protected]> wrote:
> +++ b/arch/alpha/include/asm/fncpy.h
> @@ -0,0 +1 @@
> +#include <asm-generic/fncpy.h>

Please add

generic-y += fncpy.h

to arch/<arch>/include/asm/Kbuild instead.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2013-10-23 14:11:48

by Linus Walleij

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Embeddable Position Independent Executable

On Tue, Sep 17, 2013 at 2:43 PM, Russ Dill <[email protected]> wrote:

> This patch adds support for and demonstrates the usage of an embedded
> position independent executable (PIE). The goal is to allow the use of C
> code in situations where carefully written position independent assembly
> was previously required.

This is perfectly applicable to the ARM TCM memory as well.

Currently we have arch/arm/kernel/tcm.c and related extensions to
kernel/vmlinux.kds.S, which enables us to tag some certain code
to be compiled into TCM memory which we map statically to
0xfffe0000 thru 0xfffeffff, but this is a better approach, especially
nice since it is the solution to the multiplatform situation, as we
need to select to copy code into that memory only on the specific
target.

I'll try to have a deeper look at this, please keep me on CC for
this series.

Yours,
Linus Walleij

2013-10-24 08:09:30

by Heiko Stübner

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Embeddable Position Independent Executable

Am Dienstag, 17. September 2013, 14:43:26 schrieb Russ Dill:
> This patch adds support for and demonstrates the usage of an embedded
> position independent executable (PIE). The goal is to allow the use of C
> code in situations where carefully written position independent assembly
> was previously required.

As suggested yesterday evening by Kevin Hilman, just adding my 2ct of support.

This series looks exactly like the foundation I'll need at some point in the
(probably still distant) future to handle suspend on my Rockchip platform -
where like in your example stuff like putting the ram into selfrefresh has to
be done by the os.


Thanks for the work on this
Heiko

2014-04-22 08:13:42

by Heiko Stübner

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Embeddable Position Independent Executable

Hi Russ,

Am Donnerstag, 24. Oktober 2013, 10:09:12 schrieb Heiko St?bner:
> Am Dienstag, 17. September 2013, 14:43:26 schrieb Russ Dill:
> > This patch adds support for and demonstrates the usage of an embedded
> > position independent executable (PIE). The goal is to allow the use of C
> > code in situations where carefully written position independent assembly
> > was previously required.
>
> As suggested yesterday evening by Kevin Hilman, just adding my 2ct of
> support.
>
> This series looks exactly like the foundation I'll need at some point in the
> (probably still distant) future to handle suspend on my Rockchip platform -
> where like in your example stuff like putting the ram into selfrefresh has
> to be done by the os.

just as it came up recently again for me and I couldn't find any newer version,
ist this series still on the table?


Thanks
Heiko

2014-06-04 13:36:34

by Pascal Huerst

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Embeddable Position Independent Executable

Hi Ru

On 22.04.2014 10:15, Heiko St?bner wrote:
> Hi Russ,
>
> Am Donnerstag, 24. Oktober 2013, 10:09:12 schrieb Heiko St?bner:
>> Am Dienstag, 17. September 2013, 14:43:26 schrieb Russ Dill:
>>> This patch adds support for and demonstrates the usage of an embedded
>>> position independent executable (PIE). The goal is to allow the use of C
>>> code in situations where carefully written position independent assembly
>>> was previously required.
>>
>> As suggested yesterday evening by Kevin Hilman, just adding my 2ct of
>> support.
>>
>> This series looks exactly like the foundation I'll need at some point in the
>> (probably still distant) future to handle suspend on my Rockchip platform -
>> where like in your example stuff like putting the ram into selfrefresh has
>> to be done by the os.
>
> just as it came up recently again for me and I couldn't find any newer version,
> ist this series still on the table?

We have to maintain quite some patches, to keep pm working for our
project, so we're interested to get this mainline, too. Is someone
taking care of these issues? Or is there something i can do to help
getting this done?

regards
pascal