OMAP4 introduces a Hardware Spinlock device, which provides hardware
assistance for synchronization and mutual exclusion between heterogeneous
processors and those not operating under a single, shared operating system
(e.g. OMAP4 has dual Cortex-A9, dual Cortex-M3 and a C64x+ DSP).
The intention of this hardware device is to allow remote processors,
that have no alternative mechanism to accomplish synchronization and mutual
exclusion operations, to share resources (such as memory and/or any other
hardware resource).
This patchset adds a generic hwspinlock framework that makes it possible
for drivers to use those hwspinlock devices and stay platform-independent.
Currently there are two users for this hwspinlock interface:
1. Inter-processor communications: on OMAP4, cpu-intensive multimedia
tasks are offloaded by the host to the remote M3 and/or C64x+ slave
processors.
To achieve fast message-based communications, a minimal kernel support
is needed to deliver messages arriving from a remote processor to the
appropriate user process.
This communication is based on a simple data structure that is shared between
the remote processors, and access to it is synchronized using the hwspinlock
module (remote processor directly places new messages in this shared data
structure).
2. On some OMAP4 boards, the I2C bus is shared between the A9 and the M3,
and the hwspinlock is used to synchronize access to it.
While (2) can get away with an omap-specific hwspinlock implementation,
(1) is by no means omap-specific, and a common hwspinlock interface is
needed to keep it generic, in hopes that it will be useful for other
vendors as well.
Changes since v1
===============
- Convert to a generic interface (Tony)
- Provide API flexibility regarding irq handling (Arnd, Grant)
Note: after reviewing OMAP's L4 access times, and comparing them with
external memory latencies, I can say that there is no notable difference.
Because of that, we can safely treat the hwspinlock like we do
with regular spinlocks: preemption should be disabled, but whether
to disable interrupts or not is up to the caller.
So despite the TRM's recommendation to always disable local interrupts
when taking an OMAP Hardware Spinlock, I have decided to allow callers
not to do that (by providing the full extent of hwspin_lock(),
hwspin_lock_irq() and hwspin_lock_irqsave() API).
Just like regular spinlocks, it's up to the callers to decide whether
interrupts should be disabled or not.
Sleeping, btw, is still prohibited of course.
- API should silently succeed if framework isn't built (Greg)
- Don't use ERR_PTR pattern (Grant)
- Use tristate, fix and extend commentary (Kevin)
Patches are tested against linux-omap-2.6 master, which is 2.6.37-rc2 plus
omap material, but they apply on top of mainline 2.6.37-rc3 as well
Contributions
=============
Previous versions of an omap-specific hwspinlock driver circulated in
linux-omap several times, and received substantial attention and contribution
from many developers (see [1][2][3][4][5][6]):
Simon Que did the initial implementation and pushed several iterations
Benoit Cousson provided extensive review, help, improvements and hwmod support
Hari Kanigeri helped out when Simon was away
Sanjeev Premi, Santosh Shilimkar and Nishanth Menon did lots of review
I'd like to thank Benoit Cousson, Steve Krueger, Hari Kanigeri,
Nourredine Hamoudi and Richard Woodruff for useful discussions about
the OMAP Spinlock requirements and use-cases.
Relevant linux-omap threads:
[1] http://thread.gmane.org/gmane.linux.ports.arm.omap/38755
[2] http://thread.gmane.org/gmane.linux.ports.arm.omap/38917
[3] http://thread.gmane.org/gmane.linux.ports.arm.omap/39187
[4] http://thread.gmane.org/gmane.linux.ports.arm.omap/39365
[5] http://thread.gmane.org/gmane.linux.ports.arm.omap/39815
[6] http://thread.gmane.org/gmane.linux.ports.arm.omap/40901
Benoit Cousson (1):
OMAP4: hwmod data: Add hwspinlock
Ohad Ben-Cohen (1):
drivers: hwspinlock: add generic framework
Simon Que (2):
drivers: hwspinlock: add OMAP implementation
omap: add hwspinlock device
Documentation/hwspinlock.txt | 339 +++++++++++++++++
arch/arm/mach-omap2/Makefile | 1 +
arch/arm/mach-omap2/hwspinlock.c | 63 +++
arch/arm/mach-omap2/omap_hwmod_44xx_data.c | 63 +++
drivers/Kconfig | 2 +
drivers/Makefile | 1 +
drivers/hwspinlock/Kconfig | 22 ++
drivers/hwspinlock/Makefile | 6 +
drivers/hwspinlock/hwspinlock.h | 61 +++
drivers/hwspinlock/hwspinlock_core.c | 561 ++++++++++++++++++++++++++++
drivers/hwspinlock/omap_hwspinlock.c | 231 ++++++++++++
include/linux/hwspinlock.h | 376 +++++++++++++++++++
12 files changed, 1726 insertions(+), 0 deletions(-)
create mode 100644 Documentation/hwspinlock.txt
create mode 100644 arch/arm/mach-omap2/hwspinlock.c
create mode 100644 drivers/hwspinlock/Kconfig
create mode 100644 drivers/hwspinlock/Makefile
create mode 100644 drivers/hwspinlock/hwspinlock.h
create mode 100644 drivers/hwspinlock/hwspinlock_core.c
create mode 100644 drivers/hwspinlock/omap_hwspinlock.c
create mode 100644 include/linux/hwspinlock.h
Add a common, platform-independent, hwspinlock framework.
Hardware spinlock devices are needed, e.g., in order to access data
that is shared between remote processors, that otherwise have no
alternative mechanism to accomplish synchronization and mutual exclusion
operations.
Signed-off-by: Ohad Ben-Cohen <[email protected]>
Cc: Hari Kanigeri <[email protected]>
Cc: Benoit Cousson <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Grant Likely <[email protected]>
Cc: Tony Lindgren <[email protected]>
---
Documentation/hwspinlock.txt | 339 ++++++++++++++++++++
drivers/Kconfig | 2 +
drivers/Makefile | 1 +
drivers/hwspinlock/Kconfig | 13 +
drivers/hwspinlock/Makefile | 5 +
drivers/hwspinlock/hwspinlock.h | 61 ++++
drivers/hwspinlock/hwspinlock_core.c | 561 ++++++++++++++++++++++++++++++++++
include/linux/hwspinlock.h | 376 +++++++++++++++++++++++
8 files changed, 1358 insertions(+), 0 deletions(-)
create mode 100644 Documentation/hwspinlock.txt
create mode 100644 drivers/hwspinlock/Kconfig
create mode 100644 drivers/hwspinlock/Makefile
create mode 100644 drivers/hwspinlock/hwspinlock.h
create mode 100644 drivers/hwspinlock/hwspinlock_core.c
create mode 100644 include/linux/hwspinlock.h
diff --git a/Documentation/hwspinlock.txt b/Documentation/hwspinlock.txt
new file mode 100644
index 0000000..ccbf5de
--- /dev/null
+++ b/Documentation/hwspinlock.txt
@@ -0,0 +1,339 @@
+Common Hardware Spinlock Framework
+
+1. Introduction
+
+Hardware spinlock modules provide hardware assistance for synchronization
+and mutual exclusion between heterogeneous processors and those not operating
+under a single, shared operating system.
+
+For example, OMAP4 has dual Cortex-A9, dual Cortex-M3 and a C64x+ DSP,
+each of which is running a different Operating System (the master, A9,
+is usually running Linux and the slave processors, the M3 and the DSP,
+are running some flavor of RTOS).
+
+A generic hwspinlock framework allows platform-independent drivers to use
+the hwspinlock device in order to access data structures that are shared
+between remote processors, that otherwise have no alternative mechanism
+to accomplish synchronization and mutual exclusion operations.
+
+This is necessary, for example, for Inter-processor communications:
+on OMAP4, cpu-intensive multimedia tasks are offloaded by the host to the
+remote M3 and/or C64x+ slave processors (by an IPC subsystem called Syslink).
+
+To achieve fast message-based communications, a minimal kernel support
+is needed to deliver messages arriving from a remote processor to the
+appropriate user process.
+
+This communication is based on simple data structures that is shared between
+the remote processors, and access to it is synchronized using the hwspinlock
+module (remote processor directly places new messages in this shared data
+structure).
+
+A common hwspinlock interface makes it possible to have generic, platform-
+independent, drivers.
+
+2. User API
+
+ struct hwspinlock *hwspinlock_request(void);
+ - dynamically assign an hwspinlock and return its address, or NULL
+ in case an unused hwspinlock isn't available. Users of this
+ API will usually want to communicate the lock's id to the remote core
+ before it can be used to achieve synchronization.
+ Can be called from an atomic context (this function will not sleep) but
+ not from within interrupt context.
+
+ struct hwspinlock *hwspinlock_request_specific(unsigned int id);
+ - assign a specific hwspinlock id and return its address, or NULL
+ if that hwspinlock is already in use. Usually board code will
+ be calling this function in order to reserve specific hwspinlock
+ ids for predefined purposes.
+ Can be called from an atomic context (this function will not sleep) but
+ not from within interrupt context.
+
+ int hwspinlock_free(struct hwspinlock *hwlock);
+ - free a previously-assigned hwspinlock; returns 0 on success, or an
+ appropriate error code on failure (e.g. -EINVAL if the hwspinlock
+ is already free).
+ Can be called from an atomic context (this function will not sleep) but
+ not from within interrupt context.
+
+ int hwspin_lock(struct hwspinlock *hwlock);
+ - lock a previously assigned hwspinlock. If the hwspinlock is already
+ taken, the function will busy loop waiting for it to be released.
+ Note: if a faulty remote core never releases this lock, this function
+ will deadlock.
+ This function will fail only if hwlock is invalid. Otherwise, it will
+ always succeed (or deadlock; see above) and it will never sleep.
+ Upon a successful return from this function, preemption is disabled so
+ the caller must not sleep, and is advised to release the hwspinlock as
+ soon as possible, in order to minimize remote cores polling on the
+ hardware interconnect.
+
+ int hwspin_lock_irq(struct hwspinlock *hwlock);
+ - lock a previously assigned hwspinlock. If the hwspinlock is already
+ taken, the function will busy loop waiting for it to be released.
+ Note: if a faulty remote core never releases this lock, this function
+ will deadlock.
+ This function will fail only if hwlock is invalid. Otherwise, it will
+ always succeed (or deadlock; see above) and it will never sleep.
+ Upon a successful return from this function, preemption is disabled and
+ the local interrupts are disabled. The caller must not sleep, and is
+ advised to release the hwspinlock as soon as possible.
+
+ int hwspin_lock_irqsave(struct hwspinlock *hwlock, unsigned long *flags);
+ - lock a previously assigned hwspinlock. If the hwspinlock is already
+ taken, the function will busy loop waiting for it to be released.
+ Note: if a faulty remote core never releases this lock, this function
+ will deadlock.
+ This function will fail only if hwlock is invalid. Otherwise, it will
+ always succeed (or deadlock; see above) and it will never sleep.
+ Upon a successful return from this function, preemption is disabled,
+ the local interrupts are disabled, and their previous state is saved
+ in the given flags placeholder. The caller must not sleep, and is
+ advised to release the hwspinlock as soon as possible.
+
+ int hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long timeout);
+ - lock a previously-assigned hwspinlock with a timeout limit (specified in
+ jiffies). If the hwspinlock is already taken, the function will busy loop
+ waiting for it to be released, but give up when the timeout meets jiffies.
+ If timeout is 0, the function will never give up (therefore if a faulty
+ remote core never releases the hwspinlock, it will deadlock).
+ Upon a successful return from this function, preemption is disabled so
+ the caller must not sleep, and is advised to release the hwspinlock as
+ soon as possible, in order to minimize remote cores polling on the
+ hardware interconnect.
+ Returns 0 when successful and an appropriate error code otherwise (most
+ notably -ETIMEDOUT if the hwspinlock is still busy after timeout meets
+ jiffies). The function will never sleep.
+
+ int hwspin_lock_timeout_irq(struct hwspinlock *hwlock, unsigned long timeout);
+ - lock a previously-assigned hwspinlock with a timeout limit (specified in
+ jiffies). If the hwspinlock is already taken, the function will busy loop
+ waiting for it to be released, but give up when the timeout meets jiffies.
+ If timeout is 0, the function will never give up (therefore if a faulty
+ remote core never releases the hwspinlock, it will deadlock).
+ Upon a successful return from this function, preemption and the local
+ interrupts are disabled, so the caller must not sleep, and is advised to
+ release the hwspinlock as soon as possible.
+ Returns 0 when successful and an appropriate error code otherwise (most
+ notably -ETIMEDOUT if the hwspinlock is still busy after timeout meets
+ jiffies). The function will never sleep.
+
+ int hwspin_lock_timeout_irqsave(struct hwspinlock *hwlock, unsigned long to,
+ unsigned long *flags);
+ - lock a previously-assigned hwspinlock with a timeout limit (specified in
+ jiffies). If the hwspinlock is already taken, the function will busy loop
+ waiting for it to be released, but give up when the timeout meets jiffies.
+ If timeout is 0, the function will never give up (therefore if a faulty
+ remote core never releases the hwspinlock, it will deadlock).
+ Upon a successful return from this function, preemption is disabled,
+ local interrupts are disabled and their previous state is saved at the
+ given flags placeholder. The caller must not sleep, and is advised to
+ release the hwspinlock as soon as possible.
+ Returns 0 when successful and an appropriate error code otherwise (most
+ notably -ETIMEDOUT if the hwspinlock is still busy after timeout meets
+ jiffies). The function will never sleep.
+
+ int hwspin_trylock(struct hwspinlock *hwlock);
+ - attempt to lock a previously-assigned hwspinlock, but immediately fail if
+ it is already taken.
+ Upon a successful return from this function, preemption is disabled so
+ caller must not sleep, and is advised to release the hwspinlock as soon as
+ possible, in order to minimize remote cores polling on the hardware
+ interconnect.
+ Returns 0 on success and an appropriate error code otherwise (most
+ notably -EBUSY if the hwspinlock was already taken).
+ The function will never sleep.
+
+ int hwspin_trylock_irq(struct hwspinlock *hwlock);
+ - attempt to lock a previously-assigned hwspinlock, but immediately fail if
+ it is already taken.
+ Upon a successful return from this function, preemption and the local
+ interrupts are disabled so caller must not sleep, and is advised to
+ release the hwspinlock as soon as possible.
+ Returns 0 on success and an appropriate error code otherwise (most
+ notably -EBUSY if the hwspinlock was already taken).
+ The function will never sleep.
+
+ int hwspin_trylock_irqsave(struct hwspinlock *hwlock, unsigned long *flags);
+ - attempt to lock a previously-assigned hwspinlock, but immediately fail if
+ it is already taken.
+ Upon a successful return from this function, preemption is disabled,
+ the local interrupts are disabled and their previous state is saved
+ at the given flags placeholder. The caller must not sleep, and is advised
+ to release the hwspinlock as soon as possible.
+ Returns 0 on success and an appropriate error code otherwise (most
+ notably -EBUSY if the hwspinlock was already taken).
+ The function will never sleep.
+
+ int hwspin_unlock(struct hwspinlock *hwlock);
+ - unlock a previously-locked hwspinlock. Always succeed, and can be called
+ from any context (the function never sleeps). Note: code should _never_
+ unlock an hwspinlock which is already unlocked (there is no protection
+ against this).
+
+ int hwspin_unlock_irq(struct hwspinlock *hwlock);
+ - unlock a previously-locked hwspinlock and enable local interrupts.
+ The caller should _never_ unlock an hwspinlock which is already unlocked.
+ Doing so is considered a bug (there is no protection against this).
+ Upon a successful return from this function, preemption and local
+ interrupts are enabled. This function will never sleep.
+ Returns 0 when successful and -EINVAL if hwlock is invalid.
+
+ int hwspin_unlock_irqrestore(struct hwspinlock *hwlock, unsigned long *flags);
+ - unlock a previously-locked hwspinlock.
+ The caller should _never_ unlock an hwspinlock which is already unlocked.
+ Doing so is considered a bug (there is no protection against this).
+ Upon a successful return from this function, preemption is reenabled,
+ and the state of the local interrupts is restored to the state saved at
+ the given flags. This function will never sleep.
+ Returns 0 when successful and -EINVAL if hwlock is invalid.
+
+ int hwspinlock_get_id(struct hwspinlock *hwlock);
+ - retrieve id number of a given hwspinlock. This is needed when an
+ hwspinlock is dynamically assigned: before it can be used to achieve
+ mutual exclusion with a remote cpu, the id number should be communicated
+ to the remote task with which we want to synchronize.
+ Returns the hwspinlock id number, or -EINVAL if hwlock is null.
+
+3. Typical usage
+
+#include <linux/hwspinlock.h>
+#include <linux/err.h>
+
+int hwspinlock_example1(void)
+{
+ struct hwspinlock *hwlock;
+ int ret;
+
+ /* dynamically assign a hwspinlock */
+ hwlock = hwspinlock_request();
+ if (!hwlock)
+ ...
+
+ id = hwspinlock_get_id(hwlock);
+ /* probably need to communicate id to a remote processor now */
+
+ /* take the lock, spin if it's already taken */
+ ret = hwspin_lock(hwlock);
+ if (ret)
+ ...
+
+ /*
+ * we took the lock, do our thing now, but do NOT sleep
+ */
+
+ /* release the lock */
+ ret = hwspin_unlock(hwlock);
+ if (ret)
+ ...
+
+ /* free the lock */
+ ret = hwspinlock_free(hwlock);
+ if (ret)
+ ...
+
+ return ret;
+}
+
+int hwspinlock_example2(void)
+{
+ struct hwspinlock *hwlock;
+ int ret;
+
+ /*
+ * assign a specific hwspinlock id - this should be called early
+ * by board init code.
+ */
+ hwlock = hwspinlock_request_specific(PREDEFINED_LOCK_ID);
+ if (!hwlock)
+ ...
+
+ /* try to take it, but don't spin on it */
+ ret = hwspin_trylock(hwlock);
+ if (!ret) {
+ pr_info("lock is already taken\n");
+ return -EBUSY;
+ }
+
+ /*
+ * we took the lock, do our thing now, but do NOT sleep
+ */
+
+ /* release the lock */
+ ret = hwspin_unlock(hwlock);
+ if (ret)
+ ...
+
+ /* free the lock */
+ ret = hwspinlock_free(hwlock);
+ if (ret)
+ ...
+
+ return ret;
+}
+
+
+4. API for implementors
+
+ int hwspinlock_register(struct hwspinlock *hwlock);
+ - to be called from the underlying platform-specific implementation, in
+ order to register a new hwspinlock instance. Can be called from an atomic
+ context (this function will not sleep) but not from within interrupt
+ context. Returns 0 on success, or appropriate error code on failure.
+
+ struct hwspinlock *hwspinlock_unregister(unsigned int id);
+ - to be called from the underlying vendor-specific implementation, in order
+ to unregister an existing (and unused) hwspinlock instance.
+ Can be called from an atomic context (will not sleep) but not from
+ within interrupt context.
+ Returns the address of hwspinlock on success, or NULL on error (e.g.
+ if the hwspinlock is sill in use).
+
+5. struct hwspinlock
+
+This struct represents an hwspinlock instance. It is registered by the
+underlying hwspinlock implementation using the hwspinlock_register() API.
+
+/**
+ * struct hwspinlock - vendor-specific hwspinlock implementation
+ *
+ * @dev: underlying device, will be used with runtime PM api
+ * @ops: vendor-specific hwspinlock handlers
+ * @id: a global, unique, system-wide, index of the lock.
+ * @lock: initialized and used by hwspinlock core
+ * @owner: underlying implementation module, used to maintain module ref count
+ */
+struct hwspinlock {
+ struct device *dev;
+ const struct hwspinlock_ops *ops;
+ int id;
+ spinlock_t lock;
+ struct module *owner;
+};
+
+The underlying implementation is responsible to assign the dev, ops, id and
+owner members. The lock member, OTOH, is initialized and used by the hwspinlock
+core.
+
+6. Implementation callbacks
+
+There are three possible callbacks defined in 'struct hwspinlock_ops':
+
+struct hwspinlock_ops {
+ int (*trylock)(struct hwspinlock *lock);
+ void (*unlock)(struct hwspinlock *lock);
+ void (*relax)(struct hwspinlock *lock);
+};
+
+The first two callbacks are mandatory:
+
+The ->trylock() callback should make a single attempt to take the lock, and
+return 0 on failure and 1 on success. This callback may _not_ sleep.
+
+The ->unlock() callback releases the lock. It always succeed, and it, too,
+may _not_ sleep.
+
+The ->relax() callback is optional. It is called by hwspinlock core while
+spinning on a lock, and can be used by the underlying implementation to force
+a delay between two successive invocations of ->trylock(). It may _not_ sleep.
diff --git a/drivers/Kconfig b/drivers/Kconfig
index a2b902f..c0e2eab 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -111,4 +111,6 @@ source "drivers/xen/Kconfig"
source "drivers/staging/Kconfig"
source "drivers/platform/Kconfig"
+
+source "drivers/hwspinlock/Kconfig"
endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index cafb881..bb0c472 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -115,3 +115,4 @@ obj-$(CONFIG_VLYNQ) += vlynq/
obj-$(CONFIG_STAGING) += staging/
obj-y += platform/
obj-y += ieee802154/
+obj-$(CONFIG_HWSPINLOCK) += hwspinlock/
diff --git a/drivers/hwspinlock/Kconfig b/drivers/hwspinlock/Kconfig
new file mode 100644
index 0000000..9dd8db4
--- /dev/null
+++ b/drivers/hwspinlock/Kconfig
@@ -0,0 +1,13 @@
+#
+# Generic HWSPINLOCK framework
+#
+
+config HWSPINLOCK
+ tristate "Generic Hardware Spinlock framework"
+ help
+ Say y here to support the generic hardware spinlock framework.
+ You only need to enable this if you have hardware spinlock module
+ on your system (usually only relevant if your system has remote slave
+ coprocessors).
+
+ If unsure, say N.
diff --git a/drivers/hwspinlock/Makefile b/drivers/hwspinlock/Makefile
new file mode 100644
index 0000000..b9d2b9f
--- /dev/null
+++ b/drivers/hwspinlock/Makefile
@@ -0,0 +1,5 @@
+#
+# Generic Hardware Spinlock framework
+#
+
+obj-$(CONFIG_HWSPINLOCK) += hwspinlock_core.o
diff --git a/drivers/hwspinlock/hwspinlock.h b/drivers/hwspinlock/hwspinlock.h
new file mode 100644
index 0000000..69935e6
--- /dev/null
+++ b/drivers/hwspinlock/hwspinlock.h
@@ -0,0 +1,61 @@
+/*
+ * Hardware spinlocks internal header
+ *
+ * Copyright (C) 2010 Texas Instruments Incorporated - http://www.ti.com
+ *
+ * Contact: Ohad Ben-Cohen <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __HWSPINLOCK_HWSPINLOCK_H
+#define __HWSPINLOCK_HWSPINLOCK_H
+
+#include <linux/spinlock.h>
+#include <linux/device.h>
+
+/**
+ * struct hwspinlock_ops - platform-specific hwspinlock handlers
+ *
+ * @trylock: make a single attempt to take the lock. returns 0 on
+ * failure and true on success. may _not_ sleep.
+ * @unlock: release the lock. always succeed. may _not_ sleep.
+ * @relax: optional, platform-specific relax handler, called by hwspinlock
+ * core while spinning on a lock, between two successive
+ * invocations of @trylock. may _not_ sleep.
+ */
+struct hwspinlock_ops {
+ int (*trylock)(struct hwspinlock *lock);
+ void (*unlock)(struct hwspinlock *lock);
+ void (*relax)(struct hwspinlock *lock);
+};
+
+/**
+ * struct hwspinlock - this struct represents a single hwspinlock instance
+ *
+ * @dev: underlying device, will be used to invoke runtime PM api
+ * @ops: platform-specific hwspinlock handlers
+ * @id: a global, unique, system-wide, index of the lock.
+ * @lock: initialized and used by hwspinlock core
+ * @owner: underlying implementation module, used to maintain module ref count
+ *
+ * Note: currently simplicity was opted for, but later we can squeeze some
+ * memory bytes by grouping the dev, ops and owner members in a single
+ * per-platform struct, and have all hwspinlocks point at it.
+ */
+struct hwspinlock {
+ struct device *dev;
+ const struct hwspinlock_ops *ops;
+ int id;
+ spinlock_t lock;
+ struct module *owner;
+};
+
+#endif /* __HWSPINLOCK_HWSPINLOCK_H */
diff --git a/drivers/hwspinlock/hwspinlock_core.c b/drivers/hwspinlock/hwspinlock_core.c
new file mode 100644
index 0000000..cd230bc
--- /dev/null
+++ b/drivers/hwspinlock/hwspinlock_core.c
@@ -0,0 +1,561 @@
+/*
+ * Generic hardware spinlock framework
+ *
+ * Copyright (C) 2010 Texas Instruments Incorporated - http://www.ti.com
+ *
+ * Contact: Ohad Ben-Cohen <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#define pr_fmt(fmt) "%s: " fmt, __func__
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/spinlock.h>
+#include <linux/types.h>
+#include <linux/err.h>
+#include <linux/jiffies.h>
+#include <linux/radix-tree.h>
+#include <linux/hwspinlock.h>
+#include <linux/pm_runtime.h>
+
+#include "hwspinlock.h"
+
+/* radix tree tags */
+#define HWSPINLOCK_UNUSED (0) /* tags an hwspinlock as unused */
+
+/*
+ * A radix tree is used to maintain the available hwspinlock instances.
+ * The tree associates hwspinlock pointers with their integer key id,
+ * and provides easy-to-use API which makes the hwspinlock core code simple
+ * and easy to read.
+ *
+ * Radix trees are quick on lookups, and reasonably efficient in terms of
+ * storage, especially with high density usages such as this framework
+ * requires (a continuous range of integer keys, beginning with zero, is
+ * used as the ID's of the hwspinlock instances).
+ *
+ * The radix tree API supports tagging items in the tree, which this
+ * framework uses to mark unused hwspinlock instances (see the
+ * HWSPINLOCK_UNUSED tag above). As a result, the process of querying the
+ * tree, looking for an unused hwspinlock instance, is now reduced to a
+ * single radix tree API call.
+ */
+static RADIX_TREE(hwspinlock_tree, GFP_KERNEL);
+
+/*
+ * Synchronization of access to the tree is achieved using this spinlock,
+ * as the radix-tree API requires that users provide all synchronisation.
+ */
+static DEFINE_SPINLOCK(hwspinlock_tree_lock);
+
+/**
+ * __hwspin_trylock() - attempt to lock a specific hwspinlock
+ * @hwlock: an hwspinlock which we want to trylock
+ * @mode: controls whether local interrups are disabled or not
+ * @flags: a pointer where the caller's interrupt state will be saved at (if
+ * requested)
+ *
+ * This function attempts to lock an hwspinlock, and will immediately
+ * fail if the hwspinlock is already taken.
+ *
+ * Upon a successful return from this function, preemption (and possibly
+ * interrupts) is disabled, so the caller must not sleep, and is advised to
+ * release the hwspinlock as soon as possible. This is required in order to
+ * minimize remote cores polling on the hardware interconnect.
+ *
+ * The user decides whether local interrupts are disabled or not, and if yes,
+ * whether he wants their previous state to be saved. It is up to the user
+ * to choose the appropriate @mode of operation, exactly the same way users
+ * should decide between spin_trylock, spin_trylock_irq and
+ * spin_trylock_irqsave.
+ *
+ * Returns 0 if we successfully locked the hwspinlock, -EBUSY if
+ * the hwspinlock was already taken, and -EINVAL if @hwlock is invalid.
+ * This function will never sleep.
+ */
+int __hwspin_trylock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
+{
+ int ret;
+
+ if (unlikely(!hwlock)) {
+ pr_err("invalid hwlock\n");
+ return -EINVAL;
+ }
+ if (unlikely(mode == HWLOCK_IRQSTATE && flags == NULL)) {
+ pr_err("invalid flags pointer (null)\n");
+ return -EINVAL;
+ }
+
+ /*
+ * This spin_lock{_irq, _irqsave} serves three purposes:
+ *
+ * 1. Disable preemption, in order to minimize the period of time
+ * in which the hwspinlock is taken. This is important in order
+ * to minimize the possible polling on the hardware interconnect
+ * by a remote user of this lock.
+ * 2. Make the hwspinlock SMP-safe (so we can take it from
+ * additional contexts on the local host).
+ * 3. Ensure that in_atomic/might_sleep checks catch potential
+ * problems with hwspinlock usage (e.g. scheduler checks like
+ * 'scheduling while atomic' etc.)
+ */
+ if (mode == HWLOCK_IRQSTATE)
+ ret = spin_trylock_irqsave(&hwlock->lock, *flags);
+ else if (mode == HWLOCK_IRQ)
+ ret = spin_trylock_irq(&hwlock->lock);
+ else
+ ret = spin_trylock(&hwlock->lock);
+
+ /* is lock already taken by another context on the local cpu ? */
+ if (!ret)
+ return -EBUSY;
+
+ /* try to take the hwspinlock device */
+ ret = hwlock->ops->trylock(hwlock);
+
+ /* if hwlock is already taken, undo spin_trylock_* and exit */
+ if (!ret) {
+ if (mode == HWLOCK_IRQSTATE)
+ spin_unlock_irqrestore(&hwlock->lock, *flags);
+ else if (mode == HWLOCK_IRQ)
+ spin_unlock_irq(&hwlock->lock);
+ else
+ spin_unlock(&hwlock->lock);
+
+ return -EBUSY;
+ }
+
+ /*
+ * We can be sure the other core's memory operations
+ * are observable to us only _after_ we successfully take
+ * the hwspinlock, so we must make sure that subsequent memory
+ * operations will not be reordered before we actually took the
+ * hwspinlock.
+ *
+ * Note: the implicit memory barrier of the spinlock above is too
+ * early, so we need this additional explicit memory barrier.
+ */
+ mb();
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(__hwspin_trylock);
+
+/**
+ * __hwspin_lock_timeout() - lock an hwspinlock with timeout limit
+ * @hwlock: the hwspinlock to be locked
+ * @timeout: timeout, in jiffies
+ * @mode: mode which controls whether local interrups are disabled or not
+ * @flags: a pointer to where the caller's interrupt state will be saved at (if
+ * requested)
+ *
+ * This function locks the given @hwlock. If the @hwlock
+ * is already taken, the function will busy loop waiting for it to
+ * be released, but give up when @timeout meets jiffies. If @timeout
+ * is 0, the function will never give up (therefore if a
+ * faulty remote core never releases the @hwlock, it will deadlock).
+ *
+ * Upon a successful return from this function, preemption is disabled
+ * (and possibly local interrupts, too), so the caller must not sleep,
+ * and is advised to release the hwspinlock as soon as possible.
+ * This is required in order to minimize remote cores polling on the
+ * hardware interconnect.
+ *
+ * The user decides whether local interrupts are disabled or not, and if yes,
+ * whether he wants their previous state to be saved. It is up to the user
+ * to choose the appropriate @mode of operation, exactly the same way users
+ * should decide between spin_lock, spin_lock_irq and spin_lock_irqsave.
+ *
+ * Returns 0 when the @hwlock was successfully taken, and an appropriate
+ * error code otherwise (most notably -ETIMEDOUT if the @hwlock is still
+ * busy after @timeout meets jiffies). The function will never sleep.
+ */
+int __hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long to,
+ int mode, unsigned long *flags)
+{
+ int ret;
+
+ for (;;) {
+ /* Try to take the hwspinlock */
+ ret = __hwspin_trylock(hwlock, mode, flags);
+ if (ret != -EBUSY)
+ break;
+
+ /*
+ * The lock is already taken, let's check if the user wants
+ * us to try again
+ */
+ if (to && time_is_before_eq_jiffies(to))
+ return -ETIMEDOUT;
+
+ /*
+ * Allow platform-specific relax handlers to prevent
+ * hogging the interconnect (no sleeping, though)
+ */
+ if (hwlock->ops->relax)
+ hwlock->ops->relax(hwlock);
+ }
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(__hwspin_lock_timeout);
+
+/**
+ * __hwspin_unlock() - unlock a specific hwspinlock
+ * @hwlock: a previously-acquired hwspinlock which we want to unlock
+ * @mode: controls whether local interrups needs to be restored or not
+ * @flags: previous caller's interrupt state to restore (if requested)
+ *
+ * This function will unlock a specific hwspinlock, enable preemption and
+ * (possibly) enable interrupts or restore their previous state.
+ * @hwlock must be already locked before calling this function: it is a bug
+ * to call unlock on a @hwlock that is already unlocked.
+ *
+ * The user decides whether local interrupts should be enabled or not, and
+ * if yes, whether he wants their previous state to be restored. It is up
+ * to the user to choose the appropriate @mode of operation, exactly the
+ * same way users decide between spin_unlock, spin_unlock_irq and
+ * spin_unlock_irqrestore.
+ *
+ * The function will never sleep.
+ *
+ * Returns 0 on success, -EINVAL if @hwlock or @flags are invalid.
+ */
+int __hwspin_unlock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
+{
+ if (unlikely(!hwlock)) {
+ pr_err("invalid hwlock\n");
+ return -EINVAL;
+ }
+ if (unlikely(mode == HWLOCK_IRQSTATE && flags == NULL)) {
+ pr_err("invalid flags pointer (null)\n");
+ return -EINVAL;
+ }
+
+ /*
+ * We must make sure that memory operations, done before unlocking
+ * the hwspinlock, will not be reordered after the lock is released.
+ *
+ * That's the purpose of this explicit memory barrier.
+ *
+ * Note: the memory barrier induced by the spin_unlock below is too
+ * late; the other core is going to access memory soon after it will
+ * take the hwspinlock, and by then we want to be sure our memory
+ * operations are already observable.
+ */
+ mb();
+
+ hwlock->ops->unlock(hwlock);
+
+ /* Undo the spin_trylock{_irq, _irqsave} called while locking */
+ if (mode == HWLOCK_IRQSTATE)
+ spin_unlock_irqrestore(&hwlock->lock, *flags);
+ else if (mode == HWLOCK_IRQ)
+ spin_unlock_irq(&hwlock->lock);
+ else
+ spin_unlock(&hwlock->lock);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(__hwspin_unlock);
+
+/**
+ * hwspinlock_register() - register a new hw spinlock
+ * @hwlock: hwspinlock to register.
+ *
+ * This function should be called from the underlying platform-specific
+ * implementation, to register a new hwspinlock instance.
+ *
+ * Can be called from an atomic context (will not sleep) but not from
+ * within interrupt context.
+ *
+ * Returns 0 on success, or an appropriate error code on failure
+ */
+int hwspinlock_register(struct hwspinlock *hwlock)
+{
+ struct hwspinlock *tmp;
+ int ret;
+
+ if (!hwlock || !hwlock->ops ||
+ !hwlock->ops->trylock || !hwlock->ops->unlock) {
+ pr_err("invalid parameters\n");
+ return -EINVAL;
+ }
+
+ spin_lock_init(&hwlock->lock);
+
+ spin_lock(&hwspinlock_tree_lock);
+
+ ret = radix_tree_insert(&hwspinlock_tree, hwlock->id, hwlock);
+ if (ret)
+ goto out;
+
+ /* mark this hwspinlock as available */
+ tmp = radix_tree_tag_set(&hwspinlock_tree, hwlock->id,
+ HWSPINLOCK_UNUSED);
+
+ /* this implies an unrecoverable bug. at least rant */
+ WARN_ON(tmp != hwlock);
+
+out:
+ spin_unlock(&hwspinlock_tree_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(hwspinlock_register);
+
+/**
+ * hwspinlock_unregister() - unregister an hw spinlock
+ * @id: index of the specific hwspinlock to unregister
+ *
+ * This function should be called from the underlying platform-specific
+ * implementation, to unregister an existing (and unused) hwspinlock.
+ *
+ * Can be called from an atomic context (will not sleep) but not from
+ * within interrupt context.
+ *
+ * Returns the address of hwspinlock @id on success, or NULL on failure
+ */
+struct hwspinlock *hwspinlock_unregister(unsigned int id)
+{
+ struct hwspinlock *hwlock = NULL;
+ int ret;
+
+ spin_lock(&hwspinlock_tree_lock);
+
+ /* make sure the hwspinlock is not in use (tag is set) */
+ ret = radix_tree_tag_get(&hwspinlock_tree, id, HWSPINLOCK_UNUSED);
+ if (ret == 0) {
+ pr_err("hwspinlock %d still in use (or not present)\n", id);
+ goto out;
+ }
+
+ hwlock = radix_tree_delete(&hwspinlock_tree, id);
+ if (!hwlock) {
+ pr_err("failed to delete hwspinlock %d\n", id);
+ goto out;
+ }
+
+out:
+ spin_unlock(&hwspinlock_tree_lock);
+ return hwlock;
+}
+EXPORT_SYMBOL_GPL(hwspinlock_unregister);
+
+/**
+ * __hwspinlock_request() - tag an hwspinlock as used and power it up
+ *
+ * This is an internal function that prepares an hwspinlock instance
+ * before it is given to the user. The function assumes that
+ * hwspinlock_tree_lock is taken.
+ *
+ * Returns 0 or positive to indicate success, and a negative value to
+ * indicate an error (with the appropriate error code)
+ */
+static int __hwspinlock_request(struct hwspinlock *hwlock)
+{
+ struct hwspinlock *tmp;
+ int ret;
+
+ /* prevent underlying implementation from being removed */
+ if (!try_module_get(hwlock->owner)) {
+ dev_err(hwlock->dev, "%s: can't get owner\n", __func__);
+ return -EINVAL;
+ }
+
+ /* notify PM core that power is now needed */
+ ret = pm_runtime_get_sync(hwlock->dev);
+ if (ret < 0) {
+ dev_err(hwlock->dev, "%s: can't power on device\n", __func__);
+ return ret;
+ }
+
+ /* mark hwspinlock as used, should not fail */
+ tmp = radix_tree_tag_clear(&hwspinlock_tree, hwlock->id,
+ HWSPINLOCK_UNUSED);
+
+ /* this implies an unrecoverable bug. at least rant */
+ WARN_ON(tmp != hwlock);
+
+ return ret;
+}
+
+/**
+ * hwspinlock_get_id() - retrieve id number of a given hwspinlock
+ * @hwlock: a valid hwspinlock instance
+ *
+ * Returns the id number of a given @hwlock, or -EINVAL if @hwlock is invalid.
+ */
+int hwspinlock_get_id(struct hwspinlock *hwlock)
+{
+ if (!hwlock) {
+ pr_err("invalid hwlock\n");
+ return -EINVAL;
+ }
+
+ return hwlock->id;
+}
+EXPORT_SYMBOL_GPL(hwspinlock_get_id);
+
+/**
+ * hwspinlock_request() - request an hwspinlock
+ *
+ * This function should be called by users of the hwspinlock device,
+ * in order to dynamically assign them an unused hwspinlock.
+ * Usually the user of this lock will then have to communicate the lock's id
+ * to the remote core before it can be used for synchronization (to get the
+ * id of a given hwlock, use hwspinlock_get_id()).
+ *
+ * Can be called from an atomic context (will not sleep) but not from
+ * within interrupt context (simply because there is no use case for
+ * that yet).
+ *
+ * Returns the address of the assigned hwspinlock, or NULL on error
+ */
+struct hwspinlock *hwspinlock_request(void)
+{
+ struct hwspinlock *hwlock;
+ int ret;
+
+ spin_lock(&hwspinlock_tree_lock);
+
+ /* look for an unused lock */
+ ret = radix_tree_gang_lookup_tag(&hwspinlock_tree, (void **)&hwlock,
+ 0, 1, HWSPINLOCK_UNUSED);
+ if (ret == 0) {
+ pr_warn("a free hwspinlock is not available\n");
+ hwlock = NULL;
+ goto out;
+ }
+
+ /* sanity check: we shouldn't get more than we requested for */
+ WARN_ON(ret > 1);
+
+ /* mark as used and power up */
+ ret = __hwspinlock_request(hwlock);
+ if (ret < 0)
+ hwlock = NULL;
+
+out:
+ spin_unlock(&hwspinlock_tree_lock);
+ return hwlock;
+}
+EXPORT_SYMBOL_GPL(hwspinlock_request);
+
+/**
+ * hwspinlock_request_specific() - request for a specific hwspinlock
+ * @id: index of the specific hwspinlock that is requested
+ *
+ * This function should be called by users of the hwspinlock module,
+ * in order to assign them a specific hwspinlock.
+ * Usually early board code will be calling this function in order to
+ * reserve specific hwspinlock ids for predefined purposes.
+ *
+ * Can be called from an atomic context (will not sleep) but not from
+ * within interrupt context (simply because there is no use case for
+ * that yet).
+ *
+ * Returns the address of the assigned hwspinlock, or NULL on error
+ */
+struct hwspinlock *hwspinlock_request_specific(unsigned int id)
+{
+ struct hwspinlock *hwlock;
+ int ret;
+
+ spin_lock(&hwspinlock_tree_lock);
+
+ /* make sure this hwspinlock exists */
+ hwlock = radix_tree_lookup(&hwspinlock_tree, id);
+ if (!hwlock) {
+ pr_warn("hwspinlock %u does not exist\n", id);
+ goto out;
+ }
+
+ /* sanity check (this shouldn't happen) */
+ WARN_ON(hwlock->id != id);
+
+ /* make sure this hwspinlock is unused */
+ ret = radix_tree_tag_get(&hwspinlock_tree, id, HWSPINLOCK_UNUSED);
+ if (ret == 0) {
+ pr_warn("hwspinlock %u is already in use\n", id);
+ hwlock = NULL;
+ goto out;
+ }
+
+ /* mark as used and power up */
+ ret = __hwspinlock_request(hwlock);
+ if (ret < 0)
+ hwlock = NULL;
+
+out:
+ spin_unlock(&hwspinlock_tree_lock);
+ return hwlock;
+}
+EXPORT_SYMBOL_GPL(hwspinlock_request_specific);
+
+/**
+ * hwspinlock_free() - free a specific hwspinlock
+ * @hwlock: the specific hwspinlock to free
+ *
+ * This function mark @hwlock as free again.
+ * Should only be called with an @hwlock that was retrieved from
+ * an earlier call to omap_hwspinlock_request{_specific}.
+ *
+ * Can be called from an atomic context (will not sleep) but not from
+ * within interrupt context (simply because there is no use case for
+ * that yet).
+ *
+ * Returns 0 on success, or an appropriate error code on failure
+ */
+int hwspinlock_free(struct hwspinlock *hwlock)
+{
+ struct hwspinlock *tmp;
+ int ret;
+
+ if (!hwlock) {
+ pr_err("invalid hwlock\n");
+ return -EINVAL;
+ }
+
+ spin_lock(&hwspinlock_tree_lock);
+
+ /* make sure the hwspinlock is used */
+ ret = radix_tree_tag_get(&hwspinlock_tree, hwlock->id,
+ HWSPINLOCK_UNUSED);
+ if (ret == 1) {
+ dev_err(hwlock->dev, "%s: hwlock is already free\n", __func__);
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /* notify the underlying device that power is not needed */
+ ret = pm_runtime_put(hwlock->dev);
+ if (ret < 0)
+ goto out;
+
+ /* mark this hwspinlock as available */
+ tmp = radix_tree_tag_set(&hwspinlock_tree, hwlock->id,
+ HWSPINLOCK_UNUSED);
+
+ /* sanity check (this shouldn't happen) */
+ WARN_ON(tmp != hwlock);
+
+ module_put(hwlock->owner);
+
+out:
+ spin_unlock(&hwspinlock_tree_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(hwspinlock_free);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Generic hardware spinlock interface");
+MODULE_AUTHOR("Ohad Ben-Cohen <[email protected]>");
diff --git a/include/linux/hwspinlock.h b/include/linux/hwspinlock.h
new file mode 100644
index 0000000..f7f4156
--- /dev/null
+++ b/include/linux/hwspinlock.h
@@ -0,0 +1,376 @@
+/*
+ * Hardware spinlock public header
+ *
+ * Copyright (C) 2010 Texas Instruments Incorporated - http://www.ti.com
+ *
+ * Contact: Ohad Ben-Cohen <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __LINUX_HWSPINLOCK_H
+#define __LINUX_HWSPINLOCK_H
+
+#include <linux/err.h>
+
+/* hwspinlock mode argument */
+#define HWLOCK_IRQSTATE 0x01 /* Disable interrupts, save state */
+#define HWLOCK_IRQ 0x02 /* Disable interrupts, don't save state */
+
+struct hwspinlock;
+
+#if defined(CONFIG_HWSPINLOCK) || defined(CONFIG_HWSPINLOCK_MODULE)
+
+int hwspinlock_register(struct hwspinlock *lock);
+struct hwspinlock *hwspinlock_unregister(unsigned int id);
+struct hwspinlock *hwspinlock_request(void);
+struct hwspinlock *hwspinlock_request_specific(unsigned int id);
+int hwspinlock_free(struct hwspinlock *hwlock);
+int hwspinlock_get_id(struct hwspinlock *hwlock);
+int __hwspin_lock_timeout(struct hwspinlock *, unsigned long, int,
+ unsigned long *);
+int __hwspin_trylock(struct hwspinlock *, int, unsigned long *);
+int __hwspin_unlock(struct hwspinlock *, int, unsigned long *);
+
+#else /* !CONFIG_HWSPINLOCK */
+
+/*
+ * We don't want these functions to fail if CONFIG_HWSPINLOCK is not
+ * enabled. We prefer to silently succeed in this case, and let the
+ * code path get compiled away. This way, if CONFIG_HWSPINLOCK is not
+ * required on a given setup, users will still work.
+ *
+ * The only exception is hwspinlock_register/hwspinlock_unregister, with which
+ * we _do_ want users to fail (no point in registering hwspinlock instances if
+ * the framework is not available).
+ *
+ * Note: ERR_PTR(-ENODEV) will still be considered a success for NULL-checking
+ * users. Others, which care, can still check this with IS_ERR.
+ */
+static inline struct hwspinlock *hwspinlock_request(void)
+{
+ return ERR_PTR(-ENODEV);
+}
+
+static inline struct hwspinlock *hwspinlock_request_specific(unsigned int id)
+{
+ return ERR_PTR(-ENODEV);
+}
+
+static inline int hwspinlock_free(struct hwspinlock *hwlock)
+{
+ return 0;
+}
+
+static inline
+int __hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long to,
+ int mode, unsigned long *flags)
+{
+ return 0;
+}
+
+static inline
+int __hwspin_trylock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
+{
+ return 0;
+}
+
+static inline
+int __hwspin_unlock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
+{
+ return 0;
+}
+
+static inline int hwspinlock_get_id(struct hwspinlock *hwlock)
+{
+ return 0;
+}
+
+static inline int hwspinlock_register(struct hwspinlock *hwlock)
+{
+ return -ENODEV;
+}
+
+static inline struct hwspinlock *hwspinlock_unregister(unsigned int id)
+{
+ return NULL;
+}
+
+#endif /* !CONFIG_HWSPINLOCK */
+
+/**
+ * hwspin_trylock_irqsave() - try to lock an hwspinlock, disable interrupts
+ * @hwlock: an hwspinlock which we want to trylock
+ * @flags: a pointer to where the caller's interrupt state will be saved at
+ *
+ * This function attempts to lock the underlying hwspinlock, and will
+ * immediately fail if the hwspinlock is already locked.
+ *
+ * Upon a successful return from this function, preemption and local
+ * interrupts are disabled (previous interrupts state is saved at @flags),
+ * so the caller must not sleep, and is advised to release the hwspinlock
+ * as soon as possible.
+ *
+ * Returns 0 if we successfully locked the hwspinlock, -EBUSY if
+ * the hwspinlock was already taken, and -EINVAL if @hwlock is invalid.
+ */
+static inline
+int hwspin_trylock_irqsave(struct hwspinlock *hwlock, unsigned long *flags)
+{
+ return __hwspin_trylock(hwlock, HWLOCK_IRQSTATE, flags);
+}
+
+/**
+ * hwspin_trylock_irq() - try to lock an hwspinlock, disable interrupts
+ * @hwlock: an hwspinlock which we want to trylock
+ *
+ * This function attempts to lock the underlying hwspinlock, and will
+ * immediately fail if the hwspinlock is already locked.
+ *
+ * Upon a successful return from this function, preemption and local
+ * interrupts are disabled, so the caller must not sleep, and is advised
+ * to release the hwspinlock as soon as possible.
+ *
+ * Returns 0 if we successfully locked the hwspinlock, -EBUSY if
+ * the hwspinlock was already taken, and -EINVAL if @hwlock is invalid.
+ */
+static inline int hwspin_trylock_irq(struct hwspinlock *hwlock)
+{
+ return __hwspin_trylock(hwlock, HWLOCK_IRQ, NULL);
+}
+
+/**
+ * hwspin_trylock() - attempt to lock a specific hwspinlock
+ * @hwlock: an hwspinlock which we want to trylock
+ *
+ * This function attempts to lock the underlying hwspinlock. Unlike
+ * hwspinlock_lock, this function will immediately fail if the hwspinlock
+ * is already taken.
+ *
+ * Upon a successful return from this function, preemption is disabled,
+ * so the caller must not sleep, and is advised to release the hwspinlock
+ * as soon as possible. This is required in order to minimize remote cores
+ * polling on the hardware interconnect.
+ *
+ * Returns 0 if we successfully locked the hwspinlock, -EBUSY if
+ * the hwspinlock was already taken, and -EINVAL if @hwlock is invalid.
+ */
+static inline int hwspin_trylock(struct hwspinlock *hwlock)
+{
+ return __hwspin_trylock(hwlock, 0, NULL);
+}
+
+/**
+ * hwspin_lock_timeout_irqsave() - lock hwspinlock, with timeout, disable irqs
+ * @hwlock: the hwspinlock to be locked
+ * @timeout: timeout, in jiffies
+ * @flags: a pointer to where the caller's interrupt state will be saved at
+ *
+ * This function locks the underlying @hwlock. If the @hwlock
+ * is already taken, the function will busy loop waiting for it to
+ * be released, but give up when @timeout meets jiffies. If @timeout
+ * is 0, the function will never give up (therefore if a
+ * faulty remote core never releases the @hwlock, it will deadlock).
+ *
+ * Upon a successful return from this function, preemption and local interrupts
+ * are disabled (plus previous interrupt state is saved), so the caller must
+ * not sleep, and is advised to release the hwspinlock as soon as possible.
+ *
+ * Returns 0 when the @hwlock was successfully taken, and an appropriate
+ * error code otherwise (most notably an -ETIMEDOUT if the @hwlock is still
+ * busy after @timeout meets jiffies). The function will never sleep.
+ */
+static inline int hwspin_lock_timeout_irqsave(struct hwspinlock *hwlock,
+ unsigned long to, unsigned long *flags)
+{
+ return __hwspin_lock_timeout(hwlock, to, HWLOCK_IRQSTATE, flags);
+}
+
+/**
+ * hwspin_lock_timeout_irq() - lock hwspinlock, with timeout, disable irqs
+ * @hwlock: the hwspinlock to be locked
+ * @timeout: timeout, in jiffies
+ *
+ * This function locks the underlying @hwlock. If the @hwlock
+ * is already taken, the function will busy loop waiting for it to
+ * be released, but give up when @timeout meets jiffies. If @timeout
+ * is 0, the function will never give up (therefore if a
+ * faulty remote core never releases the @hwlock, it will deadlock).
+ *
+ * Upon a successful return from this function, preemption and local interrupts
+ * are disabled so the caller must not sleep, and is advised to release the
+ * hwspinlock as soon as possible.
+ *
+ * Returns 0 when the @hwlock was successfully taken, and an appropriate
+ * error code otherwise (most notably an -ETIMEDOUT if the @hwlock is still
+ * busy after @timeout meets jiffies). The function will never sleep.
+ */
+static inline
+int hwspin_lock_timeout_irq(struct hwspinlock *hwlock, unsigned long to)
+{
+ return __hwspin_lock_timeout(hwlock, to, HWLOCK_IRQ, NULL);
+}
+
+/**
+ * hwspin_lock_timeout() - lock an hwspinlock with timeout limit
+ * @hwlock: the hwspinlock to be locked
+ * @timeout: timeout, in jiffies
+ *
+ * This function locks the underlying @hwlock. If the @hwlock
+ * is already taken, the function will busy loop waiting for it to
+ * be released, but give up when @timeout meets jiffies. If @timeout
+ * is 0, the function will never give up (therefore if a
+ * faulty remote core never releases the @hwlock, it will deadlock).
+ *
+ * Upon a successful return from this function, preemption is disabled
+ * so the caller must not sleep, and is advised to release the hwspinlock
+ * as soon as possible.
+ * This is required in order to minimize remote cores polling on the
+ * hardware interconnect.
+ *
+ * Returns 0 when the @hwlock was successfully taken, and an appropriate
+ * error code otherwise (most notably an -ETIMEDOUT if the @hwlock is still
+ * busy after @timeout meets jiffies). The function will never sleep.
+ */
+static inline
+int hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long to)
+{
+ return __hwspin_lock_timeout(hwlock, to, 0, NULL);
+}
+
+/**
+ * hwspin_unlock_irqrestore() - unlock hwspinlock, restore irq state
+ * @hwlock: a previously-acquired hwspinlock which we want to unlock
+ * @flags: previous caller's interrupt state to restore
+ *
+ * This function will unlock a specific hwspinlock, enable preemption and
+ * restore the previous state of the local interrupts. It should be used
+ * to undo hwspin_lock_irqsave().
+ *
+ * @hwlock must be already locked before calling this function: it is a bug
+ * to call unlock on a @hwlock that is already unlocked.
+ *
+ * Returns 0 when the @hwlock on success, or -EINVAL if @hwlock is invalid.
+ */
+static inline int hwspin_unlock_irqrestore(struct hwspinlock *hwlock,
+ unsigned long *flags)
+{
+ return __hwspin_unlock(hwlock, HWLOCK_IRQSTATE, flags);
+}
+
+/**
+ * hwspin_unlock_irq() - unlock hwspinlock, enable interrupts
+ * @hwlock: a previously-acquired hwspinlock which we want to unlock
+ *
+ * This function will unlock a specific hwspinlock, enable preemption and
+ * enable local interrupts. Should be used to undo hwspin_lock_irq().
+ *
+ * @hwlock must be already locked (by hwspin_lock_irq()) before calling this
+ * function: it is a bug to call unlock on a @hwlock that is already unlocked.
+ *
+ * Returns 0 when the @hwlock on success, or -EINVAL if @hwlock is invalid.
+ */
+static inline int hwspin_unlock_irq(struct hwspinlock *hwlock)
+{
+ return __hwspin_unlock(hwlock, HWLOCK_IRQ, NULL);
+}
+
+/**
+ * hwspin_unlock() - unlock hwspinlock
+ * @hwlock: a previously-acquired hwspinlock which we want to unlock
+ *
+ * This function will unlock a specific hwspinlock and enable preemption
+ * back.
+ *
+ * @hwlock must be already locked (by hwspin_lock()) before calling this
+ * function: it is a bug to call unlock on a @hwlock that is already unlocked.
+ *
+ * Returns 0 when the @hwlock on success, or -EINVAL if @hwlock is invalid.
+ */
+static inline int hwspin_unlock(struct hwspinlock *hwlock)
+{
+ return __hwspin_unlock(hwlock, 0, NULL);
+}
+
+/**
+ * hwspin_lock_irqsave() - lock hwspinlock, no time limits, disable interrupts
+ * @hwlock: the hwspinlock to be locked
+ * @flags: previous caller's interrupt state to restore
+ *
+ * This function locks the underlying @hwlock. If the @hwlock
+ * is already taken, the function will busy loop waiting for it to
+ * be released. Note: if a faulty remote core never releases the
+ * @hwlock, this function will deadlock.
+ *
+ * Upon a successful return from this function, preemption and interrupts
+ * are disabled (and the previous interrupt state is saved), so the caller
+ * must not sleep, and is advised to release the hwspinlock as soon as
+ * possible.
+ *
+ * Can be called from any context.
+ *
+ * Returns 0 when the @hwlock was successfully taken, and -EINVAL if
+ * @hwlock is invalid (the only possible error code).
+ */
+static inline int hwspin_lock_irqsave(struct hwspinlock *hwlock,
+ unsigned long *flags)
+{
+ return hwspin_lock_timeout_irqsave(hwlock, 0, flags);
+}
+
+/**
+ * hwspin_lock_irq() - lock hwspinlock, no time limits, disable interrupts
+ * @hwlock: the hwspinlock to be locked
+ *
+ * This function locks the underlying @hwlock. If the @hwlock
+ * is already taken, the function will busy loop waiting for it to
+ * be released. Note: if a faulty remote core never releases the
+ * @hwlock, this function will deadlock.
+ *
+ * Upon a successful return from this function, preemption and interrupts
+ * are disabled, so the caller must not sleep, and is advised to release
+ * the hwspinlock as soon as possible. This is required in order to minimize
+ * remote cores polling on the hardware interconnect.
+ *
+ * Can be called from any context.
+ *
+ * Returns 0 when the @hwlock was successfully taken, and -EINVAL if
+ * @hwlock is invalid (the only possible error code).
+ */
+static inline int hwspin_lock_irq(struct hwspinlock *hwlock)
+{
+ return hwspin_lock_timeout_irq(hwlock, 0);
+}
+
+/**
+ * hwspin_lock() - lock hwspinlock, no time limits
+ * @hwlock: the hwspinlock to be locked
+ *
+ * This function locks the underlying @hwlock. If the @hwlock
+ * is already taken, the function will busy loop waiting for it to
+ * be released. Note: if a faulty remote core never releases the
+ * @hwlock, this function will deadlock.
+ *
+ * Upon a successful return from this function, preemption is disabled,
+ * so the caller must not sleep, and is advised to release
+ * the hwspinlock as soon as possible. This is required in order to minimize
+ * remote cores polling on the hardware interconnect.
+ *
+ * This function does not sleep.
+ *
+ * Returns 0 when the @hwlock was successfully taken, and -EINVAL if
+ * @hwlock is invalid (the only possible error code).
+ */
+static inline int hwspin_lock(struct hwspinlock *hwlock)
+{
+ return hwspin_lock_timeout(hwlock, 0);
+}
+
+#endif /* __LINUX_HWSPINLOCK_H */
--
1.7.0.4
From: Benoit Cousson <[email protected]>
Add hwspinlock hwmod data for OMAP4 chip
Signed-off-by: Cousson, Benoit <[email protected]>
Signed-off-by: Hari Kanigeri <[email protected]>
Signed-off-by: Ohad Ben-Cohen <[email protected]>
Cc: Paul Walmsley <[email protected]>
---
arch/arm/mach-omap2/omap_hwmod_44xx_data.c | 63 ++++++++++++++++++++++++++++
1 files changed, 63 insertions(+), 0 deletions(-)
diff --git a/arch/arm/mach-omap2/omap_hwmod_44xx_data.c b/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
index 0d5c6eb..07c3654 100644
--- a/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
@@ -1043,6 +1043,66 @@ static struct omap_hwmod omap44xx_uart4_hwmod = {
.omap_chip = OMAP_CHIP_INIT(CHIP_IS_OMAP4430),
};
+/*
+ * 'spinlock' class
+ * spinlock provides hardware assistance for synchronizing the processes
+ * running on multiple processors
+ */
+
+static struct omap_hwmod_class_sysconfig omap44xx_spinlock_sysc = {
+ .rev_offs = 0x0000,
+ .sysc_offs = 0x0010,
+ .syss_offs = 0x0014,
+ .sysc_flags = (SYSC_HAS_AUTOIDLE | SYSC_HAS_CLOCKACTIVITY |
+ SYSC_HAS_ENAWAKEUP | SYSC_HAS_SIDLEMODE |
+ SYSC_HAS_SOFTRESET | SYSS_HAS_RESET_STATUS),
+ .idlemodes = (SIDLE_FORCE | SIDLE_NO | SIDLE_SMART),
+ .sysc_fields = &omap_hwmod_sysc_type1,
+};
+
+static struct omap_hwmod_class omap44xx_spinlock_hwmod_class = {
+ .name = "spinlock",
+ .sysc = &omap44xx_spinlock_sysc,
+};
+
+/* spinlock */
+static struct omap_hwmod omap44xx_spinlock_hwmod;
+static struct omap_hwmod_addr_space omap44xx_spinlock_addrs[] = {
+ {
+ .pa_start = 0x4a0f6000,
+ .pa_end = 0x4a0f6fff,
+ .flags = ADDR_TYPE_RT
+ },
+};
+
+/* l4_cfg -> spinlock */
+static struct omap_hwmod_ocp_if omap44xx_l4_cfg__spinlock = {
+ .master = &omap44xx_l4_cfg_hwmod,
+ .slave = &omap44xx_spinlock_hwmod,
+ .clk = "l4_div_ck",
+ .addr = omap44xx_spinlock_addrs,
+ .addr_cnt = ARRAY_SIZE(omap44xx_spinlock_addrs),
+ .user = OCP_USER_MPU | OCP_USER_SDMA,
+};
+
+/* spinlock slave ports */
+static struct omap_hwmod_ocp_if *omap44xx_spinlock_slaves[] = {
+ &omap44xx_l4_cfg__spinlock,
+};
+
+static struct omap_hwmod omap44xx_spinlock_hwmod = {
+ .name = "spinlock",
+ .class = &omap44xx_spinlock_hwmod_class,
+ .prcm = {
+ .omap4 = {
+ .clkctrl_reg = OMAP4430_CM_L4CFG_HW_SEM_CLKCTRL,
+ },
+ },
+ .slaves = omap44xx_spinlock_slaves,
+ .slaves_cnt = ARRAY_SIZE(omap44xx_spinlock_slaves),
+ .omap_chip = OMAP_CHIP_INIT(CHIP_IS_OMAP4430),
+};
+
static __initdata struct omap_hwmod *omap44xx_hwmods[] = {
/* dmm class */
&omap44xx_dmm_hwmod,
@@ -1077,6 +1137,9 @@ static __initdata struct omap_hwmod *omap44xx_hwmods[] = {
&omap44xx_uart2_hwmod,
&omap44xx_uart3_hwmod,
&omap44xx_uart4_hwmod,
+
+ /* spinlock class */
+ &omap44xx_spinlock_hwmod,
NULL,
};
--
1.7.0.4
From: Simon Que <[email protected]>
Build and register an hwspinlock platform device.
Although only OMAP4 supports the hardware spinlock module (for now),
it is still safe to run this initcall on all omaps, because hwmod lookup
will simply fail on hwspinlock-less platforms.
Signed-off-by: Simon Que <[email protected]>
Signed-off-by: Hari Kanigeri <[email protected]>
Signed-off-by: Ohad Ben-Cohen <[email protected]>
Cc: Benoit Cousson <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Paul Walmsley <[email protected]>
---
arch/arm/mach-omap2/Makefile | 1 +
arch/arm/mach-omap2/hwspinlock.c | 63 ++++++++++++++++++++++++++++++++++++++
2 files changed, 64 insertions(+), 0 deletions(-)
create mode 100644 arch/arm/mach-omap2/hwspinlock.c
diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
index ce7b1f0..5162665 100644
--- a/arch/arm/mach-omap2/Makefile
+++ b/arch/arm/mach-omap2/Makefile
@@ -197,3 +197,4 @@ obj-y += $(smc91x-m) $(smc91x-y)
smsc911x-$(CONFIG_SMSC911X) := gpmc-smsc911x.o
obj-y += $(smsc911x-m) $(smsc911x-y)
+obj-$(CONFIG_ARCH_OMAP4) += hwspinlock.o
diff --git a/arch/arm/mach-omap2/hwspinlock.c b/arch/arm/mach-omap2/hwspinlock.c
new file mode 100644
index 0000000..06d4a80
--- /dev/null
+++ b/arch/arm/mach-omap2/hwspinlock.c
@@ -0,0 +1,63 @@
+/*
+ * OMAP hardware spinlock device initialization
+ *
+ * Copyright (C) 2010 Texas Instruments Incorporated - http://www.ti.com
+ *
+ * Contact: Simon Que <[email protected]>
+ * Hari Kanigeri <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/err.h>
+
+#include <plat/omap_hwmod.h>
+#include <plat/omap_device.h>
+
+struct omap_device_pm_latency omap_spinlock_latency[] = {
+ {
+ .deactivate_func = omap_device_idle_hwmods,
+ .activate_func = omap_device_enable_hwmods,
+ .flags = OMAP_DEVICE_LATENCY_AUTO_ADJUST,
+ }
+};
+
+int __init hwspinlocks_init(void)
+{
+ int retval = 0;
+ struct omap_hwmod *oh;
+ struct omap_device *od;
+ const char *oh_name = "spinlock";
+ const char *dev_name = "omap_hwspinlock";
+
+ /*
+ * Hwmod lookup will fail in case our platform doesn't support the
+ * hardware spinlock module, so it is safe to run this initcall
+ * on all omaps
+ */
+ oh = omap_hwmod_lookup(oh_name);
+ if (oh == NULL)
+ return -EINVAL;
+
+ od = omap_device_build(dev_name, 0, oh, NULL, 0,
+ omap_spinlock_latency,
+ ARRAY_SIZE(omap_spinlock_latency), false);
+ if (IS_ERR(od)) {
+ pr_err("Can't build omap_device for %s:%s\n", dev_name,
+ oh_name);
+ retval = PTR_ERR(od);
+ }
+
+ return retval;
+}
+/* early board code might need to reserve specific hwspinlock instances */
+postcore_initcall(hwspinlocks_init);
--
1.7.0.4
From: Simon Que <[email protected]>
Add hwspinlock support for the OMAP4 Hardware Spinlock device.
The Hardware Spinlock device on OMAP4 provides hardware assistance
for synchronization between the multiple processors in the system
(dual Cortex-A9, dual Cortex-M3 and a C64x+ DSP).
[[email protected]: adapt to generic hwspinlock api, tidy up]
Signed-off-by: Simon Que <[email protected]>
Signed-off-by: Hari Kanigeri <[email protected]>
Signed-off-by: Krishnamoorthy, Balaji T <[email protected]>
Signed-off-by: Ohad Ben-Cohen <[email protected]>
Cc: Benoit Cousson <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Grant Likely <[email protected]>
Cc: Tony Lindgren <[email protected]>
---
drivers/hwspinlock/Kconfig | 9 ++
drivers/hwspinlock/Makefile | 1 +
drivers/hwspinlock/omap_hwspinlock.c | 231 ++++++++++++++++++++++++++++++++++
3 files changed, 241 insertions(+), 0 deletions(-)
create mode 100644 drivers/hwspinlock/omap_hwspinlock.c
diff --git a/drivers/hwspinlock/Kconfig b/drivers/hwspinlock/Kconfig
index 9dd8db4..eb4af28 100644
--- a/drivers/hwspinlock/Kconfig
+++ b/drivers/hwspinlock/Kconfig
@@ -11,3 +11,12 @@ config HWSPINLOCK
coprocessors).
If unsure, say N.
+
+config HWSPINLOCK_OMAP
+ tristate "OMAP Hardware Spinlock device"
+ depends on HWSPINLOCK && ARCH_OMAP4
+ help
+ Say y here to support the OMAP Hardware Spinlock device (firstly
+ introduced in OMAP4).
+
+ If unsure, say N.
diff --git a/drivers/hwspinlock/Makefile b/drivers/hwspinlock/Makefile
index b9d2b9f..5729a3f 100644
--- a/drivers/hwspinlock/Makefile
+++ b/drivers/hwspinlock/Makefile
@@ -3,3 +3,4 @@
#
obj-$(CONFIG_HWSPINLOCK) += hwspinlock_core.o
+obj-$(CONFIG_HWSPINLOCK_OMAP) += omap_hwspinlock.o
diff --git a/drivers/hwspinlock/omap_hwspinlock.c b/drivers/hwspinlock/omap_hwspinlock.c
new file mode 100644
index 0000000..0525d59
--- /dev/null
+++ b/drivers/hwspinlock/omap_hwspinlock.c
@@ -0,0 +1,231 @@
+/*
+ * OMAP hardware spinlock driver
+ *
+ * Copyright (C) 2010 Texas Instruments Incorporated - http://www.ti.com
+ *
+ * Contact: Simon Que <[email protected]>
+ * Hari Kanigeri <[email protected]>
+ * Ohad Ben-Cohen <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/log2.h>
+#include <linux/pm_runtime.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/hwspinlock.h>
+#include <linux/platform_device.h>
+
+#include "hwspinlock.h"
+
+/* Spinlock register offsets */
+#define SYSSTATUS_OFFSET 0x0014
+#define LOCK_BASE_OFFSET 0x0800
+
+#define SPINLOCK_NUMLOCKS_BIT_OFFSET (24)
+
+/* Possible values of SPINLOCK_LOCK_REG */
+#define SPINLOCK_NOTTAKEN (0) /* free */
+#define SPINLOCK_TAKEN (1) /* locked */
+
+#define to_omap_hwspinlock(lock) \
+ container_of(lock, struct omap_hwspinlock, lock)
+
+struct omap_hwspinlock {
+ struct hwspinlock lock;
+ void __iomem *addr;
+};
+
+struct omap_hwspinlock_state {
+ int num_locks; /* Total number of locks in system */
+ void __iomem *io_base; /* Mapped base address */
+};
+
+static int omap_hwspinlock_trylock(struct hwspinlock *lock)
+{
+ struct omap_hwspinlock *omap_lock = to_omap_hwspinlock(lock);
+
+ /* attempt to acquire the lock by reading its value */
+ return (SPINLOCK_NOTTAKEN == readl(omap_lock->addr));
+}
+
+static void omap_hwspinlock_unlock(struct hwspinlock *lock)
+{
+ struct omap_hwspinlock *omap_lock = to_omap_hwspinlock(lock);
+
+ /* release the lock by writing 0 to it */
+ writel(SPINLOCK_NOTTAKEN, omap_lock->addr);
+}
+
+/*
+ * relax the OMAP interconnect while spinning on it.
+ *
+ * The specs recommended that the retry delay time will be
+ * just over half of the time that a requester would be
+ * expected to hold the lock.
+ *
+ * The number below is taken from an hardware specs example,
+ * obviously it is somewhat arbitrary.
+ */
+static void omap_hwspinlock_relax(struct hwspinlock *lock)
+{
+ ndelay(50);
+}
+
+static const struct hwspinlock_ops omap_hwspinlock_ops = {
+ .trylock = omap_hwspinlock_trylock,
+ .unlock = omap_hwspinlock_unlock,
+ .relax = omap_hwspinlock_relax,
+};
+
+static int __devinit omap_hwspinlock_probe(struct platform_device *pdev)
+{
+ struct omap_hwspinlock *omap_lock;
+ struct omap_hwspinlock_state *state;
+ struct hwspinlock *lock;
+ struct resource *res;
+ void __iomem *io_base;
+ int i, ret;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res)
+ return -ENODEV;
+
+ state = kzalloc(sizeof(*state), GFP_KERNEL);
+ if (!state)
+ return -ENOMEM;
+
+ io_base = ioremap(res->start, resource_size(res));
+ if (!io_base) {
+ ret = -ENOMEM;
+ goto free_state;
+ }
+
+ /* Determine number of locks */
+ i = readl(io_base + SYSSTATUS_OFFSET);
+ i >>= SPINLOCK_NUMLOCKS_BIT_OFFSET;
+
+ /* exactly one of the four least significant bits must be 1 */
+ if (!i || !is_power_of_2(i) || i > 8) {
+ ret = -EINVAL;
+ goto iounmap_base;
+ }
+
+ state->num_locks = i * 32;
+ state->io_base = io_base;
+
+ platform_set_drvdata(pdev, state);
+
+ /*
+ * runtime PM will make sure the clock of this module is
+ * enabled iff at least one lock is requested
+ */
+ pm_runtime_enable(&pdev->dev);
+
+ for (i = 0; i < state->num_locks; i++) {
+ omap_lock = kzalloc(sizeof(*omap_lock), GFP_KERNEL);
+ if (!omap_lock) {
+ ret = -ENOMEM;
+ goto free_locks;
+ }
+
+ omap_lock->lock.dev = &pdev->dev;
+ omap_lock->lock.owner = THIS_MODULE;
+ omap_lock->lock.id = i;
+ omap_lock->lock.ops = &omap_hwspinlock_ops;
+ omap_lock->addr = io_base + LOCK_BASE_OFFSET + sizeof(u32) * i;
+
+ ret = hwspinlock_register(&omap_lock->lock);
+ if (ret) {
+ kfree(omap_lock);
+ goto free_locks;
+ }
+ }
+
+ return 0;
+
+free_locks:
+ while (--i >= 0) {
+ lock = hwspinlock_unregister(i);
+ /* this should't happen, but let's give our best effort */
+ if (!lock) {
+ dev_err(&pdev->dev, "%s: cleanups failed\n", __func__);
+ continue;
+ }
+ omap_lock = to_omap_hwspinlock(lock);
+ kfree(omap_lock);
+ }
+ pm_runtime_disable(&pdev->dev);
+iounmap_base:
+ iounmap(state->io_base);
+free_state:
+ kfree(state);
+ return ret;
+}
+
+static int omap_hwspinlock_remove(struct platform_device *pdev)
+{
+ struct omap_hwspinlock_state *state = platform_get_drvdata(pdev);
+ struct hwspinlock *lock;
+ struct omap_hwspinlock *omap_lock;
+ int i;
+
+ for (i = 0; i < state->num_locks; i++) {
+ lock = hwspinlock_unregister(i);
+ /* this shouldn't happen at this point. if it does, at least
+ * don't continue with the remove */
+ if (!lock) {
+ dev_err(&pdev->dev, "%s: failed on %d\n", __func__, i);
+ return -EBUSY;
+ }
+
+ omap_lock = to_omap_hwspinlock(lock);
+ kfree(omap_lock);
+ }
+
+ pm_runtime_disable(&pdev->dev);
+ iounmap(state->io_base);
+ kfree(state);
+
+ return 0;
+}
+
+static struct platform_driver omap_hwspinlock_driver = {
+ .probe = omap_hwspinlock_probe,
+ .remove = omap_hwspinlock_remove,
+ .driver = {
+ .name = "omap_hwspinlock",
+ },
+};
+
+static int __init omap_hwspinlock_init(void)
+{
+ return platform_driver_register(&omap_hwspinlock_driver);
+}
+/* board init code might need to reserve hwspinlocks for predefined purposes */
+postcore_initcall(omap_hwspinlock_init);
+
+static void __exit omap_hwspinlock_exit(void)
+{
+ platform_driver_unregister(&omap_hwspinlock_driver);
+}
+module_exit(omap_hwspinlock_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Hardware spinlock driver for OMAP");
+MODULE_AUTHOR("Simon Que <[email protected]>");
+MODULE_AUTHOR("Hari Kanigeri <[email protected]>");
+MODULE_AUTHOR("Ohad Ben-Cohen <[email protected]>");
--
1.7.0.4
Hi Ohad,
On Tue, 2010-11-23 at 17:38 +0200, Ohad Ben-Cohen wrote:
> From: Simon Que <[email protected]>
>
> Add hwspinlock support for the OMAP4 Hardware Spinlock device.
>
<snip>
> +
> + io_base = ioremap(res->start, resource_size(res));
> + if (!io_base) {
> + ret = -ENOMEM;
> + goto free_state;
> + }
> +
> + /* Determine number of locks */
> + i = readl(io_base + SYSSTATUS_OFFSET);
> + i >>= SPINLOCK_NUMLOCKS_BIT_OFFSET;
> +
> + /* exactly one of the four least significant bits must be 1 */
> + if (!i || !is_power_of_2(i) || i > 8) {
> + ret = -EINVAL;
> + goto iounmap_base;
> + }
> +
At iounmap_base you unmap state->io_base, but that's set only after this
check. You should probably iounmap(io_base).
Regards,
Ionut.
Ohad,
> -----Original Message-----
> From: [email protected] [mailto:linux-omap-
> [email protected]] On Behalf Of Ohad Ben-Cohen
> Sent: Tuesday, November 23, 2010 9:09 PM
> To: [email protected]; [email protected]; linux-arm-
> [email protected]
> Cc: [email protected]; Greg KH; Tony Lindgren; Cousson, Benoit;
> Grant Likely; Kanigeri, Hari; Anna, Suman; Kevin Hilman; Arnd Bergmann;
> Ohad Ben-Cohen
> Subject: [PATCH v2 1/4] drivers: hwspinlock: add generic framework
>
> Add a common, platform-independent, hwspinlock framework.
>
> Hardware spinlock devices are needed, e.g., in order to access data
> that is shared between remote processors, that otherwise have no
> alternative mechanism to accomplish synchronization and mutual exclusion
> operations.
>
> Signed-off-by: Ohad Ben-Cohen <[email protected]>
> Cc: Hari Kanigeri <[email protected]>
> Cc: Benoit Cousson <[email protected]>
> Cc: Kevin Hilman <[email protected]>
> Cc: Grant Likely <[email protected]>
> Cc: Tony Lindgren <[email protected]>
> ---
> Documentation/hwspinlock.txt | 339 ++++++++++++++++++++
> drivers/Kconfig | 2 +
> drivers/Makefile | 1 +
> drivers/hwspinlock/Kconfig | 13 +
> drivers/hwspinlock/Makefile | 5 +
> drivers/hwspinlock/hwspinlock.h | 61 ++++
> drivers/hwspinlock/hwspinlock_core.c | 561
> ++++++++++++++++++++++++++++++++++
> include/linux/hwspinlock.h | 376 +++++++++++++++++++++++
> 8 files changed, 1358 insertions(+), 0 deletions(-)
> create mode 100644 Documentation/hwspinlock.txt
> create mode 100644 drivers/hwspinlock/Kconfig
> create mode 100644 drivers/hwspinlock/Makefile
> create mode 100644 drivers/hwspinlock/hwspinlock.h
> create mode 100644 drivers/hwspinlock/hwspinlock_core.c
> create mode 100644 include/linux/hwspinlock.h
>
> diff --git a/Documentation/hwspinlock.txt b/Documentation/hwspinlock.txt
> new file mode 100644
> index 0000000..ccbf5de
> --- /dev/null
> +++ b/Documentation/hwspinlock.txt
> @@ -0,0 +1,339 @@
> +Common Hardware Spinlock Framework
> +
> +1. Introduction
> +
> +Hardware spinlock modules provide hardware assistance for synchronization
> +and mutual exclusion between heterogeneous processors and those not
> operating
> +under a single, shared operating system.
> +
> +For example, OMAP4 has dual Cortex-A9, dual Cortex-M3 and a C64x+ DSP,
> +each of which is running a different Operating System (the master, A9,
> +is usually running Linux and the slave processors, the M3 and the DSP,
> +are running some flavor of RTOS).
> +
> +A generic hwspinlock framework allows platform-independent drivers to use
> +the hwspinlock device in order to access data structures that are shared
> +between remote processors, that otherwise have no alternative mechanism
> +to accomplish synchronization and mutual exclusion operations.
> +
> +This is necessary, for example, for Inter-processor communications:
> +on OMAP4, cpu-intensive multimedia tasks are offloaded by the host to the
> +remote M3 and/or C64x+ slave processors (by an IPC subsystem called
> Syslink).
> +
> +To achieve fast message-based communications, a minimal kernel support
> +is needed to deliver messages arriving from a remote processor to the
> +appropriate user process.
> +
> +This communication is based on simple data structures that is shared
> between
> +the remote processors, and access to it is synchronized using the
> hwspinlock
> +module (remote processor directly places new messages in this shared data
> +structure).
> +
> +A common hwspinlock interface makes it possible to have generic,
> platform-
> +independent, drivers.
> +
> +2. User API
> +
> + struct hwspinlock *hwspinlock_request(void);
> + - dynamically assign an hwspinlock and return its address, or NULL
> + in case an unused hwspinlock isn't available. Users of this
> + API will usually want to communicate the lock's id to the remote core
> + before it can be used to achieve synchronization.
> + Can be called from an atomic context (this function will not sleep) but
> + not from within interrupt context.
How do multiple clients get a handle that they can use? Are they expected to
share the handle they get from the call above? What if they are independent
clients with no means of communication between them? There may be a need of
an API that returns the handle for a specific ID. For example, a module over
the hwspinlock driver that does some level of management of IDs (e.g. name
to ID mapping) and enables users to get multi-core as well as multi-client
protection on Linux.
For example:
struct hwspinlock *hwspinlock_get_handle(unsigned int id);
> +
> + struct hwspinlock *hwspinlock_request_specific(unsigned int id);
> + - assign a specific hwspinlock id and return its address, or NULL
> + if that hwspinlock is already in use. Usually board code will
> + be calling this function in order to reserve specific hwspinlock
> + ids for predefined purposes.
> + Can be called from an atomic context (this function will not sleep) but
> + not from within interrupt context.
> +
> + int hwspinlock_free(struct hwspinlock *hwlock);
> + - free a previously-assigned hwspinlock; returns 0 on success, or an
> + appropriate error code on failure (e.g. -EINVAL if the hwspinlock
> + is already free).
> + Can be called from an atomic context (this function will not sleep) but
> + not from within interrupt context.
> +
> + int hwspin_lock(struct hwspinlock *hwlock);
> + - lock a previously assigned hwspinlock. If the hwspinlock is already
> + taken, the function will busy loop waiting for it to be released.
> + Note: if a faulty remote core never releases this lock, this function
> + will deadlock.
> + This function will fail only if hwlock is invalid. Otherwise, it will
> + always succeed (or deadlock; see above) and it will never sleep.
> + Upon a successful return from this function, preemption is disabled so
> + the caller must not sleep, and is advised to release the hwspinlock as
> + soon as possible, in order to minimize remote cores polling on the
> + hardware interconnect.
Why are some of the APIs hwspinlock_ and others hwspin_?
> +
> + int hwspin_lock_irq(struct hwspinlock *hwlock);
> + - lock a previously assigned hwspinlock. If the hwspinlock is already
> + taken, the function will busy loop waiting for it to be released.
> + Note: if a faulty remote core never releases this lock, this
> function
> + will deadlock.
> + This function will fail only if hwlock is invalid. Otherwise, it will
> + always succeed (or deadlock; see above) and it will never sleep.
> + Upon a successful return from this function, preemption is disabled and
> + the local interrupts are disabled. The caller must not sleep, and is
> + advised to release the hwspinlock as soon as possible.
> +
> + int hwspin_lock_irqsave(struct hwspinlock *hwlock, unsigned long *flags);
> + - lock a previously assigned hwspinlock. If the hwspinlock is already
> + taken, the function will busy loop waiting for it to be released.
> + Note: if a faulty remote core never releases this lock, this function
> + will deadlock.
> + This function will fail only if hwlock is invalid. Otherwise, it will
> + always succeed (or deadlock; see above) and it will never sleep.
> + Upon a successful return from this function, preemption is disabled,
> + the local interrupts are disabled, and their previous state is saved
> + in the given flags placeholder. The caller must not sleep, and is
> + advised to release the hwspinlock as soon as possible.
> +
> + int hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long timeout);
> + - lock a previously-assigned hwspinlock with a timeout limit (specified in
> + jiffies). If the hwspinlock is already taken, the function will busy
> loop
> + waiting for it to be released, but give up when the timeout meets
> jiffies.
> + If timeout is 0, the function will never give up (therefore if a faulty
> + remote core never releases the hwspinlock, it will deadlock).
If timeout is 0, shouldn't it rather behave as a trylock? If timeout is
MAX_SCHEDULE_TIMEOUT, then it should behave as a wait-forever.
> + Upon a successful return from this function, preemption is disabled so
> + the caller must not sleep, and is advised to release the hwspinlock as
> + soon as possible, in order to minimize remote cores polling on the
> + hardware interconnect.
> + Returns 0 when successful and an appropriate error code otherwise (most
> + notably -ETIMEDOUT if the hwspinlock is still busy after timeout meets
> + jiffies). The function will never sleep.
> +
> + int hwspin_lock_timeout_irq(struct hwspinlock *hwlock, unsigned long
> timeout);
> + - lock a previously-assigned hwspinlock with a timeout limit (specified in
> + jiffies). If the hwspinlock is already taken, the function will busy
> loop
> + waiting for it to be released, but give up when the timeout meets
> jiffies.
> + If timeout is 0, the function will never give up (therefore if a faulty
> + remote core never releases the hwspinlock, it will deadlock).
Same comment as above for 0 timeout.
> + Upon a successful return from this function, preemption and the local
> + interrupts are disabled, so the caller must not sleep, and is advised to
> + release the hwspinlock as soon as possible.
> + Returns 0 when successful and an appropriate error code otherwise (most
> + notably -ETIMEDOUT if the hwspinlock is still busy after timeout meets
> + jiffies). The function will never sleep.
> +
> + int hwspin_lock_timeout_irqsave(struct hwspinlock *hwlock, unsigned long
> to,
> + unsigned long *flags);
> + - lock a previously-assigned hwspinlock with a timeout limit (specified in
> + jiffies). If the hwspinlock is already taken, the function will busy
> loop
> + waiting for it to be released, but give up when the timeout meets
> jiffies.
> + If timeout is 0, the function will never give up (therefore if a faulty
> + remote core never releases the hwspinlock, it will deadlock).
Same comment as above for 0 timeout.
> + Upon a successful return from this function, preemption is disabled,
> + local interrupts are disabled and their previous state is saved at the
> + given flags placeholder. The caller must not sleep, and is advised to
> + release the hwspinlock as soon as possible.
> + Returns 0 when successful and an appropriate error code otherwise (most
> + notably -ETIMEDOUT if the hwspinlock is still busy after timeout meets
> + jiffies). The function will never sleep.
> +
> + int hwspin_trylock(struct hwspinlock *hwlock);
> + - attempt to lock a previously-assigned hwspinlock, but immediately fail
> if
> + it is already taken.
> + Upon a successful return from this function, preemption is disabled so
> + caller must not sleep, and is advised to release the hwspinlock as soon
> as
> + possible, in order to minimize remote cores polling on the hardware
> + interconnect.
> + Returns 0 on success and an appropriate error code otherwise (most
> + notably -EBUSY if the hwspinlock was already taken).
> + The function will never sleep.
Is this function needed at all if timeout 0 behaves similar to trylock?
> +
> + int hwspin_trylock_irq(struct hwspinlock *hwlock);
> + - attempt to lock a previously-assigned hwspinlock, but immediately fail
> if
> + it is already taken.
> + Upon a successful return from this function, preemption and the local
> + interrupts are disabled so caller must not sleep, and is advised to
> + release the hwspinlock as soon as possible.
> + Returns 0 on success and an appropriate error code otherwise (most
> + notably -EBUSY if the hwspinlock was already taken).
> + The function will never sleep.
Same comment as above for trylock.
> +
> + int hwspin_trylock_irqsave(struct hwspinlock *hwlock, unsigned long
> *flags);
> + - attempt to lock a previously-assigned hwspinlock, but immediately fail
> if
> + it is already taken.
> + Upon a successful return from this function, preemption is disabled,
> + the local interrupts are disabled and their previous state is saved
> + at the given flags placeholder. The caller must not sleep, and is
> advised
> + to release the hwspinlock as soon as possible.
> + Returns 0 on success and an appropriate error code otherwise (most
> + notably -EBUSY if the hwspinlock was already taken).
> + The function will never sleep.
Same comment as above for trylock.
<snip>
> +
> +4. API for implementors
> +
> + int hwspinlock_register(struct hwspinlock *hwlock);
> + - to be called from the underlying platform-specific implementation, in
> + order to register a new hwspinlock instance. Can be called from an
> atomic
> + context (this function will not sleep) but not from within interrupt
> + context. Returns 0 on success, or appropriate error code on failure.
> +
> + struct hwspinlock *hwspinlock_unregister(unsigned int id);
> + - to be called from the underlying vendor-specific implementation, in
> order
> + to unregister an existing (and unused) hwspinlock instance.
> + Can be called from an atomic context (will not sleep) but not from
> + within interrupt context.
> + Returns the address of hwspinlock on success, or NULL on error (e.g.
> + if the hwspinlock is sill in use).
Does this need to be called for all hwspinlocks? For example, if there are
64 hwspinlocks on a device, do these APIs need to be called 64 times? Is there
any other way to directly register all 64 hwspinlocks in a single shot?
<snip>
> +int __hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long to,
> + int mode, unsigned long *flags)
> +{
> + int ret;
> +
> + for (;;) {
> + /* Try to take the hwspinlock */
> + ret = __hwspin_trylock(hwlock, mode, flags);
> + if (ret != -EBUSY)
> + break;
> +
> + /*
> + * The lock is already taken, let's check if the user wants
> + * us to try again
> + */
> + if (to && time_is_before_eq_jiffies(to))
> + return -ETIMEDOUT;
> +
> + /*
> + * Allow platform-specific relax handlers to prevent
> + * hogging the interconnect (no sleeping, though)
> + */
> + if (hwlock->ops->relax)
> + hwlock->ops->relax(hwlock);
There should be a way to have an escape mechanism for the case where a deadlock
has occurred due to remote side taking the spinlock and crashing.
Is it possible to have a max loop or time check here and then exit with error
if that max is reached, because there was an expected condition? Hanging the
kernel because of a remote crash causes problems in implementing slave error
handling.
> + }
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(__hwspin_lock_timeout);
<snip>
On Wed, Nov 24, 2010 at 1:23 AM, Ionut Nicu <[email protected]> wrote:
> At iounmap_base you unmap state->io_base, but that's set only after this
> check. You should probably iounmap(io_base).
Of course. Thanks.
>
> Regards,
> Ionut.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
Hi Mugdha,
On Wed, Nov 24, 2010 at 9:44 AM, Kamoolkar, Mugdha <[email protected]> wrote:
> How do multiple clients get a handle that they can use? Are they expected to
> share the handle they get from the call above?
Currently, yes.
> What if they are independent
> clients with no means of communication between them? There may be a need of
> an API that returns the handle for a specific ID. For example, a module over
> the hwspinlock driver that does some level of management of IDs (e.g. name
> to ID mapping) and enables users to get multi-core as well as multi-client
> protection on Linux.
I'm not sure I understand the use case. Can you please elaborate ?
> For example:
> struct hwspinlock *hwspinlock_get_handle(unsigned int id);
I'm afraid such an API will be easily abused, e.g., drivers that will
try to use a predefined hwspinlock id without taking care of reserving
it early enough will probably use it to overcome an "oops this
hwspinlock has already been assigned to someone else".
So let me suggest we should first understand the use case for the API
you propose, and then see how we solve it ?
> Why are some of the APIs hwspinlock_ and others hwspin_?
I'm following the regular spinlock naming, which nicely avoids having
the word 'lock' twice in the API. So you get APIs like hwspin_lock,
hwspin_unlock, hwspin_trylock. In our case we still have stuff like
hwspinlock_request and hwspinlock_free. I can probably make it
hwspin_lock_request, does that look nicer ?
>> + ?int hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long timeout);
>> + ? - lock a previously-assigned hwspinlock with a timeout limit (specified in
>> + ? ? jiffies). If the hwspinlock is already taken, the function will busy
>> loop
>> + ? ? waiting for it to be released, but give up when the timeout meets
>> jiffies.
>> + ? ? If timeout is 0, the function will never give up (therefore if a faulty
>> + ? ? remote core never releases the hwspinlock, it will deadlock).
> If timeout is 0, shouldn't it rather behave as a trylock? If timeout is
> MAX_SCHEDULE_TIMEOUT, then it should behave as a wait-forever.
Good point, thanks!
>> + ?int hwspin_trylock(struct hwspinlock *hwlock);
>> + ? - attempt to lock a previously-assigned hwspinlock, but immediately fail
>> if
>> + ? ? it is already taken.
>> + ? ? Upon a successful return from this function, preemption is disabled so
>> + ? ? caller must not sleep, and is advised to release the hwspinlock as soon
>> as
>> + ? ? possible, in order to minimize remote cores polling on the hardware
>> + ? ? interconnect.
>> + ? ? Returns 0 on success and an appropriate error code otherwise (most
>> + ? ? notably -EBUSY if the hwspinlock was already taken).
>> + ? ? The function will never sleep.
> Is this function needed at all if timeout 0 behaves similar to trylock?
Yeah. Drivers should use the _trylock version when applicable because
it'd make the code more readable, and it's more intuitive (just like
the spin_trylock API).
>> + ?struct hwspinlock *hwspinlock_unregister(unsigned int id);
>> + ? - to be called from the underlying vendor-specific implementation, in
>> order
>> + ? ? to unregister an existing (and unused) hwspinlock instance.
>> + ? ? Can be called from an atomic context (will not sleep) but not from
>> + ? ? within interrupt context.
>> + ? ? Returns the address of hwspinlock on success, or NULL on error (e.g.
>> + ? ? if the hwspinlock is sill in use).
> Does this need to be called for all hwspinlocks?
Currently, yes.
I actually am planning an improvement that would allow registering a
block of hwspinlocks; I don't think that the multiple calls of
register/unregister is that bad (it happens only once at boot), but I
want to allow sharing of the owner, ops and dev members of the
hwspinlock instances (to save some memory).
Anyway, it's not a big improvement, so I planned first to get things
merged and then look into stuff like that.
>> +int __hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long to,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int mode, unsigned long *flags)
>> +{
>> + ? ? int ret;
>> +
>> + ? ? for (;;) {
>> + ? ? ? ? ? ? /* Try to take the hwspinlock */
>> + ? ? ? ? ? ? ret = __hwspin_trylock(hwlock, mode, flags);
>> + ? ? ? ? ? ? if (ret != -EBUSY)
>> + ? ? ? ? ? ? ? ? ? ? break;
>> +
>> + ? ? ? ? ? ? /*
>> + ? ? ? ? ? ? ?* The lock is already taken, let's check if the user wants
>> + ? ? ? ? ? ? ?* us to try again
>> + ? ? ? ? ? ? ?*/
>> + ? ? ? ? ? ? if (to && time_is_before_eq_jiffies(to))
>> + ? ? ? ? ? ? ? ? ? ? return -ETIMEDOUT;
>> +
>> + ? ? ? ? ? ? /*
>> + ? ? ? ? ? ? ?* Allow platform-specific relax handlers to prevent
>> + ? ? ? ? ? ? ?* hogging the interconnect (no sleeping, though)
>> + ? ? ? ? ? ? ?*/
>> + ? ? ? ? ? ? if (hwlock->ops->relax)
>> + ? ? ? ? ? ? ? ? ? ? hwlock->ops->relax(hwlock);
> There should be a way to have an escape mechanism for the case where a deadlock
> has occurred due to remote side taking the spinlock and crashing.
This is exactly what the timeout parameter is for..
Thanks,
Ohad.
My rule of thumb is that nothing is "generic"
until at least three whatever-it-is instances
plug in to it. Sometimes this is called
the "Rule of Three".
Other than OMAP, what's providing hardware
spinlocks that plug into this framework?
- Dave
Ohad,
> -----Original Message-----
> From: Ohad Ben-Cohen [mailto:[email protected]]
> Sent: Thursday, November 25, 2010 1:29 AM
> To: Kamoolkar, Mugdha
> Cc: [email protected]; [email protected]; linux-arm-
> [email protected]; [email protected]; Greg KH; Tony
> Lindgren; Cousson, Benoit; Grant Likely; Kanigeri, Hari; Anna, Suman;
> Kevin Hilman; Arnd Bergmann
> Subject: Re: [PATCH v2 1/4] drivers: hwspinlock: add generic framework
>
> Hi Mugdha,
>
> On Wed, Nov 24, 2010 at 9:44 AM, Kamoolkar, Mugdha <[email protected]> wrote:
> > How do multiple clients get a handle that they can use? Are they
> expected to
> > share the handle they get from the call above?
>
> Currently, yes.
>
> > What if they are independent
> > clients with no means of communication between them? There may be a need
> of
> > an API that returns the handle for a specific ID. For example, a module
> over
> > the hwspinlock driver that does some level of management of IDs (e.g.
> name
> > to ID mapping) and enables users to get multi-core as well as multi-
> client
> > protection on Linux.
>
> I'm not sure I understand the use case. Can you please elaborate ?
>
Consider a software module on top of the hwspinlock, which provides a
generic way for multi-core critical section, say GateMP. This module enables
users to create critical section objects by name. Any other client can open
the critical section by name and get a handle to it. I would expect this
module to make a call to request a resource when creating the GateMP object.
Suppose that the object is actually created by the remote core, and the call
comes to Linux on the host processor to allocate the system resource (as the
owner of the system resources). It will call hwspinlock_request, get a
handle, get the ID from it, and return the ID to the remote processor. There
is no point in the remote processor holding the handle that's not valid in
its virtual space. The ID, in this case, is the single portable value that
every processor understands in a different way. When this object were being
deleted, the ID would be passed to Linux, and a corresponding Linux entity
would then have to get the handle from the ID and call _free.
Similarly, suppose the creation is happening from user-space, the user-space
object should store the ID in the user object, and get the handle from the
ID when performing any actions on the lock from kernel-side.
> > For example:
> > struct hwspinlock *hwspinlock_get_handle(unsigned int id);
>
> I'm afraid such an API will be easily abused, e.g., drivers that will
> try to use a predefined hwspinlock id without taking care of reserving
> it early enough will probably use it to overcome an "oops this
> hwspinlock has already been assigned to someone else".
>
Really? Why would they intentionally destabilize the system like this???
> So let me suggest we should first understand the use case for the API
> you propose, and then see how we solve it ?
>
> > Why are some of the APIs hwspinlock_ and others hwspin_?
>
> I'm following the regular spinlock naming, which nicely avoids having
> the word 'lock' twice in the API. So you get APIs like hwspin_lock,
> hwspin_unlock, hwspin_trylock. In our case we still have stuff like
> hwspinlock_request and hwspinlock_free. I can probably make it
> hwspin_lock_request, does that look nicer ?
>
No real issues with this, just seems less intuitive. I just expected all the
module APIs to follow same convention <module>_API to make them more
intuitive.
> >> + ?int hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long
> timeout);
> >> + ? - lock a previously-assigned hwspinlock with a timeout limit
> (specified in
> >> + ? ? jiffies). If the hwspinlock is already taken, the function will
> busy
> >> loop
> >> + ? ? waiting for it to be released, but give up when the timeout meets
> >> jiffies.
> >> + ? ? If timeout is 0, the function will never give up (therefore if a
> faulty
> >> + ? ? remote core never releases the hwspinlock, it will deadlock).
> > If timeout is 0, shouldn't it rather behave as a trylock? If timeout is
> > MAX_SCHEDULE_TIMEOUT, then it should behave as a wait-forever.
>
> Good point, thanks!
>
> >> + ?int hwspin_trylock(struct hwspinlock *hwlock);
> >> + ? - attempt to lock a previously-assigned hwspinlock, but immediately
> fail
> >> if
> >> + ? ? it is already taken.
> >> + ? ? Upon a successful return from this function, preemption is
> disabled so
> >> + ? ? caller must not sleep, and is advised to release the hwspinlock
> as soon
> >> as
> >> + ? ? possible, in order to minimize remote cores polling on the
> hardware
> >> + ? ? interconnect.
> >> + ? ? Returns 0 on success and an appropriate error code otherwise
> (most
> >> + ? ? notably -EBUSY if the hwspinlock was already taken).
> >> + ? ? The function will never sleep.
> > Is this function needed at all if timeout 0 behaves similar to trylock?
>
> Yeah. Drivers should use the _trylock version when applicable because
> it'd make the code more readable, and it's more intuitive (just like
> the spin_trylock API).
Agreed.
>
> >> + ?struct hwspinlock *hwspinlock_unregister(unsigned int id);
> >> + ? - to be called from the underlying vendor-specific implementation,
> in
> >> order
> >> + ? ? to unregister an existing (and unused) hwspinlock instance.
> >> + ? ? Can be called from an atomic context (will not sleep) but not
> from
> >> + ? ? within interrupt context.
> >> + ? ? Returns the address of hwspinlock on success, or NULL on error
> (e.g.
> >> + ? ? if the hwspinlock is sill in use).
> > Does this need to be called for all hwspinlocks?
>
> Currently, yes.
>
Ok, that should be fine.
> I actually am planning an improvement that would allow registering a
> block of hwspinlocks; I don't think that the multiple calls of
> register/unregister is that bad (it happens only once at boot), but I
> want to allow sharing of the owner, ops and dev members of the
> hwspinlock instances (to save some memory).
>
> Anyway, it's not a big improvement, so I planned first to get things
> merged and then look into stuff like that.
>
> >> +int __hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long to,
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int mode, unsigned long *flags)
> >> +{
> >> + ? ? int ret;
> >> +
> >> + ? ? for (;;) {
> >> + ? ? ? ? ? ? /* Try to take the hwspinlock */
> >> + ? ? ? ? ? ? ret = __hwspin_trylock(hwlock, mode, flags);
> >> + ? ? ? ? ? ? if (ret != -EBUSY)
> >> + ? ? ? ? ? ? ? ? ? ? break;
> >> +
> >> + ? ? ? ? ? ? /*
> >> + ? ? ? ? ? ? ?* The lock is already taken, let's check if the user
> wants
> >> + ? ? ? ? ? ? ?* us to try again
> >> + ? ? ? ? ? ? ?*/
> >> + ? ? ? ? ? ? if (to && time_is_before_eq_jiffies(to))
> >> + ? ? ? ? ? ? ? ? ? ? return -ETIMEDOUT;
> >> +
> >> + ? ? ? ? ? ? /*
> >> + ? ? ? ? ? ? ?* Allow platform-specific relax handlers to prevent
> >> + ? ? ? ? ? ? ?* hogging the interconnect (no sleeping, though)
> >> + ? ? ? ? ? ? ?*/
> >> + ? ? ? ? ? ? if (hwlock->ops->relax)
> >> + ? ? ? ? ? ? ? ? ? ? hwlock->ops->relax(hwlock);
> > There should be a way to have an escape mechanism for the case where a
> deadlock
> > has occurred due to remote side taking the spinlock and crashing.
>
> This is exactly what the timeout parameter is for..
>
I was looking at the situation where someone does not use the lock with
timeout. In that case, do we say that there's no way out? And the system
will hang and need a re-boot? Or do we still want to enable some way to
forcibly break the wait after it has happened for an unreasonably long time?
> Thanks,
> Ohad.
On Thu, Nov 25, 2010 at 5:59 AM, David Brownell <[email protected]> wrote:
> My rule of thumb is that nothing is "generic"
> until at least three whatever-it-is instances
> plug in to it. ?Sometimes this is called
> the "Rule of Three".
>
> Other than OMAP, what's providing hardware
> spinlocks that plug into this framework?
We are not aware of any.
That's why the first iteration was just an omap-specific misc driver.
But we were asked not to pollute device drivers with omap-specific
interfaces (see the discussion on [1]). I think it's a good goal (it
will keep the IPC drivers that will come from TI platform-agnostic),
so we split the driver into a generic interface plus small
omap-specific implementation.
This way platforms [2] can easily plug into the framework anything
they need to achieve multi-core synchronization. E.g., even in case a
platform doesn't have dedicated silicon, but still need this
functionality, it can still plug in an implementation which is based
on Peterson's shared memory mutual exclusion algorithm (see
http://en.wikipedia.org/wiki/Peterson's_algorithm).
The third alternative is to have this driver completely hidden inside
the omap folders and deliver pdata func pointer to drivers that use
it. I am not fond of this, since the driver really only have a tiny
omap-specific part, and most of it should really sit in drivers/. In
addition, it will probably kill the chance of others using it too.
[1] http://lkml.org/lkml/2010/10/22/317
[2] this is mainly aimed at non-coherent heterogeneous processors that
do not support fancy synchronization primitives
>
> - Dave
>
>
On Thu, Nov 25, 2010 at 8:05 AM, Kamoolkar, Mugdha <[email protected]> wrote:
> Consider a software module on top of the hwspinlock, which provides a
> generic way for multi-core critical section, say GateMP. This module enables
> users to create critical section objects by name. Any other client can open
> the critical section by name and get a handle to it. I would expect this
> module to make a call to request a resource when creating the GateMP object.
> Suppose that the object is actually created by the remote core, and the call
> comes to Linux on the host processor to allocate the system resource (as the
> owner of the system resources). It will call hwspinlock_request, get a
> handle, get the ID from it, and return the ID to the remote processor. There
> is no point in the remote processor holding the handle that's not valid in
> its virtual space. The ID, in this case, is the single portable value that
> every processor understands in a different way. When this object were being
> deleted, the ID would be passed to Linux, and a corresponding Linux entity
> would then have to get the handle from the ID and call _free.
How is it different from any other hardware resource that the host
will allocate on behalf of the slaves ? Do we have such _open API for,
e.g., dmtimer ?
It looks like this is a broader problem that needs to be solved by the
kernel module that will manage the resources of the remote processors.
>
> Similarly, suppose the creation is happening from user-space, the user-space
> object should store the ID in the user object, and get the handle from the
> ID when performing any actions on the lock from kernel-side.
The user space interface is not yet defined, but -
One option is to expose a char device for every hwspinlock device,
that can only be opened by a single user (which, after successfully
opening it, will become its owner). Once that user will close the
device (either consciously or by crashing), the kernel will free the
resource. It doesn't look like an _open API is needed here.
>> > For example:
>> > struct hwspinlock *hwspinlock_get_handle(unsigned int id);
>>
>> I'm afraid such an API will be easily abused, e.g., drivers that will
>> try to use a predefined hwspinlock id without taking care of reserving
>> it early enough will probably use it to overcome an "oops this
>> hwspinlock has already been assigned to someone else".
>>
> Really? Why would they intentionally destabilize the system like this???
These things just happen.. even by mistake.
But I rather focus on the "why yes" rather than on the "why not".
Let's define the problem exactly, and then see how to solve it
correctly. Btw it might still end up being an _open API if we find it
as the best solution (but let's first see that we understand the
problem first).
> No real issues with this, just seems less intuitive. I just expected all the
> module APIs to follow same convention <module>_API to make them more
> intuitive.
Then I'll change it, it does might look better.
>> >> +int __hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long to,
>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int mode, unsigned long *flags)
>> >> +{
>> >> + ? ? int ret;
>> >> +
>> >> + ? ? for (;;) {
>> >> + ? ? ? ? ? ? /* Try to take the hwspinlock */
>> >> + ? ? ? ? ? ? ret = __hwspin_trylock(hwlock, mode, flags);
>> >> + ? ? ? ? ? ? if (ret != -EBUSY)
>> >> + ? ? ? ? ? ? ? ? ? ? break;
>> >> +
>> >> + ? ? ? ? ? ? /*
>> >> + ? ? ? ? ? ? ?* The lock is already taken, let's check if the user
>> wants
>> >> + ? ? ? ? ? ? ?* us to try again
>> >> + ? ? ? ? ? ? ?*/
>> >> + ? ? ? ? ? ? if (to && time_is_before_eq_jiffies(to))
>> >> + ? ? ? ? ? ? ? ? ? ? return -ETIMEDOUT;
>> >> +
>> >> + ? ? ? ? ? ? /*
>> >> + ? ? ? ? ? ? ?* Allow platform-specific relax handlers to prevent
>> >> + ? ? ? ? ? ? ?* hogging the interconnect (no sleeping, though)
>> >> + ? ? ? ? ? ? ?*/
>> >> + ? ? ? ? ? ? if (hwlock->ops->relax)
>> >> + ? ? ? ? ? ? ? ? ? ? hwlock->ops->relax(hwlock);
>> > There should be a way to have an escape mechanism for the case where a
>> deadlock
>> > has occurred due to remote side taking the spinlock and crashing.
>>
>> This is exactly what the timeout parameter is for..
>>
> I was looking at the situation where someone does not use the lock with
> timeout.
> In that case, do we say that there's no way out?
If one decides to use the simple _lock version, then yes (just like
with spin_lock()).
But if the user cares about an optional crash of the remote task, then
definitely the _timeout version should be used - that's why we
provided it.
We can remove the _lock version altogether, but I don't think it's a
good thing. A user can still use the _timeout version with infinite
timeout, which would yield the same result. The _lock version is
provided for simple cases, where a timeout isn't required.
Thanks,
Ohad.
> And the system
> will hang and need a re-boot? Or do we still want to enable some way to
> forcibly break the wait after it has happened for an unreasonably long time?
>
>> Thanks,
>> Ohad.
>
On Thu, 2010-11-25 at 08:40 +0200, Ohad Ben-Cohen wrote:
> On Thu, Nov 25, 2010 at 5:59 AM, David Brownell <[email protected]> wrote:
> > My rule of thumb is that nothing is "generic"
> > until at least three whatever-it-is instances
> > plug in to it. Sometimes this is called
> > the "Rule of Three".
> >
> > Other than OMAP, what's providing hardware
> > spinlocks that plug into this framework?
>
> We are not aware of any.
So there's no strong reason to think this is
actually "ggeneric". Function names no longer
specify OMAP, but without other hardware under
the interface, calling it "generic" reflects
more optimism than reality. (That was the
implication of my observations...)
To find other hardware spinlocks, you might be
able to look at fault tolerant multiprocessors.
Ages ago I worked with one of those, where any
spinlock failures integrated with CPU/OS fault
detection; HW cwould yank (checkpointed) CPU boards
off the bus so they could be recovered (some
might continue later from checkpoints, etc.)...
> This way platforms [2] can easily plug into the framework anything
> they need to achieve multi-core synchronization. E.g., even in case a
> platform doesn't have dedicated silicon, but still need this
> functionality, it can still plug in an implementation which is based
> on Peterson's shared memory mutual exclusion algorithm
And maybe also standard Linux spinlocks?
I seem to recall some iterations of the real-time patches doing a lot of
work to generalize spinlocks, since they needed multiple variants. It
might be worth following in those footsteps. (Though I'm not sure they
were thinking much about hardware support .
- Dave
Hi,
In addition to other comments from others, here are a few on the
implementation.
There's a fair amount of potentially spammy and redundant debug code
left in the generic code. I've commented on some of them below, but the
same comments would apply to other locations as well.
On Tue, Nov 23, 2010 at 05:38:57PM +0200, Ohad Ben-Cohen wrote:
> diff --git a/drivers/hwspinlock/hwspinlock_core.c b/drivers/hwspinlock/hwspinlock_core.c
> new file mode 100644
> index 0000000..cd230bc
> --- /dev/null
> +++ b/drivers/hwspinlock/hwspinlock_core.c
> @@ -0,0 +1,561 @@
> +/*
> + * Generic hardware spinlock framework
> + *
> + * Copyright (C) 2010 Texas Instruments Incorporated - http://www.ti.com
> + *
> + * Contact: Ohad Ben-Cohen <[email protected]>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + */
> +
> +#define pr_fmt(fmt) "%s: " fmt, __func__
Not used.
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/spinlock.h>
> +#include <linux/types.h>
> +#include <linux/err.h>
> +#include <linux/jiffies.h>
> +#include <linux/radix-tree.h>
> +#include <linux/hwspinlock.h>
> +#include <linux/pm_runtime.h>
> +
> +#include "hwspinlock.h"
> +
> +/* radix tree tags */
> +#define HWSPINLOCK_UNUSED (0) /* tags an hwspinlock as unused */
> +
> +/*
> + * A radix tree is used to maintain the available hwspinlock instances.
> + * The tree associates hwspinlock pointers with their integer key id,
> + * and provides easy-to-use API which makes the hwspinlock core code simple
> + * and easy to read.
> + *
> + * Radix trees are quick on lookups, and reasonably efficient in terms of
> + * storage, especially with high density usages such as this framework
> + * requires (a continuous range of integer keys, beginning with zero, is
> + * used as the ID's of the hwspinlock instances).
> + *
> + * The radix tree API supports tagging items in the tree, which this
> + * framework uses to mark unused hwspinlock instances (see the
> + * HWSPINLOCK_UNUSED tag above). As a result, the process of querying the
> + * tree, looking for an unused hwspinlock instance, is now reduced to a
> + * single radix tree API call.
> + */
> +static RADIX_TREE(hwspinlock_tree, GFP_KERNEL);
> +
> +/*
> + * Synchronization of access to the tree is achieved using this spinlock,
> + * as the radix-tree API requires that users provide all synchronisation.
> + */
> +static DEFINE_SPINLOCK(hwspinlock_tree_lock);
> +
> +/**
> + * __hwspin_trylock() - attempt to lock a specific hwspinlock
> + * @hwlock: an hwspinlock which we want to trylock
> + * @mode: controls whether local interrups are disabled or not
> + * @flags: a pointer where the caller's interrupt state will be saved at (if
> + * requested)
> + *
> + * This function attempts to lock an hwspinlock, and will immediately
> + * fail if the hwspinlock is already taken.
> + *
> + * Upon a successful return from this function, preemption (and possibly
> + * interrupts) is disabled, so the caller must not sleep, and is advised to
> + * release the hwspinlock as soon as possible. This is required in order to
> + * minimize remote cores polling on the hardware interconnect.
> + *
> + * The user decides whether local interrupts are disabled or not, and if yes,
> + * whether he wants their previous state to be saved. It is up to the user
> + * to choose the appropriate @mode of operation, exactly the same way users
> + * should decide between spin_trylock, spin_trylock_irq and
> + * spin_trylock_irqsave.
> + *
> + * Returns 0 if we successfully locked the hwspinlock, -EBUSY if
> + * the hwspinlock was already taken, and -EINVAL if @hwlock is invalid.
> + * This function will never sleep.
> + */
> +int __hwspin_trylock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
> +{
> + int ret;
> +
> + if (unlikely(!hwlock)) {
> + pr_err("invalid hwlock\n");
These kind of errors can get very spammy for buggy drivers. It's likely
more useful to either do a WARN_ON(), and/or move them under a debug
config option.
> + return -EINVAL;
> + }
> + if (unlikely(mode == HWLOCK_IRQSTATE && flags == NULL)) {
> + pr_err("invalid flags pointer (null)\n");
> + return -EINVAL;
> + }
> +
> + /*
> + * This spin_lock{_irq, _irqsave} serves three purposes:
> + *
> + * 1. Disable preemption, in order to minimize the period of time
> + * in which the hwspinlock is taken. This is important in order
> + * to minimize the possible polling on the hardware interconnect
> + * by a remote user of this lock.
> + * 2. Make the hwspinlock SMP-safe (so we can take it from
> + * additional contexts on the local host).
> + * 3. Ensure that in_atomic/might_sleep checks catch potential
> + * problems with hwspinlock usage (e.g. scheduler checks like
> + * 'scheduling while atomic' etc.)
> + */
> + if (mode == HWLOCK_IRQSTATE)
> + ret = spin_trylock_irqsave(&hwlock->lock, *flags);
> + else if (mode == HWLOCK_IRQ)
> + ret = spin_trylock_irq(&hwlock->lock);
> + else
> + ret = spin_trylock(&hwlock->lock);
> +
> + /* is lock already taken by another context on the local cpu ? */
> + if (!ret)
> + return -EBUSY;
> +
> + /* try to take the hwspinlock device */
> + ret = hwlock->ops->trylock(hwlock);
> +
> + /* if hwlock is already taken, undo spin_trylock_* and exit */
> + if (!ret) {
> + if (mode == HWLOCK_IRQSTATE)
> + spin_unlock_irqrestore(&hwlock->lock, *flags);
> + else if (mode == HWLOCK_IRQ)
> + spin_unlock_irq(&hwlock->lock);
> + else
> + spin_unlock(&hwlock->lock);
> +
> + return -EBUSY;
> + }
> +
> + /*
> + * We can be sure the other core's memory operations
> + * are observable to us only _after_ we successfully take
> + * the hwspinlock, so we must make sure that subsequent memory
> + * operations will not be reordered before we actually took the
> + * hwspinlock.
> + *
> + * Note: the implicit memory barrier of the spinlock above is too
> + * early, so we need this additional explicit memory barrier.
> + */
> + mb();
rmb() should be sufficient here.
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(__hwspin_trylock);
> +
[...]
> +/**
> + * __hwspin_unlock() - unlock a specific hwspinlock
> + * @hwlock: a previously-acquired hwspinlock which we want to unlock
> + * @mode: controls whether local interrups needs to be restored or not
> + * @flags: previous caller's interrupt state to restore (if requested)
> + *
> + * This function will unlock a specific hwspinlock, enable preemption and
> + * (possibly) enable interrupts or restore their previous state.
> + * @hwlock must be already locked before calling this function: it is a bug
> + * to call unlock on a @hwlock that is already unlocked.
> + *
> + * The user decides whether local interrupts should be enabled or not, and
> + * if yes, whether he wants their previous state to be restored. It is up
> + * to the user to choose the appropriate @mode of operation, exactly the
> + * same way users decide between spin_unlock, spin_unlock_irq and
> + * spin_unlock_irqrestore.
> + *
> + * The function will never sleep.
> + *
> + * Returns 0 on success, -EINVAL if @hwlock or @flags are invalid.
> + */
> +int __hwspin_unlock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
> +{
> + if (unlikely(!hwlock)) {
> + pr_err("invalid hwlock\n");
> + return -EINVAL;
> + }
> + if (unlikely(mode == HWLOCK_IRQSTATE && flags == NULL)) {
> + pr_err("invalid flags pointer (null)\n");
> + return -EINVAL;
> + }
> +
> + /*
> + * We must make sure that memory operations, done before unlocking
> + * the hwspinlock, will not be reordered after the lock is released.
> + *
> + * That's the purpose of this explicit memory barrier.
> + *
> + * Note: the memory barrier induced by the spin_unlock below is too
> + * late; the other core is going to access memory soon after it will
> + * take the hwspinlock, and by then we want to be sure our memory
> + * operations are already observable.
> + */
> + mb();
wmb() should be sufficient here.
> +
> + hwlock->ops->unlock(hwlock);
> +
> + /* Undo the spin_trylock{_irq, _irqsave} called while locking */
> + if (mode == HWLOCK_IRQSTATE)
> + spin_unlock_irqrestore(&hwlock->lock, *flags);
> + else if (mode == HWLOCK_IRQ)
> + spin_unlock_irq(&hwlock->lock);
> + else
> + spin_unlock(&hwlock->lock);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(__hwspin_unlock);
> +
> +/**
> + * hwspinlock_register() - register a new hw spinlock
> + * @hwlock: hwspinlock to register.
> + *
> + * This function should be called from the underlying platform-specific
> + * implementation, to register a new hwspinlock instance.
> + *
> + * Can be called from an atomic context (will not sleep) but not from
> + * within interrupt context.
> + *
> + * Returns 0 on success, or an appropriate error code on failure
> + */
> +int hwspinlock_register(struct hwspinlock *hwlock)
> +{
> + struct hwspinlock *tmp;
> + int ret;
> +
> + if (!hwlock || !hwlock->ops ||
> + !hwlock->ops->trylock || !hwlock->ops->unlock) {
> + pr_err("invalid parameters\n");
> + return -EINVAL;
> + }
> +
> + spin_lock_init(&hwlock->lock);
> +
> + spin_lock(&hwspinlock_tree_lock);
> +
> + ret = radix_tree_insert(&hwspinlock_tree, hwlock->id, hwlock);
> + if (ret)
> + goto out;
> +
> + /* mark this hwspinlock as available */
> + tmp = radix_tree_tag_set(&hwspinlock_tree, hwlock->id,
> + HWSPINLOCK_UNUSED);
> +
> + /* this implies an unrecoverable bug. at least rant */
> + WARN_ON(tmp != hwlock);
I don't see how this could ever happen?
> +
> +out:
> + spin_unlock(&hwspinlock_tree_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(hwspinlock_register);
> +
> +/**
> + * hwspinlock_unregister() - unregister an hw spinlock
> + * @id: index of the specific hwspinlock to unregister
> + *
> + * This function should be called from the underlying platform-specific
> + * implementation, to unregister an existing (and unused) hwspinlock.
> + *
> + * Can be called from an atomic context (will not sleep) but not from
> + * within interrupt context.
> + *
> + * Returns the address of hwspinlock @id on success, or NULL on failure
Why not just return int for success / fail and have the driver keep track
of the lock pointers too?
Or, if you for some reason is attached to the ptr-return case, please make it return
standard ERR_PTR instead.
[...]
> +struct hwspinlock *hwspinlock_request(void)
> +{
> + struct hwspinlock *hwlock;
> + int ret;
> +
> + spin_lock(&hwspinlock_tree_lock);
> +
> + /* look for an unused lock */
> + ret = radix_tree_gang_lookup_tag(&hwspinlock_tree, (void **)&hwlock,
> + 0, 1, HWSPINLOCK_UNUSED);
> + if (ret == 0) {
> + pr_warn("a free hwspinlock is not available\n");
> + hwlock = NULL;
> + goto out;
> + }
> +
> + /* sanity check: we shouldn't get more than we requested for */
> + WARN_ON(ret > 1);
No need to check, unless you're debugging the radix code.
> + /* mark as used and power up */
> + ret = __hwspinlock_request(hwlock);
> + if (ret < 0)
> + hwlock = NULL;
> +
> +out:
> + spin_unlock(&hwspinlock_tree_lock);
> + return hwlock;
> +}
> +EXPORT_SYMBOL_GPL(hwspinlock_request);
> +
[...]
> +/**
> + * hwspinlock_free() - free a specific hwspinlock
> + * @hwlock: the specific hwspinlock to free
> + *
> + * This function mark @hwlock as free again.
> + * Should only be called with an @hwlock that was retrieved from
> + * an earlier call to omap_hwspinlock_request{_specific}.
> + *
> + * Can be called from an atomic context (will not sleep) but not from
> + * within interrupt context (simply because there is no use case for
> + * that yet).
> + *
> + * Returns 0 on success, or an appropriate error code on failure
> + */
> +int hwspinlock_free(struct hwspinlock *hwlock)
> +{
> + struct hwspinlock *tmp;
> + int ret;
> +
> + if (!hwlock) {
Some alloc/free APIs will just silently return for these cases.
> + pr_err("invalid hwlock\n");
> + return -EINVAL;
> + }
> +
> + spin_lock(&hwspinlock_tree_lock);
> +
> + /* make sure the hwspinlock is used */
> + ret = radix_tree_tag_get(&hwspinlock_tree, hwlock->id,
> + HWSPINLOCK_UNUSED);
> + if (ret == 1) {
> + dev_err(hwlock->dev, "%s: hwlock is already free\n", __func__);
Double free. WARN_ON() seems appropriate?
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + /* notify the underlying device that power is not needed */
> + ret = pm_runtime_put(hwlock->dev);
> + if (ret < 0)
> + goto out;
> +
> + /* mark this hwspinlock as available */
> + tmp = radix_tree_tag_set(&hwspinlock_tree, hwlock->id,
> + HWSPINLOCK_UNUSED);
> +
> + /* sanity check (this shouldn't happen) */
> + WARN_ON(tmp != hwlock);
No need for this.
> +
> + module_put(hwlock->owner);
> +
> +out:
> + spin_unlock(&hwspinlock_tree_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(hwspinlock_free);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_DESCRIPTION("Generic hardware spinlock interface");
> +MODULE_AUTHOR("Ohad Ben-Cohen <[email protected]>");
On Thu, Nov 25, 2010 at 9:59 PM, Olof Johansson <[email protected]> wrote:
> On Tue, Nov 23, 2010 at 05:38:57PM +0200, Ohad Ben-Cohen wrote:
>> +#define pr_fmt(fmt) ? ?"%s: " fmt, __func__
>
> Not used.
pr_fmt() is a magic #define that changes the behaviour of the pr_*()
macros. See include/linux/kernel.h
g.
On Thu, Nov 25, 2010 at 10:22 PM, David Brownell <[email protected]> wrote:
> So there's no strong reason to think this is
> actually "ggeneric". ?Function names no longer
> specify OMAP, but without other hardware under
> the interface, calling it "generic" reflects
> more optimism than reality. ?(That was the
> implication of my observations...)
Well, it's not omap-specific anymore. Drivers can stay
platform-agnostic, and other vendors can plug in anything they need in
order to use the same drivers (including software implementations as
mentioned below).
As I mentioned, the other two options are going back to an omap
specific driver (which will omapify drivers who use it), or putting
the driver in the omap arch folders (which some people doesn't seem to
like, e.g. http://www.spinics.net/lists/linux-arm-msm/msg00324.html).
Which one of those would you prefer to see ?
I guess that by now we have all three implementations so it's pretty
easy to switch :)
> To find other hardware spinlocks, you might be
> able to look at fault tolerant multiprocessors.
> Ages ago I worked with one of those, where any
> spinlock failures integrated with CPU/OS fault
> detection; HW cwould yank (checkpointed) CPU boards
> off the bus so they could be recovered (some
> might continue later from checkpoints, etc.)...
Is that HW supported by Linux today ?
Any chance you can share a link or any other info about it ?
>> This way platforms [2] can easily plug into the framework anything
>> they need to achieve multi-core synchronization. E.g., even in case a
>> platform doesn't have dedicated silicon, but still need this
>> functionality, it can still plug in an implementation which is based
>> on Peterson's shared memory mutual exclusion algorithm
>
> And maybe also standard Linux spinlocks?
Umm, possibly. Though I admit it sounds awkward a bit. It feels like
platforms, on which a mere spinlock is enough to achieve multi core
synchronization, will not really need any of the IPC drivers that are
going to use this hwspinlock framework. But I guess we need to see a
use case first.
>
> I seem to recall some iterations of the real-time patches doing a lot of
> work to generalize spinlocks, since they needed multiple variants. ?It
> might be worth following in those footsteps. ?(Though I'm not sure they
> were thinking much about hardware support .
Any chance you can point me at a specific discussion or patchset that
you feel may be relevant ?
Thanks!
Ohad.
>
> - Dave
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
On Fri, Nov 26, 2010 at 6:59 AM, Olof Johansson <[email protected]> wrote:
>> +#define pr_fmt(fmt) ? ?"%s: " fmt, __func__
>
> Not used.
Yes, it is, check out how the pr_* macros are implemented:
#define pr_info(fmt, ...) \
printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
I use it to ensure that the function name is printed with any pr_*
macro, without having to explicitly specify the __func__ param every
time.
>> +int __hwspin_trylock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
>> +{
>> + ? ? int ret;
>> +
>> + ? ? if (unlikely(!hwlock)) {
>> + ? ? ? ? ? ? pr_err("invalid hwlock\n");
>
> These kind of errors can get very spammy for buggy drivers.
Yeah, but that's the purpose - I want to catch such egregious drivers
who try to crash the kernel.
> It's likely
> more useful to either do a WARN_ON(), and/or move them under a debug
> config option.
Why would you prefer to compile out reporting of such extremely buggy behavior ?
>
>> + ? ? ? ? ? ? return -EINVAL;
>> + ? ? }
>> + ? ? if (unlikely(mode == HWLOCK_IRQSTATE && flags == NULL)) {
>> + ? ? ? ? ? ? pr_err("invalid flags pointer (null)\n");
>> + ? ? ? ? ? ? return -EINVAL;
>> + ? ? }
>> +
>> + ? ? /*
>> + ? ? ?* This spin_lock{_irq, _irqsave} serves three purposes:
>> + ? ? ?*
>> + ? ? ?* 1. Disable preemption, in order to minimize the period of time
>> + ? ? ?* ? ?in which the hwspinlock is taken. This is important in order
>> + ? ? ?* ? ?to minimize the possible polling on the hardware interconnect
>> + ? ? ?* ? ?by a remote user of this lock.
>> + ? ? ?* 2. Make the hwspinlock SMP-safe (so we can take it from
>> + ? ? ?* ? ?additional contexts on the local host).
>> + ? ? ?* 3. Ensure that in_atomic/might_sleep checks catch potential
>> + ? ? ?* ? ?problems with hwspinlock usage (e.g. scheduler checks like
>> + ? ? ?* ? ?'scheduling while atomic' etc.)
>> + ? ? ?*/
>> + ? ? if (mode == HWLOCK_IRQSTATE)
>> + ? ? ? ? ? ? ret = spin_trylock_irqsave(&hwlock->lock, *flags);
>> + ? ? else if (mode == HWLOCK_IRQ)
>> + ? ? ? ? ? ? ret = spin_trylock_irq(&hwlock->lock);
>> + ? ? else
>> + ? ? ? ? ? ? ret = spin_trylock(&hwlock->lock);
>> +
>> + ? ? /* is lock already taken by another context on the local cpu ? */
>> + ? ? if (!ret)
>> + ? ? ? ? ? ? return -EBUSY;
>> +
>> + ? ? /* try to take the hwspinlock device */
>> + ? ? ret = hwlock->ops->trylock(hwlock);
>> +
>> + ? ? /* if hwlock is already taken, undo spin_trylock_* and exit */
>> + ? ? if (!ret) {
>> + ? ? ? ? ? ? if (mode == HWLOCK_IRQSTATE)
>> + ? ? ? ? ? ? ? ? ? ? spin_unlock_irqrestore(&hwlock->lock, *flags);
>> + ? ? ? ? ? ? else if (mode == HWLOCK_IRQ)
>> + ? ? ? ? ? ? ? ? ? ? spin_unlock_irq(&hwlock->lock);
>> + ? ? ? ? ? ? else
>> + ? ? ? ? ? ? ? ? ? ? spin_unlock(&hwlock->lock);
>> +
>> + ? ? ? ? ? ? return -EBUSY;
>> + ? ? }
>> +
>> + ? ? /*
>> + ? ? ?* We can be sure the other core's memory operations
>> + ? ? ?* are observable to us only _after_ we successfully take
>> + ? ? ?* the hwspinlock, so we must make sure that subsequent memory
>> + ? ? ?* operations will not be reordered before we actually took the
>> + ? ? ?* hwspinlock.
>> + ? ? ?*
>> + ? ? ?* Note: the implicit memory barrier of the spinlock above is too
>> + ? ? ?* early, so we need this additional explicit memory barrier.
>> + ? ? ?*/
>> + ? ? mb();
>
> rmb() should be sufficient here.
It's not.
We need to make sure that our writes, too, will not be reordered
before that barrier. Otherwise, we might end up changing protected
shared memory during the critical section of the remote processor
(before we acquire the lock we must not read from, or write to, the
protected shared memory).
I guess I need to add a comment about this.
>> +int __hwspin_unlock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
>> +{
>> + ? ? if (unlikely(!hwlock)) {
>> + ? ? ? ? ? ? pr_err("invalid hwlock\n");
>> + ? ? ? ? ? ? return -EINVAL;
>> + ? ? }
>> + ? ? if (unlikely(mode == HWLOCK_IRQSTATE && flags == NULL)) {
>> + ? ? ? ? ? ? pr_err("invalid flags pointer (null)\n");
>> + ? ? ? ? ? ? return -EINVAL;
>> + ? ? }
>> +
>> + ? ? /*
>> + ? ? ?* We must make sure that memory operations, done before unlocking
>> + ? ? ?* the hwspinlock, will not be reordered after the lock is released.
>> + ? ? ?*
>> + ? ? ?* That's the purpose of this explicit memory barrier.
>> + ? ? ?*
>> + ? ? ?* Note: the memory barrier induced by the spin_unlock below is too
>> + ? ? ?* late; the other core is going to access memory soon after it will
>> + ? ? ?* take the hwspinlock, and by then we want to be sure our memory
>> + ? ? ?* operations are already observable.
>> + ? ? ?*/
>> + ? ? mb();
>
> wmb() should be sufficient here.
No - here, too, we need to make sure that also our read operations
will not be reordered after the barrier. Otherwise, we might end up
reading memory that has already been changed by a remote processor
that is just about to acquire the lock.
>
>> +
>> + ? ? hwlock->ops->unlock(hwlock);
>> +
>> + ? ? /* Undo the spin_trylock{_irq, _irqsave} called while locking */
>> + ? ? if (mode == HWLOCK_IRQSTATE)
>> + ? ? ? ? ? ? spin_unlock_irqrestore(&hwlock->lock, *flags);
>> + ? ? else if (mode == HWLOCK_IRQ)
>> + ? ? ? ? ? ? spin_unlock_irq(&hwlock->lock);
>> + ? ? else
>> + ? ? ? ? ? ? spin_unlock(&hwlock->lock);
>> +
>> + ? ? return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(__hwspin_unlock);
>> +
>> + ? ? /* mark this hwspinlock as available */
>> + ? ? tmp = radix_tree_tag_set(&hwspinlock_tree, hwlock->id,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? HWSPINLOCK_UNUSED);
>> +
>> + ? ? /* this implies an unrecoverable bug. at least rant */
>> + ? ? WARN_ON(tmp != hwlock);
>
> I don't see how this could ever happen?
It can't. Unless there's a bug somewhere.
That's why:
1. It is a WARN_ON - I don't care if someone compiles this out
2. There is no error handler for this (simply because there can never
be an error handler for buggy code). It's just a rant, meant to catch
an unlikely buggy event.
The reason I did this is because I like to check return values. Even
if it's only for sanity sake.
Since it's not a hot path this shouldn't matter to anyone, but if
people are still bothered by it, it can be removed.
>> +/**
>> + * hwspinlock_unregister() - unregister an hw spinlock
>> + * @id: index of the specific hwspinlock to unregister
>> + *
>> + * This function should be called from the underlying platform-specific
>> + * implementation, to unregister an existing (and unused) hwspinlock.
>> + *
>> + * Can be called from an atomic context (will not sleep) but not from
>> + * within interrupt context.
>> + *
>> + * Returns the address of hwspinlock @id on success, or NULL on failure
>
> Why not just return int for success / fail and have the driver keep track
> of the lock pointers too?
Because it's elegant. This way drivers don't need to keep track of the pointers.
It can be changed, with an extra cost of code (duplicated for each
implementation) and memory, but why would we want to do that ?
> Or, if you for some reason is attached to the ptr-return case, please make it return
> standard ERR_PTR instead.
ERR_PTR patterns have been reasonably argued against in the previous
discussion, and I heartily agreed (see
http://www.mail-archive.com/[email protected]/msg37453.html).
>> + ? ? /* look for an unused lock */
>> + ? ? ret = radix_tree_gang_lookup_tag(&hwspinlock_tree, (void **)&hwlock,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0, 1, HWSPINLOCK_UNUSED);
>> + ? ? if (ret == 0) {
>> + ? ? ? ? ? ? pr_warn("a free hwspinlock is not available\n");
>> + ? ? ? ? ? ? hwlock = NULL;
>> + ? ? ? ? ? ? goto out;
>> + ? ? }
>> +
>> + ? ? /* sanity check: we shouldn't get more than we requested for */
>> + ? ? WARN_ON(ret > 1);
>
> No need to check, unless you're debugging the radix code.
Again, this is me preferring to check return values for sanity sake.
The one time that this may yield true (e.g. code has erroneously
changed) is well worth it.
It shouldn't bother anyone since it's not a hot path, but as I said
earlier, if people don't like it that much, it can be removed.
>
>> + ? ? /* mark as used and power up */
>> + ? ? ret = __hwspinlock_request(hwlock);
>> + ? ? if (ret < 0)
>> + ? ? ? ? ? ? hwlock = NULL;
>> +
>> +out:
>> + ? ? spin_unlock(&hwspinlock_tree_lock);
>> + ? ? return hwlock;
>> +}
>> +EXPORT_SYMBOL_GPL(hwspinlock_request);
>> +
>
> [...]
>
>> +/**
>> + * hwspinlock_free() - free a specific hwspinlock
>> + * @hwlock: the specific hwspinlock to free
>> + *
>> + * This function mark @hwlock as free again.
>> + * Should only be called with an @hwlock that was retrieved from
>> + * an earlier call to omap_hwspinlock_request{_specific}.
>> + *
>> + * Can be called from an atomic context (will not sleep) but not from
>> + * within interrupt context (simply because there is no use case for
>> + * that yet).
>> + *
>> + * Returns 0 on success, or an appropriate error code on failure
>> + */
>> +int hwspinlock_free(struct hwspinlock *hwlock)
>> +{
>> + ? ? struct hwspinlock *tmp;
>> + ? ? int ret;
>> +
>> + ? ? if (!hwlock) {
>
> Some alloc/free APIs will just silently return for these cases.
True.
Some other free API will simply crash.
Does it bother so much if we stick to the safe side here ?
>
>> + ? ? ? ? ? ? pr_err("invalid hwlock\n");
>> + ? ? ? ? ? ? return -EINVAL;
>> + ? ? }
>> +
>> + ? ? spin_lock(&hwspinlock_tree_lock);
>> +
>> + ? ? /* make sure the hwspinlock is used */
>> + ? ? ret = radix_tree_tag_get(&hwspinlock_tree, hwlock->id,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? HWSPINLOCK_UNUSED);
>> + ? ? if (ret == 1) {
>> + ? ? ? ? ? ? dev_err(hwlock->dev, "%s: hwlock is already free\n", __func__);
>
> Double free. WARN_ON() seems appropriate?
I don't want that this message will ever get compiled out.
>
>> + ? ? ? ? ? ? ret = -EINVAL;
>> + ? ? ? ? ? ? goto out;
>> + ? ? }
>> +
>> + ? ? /* notify the underlying device that power is not needed */
>> + ? ? ret = pm_runtime_put(hwlock->dev);
>> + ? ? if (ret < 0)
>> + ? ? ? ? ? ? goto out;
>> +
>> + ? ? /* mark this hwspinlock as available */
>> + ? ? tmp = radix_tree_tag_set(&hwspinlock_tree, hwlock->id,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? HWSPINLOCK_UNUSED);
>> +
>> + ? ? /* sanity check (this shouldn't happen) */
>> + ? ? WARN_ON(tmp != hwlock);
>
> No need for this.
Please see explanations above why I added those sanity checks (despite
my explicit comment saying this shouldn't happen).
Thanks!
Ohad.
>
>> +
>> + ? ? module_put(hwlock->owner);
>> +
>> +out:
>> + ? ? spin_unlock(&hwspinlock_tree_lock);
>> + ? ? return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(hwspinlock_free);
>> +
>> +MODULE_LICENSE("GPL v2");
>> +MODULE_DESCRIPTION("Generic hardware spinlock interface");
>> +MODULE_AUTHOR("Ohad Ben-Cohen <[email protected]>");
>
>
>
>
On Fri, Nov 26, 2010 at 10:53:10AM +0200, Ohad Ben-Cohen wrote:
> >> +int __hwspin_trylock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
> >> +{
> >> + ? ? int ret;
> >> +
> >> + ? ? if (unlikely(!hwlock)) {
> >> + ? ? ? ? ? ? pr_err("invalid hwlock\n");
> >
> > These kind of errors can get very spammy for buggy drivers.
>
> Yeah, but that's the purpose - I want to catch such egregious drivers
> who try to crash the kernel.
That can be better - because you get a backtrace, and it causes people
to report the problem rather than just ignore it. It may also prevent
the driver author releasing his code (as it won't work on their
initial testing.)
> > It's likely
> > more useful to either do a WARN_ON(), and/or move them under a debug
> > config option.
>
> Why would you prefer to compile out reporting of such extremely buggy
> behavior ?
If it's "extremely buggy behaviour" then the drivers deserve to crash.
Such stuff should cause them not to get out the door. A simple printk
with an error return can just be ignored.
On Fri, Nov 26, 2010 at 11:18 AM, Russell King - ARM Linux
<[email protected]> wrote:
> On Fri, Nov 26, 2010 at 10:53:10AM +0200, Ohad Ben-Cohen wrote:
>> >> +int __hwspin_trylock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
>> >> +{
>> >> + ? ? int ret;
>> >> +
>> >> + ? ? if (unlikely(!hwlock)) {
>> >> + ? ? ? ? ? ? pr_err("invalid hwlock\n");
>> >
>> > These kind of errors can get very spammy for buggy drivers.
>>
>> Yeah, but that's the purpose - I want to catch such egregious drivers
>> who try to crash the kernel.
>
> That can be better - because you get a backtrace, and it causes people
> to report the problem rather than just ignore it. ?It may also prevent
> the driver author releasing his code (as it won't work on their
> initial testing.)
>
...
>
> If it's "extremely buggy behaviour" then the drivers deserve to crash.
> Such stuff should cause them not to get out the door. ?A simple printk
> with an error return can just be ignored.
I like this approach too, but recently we had a few privilege
escalation exploits which involved NULL dereference kernel bugs
(process context mapped address 0 despite a positive mmap_min_addr).
Since we can't rely on the oops to always happen, I decided not to
omit the NULL checks.
>
On Fri, Nov 26, 2010 at 12:16:39PM +0200, Ohad Ben-Cohen wrote:
> On Fri, Nov 26, 2010 at 11:18 AM, Russell King - ARM Linux
> <[email protected]> wrote:
> > On Fri, Nov 26, 2010 at 10:53:10AM +0200, Ohad Ben-Cohen wrote:
> >> >> +int __hwspin_trylock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
> >> >> +{
> >> >> + ? ? int ret;
> >> >> +
> >> >> + ? ? if (unlikely(!hwlock)) {
> >> >> + ? ? ? ? ? ? pr_err("invalid hwlock\n");
> >> >
> >> > These kind of errors can get very spammy for buggy drivers.
> >>
> >> Yeah, but that's the purpose - I want to catch such egregious drivers
> >> who try to crash the kernel.
> >
> > That can be better - because you get a backtrace, and it causes people
> > to report the problem rather than just ignore it. ?It may also prevent
> > the driver author releasing his code (as it won't work on their
> > initial testing.)
> >
> ...
> >
> > If it's "extremely buggy behaviour" then the drivers deserve to crash.
> > Such stuff should cause them not to get out the door. ?A simple printk
> > with an error return can just be ignored.
>
> I like this approach too, but recently we had a few privilege
> escalation exploits which involved NULL dereference kernel bugs
> (process context mapped address 0 despite a positive mmap_min_addr).
That's not a concern on ARM. The prevention of having a user page mapped
at virtual address 0 does not rely on mmap_min_addr - in fact, we can't
use this as it's tuneable to enforce this requirement.
It's highly illegal on ARM - as ARM CPUs without vector remap place the
hardware vectors at virtual address 0, and as such allowing the user to
map a page there will take the system down. So we have this code in the
mmap path:
#define arch_mmap_check(addr, len, flags) \
(((flags) & MAP_FIXED && (addr) < FIRST_USER_ADDRESS) ? -EINVAL : 0)
which prevents any attempt what so ever, irrespective of the mmap_min_addr
setting, to create a userspace induced mapping at address 0.
On Fri, Nov 26, 2010 at 12:18:49AM -0700, Grant Likely wrote:
> On Thu, Nov 25, 2010 at 9:59 PM, Olof Johansson <[email protected]> wrote:
> > On Tue, Nov 23, 2010 at 05:38:57PM +0200, Ohad Ben-Cohen wrote:
> >> +#define pr_fmt(fmt) ? ?"%s: " fmt, __func__
> >
> > Not used.
>
> pr_fmt() is a magic #define that changes the behaviour of the pr_*()
> macros. See include/linux/kernel.h
D'oh, thanks.
-Olof
On Fri, Nov 26, 2010 at 12:45 PM, Russell King - ARM Linux
<[email protected]> wrote:
> On Fri, Nov 26, 2010 at 12:16:39PM +0200, Ohad Ben-Cohen wrote:
>> On Fri, Nov 26, 2010 at 11:18 AM, Russell King - ARM Linux
>> <[email protected]> wrote:
>> > On Fri, Nov 26, 2010 at 10:53:10AM +0200, Ohad Ben-Cohen wrote:
>> >> >> +int __hwspin_trylock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
>> >> >> +{
>> >> >> + ? ? int ret;
>> >> >> +
>> >> >> + ? ? if (unlikely(!hwlock)) {
>> >> >> + ? ? ? ? ? ? pr_err("invalid hwlock\n");
>> >> >
>> >> > These kind of errors can get very spammy for buggy drivers.
>> >>
>> >> Yeah, but that's the purpose - I want to catch such egregious drivers
>> >> who try to crash the kernel.
>> >
>> > That can be better - because you get a backtrace, and it causes people
>> > to report the problem rather than just ignore it. ?It may also prevent
>> > the driver author releasing his code (as it won't work on their
>> > initial testing.)
>> >
>> ...
>> >
>> > If it's "extremely buggy behaviour" then the drivers deserve to crash.
>> > Such stuff should cause them not to get out the door. ?A simple printk
>> > with an error return can just be ignored.
>>
>> I like this approach too, but recently we had a few privilege
>> escalation exploits which involved NULL dereference kernel bugs
>> (process context mapped address 0 despite a positive mmap_min_addr).
>
> That's not a concern on ARM.
Good point, thanks. The problem is that we can't rely on that in a
generic interface.
I remember a recent discussion where you suggested to have a
conditional check that will be compiled out on architectures where we
can rely on NULL dereference to produce an oops; something like that
can be handy here.
But then there's the other (quite reasonable) claim that says we
shouldn't crash the machine because of a non fatal bug: if a crappy
driver messes up, the user (not the developer) will most probably
prefer the machine to keep running with degraded functionality rather
than boot.
... which gets us back to pr_err.
> at virtual address 0 does not rely on mmap_min_addr - in fact, we can't
> use this as it's tuneable to enforce this requirement.
>
> It's highly illegal on ARM - as ARM CPUs without vector remap place the
> hardware vectors at virtual address 0, and as such allowing the user to
> map a page there will take the system down. ?So we have this code in the
> mmap path:
>
> #define arch_mmap_check(addr, len, flags) \
> ? ? ? ?(((flags) & MAP_FIXED && (addr) < FIRST_USER_ADDRESS) ? -EINVAL : 0)
>
> which prevents any attempt what so ever, irrespective of the mmap_min_addr
> setting, to create a userspace induced mapping at address 0.
>
Hi,
On Fri, Nov 26, 2010 at 10:53:10AM +0200, Ohad Ben-Cohen wrote:
> On Fri, Nov 26, 2010 at 6:59 AM, Olof Johansson <[email protected]> wrote:
> >> +#define pr_fmt(fmt) ? ?"%s: " fmt, __func__
> >
> > Not used.
>
> Yes, it is, check out how the pr_* macros are implemented:
Yep, my bad.
> >> +int __hwspin_trylock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
> >> +{
> >> + ? ? int ret;
> >> +
> >> + ? ? if (unlikely(!hwlock)) {
> >> + ? ? ? ? ? ? pr_err("invalid hwlock\n");
> >
> > These kind of errors can get very spammy for buggy drivers.
>
> Yeah, but that's the purpose - I want to catch such egregious drivers
> who try to crash the kernel.
You're not really catching the driver, since nothing points at what
made this be printed in syslog. That's why WARN_ON/__WARN has a benefit.
> > It's likely
> > more useful to either do a WARN_ON(), and/or move them under a debug
> > config option.
>
> Why would you prefer to compile out reporting of such extremely buggy behavior ?
It's not about compiling out. WARN_ON gives a stack trace of the caller
that broke the calling conventions, which makes it much easier to
pinpoint in a bug report than "I have 300 entries of "invalid hwlock"
in my syslog".
Most cases this will happen in is devleopment though, where it's more
helpful to get the stack backtrace.
Anyway, above plus a __WARN() would be OK with me. It should probably be
rate limited though, to avoid completely wedging a system with debug
printouts if it happens frequently, which locking can do.
> >> + ? ? /*
> >> + ? ? ?* We can be sure the other core's memory operations
> >> + ? ? ?* are observable to us only _after_ we successfully take
> >> + ? ? ?* the hwspinlock, so we must make sure that subsequent memory
> >> + ? ? ?* operations will not be reordered before we actually took the
> >> + ? ? ?* hwspinlock.
> >> + ? ? ?*
> >> + ? ? ?* Note: the implicit memory barrier of the spinlock above is too
> >> + ? ? ?* early, so we need this additional explicit memory barrier.
> >> + ? ? ?*/
> >> + ? ? mb();
> >
> > rmb() should be sufficient here.
>
> It's not.
>
> We need to make sure that our writes, too, will not be reordered
> before that barrier. Otherwise, we might end up changing protected
> shared memory during the critical section of the remote processor
> (before we acquire the lock we must not read from, or write to, the
> protected shared memory).
>
> I guess I need to add a comment about this.
How is the shared memory mapped? It should be mapped such that speculative
writes shouldn't be happening, and presumably such that hardware-triggered
prefetces won't happen either. For those cases a DMB should be sufficient
on ARM, a DSB shouldn't be needed?
Maybe it would make more sense to move the barriers down into the
implementation-specific layer, where the appropriate low-level barries
can be used instead of the generic rmb()/wmb()/mb() ones that tend to
be heavy-handed where implementations don't match generic requirements.
> > wmb() should be sufficient here.
>
> No - here, too, we need to make sure that also our read operations
> will not be reordered after the barrier. Otherwise, we might end up
> reading memory that has already been changed by a remote processor
> that is just about to acquire the lock.
You're right, I guess I haven't worked enough with completely
out-of-order systems, since I'm used to wmb() ordering reads as well.
Still, could be useful to push down to hardware-specific layers and use
just the specific barrier needed.
> >> + ? ? /* this implies an unrecoverable bug. at least rant */
> >> + ? ? WARN_ON(tmp != hwlock);
> >
> > I don't see how this could ever happen?
>
> It can't. Unless there's a bug somewhere.
If it's an unrecoverable bug, then you should panic(). Or BUG_ON(),
which can also be compiled out.
> That's why:
> 1. It is a WARN_ON - I don't care if someone compiles this out
> 2. There is no error handler for this (simply because there can never
> be an error handler for buggy code). It's just a rant, meant to catch
> an unlikely buggy event.
>
> The reason I did this is because I like to check return values. Even
> if it's only for sanity sake.
>
> Since it's not a hot path this shouldn't matter to anyone, but if
> people are still bothered by it, it can be removed.
In general, the kernel is low on asserts for catching bugs in used services.
Anyway, I'm not going to argue more over this one (and the others like
it). It just seemed redundant and other subsystems tend to rely on
low-level services working correctly. :)
> >> +/**
> >> + * hwspinlock_unregister() - unregister an hw spinlock
> >> + * @id: index of the specific hwspinlock to unregister
> >> + *
> >> + * This function should be called from the underlying platform-specific
> >> + * implementation, to unregister an existing (and unused) hwspinlock.
> >> + *
> >> + * Can be called from an atomic context (will not sleep) but not from
> >> + * within interrupt context.
> >> + *
> >> + * Returns the address of hwspinlock @id on success, or NULL on failure
> >
> > Why not just return int for success / fail and have the driver keep track
> > of the lock pointers too?
>
> Because it's elegant. This way drivers don't need to keep track of the pointers.
>
> It can be changed, with an extra cost of code (duplicated for each
> implementation) and memory, but why would we want to do that ?
"Functions should be short and sweet, and do just one thing"
It is now a hwspinlock_unregister_and_return(). Making a function to
lookup id->lockptr (and possibly passing that to the unregister function)
would take care of that, if you don't want to track the lock pointers in the
lowlevel driver.
> > Or, if you for some reason is attached to the ptr-return case, please make it return
> > standard ERR_PTR instead.
>
> ERR_PTR patterns have been reasonably argued against in the previous
> discussion, and I heartily agreed (see
> http://www.mail-archive.com/[email protected]/msg37453.html).
Fair enough.
> >> + ? ? /* look for an unused lock */
> >> + ? ? ret = radix_tree_gang_lookup_tag(&hwspinlock_tree, (void **)&hwlock,
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0, 1, HWSPINLOCK_UNUSED);
> >> + ? ? if (ret == 0) {
> >> + ? ? ? ? ? ? pr_warn("a free hwspinlock is not available\n");
> >> + ? ? ? ? ? ? hwlock = NULL;
> >> + ? ? ? ? ? ? goto out;
> >> + ? ? }
> >> +
> >> + ? ? /* sanity check: we shouldn't get more than we requested for */
> >> + ? ? WARN_ON(ret > 1);
> >
> > No need to check, unless you're debugging the radix code.
>
> Again, this is me preferring to check return values for sanity sake.
>
> The one time that this may yield true (e.g. code has erroneously
> changed) is well worth it.
>
> It shouldn't bother anyone since it's not a hot path, but as I said
> earlier, if people don't like it that much, it can be removed.
>
> >
> >> + ? ? /* mark as used and power up */
> >> + ? ? ret = __hwspinlock_request(hwlock);
> >> + ? ? if (ret < 0)
> >> + ? ? ? ? ? ? hwlock = NULL;
> >> +
> >> +out:
> >> + ? ? spin_unlock(&hwspinlock_tree_lock);
> >> + ? ? return hwlock;
> >> +}
> >> +EXPORT_SYMBOL_GPL(hwspinlock_request);
> >> +
> >
> > [...]
> >
> >> +/**
> >> + * hwspinlock_free() - free a specific hwspinlock
> >> + * @hwlock: the specific hwspinlock to free
> >> + *
> >> + * This function mark @hwlock as free again.
> >> + * Should only be called with an @hwlock that was retrieved from
> >> + * an earlier call to omap_hwspinlock_request{_specific}.
> >> + *
> >> + * Can be called from an atomic context (will not sleep) but not from
> >> + * within interrupt context (simply because there is no use case for
> >> + * that yet).
> >> + *
> >> + * Returns 0 on success, or an appropriate error code on failure
> >> + */
> >> +int hwspinlock_free(struct hwspinlock *hwlock)
> >> +{
> >> + ? ? struct hwspinlock *tmp;
> >> + ? ? int ret;
> >> +
> >> + ? ? if (!hwlock) {
> >
> > Some alloc/free APIs will just silently return for these cases.
>
> True.
>
> Some other free API will simply crash.
>
> Does it bother so much if we stick to the safe side here ?
The main benefit of doing a silent return and allowing a free of a NULL
pointer is that it makes cleanup of failed init/setup a bit simpler:
you can just iterate over the potentially created hwlocks and free them
even if they haven't been allocated (as long as the pointers initialize
as NULL).
> >> + ? ? ? ? ? ? pr_err("invalid hwlock\n");
> >> + ? ? ? ? ? ? return -EINVAL;
> >> + ? ? }
> >> +
> >> + ? ? spin_lock(&hwspinlock_tree_lock);
> >> +
> >> + ? ? /* make sure the hwspinlock is used */
> >> + ? ? ret = radix_tree_tag_get(&hwspinlock_tree, hwlock->id,
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? HWSPINLOCK_UNUSED);
> >> + ? ? if (ret == 1) {
> >> + ? ? ? ? ? ? dev_err(hwlock->dev, "%s: hwlock is already free\n", __func__);
> >
> > Double free. WARN_ON() seems appropriate?
>
> I don't want that this message will ever get compiled out.
Again adding a __WARN() would make both of us happy. :)
> >> + ? ? ? ? ? ? ret = -EINVAL;
> >> + ? ? ? ? ? ? goto out;
> >> + ? ? }
> >> +
> >> + ? ? /* notify the underlying device that power is not needed */
> >> + ? ? ret = pm_runtime_put(hwlock->dev);
> >> + ? ? if (ret < 0)
> >> + ? ? ? ? ? ? goto out;
> >> +
> >> + ? ? /* mark this hwspinlock as available */
> >> + ? ? tmp = radix_tree_tag_set(&hwspinlock_tree, hwlock->id,
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? HWSPINLOCK_UNUSED);
> >> +
> >> + ? ? /* sanity check (this shouldn't happen) */
> >> + ? ? WARN_ON(tmp != hwlock);
> >
> > No need for this.
>
> Please see explanations above why I added those sanity checks (despite
> my explicit comment saying this shouldn't happen).
See above replies.
-Olof
On Sat, Nov 27, 2010 at 12:18:55AM +0200, Ohad Ben-Cohen wrote:
> But then there's the other (quite reasonable) claim that says we
> shouldn't crash the machine because of a non fatal bug: if a crappy
> driver messes up, the user (not the developer) will most probably
> prefer the machine to keep running with degraded functionality rather
> than boot.
There's also the quite reasonable expectation that we shouldn't corrupt
user data. With locking interfaces, if someone abuses them and they
fail to work, then the risk is data corruption due to races. The safe
thing in that case is to panic - terminate that thread before it does
anything unsafe, thereby preventing data corruption.
Yes, it may mean that something becomes unavailable, but that's better
than corrupting data.
Take a look at the kernel's own spinlock implementation. Do we do lots
of checks in there for things like someone passing a NULL pointer to
the spinlock, or do we get an oops instead?
Also look at the list implementation. Do we check for NULL pointers
there, or do we get an oops instead?
Same for mutex. The same goes for lots of other infrastructure interfaces.
On Fri, 2010-11-26 at 09:34 +0200, Ohad Ben-Cohen wrote:
> On Thu, Nov 25, 2010 at 10:22 PM, David Brownell <[email protected]> wrote:
> > So there's no strong reason to think this is
> > actually "ggeneric". Function names no longer
> > specify OMAP, but without other hardware under
> > the interface, calling it "generic" reflects
> > more optimism than reality. (That was the
> > implication of my observations...)
>
> Well, it's not omap-specific anymore.
You haven't (and evidently can't) show non-OMAP hardware under your
calls, though ... so in a practical sense, it's still OMAP-specific code
(since nothing else is working). (And for that matter, what non-OMAP
code should try using these locks??)
Your intent "generic" is fine, but you've not achieved it and thus I
think you shouldn't imply that you have. Dropping the word "generic"
should suffice; it _is_ a framework, and maybe the next person working
with hardware spinlocks can finish generalizing (and add use cases).
> > To find other hardware spinlocks, you might be
> > able to look at fault tolerant multiprocessors.
(For much the same reasons as the various ASMP chips care
about HW spinlocks:... SMP can use pure software spinlocks, but when
there are special hardware (or system) circumstances, they may not
be sufficiently robust/ or reliable. (Consider just the impact of
differeent memory and caching models, ARM vs DSP in the OMAP case.
Non-Academic specs on fault tolerant computers may be hard to
come by, unfortunately ... They're very specialized and often
have lots of funky proprietary logic that vendors don't want
reverse-engineered. Hardware voting is just the start. The
software to make the fault tolerance robust/reliable gets to
be very tricky ... and without it, why bother with expensive
hardware mechanisms.
The same issues come up with aerospace and some industrial
systems, where reliability affects mission-readiness and, for
industrial apps, safety.
> > Ages ago I worked with one of those, where any
> > spinlock failures integrated with CPU/OS fault
> > detection; HW cwould yank (checkpointed) CPU boards
> > off the bus so they could be recovered (some
> > might continue later from checkpoints, etc.)...
>
> Is that HW supported by Linux today ?
Not mainline, and unlikely any branch. Fault tolerant
operating systems can't be as simple as Linux, and I think
that trying to add fault tolerance to it would not only turn it
into a very different animal, but would lose most developers.
(The mantra I recall was "No single Point Failures". Linux
has lots of such failure modes, and removing them would be a
multi-year effort, even just inside the kernel. (How would you
recover from a bus failure? Fan failure? Power supply death?
(All such hardware must be duplicated, with recovery supported
by both hardware and software...) (Where "recover" includes
"keep running without losing data or other computations.)
(Last I knew, Linux didn't even have much support for checkpoint
and restore of kernel state ... hibernation is related, but
seems to be constantly in flux. (Don't think it's aiming to
tolerate CPU failures after a hibernation checkpoint either.
(Heck ... on my Ubuntu, "Network Manager" isn't even competent to
switch over cleanly from Ethernet to WLAN (and don't get me talking
about other ways it's broken. LOTS of stupid fault handling, and
that's arguably mission-critical for the whole system ... multiple
single point failure modes. That's FAR from fault-tolerant.
> Any chance you can share a link or any other info about it ?
I googled for "sequoia systems fault tolerance" and found some stuff
that looked like it summarized some of the original hardware.
I can't think, off the top of my head, of other kinds of system that
need and use hardware spinlocks, but likely they do exist. (Mainframe
tech might use them too, as part of the subset of fault-tolerant HW
tech they build on. If you want to provide a "generic" framework you
should find and support some (or Tom-Sawyer such support... :)
>
> >
> > I seem to recall some iterations of the real-time patches doing a lot of
> > work to generalize spinlocks, since they needed multiple variants. It
> > might be worth following in those footsteps. (Though I'm not sure they
> > were thinking much about hardware support .
>
> Any chance you can point me at a specific discussion or patchset that
> you feel may be relevant ?
Haven't looked at RT in a long time. Just look at the current RT
patchset to see if it still has several spinlock variants. ISTR the
"raw" spinlock stuff came from there not long ago. Much compile-time
magic was involved in switching between variants.
- Dave
On Sat, Nov 27, 2010 at 12:53 AM, Russell King - ARM Linux
<[email protected]> wrote:
> On Sat, Nov 27, 2010 at 12:18:55AM +0200, Ohad Ben-Cohen wrote:
>> But then there's the other (quite reasonable) claim that says we
>> shouldn't crash the machine because of a non fatal bug: if a crappy
>> driver messes up, the user (not the developer) will most probably
>> prefer the machine to keep running with degraded functionality rather
>> than boot.
>
> There's also the quite reasonable expectation that we shouldn't corrupt
> user data. ?With locking interfaces, if someone abuses them and they
> fail to work, then the risk is data corruption due to races. ?The safe
> thing in that case is to panic - terminate that thread before it does
> anything unsafe, thereby preventing data corruption.
Makes sense.
I considered hwspinlock as a peripheral which doesn't qualify to take
down the system on failure, but in general I agree that there indeed
might be critical user data involved. Especially if this is a
framework and not just another driver.
But as a framework that can be used on non ARM architectures as well,
I do prefer to check for NULL pointers and not rely on the Oops.
If we had a macro that would be compiled out on architectures that
reliably produces an Oops on NULL dereference, but otherwise, would
BUG_ON on them, that should satisfy everyone.
The BUG_ON_MAPPABLE_NULL() macro below should achieve exactly that,
please tell me what you think, thanks!
arch/arm/include/asm/mman.h | 1 +
include/asm-generic/bug.h | 40 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 41 insertions(+), 0 deletions(-)
diff --git a/arch/arm/include/asm/mman.h b/arch/arm/include/asm/mman.h
index 41f99c5..d4ee003 100644
--- a/arch/arm/include/asm/mman.h
+++ b/arch/arm/include/asm/mman.h
@@ -1,4 +1,5 @@
#include <asm-generic/mman.h>
+#include <asm/pgtable.h>
#define arch_mmap_check(addr, len, flags) \
(((flags) & MAP_FIXED && (addr) < FIRST_USER_ADDRESS) ? -EINVAL : 0)
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index c2c9ba0..d211d9c 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -2,6 +2,11 @@
#define _ASM_GENERIC_BUG_H
#include <linux/compiler.h>
+#include <asm/mman.h>
+
+#ifndef arch_mmap_check
+#define arch_mmap_check(addr, len, flags) (0)
+#endif
#ifdef CONFIG_BUG
@@ -53,6 +58,41 @@ struct bug_entry {
#define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while(0)
#endif
+/**
+ * BUG_ON_MAPPABLE_NULL() - panic on NULL only if address 0 is mappable
+ * @addr: address to check
+ *
+ * In general, NULL dereference Oopses are not desirable, since they take down
+ * the system with them and make the user extremely unhappy. So as a general
+ * rule kernel code should avoid dereferencing NULL pointers by doing a
+ * simple check (when appropriate), and if needed, continue operating
+ * with reduced functionality rather than crash.
+ *
+ * _Critical_ kernel code, OTOH, that should not (/cannot) keep running when
+ * given an unexpected NULL pointer, should just crash. On some architectures,
+ * a NULL dereference will always reliably produce an Oops. On others, where
+ * the zero address can be mmapped, an Oops is not guaranteed. Relying on
+ * NULL dereference Oopses to happen on these architectures might lead to
+ * data corruptions (system will keep running despite a critical bug and
+ * the results will be horribly undefined). In addition, these situations
+ * can also have security implications - we have seen several privilege
+ * escalation exploits with which an attacker gained full control over the
+ * system due to NULL dereference bugs.
+ *
+ * This macro will BUG_ON if @addr is NULL on architectures where the zero
+ * addresses can be mapped. On other architectures, where the zero address
+ * can never be mapped, this macro is compiled out, so the system will just
+ * Oops when the @addr is dereferenced.
+ *
+ * As with BUG_ON, use this macro only if a NULL @addr cannot be tolerated.
+ * If proceeding with degraded functionality is an option, it's much
+ * better to just simply check for the NULL and returning instead of crashing
+ * the system.
+ */
+#define BUG_ON_MAPPABLE_NULL(addr) do { \
+ if (arch_mmap_check(0, 1, MAP_FIXED) == 0) BUG_ON(!addr); \
+} while(0)
+
/*
* WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
* significant issues that need prompt attention if they should ever
--
1.7.0.4
On Sat, Nov 27, 2010 at 3:24 AM, David Brownell <[email protected]> wrote:
> Your intent "generic" is fine, but you've not achieved it and thus I
> think you shouldn't imply that you have. ? Dropping the word "generic"
> should ?suffice; it _is_ a framework, and maybe the next person working
> with hardware spinlocks can finish generalizing (and add use cases).
Sure, I can drop the word "generic".
> Haven't looked at RT in a long time. ?Just look at the current RT
> patchset to see if it still has several spinlock variants. ?ISTR the
> "raw" spinlock stuff came from there not long ago. ?Much compile-time
> magic was involved in switching between variants.
I'll take a look there and see if I catch anything relevant.
Thanks,
Ohad.
>
> - Dave
>
>
>
Hi Olof,
On Sat, Nov 27, 2010 at 12:51 AM, Olof Johansson <[email protected]> wrote:
>> >> +int __hwspin_trylock(struct hwspinlock *hwlock, int mode, unsigned long *flags)
>> >> +{
>> >> + ? ? int ret;
>> >> +
>> >> + ? ? if (unlikely(!hwlock)) {
>> >> + ? ? ? ? ? ? pr_err("invalid hwlock\n");
>> >
>> > These kind of errors can get very spammy for buggy drivers.
>>
>> Yeah, but that's the purpose - I want to catch such egregious drivers
>> who try to crash the kernel.
...
> Anyway, above plus a __WARN() would be OK with me.
Np, but let's wait a bit to see if something like the
BUG_ON_MAPPABLE_NULL macro materializes or not. That should satisfy
everyone in a generic way.
>> >> + ? ? /*
>> >> + ? ? ?* We can be sure the other core's memory operations
>> >> + ? ? ?* are observable to us only _after_ we successfully take
>> >> + ? ? ?* the hwspinlock, so we must make sure that subsequent memory
>> >> + ? ? ?* operations will not be reordered before we actually took the
>> >> + ? ? ?* hwspinlock.
>> >> + ? ? ?*
>> >> + ? ? ?* Note: the implicit memory barrier of the spinlock above is too
>> >> + ? ? ?* early, so we need this additional explicit memory barrier.
>> >> + ? ? ?*/
>> >> + ? ? mb();
>> >
>> > rmb() should be sufficient here.
>>
>> It's not.
>>
>> We need to make sure that our writes, too, will not be reordered
>> before that barrier. Otherwise, we might end up changing protected
>> shared memory during the critical section of the remote processor
>> (before we acquire the lock we must not read from, or write to, the
>> protected shared memory).
>>
>> I guess I need to add a comment about this.
>
> How is the shared memory mapped? It should be mapped such that speculative
> writes shouldn't be happening, and presumably such that hardware-triggered
> prefetces won't happen either.
That's up to the users of hwspinlock to take care of, we shouldn't
worry about it here.
> For those cases a DMB should be sufficient
> on ARM, a DSB shouldn't be needed?
>
> Maybe it would make more sense to move the barriers down into the
> implementation-specific layer, where the appropriate low-level barries
> can be used instead of the generic rmb()/wmb()/mb() ones that tend to
> be heavy-handed where implementations don't match generic requirements.
...
> Still, could be useful to push down to hardware-specific layers and use
> just the specific barrier needed.
But that's what mb/wmb/rmb are for - we use mb because we care of both
read and write reordering - and then it's up to platform-specific code
to implement it correctly. We shouldn't worry about the
platform-specific implementation of mb/wmb/rmb in drivers.
>> >> + ? ? /* this implies an unrecoverable bug. at least rant */
>> >> + ? ? WARN_ON(tmp != hwlock);
>> >
>> > I don't see how this could ever happen?
>>
>> It can't. Unless there's a bug somewhere.
>
> If it's an unrecoverable bug, then you should panic(). Or BUG_ON(),
> which can also be compiled out.
Sure.
> Anyway, I'm not going to argue more over this one (and the others like
> it). It just seemed redundant and other subsystems tend to rely on
> low-level services working correctly. :)
It's not entirely about checking other subsystems: in a way, one can
also look at it as sanity checks to the hwspinlock framework itself.
For example, when we fetch an hwlock object, and then make sure the id
of the hwlock really match the id we looked for, we get a sanity-level
confidence that our data isn't messed up, and that the radix tree was
pretty much used correctly so far.
If, for some reason, that check would fail, that doesn't necessarily
mean the radix code is faulty: it just means something went really
bad, and for some reason the incorrect object is stored with the id we
looked for. It can also be hwspinlock's fault.
It's important to detect such an anomaly as early as possible,
otherwise the wrong lock is going to be used, user data might be
corrupted, and in general, it's going to be veeeery hard to debug
this.
So I consider this a low-cost check with very high value, and if it
would be triggered even once in the lifetime of hwspinlock, it's well
worth it.
>
>> >> +/**
>> >> + * hwspinlock_unregister() - unregister an hw spinlock
>> >> + * @id: index of the specific hwspinlock to unregister
>> >> + *
>> >> + * This function should be called from the underlying platform-specific
>> >> + * implementation, to unregister an existing (and unused) hwspinlock.
>> >> + *
>> >> + * Can be called from an atomic context (will not sleep) but not from
>> >> + * within interrupt context.
>> >> + *
>> >> + * Returns the address of hwspinlock @id on success, or NULL on failure
>> >
>> > Why not just return int for success / fail and have the driver keep track
>> > of the lock pointers too?
>>
>> Because it's elegant. This way drivers don't need to keep track of the pointers.
>>
>> It can be changed, with an extra cost of code (duplicated for each
>> implementation) and memory, but why would we want to do that ?
>
> "Functions should be short and sweet, and do just one thing"
Do you consider radix_tree_delete() inappropriate then ?
>
> It is now a hwspinlock_unregister_and_return(). Making a function to
> lookup id->lockptr (and possibly passing that to the unregister function)
> would take care of that
Such a function is redundant - as this functionality is already
provided in radix_tree_delete, and it'd be a pity not to use it. One
doesn't have to use this return value though, but it really helps
simplifying the users with no additional cost.
>, if you don't want to track the lock pointers in the
> lowlevel driver.
>
>> >> +int hwspinlock_free(struct hwspinlock *hwlock)
>> >> +{
>> >> + ? ? struct hwspinlock *tmp;
>> >> + ? ? int ret;
>> >> +
>> >> + ? ? if (!hwlock) {
>> >
>> > Some alloc/free APIs will just silently return for these cases.
>>
>> True.
>>
>> Some other free API will simply crash.
>>
>> Does it bother so much if we stick to the safe side here ?
>
> The main benefit of doing a silent return and allowing a free of a NULL
> pointer is that it makes cleanup of failed init/setup a bit simpler:
> you can just iterate over the potentially created hwlocks and free them
> even if they haven't been allocated (as long as the pointers initialize
> as NULL).
It feels to me like encouraging sloppy coding, especially since we're
talking about hardware resources here, but maybe I'm overlooking a
real use case here...
>> >> + ? ? ? ? ? ? pr_err("invalid hwlock\n");
>> >> + ? ? ? ? ? ? return -EINVAL;
>> >> + ? ? }
>> >> +
>> >> + ? ? spin_lock(&hwspinlock_tree_lock);
>> >> +
>> >> + ? ? /* make sure the hwspinlock is used */
>> >> + ? ? ret = radix_tree_tag_get(&hwspinlock_tree, hwlock->id,
>> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? HWSPINLOCK_UNUSED);
>> >> + ? ? if (ret == 1) {
>> >> + ? ? ? ? ? ? dev_err(hwlock->dev, "%s: hwlock is already free\n", __func__);
>> >
>> > Double free. WARN_ON() seems appropriate?
>>
>> I don't want that this message will ever get compiled out.
>
> Again adding a __WARN() would make both of us happy. :)
Sure.
Thanks,
Ohad.
* Ohad Ben-Cohen <[email protected]> [101123 07:27]:
> Add a common, platform-independent, hwspinlock framework.
>
> Hardware spinlock devices are needed, e.g., in order to access data
> that is shared between remote processors, that otherwise have no
> alternative mechanism to accomplish synchronization and mutual exclusion
> operations.
<snip>
> + int hwspin_lock(struct hwspinlock *hwlock);
> + - lock a previously assigned hwspinlock. If the hwspinlock is already
> + taken, the function will busy loop waiting for it to be released.
> + Note: if a faulty remote core never releases this lock, this function
> + will deadlock.
> + This function will fail only if hwlock is invalid. Otherwise, it will
> + always succeed (or deadlock; see above) and it will never sleep.
> + Upon a successful return from this function, preemption is disabled so
> + the caller must not sleep, and is advised to release the hwspinlock as
> + soon as possible, in order to minimize remote cores polling on the
> + hardware interconnect.
...
> + int hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned long timeout);
> + - lock a previously-assigned hwspinlock with a timeout limit (specified in
> + jiffies). If the hwspinlock is already taken, the function will busy loop
> + waiting for it to be released, but give up when the timeout meets jiffies.
> + If timeout is 0, the function will never give up (therefore if a faulty
> + remote core never releases the hwspinlock, it will deadlock).
> + Upon a successful return from this function, preemption is disabled so
> + the caller must not sleep, and is advised to release the hwspinlock as
> + soon as possible, in order to minimize remote cores polling on the
> + hardware interconnect.
> + Returns 0 when successful and an appropriate error code otherwise (most
> + notably -ETIMEDOUT if the hwspinlock is still busy after timeout meets
> + jiffies). The function will never sleep.
Do we even need the hwspin_lock variants, why can't we always use the
hwspin_lock_timeout variants?
To me the idea of looping waiting for some external system to release
a lock is not a good idea..
Regards,
Tony
On Tue, Nov 30, 2010 at 11:00 AM, Tony Lindgren <[email protected]> wrote:
> Do we even need the hwspin_lock variants,
I personally don't have any specific use case in mind.
It's just a simple wrapper over the _timeout variants, provided for
API completeness, but -
> why can't we always use the hwspin_lock_timeout variants?
We can. I can just remove the _lock variants.
* Ohad Ben-Cohen <[email protected]> [101130 14:10]:
> On Tue, Nov 30, 2010 at 11:00 AM, Tony Lindgren <[email protected]> wrote:
> > Do we even need the hwspin_lock variants,
>
> I personally don't have any specific use case in mind.
>
> It's just a simple wrapper over the _timeout variants, provided for
> API completeness, but -
>
> > why can't we always use the hwspin_lock_timeout variants?
>
> We can. I can just remove the _lock variants.
OK sounds good to me.
Tony