Changes in v8:
- Bounds check for cmd_cache
- Describe interrupts to other DRVs in DT bindings
- Rebase on top of 4.17-rc3
Changes in v7:
- Rename 'm' and 'n' and use tcs_id and cmd_id instead
- Bug fix in find_match() and other review comments from Matthias
- Spinlock around get_rpmh_ctrlr()
- DT documentation example fixes
- Rebase on top of 4.16-rc2
Changes in v6:
- Remove tasklet in rpmh-rsc.c
- Remove rpmh_client and use struct device * instead
- Variable changes and bug fixes
- DT binding changes to describe all DRVs in the RSC
- Documentation and comment fixes
Changes in v5:
- Add Reviewed-by tags
- Rebase on top of 4.16
Changes in v4:
- Rename variables as suggested by Stephen and Evan
- Lot of minor syntax and style fixes
- Fix FTRACE compilation error
- Improve doc comments and DT description
Changes in v3:
- Address Steven's comments in FTRACE
- Fix DT documentation as suggested by Rob H
- Fix error handling in IRQ handler as suggested by Evan
- Remove locks in rpmh_flush()
- Improve comments
Changes in v2:
- Added sleep/wake, async and batch requests support
- Addressed Bjorn's comments
- Private FTRACE for drivers/soc/qcom as suggested by Steven
- Sparse checked on these patches
- Use SPDX license commenting sytle
This set of patches add the ability for platform drivers to make use of shared
resources in newer Qualcomm SoCs like SDM845. Resources that are shared between
multiple processors in a SoC are generally controlled by a dedicated remote
processor. The remote processor (Resource Power Manager or RPM in previous QCOM
SoCs) receives requests for resource state from other processors using the
shared resource, aggregates the request and applies the result on the shared
resource. SDM845 advances this concept and uses h/w (hardened I/P) blocks for
aggregating requests and applying the result on the resource. The resources
could be clocks, regulators or bandwidth requests for buses. This new
architecture is called RPM-hardened or RPMH in short.
Since this communication mechanism is completely hardware driven without a
processor intervention on the remote end, existing mechanisms like RPM-SMD are
no longer useful. Also, there is no serialization of data or is data is written
to a shared memory in this new format. The data used is different, unsigned 32
bits are used for representing an address, data and header. Each resource's
property is a unique u32 address and have pre-defined set of property specific
valid values. A request that comprises of <header, addr, data> is sent by
writing to a set of registers from Linux and transmitted to the remote slave
through an internal bus. The remote end aggregates this request along with
requests from other processors for the <addr> and applies the result.
The hardware block that houses this functionality is called Resource State
Coordinator or RSC. Inside the RSC are set of slots for sending RPMH requests
called Trigger Commands Sets (TCS). The set of patches are for writing the
requests into these TCSes and sending them to hardened IP blocks.
The driver design is split into two components. The RSC driver housed in
rpmh-rsc.c and the set of library functions in rpmh.c that frame the request and
transmit it using the controller. This first set of patches allow a simple
synchronous request to be made by the platform drivers. Future patches will add
more functionality that cater to complex drivers and use cases.
Please consider reviewing this patchset.
v1: https://www.spinics.net/lists/devicetree/msg210980.html
v2: https://lkml.org/lkml/2018/2/15/852
v3: https://lkml.org/lkml/2018/3/2/801
v4: https://lkml.org/lkml/2018/3/9/979
v5: https://lkml.org/lkml/2018/4/5/480
v6: https://lkml.org/lkml/2018/4/19/914
v7: https://lkml.org/lkml/2018/5/2/779
Lina Iyer (10):
drivers: qcom: rpmh-rsc: add RPMH controller for QCOM SoCs
dt-bindings: introduce RPMH RSC bindings for Qualcomm SoCs
drivers: qcom: rpmh-rsc: log RPMH requests in FTRACE
drivers: qcom: rpmh: add RPMH helper functions
drivers: qcom: rpmh-rsc: write sleep/wake requests to TCS
drivers: qcom: rpmh-rsc: allow invalidation of sleep/wake TCS
drivers: qcom: rpmh: cache sleep/wake state requests
drivers: qcom: rpmh: allow requests to be sent asynchronously
drivers: qcom: rpmh: add support for batch RPMH request
drivers: qcom: rpmh-rsc: allow active requests from wake TCS
.../devicetree/bindings/soc/qcom/rpmh-rsc.txt | 137 ++++
drivers/soc/qcom/Kconfig | 10 +
drivers/soc/qcom/Makefile | 4 +
drivers/soc/qcom/rpmh-internal.h | 83 +++
drivers/soc/qcom/rpmh-rsc.c | 683 ++++++++++++++++++
drivers/soc/qcom/rpmh.c | 585 +++++++++++++++
drivers/soc/qcom/trace-rpmh.h | 82 +++
include/dt-bindings/soc/qcom,rpmh-rsc.h | 14 +
include/soc/qcom/rpmh.h | 51 ++
include/soc/qcom/tcs.h | 56 ++
10 files changed, 1705 insertions(+)
create mode 100644 Documentation/devicetree/bindings/soc/qcom/rpmh-rsc.txt
create mode 100644 drivers/soc/qcom/rpmh-internal.h
create mode 100644 drivers/soc/qcom/rpmh-rsc.c
create mode 100644 drivers/soc/qcom/rpmh.c
create mode 100644 drivers/soc/qcom/trace-rpmh.h
create mode 100644 include/dt-bindings/soc/qcom,rpmh-rsc.h
create mode 100644 include/soc/qcom/rpmh.h
create mode 100644 include/soc/qcom/tcs.h
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Platform drivers need make a lot of resource state requests at the same
time, say, at the start or end of an usecase. It can be quite
inefficient to send each request separately. Instead they can give the
RPMH library a batch of requests to be sent and wait on the whole
transaction to be complete.
rpmh_write_batch() is a blocking call that can be used to send multiple
RPMH command sets. Each RPMH command set is set asynchronously and the
API blocks until all the command sets are complete and receive their
tx_done callbacks.
Signed-off-by: Lina Iyer <[email protected]>
---
Changes in v7:
- Check for loop out of bounds
Changes in v6:
- replace rpmh_client with device *
Changes in v4:
- reorganize rpmh_write_batch()
- introduce wait_count here, instead of patch#4
---
drivers/soc/qcom/rpmh.c | 154 +++++++++++++++++++++++++++++++++++++++-
include/soc/qcom/rpmh.h | 8 +++
2 files changed, 160 insertions(+), 2 deletions(-)
diff --git a/drivers/soc/qcom/rpmh.c b/drivers/soc/qcom/rpmh.c
index 1bb62876795c..a0e277b4b846 100644
--- a/drivers/soc/qcom/rpmh.c
+++ b/drivers/soc/qcom/rpmh.c
@@ -21,6 +21,8 @@
#include "rpmh-internal.h"
#define RPMH_TIMEOUT_MS msecs_to_jiffies(10000)
+#define RPMH_MAX_REQ_IN_BATCH 10
+#define RPMH_MAX_BATCH_CACHE (2 * RPMH_MAX_REQ_IN_BATCH)
#define DEFINE_RPMH_MSG_ONSTACK(dev, s, q, name) \
struct rpmh_request name = { \
@@ -34,6 +36,7 @@
.completion = q, \
.dev = dev, \
.free = NULL, \
+ .wait_count = NULL, \
}
/**
@@ -60,6 +63,7 @@ struct cache_req {
* @dev: the device making the request
* @err: err return from the controller
* @free: the request object to be freed at tx_done
+ * @wait_count: count of waiters for this completion
*/
struct rpmh_request {
struct tcs_request msg;
@@ -68,6 +72,7 @@ struct rpmh_request {
const struct device *dev;
int err;
struct rpmh_request *free;
+ atomic_t *wait_count;
};
/**
@@ -77,12 +82,14 @@ struct rpmh_request {
* @cache: the list of cached requests
* @lock: synchronize access to the controller data
* @dirty: was the cache updated since flush
+ * @batch_cache: Cache sleep and wake requests sent as batch
*/
struct rpmh_ctrlr {
struct rsc_drv *drv;
struct list_head cache;
spinlock_t lock;
bool dirty;
+ const struct rpmh_request *batch_cache[RPMH_MAX_BATCH_CACHE];
};
static struct rpmh_ctrlr rpmh_rsc[RPMH_MAX_CTRLR];
@@ -133,6 +140,7 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
struct rpmh_request *rpm_msg = container_of(msg, struct rpmh_request,
msg);
struct completion *compl = rpm_msg->completion;
+ atomic_t *wc = rpm_msg->wait_count;
rpm_msg->err = r;
@@ -143,8 +151,13 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
kfree(rpm_msg->free);
/* Signal the blocking thread we are done */
- if (compl)
- complete(compl);
+ if (!compl)
+ return;
+
+ if (wc && !atomic_dec_and_test(wc))
+ return;
+
+ complete(compl);
}
EXPORT_SYMBOL(rpmh_tx_done);
@@ -332,6 +345,137 @@ int rpmh_write(const struct device *dev, enum rpmh_state state,
}
EXPORT_SYMBOL(rpmh_write);
+static int cache_batch(struct rpmh_ctrlr *ctrlr,
+ struct rpmh_request **rpm_msg, int count)
+{
+ unsigned long flags;
+ int ret = 0;
+ int index = 0;
+ int i;
+
+ spin_lock_irqsave(&ctrlr->lock, flags);
+ while (index < RPMH_MAX_BATCH_CACHE && ctrlr->batch_cache[index])
+ index++;
+ if (index + count >= RPMH_MAX_BATCH_CACHE) {
+ ret = -ENOMEM;
+ goto fail;
+ }
+
+ for (i = 0; i < count; i++)
+ ctrlr->batch_cache[index + i] = rpm_msg[i];
+fail:
+ spin_unlock_irqrestore(&ctrlr->lock, flags);
+
+ return ret;
+}
+
+static int flush_batch(struct rpmh_ctrlr *ctrlr)
+{
+ const struct rpmh_request *rpm_msg;
+ unsigned long flags;
+ int ret = 0;
+ int i;
+
+ /* Send Sleep/Wake requests to the controller, expect no response */
+ spin_lock_irqsave(&ctrlr->lock, flags);
+ for (i = 0; ctrlr->batch_cache[i]; i++) {
+ rpm_msg = ctrlr->batch_cache[i];
+ ret = rpmh_rsc_write_ctrl_data(ctrlr->drv, &rpm_msg->msg);
+ if (ret)
+ break;
+ }
+ spin_unlock_irqrestore(&ctrlr->lock, flags);
+
+ return ret;
+}
+
+static void invalidate_batch(struct rpmh_ctrlr *ctrlr)
+{
+ unsigned long flags;
+ int index = 0;
+
+ spin_lock_irqsave(&ctrlr->lock, flags);
+ while (ctrlr->batch_cache[index]) {
+ kfree(ctrlr->batch_cache[index]->free);
+ ctrlr->batch_cache[index] = NULL;
+ index++;
+ }
+ spin_unlock_irqrestore(&ctrlr->lock, flags);
+}
+
+/**
+ * rpmh_write_batch: Write multiple sets of RPMH commands and wait for the
+ * batch to finish.
+ *
+ * @dev: the device making the request
+ * @state: Active/sleep set
+ * @cmd: The payload data
+ * @n: The array of count of elements in each batch, 0 terminated.
+ *
+ * Write a request to the RSC controller without caching. If the request
+ * state is ACTIVE, then the requests are treated as completion request
+ * and sent to the controller immediately. The function waits until all the
+ * commands are complete. If the request was to SLEEP or WAKE_ONLY, then the
+ * request is sent as fire-n-forget and no ack is expected.
+ *
+ * May sleep. Do not call from atomic contexts for ACTIVE_ONLY requests.
+ */
+int rpmh_write_batch(const struct device *dev, enum rpmh_state state,
+ const struct tcs_cmd *cmd, u32 *n)
+{
+ struct rpmh_request *rpm_msg[RPMH_MAX_REQ_IN_BATCH] = { NULL };
+ DECLARE_COMPLETION_ONSTACK(compl);
+ atomic_t wait_count = ATOMIC_INIT(0);
+ struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev);
+ int count = 0;
+ int ret, i;
+
+ if (IS_ERR(ctrlr) || !cmd || !n)
+ return -EINVAL;
+
+ while (n[count++] > 0)
+ ;
+ count--;
+ if (!count || count > RPMH_MAX_REQ_IN_BATCH)
+ return -EINVAL;
+
+ for (i = 0; i < count; i++) {
+ rpm_msg[i] = __get_rpmh_msg_async(state, cmd, n[i]);
+ if (IS_ERR_OR_NULL(rpm_msg[i])) {
+ ret = PTR_ERR(rpm_msg[i]);
+ for (; i >= 0; i--)
+ kfree(rpm_msg[i]->free);
+ return ret;
+ }
+ cmd += n[i];
+ }
+
+ if (state != RPMH_ACTIVE_ONLY_STATE)
+ return cache_batch(ctrlr, rpm_msg, count);
+
+ atomic_set(&wait_count, count);
+
+ for (i = 0; i < count; i++) {
+ rpm_msg[i]->completion = &compl;
+ rpm_msg[i]->wait_count = &wait_count;
+ ret = rpmh_rsc_send_data(ctrlr->drv, &rpm_msg[i]->msg);
+ if (ret) {
+ int j;
+
+ pr_err("Error(%d) sending RPMH message addr=%#x\n",
+ ret, rpm_msg[i]->msg.cmds[0].addr);
+ for (j = i; j < count; j++)
+ rpmh_tx_done(&rpm_msg[j]->msg, ret);
+ break;
+ }
+ }
+
+ ret = wait_for_completion_timeout(&compl, RPMH_TIMEOUT_MS);
+ return (ret > 0) ? 0 : -ETIMEDOUT;
+
+}
+EXPORT_SYMBOL(rpmh_write_batch);
+
static int is_req_valid(struct cache_req *req)
{
return (req->sleep_val != UINT_MAX &&
@@ -383,6 +527,11 @@ int rpmh_flush(const struct device *dev)
return 0;
}
+ /* First flush the cached batch requests */
+ ret = flush_batch(ctrlr);
+ if (ret)
+ return ret;
+
/*
* Nobody else should be calling this function other than system PM,
* hence we can run without locks.
@@ -424,6 +573,7 @@ int rpmh_invalidate(const struct device *dev)
if (IS_ERR(ctrlr))
return PTR_ERR(ctrlr);
+ invalidate_batch(ctrlr);
ctrlr->dirty = true;
do {
diff --git a/include/soc/qcom/rpmh.h b/include/soc/qcom/rpmh.h
index 1161a5c77e75..619e07c75da9 100644
--- a/include/soc/qcom/rpmh.h
+++ b/include/soc/qcom/rpmh.h
@@ -17,6 +17,9 @@ int rpmh_write(const struct device *dev, enum rpmh_state state,
int rpmh_write_async(const struct device *dev, enum rpmh_state state,
const struct tcs_cmd *cmd, u32 n);
+int rpmh_write_batch(const struct device *dev, enum rpmh_state state,
+ const struct tcs_cmd *cmd, u32 *n);
+
int rpmh_flush(const struct device *dev);
int rpmh_invalidate(const struct device *dev);
@@ -32,6 +35,11 @@ static inline int rpmh_write_async(const struct device *dev,
const struct tcs_cmd *cmd, u32 n)
{ return -ENODEV; }
+static inline int rpmh_write_batch(const struct device *dev,
+ enum rpmh_state state,
+ const struct tcs_cmd *cmd, u32 *n)
+{ return -ENODEV; }
+
static inline int rpmh_flush(const struct device *dev)
{ return -ENODEV; }
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Sleep and wake requests are sent when the application processor
subsystem of the SoC is entering deep sleep states like in suspend.
These requests help lower the system power requirements when the
resources are not in use.
Sleep and wake requests are written to the TCS slots but are not
triggered at the time of writing. The TCS are triggered by the firmware
after the last of the CPUs has executed its WFI. Since these requests
may come in different batches of requests, it is the job of this
controller driver to find and arrange the requests into the available
TCSes.
Signed-off-by: Lina Iyer <[email protected]>
Reviewed-by: Evan Green <[email protected]>
---
Changes in v8:
- Bounds check in find_match()
Changes in v7:
- Bug fix in find_match()
---
drivers/soc/qcom/rpmh-internal.h | 8 ++
drivers/soc/qcom/rpmh-rsc.c | 122 +++++++++++++++++++++++++++++++
2 files changed, 130 insertions(+)
diff --git a/drivers/soc/qcom/rpmh-internal.h b/drivers/soc/qcom/rpmh-internal.h
index d9a21726e568..6e19fe458c31 100644
--- a/drivers/soc/qcom/rpmh-internal.h
+++ b/drivers/soc/qcom/rpmh-internal.h
@@ -14,6 +14,7 @@
#define MAX_CMDS_PER_TCS 16
#define MAX_TCS_PER_TYPE 3
#define MAX_TCS_NR (MAX_TCS_PER_TYPE * TCS_TYPE_NR)
+#define MAX_TCS_SLOTS (MAX_CMDS_PER_TCS * MAX_TCS_PER_TYPE)
#define RPMH_MAX_CTRLR 2
struct rsc_drv;
@@ -30,6 +31,8 @@ struct rsc_drv;
* @ncpt: number of commands in each TCS
* @lock: lock for synchronizing this TCS writes
* @req: requests that are sent from the TCS
+ * @cmd_cache: flattened cache of cmds in sleep/wake TCS
+ * @slots: indicates which of @cmd_addr are occupied
*/
struct tcs_group {
struct rsc_drv *drv;
@@ -40,6 +43,8 @@ struct tcs_group {
int ncpt;
spinlock_t lock;
const struct tcs_request *req[MAX_TCS_PER_TYPE];
+ u32 *cmd_cache;
+ DECLARE_BITMAP(slots, MAX_TCS_SLOTS);
};
/**
@@ -69,6 +74,9 @@ struct rsc_drv {
extern struct list_head rsc_drv_list;
int rpmh_rsc_send_data(struct rsc_drv *drv, const struct tcs_request *msg);
+int rpmh_rsc_write_ctrl_data(struct rsc_drv *drv,
+ const struct tcs_request *msg);
+int rpmh_rsc_invalidate(struct rsc_drv *drv);
void rpmh_tx_done(const struct tcs_request *msg, int r);
diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
index c0edf3850147..b5894b001ae1 100644
--- a/drivers/soc/qcom/rpmh-rsc.c
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -113,6 +113,12 @@ static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,
case RPMH_ACTIVE_ONLY_STATE:
type = ACTIVE_TCS;
break;
+ case RPMH_WAKE_ONLY_STATE:
+ type = WAKE_TCS;
+ break;
+ case RPMH_SLEEP_STATE:
+ type = SLEEP_TCS;
+ break;
default:
return ERR_PTR(-EINVAL);
}
@@ -353,6 +359,109 @@ int rpmh_rsc_send_data(struct rsc_drv *drv, const struct tcs_request *msg)
}
EXPORT_SYMBOL(rpmh_rsc_send_data);
+static int find_match(const struct tcs_group *tcs, const struct tcs_cmd *cmd,
+ int len)
+{
+ int i, j;
+
+ /* Check for already cached commands */
+ for_each_set_bit(i, tcs->slots, MAX_TCS_SLOTS) {
+ if (tcs->cmd_cache[i] != cmd[0].addr)
+ continue;
+ if (i + len >= MAX_TCS_SLOTS)
+ goto seq_err;
+ for (j = 0; j < len; j++) {
+ if (tcs->cmd_cache[i + j] != cmd[j].addr)
+ goto seq_err;
+ }
+ return i;
+ }
+
+ return -ENODATA;
+
+seq_err:
+ WARN(1, "Message does not match previous sequence.\n");
+ return -EINVAL;
+}
+
+static int find_slots(struct tcs_group *tcs, const struct tcs_request *msg,
+ int *tcs_id, int *cmd_id)
+{
+ int slot, offset;
+ int i = 0;
+
+ /* Find if we already have the msg in our TCS */
+ slot = find_match(tcs, msg->cmds, msg->num_cmds);
+ if (slot >= 0)
+ goto copy_data;
+
+ /* Do over, until we can fit the full payload in a TCS */
+ do {
+ slot = bitmap_find_next_zero_area(tcs->slots, MAX_TCS_SLOTS,
+ i, msg->num_cmds, 0);
+ if (slot == MAX_TCS_SLOTS)
+ return -ENOMEM;
+ i += tcs->ncpt;
+ } while (slot + msg->num_cmds - 1 >= i);
+
+copy_data:
+ bitmap_set(tcs->slots, slot, msg->num_cmds);
+ /* Copy the addresses of the resources over to the slots */
+ for (i = 0; i < msg->num_cmds; i++)
+ tcs->cmd_cache[slot + i] = msg->cmds[i].addr;
+
+ offset = slot / tcs->ncpt;
+ *tcs_id = offset + tcs->offset;
+ *cmd_id = slot % tcs->ncpt;
+
+ return 0;
+}
+
+static int tcs_ctrl_write(struct rsc_drv *drv, const struct tcs_request *msg)
+{
+ struct tcs_group *tcs;
+ int tcs_id = 0, cmd_id = 0;
+ unsigned long flags;
+ int ret;
+
+ tcs = get_tcs_for_msg(drv, msg);
+ if (IS_ERR(tcs))
+ return PTR_ERR(tcs);
+
+ spin_lock_irqsave(&tcs->lock, flags);
+ /* find the m-th TCS and the n-th position in the TCS to write to */
+ ret = find_slots(tcs, msg, &tcs_id, &cmd_id);
+ if (!ret)
+ __tcs_buffer_write(drv, tcs_id, cmd_id, msg);
+ spin_unlock_irqrestore(&tcs->lock, flags);
+
+ return ret;
+}
+
+/**
+ * rpmh_rsc_write_ctrl_data: Write request to the controller
+ *
+ * @drv: the controller
+ * @msg: the data to be written to the controller
+ *
+ * There is no response returned for writing the request to the controller.
+ */
+int rpmh_rsc_write_ctrl_data(struct rsc_drv *drv, const struct tcs_request *msg)
+{
+ if (!msg || !msg->cmds || !msg->num_cmds ||
+ msg->num_cmds > MAX_RPMH_PAYLOAD) {
+ pr_err("Payload error\n");
+ return -EINVAL;
+ }
+
+ /* Data sent to this API will not be sent immediately */
+ if (msg->state == RPMH_ACTIVE_ONLY_STATE)
+ return -EINVAL;
+
+ return tcs_ctrl_write(drv, msg);
+}
+EXPORT_SYMBOL(rpmh_rsc_write_ctrl_data);
+
static int rpmh_probe_tcs_config(struct platform_device *pdev,
struct rsc_drv *drv)
{
@@ -428,6 +537,19 @@ static int rpmh_probe_tcs_config(struct platform_device *pdev,
tcs->mask = ((1 << tcs->num_tcs) - 1) << st;
tcs->offset = st;
st += tcs->num_tcs;
+
+ /*
+ * Allocate memory to cache sleep and wake requests to
+ * avoid reading TCS register memory.
+ */
+ if (tcs->type == ACTIVE_TCS)
+ continue;
+
+ tcs->cmd_cache = devm_kcalloc(&pdev->dev,
+ tcs->num_tcs * ncpt, sizeof(u32),
+ GFP_KERNEL);
+ if (!tcs->cmd_cache)
+ return -ENOMEM;
}
drv->num_tcs = st;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Some RSCs may only have sleep and wake TCS, i.e, there is no dedicated
TCS for active mode request, but drivers may still want to make active
requests from these RSCs. In such cases re-purpose the wake TCS to send
active state requests.
The requirement for this is that the driver is aware that the wake TCS
is being repurposed to send active request, hence the sleep and wake
TCSes be invalidated before the active request is sent.
Signed-off-by: Lina Iyer <[email protected]>
Reviewed-by: Matthias Kaehlcke <[email protected]>
---
drivers/soc/qcom/rpmh-rsc.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
index 68c25ebbbe09..369b9b3eedc5 100644
--- a/drivers/soc/qcom/rpmh-rsc.c
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -153,6 +153,7 @@ static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,
const struct tcs_request *msg)
{
int type;
+ struct tcs_group *tcs;
switch (msg->state) {
case RPMH_ACTIVE_ONLY_STATE:
@@ -168,7 +169,22 @@ static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,
return ERR_PTR(-EINVAL);
}
- return get_tcs_of_type(drv, type);
+ /*
+ * If we are making an active request on a RSC that does not have a
+ * dedicated TCS for active state use, then re-purpose a wake TCS to
+ * send active votes.
+ * NOTE: The driver must be aware that this RSC does not have a
+ * dedicated AMC, and therefore would invalidate the sleep and wake
+ * TCSes before making an active state request.
+ */
+ tcs = get_tcs_of_type(drv, type);
+ if (msg->state == RPMH_ACTIVE_ONLY_STATE && IS_ERR(tcs)) {
+ tcs = get_tcs_of_type(drv, WAKE_TCS);
+ if (!IS_ERR(tcs))
+ rpmh_rsc_invalidate(drv);
+ }
+
+ return tcs;
}
static const struct tcs_request *get_req_from_tcs(struct rsc_drv *drv,
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Platform drivers that want to send a request but do not want to block
until the RPMH request completes have now a new API -
rpmh_write_async().
The API allocates memory and send the requests and returns the control
back to the platform driver. The tx_done callback from the controller is
handled in the context of the controller's thread and frees the
allocated memory. This API allows RPMH requests from atomic contexts as
well.
Signed-off-by: Lina Iyer <[email protected]>
---
Changes in v6:
- replace rpmh_client with device *
---
drivers/soc/qcom/rpmh.c | 52 +++++++++++++++++++++++++++++++++++++++++
include/soc/qcom/rpmh.h | 7 ++++++
2 files changed, 59 insertions(+)
diff --git a/drivers/soc/qcom/rpmh.c b/drivers/soc/qcom/rpmh.c
index 23fcbd9487cd..1bb62876795c 100644
--- a/drivers/soc/qcom/rpmh.c
+++ b/drivers/soc/qcom/rpmh.c
@@ -33,6 +33,7 @@
.cmd = { { 0 } }, \
.completion = q, \
.dev = dev, \
+ .free = NULL, \
}
/**
@@ -58,6 +59,7 @@ struct cache_req {
* @completion: triggered when request is done
* @dev: the device making the request
* @err: err return from the controller
+ * @free: the request object to be freed at tx_done
*/
struct rpmh_request {
struct tcs_request msg;
@@ -65,6 +67,7 @@ struct rpmh_request {
struct completion *completion;
const struct device *dev;
int err;
+ struct rpmh_request *free;
};
/**
@@ -137,6 +140,8 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
dev_err(rpm_msg->dev, "RPMH TX fail in msg addr=%#x, err=%d\n",
rpm_msg->msg.cmds[0].addr, r);
+ kfree(rpm_msg->free);
+
/* Signal the blocking thread we are done */
if (compl)
complete(compl);
@@ -248,6 +253,53 @@ static int __rpmh_write(const struct device *dev, enum rpmh_state state,
return ret;
}
+static struct rpmh_request *__get_rpmh_msg_async(enum rpmh_state state,
+ const struct tcs_cmd *cmd,
+ u32 n)
+{
+ struct rpmh_request *req;
+
+ if (!cmd || !n || n > MAX_RPMH_PAYLOAD)
+ return ERR_PTR(-EINVAL);
+
+ req = kzalloc(sizeof(*req), GFP_ATOMIC);
+ if (!req)
+ return ERR_PTR(-ENOMEM);
+
+ memcpy(req->cmd, cmd, n * sizeof(*cmd));
+
+ req->msg.state = state;
+ req->msg.cmds = req->cmd;
+ req->msg.num_cmds = n;
+ req->free = req;
+
+ return req;
+}
+
+/**
+ * rpmh_write_async: Write a set of RPMH commands
+ *
+ * @dev: The device making the request
+ * @state: Active/sleep set
+ * @cmd: The payload data
+ * @n: The number of elements in payload
+ *
+ * Write a set of RPMH commands, the order of commands is maintained
+ * and will be sent as a single shot.
+ */
+int rpmh_write_async(const struct device *dev, enum rpmh_state state,
+ const struct tcs_cmd *cmd, u32 n)
+{
+ struct rpmh_request *rpm_msg;
+
+ rpm_msg = __get_rpmh_msg_async(state, cmd, n);
+ if (IS_ERR(rpm_msg))
+ return PTR_ERR(rpm_msg);
+
+ return __rpmh_write(dev, state, rpm_msg);
+}
+EXPORT_SYMBOL(rpmh_write_async);
+
/**
* rpmh_write: Write a set of RPMH commands and block until response
*
diff --git a/include/soc/qcom/rpmh.h b/include/soc/qcom/rpmh.h
index 42e62a0d26d8..1161a5c77e75 100644
--- a/include/soc/qcom/rpmh.h
+++ b/include/soc/qcom/rpmh.h
@@ -14,6 +14,9 @@
int rpmh_write(const struct device *dev, enum rpmh_state state,
const struct tcs_cmd *cmd, u32 n);
+int rpmh_write_async(const struct device *dev, enum rpmh_state state,
+ const struct tcs_cmd *cmd, u32 n);
+
int rpmh_flush(const struct device *dev);
int rpmh_invalidate(const struct device *dev);
@@ -24,6 +27,10 @@ static inline int rpmh_write(const struct device *dev, enum rpmh_state state,
const struct tcs_cmd *cmd, u32 n)
{ return -ENODEV; }
+static inline int rpmh_write_async(const struct device *dev,
+ enum rpmh_state state,
+ const struct tcs_cmd *cmd, u32 n)
+{ return -ENODEV; }
static inline int rpmh_flush(const struct device *dev)
{ return -ENODEV; }
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Add device binding documentation for Qualcomm Technology Inc's RPMH RSC
driver. The driver is used for communicating resource state requests for
shared resources.
Cc: [email protected]
Signed-off-by: Lina Iyer <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
---
Changes in v8:
- Describe IRQ for all DRVs
Changes in v7:
- Fix example
Changes in v6:
- Address comments from Stephen Boyd
Changes in v3:
- Move to soc/qcom
- Amend text per Stephen's suggestions
Changes in v2:
- Amend text to describe the registers in reg property
- Add reg-names for the registers
- Update examples to use GIC_SPI in interrupts instead of 0
- Rephrase incorrect description
Changes in v3:
- Fix unwanted capitalization
- Remove clients from the examples, this doc does not describe
them
- Rephrase introductory paragraph
- Remove hardware specifics from DT bindings
---
.../devicetree/bindings/soc/qcom/rpmh-rsc.txt | 137 ++++++++++++++++++
1 file changed, 137 insertions(+)
create mode 100644 Documentation/devicetree/bindings/soc/qcom/rpmh-rsc.txt
diff --git a/Documentation/devicetree/bindings/soc/qcom/rpmh-rsc.txt b/Documentation/devicetree/bindings/soc/qcom/rpmh-rsc.txt
new file mode 100644
index 000000000000..e15c100f5c92
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/qcom/rpmh-rsc.txt
@@ -0,0 +1,137 @@
+RPMH RSC:
+------------
+
+Resource Power Manager Hardened (RPMH) is the mechanism for communicating with
+the hardened resource accelerators on Qualcomm SoCs. Requests to the resources
+can be written to the Trigger Command Set (TCS) registers and using a (addr,
+val) pair and triggered. Messages in the TCS are then sent in sequence over an
+internal bus.
+
+The hardware block (Direct Resource Voter or DRV) is a part of the h/w entity
+(Resource State Coordinator a.k.a RSC) that can handle multiple sleep and
+active/wake resource requests. Multiple such DRVs can exist in a SoC and can
+be written to from Linux. The structure of each DRV follows the same template
+with a few variations that are captured by the properties here.
+
+A TCS may be triggered from Linux or triggered by the F/W after all the CPUs
+have powered off to facilitate idle power saving. TCS could be classified as -
+
+ SLEEP /* Triggered by F/W */
+ WAKE /* Triggered by F/W */
+ ACTIVE /* Triggered by Linux */
+ CONTROL /* Triggered by F/W */
+
+The order in which they are described in the DT, should match the hardware
+configuration.
+
+Requests can be made for the state of a resource, when the subsystem is active
+or idle. When all subsystems like Modem, GPU, CPU are idle, the resource state
+will be an aggregate of the sleep votes from each of those subsystems. Clients
+may request a sleep value for their shared resources in addition to the active
+mode requests.
+
+Properties:
+
+- compatible:
+ Usage: required
+ Value type: <string>
+ Definition: Should be "qcom,rpmh-rsc".
+
+- reg:
+ Usage: required
+ Value type: <prop-encoded-array>
+ Definition: The first register specifies the base address of the
+ DRV(s). The number of DRVs in the dependent on the RSC.
+ The tcs-offset specifies the start address of the
+ TCS in the DRVs.
+
+- reg-names:
+ Usage: required
+ Value type: <string>
+ Definition: Maps the register specified in the reg property. Must be
+ "drv-0", "drv-1", "drv-2" etc and "tcs-offset". The
+
+- interrupts:
+ Usage: required
+ Value type: <prop-encoded-interrupt>
+ Definition: The interrupt that trips when a message complete/response
+ is received for this DRV from the accelerators.
+
+- qcom,drv-id:
+ Usage: required
+ Value type: <u32>
+ Definition: The id of the DRV in the RSC block that will be used by
+ this controller.
+
+- qcom,tcs-config:
+ Usage: required
+ Value type: <prop-encoded-array>
+ Definition: The tuple defining the configuration of TCS.
+ Must have 2 cells which describe each TCS type.
+ <type number_of_tcs>.
+ The order of the TCS must match the hardware
+ configuration.
+ - Cell #1 (TCS Type): TCS types to be specified -
+ SLEEP_TCS
+ WAKE_TCS
+ ACTIVE_TCS
+ CONTROL_TCS
+ - Cell #2 (Number of TCS): <u32>
+
+- label:
+ Usage: optional
+ Value type: <string>
+ Definition: Name for the RSC. The name would be used in trace logs.
+
+Drivers that want to use the RSC to communicate with RPMH must specify their
+bindings as child nodes of the RSC controllers they wish to communicate with.
+
+Example 1:
+
+For a TCS whose RSC base address is is 0x179C0000 and is at a DRV id of 2, the
+register offsets for DRV2 start at 0D00, the register calculations are like
+this -
+DRV0: 0x179C0000
+DRV2: 0x179C0000 + 0x10000 = 0x179D0000
+DRV2: 0x179C0000 + 0x10000 * 2 = 0x179E0000
+TCS-OFFSET: 0xD00
+
+ apps_rsc: rsc@179c0000 {
+ label = "apps_rsc";
+ compatible = "qcom,rpmh-rsc";
+ reg = <0x179c0000 0x10000>,
+ <0x179d0000 0x10000>,
+ <0x179e0000 0x10000>;
+ reg-names = "drv-0", "drv-1", "drv-2";
+ interrupts = <GIC_SPI 3 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 4 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 5 IRQ_TYPE_LEVEL_HIGH>;
+ qcom,tcs-offset = <0xd00>;
+ qcom,drv-id = <2>;
+ qcom,tcs-config = <SLEEP_TCS 3>,
+ <WAKE_TCS 3>,
+ <ACTIVE_TCS 2>,
+ <CONTROL_TCS 1>;
+ };
+
+Example 2:
+
+For a TCS whose RSC base address is 0xAF20000 and is at DRV id of 0, the
+register offsets for DRV0 start at 01C00, the register calculations are like
+this -
+DRV0: 0xAF20000
+TCS-OFFSET: 0x1C00
+
+ disp_rsc: rsc@af20000 {
+ label = "disp_rsc";
+ compatible = "qcom,rpmh-rsc";
+ reg = <0xaf20000 0x10000>;
+ reg-names = "drv-0";
+ interrupts = <GIC_SPI 129 IRQ_TYPE_LEVEL_HIGH>;
+ qcom,tcs-offset = <0x1c00>;
+ qcom,drv-id = <0>;
+ qcom,tcs-config = <SLEEP_TCS 1>,
+ <WAKE_TCS 1>,
+ <ACTIVE_TCS 0>,
+ <CONTROL_TCS 0>;
+ };
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Sending RPMH requests and waiting for response from the controller
through a callback is common functionality across all platform drivers.
To simplify drivers, add a library functions to create RPMH client and
send resource state requests.
rpmh_write() is a synchronous blocking call that can be used to send
active state requests.
Signed-off-by: Lina Iyer <[email protected]>
---
Changes in v7:
- Optimization and locking fixes
Changes in v6:
- replace rpmh_client with device
- inline wait_for_tx_done()
Changes in v4:
- use const struct tcs_cmd in API
- remove wait count from this patch
- changed -EFAULT to -EINVAL
---
drivers/soc/qcom/Makefile | 4 +-
drivers/soc/qcom/rpmh-internal.h | 6 ++
drivers/soc/qcom/rpmh-rsc.c | 8 ++
drivers/soc/qcom/rpmh.c | 176 +++++++++++++++++++++++++++++++
include/soc/qcom/rpmh.h | 25 +++++
5 files changed, 218 insertions(+), 1 deletion(-)
create mode 100644 drivers/soc/qcom/rpmh.c
create mode 100644 include/soc/qcom/rpmh.h
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index cb6300f6a8e9..bb395c3202ca 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -7,7 +7,9 @@ obj-$(CONFIG_QCOM_PM) += spm.o
obj-$(CONFIG_QCOM_QMI_HELPERS) += qmi_helpers.o
qmi_helpers-y += qmi_encdec.o qmi_interface.o
obj-$(CONFIG_QCOM_RMTFS_MEM) += rmtfs_mem.o
-obj-$(CONFIG_QCOM_RPMH) += rpmh-rsc.o
+obj-$(CONFIG_QCOM_RPMH) += qcom_rpmh.o
+qcom_rpmh-y += rpmh-rsc.o
+qcom_rpmh-y += rpmh.o
obj-$(CONFIG_QCOM_SMD_RPM) += smd-rpm.o
obj-$(CONFIG_QCOM_SMEM) += smem.o
obj-$(CONFIG_QCOM_SMEM_STATE) += smem_state.o
diff --git a/drivers/soc/qcom/rpmh-internal.h b/drivers/soc/qcom/rpmh-internal.h
index cc29176f1303..d9a21726e568 100644
--- a/drivers/soc/qcom/rpmh-internal.h
+++ b/drivers/soc/qcom/rpmh-internal.h
@@ -14,6 +14,7 @@
#define MAX_CMDS_PER_TCS 16
#define MAX_TCS_PER_TYPE 3
#define MAX_TCS_NR (MAX_TCS_PER_TYPE * TCS_TYPE_NR)
+#define RPMH_MAX_CTRLR 2
struct rsc_drv;
@@ -52,6 +53,7 @@ struct tcs_group {
* @tcs: TCS groups
* @tcs_in_use: s/w state of the TCS
* @lock: synchronize state of the controller
+ * @list: element in list of drv
*/
struct rsc_drv {
const char *name;
@@ -61,9 +63,13 @@ struct rsc_drv {
struct tcs_group tcs[TCS_TYPE_NR];
DECLARE_BITMAP(tcs_in_use, MAX_TCS_NR);
spinlock_t lock;
+ struct list_head list;
};
+extern struct list_head rsc_drv_list;
int rpmh_rsc_send_data(struct rsc_drv *drv, const struct tcs_request *msg);
+void rpmh_tx_done(const struct tcs_request *msg, int r);
+
#endif /* __RPM_INTERNAL_H__ */
diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
index 0a8cec9d1651..c0edf3850147 100644
--- a/drivers/soc/qcom/rpmh-rsc.c
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -61,6 +61,8 @@
#define CMD_STATUS_ISSUED BIT(8)
#define CMD_STATUS_COMPL BIT(16)
+LIST_HEAD(rsc_drv_list);
+
static u32 read_tcs_reg(struct rsc_drv *drv, int reg, int tcs_id, int cmd_id)
{
return readl_relaxed(drv->tcs_base + reg + RSC_DRV_TCS_OFFSET * tcs_id +
@@ -176,6 +178,8 @@ static irqreturn_t tcs_tx_done(int irq, void *p)
spin_lock(&drv->lock);
clear_bit(i, drv->tcs_in_use);
spin_unlock(&drv->lock);
+ if (req)
+ rpmh_tx_done(req, err);
}
return IRQ_HANDLED;
@@ -469,6 +473,10 @@ static int rpmh_rsc_probe(struct platform_device *pdev)
/* Enable the active TCS to send requests immediately */
write_tcs_reg(drv, RSC_DRV_IRQ_ENABLE, 0, drv->tcs[ACTIVE_TCS].mask);
+ INIT_LIST_HEAD(&drv->list);
+ list_add(&drv->list, &rsc_drv_list);
+ dev_set_drvdata(&pdev->dev, drv);
+
return devm_of_platform_populate(&pdev->dev);
}
diff --git a/drivers/soc/qcom/rpmh.c b/drivers/soc/qcom/rpmh.c
new file mode 100644
index 000000000000..74bb82339b01
--- /dev/null
+++ b/drivers/soc/qcom/rpmh.c
@@ -0,0 +1,176 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ */
+
+#include <linux/atomic.h>
+#include <linux/interrupt.h>
+#include <linux/jiffies.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/wait.h>
+
+#include <soc/qcom/rpmh.h>
+
+#include "rpmh-internal.h"
+
+#define RPMH_TIMEOUT_MS msecs_to_jiffies(10000)
+
+#define DEFINE_RPMH_MSG_ONSTACK(dev, s, q, name) \
+ struct rpmh_request name = { \
+ .msg = { \
+ .state = s, \
+ .cmds = name.cmd, \
+ .num_cmds = 0, \
+ .wait_for_compl = true, \
+ }, \
+ .cmd = { { 0 } }, \
+ .completion = q, \
+ .dev = dev, \
+ }
+
+/**
+ * struct rpmh_request: the message to be sent to rpmh-rsc
+ *
+ * @msg: the request
+ * @cmd: the payload that will be part of the @msg
+ * @completion: triggered when request is done
+ * @dev: the device making the request
+ * @err: err return from the controller
+ */
+struct rpmh_request {
+ struct tcs_request msg;
+ struct tcs_cmd cmd[MAX_RPMH_PAYLOAD];
+ struct completion *completion;
+ const struct device *dev;
+ int err;
+};
+
+/**
+ * struct rpmh_ctrlr: our representation of the controller
+ *
+ * @drv: the controller instance
+ */
+struct rpmh_ctrlr {
+ struct rsc_drv *drv;
+};
+
+static struct rpmh_ctrlr rpmh_rsc[RPMH_MAX_CTRLR];
+static DEFINE_SPINLOCK(rpmh_rsc_lock);
+
+static struct rpmh_ctrlr *get_rpmh_ctrlr(const struct device *dev)
+{
+ int i;
+ struct rsc_drv *p, *drv = dev_get_drvdata(dev->parent);
+ struct rpmh_ctrlr *ctrlr = ERR_PTR(-EINVAL);
+ unsigned long flags;
+
+ if (!drv)
+ return ctrlr;
+
+ for (i = 0; i < RPMH_MAX_CTRLR; i++) {
+ if (rpmh_rsc[i].drv == drv) {
+ ctrlr = &rpmh_rsc[i];
+ return ctrlr;
+ }
+ }
+
+ spin_lock_irqsave(&rpmh_rsc_lock, flags);
+ list_for_each_entry(p, &rsc_drv_list, list) {
+ if (drv == p) {
+ for (i = 0; i < RPMH_MAX_CTRLR; i++) {
+ if (!rpmh_rsc[i].drv)
+ break;
+ }
+ if (i == RPMH_MAX_CTRLR) {
+ ctrlr = ERR_PTR(-ENOMEM);
+ break;
+ }
+ rpmh_rsc[i].drv = drv;
+ ctrlr = &rpmh_rsc[i];
+ break;
+ }
+ }
+ spin_unlock_irqrestore(&rpmh_rsc_lock, flags);
+
+ return ctrlr;
+}
+
+void rpmh_tx_done(const struct tcs_request *msg, int r)
+{
+ struct rpmh_request *rpm_msg = container_of(msg, struct rpmh_request,
+ msg);
+ struct completion *compl = rpm_msg->completion;
+
+ rpm_msg->err = r;
+
+ if (r)
+ dev_err(rpm_msg->dev, "RPMH TX fail in msg addr=%#x, err=%d\n",
+ rpm_msg->msg.cmds[0].addr, r);
+
+ /* Signal the blocking thread we are done */
+ if (compl)
+ complete(compl);
+}
+EXPORT_SYMBOL(rpmh_tx_done);
+
+/**
+ * __rpmh_write: send the RPMH request
+ *
+ * @dev: The device making the request
+ * @state: Active/Sleep request type
+ * @rpm_msg: The data that needs to be sent (cmds).
+ */
+static int __rpmh_write(const struct device *dev, enum rpmh_state state,
+ struct rpmh_request *rpm_msg)
+{
+ struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev);
+
+ if (IS_ERR(ctrlr))
+ return PTR_ERR(ctrlr);
+
+ rpm_msg->msg.state = state;
+
+ if (state != RPMH_ACTIVE_ONLY_STATE)
+ return -EINVAL;
+
+ WARN_ON(irqs_disabled());
+
+ return rpmh_rsc_send_data(ctrlr->drv, &rpm_msg->msg);
+}
+
+/**
+ * rpmh_write: Write a set of RPMH commands and block until response
+ *
+ * @rc: The RPMH handle got from rpmh_get_client
+ * @state: Active/sleep set
+ * @cmd: The payload data
+ * @n: The number of elements in @cmd
+ *
+ * May sleep. Do not call from atomic contexts.
+ */
+int rpmh_write(const struct device *dev, enum rpmh_state state,
+ const struct tcs_cmd *cmd, u32 n)
+{
+ DECLARE_COMPLETION_ONSTACK(compl);
+ DEFINE_RPMH_MSG_ONSTACK(dev, state, &compl, rpm_msg);
+ int ret;
+
+ if (!cmd || !n || n > MAX_RPMH_PAYLOAD)
+ return -EINVAL;
+
+ memcpy(rpm_msg.cmd, cmd, n * sizeof(*cmd));
+ rpm_msg.msg.num_cmds = n;
+
+ ret = __rpmh_write(dev, state, &rpm_msg);
+ if (ret)
+ return ret;
+
+ ret = wait_for_completion_timeout(&compl, RPMH_TIMEOUT_MS);
+ return (ret > 0) ? 0 : -ETIMEDOUT;
+}
+EXPORT_SYMBOL(rpmh_write);
diff --git a/include/soc/qcom/rpmh.h b/include/soc/qcom/rpmh.h
new file mode 100644
index 000000000000..c1d0f902bd71
--- /dev/null
+++ b/include/soc/qcom/rpmh.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ */
+
+#ifndef __SOC_QCOM_RPMH_H__
+#define __SOC_QCOM_RPMH_H__
+
+#include <soc/qcom/tcs.h>
+#include <linux/platform_device.h>
+
+
+#if IS_ENABLED(CONFIG_QCOM_RPMH)
+int rpmh_write(const struct device *dev, enum rpmh_state state,
+ const struct tcs_cmd *cmd, u32 n);
+
+#else
+
+static inline int rpmh_write(const struct device *dev, enum rpmh_state state,
+ const struct tcs_cmd *cmd, u32 n)
+{ return -ENODEV; }
+
+#endif /* CONFIG_QCOM_RPMH */
+
+#endif /* __SOC_QCOM_RPMH_H__ */
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Log sent RPMH requests and interrupt responses in FTRACE.
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Lina Iyer <[email protected]>
---
Changes in v7:
- varible name changes and white space
Changes in v6:
- struct tcs_response was removed. Fix in trace as well.
Changes in v4:
- fix compilation issues, use __assign_str
- use %#x instead of 0x%08x
Changes in v3:
- Use __string() instead of char *
- fix TRACE_INCLUDE_PATH
---
drivers/soc/qcom/Makefile | 1 +
drivers/soc/qcom/rpmh-rsc.c | 11 ++++-
drivers/soc/qcom/trace-rpmh.h | 82 +++++++++++++++++++++++++++++++++++
3 files changed, 93 insertions(+), 1 deletion(-)
create mode 100644 drivers/soc/qcom/trace-rpmh.h
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index 39d3a059ee50..cb6300f6a8e9 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -1,4 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
+CFLAGS_rpmh-rsc.o := -I$(src)
obj-$(CONFIG_QCOM_GLINK_SSR) += glink_ssr.o
obj-$(CONFIG_QCOM_GSBI) += qcom_gsbi.o
obj-$(CONFIG_QCOM_MDT_LOADER) += mdt_loader.o
diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
index 4eeccf75cc15..0a8cec9d1651 100644
--- a/drivers/soc/qcom/rpmh-rsc.c
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -24,6 +24,9 @@
#include "rpmh-internal.h"
+#define CREATE_TRACE_POINTS
+#include "trace-rpmh.h"
+
#define RSC_DRV_TCS_OFFSET 672
#define RSC_DRV_CMD_OFFSET 20
@@ -136,7 +139,7 @@ static const struct tcs_request *get_req_from_tcs(struct rsc_drv *drv,
static irqreturn_t tcs_tx_done(int irq, void *p)
{
struct rsc_drv *drv = p;
- int i, j;
+ int i, j, err;
unsigned long irq_status;
const struct tcs_request *req;
struct tcs_cmd *cmd;
@@ -150,6 +153,7 @@ static irqreturn_t tcs_tx_done(int irq, void *p)
goto skip;
}
+ err = 0;
for (j = 0; j < req->num_cmds; j++) {
u32 sts;
@@ -160,8 +164,11 @@ static irqreturn_t tcs_tx_done(int irq, void *p)
!(sts & CMD_STATUS_COMPL))) {
pr_err("Incomplete request: %s: addr=%#x data=%#x",
drv->name, cmd->addr, cmd->data);
+ err = -EIO;
}
}
+
+ trace_rpmh_tx_done(drv, i, req, err);
skip:
/* Reclaim the TCS */
write_tcs_reg(drv, RSC_DRV_CMD_ENABLE, i, 0);
@@ -195,9 +202,11 @@ static void __tcs_buffer_write(struct rsc_drv *drv, int tcs_id, int cmd_id,
cmd_complete |= cmd->wait << j;
msgid = cmd_msgid;
msgid |= cmd->wait ? CMD_MSGID_RESP_REQ : 0;
+
write_tcs_cmd(drv, RSC_DRV_CMD_MSGID, tcs_id, j, msgid);
write_tcs_cmd(drv, RSC_DRV_CMD_ADDR, tcs_id, j, cmd->addr);
write_tcs_cmd(drv, RSC_DRV_CMD_DATA, tcs_id, j, cmd->data);
+ trace_rpmh_send_msg(drv, tcs_id, j, msgid, cmd);
}
write_tcs_reg(drv, RSC_DRV_CMD_WAIT_FOR_CMPL, tcs_id, cmd_complete);
diff --git a/drivers/soc/qcom/trace-rpmh.h b/drivers/soc/qcom/trace-rpmh.h
new file mode 100644
index 000000000000..feb0cb455e37
--- /dev/null
+++ b/drivers/soc/qcom/trace-rpmh.h
@@ -0,0 +1,82 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ */
+
+#if !defined(_TRACE_RPMH_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_RPMH_H
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM rpmh
+
+#include <linux/tracepoint.h>
+#include "rpmh-internal.h"
+
+TRACE_EVENT(rpmh_tx_done,
+
+ TP_PROTO(struct rsc_drv *d, int m, const struct tcs_request *r, int e),
+
+ TP_ARGS(d, m, r, e),
+
+ TP_STRUCT__entry(
+ __string(name, d->name)
+ __field(int, m)
+ __field(u32, addr)
+ __field(u32, data)
+ __field(int, err)
+ ),
+
+ TP_fast_assign(
+ __assign_str(name, d->name);
+ __entry->m = m;
+ __entry->addr = r->cmds[0].addr;
+ __entry->data = r->cmds[0].data;
+ __entry->err = e;
+ ),
+
+ TP_printk("%s: ack: tcs-m: %d addr: %#x data: %#x errno: %d",
+ __get_str(name), __entry->m, __entry->addr, __entry->data,
+ __entry->err)
+);
+
+TRACE_EVENT(rpmh_send_msg,
+
+ TP_PROTO(struct rsc_drv *d, int m, int n, u32 h,
+ const struct tcs_cmd *c),
+
+ TP_ARGS(d, m, n, h, c),
+
+ TP_STRUCT__entry(
+ __string(name, d->name)
+ __field(int, m)
+ __field(int, n)
+ __field(u32, hdr)
+ __field(u32, addr)
+ __field(u32, data)
+ __field(bool, wait)
+ ),
+
+ TP_fast_assign(
+ __assign_str(name, d->name);
+ __entry->m = m;
+ __entry->n = n;
+ __entry->hdr = h;
+ __entry->addr = c->addr;
+ __entry->data = c->data;
+ __entry->wait = c->wait;
+ ),
+
+ TP_printk("%s: send-msg: tcs(m): %d cmd(n): %d msgid: %#x addr: %#x data: %#x complete: %d",
+ __get_str(name), __entry->m, __entry->n, __entry->hdr,
+ __entry->addr, __entry->data, __entry->wait)
+);
+
+#endif /* _TRACE_RPMH_H */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE trace-rpmh
+
+#include <trace/define_trace.h>
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Add controller driver for QCOM SoCs that have hardware based shared
resource management. The hardware IP known as RSC (Resource State
Coordinator) houses multiple Direct Resource Voter (DRV) for different
execution levels. A DRV is a unique voter on the state of a shared
resource. A Trigger Control Set (TCS) is a bunch of slots that can house
multiple resource state requests, that when triggered will issue those
requests through an internal bus to the Resource Power Manager Hardened
(RPMH) blocks. These hardware blocks are capable of adjusting clocks,
voltages, etc. The resource state request from a DRV are aggregated
along with state requests from other processors in the SoC and the
aggregate value is applied on the resource.
Some important aspects of the RPMH communication -
- Requests are <addr, value> with some header information
- Multiple requests (upto 16) may be sent through a TCS, at a time
- Requests in a TCS are sent in sequence
- Requests may be fire-n-forget or completion (response expected)
- Multiple TCS from the same DRV may be triggered simultaneously
- Cannot send a request if another request for the same addr is in
progress from the same DRV
- When all the requests from a TCS are complete, an IRQ is raised
- The IRQ handler needs to clear the TCS before it is available for
reuse
- TCS configuration is specific to a DRV
- Platform drivers may use DRV from different RSCs to make requests
Resource state requests made when CPUs are active are called 'active'
state requests. Requests made when all the CPUs are powered down (idle
state) are called 'sleep' state requests. They are matched by a
corresponding 'wake' state requests which puts the resources back in to
previously requested active state before resuming any CPU. TCSes are
dedicated for each type of requests. Active mode TCSes (AMC) are used to
send requests immediately to the resource, while control TCS are used to
provide specific information to the controller. Sleep and Wake TCS send
sleep and wake requests, after and before the system halt respectively.
Signed-off-by: Lina Iyer <[email protected]>
---
Changes in v8:
- introduce interrupt description in DT for all DRvs
Changes in v7:
- rename 'm' and 'n' variable names
Changes in v6:
- Remove tasklet and response object
- introduce rpm_write_tcs_cmd() function
- rename tcs_irq_handler()
- avoid bool in structures. Fix tcs.h.
Changes in v4:
- lots of variable name changes as suggested by Stephen B
- use of const for data pointers
- fix comments and other code syntax
- use of bitmap for tcs_in_use instead of atomic
---
drivers/soc/qcom/Kconfig | 10 +
drivers/soc/qcom/Makefile | 1 +
drivers/soc/qcom/rpmh-internal.h | 69 ++++
drivers/soc/qcom/rpmh-rsc.c | 483 ++++++++++++++++++++++++
include/dt-bindings/soc/qcom,rpmh-rsc.h | 14 +
include/soc/qcom/tcs.h | 56 +++
6 files changed, 633 insertions(+)
create mode 100644 drivers/soc/qcom/rpmh-internal.h
create mode 100644 drivers/soc/qcom/rpmh-rsc.c
create mode 100644 include/dt-bindings/soc/qcom,rpmh-rsc.h
create mode 100644 include/soc/qcom/tcs.h
diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index 5c4535b545cc..1887e1b1f43b 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -56,6 +56,16 @@ config QCOM_RMTFS_MEM
Say y here if you intend to boot the modem remoteproc.
+config QCOM_RPMH
+ bool "Qualcomm RPM-Hardened (RPMH) Communication"
+ depends on ARCH_QCOM && ARM64 && OF || COMPILE_TEST
+ help
+ Support for communication with the hardened-RPM blocks in
+ Qualcomm Technologies Inc (QTI) SoCs. RPMH communication uses an
+ internal bus to transmit state requests for shared resources. A set
+ of hardware components aggregate requests for these resources and
+ help apply the aggregated state on the resource.
+
config QCOM_SMEM
tristate "Qualcomm Shared Memory Manager (SMEM)"
depends on ARCH_QCOM
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index dcebf2814e6d..39d3a059ee50 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_QCOM_PM) += spm.o
obj-$(CONFIG_QCOM_QMI_HELPERS) += qmi_helpers.o
qmi_helpers-y += qmi_encdec.o qmi_interface.o
obj-$(CONFIG_QCOM_RMTFS_MEM) += rmtfs_mem.o
+obj-$(CONFIG_QCOM_RPMH) += rpmh-rsc.o
obj-$(CONFIG_QCOM_SMD_RPM) += smd-rpm.o
obj-$(CONFIG_QCOM_SMEM) += smem.o
obj-$(CONFIG_QCOM_SMEM_STATE) += smem_state.o
diff --git a/drivers/soc/qcom/rpmh-internal.h b/drivers/soc/qcom/rpmh-internal.h
new file mode 100644
index 000000000000..cc29176f1303
--- /dev/null
+++ b/drivers/soc/qcom/rpmh-internal.h
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ */
+
+
+#ifndef __RPM_INTERNAL_H__
+#define __RPM_INTERNAL_H__
+
+#include <linux/bitmap.h>
+#include <soc/qcom/tcs.h>
+
+#define TCS_TYPE_NR 4
+#define MAX_CMDS_PER_TCS 16
+#define MAX_TCS_PER_TYPE 3
+#define MAX_TCS_NR (MAX_TCS_PER_TYPE * TCS_TYPE_NR)
+
+struct rsc_drv;
+
+/**
+ * struct tcs_group: group of Trigger Command Sets (TCS) to send state requests
+ * to the controller
+ *
+ * @drv: the controller
+ * @type: type of the TCS in this group - active, sleep, wake
+ * @mask: mask of the TCSes relative to all the TCSes in the RSC
+ * @offset: start of the TCS group relative to the TCSes in the RSC
+ * @num_tcs: number of TCSes in this type
+ * @ncpt: number of commands in each TCS
+ * @lock: lock for synchronizing this TCS writes
+ * @req: requests that are sent from the TCS
+ */
+struct tcs_group {
+ struct rsc_drv *drv;
+ int type;
+ u32 mask;
+ u32 offset;
+ int num_tcs;
+ int ncpt;
+ spinlock_t lock;
+ const struct tcs_request *req[MAX_TCS_PER_TYPE];
+};
+
+/**
+ * struct rsc_drv: the Direct Resource Voter (DRV) of the
+ * Resource State Coordinator controller (RSC)
+ *
+ * @name: controller identifier
+ * @tcs_base: start address of the TCS registers in this controller
+ * @id: instance id in the controller (Direct Resource Voter)
+ * @num_tcs: number of TCSes in this DRV
+ * @tcs: TCS groups
+ * @tcs_in_use: s/w state of the TCS
+ * @lock: synchronize state of the controller
+ */
+struct rsc_drv {
+ const char *name;
+ void __iomem *tcs_base;
+ int id;
+ int num_tcs;
+ struct tcs_group tcs[TCS_TYPE_NR];
+ DECLARE_BITMAP(tcs_in_use, MAX_TCS_NR);
+ spinlock_t lock;
+};
+
+
+int rpmh_rsc_send_data(struct rsc_drv *drv, const struct tcs_request *msg);
+
+#endif /* __RPM_INTERNAL_H__ */
diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
new file mode 100644
index 000000000000..4eeccf75cc15
--- /dev/null
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -0,0 +1,483 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ */
+
+#define pr_fmt(fmt) "%s " fmt, KBUILD_MODNAME
+
+#include <linux/atomic.h>
+#include <linux/delay.h>
+#include <linux/export.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/of.h>
+#include <linux/of_irq.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include <soc/qcom/tcs.h>
+#include <dt-bindings/soc/qcom,rpmh-rsc.h>
+
+#include "rpmh-internal.h"
+
+#define RSC_DRV_TCS_OFFSET 672
+#define RSC_DRV_CMD_OFFSET 20
+
+/* DRV Configuration Information Register */
+#define DRV_PRNT_CHLD_CONFIG 0x0C
+#define DRV_NUM_TCS_MASK 0x3F
+#define DRV_NUM_TCS_SHIFT 6
+#define DRV_NCPT_MASK 0x1F
+#define DRV_NCPT_SHIFT 27
+
+/* Register offsets */
+#define RSC_DRV_IRQ_ENABLE 0x00
+#define RSC_DRV_IRQ_STATUS 0x04
+#define RSC_DRV_IRQ_CLEAR 0x08
+#define RSC_DRV_CMD_WAIT_FOR_CMPL 0x10
+#define RSC_DRV_CONTROL 0x14
+#define RSC_DRV_STATUS 0x18
+#define RSC_DRV_CMD_ENABLE 0x1C
+#define RSC_DRV_CMD_MSGID 0x30
+#define RSC_DRV_CMD_ADDR 0x34
+#define RSC_DRV_CMD_DATA 0x38
+#define RSC_DRV_CMD_STATUS 0x3C
+#define RSC_DRV_CMD_RESP_DATA 0x40
+
+#define TCS_AMC_MODE_ENABLE BIT(16)
+#define TCS_AMC_MODE_TRIGGER BIT(24)
+
+/* TCS CMD register bit mask */
+#define CMD_MSGID_LEN 8
+#define CMD_MSGID_RESP_REQ BIT(8)
+#define CMD_MSGID_WRITE BIT(16)
+#define CMD_STATUS_ISSUED BIT(8)
+#define CMD_STATUS_COMPL BIT(16)
+
+static u32 read_tcs_reg(struct rsc_drv *drv, int reg, int tcs_id, int cmd_id)
+{
+ return readl_relaxed(drv->tcs_base + reg + RSC_DRV_TCS_OFFSET * tcs_id +
+ RSC_DRV_CMD_OFFSET * cmd_id);
+}
+
+static void write_tcs_cmd(struct rsc_drv *drv, int reg, int tcs_id, int cmd_id,
+ u32 data)
+{
+ writel_relaxed(data, drv->tcs_base + reg + RSC_DRV_TCS_OFFSET * tcs_id +
+ RSC_DRV_CMD_OFFSET * cmd_id);
+}
+
+static void write_tcs_reg(struct rsc_drv *drv, int reg, int tcs_id, u32 data)
+{
+ writel_relaxed(data, drv->tcs_base + reg + RSC_DRV_TCS_OFFSET * tcs_id);
+}
+
+static void write_tcs_reg_sync(struct rsc_drv *drv, int reg, int tcs_id,
+ u32 data)
+{
+ writel(data, drv->tcs_base + reg + RSC_DRV_TCS_OFFSET * tcs_id);
+ for (;;) {
+ if (data == readl(drv->tcs_base + reg +
+ RSC_DRV_TCS_OFFSET * tcs_id))
+ break;
+ udelay(1);
+ }
+}
+
+static bool tcs_is_free(struct rsc_drv *drv, int tcs_id)
+{
+ return !test_bit(tcs_id, drv->tcs_in_use) &&
+ read_tcs_reg(drv, RSC_DRV_STATUS, tcs_id, 0);
+}
+
+static struct tcs_group *get_tcs_of_type(struct rsc_drv *drv, int type)
+{
+ return &drv->tcs[type];
+}
+
+static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,
+ const struct tcs_request *msg)
+{
+ int type;
+
+ switch (msg->state) {
+ case RPMH_ACTIVE_ONLY_STATE:
+ type = ACTIVE_TCS;
+ break;
+ default:
+ return ERR_PTR(-EINVAL);
+ }
+
+ return get_tcs_of_type(drv, type);
+}
+
+static const struct tcs_request *get_req_from_tcs(struct rsc_drv *drv,
+ int tcs_id)
+{
+ struct tcs_group *tcs;
+ int i;
+
+ for (i = 0; i < drv->num_tcs; i++) {
+ tcs = &drv->tcs[i];
+ if (tcs->mask & BIT(tcs_id))
+ return tcs->req[tcs_id - tcs->offset];
+ }
+
+ return NULL;
+}
+
+/**
+ * tcs_tx_done: TX Done interrupt handler
+ */
+static irqreturn_t tcs_tx_done(int irq, void *p)
+{
+ struct rsc_drv *drv = p;
+ int i, j;
+ unsigned long irq_status;
+ const struct tcs_request *req;
+ struct tcs_cmd *cmd;
+
+ irq_status = read_tcs_reg(drv, RSC_DRV_IRQ_STATUS, 0, 0);
+
+ for_each_set_bit(i, &irq_status, BITS_PER_LONG) {
+ req = get_req_from_tcs(drv, i);
+ if (!req) {
+ WARN_ON(1);
+ goto skip;
+ }
+
+ for (j = 0; j < req->num_cmds; j++) {
+ u32 sts;
+
+ cmd = &req->cmds[j];
+ sts = read_tcs_reg(drv, RSC_DRV_CMD_STATUS, i, j);
+ if (!(sts & CMD_STATUS_ISSUED) ||
+ ((req->wait_for_compl || cmd->wait) &&
+ !(sts & CMD_STATUS_COMPL))) {
+ pr_err("Incomplete request: %s: addr=%#x data=%#x",
+ drv->name, cmd->addr, cmd->data);
+ }
+ }
+skip:
+ /* Reclaim the TCS */
+ write_tcs_reg(drv, RSC_DRV_CMD_ENABLE, i, 0);
+ write_tcs_reg(drv, RSC_DRV_IRQ_CLEAR, 0, BIT(i));
+ spin_lock(&drv->lock);
+ clear_bit(i, drv->tcs_in_use);
+ spin_unlock(&drv->lock);
+ }
+
+ return IRQ_HANDLED;
+}
+
+static void __tcs_buffer_write(struct rsc_drv *drv, int tcs_id, int cmd_id,
+ const struct tcs_request *msg)
+{
+ u32 msgid, cmd_msgid;
+ u32 cmd_enable = 0;
+ u32 cmd_complete;
+ struct tcs_cmd *cmd;
+ int i, j;
+
+ cmd_msgid = CMD_MSGID_LEN;
+ cmd_msgid |= msg->wait_for_compl ? CMD_MSGID_RESP_REQ : 0;
+ cmd_msgid |= CMD_MSGID_WRITE;
+
+ cmd_complete = read_tcs_reg(drv, RSC_DRV_CMD_WAIT_FOR_CMPL, tcs_id, 0);
+
+ for (i = 0, j = cmd_id; i < msg->num_cmds; i++, j++) {
+ cmd = &msg->cmds[i];
+ cmd_enable |= BIT(j);
+ cmd_complete |= cmd->wait << j;
+ msgid = cmd_msgid;
+ msgid |= cmd->wait ? CMD_MSGID_RESP_REQ : 0;
+ write_tcs_cmd(drv, RSC_DRV_CMD_MSGID, tcs_id, j, msgid);
+ write_tcs_cmd(drv, RSC_DRV_CMD_ADDR, tcs_id, j, cmd->addr);
+ write_tcs_cmd(drv, RSC_DRV_CMD_DATA, tcs_id, j, cmd->data);
+ }
+
+ write_tcs_reg(drv, RSC_DRV_CMD_WAIT_FOR_CMPL, tcs_id, cmd_complete);
+ cmd_enable |= read_tcs_reg(drv, RSC_DRV_CMD_ENABLE, tcs_id, 0);
+ write_tcs_reg(drv, RSC_DRV_CMD_ENABLE, tcs_id, cmd_enable);
+}
+
+static void __tcs_trigger(struct rsc_drv *drv, int tcs_id)
+{
+ u32 enable;
+
+ /*
+ * HW req: Clear the DRV_CONTROL and enable TCS again
+ * While clearing ensure that the AMC mode trigger is cleared
+ * and then the mode enable is cleared.
+ */
+ enable = read_tcs_reg(drv, RSC_DRV_CONTROL, tcs_id, 0);
+ enable &= ~TCS_AMC_MODE_TRIGGER;
+ write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable);
+ enable &= ~TCS_AMC_MODE_ENABLE;
+ write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable);
+
+ /* Enable the AMC mode on the TCS and then trigger the TCS */
+ enable = TCS_AMC_MODE_ENABLE;
+ write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable);
+ enable |= TCS_AMC_MODE_TRIGGER;
+ write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable);
+}
+
+static int check_for_req_inflight(struct rsc_drv *drv, struct tcs_group *tcs,
+ const struct tcs_request *msg)
+{
+ unsigned long curr_enabled;
+ u32 addr;
+ int i, j, k;
+ int tcs_id = tcs->offset;
+
+ for (i = 0; i < tcs->num_tcs; i++, tcs_id++) {
+ if (tcs_is_free(drv, tcs_id))
+ continue;
+
+ curr_enabled = read_tcs_reg(drv, RSC_DRV_CMD_ENABLE, tcs_id, 0);
+
+ for_each_set_bit(j, &curr_enabled, MAX_CMDS_PER_TCS) {
+ addr = read_tcs_reg(drv, RSC_DRV_CMD_ADDR, tcs_id, j);
+ for (k = 0; k < msg->num_cmds; k++) {
+ if (addr == msg->cmds[k].addr)
+ return -EBUSY;
+ }
+ }
+ }
+
+ return 0;
+}
+
+static int find_free_tcs(struct tcs_group *tcs)
+{
+ int i;
+
+ for (i = 0; i < tcs->num_tcs; i++) {
+ if (tcs_is_free(tcs->drv, tcs->offset + i))
+ return tcs->offset + i;
+ }
+
+ return -EBUSY;
+}
+
+static int tcs_write(struct rsc_drv *drv, const struct tcs_request *msg)
+{
+ struct tcs_group *tcs;
+ int tcs_id;
+ unsigned long flags;
+ int ret;
+
+ tcs = get_tcs_for_msg(drv, msg);
+ if (IS_ERR(tcs))
+ return PTR_ERR(tcs);
+
+ spin_lock_irqsave(&tcs->lock, flags);
+ spin_lock(&drv->lock);
+ /*
+ * The h/w does not like if we send a request to the same address,
+ * when one is already in-flight or being processed.
+ */
+ ret = check_for_req_inflight(drv, tcs, msg);
+ if (ret) {
+ spin_unlock(&drv->lock);
+ goto done_write;
+ }
+
+ tcs_id = find_free_tcs(tcs);
+ if (tcs_id < 0) {
+ ret = tcs_id;
+ spin_unlock(&drv->lock);
+ goto done_write;
+ }
+
+ tcs->req[tcs_id - tcs->offset] = msg;
+ set_bit(tcs_id, drv->tcs_in_use);
+ spin_unlock(&drv->lock);
+
+ __tcs_buffer_write(drv, tcs_id, 0, msg);
+ __tcs_trigger(drv, tcs_id);
+
+done_write:
+ spin_unlock_irqrestore(&tcs->lock, flags);
+ return ret;
+}
+
+/**
+ * rpmh_rsc_send_data: Validate the incoming message and write to the
+ * appropriate TCS block.
+ *
+ * @drv: the controller
+ * @msg: the data to be sent
+ *
+ * Return: 0 on success, -EINVAL on error.
+ * Note: This call blocks until a valid data is written to the TCS.
+ */
+int rpmh_rsc_send_data(struct rsc_drv *drv, const struct tcs_request *msg)
+{
+ int ret;
+
+ if (!msg || !msg->cmds || !msg->num_cmds ||
+ msg->num_cmds > MAX_RPMH_PAYLOAD) {
+ WARN_ON(1);
+ return -EINVAL;
+ }
+
+ do {
+ ret = tcs_write(drv, msg);
+ if (ret == -EBUSY) {
+ pr_info_ratelimited("TCS Busy, retrying RPMH message send: addr=%#x\n",
+ msg->cmds[0].addr);
+ udelay(10);
+ }
+ } while (ret == -EBUSY);
+
+ return ret;
+}
+EXPORT_SYMBOL(rpmh_rsc_send_data);
+
+static int rpmh_probe_tcs_config(struct platform_device *pdev,
+ struct rsc_drv *drv)
+{
+ struct tcs_type_config {
+ u32 type;
+ u32 n;
+ } tcs_cfg[TCS_TYPE_NR] = { { 0 } };
+ struct device_node *dn = pdev->dev.of_node;
+ u32 config, max_tcs, ncpt, offset;
+ int i, ret, n, st = 0;
+ struct tcs_group *tcs;
+ struct resource *res;
+ void __iomem *base;
+ char drv_id[10] = {0};
+
+ snprintf(drv_id, ARRAY_SIZE(drv_id), "drv-%d", drv->id);
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, drv_id);
+ base = devm_ioremap_resource(&pdev->dev, res);
+ if (IS_ERR(base))
+ return PTR_ERR(base);
+
+ ret = of_property_read_u32(dn, "qcom,tcs-offset", &offset);
+ if (ret)
+ return ret;
+ drv->tcs_base = base + offset;
+
+ config = readl_relaxed(base + DRV_PRNT_CHLD_CONFIG);
+
+ max_tcs = config;
+ max_tcs &= DRV_NUM_TCS_MASK << (DRV_NUM_TCS_SHIFT * drv->id);
+ max_tcs = max_tcs >> (DRV_NUM_TCS_SHIFT * drv->id);
+
+ ncpt = config & (DRV_NCPT_MASK << DRV_NCPT_SHIFT);
+ ncpt = ncpt >> DRV_NCPT_SHIFT;
+
+ n = of_property_count_u32_elems(dn, "qcom,tcs-config");
+ if (n != 2 * TCS_TYPE_NR)
+ return -EINVAL;
+
+ for (i = 0; i < TCS_TYPE_NR; i++) {
+ ret = of_property_read_u32_index(dn, "qcom,tcs-config",
+ i * 2, &tcs_cfg[i].type);
+ if (ret)
+ return ret;
+ if (tcs_cfg[i].type >= TCS_TYPE_NR)
+ return -EINVAL;
+
+ ret = of_property_read_u32_index(dn, "qcom,tcs-config",
+ i * 2 + 1, &tcs_cfg[i].n);
+ if (ret)
+ return ret;
+ if (tcs_cfg[i].n > MAX_TCS_PER_TYPE)
+ return -EINVAL;
+ }
+
+ for (i = 0; i < TCS_TYPE_NR; i++) {
+ tcs = &drv->tcs[tcs_cfg[i].type];
+ if (tcs->drv)
+ return -EINVAL;
+ tcs->drv = drv;
+ tcs->type = tcs_cfg[i].type;
+ tcs->num_tcs = tcs_cfg[i].n;
+ tcs->ncpt = ncpt;
+ spin_lock_init(&tcs->lock);
+
+ if (!tcs->num_tcs || tcs->type == CONTROL_TCS)
+ continue;
+
+ if (st + tcs->num_tcs > max_tcs ||
+ st + tcs->num_tcs >= BITS_PER_BYTE * sizeof(tcs->mask))
+ return -EINVAL;
+
+ tcs->mask = ((1 << tcs->num_tcs) - 1) << st;
+ tcs->offset = st;
+ st += tcs->num_tcs;
+ }
+
+ drv->num_tcs = st;
+
+ return 0;
+}
+
+static int rpmh_rsc_probe(struct platform_device *pdev)
+{
+ struct device_node *dn = pdev->dev.of_node;
+ struct rsc_drv *drv;
+ int ret, irq;
+
+ drv = devm_kzalloc(&pdev->dev, sizeof(*drv), GFP_KERNEL);
+ if (!drv)
+ return -ENOMEM;
+
+ ret = of_property_read_u32(dn, "qcom,drv-id", &drv->id);
+ if (ret)
+ return ret;
+
+ drv->name = of_get_property(dn, "label", NULL);
+ if (!drv->name)
+ drv->name = dev_name(&pdev->dev);
+
+ ret = rpmh_probe_tcs_config(pdev, drv);
+ if (ret)
+ return ret;
+
+ spin_lock_init(&drv->lock);
+ bitmap_zero(drv->tcs_in_use, MAX_TCS_NR);
+
+ irq = platform_get_irq(pdev, drv->id);
+ if (irq < 0)
+ return irq;
+
+ ret = devm_request_irq(&pdev->dev, irq, tcs_tx_done,
+ IRQF_TRIGGER_HIGH | IRQF_NO_SUSPEND,
+ drv->name, drv);
+ if (ret)
+ return ret;
+
+ /* Enable the active TCS to send requests immediately */
+ write_tcs_reg(drv, RSC_DRV_IRQ_ENABLE, 0, drv->tcs[ACTIVE_TCS].mask);
+
+ return devm_of_platform_populate(&pdev->dev);
+}
+
+static const struct of_device_id rpmh_drv_match[] = {
+ { .compatible = "qcom,rpmh-rsc", },
+ { }
+};
+
+static struct platform_driver rpmh_driver = {
+ .probe = rpmh_rsc_probe,
+ .driver = {
+ .name = "rpmh",
+ .of_match_table = rpmh_drv_match,
+ },
+};
+
+static int __init rpmh_driver_init(void)
+{
+ return platform_driver_register(&rpmh_driver);
+}
+arch_initcall(rpmh_driver_init);
diff --git a/include/dt-bindings/soc/qcom,rpmh-rsc.h b/include/dt-bindings/soc/qcom,rpmh-rsc.h
new file mode 100644
index 000000000000..868f998ea998
--- /dev/null
+++ b/include/dt-bindings/soc/qcom,rpmh-rsc.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ */
+
+#ifndef __DT_QCOM_RPMH_RSC_H__
+#define __DT_QCOM_RPMH_RSC_H__
+
+#define SLEEP_TCS 0
+#define WAKE_TCS 1
+#define ACTIVE_TCS 2
+#define CONTROL_TCS 3
+
+#endif /* __DT_QCOM_RPMH_RSC_H__ */
diff --git a/include/soc/qcom/tcs.h b/include/soc/qcom/tcs.h
new file mode 100644
index 000000000000..262876a59e86
--- /dev/null
+++ b/include/soc/qcom/tcs.h
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ */
+
+#ifndef __SOC_QCOM_TCS_H__
+#define __SOC_QCOM_TCS_H__
+
+#define MAX_RPMH_PAYLOAD 16
+
+/**
+ * rpmh_state: state for the request
+ *
+ * RPMH_SLEEP_STATE: State of the resource when the processor subsystem
+ * is powered down. There is no client using the
+ * resource actively.
+ * RPMH_WAKE_ONLY_STATE: Resume resource state to the value previously
+ * requested before the processor was powered down.
+ * RPMH_ACTIVE_ONLY_STATE: Active or AMC mode requests. Resource state
+ * is aggregated immediately.
+ */
+enum rpmh_state {
+ RPMH_SLEEP_STATE,
+ RPMH_WAKE_ONLY_STATE,
+ RPMH_ACTIVE_ONLY_STATE,
+};
+
+/**
+ * struct tcs_cmd: an individual request to RPMH.
+ *
+ * @addr: the address of the resource slv_id:18:16 | offset:0:15
+ * @data: the resource state request
+ * @wait: wait for this request to be complete before sending the next
+ */
+struct tcs_cmd {
+ u32 addr;
+ u32 data;
+ u32 wait;
+};
+
+/**
+ * struct tcs_request: A set of tcs_cmds sent together in a TCS
+ *
+ * @state: state for the request.
+ * @wait_for_compl: wait until we get a response from the h/w accelerator
+ * @num_cmds: the number of @cmds in this request
+ * @cmds: an array of tcs_cmds
+ */
+struct tcs_request {
+ enum rpmh_state state;
+ u32 wait_for_compl;
+ u32 num_cmds;
+ struct tcs_cmd *cmds;
+};
+
+#endif /* __SOC_QCOM_TCS_H__ */
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Allow sleep and wake commands to be cleared from the respective TCSes,
so that they can be re-populated.
Signed-off-by: Lina Iyer <[email protected]>
---
Changes in v7:
- Move bitmap_zero() outside the loop
Changes in v6:
- remove unnecessary locks around __tcs_invalidate
- rename function to tcs_invaldiate
Changes in v4:
- refactored the rphm_rsc_invalidate()
---
drivers/soc/qcom/rpmh-rsc.c | 45 +++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)
diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
index b5894b001ae1..68c25ebbbe09 100644
--- a/drivers/soc/qcom/rpmh-rsc.c
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -104,6 +104,51 @@ static struct tcs_group *get_tcs_of_type(struct rsc_drv *drv, int type)
return &drv->tcs[type];
}
+static int tcs_invalidate(struct rsc_drv *drv, int type)
+{
+ int m;
+ struct tcs_group *tcs;
+
+ tcs = get_tcs_of_type(drv, type);
+ if (IS_ERR(tcs))
+ return PTR_ERR(tcs);
+
+ spin_lock(&tcs->lock);
+ if (bitmap_empty(tcs->slots, MAX_TCS_SLOTS)) {
+ spin_unlock(&tcs->lock);
+ return 0;
+ }
+
+ for (m = tcs->offset; m < tcs->offset + tcs->num_tcs; m++) {
+ if (!tcs_is_free(drv, m)) {
+ spin_unlock(&tcs->lock);
+ return -EAGAIN;
+ }
+ write_tcs_reg_sync(drv, RSC_DRV_CMD_ENABLE, m, 0);
+ }
+ bitmap_zero(tcs->slots, MAX_TCS_SLOTS);
+ spin_unlock(&tcs->lock);
+
+ return 0;
+}
+
+/**
+ * rpmh_rsc_invalidate - Invalidate sleep and wake TCSes
+ *
+ * @drv: the RSC controller
+ */
+int rpmh_rsc_invalidate(struct rsc_drv *drv)
+{
+ int ret;
+
+ ret = tcs_invalidate(drv, SLEEP_TCS);
+ if (!ret)
+ ret = tcs_invalidate(drv, WAKE_TCS);
+
+ return ret;
+}
+EXPORT_SYMBOL(rpmh_rsc_invalidate);
+
static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,
const struct tcs_request *msg)
{
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Active state requests are sent immediately to the RSC controller, while
sleep and wake state requests are cached in this driver to avoid taxing
the RSC controller repeatedly. The cached values will be sent to the
controller when the rpmh_flush() is called.
Generally, flushing is a system PM activity and may be called from the
system PM drivers when the system is entering suspend or deeper sleep
modes during cpuidle.
Also allow invalidating the cached requests, so they may be re-populated
again.
Signed-off-by: Lina Iyer <[email protected]>
Reviewed-by: Evan Green <[email protected]>
Reviewed-by: Matthias Kaehlcke <[email protected]>
---
Changes in v6:
- replace rpmh_client with device *
Changes in v4:
- remove locking for ->dirty in invalidate
- fix send_single
Changes in v3:
- Remove locking for flush function
- Improve comments
---
drivers/soc/qcom/rpmh.c | 217 +++++++++++++++++++++++++++++++++++++++-
include/soc/qcom/rpmh.h | 11 ++
2 files changed, 223 insertions(+), 5 deletions(-)
diff --git a/drivers/soc/qcom/rpmh.c b/drivers/soc/qcom/rpmh.c
index 74bb82339b01..23fcbd9487cd 100644
--- a/drivers/soc/qcom/rpmh.c
+++ b/drivers/soc/qcom/rpmh.c
@@ -7,10 +7,12 @@
#include <linux/interrupt.h>
#include <linux/jiffies.h>
#include <linux/kernel.h>
+#include <linux/list.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
+#include <linux/spinlock.h>
#include <linux/types.h>
#include <linux/wait.h>
@@ -33,6 +35,21 @@
.dev = dev, \
}
+/**
+ * struct cache_req: the request object for caching
+ *
+ * @addr: the address of the resource
+ * @sleep_val: the sleep vote
+ * @wake_val: the wake vote
+ * @list: linked list obj
+ */
+struct cache_req {
+ u32 addr;
+ u32 sleep_val;
+ u32 wake_val;
+ struct list_head list;
+};
+
/**
* struct rpmh_request: the message to be sent to rpmh-rsc
*
@@ -54,9 +71,15 @@ struct rpmh_request {
* struct rpmh_ctrlr: our representation of the controller
*
* @drv: the controller instance
+ * @cache: the list of cached requests
+ * @lock: synchronize access to the controller data
+ * @dirty: was the cache updated since flush
*/
struct rpmh_ctrlr {
struct rsc_drv *drv;
+ struct list_head cache;
+ spinlock_t lock;
+ bool dirty;
};
static struct rpmh_ctrlr rpmh_rsc[RPMH_MAX_CTRLR];
@@ -91,6 +114,8 @@ static struct rpmh_ctrlr *get_rpmh_ctrlr(const struct device *dev)
break;
}
rpmh_rsc[i].drv = drv;
+ spin_lock_init(&rpmh_rsc[i].lock);
+ INIT_LIST_HEAD(&rpmh_rsc[i].cache);
ctrlr = &rpmh_rsc[i];
break;
}
@@ -118,29 +143,109 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
}
EXPORT_SYMBOL(rpmh_tx_done);
+static struct cache_req *__find_req(struct rpmh_ctrlr *ctrlr, u32 addr)
+{
+ struct cache_req *p, *req = NULL;
+
+ list_for_each_entry(p, &ctrlr->cache, list) {
+ if (p->addr == addr) {
+ req = p;
+ break;
+ }
+ }
+
+ return req;
+}
+
+static struct cache_req *cache_rpm_request(struct rpmh_ctrlr *ctrlr,
+ enum rpmh_state state,
+ struct tcs_cmd *cmd)
+{
+ struct cache_req *req;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ctrlr->lock, flags);
+ req = __find_req(ctrlr, cmd->addr);
+ if (req)
+ goto existing;
+
+ req = kzalloc(sizeof(*req), GFP_ATOMIC);
+ if (!req) {
+ req = ERR_PTR(-ENOMEM);
+ goto unlock;
+ }
+
+ req->addr = cmd->addr;
+ req->sleep_val = req->wake_val = UINT_MAX;
+ INIT_LIST_HEAD(&req->list);
+ list_add_tail(&req->list, &ctrlr->cache);
+
+existing:
+ switch (state) {
+ case RPMH_ACTIVE_ONLY_STATE:
+ if (req->sleep_val != UINT_MAX)
+ req->wake_val = cmd->data;
+ break;
+ case RPMH_WAKE_ONLY_STATE:
+ req->wake_val = cmd->data;
+ break;
+ case RPMH_SLEEP_STATE:
+ req->sleep_val = cmd->data;
+ break;
+ default:
+ break;
+ };
+
+ ctrlr->dirty = true;
+unlock:
+ spin_unlock_irqrestore(&ctrlr->lock, flags);
+
+ return req;
+}
+
/**
- * __rpmh_write: send the RPMH request
+ * __rpmh_write: Cache and send the RPMH request
*
* @dev: The device making the request
* @state: Active/Sleep request type
* @rpm_msg: The data that needs to be sent (cmds).
+ *
+ * Cache the RPMH request and send if the state is ACTIVE_ONLY.
+ * SLEEP/WAKE_ONLY requests are not sent to the controller at
+ * this time. Use rpmh_flush() to send them to the controller.
*/
static int __rpmh_write(const struct device *dev, enum rpmh_state state,
struct rpmh_request *rpm_msg)
{
struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev);
+ int ret = -EINVAL;
+ struct cache_req *req;
+ int i;
if (IS_ERR(ctrlr))
return PTR_ERR(ctrlr);
rpm_msg->msg.state = state;
- if (state != RPMH_ACTIVE_ONLY_STATE)
- return -EINVAL;
+ /* Cache the request in our store and link the payload */
+ for (i = 0; i < rpm_msg->msg.num_cmds; i++) {
+ req = cache_rpm_request(ctrlr, state, &rpm_msg->msg.cmds[i]);
+ if (IS_ERR(req))
+ return PTR_ERR(req);
+ }
+
+ rpm_msg->msg.state = state;
- WARN_ON(irqs_disabled());
+ if (state == RPMH_ACTIVE_ONLY_STATE) {
+ WARN_ON(irqs_disabled());
+ ret = rpmh_rsc_send_data(ctrlr->drv, &rpm_msg->msg);
+ } else {
+ ret = rpmh_rsc_write_ctrl_data(ctrlr->drv, &rpm_msg->msg);
+ /* Clean up our call by spoofing tx_done */
+ rpmh_tx_done(&rpm_msg->msg, ret);
+ }
- return rpmh_rsc_send_data(ctrlr->drv, &rpm_msg->msg);
+ return ret;
}
/**
@@ -174,3 +279,105 @@ int rpmh_write(const struct device *dev, enum rpmh_state state,
return (ret > 0) ? 0 : -ETIMEDOUT;
}
EXPORT_SYMBOL(rpmh_write);
+
+static int is_req_valid(struct cache_req *req)
+{
+ return (req->sleep_val != UINT_MAX &&
+ req->wake_val != UINT_MAX &&
+ req->sleep_val != req->wake_val);
+}
+
+static int send_single(const struct device *dev, enum rpmh_state state,
+ u32 addr, u32 data)
+{
+ DEFINE_RPMH_MSG_ONSTACK(dev, state, NULL, rpm_msg);
+ struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev);
+
+ if (IS_ERR(ctrlr))
+ return PTR_ERR(ctrlr);
+
+ /* Wake sets are always complete and sleep sets are not */
+ rpm_msg.msg.wait_for_compl = (state == RPMH_WAKE_ONLY_STATE);
+ rpm_msg.cmd[0].addr = addr;
+ rpm_msg.cmd[0].data = data;
+ rpm_msg.msg.num_cmds = 1;
+
+ return rpmh_rsc_write_ctrl_data(ctrlr->drv, &rpm_msg.msg);
+}
+
+/**
+ * rpmh_flush: Flushes the buffered active and sleep sets to TCS
+ *
+ * @dev: The device making the request
+ *
+ * Return: -EBUSY if the controller is busy, probably waiting on a response
+ * to a RPMH request sent earlier.
+ *
+ * This function is generally called from the sleep code from the last CPU
+ * that is powering down the entire system. Since no other RPMH API would be
+ * executing at this time, it is safe to run lockless.
+ */
+int rpmh_flush(const struct device *dev)
+{
+ struct cache_req *p;
+ struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev);
+ int ret;
+
+ if (IS_ERR(ctrlr))
+ return PTR_ERR(ctrlr);
+
+ if (!ctrlr->dirty) {
+ pr_debug("Skipping flush, TCS has latest data.\n");
+ return 0;
+ }
+
+ /*
+ * Nobody else should be calling this function other than system PM,
+ * hence we can run without locks.
+ */
+ list_for_each_entry(p, &ctrlr->cache, list) {
+ if (!is_req_valid(p)) {
+ pr_debug("%s: skipping RPMH req: a:%#x s:%#x w:%#x",
+ __func__, p->addr, p->sleep_val, p->wake_val);
+ continue;
+ }
+ ret = send_single(dev, RPMH_SLEEP_STATE, p->addr, p->sleep_val);
+ if (ret)
+ return ret;
+ ret = send_single(dev, RPMH_WAKE_ONLY_STATE,
+ p->addr, p->wake_val);
+ if (ret)
+ return ret;
+ }
+
+ ctrlr->dirty = false;
+
+ return 0;
+}
+EXPORT_SYMBOL(rpmh_flush);
+
+/**
+ * rpmh_invalidate: Invalidate all sleep and active sets
+ * sets.
+ *
+ * @dev: The device making the request
+ *
+ * Invalidate the sleep and active values in the TCS blocks.
+ */
+int rpmh_invalidate(const struct device *dev)
+{
+ struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev);
+ int ret;
+
+ if (IS_ERR(ctrlr))
+ return PTR_ERR(ctrlr);
+
+ ctrlr->dirty = true;
+
+ do {
+ ret = rpmh_rsc_invalidate(ctrlr->drv);
+ } while (ret == -EAGAIN);
+
+ return ret;
+}
+EXPORT_SYMBOL(rpmh_invalidate);
diff --git a/include/soc/qcom/rpmh.h b/include/soc/qcom/rpmh.h
index c1d0f902bd71..42e62a0d26d8 100644
--- a/include/soc/qcom/rpmh.h
+++ b/include/soc/qcom/rpmh.h
@@ -14,12 +14,23 @@
int rpmh_write(const struct device *dev, enum rpmh_state state,
const struct tcs_cmd *cmd, u32 n);
+int rpmh_flush(const struct device *dev);
+
+int rpmh_invalidate(const struct device *dev);
+
#else
static inline int rpmh_write(const struct device *dev, enum rpmh_state state,
const struct tcs_cmd *cmd, u32 n)
{ return -ENODEV; }
+
+static inline int rpmh_flush(const struct device *dev)
+{ return -ENODEV; }
+
+static inline int rpmh_invalidate(const struct device *dev)
+{ return -ENODEV; }
+
#endif /* CONFIG_QCOM_RPMH */
#endif /* __SOC_QCOM_RPMH_H__ */
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
On Wed, 9 May 2018 11:01:52 -0600
Lina Iyer <[email protected]> wrote:
> Log sent RPMH requests and interrupt responses in FTRACE.
Has this changed since the last time I reviewed it? If not, please add
the: Reviewed-by: Steven Rostedt (VMware) <[email protected]>
to you patch queue.
-- Steve
>
> Cc: Steven Rostedt <[email protected]>
> Signed-off-by: Lina Iyer <[email protected]>
> ---
>
> Changes in v7:
> - varible name changes and white space
>
> Changes in v6:
> - struct tcs_response was removed. Fix in trace as well.
> Changes in v4:
> - fix compilation issues, use __assign_str
> - use %#x instead of 0x%08x
> Changes in v3:
> - Use __string() instead of char *
> - fix TRACE_INCLUDE_PATH
> ---
On Wed, May 09, 2018 at 11:01:58AM -0600, Lina Iyer wrote:
> Platform drivers need make a lot of resource state requests at the same
> time, say, at the start or end of an usecase. It can be quite
> inefficient to send each request separately. Instead they can give the
> RPMH library a batch of requests to be sent and wait on the whole
> transaction to be complete.
>
> rpmh_write_batch() is a blocking call that can be used to send multiple
> RPMH command sets. Each RPMH command set is set asynchronously and the
> API blocks until all the command sets are complete and receive their
> tx_done callbacks.
>
> Signed-off-by: Lina Iyer <[email protected]>
> ---
>
> Changes in v7:
> - Check for loop out of bounds
>
> Changes in v6:
> - replace rpmh_client with device *
> Changes in v4:
> - reorganize rpmh_write_batch()
> - introduce wait_count here, instead of patch#4
> ---
> drivers/soc/qcom/rpmh.c | 154 +++++++++++++++++++++++++++++++++++++++-
> include/soc/qcom/rpmh.h | 8 +++
> 2 files changed, 160 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/soc/qcom/rpmh.c b/drivers/soc/qcom/rpmh.c
> index 1bb62876795c..a0e277b4b846 100644
> --- a/drivers/soc/qcom/rpmh.c
> +++ b/drivers/soc/qcom/rpmh.c
>
> ...
>
> +static int flush_batch(struct rpmh_ctrlr *ctrlr)
> +{
> + const struct rpmh_request *rpm_msg;
> + unsigned long flags;
> + int ret = 0;
> + int i;
> +
> + /* Send Sleep/Wake requests to the controller, expect no response */
> + spin_lock_irqsave(&ctrlr->lock, flags);
> + for (i = 0; ctrlr->batch_cache[i]; i++) {
I missed this earlier: the loop goes beyond ctrlr->batch_cache when
the batch cache is full.
> + rpm_msg = ctrlr->batch_cache[i];
> + ret = rpmh_rsc_write_ctrl_data(ctrlr->drv, &rpm_msg->msg);
> + if (ret)
> + break;
> + }
> + spin_unlock_irqrestore(&ctrlr->lock, flags);
> +
> + return ret;
> +}
> +
> +static void invalidate_batch(struct rpmh_ctrlr *ctrlr)
> +{
> + unsigned long flags;
> + int index = 0;
> +
> + spin_lock_irqsave(&ctrlr->lock, flags);
> + while (ctrlr->batch_cache[index]) {
This still goes beyond ctrlr->batch_cache when the batch cache is
full.
Hi Lina,
On Wed, May 09, 2018 at 11:01:54AM -0600, Lina Iyer wrote:
> Sleep and wake requests are sent when the application processor
> subsystem of the SoC is entering deep sleep states like in suspend.
> These requests help lower the system power requirements when the
> resources are not in use.
>
> Sleep and wake requests are written to the TCS slots but are not
> triggered at the time of writing. The TCS are triggered by the firmware
> after the last of the CPUs has executed its WFI. Since these requests
> may come in different batches of requests, it is the job of this
> controller driver to find and arrange the requests into the available
> TCSes.
>
> Signed-off-by: Lina Iyer <[email protected]>
> Reviewed-by: Evan Green <[email protected]>
>
> ---
>
> diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
> index c0edf3850147..b5894b001ae1 100644
> --- a/drivers/soc/qcom/rpmh-rsc.c
> +++ b/drivers/soc/qcom/rpmh-rsc.c
>
> <snip>
>
> +static int find_match(const struct tcs_group *tcs, const struct tcs_cmd *cmd,
> + int len)
> +{
> + int i, j;
> +
> + /* Check for already cached commands */
> + for_each_set_bit(i, tcs->slots, MAX_TCS_SLOTS) {
> + if (tcs->cmd_cache[i] != cmd[0].addr)
> + continue;
> + if (i + len >= MAX_TCS_SLOTS)
> + goto seq_err;
The command cache can have less than MAX_TCS_SLOTS slot:
static int rpmh_probe_tcs_config(struct platform_device *pdev,
struct rsc_drv *drv)
{
...
tcs->cmd_cache = devm_kcalloc(&pdev->dev,
tcs->num_tcs * ncpt, sizeof(u32),
GFP_KERNEL);
...
}
So the condition needs to be:
if (i + len >= tcs->num_tcs * tcs->ncpt)
> +static int find_slots(struct tcs_group *tcs, const struct tcs_request *msg,
> + int *tcs_id, int *cmd_id)
> +{
> + int slot, offset;
> + int i = 0;
> +
> + /* Find if we already have the msg in our TCS */
> + slot = find_match(tcs, msg->cmds, msg->num_cmds);
> + if (slot >= 0)
> + goto copy_data;
> +
> + /* Do over, until we can fit the full payload in a TCS */
> + do {
> + slot = bitmap_find_next_zero_area(tcs->slots, MAX_TCS_SLOTS,
> + i, msg->num_cmds, 0);
> + if (slot == MAX_TCS_SLOTS)
> + return -ENOMEM;
Like above, use 'tcs->num_tcs * tcs->ncpt' as maximum instead of
MAX_TCS_SLOTS.
> +static int tcs_ctrl_write(struct rsc_drv *drv, const struct tcs_request *msg)
> +{
> + struct tcs_group *tcs;
> + int tcs_id = 0, cmd_id = 0;
> + unsigned long flags;
> + int ret;
> +
> + tcs = get_tcs_for_msg(drv, msg);
> + if (IS_ERR(tcs))
> + return PTR_ERR(tcs);
> +
> + spin_lock_irqsave(&tcs->lock, flags);
> + /* find the m-th TCS and the n-th position in the TCS to write to */
The comment still refers to the old names 'm' and 'n'.
Thanks
Matthias
On Wed, May 09, 2018 at 11:01:57AM -0600, Lina Iyer wrote:
> Platform drivers that want to send a request but do not want to block
> until the RPMH request completes have now a new API -
> rpmh_write_async().
>
> The API allocates memory and send the requests and returns the control
> back to the platform driver. The tx_done callback from the controller is
> handled in the context of the controller's thread and frees the
> allocated memory. This API allows RPMH requests from atomic contexts as
> well.
>
> Signed-off-by: Lina Iyer <[email protected]>
Reviewed-by: Matthias Kaehlcke <[email protected]>
On Wed, May 09 2018 at 11:49 -0600, Steven Rostedt wrote:
>On Wed, 9 May 2018 11:01:52 -0600
>Lina Iyer <[email protected]> wrote:
>
>> Log sent RPMH requests and interrupt responses in FTRACE.
>
>Has this changed since the last time I reviewed it? If not, please add
>the: Reviewed-by: Steven Rostedt (VMware) <[email protected]>
>to you patch queue.
>
Sorry, no it did not change in v8.
I will add your reviewed-by tag. Thanks Steve.
-- Lina
>-- Steve
>
>>
>> Cc: Steven Rostedt <[email protected]>
>> Signed-off-by: Lina Iyer <[email protected]>
>> ---
>>
>> Changes in v7:
>> - varible name changes and white space
>>
>> Changes in v6:
>> - struct tcs_response was removed. Fix in trace as well.
>> Changes in v4:
>> - fix compilation issues, use __assign_str
>> - use %#x instead of 0x%08x
>> Changes in v3:
>> - Use __string() instead of char *
>> - fix TRACE_INCLUDE_PATH
>> ---
On Wed, May 09 2018 at 17:25 -0600, Matthias Kaehlcke wrote:
>Hi Lina,
>
>On Wed, May 09, 2018 at 11:01:54AM -0600, Lina Iyer wrote:
>> Sleep and wake requests are sent when the application processor
>> subsystem of the SoC is entering deep sleep states like in suspend.
>> These requests help lower the system power requirements when the
>> resources are not in use.
>>
>> Sleep and wake requests are written to the TCS slots but are not
>> triggered at the time of writing. The TCS are triggered by the firmware
>> after the last of the CPUs has executed its WFI. Since these requests
>> may come in different batches of requests, it is the job of this
>> controller driver to find and arrange the requests into the available
>> TCSes.
>>
>> Signed-off-by: Lina Iyer <[email protected]>
>> Reviewed-by: Evan Green <[email protected]>
>>
>> ---
>>
>> diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
>> index c0edf3850147..b5894b001ae1 100644
>> --- a/drivers/soc/qcom/rpmh-rsc.c
>> +++ b/drivers/soc/qcom/rpmh-rsc.c
>>
>> <snip>
>>
>> +static int find_match(const struct tcs_group *tcs, const struct tcs_cmd *cmd,
>> + int len)
>> +{
>> + int i, j;
>> +
>> + /* Check for already cached commands */
>> + for_each_set_bit(i, tcs->slots, MAX_TCS_SLOTS) {
>> + if (tcs->cmd_cache[i] != cmd[0].addr)
>> + continue;
>> + if (i + len >= MAX_TCS_SLOTS)
>> + goto seq_err;
>
>The command cache can have less than MAX_TCS_SLOTS slot:
>
That's true. I forgot that I had optimized the cache slots. Thanks for
pointing out.
>static int rpmh_probe_tcs_config(struct platform_device *pdev,
> struct rsc_drv *drv)
>{
> ...
> tcs->cmd_cache = devm_kcalloc(&pdev->dev,
> tcs->num_tcs * ncpt, sizeof(u32),
> GFP_KERNEL);
> ...
>}
>
>So the condition needs to be:
>
>if (i + len >= tcs->num_tcs * tcs->ncpt)
>
>> +static int find_slots(struct tcs_group *tcs, const struct tcs_request *msg,
>> + int *tcs_id, int *cmd_id)
>> +{
>> + int slot, offset;
>> + int i = 0;
>> +
>> + /* Find if we already have the msg in our TCS */
>> + slot = find_match(tcs, msg->cmds, msg->num_cmds);
>> + if (slot >= 0)
>> + goto copy_data;
>> +
>> + /* Do over, until we can fit the full payload in a TCS */
>> + do {
>> + slot = bitmap_find_next_zero_area(tcs->slots, MAX_TCS_SLOTS,
>> + i, msg->num_cmds, 0);
>> + if (slot == MAX_TCS_SLOTS)
>> + return -ENOMEM;
>
>Like above, use 'tcs->num_tcs * tcs->ncpt' as maximum instead of
>MAX_TCS_SLOTS.
>
>> +static int tcs_ctrl_write(struct rsc_drv *drv, const struct tcs_request *msg)
>> +{
>> + struct tcs_group *tcs;
>> + int tcs_id = 0, cmd_id = 0;
>> + unsigned long flags;
>> + int ret;
>> +
>> + tcs = get_tcs_for_msg(drv, msg);
>> + if (IS_ERR(tcs))
>> + return PTR_ERR(tcs);
>> +
>> + spin_lock_irqsave(&tcs->lock, flags);
>> + /* find the m-th TCS and the n-th position in the TCS to write to */
>
>The comment still refers to the old names 'm' and 'n'.
>
Really? :)
Will fix.
Thanks for your review,
Lina
On Wed, May 09 2018 at 16:03 -0600, Matthias Kaehlcke wrote:
>On Wed, May 09, 2018 at 11:01:58AM -0600, Lina Iyer wrote:
>> Platform drivers need make a lot of resource state requests at the same
>> time, say, at the start or end of an usecase. It can be quite
>> inefficient to send each request separately. Instead they can give the
>> RPMH library a batch of requests to be sent and wait on the whole
>> transaction to be complete.
>>
>> rpmh_write_batch() is a blocking call that can be used to send multiple
>> RPMH command sets. Each RPMH command set is set asynchronously and the
>> API blocks until all the command sets are complete and receive their
>> tx_done callbacks.
>>
>> Signed-off-by: Lina Iyer <[email protected]>
>> ---
>>
>> Changes in v7:
>> - Check for loop out of bounds
>>
>> Changes in v6:
>> - replace rpmh_client with device *
>> Changes in v4:
>> - reorganize rpmh_write_batch()
>> - introduce wait_count here, instead of patch#4
>> ---
>> drivers/soc/qcom/rpmh.c | 154 +++++++++++++++++++++++++++++++++++++++-
>> include/soc/qcom/rpmh.h | 8 +++
>> 2 files changed, 160 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/soc/qcom/rpmh.c b/drivers/soc/qcom/rpmh.c
>> index 1bb62876795c..a0e277b4b846 100644
>> --- a/drivers/soc/qcom/rpmh.c
>> +++ b/drivers/soc/qcom/rpmh.c
>>
>> ...
>>
>> +static int flush_batch(struct rpmh_ctrlr *ctrlr)
>> +{
>> + const struct rpmh_request *rpm_msg;
>> + unsigned long flags;
>> + int ret = 0;
>> + int i;
>> +
>> + /* Send Sleep/Wake requests to the controller, expect no response */
>> + spin_lock_irqsave(&ctrlr->lock, flags);
>> + for (i = 0; ctrlr->batch_cache[i]; i++) {
>
>I missed this earlier: the loop goes beyond ctrlr->batch_cache when
>the batch cache is full.
>
>> + rpm_msg = ctrlr->batch_cache[i];
>> + ret = rpmh_rsc_write_ctrl_data(ctrlr->drv, &rpm_msg->msg);
>> + if (ret)
>> + break;
>> + }
>> + spin_unlock_irqrestore(&ctrlr->lock, flags);
>> +
>> + return ret;
>> +}
>> +
>> +static void invalidate_batch(struct rpmh_ctrlr *ctrlr)
>> +{
>> + unsigned long flags;
>> + int index = 0;
>> +
>> + spin_lock_irqsave(&ctrlr->lock, flags);
>> + while (ctrlr->batch_cache[index]) {
>
>This still goes beyond ctrlr->batch_cache when the batch cache is
>full.
I will check through the code for all out-of-bounds. Seems like I have
not worried about it too much. Well the space here assigned is beyond
what we see on a production device. Neverthless, it needs to be checked.
Thanks for your review Matthias.
-- Lina
Hi,
On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
> +int rpmh_rsc_send_data(struct rsc_drv *drv, const struct tcs_request *msg)
> +{
> + int ret;
> +
> + if (!msg || !msg->cmds || !msg->num_cmds ||
> + msg->num_cmds > MAX_RPMH_PAYLOAD) {
> + WARN_ON(1);
> + return -EINVAL;
> + }
> +
> + do {
> + ret = tcs_write(drv, msg);
> + if (ret == -EBUSY) {
> + pr_info_ratelimited("TCS Busy, retrying RPMH message send: addr=%#x\n",
> + msg->cmds[0].addr);
> + udelay(10);
> + }
> + } while (ret == -EBUSY);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL(rpmh_rsc_send_data);
Here and elsewhere in this series: why EXPORT_SYMBOL in this case?
This is only exported to rpmh.c, right? You don't need EXPORT_SYMBOL
for that. The Makefile puts rpmh.c and rpmh-rsc.c together in the
same "qcom_rpmh.o", and then even further the KConfig lists this as
bool so both are builtin to the kernel.
-Doug
Hi,
On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
> /**
> @@ -137,6 +140,8 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
> dev_err(rpm_msg->dev, "RPMH TX fail in msg addr=%#x, err=%d\n",
> rpm_msg->msg.cmds[0].addr, r);
>
> + kfree(rpm_msg->free);
> +
The way the code is written makes it seem like you could free memory
_and_ have a completion but you can't. Specifically:
* The only plausible thing that "rpm_msg->free" could point to is "rpm_msg".
* The complete(compl) would then be accessing freed memory.
I believe the kfree() should be at the end of the function.
Personally I'd make it more obvious that this is just a boolean value
and change to:
if (rpm_msg->needs_free)
kgree(rpm_msg)
...then the reader of the code doesn't need to go figure out what
you're trying to free.
-Doug
Hi,
On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
> +int rpmh_write(const struct device *dev, enum rpmh_state state,
> + const struct tcs_cmd *cmd, u32 n)
> +{
> + DECLARE_COMPLETION_ONSTACK(compl);
> + DEFINE_RPMH_MSG_ONSTACK(dev, state, &compl, rpm_msg);
> + int ret;
> +
> + if (!cmd || !n || n > MAX_RPMH_PAYLOAD)
> + return -EINVAL;
> +
> + memcpy(rpm_msg.cmd, cmd, n * sizeof(*cmd));
> + rpm_msg.msg.num_cmds = n;
> +
> + ret = __rpmh_write(dev, state, &rpm_msg);
> + if (ret)
> + return ret;
> +
> + ret = wait_for_completion_timeout(&compl, RPMH_TIMEOUT_MS);
IMO it's almost never a good idea to use wait_for_completion_timeout()
together with a completion that's declared on the stack. If you
somehow insist that this is a good idea then I need to see incredibly
clear and obvious code/comments that say why it's impossible that the
process might somehow try to signal the completion _after_
RPMH_TIMEOUT_MS has expired.
Specifically if the timeout happens but the process could still signal
a completion later then they will access random data on the stack of a
function that has already returned. This causes ridiculously
difficult-to-debug crashes.
NOTE: You've got timeout set to 10 seconds here. Is that really even
useful? IMO just call wait_for_completion() without a timeout. It's
much better to have a nice clean hang than a random stack corruption.
-Doug
Hi,
On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
> /**
> * struct rpmh_request: the message to be sent to rpmh-rsc
> *
> @@ -54,9 +71,15 @@ struct rpmh_request {
> * struct rpmh_ctrlr: our representation of the controller
> *
> * @drv: the controller instance
> + * @cache: the list of cached requests
> + * @lock: synchronize access to the controller data
nit: this makes it sound like this lock will be grabbed for all calls
into rpmh-rsc. In fact, it only protects access to the cache.
Ideally name it cache_lock and document that it's for protecting the
cache.
> +/**
> + * rpmh_flush: Flushes the buffered active and sleep sets to TCS
> + *
> + * @dev: The device making the request
> + *
> + * Return: -EBUSY if the controller is busy, probably waiting on a response
> + * to a RPMH request sent earlier.
> + *
> + * This function is generally called from the sleep code from the last CPU
"is generally" implies that sometimes it's not called from the sleep
code. Change to "is always". If "is generally" is more correct, you
can't run lockless right?
-Doug
Hi,
On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
> Some RSCs may only have sleep and wake TCS, i.e, there is no dedicated
> TCS for active mode request, but drivers may still want to make active
> requests from these RSCs. In such cases re-purpose the wake TCS to send
> active state requests.
>
> The requirement for this is that the driver is aware that the wake TCS
> is being repurposed to send active request, hence the sleep and wake
> TCSes be invalidated before the active request is sent.
>
> Signed-off-by: Lina Iyer <[email protected]>
> Reviewed-by: Matthias Kaehlcke <[email protected]>
> ---
> drivers/soc/qcom/rpmh-rsc.c | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
> index 68c25ebbbe09..369b9b3eedc5 100644
> --- a/drivers/soc/qcom/rpmh-rsc.c
> +++ b/drivers/soc/qcom/rpmh-rsc.c
> @@ -153,6 +153,7 @@ static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,
> const struct tcs_request *msg)
> {
> int type;
> + struct tcs_group *tcs;
>
> switch (msg->state) {
> case RPMH_ACTIVE_ONLY_STATE:
> @@ -168,7 +169,22 @@ static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,
> return ERR_PTR(-EINVAL);
> }
>
> - return get_tcs_of_type(drv, type);
> + /*
> + * If we are making an active request on a RSC that does not have a
> + * dedicated TCS for active state use, then re-purpose a wake TCS to
> + * send active votes.
> + * NOTE: The driver must be aware that this RSC does not have a
> + * dedicated AMC, and therefore would invalidate the sleep and wake
> + * TCSes before making an active state request.
> + */
> + tcs = get_tcs_of_type(drv, type);
> + if (msg->state == RPMH_ACTIVE_ONLY_STATE && IS_ERR(tcs)) {
> + tcs = get_tcs_of_type(drv, WAKE_TCS);
> + if (!IS_ERR(tcs))
> + rpmh_rsc_invalidate(drv);
I noticed that rpmh_rsc_invalidate() can return -EAGAIN. Do you need
to deal with that here?
-Doug
Hi,
On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
> /**
> @@ -77,12 +82,14 @@ struct rpmh_request {
> * @cache: the list of cached requests
> * @lock: synchronize access to the controller data
> * @dirty: was the cache updated since flush
> + * @batch_cache: Cache sleep and wake requests sent as batch
> */
> struct rpmh_ctrlr {
> struct rsc_drv *drv;
> struct list_head cache;
> spinlock_t lock;
> bool dirty;
> + const struct rpmh_request *batch_cache[RPMH_MAX_BATCH_CACHE];
I'm pretty confused about why the "batch_cache" is separate from the
normal cache. As far as I can tell the purpose of the two is the same
but you have two totally separate code paths and data structures.
> };
>
> static struct rpmh_ctrlr rpmh_rsc[RPMH_MAX_CTRLR];
> @@ -133,6 +140,7 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
> struct rpmh_request *rpm_msg = container_of(msg, struct rpmh_request,
> msg);
> struct completion *compl = rpm_msg->completion;
> + atomic_t *wc = rpm_msg->wait_count;
>
> rpm_msg->err = r;
>
> @@ -143,8 +151,13 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
> kfree(rpm_msg->free);
>
> /* Signal the blocking thread we are done */
> - if (compl)
> - complete(compl);
> + if (!compl)
> + return;
The comment above this "if" block no longer applies to the line next
to it after your patch. ...but below I suggest you get rid of
"wait_count", so maybe this part of the patch will go away.
> +static int cache_batch(struct rpmh_ctrlr *ctrlr,
> + struct rpmh_request **rpm_msg, int count)
> +{
> + unsigned long flags;
> + int ret = 0;
> + int index = 0;
> + int i;
> +
> + spin_lock_irqsave(&ctrlr->lock, flags);
> + while (index < RPMH_MAX_BATCH_CACHE && ctrlr->batch_cache[index])
> + index++;
> + if (index + count >= RPMH_MAX_BATCH_CACHE) {
> + ret = -ENOMEM;
> + goto fail;
> + }
> +
> + for (i = 0; i < count; i++)
> + ctrlr->batch_cache[index + i] = rpm_msg[i];
> +fail:
Nit: this label is for both failure and normal exit, so call it "exit".
> + spin_unlock_irqrestore(&ctrlr->lock, flags);
> +
> + return ret;
> +}
As part of my overall confusion about why the batch cache is different
than the normal one: for the normal use case you still call
rpmh_rsc_write_ctrl_data() for things you put in your cache, but you
don't for the batch cache. I still haven't totally figured out what
rpmh_rsc_write_ctrl_data() does, but it seems strange that you don't
do it for the batch cache but you do for the other one.
> +/**
> + * rpmh_write_batch: Write multiple sets of RPMH commands and wait for the
> + * batch to finish.
> + *
> + * @dev: the device making the request
> + * @state: Active/sleep set
> + * @cmd: The payload data
> + * @n: The array of count of elements in each batch, 0 terminated.
> + *
> + * Write a request to the RSC controller without caching. If the request
> + * state is ACTIVE, then the requests are treated as completion request
> + * and sent to the controller immediately. The function waits until all the
> + * commands are complete. If the request was to SLEEP or WAKE_ONLY, then the
> + * request is sent as fire-n-forget and no ack is expected.
> + *
> + * May sleep. Do not call from atomic contexts for ACTIVE_ONLY requests.
> + */
> +int rpmh_write_batch(const struct device *dev, enum rpmh_state state,
> + const struct tcs_cmd *cmd, u32 *n)
> +{
> + struct rpmh_request *rpm_msg[RPMH_MAX_REQ_IN_BATCH] = { NULL };
> + DECLARE_COMPLETION_ONSTACK(compl);
> + atomic_t wait_count = ATOMIC_INIT(0);
> + struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev);
> + int count = 0;
> + int ret, i;
> +
> + if (IS_ERR(ctrlr) || !cmd || !n)
> + return -EINVAL;
> +
> + while (n[count++] > 0)
> + ;
> + count--;
> + if (!count || count > RPMH_MAX_REQ_IN_BATCH)
> + return -EINVAL;
> +
> + for (i = 0; i < count; i++) {
> + rpm_msg[i] = __get_rpmh_msg_async(state, cmd, n[i]);
> + if (IS_ERR_OR_NULL(rpm_msg[i])) {
Just "IS_ERR". It's never NULL.
...also add a i-- somewhere in here or you're going to be kfree()ing
your error value, aren't you?
> + ret = PTR_ERR(rpm_msg[i]);
> + for (; i >= 0; i--)
> + kfree(rpm_msg[i]->free);
> + return ret;
> + }
> + cmd += n[i];
> + }
> +
> + if (state != RPMH_ACTIVE_ONLY_STATE)
> + return cache_batch(ctrlr, rpm_msg, count);
Don't you need to free rpm_msg items in this case?
> +
> + atomic_set(&wait_count, count);
> +
> + for (i = 0; i < count; i++) {
> + rpm_msg[i]->completion = &compl;
> + rpm_msg[i]->wait_count = &wait_count;
> + ret = rpmh_rsc_send_data(ctrlr->drv, &rpm_msg[i]->msg);
> + if (ret) {
> + int j;
> +
> + pr_err("Error(%d) sending RPMH message addr=%#x\n",
> + ret, rpm_msg[i]->msg.cmds[0].addr);
> + for (j = i; j < count; j++)
> + rpmh_tx_done(&rpm_msg[j]->msg, ret);
You're just using rpmh_tx_done() to free memory? Note that you'll
probably do your error handling in this function a favor if you rename
__get_rpmh_msg_async() to __fill_rpmh_msg() and remove the memory
allocation from there. Then you can do one big allocation of the
whole array in rpmh_write_batch() and then you'll only have one free
at the end...
> + break;
"break" seems wrong here. You'll end up waiting for the completion,
then I guess timing out, then returning -ETIMEDOUT?
> + }
> + }
> +
> + ret = wait_for_completion_timeout(&compl, RPMH_TIMEOUT_MS);
The "wait_count" abstraction is confusing and I believe it's not
needed. I think you can remove it and change the above to this
(untested) code:
time_left = RPMH_TIMEOUT_MS;
for (i = 0; i < count; i++) {
time_left = wait_for_completion_timeout(&compl, time_left);
if (!time_left)
return -ETIMEDOUT;
}
...specifically completions are additive, so just wait "count" times
and then the reader doesn't need to learn your new wait_count
abstraction and try to reason about it.
...and, actually, I argue in other replies that this should't use a
timeout, so even cleaner:
for (i = 0; i < count; i++)
wait_for_completion(&compl);
Once you do that, you can also get rid of the need to pre-count "n",
so all your loops turn into:
for (i = 0; n[i]; i++)
I suppose you might want to get rid of "RPMH_MAX_REQ_IN_BATCH" and
dynamically allocate your array too, but that seems sane. As per
above it seems like you should just dynamically allocate a whole array
of "struct rpmh_request" items at once anyway.
---
> + return (ret > 0) ? 0 : -ETIMEDOUT;
> +
> +}
> +EXPORT_SYMBOL(rpmh_write_batch);
Perhaps an even simpler thing than taking all my advice above: can't
you just add a optional completion to rpmh_write_async()? That would
just be stuffed into rpm_msg.
Now your batch code would just be a bunch of calls to
rpmh_write_async() with an equal number of wait_for_completion() calls
at the end. Is there a reason that wouldn't work? You'd get rid of
_a lot_ of code.
-Doug
Hi Doug,
Will explain only the key points now.
On Fri, May 11 2018 at 14:19 -0600, Doug Anderson wrote:
>Hi,
>
>On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
>> /**
>> @@ -77,12 +82,14 @@ struct rpmh_request {
>> * @cache: the list of cached requests
>> * @lock: synchronize access to the controller data
>> * @dirty: was the cache updated since flush
>> + * @batch_cache: Cache sleep and wake requests sent as batch
>> */
>> struct rpmh_ctrlr {
>> struct rsc_drv *drv;
>> struct list_head cache;
>> spinlock_t lock;
>> bool dirty;
>> + const struct rpmh_request *batch_cache[RPMH_MAX_BATCH_CACHE];
>
>I'm pretty confused about why the "batch_cache" is separate from the
>normal cache. As far as I can tell the purpose of the two is the same
>but you have two totally separate code paths and data structures.
>
Due to a hardware limitation, requests made by bus drivers must be set
up in the sleep and wake TCS first before setting up the requests from
other drivers. Bus drivers use batch mode for any and all RPMH
communication. Hence their request are the only ones in the batch_cache.
>> };
>>
>> static struct rpmh_ctrlr rpmh_rsc[RPMH_MAX_CTRLR];
>> @@ -133,6 +140,7 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
>> struct rpmh_request *rpm_msg = container_of(msg, struct rpmh_request,
>> msg);
>> struct completion *compl = rpm_msg->completion;
>> + atomic_t *wc = rpm_msg->wait_count;
>>
>> rpm_msg->err = r;
>>
>> @@ -143,8 +151,13 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
>> kfree(rpm_msg->free);
>>
>> /* Signal the blocking thread we are done */
>> - if (compl)
>> - complete(compl);
>> + if (!compl)
>> + return;
>
>The comment above this "if" block no longer applies to the line next
>to it after your patch. ...but below I suggest you get rid of
>"wait_count", so maybe this part of the patch will go away.
>
>
>> +static int cache_batch(struct rpmh_ctrlr *ctrlr,
>> + struct rpmh_request **rpm_msg, int count)
>> +{
>> + unsigned long flags;
>> + int ret = 0;
>> + int index = 0;
>> + int i;
>> +
>> + spin_lock_irqsave(&ctrlr->lock, flags);
>> + while (index < RPMH_MAX_BATCH_CACHE && ctrlr->batch_cache[index])
>> + index++;
>> + if (index + count >= RPMH_MAX_BATCH_CACHE) {
>> + ret = -ENOMEM;
>> + goto fail;
>> + }
>> +
>> + for (i = 0; i < count; i++)
>> + ctrlr->batch_cache[index + i] = rpm_msg[i];
>> +fail:
>
>Nit: this label is for both failure and normal exit, so call it "exit".
>
>
>> + spin_unlock_irqrestore(&ctrlr->lock, flags);
>> +
>> + return ret;
>> +}
>
>As part of my overall confusion about why the batch cache is different
>than the normal one: for the normal use case you still call
>rpmh_rsc_write_ctrl_data() for things you put in your cache, but you
>don't for the batch cache. I still haven't totally figured out what
>rpmh_rsc_write_ctrl_data() does, but it seems strange that you don't
>do it for the batch cache but you do for the other one.
>
>
flush_batch does write to the controller using
rpmh_rsc_write_ctrl_data()
Thanks,
Lina
>> +/**
>> + * rpmh_write_batch: Write multiple sets of RPMH commands and wait for the
>> + * batch to finish.
>> + *
>> + * @dev: the device making the request
>> + * @state: Active/sleep set
>> + * @cmd: The payload data
>> + * @n: The array of count of elements in each batch, 0 terminated.
>> + *
>> + * Write a request to the RSC controller without caching. If the request
>> + * state is ACTIVE, then the requests are treated as completion request
>> + * and sent to the controller immediately. The function waits until all the
>> + * commands are complete. If the request was to SLEEP or WAKE_ONLY, then the
>> + * request is sent as fire-n-forget and no ack is expected.
>> + *
>> + * May sleep. Do not call from atomic contexts for ACTIVE_ONLY requests.
>> + */
>> +int rpmh_write_batch(const struct device *dev, enum rpmh_state state,
>> + const struct tcs_cmd *cmd, u32 *n)
>> +{
>> + struct rpmh_request *rpm_msg[RPMH_MAX_REQ_IN_BATCH] = { NULL };
>> + DECLARE_COMPLETION_ONSTACK(compl);
>> + atomic_t wait_count = ATOMIC_INIT(0);
>> + struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev);
>> + int count = 0;
>> + int ret, i;
>> +
>> + if (IS_ERR(ctrlr) || !cmd || !n)
>> + return -EINVAL;
>> +
>> + while (n[count++] > 0)
>> + ;
>> + count--;
>> + if (!count || count > RPMH_MAX_REQ_IN_BATCH)
>> + return -EINVAL;
>> +
>> + for (i = 0; i < count; i++) {
>> + rpm_msg[i] = __get_rpmh_msg_async(state, cmd, n[i]);
>> + if (IS_ERR_OR_NULL(rpm_msg[i])) {
>
>Just "IS_ERR". It's never NULL.
>
>...also add a i-- somewhere in here or you're going to be kfree()ing
>your error value, aren't you?
>
>
>> + ret = PTR_ERR(rpm_msg[i]);
>> + for (; i >= 0; i--)
>> + kfree(rpm_msg[i]->free);
>> + return ret;
>> + }
>> + cmd += n[i];
>> + }
>> +
>> + if (state != RPMH_ACTIVE_ONLY_STATE)
>> + return cache_batch(ctrlr, rpm_msg, count);
>
>Don't you need to free rpm_msg items in this case?
>
>
>> +
>> + atomic_set(&wait_count, count);
>> +
>> + for (i = 0; i < count; i++) {
>> + rpm_msg[i]->completion = &compl;
>> + rpm_msg[i]->wait_count = &wait_count;
>> + ret = rpmh_rsc_send_data(ctrlr->drv, &rpm_msg[i]->msg);
>> + if (ret) {
>> + int j;
>> +
>> + pr_err("Error(%d) sending RPMH message addr=%#x\n",
>> + ret, rpm_msg[i]->msg.cmds[0].addr);
>> + for (j = i; j < count; j++)
>> + rpmh_tx_done(&rpm_msg[j]->msg, ret);
>
>You're just using rpmh_tx_done() to free memory? Note that you'll
>probably do your error handling in this function a favor if you rename
>__get_rpmh_msg_async() to __fill_rpmh_msg() and remove the memory
>allocation from there. Then you can do one big allocation of the
>whole array in rpmh_write_batch() and then you'll only have one free
>at the end...
>
>
>
>> + break;
>
>"break" seems wrong here. You'll end up waiting for the completion,
>then I guess timing out, then returning -ETIMEDOUT?
>
>
>> + }
>> + }
>> +
>> + ret = wait_for_completion_timeout(&compl, RPMH_TIMEOUT_MS);
>
>The "wait_count" abstraction is confusing and I believe it's not
>needed. I think you can remove it and change the above to this
>(untested) code:
>
>time_left = RPMH_TIMEOUT_MS;
>for (i = 0; i < count; i++) {
> time_left = wait_for_completion_timeout(&compl, time_left);
> if (!time_left)
> return -ETIMEDOUT;
>}
>
>...specifically completions are additive, so just wait "count" times
>and then the reader doesn't need to learn your new wait_count
>abstraction and try to reason about it.
>
>...and, actually, I argue in other replies that this should't use a
>timeout, so even cleaner:
>
>for (i = 0; i < count; i++)
> wait_for_completion(&compl);
>
>
>Once you do that, you can also get rid of the need to pre-count "n",
>so all your loops turn into:
>
>for (i = 0; n[i]; i++)
>
>
>I suppose you might want to get rid of "RPMH_MAX_REQ_IN_BATCH" and
>dynamically allocate your array too, but that seems sane. As per
>above it seems like you should just dynamically allocate a whole array
>of "struct rpmh_request" items at once anyway.
>
>---
>
>> + return (ret > 0) ? 0 : -ETIMEDOUT;
>> +
>> +}
>> +EXPORT_SYMBOL(rpmh_write_batch);
>
>Perhaps an even simpler thing than taking all my advice above: can't
>you just add a optional completion to rpmh_write_async()? That would
>just be stuffed into rpm_msg.
>
>Now your batch code would just be a bunch of calls to
>rpmh_write_async() with an equal number of wait_for_completion() calls
>at the end. Is there a reason that wouldn't work? You'd get rid of
>_a lot_ of code.
>
>
>-Doug
Hi,
On Mon, May 14, 2018 at 12:59 PM, Lina Iyer <[email protected]> wrote:
>>> /**
>>> @@ -77,12 +82,14 @@ struct rpmh_request {
>>> * @cache: the list of cached requests
>>> * @lock: synchronize access to the controller data
>>> * @dirty: was the cache updated since flush
>>> + * @batch_cache: Cache sleep and wake requests sent as batch
>>> */
>>> struct rpmh_ctrlr {
>>> struct rsc_drv *drv;
>>> struct list_head cache;
>>> spinlock_t lock;
>>> bool dirty;
>>> + const struct rpmh_request *batch_cache[RPMH_MAX_BATCH_CACHE];
>>
>>
>> I'm pretty confused about why the "batch_cache" is separate from the
>> normal cache. As far as I can tell the purpose of the two is the same
>> but you have two totally separate code paths and data structures.
>>
> Due to a hardware limitation, requests made by bus drivers must be set
> up in the sleep and wake TCS first before setting up the requests from
> other drivers. Bus drivers use batch mode for any and all RPMH
> communication. Hence their request are the only ones in the batch_cache.
This is totally not obvious and not commented. Why not rename to
"priority" instead of "batch"?
If the only requirement is that these come first, that's still no
reason to use totally separate code paths and mechanisms. These
requests can still use the same data structures / functions and just
be ordered to come first, can't they? ...or given a boolean
"priority" and you can do two passes through your queue: one to do the
priority ones and one to do the secondary ones.
>>> + spin_unlock_irqrestore(&ctrlr->lock, flags);
>>> +
>>> + return ret;
>>> +}
>>
>>
>> As part of my overall confusion about why the batch cache is different
>> than the normal one: for the normal use case you still call
>> rpmh_rsc_write_ctrl_data() for things you put in your cache, but you
>> don't for the batch cache. I still haven't totally figured out what
>> rpmh_rsc_write_ctrl_data() does, but it seems strange that you don't
>> do it for the batch cache but you do for the other one.
>>
>>
> flush_batch does write to the controller using
> rpmh_rsc_write_ctrl_data()
My confusion is that they happen at different times. As I understand it:
* For the normal case, you immediately calls
rpmh_rsc_write_ctrl_data() and then later do the rest of the work.
* For the batch case, you call both later.
Is there a good reason for this, or is it just an arbitrary
difference? If there's a good reason, it should be documented.
-Doug
On Tue, May 15 2018 at 09:52 -0600, Doug Anderson wrote:
>Hi,
>
>On Mon, May 14, 2018 at 12:59 PM, Lina Iyer <[email protected]> wrote:
>
>>>> /**
>>>> @@ -77,12 +82,14 @@ struct rpmh_request {
>>>> * @cache: the list of cached requests
>>>> * @lock: synchronize access to the controller data
>>>> * @dirty: was the cache updated since flush
>>>> + * @batch_cache: Cache sleep and wake requests sent as batch
>>>> */
>>>> struct rpmh_ctrlr {
>>>> struct rsc_drv *drv;
>>>> struct list_head cache;
>>>> spinlock_t lock;
>>>> bool dirty;
>>>> + const struct rpmh_request *batch_cache[RPMH_MAX_BATCH_CACHE];
>>>
>>>
>>> I'm pretty confused about why the "batch_cache" is separate from the
>>> normal cache. As far as I can tell the purpose of the two is the same
>>> but you have two totally separate code paths and data structures.
>>>
>> Due to a hardware limitation, requests made by bus drivers must be set
>> up in the sleep and wake TCS first before setting up the requests from
>> other drivers. Bus drivers use batch mode for any and all RPMH
>> communication. Hence their request are the only ones in the batch_cache.
>
>This is totally not obvious and not commented. Why not rename to
>"priority" instead of "batch"?
>
>If the only requirement is that these come first, that's still no
>reason to use totally separate code paths and mechanisms. These
>requests can still use the same data structures / functions and just
>be ordered to come first, can't they? ...or given a boolean
>"priority" and you can do two passes through your queue: one to do the
>priority ones and one to do the secondary ones.
>
>
The bus requests have a certain order and cannot be mutated by the
RPMH driver. It has to be maintained in the TCS in the same order.
Also, the bus requests have quite a churn during the course of an
usecase. They are setup and invalidated often.
It is faster to have them separate and invalidate the whole lot of the
batch_cache instead of intertwining them with requests from other
drivers and then figuring out which all must be invalidated and rebuilt
when the next batch requests comes in. Remember, that requests may come
from any driver any time and therefore will be mangled if we use the
same data structure. So to be faster and to avoid having mangled requests
in the TCS, I have kept the data structures separate.
>>>> + spin_unlock_irqrestore(&ctrlr->lock, flags);
>>>> +
>>>> + return ret;
>>>> +}
>>>
>>>
>>> As part of my overall confusion about why the batch cache is different
>>> than the normal one: for the normal use case you still call
>>> rpmh_rsc_write_ctrl_data() for things you put in your cache, but you
>>> don't for the batch cache. I still haven't totally figured out what
>>> rpmh_rsc_write_ctrl_data() does, but it seems strange that you don't
>>> do it for the batch cache but you do for the other one.
>>>
>>>
>> flush_batch does write to the controller using
>> rpmh_rsc_write_ctrl_data()
>
>My confusion is that they happen at different times. As I understand it:
>
>* For the normal case, you immediately calls
>rpmh_rsc_write_ctrl_data() and then later do the rest of the work.
>
>* For the batch case, you call both later.
>
>Is there a good reason for this, or is it just an arbitrary
>difference? If there's a good reason, it should be documented.
>
>
In both the cases, the requests are cached in the rpmh library and are
only sent to the controller only when the flushed. I am not sure the
work is any different. The rpmh_flush() flushes out batch requests and
then the requests from other drivers.
Thanks,
Lina
Hi,
On Tue, May 15, 2018 at 9:23 AM, Lina Iyer <[email protected]> wrote:
> On Tue, May 15 2018 at 09:52 -0600, Doug Anderson wrote:
>>
>> Hi,
>>
>> On Mon, May 14, 2018 at 12:59 PM, Lina Iyer <[email protected]> wrote:
>>
>>>>> /**
>>>>> @@ -77,12 +82,14 @@ struct rpmh_request {
>>>>> * @cache: the list of cached requests
>>>>> * @lock: synchronize access to the controller data
>>>>> * @dirty: was the cache updated since flush
>>>>> + * @batch_cache: Cache sleep and wake requests sent as batch
>>>>> */
>>>>> struct rpmh_ctrlr {
>>>>> struct rsc_drv *drv;
>>>>> struct list_head cache;
>>>>> spinlock_t lock;
>>>>> bool dirty;
>>>>> + const struct rpmh_request *batch_cache[RPMH_MAX_BATCH_CACHE];
>>>>
>>>>
>>>>
>>>> I'm pretty confused about why the "batch_cache" is separate from the
>>>> normal cache. As far as I can tell the purpose of the two is the same
>>>> but you have two totally separate code paths and data structures.
>>>>
>>> Due to a hardware limitation, requests made by bus drivers must be set
>>> up in the sleep and wake TCS first before setting up the requests from
>>> other drivers. Bus drivers use batch mode for any and all RPMH
>>> communication. Hence their request are the only ones in the batch_cache.
>>
>>
>> This is totally not obvious and not commented. Why not rename to
>> "priority" instead of "batch"?
>>
>> If the only requirement is that these come first, that's still no
>> reason to use totally separate code paths and mechanisms. These
>> requests can still use the same data structures / functions and just
>> be ordered to come first, can't they? ...or given a boolean
>> "priority" and you can do two passes through your queue: one to do the
>> priority ones and one to do the secondary ones.
>>
>>
> The bus requests have a certain order and cannot be mutated by the
> RPMH driver. It has to be maintained in the TCS in the same order.
Please document this requirement in the code.
> Also, the bus requests have quite a churn during the course of an
> usecase. They are setup and invalidated often.
> It is faster to have them separate and invalidate the whole lot of the
> batch_cache instead of intertwining them with requests from other
> drivers and then figuring out which all must be invalidated and rebuilt
> when the next batch requests comes in. Remember, that requests may come
> from any driver any time and therefore will be mangled if we use the
> same data structure. So to be faster and to avoid having mangled requests
> in the TCS, I have kept the data structures separate.
If you need to find a certain group of requests then can't you just
tag them and it's easy to find them? I'm still not seeing the need
for a whole separate code path and data structure.
>>>>> + spin_unlock_irqrestore(&ctrlr->lock, flags);
>>>>> +
>>>>> + return ret;
>>>>> +}
>>>>
>>>>
>>>>
>>>> As part of my overall confusion about why the batch cache is different
>>>> than the normal one: for the normal use case you still call
>>>> rpmh_rsc_write_ctrl_data() for things you put in your cache, but you
>>>> don't for the batch cache. I still haven't totally figured out what
>>>> rpmh_rsc_write_ctrl_data() does, but it seems strange that you don't
>>>> do it for the batch cache but you do for the other one.
>>>>
>>>>
>>> flush_batch does write to the controller using
>>> rpmh_rsc_write_ctrl_data()
>>
>>
>> My confusion is that they happen at different times. As I understand it:
>>
>> * For the normal case, you immediately calls
>> rpmh_rsc_write_ctrl_data() and then later do the rest of the work.
>>
>> * For the batch case, you call both later.
>>
>> Is there a good reason for this, or is it just an arbitrary
>> difference? If there's a good reason, it should be documented.
>>
>>
> In both the cases, the requests are cached in the rpmh library and are
> only sent to the controller only when the flushed. I am not sure the
> work is any different. The rpmh_flush() flushes out batch requests and
> then the requests from other drivers.
OK then, here are the code paths I see:
rpmh_write
=> __rpmh_write
=> cache_rpm_request()
=> (state != RPMH_ACTIVE_ONLY_STATE): rpmh_rsc_send_data()
rpmh_write_batch
=> (state != RPMH_ACTIVE_ONLY_STATE): cache_batch()
=> No call to rpmh_rsc_send_data()
Said another way:
* if you call rpmh_write() for something you're going to defer you
will still call cache_rpm_request() _before_ rpmh_write() returns.
* if you call rpmh_write_batch() for something you're going to defer
then you _don't_ call cache_rpm_request() before rpmh_write_batch()
returns.
Do you find a fault in my analysis of the current code? If you see a
fault then please point it out. If you don't see a fault, please say
why the behaviors are different. I certainly understand that
eventually you will call cache_rpm_request() for both cases. It's
just that in one case the call happens right away and the other case
it is deferred.
On Fri, May 11 2018 at 14:17 -0600, Doug Anderson wrote:
>Hi,
>
>On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
>> +int rpmh_write(const struct device *dev, enum rpmh_state state,
>> + const struct tcs_cmd *cmd, u32 n)
>> +{
>> + DECLARE_COMPLETION_ONSTACK(compl);
>> + DEFINE_RPMH_MSG_ONSTACK(dev, state, &compl, rpm_msg);
>> + int ret;
>> +
>> + if (!cmd || !n || n > MAX_RPMH_PAYLOAD)
>> + return -EINVAL;
>> +
>> + memcpy(rpm_msg.cmd, cmd, n * sizeof(*cmd));
>> + rpm_msg.msg.num_cmds = n;
>> +
>> + ret = __rpmh_write(dev, state, &rpm_msg);
>> + if (ret)
>> + return ret;
>> +
>> + ret = wait_for_completion_timeout(&compl, RPMH_TIMEOUT_MS);
>
>IMO it's almost never a good idea to use wait_for_completion_timeout()
>together with a completion that's declared on the stack. If you
>somehow insist that this is a good idea then I need to see incredibly
>clear and obvious code/comments that say why it's impossible that the
>process might somehow try to signal the completion _after_
>RPMH_TIMEOUT_MS has expired.
>
>Specifically if the timeout happens but the process could still signal
>a completion later then they will access random data on the stack of a
>function that has already returned. This causes ridiculously
>difficult-to-debug crashes.
>
>
>NOTE: You've got timeout set to 10 seconds here. Is that really even
>useful? IMO just call wait_for_completion() without a timeout. It's
>much better to have a nice clean hang than a random stack corruption.
>
>
The 10 sec timeout will guarantee that we will not get a response at all
anymore for the request. Usually requests can be considered failed if
there is no response in a few tens of microseconds. 10 sec is just an
arbitarily large number.
The reason we use timeout is that once the timeout happens, we know we
have failed, we could trigger a watchdog or crash the system. This is
very important for our productization in debugging RPMH failures. A
hang would not always trigger a watchdog and the failure would be silent
and possibly fatal but hard to debug.
-- Lina
On Tue, May 15 2018 at 10:50 -0600, Doug Anderson wrote:
>Hi,
>
>On Tue, May 15, 2018 at 9:23 AM, Lina Iyer <[email protected]> wrote:
>> On Tue, May 15 2018 at 09:52 -0600, Doug Anderson wrote:
>>>
>>> Hi,
>>>
>>> On Mon, May 14, 2018 at 12:59 PM, Lina Iyer <[email protected]> wrote:
>>>
>>>>>> /**
>>>>>> @@ -77,12 +82,14 @@ struct rpmh_request {
>>>>>> * @cache: the list of cached requests
>>>>>> * @lock: synchronize access to the controller data
>>>>>> * @dirty: was the cache updated since flush
>>>>>> + * @batch_cache: Cache sleep and wake requests sent as batch
>>>>>> */
>>>>>> struct rpmh_ctrlr {
>>>>>> struct rsc_drv *drv;
>>>>>> struct list_head cache;
>>>>>> spinlock_t lock;
>>>>>> bool dirty;
>>>>>> + const struct rpmh_request *batch_cache[RPMH_MAX_BATCH_CACHE];
>>>>>
>>>>>
>>>>>
>>>>> I'm pretty confused about why the "batch_cache" is separate from the
>>>>> normal cache. As far as I can tell the purpose of the two is the same
>>>>> but you have two totally separate code paths and data structures.
>>>>>
>>>> Due to a hardware limitation, requests made by bus drivers must be set
>>>> up in the sleep and wake TCS first before setting up the requests from
>>>> other drivers. Bus drivers use batch mode for any and all RPMH
>>>> communication. Hence their request are the only ones in the batch_cache.
>>>
>>>
>>> This is totally not obvious and not commented. Why not rename to
>>> "priority" instead of "batch"?
>>>
>>> If the only requirement is that these come first, that's still no
>>> reason to use totally separate code paths and mechanisms. These
>>> requests can still use the same data structures / functions and just
>>> be ordered to come first, can't they? ...or given a boolean
>>> "priority" and you can do two passes through your queue: one to do the
>>> priority ones and one to do the secondary ones.
>>>
>>>
>> The bus requests have a certain order and cannot be mutated by the
>> RPMH driver. It has to be maintained in the TCS in the same order.
>
>Please document this requirement in the code.
>
>
OK.
>> Also, the bus requests have quite a churn during the course of an
>> usecase. They are setup and invalidated often.
>> It is faster to have them separate and invalidate the whole lot of the
>> batch_cache instead of intertwining them with requests from other
>> drivers and then figuring out which all must be invalidated and rebuilt
>> when the next batch requests comes in. Remember, that requests may come
>> from any driver any time and therefore will be mangled if we use the
>> same data structure. So to be faster and to avoid having mangled requests
>> in the TCS, I have kept the data structures separate.
>
>If you need to find a certain group of requests then can't you just
>tag them and it's easy to find them? I'm still not seeing the need
>for a whole separate code path and data structure.
>
Could be done but it will be slower and involve a lot more code than
separate data structures.
>>>>>> + spin_unlock_irqrestore(&ctrlr->lock, flags);
>>>>>> +
>>>>>> + return ret;
>>>>>> +}
>>>>>
>>>>>
>>>>>
>>>>> As part of my overall confusion about why the batch cache is different
>>>>> than the normal one: for the normal use case you still call
>>>>> rpmh_rsc_write_ctrl_data() for things you put in your cache, but you
>>>>> don't for the batch cache. I still haven't totally figured out what
>>>>> rpmh_rsc_write_ctrl_data() does, but it seems strange that you don't
>>>>> do it for the batch cache but you do for the other one.
>>>>>
>>>>>
>>>> flush_batch does write to the controller using
>>>> rpmh_rsc_write_ctrl_data()
>>>
>>>
>>> My confusion is that they happen at different times. As I understand it:
>>>
>>> * For the normal case, you immediately calls
>>> rpmh_rsc_write_ctrl_data() and then later do the rest of the work.
>>>
>>> * For the batch case, you call both later.
>>>
>>> Is there a good reason for this, or is it just an arbitrary
>>> difference? If there's a good reason, it should be documented.
>>>
>>>
>> In both the cases, the requests are cached in the rpmh library and are
>> only sent to the controller only when the flushed. I am not sure the
>> work is any different. The rpmh_flush() flushes out batch requests and
>> then the requests from other drivers.
>
>OK then, here are the code paths I see:
>
>rpmh_write
>=> __rpmh_write
> => cache_rpm_request()
> => (state != RPMH_ACTIVE_ONLY_STATE): rpmh_rsc_send_data()
>
>rpmh_write_batch
>=> (state != RPMH_ACTIVE_ONLY_STATE): cache_batch()
> => No call to rpmh_rsc_send_data()
>
>
>Said another way:
>
>* if you call rpmh_write() for something you're going to defer you
>will still call cache_rpm_request() _before_ rpmh_write() returns.
>
>* if you call rpmh_write_batch() for something you're going to defer
>then you _don't_ call cache_rpm_request() before rpmh_write_batch()
>returns.
>
>
>Do you find a fault in my analysis of the current code? If you see a
>fault then please point it out. If you don't see a fault, please say
>why the behaviors are different. I certainly understand that
>eventually you will call cache_rpm_request() for both cases. It's
>just that in one case the call happens right away and the other case
>it is deferred.
True. I see where your confusion is. It is because of an optimization and
our existential need for saving power at every opportunity.
For rpmh_write path -
The reason being for the disparity is that, we can vote for a system low
power mode (modes where we flush the caches and put the entire silicon
is a low power state) at any point when all the CPUs are idle. If a
driver has updated its vote for a system resource, it needs to be
updated in the cache and thats what cache_rpm_request() does. Its
possible that we may enter a system low power mode immediately after
that. The rpmh_flush() would be called by the idle driver and we would
flush the cached values and enter the idle state. By writing immediately
to the TCS, we avoid increasing the latency of entering an idle state.
For the rpmh_write_batch() path -
The Bus driver would invalidate the TCS and the batch_cache. The use
case fills up the batch_cache with values as needed by the use case.
While during the usecase, the CPUs can go idle and the idle drivers
would call rpmh_flush(). At that time, we would write the batch_cache
into the already invalidated TCSes and then write the rest of the cached
requests from ->cache. We then enter the low power modes.
The optimization of writing the sleep/wake votes in the same context of
the driver making the request, helps us avoid writing some extra
registers in the critical idle path.
Hope this helps.
-- Lina
Hi,
On Tue, May 15, 2018 at 10:47 AM, Lina Iyer <[email protected]> wrote:
> On Fri, May 11 2018 at 14:17 -0600, Doug Anderson wrote:
>>
>> Hi,
>>
>> On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
>>>
>>> +int rpmh_write(const struct device *dev, enum rpmh_state state,
>>> + const struct tcs_cmd *cmd, u32 n)
>>> +{
>>> + DECLARE_COMPLETION_ONSTACK(compl);
>>> + DEFINE_RPMH_MSG_ONSTACK(dev, state, &compl, rpm_msg);
>>> + int ret;
>>> +
>>> + if (!cmd || !n || n > MAX_RPMH_PAYLOAD)
>>> + return -EINVAL;
>>> +
>>> + memcpy(rpm_msg.cmd, cmd, n * sizeof(*cmd));
>>> + rpm_msg.msg.num_cmds = n;
>>> +
>>> + ret = __rpmh_write(dev, state, &rpm_msg);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + ret = wait_for_completion_timeout(&compl, RPMH_TIMEOUT_MS);
>>
>>
>> IMO it's almost never a good idea to use wait_for_completion_timeout()
>> together with a completion that's declared on the stack. If you
>> somehow insist that this is a good idea then I need to see incredibly
>> clear and obvious code/comments that say why it's impossible that the
>> process might somehow try to signal the completion _after_
>> RPMH_TIMEOUT_MS has expired.
>>
>> Specifically if the timeout happens but the process could still signal
>> a completion later then they will access random data on the stack of a
>> function that has already returned. This causes ridiculously
>> difficult-to-debug crashes.
>>
>>
>> NOTE: You've got timeout set to 10 seconds here. Is that really even
>> useful? IMO just call wait_for_completion() without a timeout. It's
>> much better to have a nice clean hang than a random stack corruption.
>>
>>
> The 10 sec timeout will guarantee that we will not get a response at all
> anymore for the request. Usually requests can be considered failed if
> there is no response in a few tens of microseconds. 10 sec is just an
> arbitarily large number.
>
> The reason we use timeout is that once the timeout happens, we know we
> have failed, we could trigger a watchdog or crash the system. This is
> very important for our productization in debugging RPMH failures. A
> hang would not always trigger a watchdog and the failure would be silent
> and possibly fatal but hard to debug.
If you intend the system to crash when this timeout happens then IMHO
add a BUG_ON. Then I won't worry about something coming around later
and clobbering the stack.
-Doug
Hi,
On Tue, May 15, 2018 at 11:03 AM, Lina Iyer <[email protected]> wrote:
>>> Also, the bus requests have quite a churn during the course of an
>>> usecase. They are setup and invalidated often.
>>> It is faster to have them separate and invalidate the whole lot of the
>>> batch_cache instead of intertwining them with requests from other
>>> drivers and then figuring out which all must be invalidated and rebuilt
>>> when the next batch requests comes in. Remember, that requests may come
>>> from any driver any time and therefore will be mangled if we use the
>>> same data structure. So to be faster and to avoid having mangled requests
>>> in the TCS, I have kept the data structures separate.
>>
>>
>> If you need to find a certain group of requests then can't you just
>> tag them and it's easy to find them? I'm still not seeing the need
>> for a whole separate code path and data structure.
>>
> Could be done but it will be slower and involve a lot more code than
> separate data structures.
When you say "a lot more code", you mean that you'll have to write a
lot more code or that each request will execute a lot more code? I
would argue that very little code would need to be added (compared to
your patch) if rpmh_write_batch() was implemented atop
rpmh_write_async(). Even with extra tagging / prioritization it would
be small. I could try to prototype it I guess.
I think you mean that it would _execute_ a lot more code and thus be
slower. Is that right? Is your main argument here that statically
pre-allocating 20 slots in an array is going to be a lot faster than
kzalloc-ing 20 linked list nodes and chaining them together? If
you're truly worried about the kzalloc calls being slow, I suppose you
could always allocate them with a kmem_cache. ...but we're talking
fewer than 20 kzalloc calls, right?
I'll also make an argument (with no data to back me up) that
maintaining separate functions for handling the batch cases will slow
down your kernel because your footprint in the icache will be bigger
and you'll be more likely to kick something else out that actually
needs to run fast.
>>>>>>> + spin_unlock_irqrestore(&ctrlr->lock, flags);
>>>>>>> +
>>>>>>> + return ret;
>>>>>>> +}
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> As part of my overall confusion about why the batch cache is different
>>>>>> than the normal one: for the normal use case you still call
>>>>>> rpmh_rsc_write_ctrl_data() for things you put in your cache, but you
>>>>>> don't for the batch cache. I still haven't totally figured out what
>>>>>> rpmh_rsc_write_ctrl_data() does, but it seems strange that you don't
>>>>>> do it for the batch cache but you do for the other one.
>>>>>>
>>>>>>
>>>>> flush_batch does write to the controller using
>>>>> rpmh_rsc_write_ctrl_data()
>>>>
>>>>
>>>>
>>>> My confusion is that they happen at different times. As I understand
>>>> it:
>>>>
>>>> * For the normal case, you immediately calls
>>>> rpmh_rsc_write_ctrl_data() and then later do the rest of the work.
>>>>
>>>> * For the batch case, you call both later.
>>>>
>>>> Is there a good reason for this, or is it just an arbitrary
>>>> difference? If there's a good reason, it should be documented.
>>>>
>>>>
>>> In both the cases, the requests are cached in the rpmh library and are
>>> only sent to the controller only when the flushed. I am not sure the
>>> work is any different. The rpmh_flush() flushes out batch requests and
>>> then the requests from other drivers.
>>
>>
>> OK then, here are the code paths I see:
>>
>> rpmh_write
>> => __rpmh_write
>> => cache_rpm_request()
>> => (state != RPMH_ACTIVE_ONLY_STATE): rpmh_rsc_send_data()
>>
>> rpmh_write_batch
>> => (state != RPMH_ACTIVE_ONLY_STATE): cache_batch()
>> => No call to rpmh_rsc_send_data()
>>
>>
>> Said another way:
>>
>> * if you call rpmh_write() for something you're going to defer you
>> will still call cache_rpm_request() _before_ rpmh_write() returns.
>>
>> * if you call rpmh_write_batch() for something you're going to defer
>> then you _don't_ call cache_rpm_request() before rpmh_write_batch()
>> returns.
>>
>>
>> Do you find a fault in my analysis of the current code? If you see a
>> fault then please point it out. If you don't see a fault, please say
>> why the behaviors are different. I certainly understand that
>> eventually you will call cache_rpm_request() for both cases. It's
>> just that in one case the call happens right away and the other case
>> it is deferred.
>
> True. I see where your confusion is. It is because of an optimization and
> our existential need for saving power at every opportunity.
>
> For rpmh_write path -
> The reason being for the disparity is that, we can vote for a system low
> power mode (modes where we flush the caches and put the entire silicon
> is a low power state) at any point when all the CPUs are idle. If a
> driver has updated its vote for a system resource, it needs to be
> updated in the cache and thats what cache_rpm_request() does. Its
> possible that we may enter a system low power mode immediately after
> that. The rpmh_flush() would be called by the idle driver and we would
> flush the cached values and enter the idle state. By writing immediately
> to the TCS, we avoid increasing the latency of entering an idle state.
>
> For the rpmh_write_batch() path -
> The Bus driver would invalidate the TCS and the batch_cache. The use
> case fills up the batch_cache with values as needed by the use case.
> While during the usecase, the CPUs can go idle and the idle drivers
> would call rpmh_flush(). At that time, we would write the batch_cache
> into the already invalidated TCSes and then write the rest of the cached
> requests from ->cache. We then enter the low power modes.
>
> The optimization of writing the sleep/wake votes in the same context of
> the driver making the request, helps us avoid writing some extra
> registers in the critical idle path.
I don't think I followed all that, but I'll try to read deeper into
RPMh. ...but one question: could the two paths be changed to be the
same? AKA would it be bad to call rpmh_rsc_send_data() synchronously
in rpmh_write_batch()?
If the two cases really need to behave differently then it should be
documented in the code.
-Doug
Hi,
On 5/12/2018 1:45 AM, Doug Anderson wrote:
> Hi,
>
> On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
>> +int rpmh_rsc_send_data(struct rsc_drv *drv, const struct tcs_request *msg)
>> +{
>> + int ret;
>> +
>> + if (!msg || !msg->cmds || !msg->num_cmds ||
>> + msg->num_cmds > MAX_RPMH_PAYLOAD) {
>> + WARN_ON(1);
>> + return -EINVAL;
>> + }
>> +
>> + do {
>> + ret = tcs_write(drv, msg);
>> + if (ret == -EBUSY) {
>> + pr_info_ratelimited("TCS Busy, retrying RPMH message send: addr=%#x\n",
>> + msg->cmds[0].addr);
>> + udelay(10);
>> + }
>> + } while (ret == -EBUSY);
>> +
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(rpmh_rsc_send_data);
>
> Here and elsewhere in this series: why EXPORT_SYMBOL in this case?
> This is only exported to rpmh.c, right? You don't need EXPORT_SYMBOL
> for that. The Makefile puts rpmh.c and rpmh-rsc.c together in the
> same "qcom_rpmh.o", and then even further the KConfig lists this as
> bool so both are builtin to the kernel.
>
> -Doug
Sure. I Will change in v9.
Thanks for your review Doug,
Raju
> --
> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Hi,
On 5/15/2018 11:52 PM, Doug Anderson wrote:
> Hi,
>
> On Tue, May 15, 2018 at 10:47 AM, Lina Iyer <[email protected]> wrote:
>> On Fri, May 11 2018 at 14:17 -0600, Doug Anderson wrote:
>>>
>>> Hi,
>>>
>>> On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
>>>>
>>>> +int rpmh_write(const struct device *dev, enum rpmh_state state,
>>>> + const struct tcs_cmd *cmd, u32 n)
>>>> +{
>>>> + DECLARE_COMPLETION_ONSTACK(compl);
>>>> + DEFINE_RPMH_MSG_ONSTACK(dev, state, &compl, rpm_msg);
>>>> + int ret;
>>>> +
>>>> + if (!cmd || !n || n > MAX_RPMH_PAYLOAD)
>>>> + return -EINVAL;
>>>> +
>>>> + memcpy(rpm_msg.cmd, cmd, n * sizeof(*cmd));
>>>> + rpm_msg.msg.num_cmds = n;
>>>> +
>>>> + ret = __rpmh_write(dev, state, &rpm_msg);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + ret = wait_for_completion_timeout(&compl, RPMH_TIMEOUT_MS);
>>>
>>>
>>> IMO it's almost never a good idea to use wait_for_completion_timeout()
>>> together with a completion that's declared on the stack. If you
>>> somehow insist that this is a good idea then I need to see incredibly
>>> clear and obvious code/comments that say why it's impossible that the
>>> process might somehow try to signal the completion _after_
>>> RPMH_TIMEOUT_MS has expired.
>>>
>>> Specifically if the timeout happens but the process could still signal
>>> a completion later then they will access random data on the stack of a
>>> function that has already returned. This causes ridiculously
>>> difficult-to-debug crashes.
>>>
>>>
>>> NOTE: You've got timeout set to 10 seconds here. Is that really even
>>> useful? IMO just call wait_for_completion() without a timeout. It's
>>> much better to have a nice clean hang than a random stack corruption.
>>>
>>>
>> The 10 sec timeout will guarantee that we will not get a response at all
>> anymore for the request. Usually requests can be considered failed if
>> there is no response in a few tens of microseconds. 10 sec is just an
>> arbitarily large number.
>>
>> The reason we use timeout is that once the timeout happens, we know we
>> have failed, we could trigger a watchdog or crash the system. This is
>> very important for our productization in debugging RPMH failures. A
>> hang would not always trigger a watchdog and the failure would be silent
>> and possibly fatal but hard to debug.
>
> If you intend the system to crash when this timeout happens then IMHO
> add a BUG_ON. Then I won't worry about something coming around later
> and clobbering the stack.
>
> -Doug
>
Sure. Will add BUG_ON in next patch.
Thanks,
Raju.
Hi,
On 5/12/2018 1:48 AM, Doug Anderson wrote:
> Hi,
>
> On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
>> /**
>> * struct rpmh_request: the message to be sent to rpmh-rsc
>> *
>> @@ -54,9 +71,15 @@ struct rpmh_request {
>> * struct rpmh_ctrlr: our representation of the controller
>> *
>> * @drv: the controller instance
>> + * @cache: the list of cached requests
>> + * @lock: synchronize access to the controller data
>
> nit: this makes it sound like this lock will be grabbed for all calls
> into rpmh-rsc. In fact, it only protects access to the cache.
> Ideally name it cache_lock and document that it's for protecting the
> cache.
>
>
>> +/**
>> + * rpmh_flush: Flushes the buffered active and sleep sets to TCS
>> + *
>> + * @dev: The device making the request
>> + *
>> + * Return: -EBUSY if the controller is busy, probably waiting on a response
>> + * to a RPMH request sent earlier.
>> + *
>> + * This function is generally called from the sleep code from the last CPU
>
> "is generally" implies that sometimes it's not called from the sleep
> code. Change to "is always". If "is generally" is more correct, you
> can't run lockless right?
>
>
> -Doug
>
Yes. Will change in next patch.
Thanks,
Raju
Hi,
On 5/12/2018 1:46 AM, Doug Anderson wrote:
> Hi,
>
> On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
>> /**
>> @@ -137,6 +140,8 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
>> dev_err(rpm_msg->dev, "RPMH TX fail in msg addr=%#x, err=%d\n",
>> rpm_msg->msg.cmds[0].addr, r);
>>
>> + kfree(rpm_msg->free);
>> +
>
> The way the code is written makes it seem like you could free memory
> _and_ have a completion but you can't. Specifically:
>
> * The only plausible thing that "rpm_msg->free" could point to is "rpm_msg".
>
> * The complete(compl) would then be accessing freed memory.
As the completions are declared on stack, it will not access freed memory.
>
> I believe the kfree() should be at the end of the function.
> Personally I'd make it more obvious that this is just a boolean value
> and change to:
>
> if (rpm_msg->needs_free)
> kgree(rpm_msg)
>
> ...then the reader of the code doesn't need to go figure out what
> you're trying to free.
>
>
> -Doug
>
Yes. Will change it this way in next patch set.
Thanks,
Raju
Hi,
will reply on points other than what Lina has responded.
On 5/12/2018 1:49 AM, Doug Anderson wrote:
> Hi,
>
> On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
>> /**
>> @@ -77,12 +82,14 @@ struct rpmh_request {
>> * @cache: the list of cached requests
>> * @lock: synchronize access to the controller data
>> * @dirty: was the cache updated since flush
>> + * @batch_cache: Cache sleep and wake requests sent as batch
>> */
>> struct rpmh_ctrlr {
>> struct rsc_drv *drv;
>> struct list_head cache;
>> spinlock_t lock;
>> bool dirty;
>> + const struct rpmh_request *batch_cache[RPMH_MAX_BATCH_CACHE];
>
> I'm pretty confused about why the "batch_cache" is separate from the
> normal cache. As far as I can tell the purpose of the two is the same
> but you have two totally separate code paths and data structures.
>
>
>> };
>>
>> static struct rpmh_ctrlr rpmh_rsc[RPMH_MAX_CTRLR];
>> @@ -133,6 +140,7 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
>> struct rpmh_request *rpm_msg = container_of(msg, struct rpmh_request,
>> msg);
>> struct completion *compl = rpm_msg->completion;
>> + atomic_t *wc = rpm_msg->wait_count;
>>
>> rpm_msg->err = r;
>>
>> @@ -143,8 +151,13 @@ void rpmh_tx_done(const struct tcs_request *msg, int r)
>> kfree(rpm_msg->free);
>>
>> /* Signal the blocking thread we are done */
>> - if (compl)
>> - complete(compl);
>> + if (!compl)
>> + return;
>
> The comment above this "if" block no longer applies to the line next
> to it after your patch. ...but below I suggest you get rid of
> "wait_count", so maybe this part of the patch will go away.
>
>
>> +static int cache_batch(struct rpmh_ctrlr *ctrlr,
>> + struct rpmh_request **rpm_msg, int count)
>> +{
>> + unsigned long flags;
>> + int ret = 0;
>> + int index = 0;
>> + int i;
>> +
>> + spin_lock_irqsave(&ctrlr->lock, flags);
>> + while (index < RPMH_MAX_BATCH_CACHE && ctrlr->batch_cache[index])
>> + index++;
>> + if (index + count >= RPMH_MAX_BATCH_CACHE) {
>> + ret = -ENOMEM;
>> + goto fail;
>> + }
>> +
>> + for (i = 0; i < count; i++)
>> + ctrlr->batch_cache[index + i] = rpm_msg[i];
>> +fail:
>
> Nit: this label is for both failure and normal exit, so call it "exit".
>
>
>> + spin_unlock_irqrestore(&ctrlr->lock, flags);
>> +
>> + return ret;
>> +}
>
> As part of my overall confusion about why the batch cache is different
> than the normal one: for the normal use case you still call
> rpmh_rsc_write_ctrl_data() for things you put in your cache, but you
> don't for the batch cache. I still haven't totally figured out what
> rpmh_rsc_write_ctrl_data() does, but it seems strange that you don't
> do it for the batch cache but you do for the other one.
>
>
>> +/**
>> + * rpmh_write_batch: Write multiple sets of RPMH commands and wait for the
>> + * batch to finish.
>> + *
>> + * @dev: the device making the request
>> + * @state: Active/sleep set
>> + * @cmd: The payload data
>> + * @n: The array of count of elements in each batch, 0 terminated.
>> + *
>> + * Write a request to the RSC controller without caching. If the request
>> + * state is ACTIVE, then the requests are treated as completion request
>> + * and sent to the controller immediately. The function waits until all the
>> + * commands are complete. If the request was to SLEEP or WAKE_ONLY, then the
>> + * request is sent as fire-n-forget and no ack is expected.
>> + *
>> + * May sleep. Do not call from atomic contexts for ACTIVE_ONLY requests.
>> + */
>> +int rpmh_write_batch(const struct device *dev, enum rpmh_state state,
>> + const struct tcs_cmd *cmd, u32 *n)
>> +{
>> + struct rpmh_request *rpm_msg[RPMH_MAX_REQ_IN_BATCH] = { NULL };
>> + DECLARE_COMPLETION_ONSTACK(compl);
>> + atomic_t wait_count = ATOMIC_INIT(0);
>> + struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev);
>> + int count = 0;
>> + int ret, i;
>> +
>> + if (IS_ERR(ctrlr) || !cmd || !n)
>> + return -EINVAL;
>> +
>> + while (n[count++] > 0)
>> + ;
>> + count--;
>> + if (!count || count > RPMH_MAX_REQ_IN_BATCH)
>> + return -EINVAL;
>> +
>> + for (i = 0; i < count; i++) {
>> + rpm_msg[i] = __get_rpmh_msg_async(state, cmd, n[i]);
>> + if (IS_ERR_OR_NULL(rpm_msg[i])) {
>
> Just "IS_ERR". It's never NULL.
> ...also add a i-- somewhere in here or you're going to be kfree()ing
> your error value, aren't you?
Sure. Will make change in next patch.
>
>
>> + ret = PTR_ERR(rpm_msg[i]);
>> + for (; i >= 0; i--)
>> + kfree(rpm_msg[i]->free);
>> + return ret;
>> + }
>> + cmd += n[i];
>> + }
>> +
>> + if (state != RPMH_ACTIVE_ONLY_STATE)
>> + return cache_batch(ctrlr, rpm_msg, count);
>
> Don't you need to free rpm_msg items in this case?
>
>
>> +
>> + atomic_set(&wait_count, count);
>> +
>> + for (i = 0; i < count; i++) {
>> + rpm_msg[i]->completion = &compl;
>> + rpm_msg[i]->wait_count = &wait_count;
>> + ret = rpmh_rsc_send_data(ctrlr->drv, &rpm_msg[i]->msg);
>> + if (ret) {
>> + int j;
>> +
>> + pr_err("Error(%d) sending RPMH message addr=%#x\n",
>> + ret, rpm_msg[i]->msg.cmds[0].addr);
>> + for (j = i; j < count; j++)
>> + rpmh_tx_done(&rpm_msg[j]->msg, ret);
>
> You're just using rpmh_tx_done() to free memory? Note that you'll
> probably do your error handling in this function a favor if you rename
> __get_rpmh_msg_async() to __fill_rpmh_msg() and remove the memory
> allocation from there. Then you can do one big allocation of the
> whole array in rpmh_write_batch() and then you'll only have one free
> at the end...
>
>
>
>> + break;
>
> "break" seems wrong here. You'll end up waiting for the completion,
> then I guess timing out, then returning -ETIMEDOUT?
As the loop above break runs for remaining count, completion will be
notified so there will not be waiting.
Thanks,
Raju
>
>
>> + }
>> + }
>> +
>> + ret = wait_for_completion_timeout(&compl, RPMH_TIMEOUT_MS);
>
> The "wait_count" abstraction is confusing and I believe it's not
> needed. I think you can remove it and change the above to this
> (untested) code:
>
> time_left = RPMH_TIMEOUT_MS;
> for (i = 0; i < count; i++) {
> time_left = wait_for_completion_timeout(&compl, time_left);
> if (!time_left)
> return -ETIMEDOUT;
> }
>
> ...specifically completions are additive, so just wait "count" times
> and then the reader doesn't need to learn your new wait_count
> abstraction and try to reason about it.
>
> ...and, actually, I argue in other replies that this should't use a
> timeout, so even cleaner:
>
> for (i = 0; i < count; i++)
> wait_for_completion(&compl);
>
>
> Once you do that, you can also get rid of the need to pre-count "n",
> so all your loops turn into:
>
> for (i = 0; n[i]; i++)
>
>
> I suppose you might want to get rid of "RPMH_MAX_REQ_IN_BATCH" and
> dynamically allocate your array too, but that seems sane. As per
> above it seems like you should just dynamically allocate a whole array
> of "struct rpmh_request" items at once anyway.
>
> ---
>
>> + return (ret > 0) ? 0 : -ETIMEDOUT;
>> +
>> +}
>> +EXPORT_SYMBOL(rpmh_write_batch);
>
> Perhaps an even simpler thing than taking all my advice above: can't
> you just add a optional completion to rpmh_write_async()? That would
> just be stuffed into rpm_msg.
>
> Now your batch code would just be a bunch of calls to
> rpmh_write_async() with an equal number of wait_for_completion() calls
> at the end. Is there a reason that wouldn't work? You'd get rid of
> _a lot_ of code.
>
>
> -Doug
>
Hi,
On 5/12/2018 1:47 AM, Doug Anderson wrote:
> Hi,
>
> On Wed, May 9, 2018 at 10:01 AM, Lina Iyer <[email protected]> wrote:
>> Some RSCs may only have sleep and wake TCS, i.e, there is no dedicated
>> TCS for active mode request, but drivers may still want to make active
>> requests from these RSCs. In such cases re-purpose the wake TCS to send
>> active state requests.
>>
>> The requirement for this is that the driver is aware that the wake TCS
>> is being repurposed to send active request, hence the sleep and wake
>> TCSes be invalidated before the active request is sent.
>>
>> Signed-off-by: Lina Iyer <[email protected]>
>> Reviewed-by: Matthias Kaehlcke <[email protected]>
>> ---
>> drivers/soc/qcom/rpmh-rsc.c | 18 +++++++++++++++++-
>> 1 file changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
>> index 68c25ebbbe09..369b9b3eedc5 100644
>> --- a/drivers/soc/qcom/rpmh-rsc.c
>> +++ b/drivers/soc/qcom/rpmh-rsc.c
>> @@ -153,6 +153,7 @@ static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,
>> const struct tcs_request *msg)
>> {
>> int type;
>> + struct tcs_group *tcs;
>>
>> switch (msg->state) {
>> case RPMH_ACTIVE_ONLY_STATE:
>> @@ -168,7 +169,22 @@ static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,
>> return ERR_PTR(-EINVAL);
>> }
>>
>> - return get_tcs_of_type(drv, type);
>> + /*
>> + * If we are making an active request on a RSC that does not have a
>> + * dedicated TCS for active state use, then re-purpose a wake TCS to
>> + * send active votes.
>> + * NOTE: The driver must be aware that this RSC does not have a
>> + * dedicated AMC, and therefore would invalidate the sleep and wake
>> + * TCSes before making an active state request.
>> + */
>> + tcs = get_tcs_of_type(drv, type);
>> + if (msg->state == RPMH_ACTIVE_ONLY_STATE && IS_ERR(tcs)) {
>> + tcs = get_tcs_of_type(drv, WAKE_TCS);
>> + if (!IS_ERR(tcs))
>> + rpmh_rsc_invalidate(drv);
>
> I noticed that rpmh_rsc_invalidate() can return -EAGAIN. Do you need
> to deal with that here?
Yes. will add the check here.
Thanks,
Raju.
>
>
> -Doug
>
Hi,
On Wed, May 23, 2018 at 6:27 AM, Raju P L S S S N
<[email protected]> wrote:
>>> + break;
>>
>>
>> "break" seems wrong here. You'll end up waiting for the completion,
>> then I guess timing out, then returning -ETIMEDOUT?
>
>
> As the loop above break runs for remaining count, completion will be
> notified so there will not be waiting.
Ah, I see. I missed that.
...oh, I guess you need to wait because any requests you might have
made will try to signal your completion which is on the stack...