LinuxLists.cc - [PATCH v7 00/16] Introduce atomic support for SCMI transports

2021-11-29 19:14:32

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 00/16] Introduce atomic support for SCMI transports

Hi all,

This series mainly aims to introduce atomic support for SCMI transports
that can support it.

After a bit of refactoring in the first 5 patches of the series, in
[06/16], as a closely related addition, it is introduced a common way for a
transport to signal to the SCMI core that it does not offer completion
interrupts, so that the usual polling behaviour will be required: this can
be done enabling statically a global polling behaviour for the whole
transport with flag scmi_desc.force_polling OR dynamically enabling at
runtime such polling behaviour on a per-channel basis using the flag
scmi_chan_info.no_completion_irq, typically during .chan_setup().
The usual per-command polling selection behaviour based on
hdr.poll_completion is preserved as before.

Patch [07/16], ports SMC transport to use the common core completions when
completion interrupt is available or otherwise revert to use common core
polling mechanism above introduced: this avoids the improper call of
scmi_rx_callback directly in smc_send_message.

With [08/16] I introduce a flag to allow a transport to signal to the core
that upon return of a .send_message() the requested command execution can
be assumed by the core to have been fully completed by the platform, so
that the response payload (if any) can be immediately fetched without the
need to effectively poll the channel.

In [09/16] and [10/16] I enable such flag for SMC amd OPTEE transports.

With [11/16] a transport that supports atomic operations on its TX
path can now declare itself as .atomic_enabled and, if the platform has
been also configured to use such atomic operation mode via Kconfig, the
SCMI core will refrain itself too from sleeping on the RX path and instead
rely on polling on the correspondent when such atomic behaviour is
requested from the upper layers. (like the Clock framework)

Then in [12/16] SMC is converted to be .atomic_enabled by substituting
the mutexes with busy-waiting to keep the channel 'locked' ONLY IF the
SMC transport is configured to operate in atomic mode.
(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE=y)

In [13/16] a new parameter is added to mark_txdone transport operation;
this will be needed to support polling/atomic support in SCMI virtio.

[14/16] adds polling and configurable atomic mode support to SCMI virtio.
(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE=y)

Finally [15/16] adds support for atomic clock enable/disable requests to
the SCMI Clock protocol layer, while in [16/16] the SCMI Clock driver is
modified to gain optional support for atomic operations when operating on
an atomically capable and configured SCMI transport.

Atomic support has been tested against the virtio transport in regards to
the above mentioned Clock enable/disable operations.
(echo 1 > /sys/kernel/debug/clk/myclock/clk_prepare_enable)
Pure polling has been tested on mailbox transport.

The series is based on sudeep/for-next/scmi [1] on top of:

commit 61bc76be367e ("firmware: arm_scmi: optee: Fix missing mutex_init()")

Any feedback welcome.

Thanks,

Cristian

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux.git/log/?h=for-next/scmi

---
V6 --> V7
- rebased on lates sudeep/for-next/scmi
- fixed missing list_del() in vio_complete while processing pre-fetched buffers
- added proper spilocking while accessing pending_cmds_list
- reduced unneeded spinlocked region in virtio_poll_done
- using __unused macro in optional mark_txdone param

V5 --> V6
- changed default polling timeout
- refactored SCMI RX path
- added timeout_ms to traces in RX-path
- added OPTEE support for sync_cmds_atomic_replies
- reworked the whole atomic support in SCMI core removing any reliance
on IRQs. (using pure polling)
- added new parameter to mark_txdone
- added new SCMI VirtIO polling/atomic support
- reviewed Clock SCMI Driver atomic support: now supporting both
clk_prepare and clk_enable when atomic transport is detected
- added CCs

V4 --> V5
- removed RFCs tags
- added scmi_desc.atomic_enabled flags and a few Kconfig options to set
atomic mode for SMC and VirtIO transports. Default disabled.
- added Kconfig option to enable forced polling as a whole on the Mailbox
transport
- removed .poll_done callback from SMC transport since no real polling is
needed once sync_cmds_atomic_replies is set
- made atomic_capable changes on SMC transport dependent on Kconfig
CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE: so no change and no busy waiting
if atomic mode is NOT enabled in Kconfig.
- made const: force_polling/atomic_capable/atomic_enabled/sync_cmds_atomic_replies

V3 --> V4
- rebased on linux-next/master next-20210824
- renamed .needs_polling to .no_completion_irq
- added .sync_cmds_atomic_replies
- make SMC use .sync_cmd_atomic_replies

V2 --> v3
- rebased on SCMI VirtIO V6 which in turn is based on v5.14-rc1

Cristian Marussi (16):
firmware: arm_scmi: Perform earlier cinfo lookup call in do_xfer
firmware: arm_scmi: Set polling timeout to max_rx_timeout_ms
firmware: arm_scmi: Refactor message response path
include: trace: Add new scmi_xfer_response_wait event
firmware: arm_scmi: Use new trace event scmi_xfer_response_wait
firmware: arm_scmi: Add configurable polling mode for transports
firmware: arm_scmi: Make smc transport use common completions
firmware: arm_scmi: Add sync_cmds_atomic_replies transport flag
firmware: arm_scmi: Make smc support atomic sync commands replies
firmware: arm_scmi: Make optee support atomic sync commands replies
firmware: arm_scmi: Add support for atomic transports
firmware: arm_scmi: Add atomic mode support to smc transport
firmware: arm_scmi: Add new parameter to mark_txdone
firmware: arm_scmi: Add atomic mode support to virtio transport
firmware: arm_scmi: Add atomic support to clock protocol
clk: scmi: Support atomic clock enable/disable API

drivers/clk/clk-scmi.c | 56 +++++--
drivers/firmware/arm_scmi/Kconfig | 29 ++++
drivers/firmware/arm_scmi/clock.c | 22 ++-
drivers/firmware/arm_scmi/common.h | 26 ++-
drivers/firmware/arm_scmi/driver.c | 196 +++++++++++++++++-----
drivers/firmware/arm_scmi/mailbox.c | 3 +-
drivers/firmware/arm_scmi/optee.c | 18 ++-
drivers/firmware/arm_scmi/smc.c | 108 ++++++++++---
drivers/firmware/arm_scmi/virtio.c | 241 ++++++++++++++++++++++++++--
include/linux/scmi_protocol.h | 11 ++
include/trace/events/scmi.h | 28 ++++
11 files changed, 636 insertions(+), 102 deletions(-)

--
2.17.1

2021-11-29 19:14:33

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 01/16] firmware: arm_scmi: Perform earlier cinfo lookup call in do_xfer

Lookup cinfo data early in do_xfer so as to avoid any further init work
on xfer structure in case of error.

No functional change.

Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Cristian Marussi <[email protected]>
---
drivers/firmware/arm_scmi/driver.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 768926a77f5d..3cf161f3bcc7 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -766,6 +766,10 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
return -EINVAL;
}

+ cinfo = idr_find(&info->tx_idr, pi->proto->id);
+ if (unlikely(!cinfo))
+ return -EINVAL;
+
/*
* Initialise protocol id now from protocol handle to avoid it being
* overridden by mistake (or malice) by the protocol code mangling with
@@ -774,10 +778,6 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
xfer->hdr.protocol_id = pi->proto->id;
reinit_completion(&xfer->done);

- cinfo = idr_find(&info->tx_idr, xfer->hdr.protocol_id);
- if (unlikely(!cinfo))
- return -EINVAL;
-
trace_scmi_xfer_begin(xfer->transfer_id, xfer->hdr.id,
xfer->hdr.protocol_id, xfer->hdr.seq,
xfer->hdr.poll_completion);
--
2.17.1

2021-11-29 19:14:41

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 02/16] firmware: arm_scmi: Set polling timeout to max_rx_timeout_ms

Use transport specific transmission timeout (max_rx_timeout_ms) also for
polling transactions.

Signed-off-by: Cristian Marussi <[email protected]>
---
drivers/firmware/arm_scmi/driver.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 3cf161f3bcc7..568562121f64 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -724,8 +724,6 @@ static void xfer_put(const struct scmi_protocol_handle *ph,
__scmi_xfer_put(&info->tx_minfo, xfer);
}

-#define SCMI_MAX_POLL_TO_NS (100 * NSEC_PER_USEC)
-
static bool scmi_xfer_done_no_timeout(struct scmi_chan_info *cinfo,
struct scmi_xfer *xfer, ktime_t stop)
{
@@ -799,7 +797,8 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
}

if (xfer->hdr.poll_completion) {
- ktime_t stop = ktime_add_ns(ktime_get(), SCMI_MAX_POLL_TO_NS);
+ ktime_t stop = ktime_add_ms(ktime_get(),
+ info->desc->max_rx_timeout_ms);

spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
if (ktime_before(ktime_get(), stop)) {
--
2.17.1

2021-11-29 19:14:46

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 04/16] include: trace: Add new scmi_xfer_response_wait event

Having a new step to trace SCMI stack while it waits for synchronous
responses is useful to analyze system performance when changing waiting
mode between polling and interrupt completion.

Signed-off-by: Cristian Marussi <[email protected]>
---
v5 --> v6
- removed atomic flag / poll is enough
- added timeout field
---
include/trace/events/scmi.h | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/include/trace/events/scmi.h b/include/trace/events/scmi.h
index f3a4b4d60714..cee4b2b64ae4 100644
--- a/include/trace/events/scmi.h
+++ b/include/trace/events/scmi.h
@@ -33,6 +33,34 @@ TRACE_EVENT(scmi_xfer_begin,
__entry->seq, __entry->poll)
);

+TRACE_EVENT(scmi_xfer_response_wait,
+ TP_PROTO(int transfer_id, u8 msg_id, u8 protocol_id, u16 seq,
+ u32 timeout, bool poll),
+ TP_ARGS(transfer_id, msg_id, protocol_id, seq, timeout, poll),
+
+ TP_STRUCT__entry(
+ __field(int, transfer_id)
+ __field(u8, msg_id)
+ __field(u8, protocol_id)
+ __field(u16, seq)
+ __field(u32, timeout)
+ __field(bool, poll)
+ ),
+
+ TP_fast_assign(
+ __entry->transfer_id = transfer_id;
+ __entry->msg_id = msg_id;
+ __entry->protocol_id = protocol_id;
+ __entry->seq = seq;
+ __entry->timeout = timeout;
+ __entry->poll = poll;
+ ),
+
+ TP_printk("transfer_id=%d msg_id=%u protocol_id=%u seq=%u tmo_ms=%u poll=%u",
+ __entry->transfer_id, __entry->msg_id, __entry->protocol_id,
+ __entry->seq, __entry->timeout, __entry->poll)
+);
+
TRACE_EVENT(scmi_xfer_end,
TP_PROTO(int transfer_id, u8 msg_id, u8 protocol_id, u16 seq,
int status),
--
2.17.1

2021-11-29 19:14:47

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 03/16] firmware: arm_scmi: Refactor message response path

Refactor code path waiting for message responses into a dedicated helper
function.

No functional change.

Signed-off-by: Cristian Marussi <[email protected]>
---
drivers/firmware/arm_scmi/driver.c | 88 +++++++++++++++++++-----------
1 file changed, 56 insertions(+), 32 deletions(-)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 568562121f64..9a8d6bfd4ebb 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -738,6 +738,61 @@ static bool scmi_xfer_done_no_timeout(struct scmi_chan_info *cinfo,
ktime_after(ktime_get(), stop);
}

+/**
+ * scmi_wait_for_message_response - An helper to group all the possible ways of
+ * waiting for a synchronous message response.
+ *
+ * @cinfo: SCMI channel info
+ * @xfer: Reference to the transfer being waited for.
+ *
+ * Chooses waiting strategy (sleep-waiting vs busy-waiting) depending on
+ * configuration flags like xfer->hdr.poll_completion.
+ *
+ * Return: 0 on Success, error otherwise.
+ */
+static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
+ struct scmi_xfer *xfer)
+{
+ struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
+ struct device *dev = info->dev;
+ int ret = 0, timeout_ms = info->desc->max_rx_timeout_ms;
+
+ if (xfer->hdr.poll_completion) {
+ ktime_t stop = ktime_add_ms(ktime_get(), timeout_ms);
+
+ spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
+ if (ktime_before(ktime_get(), stop)) {
+ unsigned long flags;
+
+ /*
+ * Do not fetch_response if an out-of-order delayed
+ * response is being processed.
+ */
+ spin_lock_irqsave(&xfer->lock, flags);
+ if (xfer->state == SCMI_XFER_SENT_OK) {
+ info->desc->ops->fetch_response(cinfo, xfer);
+ xfer->state = SCMI_XFER_RESP_OK;
+ }
+ spin_unlock_irqrestore(&xfer->lock, flags);
+ } else {
+ dev_err(dev,
+ "timed out in resp(caller: %pS) - polling\n",
+ (void *)_RET_IP_);
+ ret = -ETIMEDOUT;
+ }
+ } else {
+ /* And we wait for the response. */
+ if (!wait_for_completion_timeout(&xfer->done,
+ msecs_to_jiffies(timeout_ms))) {
+ dev_err(dev, "timed out in resp(caller: %pS)\n",
+ (void *)_RET_IP_);
+ ret = -ETIMEDOUT;
+ }
+ }
+
+ return ret;
+}
+
/**
* do_xfer() - Do one transfer
*
@@ -752,7 +807,6 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
struct scmi_xfer *xfer)
{
int ret;
- int timeout;
const struct scmi_protocol_instance *pi = ph_to_pi(ph);
struct scmi_info *info = handle_to_scmi_info(pi->handle);
struct device *dev = info->dev;
@@ -796,37 +850,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
return ret;
}

- if (xfer->hdr.poll_completion) {
- ktime_t stop = ktime_add_ms(ktime_get(),
- info->desc->max_rx_timeout_ms);
-
- spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
- if (ktime_before(ktime_get(), stop)) {
- unsigned long flags;
-
- /*
- * Do not fetch_response if an out-of-order delayed
- * response is being processed.
- */
- spin_lock_irqsave(&xfer->lock, flags);
- if (xfer->state == SCMI_XFER_SENT_OK) {
- info->desc->ops->fetch_response(cinfo, xfer);
- xfer->state = SCMI_XFER_RESP_OK;
- }
- spin_unlock_irqrestore(&xfer->lock, flags);
- } else {
- ret = -ETIMEDOUT;
- }
- } else {
- /* And we wait for the response. */
- timeout = msecs_to_jiffies(info->desc->max_rx_timeout_ms);
- if (!wait_for_completion_timeout(&xfer->done, timeout)) {
- dev_err(dev, "timed out in resp(caller: %pS)\n",
- (void *)_RET_IP_);
- ret = -ETIMEDOUT;
- }
- }
-
+ ret = scmi_wait_for_message_response(cinfo, xfer);
if (!ret && xfer->hdr.status)
ret = scmi_to_linux_errno(xfer->hdr.status);

--
2.17.1

2021-11-29 19:16:34

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 06/16] firmware: arm_scmi: Add configurable polling mode for transports

SCMI communications along TX channels can optionally be provided of a
completion interrupt; when such interrupt is not available, command
transactions should rely on polling, where the SCMI core takes care to
repeatedly evaluate the transport-specific .poll_done() function, if
available, to determine if and when a request was fully completed or
timed out.

Such mechanism is already present and working on a single transfer base:
SCMI protocols can indeed enable hdr.poll_completion on specific commands
ahead of each transfer and cause that transaction to be handled with
polling.

Introduce a couple of flags to be able to enforce such polling behaviour
globally at will:

- scmi_desc.force_polling: to statically switch the whole transport to
polling mode.

- scmi_chan_info.no_completion_irq: to switch a single channel dynamically
to polling mode if, at runtime, is determined that no completion
interrupt was available for such channel.

Signed-off-by: Cristian Marussi <[email protected]>
---
v5 --> v6
- removed check on replies received by IRQs when xfer was requested
as poll_completion (not all transport can suppress IRQs on an xfer basis)
v4 --> v5
- make force_polling const
- introduce polling_enabled flag to simplify checks on do_xfer
v3 --> v4:
- renamed .needs_polling flag to .no_completion_irq
- refactored error path when polling needed but not supported
---
drivers/firmware/arm_scmi/common.h | 11 +++++++++++
drivers/firmware/arm_scmi/driver.c | 17 +++++++++++++++++
2 files changed, 28 insertions(+)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 6438b5248c24..99b74f4d39b6 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -339,11 +339,19 @@ void scmi_protocol_release(const struct scmi_handle *handle, u8 protocol_id);
* @dev: Reference to device in the SCMI hierarchy corresponding to this
* channel
* @handle: Pointer to SCMI entity handle
+ * @no_completion_irq: Flag to indicate that this channel has no completion
+ * interrupt mechanism for synchronous commands.
+ * This can be dynamically set by transports at run-time
+ * inside their provided .chan_setup().
+ * @polling_enabled: Flag used to annotate if polling mode is currently enabled
+ * on this channel.
* @transport_info: Transport layer related information
*/
struct scmi_chan_info {
struct device *dev;
struct scmi_handle *handle;
+ bool no_completion_irq;
+ bool polling_enabled;
void *transport_info;
};

@@ -402,6 +410,8 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
* be pending simultaneously in the system. May be overridden by the
* get_max_msg op.
* @max_msg_size: Maximum size of data per message that can be handled.
+ * @force_polling: Flag to force this whole transport to use SCMI core polling
+ * mechanism instead of completion interrupts even if available.
*/
struct scmi_desc {
int (*transport_init)(void);
@@ -410,6 +420,7 @@ struct scmi_desc {
int max_rx_timeout_ms;
int max_msg;
int max_msg_size;
+ const bool force_polling;
};

#ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 476b91845e40..8a30b832899c 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -817,6 +817,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
struct device *dev = info->dev;
struct scmi_chan_info *cinfo;

+ /* Check for polling request on custom command xfers at first */
if (xfer->hdr.poll_completion && !info->desc->ops->poll_done) {
dev_warn_once(dev,
"Polling mode is not supported by transport.\n");
@@ -827,6 +828,10 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
if (unlikely(!cinfo))
return -EINVAL;

+ /* Initialized to true ONLY if also supported by transport. */
+ if (cinfo->polling_enabled)
+ xfer->hdr.poll_completion = true;
+
/*
* Initialise protocol id now from protocol handle to avoid it being
* overridden by mistake (or malice) by the protocol code mangling with
@@ -1527,6 +1532,18 @@ static int scmi_chan_setup(struct scmi_info *info, struct device *dev,
if (ret)
return ret;

+ if (tx && (cinfo->no_completion_irq || info->desc->force_polling)) {
+ if (info->desc->ops->poll_done) {
+ dev_info(dev,
+ "Enabled polling mode TX channel - prot_id:%d\n",
+ prot_id);
+ cinfo->polling_enabled = true;
+ } else {
+ dev_warn(dev,
+ "Polling mode NOT supported by transport.\n");
+ }
+ }
+
idr_alloc:
ret = idr_alloc(idr, cinfo, prot_id, prot_id + 1, GFP_KERNEL);
if (ret != prot_id) {
--
2.17.1

2021-11-29 19:16:35

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 05/16] firmware: arm_scmi: Use new trace event scmi_xfer_response_wait

Use new trace event to mark start of waiting for response section.

Signed-off-by: Cristian Marussi <[email protected]>
---
v5 --> v6
- removed atomic flag / poll is enough
- added timeout field
v4 --> v5
- consider atomic_enable flag too
---
drivers/firmware/arm_scmi/driver.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 9a8d6bfd4ebb..476b91845e40 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -757,6 +757,11 @@ static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
struct device *dev = info->dev;
int ret = 0, timeout_ms = info->desc->max_rx_timeout_ms;

+ trace_scmi_xfer_response_wait(xfer->transfer_id, xfer->hdr.id,
+ xfer->hdr.protocol_id, xfer->hdr.seq,
+ timeout_ms,
+ xfer->hdr.poll_completion);
+
if (xfer->hdr.poll_completion) {
ktime_t stop = ktime_add_ms(ktime_get(), timeout_ms);

--
2.17.1

2021-11-29 19:16:37

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 07/16] firmware: arm_scmi: Make smc transport use common completions

When a completion irq is available use it and delegate command completion
handling to the core SCMI completion mechanism.

If no completion irq is available revert to polling, using the core common
polling machinery.

Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Cristian Marussi <[email protected]>
---
v6 --> v7
- removed spurios blank line removal
v4 --> v5
- removed RFC tag
v3 --> v4
- renamed usage of .needs_polling to .no_completion_irq
---
drivers/firmware/arm_scmi/smc.c | 39 +++++++++++++++++----------------
1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index 4effecc3bb46..d6c6ad9f6bab 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -25,8 +25,6 @@
* @shmem: Transmit/Receive shared memory area
* @shmem_lock: Lock to protect access to Tx/Rx shared memory area
* @func_id: smc/hvc call function id
- * @irq: Optional; employed when platforms indicates msg completion by intr.
- * @tx_complete: Optional, employed only when irq is valid.
*/

struct scmi_smc {
@@ -34,15 +32,14 @@ struct scmi_smc {
struct scmi_shared_mem __iomem *shmem;
struct mutex shmem_lock;
u32 func_id;
- int irq;
- struct completion tx_complete;
};

static irqreturn_t smc_msg_done_isr(int irq, void *data)
{
struct scmi_smc *scmi_info = data;

- complete(&scmi_info->tx_complete);
+ scmi_rx_callback(scmi_info->cinfo,
+ shmem_read_header(scmi_info->shmem), NULL);

return IRQ_HANDLED;
}
@@ -111,8 +108,8 @@ static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
dev_err(dev, "failed to setup SCMI smc irq\n");
return ret;
}
- init_completion(&scmi_info->tx_complete);
- scmi_info->irq = irq;
+ } else {
+ cinfo->no_completion_irq = true;
}

scmi_info->func_id = func_id;
@@ -142,26 +139,22 @@ static int smc_send_message(struct scmi_chan_info *cinfo,
struct scmi_smc *scmi_info = cinfo->transport_info;
struct arm_smccc_res res;

+ /*
+ * Channel lock will be released only once response has been
+ * surely fully retrieved, so after .mark_txdone()
+ */
mutex_lock(&scmi_info->shmem_lock);

shmem_tx_prepare(scmi_info->shmem, xfer);

- if (scmi_info->irq)
- reinit_completion(&scmi_info->tx_complete);
-
arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);

- if (scmi_info->irq)
- wait_for_completion(&scmi_info->tx_complete);
-
- scmi_rx_callback(scmi_info->cinfo,
- shmem_read_header(scmi_info->shmem), NULL);
-
- mutex_unlock(&scmi_info->shmem_lock);
-
/* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
- if (res.a0)
+ if (res.a0) {
+ mutex_unlock(&scmi_info->shmem_lock);
return -EOPNOTSUPP;
+ }
+
return 0;
}

@@ -173,6 +166,13 @@ static void smc_fetch_response(struct scmi_chan_info *cinfo,
shmem_fetch_response(scmi_info->shmem, xfer);
}

+static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+{
+ struct scmi_smc *scmi_info = cinfo->transport_info;
+
+ mutex_unlock(&scmi_info->shmem_lock);
+}
+
static bool
smc_poll_done(struct scmi_chan_info *cinfo, struct scmi_xfer *xfer)
{
@@ -186,6 +186,7 @@ static const struct scmi_transport_ops scmi_smc_ops = {
.chan_setup = smc_chan_setup,
.chan_free = smc_chan_free,
.send_message = smc_send_message,
+ .mark_txdone = smc_mark_txdone,
.fetch_response = smc_fetch_response,
.poll_done = smc_poll_done,
};
--
2.17.1

2021-11-29 19:16:38

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 08/16] firmware: arm_scmi: Add sync_cmds_atomic_replies transport flag

Add a flag to let the transport signal to the core if its handling of sync
command implies that, after .send_message has returned successfully, the
requested command can be assumed to be fully and completely executed on
SCMI platform side so that any possible response value is already
immediately available to be retrieved by a .fetch_response: in other words
the polling phase can be skipped in such a case and the response values
accessed straight away.

Note that all of the above applies only when polling mode of operation was
selected by the core: if instead a completion IRQ was found to be available
the normal response processing path based on completions will still be
followed.

Signed-off-by: Cristian Marussi <[email protected]>
---
v5 --> v6
- added polling_capable helper flag
v4 --> v5
- removed RFC tag
- consider sync_cmds_atomic_replies flag when deciding if polling is to be
supported and .poll_done() is not provided.
- reviewed commit message
---
drivers/firmware/arm_scmi/common.h | 8 ++++++
drivers/firmware/arm_scmi/driver.c | 43 +++++++++++++++++++++++-------
2 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 99b74f4d39b6..bf25f0e89c78 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -412,6 +412,13 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
* @max_msg_size: Maximum size of data per message that can be handled.
* @force_polling: Flag to force this whole transport to use SCMI core polling
* mechanism instead of completion interrupts even if available.
+ * @sync_cmds_atomic_replies: Flag to indicate that the transport assures
+ * synchronous-command messages are atomically
+ * completed on .send_message: no need to poll
+ * actively waiting for a response.
+ * Used by core internally only when polling is
+ * selected as a waiting for reply method: i.e.
+ * if a completion irq was found use that anyway.
*/
struct scmi_desc {
int (*transport_init)(void);
@@ -421,6 +428,7 @@ struct scmi_desc {
int max_msg;
int max_msg_size;
const bool force_polling;
+ const bool sync_cmds_atomic_replies;
};

#ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 8a30b832899c..8c04632f3ba3 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -134,6 +134,8 @@ struct scmi_protocol_instance {
* @notify_priv: Pointer to private data structure specific to notifications.
* @node: List head
* @users: Number of users of this instance
+ * @polling_capable: A flag to annotate if the underlying transport can operate
+ * in polling mode.
*/
struct scmi_info {
struct device *dev;
@@ -152,6 +154,7 @@ struct scmi_info {
void *notify_priv;
struct list_head node;
int users;
+ bool polling_capable;
};

#define handle_to_scmi_info(h) container_of(h, struct scmi_info, handle)
@@ -763,10 +766,28 @@ static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
xfer->hdr.poll_completion);

if (xfer->hdr.poll_completion) {
- ktime_t stop = ktime_add_ms(ktime_get(), timeout_ms);
+ /*
+ * Real polling is needed only if transport has NOT declared
+ * itself to support synchronous commands replies.
+ */
+ if (!info->desc->sync_cmds_atomic_replies) {
+ /*
+ * Poll on xfer using transport provided .poll_done();
+ * assumes no completion interrupt was available.
+ */
+ ktime_t stop = ktime_add_ms(ktime_get(), timeout_ms);
+
+ spin_until_cond(scmi_xfer_done_no_timeout(cinfo,
+ xfer, stop));
+ if (ktime_after(ktime_get(), stop)) {
+ dev_err(dev,
+ "timed out in resp(caller: %pS) - polling\n",
+ (void *)_RET_IP_);
+ ret = -ETIMEDOUT;
+ }
+ }

- spin_until_cond(scmi_xfer_done_no_timeout(cinfo, xfer, stop));
- if (ktime_before(ktime_get(), stop)) {
+ if (!ret) {
unsigned long flags;

/*
@@ -779,11 +800,6 @@ static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
xfer->state = SCMI_XFER_RESP_OK;
}
spin_unlock_irqrestore(&xfer->lock, flags);
- } else {
- dev_err(dev,
- "timed out in resp(caller: %pS) - polling\n",
- (void *)_RET_IP_);
- ret = -ETIMEDOUT;
}
} else {
/* And we wait for the response. */
@@ -818,7 +834,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
struct scmi_chan_info *cinfo;

/* Check for polling request on custom command xfers at first */
- if (xfer->hdr.poll_completion && !info->desc->ops->poll_done) {
+ if (xfer->hdr.poll_completion && !info->polling_capable) {
dev_warn_once(dev,
"Polling mode is not supported by transport.\n");
return -EINVAL;
@@ -1533,7 +1549,7 @@ static int scmi_chan_setup(struct scmi_info *info, struct device *dev,
return ret;

if (tx && (cinfo->no_completion_irq || info->desc->force_polling)) {
- if (info->desc->ops->poll_done) {
+ if (info->polling_capable) {
dev_info(dev,
"Enabled polling mode TX channel - prot_id:%d\n",
prot_id);
@@ -1866,6 +1882,13 @@ static int scmi_probe(struct platform_device *pdev)

info->dev = dev;
info->desc = desc;
+ /*
+ * Annotate if the underlying transport for this instance has polling
+ * capabilities by providing a .poll_done callback OR not requiring
+ * real polling at all.
+ */
+ info->polling_capable = info->desc->ops->poll_done ||
+ info->desc->sync_cmds_atomic_replies;
INIT_LIST_HEAD(&info->node);
idr_init(&info->protocols);
mutex_init(&info->protocols_mtx);
--
2.17.1

2021-11-29 19:16:41

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 09/16] firmware: arm_scmi: Make smc support atomic sync commands replies

Enable sync_cmds_atomic_replies in the SMC transport descriptor and remove
SMC specific .poll_done callback support since polling is bypassed when
sync_cmds_atomic_replies is set.

Signed-off-by: Cristian Marussi <[email protected]>
---
v4 --> v5
- removed RFC tag
- added comment on setting flag
- remove smc_poll_done
---
drivers/firmware/arm_scmi/smc.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index d6c6ad9f6bab..b2f31d3feb10 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -173,14 +173,6 @@ static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
mutex_unlock(&scmi_info->shmem_lock);
}

-static bool
-smc_poll_done(struct scmi_chan_info *cinfo, struct scmi_xfer *xfer)
-{
- struct scmi_smc *scmi_info = cinfo->transport_info;
-
- return shmem_poll_done(scmi_info->shmem, xfer);
-}
-
static const struct scmi_transport_ops scmi_smc_ops = {
.chan_available = smc_chan_available,
.chan_setup = smc_chan_setup,
@@ -188,7 +180,6 @@ static const struct scmi_transport_ops scmi_smc_ops = {
.send_message = smc_send_message,
.mark_txdone = smc_mark_txdone,
.fetch_response = smc_fetch_response,
- .poll_done = smc_poll_done,
};

const struct scmi_desc scmi_smc_desc = {
@@ -196,4 +187,13 @@ const struct scmi_desc scmi_smc_desc = {
.max_rx_timeout_ms = 30,
.max_msg = 20,
.max_msg_size = 128,
+ /*
+ * Setting .sync_cmds_atomic_replies to true for SMC assumes that,
+ * once the SMC instruction has completed successfully, the issued
+ * SCMI command would have been already fully processed by the SCMI
+ * platform firmware and so any possible response value expected
+ * for the issued command will be immmediately ready to be fetched
+ * from the shared memory area.
+ */
+ .sync_cmds_atomic_replies = true,
};
--
2.17.1

2021-11-29 19:16:45

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 10/16] firmware: arm_scmi: Make optee support atomic sync commands replies

Declare each OPTEE SCMI channel as not having a completion_irq so as to
enable polling mode and then enable also .sync_cmds_atomic_replies flag in
the OPTEE transport descriptor so that real polling is itself effectively
bypassed on the rx path: once the optee command invocation has successfully
returned the core will directly fetch the response from the shared memory
area.

Remove OPTEE SCMI transport specific .poll_done callback support since
real polling is effectively bypassed when .sync_cmds_atomic_replies is set.

Add OPTEE SCMI transport specific .mark_txdone callback support in order to
properly handle channel locking along the tx path.

Cc: Etienne Carriere <[email protected]>
Signed-off-by: Cristian Marussi <[email protected]>
---
v6 --> v7
- reviewed commit message
---
drivers/firmware/arm_scmi/optee.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/firmware/arm_scmi/optee.c b/drivers/firmware/arm_scmi/optee.c
index 901737c9f5f8..2428032b61ca 100644
--- a/drivers/firmware/arm_scmi/optee.c
+++ b/drivers/firmware/arm_scmi/optee.c
@@ -380,6 +380,9 @@ static int scmi_optee_chan_setup(struct scmi_chan_info *cinfo, struct device *de
if (ret)
goto err_close_sess;

+ /* Enable polling */
+ cinfo->no_completion_irq = true;
+
mutex_lock(&scmi_optee_private->mu);
list_add(&channel->link, &scmi_optee_private->channel_list);
mutex_unlock(&scmi_optee_private->mu);
@@ -440,9 +443,8 @@ static int scmi_optee_send_message(struct scmi_chan_info *cinfo,
shmem_tx_prepare(shmem, xfer);

ret = invoke_process_smt_channel(channel);
-
- scmi_rx_callback(cinfo, shmem_read_header(shmem), NULL);
- mutex_unlock(&channel->mu);
+ if (ret)
+ mutex_unlock(&channel->mu);

return ret;
}
@@ -456,13 +458,11 @@ static void scmi_optee_fetch_response(struct scmi_chan_info *cinfo,
shmem_fetch_response(shmem, xfer);
}

-static bool scmi_optee_poll_done(struct scmi_chan_info *cinfo,
- struct scmi_xfer *xfer)
+static void scmi_optee_mark_txdone(struct scmi_chan_info *cinfo, int ret)
{
struct scmi_optee_channel *channel = cinfo->transport_info;
- struct scmi_shared_mem *shmem = get_channel_shm(channel, xfer);

- return shmem_poll_done(shmem, xfer);
+ mutex_unlock(&channel->mu);
}

static struct scmi_transport_ops scmi_optee_ops = {
@@ -471,9 +471,9 @@ static struct scmi_transport_ops scmi_optee_ops = {
.chan_setup = scmi_optee_chan_setup,
.chan_free = scmi_optee_chan_free,
.send_message = scmi_optee_send_message,
+ .mark_txdone = scmi_optee_mark_txdone,
.fetch_response = scmi_optee_fetch_response,
.clear_channel = scmi_optee_clear_channel,
- .poll_done = scmi_optee_poll_done,
};

static int scmi_optee_ctx_match(struct tee_ioctl_version_data *ver, const void *data)
@@ -579,4 +579,5 @@ const struct scmi_desc scmi_optee_desc = {
.max_rx_timeout_ms = 30,
.max_msg = 20,
.max_msg_size = SCMI_OPTEE_MAX_MSG_SIZE,
+ .sync_cmds_atomic_replies = true,
};
--
2.17.1

2021-11-29 19:16:47

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 11/16] firmware: arm_scmi: Add support for atomic transports

An SCMI transport can be configured as .atomic_enabled in order to signal
to the SCMI core that all its TX path is executed in atomic context and
that, when requested, polling mode should be used while waiting for command
responses.

When a specific platform configuration had properly configured such a
transport as .atomic_enabled, the SCMI core will also take care not to
sleep in the corresponding RX path while waiting for a response if that
specific command transaction was requested as atomic using polling mode.

Asynchronous commands should not be used in an atomic context and so a
warning is emitted if polling was requested for an asynchronous command.

Add also a method to check, from the SCMI drivers, if the underlying SCMI
transport is currently configured to support atomic transactions: this will
be used by upper layers to determine if atomic requests can be supported at
all on this SCMI instance.

Signed-off-by: Cristian Marussi <[email protected]>
---
v6 --> v7
- reviewed commit message
- converted async WARN_ON to WARN_ON_ONCE
v5 --> v6
- removed atomic_capable
- fully relying on transport polling capabilities
- removed polling/atomic support for delayed_reponse and WARN
- merged with is_transport_atomic() patch
- is_transport_atomic() now considers polling_capable and
atomic_enabled flags
v4 --> v5
- added .atomic_enabled flag to decide wheter to enable atomic mode or not
for atomic_capable transports
- reviewed commit message
---
drivers/firmware/arm_scmi/common.h | 4 +++
drivers/firmware/arm_scmi/driver.c | 50 ++++++++++++++++++++++++++++--
include/linux/scmi_protocol.h | 8 +++++
3 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index bf25f0e89c78..97a65d5fbb1d 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -419,6 +419,9 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
* Used by core internally only when polling is
* selected as a waiting for reply method: i.e.
* if a completion irq was found use that anyway.
+ * @atomic_enabled: Flag to indicate that this transport, which is assured not
+ * to sleep anywhere on the TX path, can be used in atomic mode
+ * when requested.
*/
struct scmi_desc {
int (*transport_init)(void);
@@ -429,6 +432,7 @@ struct scmi_desc {
int max_msg_size;
const bool force_polling;
const bool sync_cmds_atomic_replies;
+ const bool atomic_enabled;
};

#ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 8c04632f3ba3..fd8ca72c67a1 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -907,6 +907,20 @@ static void reset_rx_to_maxsz(const struct scmi_protocol_handle *ph,
* @ph: Pointer to SCMI protocol handle
* @xfer: Transfer to initiate and wait for response
*
+ * Using asynchronous commands in atomic/polling mode should be avoided since
+ * it could cause long busy-waiting here, so ignore polling for the delayed
+ * response and WARN if it was requested for this command transaction since
+ * upper layers should refrain from issuing such kind of requests.
+ *
+ * The only other option would have been to refrain from using any asynchronous
+ * command even if made available, when an atomic transport is detected, and
+ * instead forcibly use the synchronous version (thing that can be easily
+ * attained at the protocol layer), but this would also have led to longer
+ * stalls of the channel for synchronous commands and possibly timeouts.
+ * (in other words there is usually a good reason if a platform provides an
+ * asynchronous version of a command and we should prefer to use it...just not
+ * when using atomic/polling mode)
+ *
* Return: -ETIMEDOUT in case of no delayed response, if transmit error,
* return corresponding error, else if all goes well, return 0.
*/
@@ -918,12 +932,24 @@ static int do_xfer_with_response(const struct scmi_protocol_handle *ph,

xfer->async_done = &async_response;

+ /*
+ * Delayed responses should not be polled, so an async command should
+ * not have been used when requiring an atomic/poll context; WARN and
+ * perform instead a sleeping wait.
+ * (Note Async + IgnoreDelayedResponses are sent via do_xfer)
+ */
+ WARN_ON_ONCE(xfer->hdr.poll_completion);
+
ret = do_xfer(ph, xfer);
if (!ret) {
- if (!wait_for_completion_timeout(xfer->async_done, timeout))
+ if (!wait_for_completion_timeout(xfer->async_done, timeout)) {
+ dev_err(ph->dev,
+ "timed out in delayed resp(caller: %pS)\n",
+ (void *)_RET_IP_);
ret = -ETIMEDOUT;
- else if (xfer->hdr.status)
+ } else if (xfer->hdr.status) {
ret = scmi_to_linux_errno(xfer->hdr.status);
+ }
}

xfer->async_done = NULL;
@@ -1357,6 +1383,21 @@ static void scmi_devm_protocol_put(struct scmi_device *sdev, u8 protocol_id)
WARN_ON(ret);
}

+/**
+ * scmi_is_transport_atomic - Method to check if underlying transport for an
+ * SCMI instance is configured as atomic.
+ *
+ * @handle: A reference to the SCMI platform instance.
+ *
+ * Return: True if transport is configured as atomic
+ */
+static bool scmi_is_transport_atomic(const struct scmi_handle *handle)
+{
+ struct scmi_info *info = handle_to_scmi_info(handle);
+
+ return info->desc->atomic_enabled && info->polling_capable;
+}
+
static inline
struct scmi_handle *scmi_handle_get_from_info_unlocked(struct scmi_info *info)
{
@@ -1903,6 +1944,7 @@ static int scmi_probe(struct platform_device *pdev)
handle->version = &info->version;
handle->devm_protocol_get = scmi_devm_protocol_get;
handle->devm_protocol_put = scmi_devm_protocol_put;
+ handle->is_transport_atomic = scmi_is_transport_atomic;

if (desc->ops->link_supplier) {
ret = desc->ops->link_supplier(dev);
@@ -1921,6 +1963,10 @@ static int scmi_probe(struct platform_device *pdev)
if (scmi_notification_init(handle))
dev_err(dev, "SCMI Notifications NOT available.\n");

+ if (info->desc->atomic_enabled && !info->polling_capable)
+ dev_err(dev,
+ "Transport is not polling capable. Atomic mode not supported.\n");
+
/*
* Trigger SCMI Base protocol initialization.
* It's mandatory and won't be ever released/deinit until the
diff --git a/include/linux/scmi_protocol.h b/include/linux/scmi_protocol.h
index 80e781c51ddc..9f895cb81818 100644
--- a/include/linux/scmi_protocol.h
+++ b/include/linux/scmi_protocol.h
@@ -612,6 +612,13 @@ struct scmi_notify_ops {
* @devm_protocol_get: devres managed method to acquire a protocol and get specific
* operations and a dedicated protocol handler
* @devm_protocol_put: devres managed method to release a protocol
+ * @is_transport_atomic: method to check if the underlying transport for this
+ * instance handle is configured to support atomic
+ * transactions for commands.
+ * Some users of the SCMI stack in the upper layers could
+ * be interested to know if they can assume SCMI
+ * command transactions associated to this handle will
+ * never sleep and act accordingly.
* @notify_ops: pointer to set of notifications related operations
*/
struct scmi_handle {
@@ -622,6 +629,7 @@ struct scmi_handle {
(*devm_protocol_get)(struct scmi_device *sdev, u8 proto,
struct scmi_protocol_handle **ph);
void (*devm_protocol_put)(struct scmi_device *sdev, u8 proto);
+ bool (*is_transport_atomic)(const struct scmi_handle *handle);

const struct scmi_notify_ops *notify_ops;
};
--
2.17.1

2021-11-29 19:16:50

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 12/16] firmware: arm_scmi: Add atomic mode support to smc transport

Add a Kernel configuration option to enable SCMI SMC transport atomic
mode operation for selected SCMI transactions and leave it as default
disabled.

Substitute mutex usages with busy-waiting and declare smc transport as
.atomic_enabled if such Kernel configuration option is enabled.

Signed-off-by: Cristian Marussi <[email protected]>
---
v5 --> v6
- remove usage of atomic_capable
- removed needless union
- reviewed Kconfig help
v4 --> v5
- removed RFC tag
- add CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE option
- add .atomic_enable support
- make atomic_capable dependent on
CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
- make also usage of mutexes vs busy-waiting dependent on
CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
---
drivers/firmware/arm_scmi/Kconfig | 14 +++++++
drivers/firmware/arm_scmi/smc.c | 66 ++++++++++++++++++++++++++++---
2 files changed, 74 insertions(+), 6 deletions(-)

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index 638ecec89ff1..d429326433d1 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -78,6 +78,20 @@ config ARM_SCMI_TRANSPORT_SMC
If you want the ARM SCMI PROTOCOL stack to include support for a
transport based on SMC, answer Y.

+config ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
+ bool "Enable atomic mode support for SCMI SMC transport"
+ depends on ARM_SCMI_TRANSPORT_SMC
+ help
+ Enable support of atomic operation for SCMI SMC based transport.
+
+ If you want the SCMI SMC based transport to operate in atomic
+ mode, avoiding any kind of sleeping behaviour for selected
+ transactions on the TX path, answer Y.
+ Enabling atomic mode operations allows any SCMI driver using this
+ transport to optionally ask for atomic SCMI transactions and operate
+ in atomic context too, at the price of using a number of busy-waiting
+ primitives all over instead. If unsure say N.
+
config ARM_SCMI_TRANSPORT_VIRTIO
bool "SCMI transport based on VirtIO"
depends on VIRTIO=y || VIRTIO=ARM_SCMI_PROTOCOL
diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index b2f31d3feb10..0fc49cb49185 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -7,6 +7,7 @@
*/

#include <linux/arm-smccc.h>
+#include <linux/atomic.h>
#include <linux/device.h>
#include <linux/err.h>
#include <linux/interrupt.h>
@@ -14,6 +15,9 @@
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_irq.h>
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
+#include <linux/processor.h>
+#endif
#include <linux/slab.h>

#include "common.h"
@@ -23,14 +27,23 @@
*
* @cinfo: SCMI channel info
* @shmem: Transmit/Receive shared memory area
- * @shmem_lock: Lock to protect access to Tx/Rx shared memory area
+ * @shmem_lock: Lock to protect access to Tx/Rx shared memory area.
+ * Used when NOT operating in atomic mode.
+ * @inflight: Atomic flag to protect access to Tx/Rx shared memory area.
+ * Used when operating in atomic mode.
* @func_id: smc/hvc call function id
*/

struct scmi_smc {
struct scmi_chan_info *cinfo;
struct scmi_shared_mem __iomem *shmem;
+#ifndef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
+ /* Protect access to shmem area */
struct mutex shmem_lock;
+#else
+#define INFLIGHT_NONE MSG_TOKEN_MAX
+ atomic_t inflight;
+#endif
u32 func_id;
};

@@ -54,6 +67,46 @@ static bool smc_chan_available(struct device *dev, int idx)
return true;
}

+static inline void smc_channel_lock_init(struct scmi_smc *scmi_info)
+{
+#ifndef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
+ mutex_init(&scmi_info->shmem_lock);
+#else
+ atomic_set(&scmi_info->inflight, INFLIGHT_NONE);
+#endif
+}
+
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
+static bool smc_xfer_inflight(struct scmi_xfer *xfer, atomic_t *inflight)
+{
+ int ret;
+
+ ret = atomic_cmpxchg(inflight, INFLIGHT_NONE, xfer->hdr.seq);
+
+ return ret == INFLIGHT_NONE;
+}
+#endif
+
+static inline void
+smc_channel_lock_acquire(struct scmi_smc *scmi_info,
+ struct scmi_xfer *xfer __maybe_unused)
+{
+#ifndef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
+ mutex_lock(&scmi_info->shmem_lock);
+#else
+ spin_until_cond(smc_xfer_inflight(xfer, &scmi_info->inflight));
+#endif
+}
+
+static inline void smc_channel_lock_release(struct scmi_smc *scmi_info)
+{
+#ifndef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
+ mutex_unlock(&scmi_info->shmem_lock);
+#else
+ atomic_set(&scmi_info->inflight, INFLIGHT_NONE);
+#endif
+}
+
static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
bool tx)
{
@@ -114,7 +167,7 @@ static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,

scmi_info->func_id = func_id;
scmi_info->cinfo = cinfo;
- mutex_init(&scmi_info->shmem_lock);
+ smc_channel_lock_init(scmi_info);
cinfo->transport_info = scmi_info;

return 0;
@@ -140,10 +193,10 @@ static int smc_send_message(struct scmi_chan_info *cinfo,
struct arm_smccc_res res;

/*
- * Channel lock will be released only once response has been
+ * Channel will be released only once response has been
* surely fully retrieved, so after .mark_txdone()
*/
- mutex_lock(&scmi_info->shmem_lock);
+ smc_channel_lock_acquire(scmi_info, xfer);

shmem_tx_prepare(scmi_info->shmem, xfer);

@@ -151,7 +204,7 @@ static int smc_send_message(struct scmi_chan_info *cinfo,

/* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
if (res.a0) {
- mutex_unlock(&scmi_info->shmem_lock);
+ smc_channel_lock_release(scmi_info);
return -EOPNOTSUPP;
}

@@ -170,7 +223,7 @@ static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
{
struct scmi_smc *scmi_info = cinfo->transport_info;

- mutex_unlock(&scmi_info->shmem_lock);
+ smc_channel_lock_release(scmi_info);
}

static const struct scmi_transport_ops scmi_smc_ops = {
@@ -196,4 +249,5 @@ const struct scmi_desc scmi_smc_desc = {
* from the shared memory area.
*/
.sync_cmds_atomic_replies = true,
+ .atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE),
};
--
2.17.1

2021-11-29 19:16:54

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 13/16] firmware: arm_scmi: Add new parameter to mark_txdone

Add a new xfer parameter to mark_txdone transport operation which enables
the SCMI core to optionally pass back into the transport layer a reference
to the xfer descriptor that is being handled.

Signed-off-by: Cristian Marussi <[email protected]>
---
v6 --> v7
- use __unused macro for new param in existing transports
---
drivers/firmware/arm_scmi/common.h | 3 ++-
drivers/firmware/arm_scmi/driver.c | 2 +-
drivers/firmware/arm_scmi/mailbox.c | 3 ++-
drivers/firmware/arm_scmi/optee.c | 3 ++-
drivers/firmware/arm_scmi/smc.c | 3 ++-
5 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 97a65d5fbb1d..8ee12d6e1abe 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -381,7 +381,8 @@ struct scmi_transport_ops {
unsigned int (*get_max_msg)(struct scmi_chan_info *base_cinfo);
int (*send_message)(struct scmi_chan_info *cinfo,
struct scmi_xfer *xfer);
- void (*mark_txdone)(struct scmi_chan_info *cinfo, int ret);
+ void (*mark_txdone)(struct scmi_chan_info *cinfo, int ret,
+ struct scmi_xfer *xfer);
void (*fetch_response)(struct scmi_chan_info *cinfo,
struct scmi_xfer *xfer);
void (*fetch_notification)(struct scmi_chan_info *cinfo,
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index fd8ca72c67a1..35098555be3c 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -881,7 +881,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
ret = scmi_to_linux_errno(xfer->hdr.status);

if (info->desc->ops->mark_txdone)
- info->desc->ops->mark_txdone(cinfo, ret);
+ info->desc->ops->mark_txdone(cinfo, ret, xfer);

trace_scmi_xfer_end(xfer->transfer_id, xfer->hdr.id,
xfer->hdr.protocol_id, xfer->hdr.seq, ret);
diff --git a/drivers/firmware/arm_scmi/mailbox.c b/drivers/firmware/arm_scmi/mailbox.c
index e09eb12bf421..08ff4d110beb 100644
--- a/drivers/firmware/arm_scmi/mailbox.c
+++ b/drivers/firmware/arm_scmi/mailbox.c
@@ -140,7 +140,8 @@ static int mailbox_send_message(struct scmi_chan_info *cinfo,
return ret;
}

-static void mailbox_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+static void mailbox_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+ struct scmi_xfer *__unused)
{
struct scmi_mailbox *smbox = cinfo->transport_info;

diff --git a/drivers/firmware/arm_scmi/optee.c b/drivers/firmware/arm_scmi/optee.c
index 2428032b61ca..1282b935df49 100644
--- a/drivers/firmware/arm_scmi/optee.c
+++ b/drivers/firmware/arm_scmi/optee.c
@@ -458,7 +458,8 @@ static void scmi_optee_fetch_response(struct scmi_chan_info *cinfo,
shmem_fetch_response(shmem, xfer);
}

-static void scmi_optee_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+static void scmi_optee_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+ struct scmi_xfer *__unused)
{
struct scmi_optee_channel *channel = cinfo->transport_info;

diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
index 0fc49cb49185..9920b4639bfd 100644
--- a/drivers/firmware/arm_scmi/smc.c
+++ b/drivers/firmware/arm_scmi/smc.c
@@ -219,7 +219,8 @@ static void smc_fetch_response(struct scmi_chan_info *cinfo,
shmem_fetch_response(scmi_info->shmem, xfer);
}

-static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret)
+static void smc_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+ struct scmi_xfer *__unused)
{
struct scmi_smc *scmi_info = cinfo->transport_info;

--
2.17.1

2021-11-29 19:16:56

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 14/16] firmware: arm_scmi: Add atomic mode support to virtio transport

Add support for .mark_txdone and .poll_done transport operations to SCMI
VirtIO transport as pre-requisites to enable atomic operations.

Add a Kernel configuration option to enable SCMI VirtIO transport polling
and atomic mode for selected SCMI transactions while leaving it default
disabled.

Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Igor Skalkin <[email protected]>
Cc: Peter Hilber <[email protected]>
Cc: [email protected]
Signed-off-by: Cristian Marussi <[email protected]>
---
V6 --> V7
- added a few comments about virtio polling internals
- fixed missing list_del on pending_cmds_list processing
- shrinked spinlocked areas in virtio_poll_done
- added proper spinlocking to scmi_vio_complete_cb while scanning list
of pending cmds
---
drivers/firmware/arm_scmi/Kconfig | 15 ++
drivers/firmware/arm_scmi/virtio.c | 241 +++++++++++++++++++++++++++--
2 files changed, 243 insertions(+), 13 deletions(-)

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index d429326433d1..7794bd41eaa0 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
the ones implemented by kvmtool) and let the core Kernel VirtIO layer
take care of the needed conversions, say N.

+config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+ bool "Enable atomic mode for SCMI VirtIO transport"
+ depends on ARM_SCMI_TRANSPORT_VIRTIO
+ help
+ Enable support of atomic operation for SCMI VirtIO based transport.
+
+ If you want the SCMI VirtIO based transport to operate in atomic
+ mode, avoiding any kind of sleeping behaviour for selected
+ transactions on the TX path, answer Y.
+
+ Enabling atomic mode operations allows any SCMI driver using this
+ transport to optionally ask for atomic SCMI transactions and operate
+ in atomic context too, at the price of using a number of busy-waiting
+ primitives all over instead. If unsure say N.
+
endif #ARM_SCMI_PROTOCOL

config ARM_SCMI_POWER_DOMAIN
diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
index fd0f6f91fc0b..0598e185a786 100644
--- a/drivers/firmware/arm_scmi/virtio.c
+++ b/drivers/firmware/arm_scmi/virtio.c
@@ -38,6 +38,7 @@
* @vqueue: Associated virtqueue
* @cinfo: SCMI Tx or Rx channel
* @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
+ * @pending_cmds_list: List of pre-fetched commands queueud for later processing
* @is_rx: Whether channel is an Rx channel
* @ready: Whether transport user is ready to hear about channel
* @max_msg: Maximum number of pending messages for this channel.
@@ -49,6 +50,9 @@ struct scmi_vio_channel {
struct virtqueue *vqueue;
struct scmi_chan_info *cinfo;
struct list_head free_list;
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+ struct list_head pending_cmds_list;
+#endif
bool is_rx;
bool ready;
unsigned int max_msg;
@@ -65,12 +69,22 @@ struct scmi_vio_channel {
* @input: SDU used for (delayed) responses and notifications
* @list: List which scmi_vio_msg may be part of
* @rx_len: Input SDU size in bytes, once input has been received
+ * @poll_idx: Last used index registered for polling purposes if this message
+ * transaction reply was configured for polling.
+ * Note that virtqueue used index is an unsigned 16-bit.
+ * @poll_lock: Protect access to @poll_idx.
*/
struct scmi_vio_msg {
struct scmi_msg_payld *request;
struct scmi_msg_payld *input;
struct list_head list;
unsigned int rx_len;
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+#define VIO_MSG_POLL_DONE 0xffffffffUL
+ unsigned int poll_idx;
+ /* lock to protect access to poll_idx. */
+ spinlock_t poll_lock;
+#endif
};

/* Only one SCMI VirtIO device can possibly exist */
@@ -104,17 +118,22 @@ static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
return rc;
}

+static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
+ struct scmi_vio_msg *msg)
+{
+ /* Here IRQs are assumed to be already disabled by the caller */
+ spin_lock(&vioch->lock);
+ list_add(&msg->list, &vioch->free_list);
+ spin_unlock(&vioch->lock);
+}
+
static void scmi_finalize_message(struct scmi_vio_channel *vioch,
struct scmi_vio_msg *msg)
{
- if (vioch->is_rx) {
+ if (vioch->is_rx)
scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
- } else {
- /* Here IRQs are assumed to be already disabled by the caller */
- spin_lock(&vioch->lock);
- list_add(&msg->list, &vioch->free_list);
- spin_unlock(&vioch->lock);
- }
+ else
+ scmi_vio_feed_vq_tx(vioch, msg);
}

static void scmi_vio_complete_cb(struct virtqueue *vqueue)
@@ -140,6 +159,26 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)

/* IRQs already disabled here no need to irqsave */
spin_lock(&vioch->lock);
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+ /* At first scan the list of possibly pre-fetched messages */
+ if (!vioch->is_rx) {
+ struct scmi_vio_msg *tmp;
+
+ list_for_each_entry_safe(msg, tmp,
+ &vioch->pending_cmds_list,
+ list) {
+ list_del(&msg->list);
+ spin_unlock(&vioch->lock);
+
+ scmi_rx_callback(vioch->cinfo,
+ msg_read_header(msg->input),
+ msg);
+ /* Free the processed message once done */
+ spin_lock(&vioch->lock);
+ list_add(&msg->list, &vioch->free_list);
+ }
+ }
+#endif
if (cb_enabled) {
virtqueue_disable_cb(vqueue);
cb_enabled = false;
@@ -257,6 +296,9 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
GFP_KERNEL);
if (!msg->request)
return -ENOMEM;
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+ spin_lock_init(&msg->poll_lock);
+#endif
}

msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
@@ -324,7 +366,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
}

msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
- list_del(&msg->list);
+ /* Re-init element so we can discern anytime if it is still in-flight */
+ list_del_init(&msg->list);

msg_tx_prepare(msg->request, xfer);

@@ -337,6 +380,20 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
dev_err(vioch->cinfo->dev,
"failed to add to TX virtqueue (%d)\n", rc);
} else {
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+ /*
+ * If polling was requested for this transaction:
+ * - retrieve last used index (will be used as polling reference)
+ * - bind the polled message to the xfer via .priv
+ */
+ if (xfer->hdr.poll_completion) {
+ spin_lock(&msg->poll_lock);
+ msg->poll_idx =
+ virtqueue_enable_cb_prepare(vioch->vqueue);
+ spin_unlock(&msg->poll_lock);
+ xfer->priv = msg;
+ }
+#endif
virtqueue_kick(vioch->vqueue);
}

@@ -350,10 +407,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
{
struct scmi_vio_msg *msg = xfer->priv;

- if (msg) {
+ if (msg)
msg_fetch_response(msg->input, msg->rx_len, xfer);
- xfer->priv = NULL;
- }
}

static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
@@ -361,11 +416,163 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
{
struct scmi_vio_msg *msg = xfer->priv;

- if (msg) {
+ if (msg)
msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
- xfer->priv = NULL;
+}
+
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+/**
+ * virtio_mark_txdone - Mark transmission done
+ *
+ * Free only successfully completed polling transfer messages.
+ *
+ * Note that in the SCMI VirtIO transport we never explicitly release timed-out
+ * messages by forcibly re-adding them to the free-list on timeout inside the TX
+ * code path; we instead let IRQ/RX callbacks eventually clean up such messages
+ * once, finally, a late reply is received and discarded (if ever).
+ *
+ * This approach was deemed preferable since those pending timed-out buffers are
+ * still effectively owned by the SCMI platform VirtIO device even after timeout
+ * expiration: forcibly freeing and reusing them before they had beeen returned
+ * by the SCMI platform could lead to subtle bugs due to message corruption.
+ * An SCMI platform VirtIO device which never returns message buffers is
+ * anyway broken and it will quickly lead to message exhaustion.
+ *
+ * For this same reason, here, we take care to free only the successfully
+ * completed polled messages, since they won't be freed elsewhere; late replies
+ * to timed-out polled messages would be anyway freed by RX callbacks instead.
+ *
+ * @cinfo: SCMI channel info
+ * @ret: Transmission return code
+ * @xfer: Transfer descriptor
+ */
+static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
+ struct scmi_xfer *xfer)
+{
+ unsigned long flags;
+ struct scmi_vio_channel *vioch = cinfo->transport_info;
+ struct scmi_vio_msg *msg = xfer->priv;
+
+ if (!msg)
+ return;
+
+ /* Is a successfully completed polled message still to be finalized ? */
+ spin_lock_irqsave(&msg->poll_lock, flags);
+ if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
+ scmi_vio_feed_vq_tx(vioch, msg);
+ spin_unlock_irqrestore(&msg->poll_lock, flags);
+
+ xfer->priv = NULL;
+}
+
+/**
+ * virtio_poll_done - Provide polling support for VirtIO transport
+ *
+ * @cinfo: SCMI channel info
+ * @xfer: Reference to the transfer being poll for.
+ *
+ * VirtIO core provides a polling mechanism based only on last used indexes:
+ * this means that it is possible to poll the virtqueues waiting for something
+ * new to arrive from the host side but the only way to check if the freshly
+ * arrived buffer was what we were waiting for is to compare the newly arrived
+ * message descriptors with the one we are polling on.
+ *
+ * As a consequence it can happen to dequeue something different from the buffer
+ * we were poll-waiting for: if that is the case such early fetched buffers are
+ * then added to a the @pending_cmds_list list for later processing within the
+ * usual VirtIO callbacks; so, basically, once something new is spotted we
+ * proceed to de-queue all the freshly received used buffers until we found the
+ * one we were polling on, or we empty the virtqueue.
+ *
+ * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
+ * when polling since such flag is per-virtqueues and we do not want to
+ * suppress notifications as a whole: so, if the message we are polling for is
+ * delivered via usual IRQs callbacks, it will be handled as such and the
+ * polling loop in the SCMI Core TX path will be transparently terminated
+ * anyway.
+ *
+ * Return: True once polling has successfully completed.
+ */
+static bool virtio_poll_done(struct scmi_chan_info *cinfo,
+ struct scmi_xfer *xfer)
+{
+ bool ret;
+ unsigned int poll_idx;
+ unsigned long flags;
+ struct scmi_vio_msg *msg = xfer->priv;
+ struct scmi_vio_channel *vioch = cinfo->transport_info;
+
+ if (!msg)
+ return true;
+
+ /*
+ * Keep the spinlocked region as small as possible: don't care if
+ * missing something this time it will be polled again next.
+ */
+ spin_lock_irqsave(&msg->poll_lock, flags);
+ poll_idx = msg->poll_idx;
+ spin_unlock_irqrestore(&msg->poll_lock, flags);
+
+ /* Processed already by other polling loop on another CPU ? */
+ if (poll_idx == VIO_MSG_POLL_DONE)
+ return true;
+
+ /* Has cmdq index moved at all ? */
+ ret = virtqueue_poll(vioch->vqueue, poll_idx);
+ if (ret) {
+ struct scmi_vio_msg *next_msg;
+
+ spin_lock_irqsave(&vioch->lock, flags);
+ virtqueue_disable_cb(vioch->vqueue);
+ /*
+ * If something arrived we cannot be sure if it was the reply to
+ * the xfer we are polling for, or some replies to other, even
+ * possibly non-polling, pending xfers: process all new messages
+ * till the polled-for message is found OR the vqueue is empty.
+ */
+ do {
+ unsigned int length;
+
+ next_msg = virtqueue_get_buf(vioch->vqueue, &length);
+ if (next_msg) {
+ next_msg->rx_len = length;
+ if (next_msg == msg) {
+ ret = true;
+ break;
+ }
+
+ list_add_tail(&next_msg->list,
+ &vioch->pending_cmds_list);
+ spin_lock(&next_msg->poll_lock);
+ next_msg->poll_idx = VIO_MSG_POLL_DONE;
+ spin_unlock(&next_msg->poll_lock);
+ ret = false;
+ }
+ } while (next_msg);
+
+ /*
+ * When the polling loop has successfully terminated simply
+ * restart the vqueue, no matter if something else was queued
+ * in the meantime, it will be served by normal IRQ/callback
+ * or by the next poll loop.
+ *
+ * Update the polling index to the current vqueue last used
+ * index, if still looking for a reply.
+ */
+ if (ret) {
+ virtqueue_enable_cb(vioch->vqueue);
+ } else {
+ spin_lock(&msg->poll_lock);
+ msg->poll_idx =
+ virtqueue_enable_cb_prepare(vioch->vqueue);
+ spin_unlock(&msg->poll_lock);
+ }
+ spin_unlock_irqrestore(&vioch->lock, flags);
}
+
+ return ret;
}
+#endif

static const struct scmi_transport_ops scmi_virtio_ops = {
.link_supplier = virtio_link_supplier,
@@ -376,6 +583,10 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
.send_message = virtio_send_message,
.fetch_response = virtio_fetch_response,
.fetch_notification = virtio_fetch_notification,
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+ .mark_txdone = virtio_mark_txdone,
+ .poll_done = virtio_poll_done,
+#endif
};

static int scmi_vio_probe(struct virtio_device *vdev)
@@ -418,6 +629,9 @@ static int scmi_vio_probe(struct virtio_device *vdev)
spin_lock_init(&channels[i].lock);
spin_lock_init(&channels[i].ready_lock);
INIT_LIST_HEAD(&channels[i].free_list);
+#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
+ INIT_LIST_HEAD(&channels[i].pending_cmds_list);
+#endif
channels[i].vqueue = vqs[i];

sz = virtqueue_get_vring_size(channels[i].vqueue);
@@ -506,4 +720,5 @@ const struct scmi_desc scmi_virtio_desc = {
.max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
.max_msg = 0, /* overridden by virtio_get_max_msg() */
.max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
+ .atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
};
--
2.17.1

2021-11-29 19:18:33

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 15/16] firmware: arm_scmi: Add atomic support to clock protocol

Introduce new _atomic variant for SCMI clock protocol operations related
to enable disable operations: when an atomic operation is required the xfer
poll_completion flag is set for that transaction.

Signed-off-by: Cristian Marussi <[email protected]>
---
drivers/firmware/arm_scmi/clock.c | 22 +++++++++++++++++++---
include/linux/scmi_protocol.h | 3 +++
2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/firmware/arm_scmi/clock.c b/drivers/firmware/arm_scmi/clock.c
index 35b56c8ba0c0..72f930c0e3e2 100644
--- a/drivers/firmware/arm_scmi/clock.c
+++ b/drivers/firmware/arm_scmi/clock.c
@@ -273,7 +273,7 @@ static int scmi_clock_rate_set(const struct scmi_protocol_handle *ph,

static int
scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,
- u32 config)
+ u32 config, bool atomic)
{
int ret;
struct scmi_xfer *t;
@@ -284,6 +284,8 @@ scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,
if (ret)
return ret;

+ t->hdr.poll_completion = atomic;
+
cfg = t->tx.buf;
cfg->id = cpu_to_le32(clk_id);
cfg->attributes = cpu_to_le32(config);
@@ -296,12 +298,24 @@ scmi_clock_config_set(const struct scmi_protocol_handle *ph, u32 clk_id,

static int scmi_clock_enable(const struct scmi_protocol_handle *ph, u32 clk_id)
{
- return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE);
+ return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE, false);
}

static int scmi_clock_disable(const struct scmi_protocol_handle *ph, u32 clk_id)
{
- return scmi_clock_config_set(ph, clk_id, 0);
+ return scmi_clock_config_set(ph, clk_id, 0, false);
+}
+
+static int scmi_clock_enable_atomic(const struct scmi_protocol_handle *ph,
+ u32 clk_id)
+{
+ return scmi_clock_config_set(ph, clk_id, CLOCK_ENABLE, true);
+}
+
+static int scmi_clock_disable_atomic(const struct scmi_protocol_handle *ph,
+ u32 clk_id)
+{
+ return scmi_clock_config_set(ph, clk_id, 0, true);
}

static int scmi_clock_count_get(const struct scmi_protocol_handle *ph)
@@ -330,6 +344,8 @@ static const struct scmi_clk_proto_ops clk_proto_ops = {
.rate_set = scmi_clock_rate_set,
.enable = scmi_clock_enable,
.disable = scmi_clock_disable,
+ .enable_atomic = scmi_clock_enable_atomic,
+ .disable_atomic = scmi_clock_disable_atomic,
};

static int scmi_clock_protocol_init(const struct scmi_protocol_handle *ph)
diff --git a/include/linux/scmi_protocol.h b/include/linux/scmi_protocol.h
index 9f895cb81818..d4971c991963 100644
--- a/include/linux/scmi_protocol.h
+++ b/include/linux/scmi_protocol.h
@@ -82,6 +82,9 @@ struct scmi_clk_proto_ops {
u64 rate);
int (*enable)(const struct scmi_protocol_handle *ph, u32 clk_id);
int (*disable)(const struct scmi_protocol_handle *ph, u32 clk_id);
+ int (*enable_atomic)(const struct scmi_protocol_handle *ph, u32 clk_id);
+ int (*disable_atomic)(const struct scmi_protocol_handle *ph,
+ u32 clk_id);
};

/**
--
2.17.1

2021-11-29 19:18:35

by Cristian Marussi

[permalink] [raw]

Subject: [PATCH v7 16/16] clk: scmi: Support atomic clock enable/disable API

Support also atomic enable/disable clk_ops beside the bare non-atomic one
(prepare/unprepare) when the underlying SCMI transport is configured to
support atomic transactions for synchronous commands.

Cc: Michael Turquette <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: [email protected]
Signed-off-by: Cristian Marussi <[email protected]>
---
V5 --> V6
- add concurrent availability of atomic and non atomic reqs
---
drivers/clk/clk-scmi.c | 56 +++++++++++++++++++++++++++++++++++-------
1 file changed, 47 insertions(+), 9 deletions(-)

diff --git a/drivers/clk/clk-scmi.c b/drivers/clk/clk-scmi.c
index 1e357d364ca2..50033d873dde 100644
--- a/drivers/clk/clk-scmi.c
+++ b/drivers/clk/clk-scmi.c
@@ -88,21 +88,53 @@ static void scmi_clk_disable(struct clk_hw *hw)
scmi_proto_clk_ops->disable(clk->ph, clk->id);
}

+static int scmi_clk_atomic_enable(struct clk_hw *hw)
+{
+ struct scmi_clk *clk = to_scmi_clk(hw);
+
+ return scmi_proto_clk_ops->enable_atomic(clk->ph, clk->id);
+}
+
+static void scmi_clk_atomic_disable(struct clk_hw *hw)
+{
+ struct scmi_clk *clk = to_scmi_clk(hw);
+
+ scmi_proto_clk_ops->disable_atomic(clk->ph, clk->id);
+}
+
+/*
+ * We can provide enable/disable atomic callbacks only if the underlying SCMI
+ * transport for an SCMI instance is configured to handle SCMI commands in an
+ * atomic manner.
+ *
+ * When no SCMI atomic transport support is available we instead provide only
+ * the prepare/unprepare API, as allowed by the clock framework when atomic
+ * calls are not available.
+ *
+ * Two distinct sets of clk_ops are provided since we could have multiple SCMI
+ * instances with different underlying transport quality, so they cannot be
+ * shared.
+ */
static const struct clk_ops scmi_clk_ops = {
.recalc_rate = scmi_clk_recalc_rate,
.round_rate = scmi_clk_round_rate,
.set_rate = scmi_clk_set_rate,
- /*
- * We can't provide enable/disable callback as we can't perform the same
- * in atomic context. Since the clock framework provides standard API
- * clk_prepare_enable that helps cases using clk_enable in non-atomic
- * context, it should be fine providing prepare/unprepare.
- */
.prepare = scmi_clk_enable,
.unprepare = scmi_clk_disable,
};

-static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
+static const struct clk_ops scmi_atomic_clk_ops = {
+ .recalc_rate = scmi_clk_recalc_rate,
+ .round_rate = scmi_clk_round_rate,
+ .set_rate = scmi_clk_set_rate,
+ .prepare = scmi_clk_enable,
+ .unprepare = scmi_clk_disable,
+ .enable = scmi_clk_atomic_enable,
+ .disable = scmi_clk_atomic_disable,
+};
+
+static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk,
+ const struct clk_ops *scmi_ops)
{
int ret;
unsigned long min_rate, max_rate;
@@ -110,7 +142,7 @@ static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
struct clk_init_data init = {
.flags = CLK_GET_RATE_NOCACHE,
.num_parents = 0,
- .ops = &scmi_clk_ops,
+ .ops = scmi_ops,
.name = sclk->info->name,
};

@@ -145,6 +177,7 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
struct device_node *np = dev->of_node;
const struct scmi_handle *handle = sdev->handle;
struct scmi_protocol_handle *ph;
+ const struct clk_ops *scmi_ops;

if (!handle)
return -ENODEV;
@@ -168,6 +201,11 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
clk_data->num = count;
hws = clk_data->hws;

+ if (handle->is_transport_atomic(handle))
+ scmi_ops = &scmi_atomic_clk_ops;
+ else
+ scmi_ops = &scmi_clk_ops;
+
for (idx = 0; idx < count; idx++) {
struct scmi_clk *sclk;

@@ -184,7 +222,7 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
sclk->id = idx;
sclk->ph = ph;

- err = scmi_clk_ops_init(dev, sclk);
+ err = scmi_clk_ops_init(dev, sclk, scmi_ops);
if (err) {
dev_err(dev, "failed to register clock %d\n", idx);
devm_kfree(dev, sclk);
--
2.17.1

2021-12-03 02:06:50

by Stephen Boyd

[permalink] [raw]

Subject: Re: [PATCH v7 16/16] clk: scmi: Support atomic clock enable/disable API

Quoting Cristian Marussi (2021-11-29 11:11:56)
> Support also atomic enable/disable clk_ops beside the bare non-atomic one
> (prepare/unprepare) when the underlying SCMI transport is configured to
> support atomic transactions for synchronous commands.
>
> Cc: Michael Turquette <[email protected]>
> Cc: Stephen Boyd <[email protected]>
> Cc: [email protected]
> Signed-off-by: Cristian Marussi <[email protected]>
> ---

Acked-by: Stephen Boyd <[email protected]>

2021-12-03 20:13:56

by Florian Fainelli

[permalink] [raw]

Subject: Re: [PATCH v7 02/16] firmware: arm_scmi: Set polling timeout to max_rx_timeout_ms

On 11/29/21 11:11 AM, Cristian Marussi wrote:
> Use transport specific transmission timeout (max_rx_timeout_ms) also for
> polling transactions.
>
> Signed-off-by: Cristian Marussi <[email protected]>

Reviewed-by: Florian Fainelli <[email protected]>
--
Florian

2021-12-03 20:16:07

by Florian Fainelli

[permalink] [raw]

Subject: Re: [PATCH v7 04/16] include: trace: Add new scmi_xfer_response_wait event

On 11/29/21 11:11 AM, Cristian Marussi wrote:
> Having a new step to trace SCMI stack while it waits for synchronous
> responses is useful to analyze system performance when changing waiting
> mode between polling and interrupt completion.
>
> Signed-off-by: Cristian Marussi <[email protected]>

Reviewed-by: Florian Fainelli <[email protected]>
--
Florian

2021-12-03 20:16:19

by Florian Fainelli

[permalink] [raw]

Subject: Re: [PATCH v7 05/16] firmware: arm_scmi: Use new trace event scmi_xfer_response_wait

On 11/29/21 11:11 AM, Cristian Marussi wrote:
> Use new trace event to mark start of waiting for response section.
>
> Signed-off-by: Cristian Marussi <[email protected]>

Reviewed-by: Florian Fainelli <[email protected]>
--
Florian

2021-12-03 20:16:49

by Florian Fainelli

[permalink] [raw]

Subject: Re: [PATCH v7 15/16] firmware: arm_scmi: Add atomic support to clock protocol

On 11/29/21 11:11 AM, Cristian Marussi wrote:
> Introduce new _atomic variant for SCMI clock protocol operations related
> to enable disable operations: when an atomic operation is required the xfer
> poll_completion flag is set for that transaction.
>
> Signed-off-by: Cristian Marussi <[email protected]>

Reviewed-by: Florian Fainelli <[email protected]>
--
Florian

2021-12-03 20:17:17

by Florian Fainelli

[permalink] [raw]

Subject: Re: [PATCH v7 16/16] clk: scmi: Support atomic clock enable/disable API

On 11/29/21 11:11 AM, Cristian Marussi wrote:
> Support also atomic enable/disable clk_ops beside the bare non-atomic one
> (prepare/unprepare) when the underlying SCMI transport is configured to
> support atomic transactions for synchronous commands.
>
> Cc: Michael Turquette <[email protected]>
> Cc: Stephen Boyd <[email protected]>
> Cc: [email protected]
> Signed-off-by: Cristian Marussi <[email protected]>

Reviewed-by: Florian Fainelli <[email protected]>
--
Florian

2021-12-03 20:17:42

by Florian Fainelli

[permalink] [raw]

Subject: Re: [PATCH v7 13/16] firmware: arm_scmi: Add new parameter to mark_txdone

On 11/29/21 11:11 AM, Cristian Marussi wrote:
> Add a new xfer parameter to mark_txdone transport operation which enables
> the SCMI core to optionally pass back into the transport layer a reference
> to the xfer descriptor that is being handled.
>
> Signed-off-by: Cristian Marussi <[email protected]>

Reviewed-by: Florian Fainelli <[email protected]>
--
Florian

2021-12-10 10:52:54

by Vincent Guittot

[permalink] [raw]

Subject: Re: [PATCH v7 16/16] clk: scmi: Support atomic clock enable/disable API

Hi Cristian,

On Mon, 29 Nov 2021 at 20:13, Cristian Marussi <[email protected]> wrote:
>
> Support also atomic enable/disable clk_ops beside the bare non-atomic one
> (prepare/unprepare) when the underlying SCMI transport is configured to
> support atomic transactions for synchronous commands.
>
> Cc: Michael Turquette <[email protected]>
> Cc: Stephen Boyd <[email protected]>
> Cc: [email protected]
> Signed-off-by: Cristian Marussi <[email protected]>
> ---
> V5 --> V6
> - add concurrent availability of atomic and non atomic reqs
> ---
> drivers/clk/clk-scmi.c | 56 +++++++++++++++++++++++++++++++++++-------
> 1 file changed, 47 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/clk/clk-scmi.c b/drivers/clk/clk-scmi.c
> index 1e357d364ca2..50033d873dde 100644
> --- a/drivers/clk/clk-scmi.c
> +++ b/drivers/clk/clk-scmi.c
> @@ -88,21 +88,53 @@ static void scmi_clk_disable(struct clk_hw *hw)
> scmi_proto_clk_ops->disable(clk->ph, clk->id);
> }
>
> +static int scmi_clk_atomic_enable(struct clk_hw *hw)
> +{
> + struct scmi_clk *clk = to_scmi_clk(hw);
> +
> + return scmi_proto_clk_ops->enable_atomic(clk->ph, clk->id);
> +}
> +
> +static void scmi_clk_atomic_disable(struct clk_hw *hw)
> +{
> + struct scmi_clk *clk = to_scmi_clk(hw);
> +
> + scmi_proto_clk_ops->disable_atomic(clk->ph, clk->id);
> +}
> +
> +/*
> + * We can provide enable/disable atomic callbacks only if the underlying SCMI
> + * transport for an SCMI instance is configured to handle SCMI commands in an
> + * atomic manner.
> + *
> + * When no SCMI atomic transport support is available we instead provide only
> + * the prepare/unprepare API, as allowed by the clock framework when atomic
> + * calls are not available.
> + *
> + * Two distinct sets of clk_ops are provided since we could have multiple SCMI
> + * instances with different underlying transport quality, so they cannot be
> + * shared.
> + */
> static const struct clk_ops scmi_clk_ops = {
> .recalc_rate = scmi_clk_recalc_rate,
> .round_rate = scmi_clk_round_rate,
> .set_rate = scmi_clk_set_rate,
> - /*
> - * We can't provide enable/disable callback as we can't perform the same
> - * in atomic context. Since the clock framework provides standard API
> - * clk_prepare_enable that helps cases using clk_enable in non-atomic
> - * context, it should be fine providing prepare/unprepare.
> - */
> .prepare = scmi_clk_enable,
> .unprepare = scmi_clk_disable,
> };
>
> -static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
> +static const struct clk_ops scmi_atomic_clk_ops = {
> + .recalc_rate = scmi_clk_recalc_rate,
> + .round_rate = scmi_clk_round_rate,
> + .set_rate = scmi_clk_set_rate,
> + .prepare = scmi_clk_enable,
> + .unprepare = scmi_clk_disable,
> + .enable = scmi_clk_atomic_enable,

For each clock, we have to start with clk_prepare and then clk_enable
this means that for scmi clk we will do
scmi_clk_enable
then
scmi_clk_atomic_enable

scmi_clk_enable and scmi_clk_atomic_enable ends up doing the same
thing: scmi_clock_config_set but the atomic version doesn't sleep

So you will set enable twice the clock.

This is confirmed when testing your series with virtio scmi backend as
I can see to consecutive scmi clk set enable request for the same
clock

In case of atomic mode, the clk_prepare should be a nop

> + .disable = scmi_clk_atomic_disable,
> +};
> +
> +static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk,
> + const struct clk_ops *scmi_ops)
> {
> int ret;
> unsigned long min_rate, max_rate;
> @@ -110,7 +142,7 @@ static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
> struct clk_init_data init = {
> .flags = CLK_GET_RATE_NOCACHE,
> .num_parents = 0,
> - .ops = &scmi_clk_ops,
> + .ops = scmi_ops,
> .name = sclk->info->name,
> };
>
> @@ -145,6 +177,7 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
> struct device_node *np = dev->of_node;
> const struct scmi_handle *handle = sdev->handle;
> struct scmi_protocol_handle *ph;
> + const struct clk_ops *scmi_ops;
>
> if (!handle)
> return -ENODEV;
> @@ -168,6 +201,11 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
> clk_data->num = count;
> hws = clk_data->hws;
>
> + if (handle->is_transport_atomic(handle))
> + scmi_ops = &scmi_atomic_clk_ops;
> + else
> + scmi_ops = &scmi_clk_ops;
> +
> for (idx = 0; idx < count; idx++) {
> struct scmi_clk *sclk;
>
> @@ -184,7 +222,7 @@ static int scmi_clocks_probe(struct scmi_device *sdev)
> sclk->id = idx;
> sclk->ph = ph;
>
> - err = scmi_clk_ops_init(dev, sclk);
> + err = scmi_clk_ops_init(dev, sclk, scmi_ops);
> if (err) {
> dev_err(dev, "failed to register clock %d\n", idx);
> devm_kfree(dev, sclk);
> --
> 2.17.1
>

2021-12-10 12:26:10

by Peter Hilber

[permalink] [raw]

Subject: Re: [PATCH v7 14/16] firmware: arm_scmi: Add atomic mode support to virtio transport

On 29.11.21 20:11, Cristian Marussi wrote:
> Add support for .mark_txdone and .poll_done transport operations to SCMI
> VirtIO transport as pre-requisites to enable atomic operations.
>

Hi Cristian,

I see no conceptual problems with this patch. Please find some inline
remarks below.

Best regards,

Peter

> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> and atomic mode for selected SCMI transactions while leaving it default
> disabled.
>
> Cc: "Michael S. Tsirkin" <[email protected]>
> Cc: Igor Skalkin <[email protected]>
> Cc: Peter Hilber <[email protected]>
> Cc: [email protected]
> Signed-off-by: Cristian Marussi <[email protected]>
> ---
> V6 --> V7
> - added a few comments about virtio polling internals
> - fixed missing list_del on pending_cmds_list processing
> - shrinked spinlocked areas in virtio_poll_done
> - added proper spinlocking to scmi_vio_complete_cb while scanning list
> of pending cmds
> ---
> drivers/firmware/arm_scmi/Kconfig | 15 ++
> drivers/firmware/arm_scmi/virtio.c | 241 +++++++++++++++++++++++++++--
> 2 files changed, 243 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> index d429326433d1..7794bd41eaa0 100644
> --- a/drivers/firmware/arm_scmi/Kconfig
> +++ b/drivers/firmware/arm_scmi/Kconfig
> @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
> the ones implemented by kvmtool) and let the core Kernel VirtIO layer
> take care of the needed conversions, say N.
>
> +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> + bool "Enable atomic mode for SCMI VirtIO transport"
> + depends on ARM_SCMI_TRANSPORT_VIRTIO
> + help
> + Enable support of atomic operation for SCMI VirtIO based transport.
> +
> + If you want the SCMI VirtIO based transport to operate in atomic
> + mode, avoiding any kind of sleeping behaviour for selected
> + transactions on the TX path, answer Y.
> +
> + Enabling atomic mode operations allows any SCMI driver using this
> + transport to optionally ask for atomic SCMI transactions and operate
> + in atomic context too, at the price of using a number of busy-waiting
> + primitives all over instead. If unsure say N.
> +
> endif #ARM_SCMI_PROTOCOL
>
> config ARM_SCMI_POWER_DOMAIN
> diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> index fd0f6f91fc0b..0598e185a786 100644
> --- a/drivers/firmware/arm_scmi/virtio.c
> +++ b/drivers/firmware/arm_scmi/virtio.c
> @@ -38,6 +38,7 @@
> * @vqueue: Associated virtqueue
> * @cinfo: SCMI Tx or Rx channel
> * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
> * @is_rx: Whether channel is an Rx channel
> * @ready: Whether transport user is ready to hear about channel
> * @max_msg: Maximum number of pending messages for this channel.
> @@ -49,6 +50,9 @@ struct scmi_vio_channel {
> struct virtqueue *vqueue;
> struct scmi_chan_info *cinfo;
> struct list_head free_list;
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> + struct list_head pending_cmds_list;
> +#endif
> bool is_rx;
> bool ready;
> unsigned int max_msg;
> @@ -65,12 +69,22 @@ struct scmi_vio_channel {
> * @input: SDU used for (delayed) responses and notifications
> * @list: List which scmi_vio_msg may be part of
> * @rx_len: Input SDU size in bytes, once input has been received
> + * @poll_idx: Last used index registered for polling purposes if this message
> + * transaction reply was configured for polling.
> + * Note that virtqueue used index is an unsigned 16-bit.
> + * @poll_lock: Protect access to @poll_idx.
> */
> struct scmi_vio_msg {
> struct scmi_msg_payld *request;
> struct scmi_msg_payld *input;
> struct list_head list;
> unsigned int rx_len;
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> +#define VIO_MSG_POLL_DONE 0xffffffffUL

virtqueue_enable_cb_prepare() returns an "opaque unsigned value", so
this special value should not be used for .poll_idx.

> + unsigned int poll_idx;
> + /* lock to protect access to poll_idx. */
> + spinlock_t poll_lock;
> +#endif
> };
>
> /* Only one SCMI VirtIO device can possibly exist */
> @@ -104,17 +118,22 @@ static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
> return rc;
> }
>
> +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> + struct scmi_vio_msg *msg)
> +{
> + /* Here IRQs are assumed to be already disabled by the caller */
> + spin_lock(&vioch->lock);
> + list_add(&msg->list, &vioch->free_list);
> + spin_unlock(&vioch->lock);
> +}
> +
> static void scmi_finalize_message(struct scmi_vio_channel *vioch,
> struct scmi_vio_msg *msg)
> {
> - if (vioch->is_rx) {
> + if (vioch->is_rx)
> scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> - } else {
> - /* Here IRQs are assumed to be already disabled by the caller */
> - spin_lock(&vioch->lock);
> - list_add(&msg->list, &vioch->free_list);
> - spin_unlock(&vioch->lock);
> - }
> + else
> + scmi_vio_feed_vq_tx(vioch, msg);
> }
>
> static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> @@ -140,6 +159,26 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)

Looks like this function could also scmi_rx_callback() for polled
messages, which should probably not happen.

>
> /* IRQs already disabled here no need to irqsave */
> spin_lock(&vioch->lock);
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> + /* At first scan the list of possibly pre-fetched messages */
> + if (!vioch->is_rx) {
> + struct scmi_vio_msg *tmp;
> +
> + list_for_each_entry_safe(msg, tmp,
> + &vioch->pending_cmds_list,
> + list) {
> + list_del(&msg->list);
> + spin_unlock(&vioch->lock);
> +
> + scmi_rx_callback(vioch->cinfo,
> + msg_read_header(msg->input),
> + msg);
> + /* Free the processed message once done */
> + spin_lock(&vioch->lock);
> + list_add(&msg->list, &vioch->free_list);
> + }
> + }
> +#endif
> if (cb_enabled) {
> virtqueue_disable_cb(vqueue);
> cb_enabled = false;
> @@ -257,6 +296,9 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> GFP_KERNEL);
> if (!msg->request)
> return -ENOMEM;
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> + spin_lock_init(&msg->poll_lock);
> +#endif
> }
>
> msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> @@ -324,7 +366,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> }
>
> msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> - list_del(&msg->list);
> + /* Re-init element so we can discern anytime if it is still in-flight */
> + list_del_init(&msg->list);
>
> msg_tx_prepare(msg->request, xfer);
>
> @@ -337,6 +380,20 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> dev_err(vioch->cinfo->dev,
> "failed to add to TX virtqueue (%d)\n", rc);
> } else {
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> + /*
> + * If polling was requested for this transaction:
> + * - retrieve last used index (will be used as polling reference)
> + * - bind the polled message to the xfer via .priv
> + */
> + if (xfer->hdr.poll_completion) {
> + spin_lock(&msg->poll_lock);
> + msg->poll_idx =
> + virtqueue_enable_cb_prepare(vioch->vqueue);
> + spin_unlock(&msg->poll_lock);
> + xfer->priv = msg;
> + }
> +#endif
> virtqueue_kick(vioch->vqueue);
> }
>
> @@ -350,10 +407,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
> {
> struct scmi_vio_msg *msg = xfer->priv;
>
> - if (msg) {
> + if (msg)
> msg_fetch_response(msg->input, msg->rx_len, xfer);
> - xfer->priv = NULL;
> - }
> }
>
> static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> @@ -361,11 +416,163 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> {
> struct scmi_vio_msg *msg = xfer->priv;
>
> - if (msg) {
> + if (msg)
> msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> - xfer->priv = NULL;
> +}
> +
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> +/**
> + * virtio_mark_txdone - Mark transmission done
> + *
> + * Free only successfully completed polling transfer messages.
> + *
> + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> + * messages by forcibly re-adding them to the free-list on timeout inside the TX
> + * code path; we instead let IRQ/RX callbacks eventually clean up such messages
> + * once, finally, a late reply is received and discarded (if ever).
> + *
> + * This approach was deemed preferable since those pending timed-out buffers are
> + * still effectively owned by the SCMI platform VirtIO device even after timeout
> + * expiration: forcibly freeing and reusing them before they had beeen returned
> + * by the SCMI platform could lead to subtle bugs due to message corruption.
> + * An SCMI platform VirtIO device which never returns message buffers is
> + * anyway broken and it will quickly lead to message exhaustion.
> + *
> + * For this same reason, here, we take care to free only the successfully
> + * completed polled messages, since they won't be freed elsewhere; late replies
> + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> + *
> + * @cinfo: SCMI channel info
> + * @ret: Transmission return code
> + * @xfer: Transfer descriptor
> + */
> +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> + struct scmi_xfer *xfer)
> +{
> + unsigned long flags;
> + struct scmi_vio_channel *vioch = cinfo->transport_info;
> + struct scmi_vio_msg *msg = xfer->priv;
> +
> + if (!msg)
> + return;
> +
> + /* Is a successfully completed polled message still to be finalized ? */
> + spin_lock_irqsave(&msg->poll_lock, flags);

.poll_lock is acquired before vioch->lock (in scmi_vio_feed_vq_tx()),
but e. g. virtio_poll_done() uses the reverse locking order.

> + if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
> + scmi_vio_feed_vq_tx(vioch, msg);
> + spin_unlock_irqrestore(&msg->poll_lock, flags);
> +
> + xfer->priv = NULL;
> +}
> +
> +/**
> + * virtio_poll_done - Provide polling support for VirtIO transport
> + *
> + * @cinfo: SCMI channel info
> + * @xfer: Reference to the transfer being poll for.
> + *
> + * VirtIO core provides a polling mechanism based only on last used indexes:
> + * this means that it is possible to poll the virtqueues waiting for something
> + * new to arrive from the host side but the only way to check if the freshly
> + * arrived buffer was what we were waiting for is to compare the newly arrived
> + * message descriptors with the one we are polling on.
> + *
> + * As a consequence it can happen to dequeue something different from the buffer
> + * we were poll-waiting for: if that is the case such early fetched buffers are
> + * then added to a the @pending_cmds_list list for later processing within the
> + * usual VirtIO callbacks; so, basically, once something new is spotted we
> + * proceed to de-queue all the freshly received used buffers until we found the
> + * one we were polling on, or we empty the virtqueue.
> + *
> + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> + * when polling since such flag is per-virtqueues and we do not want to
> + * suppress notifications as a whole: so, if the message we are polling for is
> + * delivered via usual IRQs callbacks, it will be handled as such and the
> + * polling loop in the SCMI Core TX path will be transparently terminated
> + * anyway.
> + *
> + * Return: True once polling has successfully completed.
> + */
> +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> + struct scmi_xfer *xfer)
> +{
> + bool ret;
> + unsigned int poll_idx;
> + unsigned long flags;
> + struct scmi_vio_msg *msg = xfer->priv;
> + struct scmi_vio_channel *vioch = cinfo->transport_info;
> +
> + if (!msg)
> + return true;
> +
> + /*
> + * Keep the spinlocked region as small as possible: don't care if
> + * missing something this time it will be polled again next.
> + */
> + spin_lock_irqsave(&msg->poll_lock, flags);
> + poll_idx = msg->poll_idx;
> + spin_unlock_irqrestore(&msg->poll_lock, flags);
> +
> + /* Processed already by other polling loop on another CPU ? */
> + if (poll_idx == VIO_MSG_POLL_DONE)
> + return true;
> +
> + /* Has cmdq index moved at all ? */
> + ret = virtqueue_poll(vioch->vqueue, poll_idx);

In theory, if polling gets delayed, virtqueue_poll() might run into an
ABA problem, if 2**16 descriptors have been used in the meantime (and
activity stops after this).

I think this could be avoided by clearing .poll_idx of each returned
message everywhere (also in scmi_vio_complete_cb()).

> + if (ret) {
> + struct scmi_vio_msg *next_msg;
> +
> + spin_lock_irqsave(&vioch->lock, flags);
> + virtqueue_disable_cb(vioch->vqueue);
> + /*
> + * If something arrived we cannot be sure if it was the reply to
> + * the xfer we are polling for, or some replies to other, even
> + * possibly non-polling, pending xfers: process all new messages
> + * till the polled-for message is found OR the vqueue is empty.
> + */
> + do {
> + unsigned int length;
> +
> + next_msg = virtqueue_get_buf(vioch->vqueue, &length);
> + if (next_msg) {
> + next_msg->rx_len = length;
> + if (next_msg == msg) {
> + ret = true;
> + break;
> + }
> +
> + list_add_tail(&next_msg->list,
> + &vioch->pending_cmds_list);
> + spin_lock(&next_msg->poll_lock);
> + next_msg->poll_idx = VIO_MSG_POLL_DONE;
> + spin_unlock(&next_msg->poll_lock);
> + ret = false;
> + }
> + } while (next_msg);
> +
> + /*
> + * When the polling loop has successfully terminated simply
> + * restart the vqueue, no matter if something else was queued

"restart the vqueue" should better be phrased as "re-enable the vqueue
callbacks".

> + * in the meantime, it will be served by normal IRQ/callback
> + * or by the next poll loop.
> + *
> + * Update the polling index to the current vqueue last used
> + * index, if still looking for a reply.
> + */
> + if (ret) {
> + virtqueue_enable_cb(vioch->vqueue);

If this function returns false, how is the race described in the
function documentation handled? (The same comment might also apply in
other places.)

> + } else {
> + spin_lock(&msg->poll_lock);
> + msg->poll_idx =
> + virtqueue_enable_cb_prepare(vioch->vqueue);
> + spin_unlock(&msg->poll_lock);
> + }
> + spin_unlock_irqrestore(&vioch->lock, flags);
> }
> +
> + return ret;
> }
> +#endif
>
> static const struct scmi_transport_ops scmi_virtio_ops = {
> .link_supplier = virtio_link_supplier,
> @@ -376,6 +583,10 @@ static const struct scmi_transport_ops scmi_virtio_ops = {
> .send_message = virtio_send_message,
> .fetch_response = virtio_fetch_response,
> .fetch_notification = virtio_fetch_notification,
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> + .mark_txdone = virtio_mark_txdone,
> + .poll_done = virtio_poll_done,
> +#endif
> };
>
> static int scmi_vio_probe(struct virtio_device *vdev)
> @@ -418,6 +629,9 @@ static int scmi_vio_probe(struct virtio_device *vdev)
> spin_lock_init(&channels[i].lock);
> spin_lock_init(&channels[i].ready_lock);
> INIT_LIST_HEAD(&channels[i].free_list);
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> + INIT_LIST_HEAD(&channels[i].pending_cmds_list);
> +#endif
> channels[i].vqueue = vqs[i];
>
> sz = virtqueue_get_vring_size(channels[i].vqueue);
> @@ -506,4 +720,5 @@ const struct scmi_desc scmi_virtio_desc = {
> .max_rx_timeout_ms = 60000, /* for non-realtime virtio devices */
> .max_msg = 0, /* overridden by virtio_get_max_msg() */
> .max_msg_size = VIRTIO_SCMI_MAX_MSG_SIZE,
> + .atomic_enabled = IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE),
> };
>

2021-12-10 13:30:27

by Cristian Marussi

[permalink] [raw]

Subject: Re: [PATCH v7 16/16] clk: scmi: Support atomic clock enable/disable API

On Fri, Dec 10, 2021 at 11:52:39AM +0100, Vincent Guittot wrote:
> Hi Cristian,
>

Hi Vincent,

thanks for the feedback, my replies below.

> On Mon, 29 Nov 2021 at 20:13, Cristian Marussi <[email protected]> wrote:
> >
> > Support also atomic enable/disable clk_ops beside the bare non-atomic one
> > (prepare/unprepare) when the underlying SCMI transport is configured to
> > support atomic transactions for synchronous commands.
> >
> > Cc: Michael Turquette <[email protected]>
> > Cc: Stephen Boyd <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: Cristian Marussi <[email protected]>
> > ---
> > V5 --> V6
> > - add concurrent availability of atomic and non atomic reqs
> > ---
> > drivers/clk/clk-scmi.c | 56 +++++++++++++++++++++++++++++++++++-------
> > 1 file changed, 47 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/clk/clk-scmi.c b/drivers/clk/clk-scmi.c
> > index 1e357d364ca2..50033d873dde 100644
> > --- a/drivers/clk/clk-scmi.c
> > +++ b/drivers/clk/clk-scmi.c
> > @@ -88,21 +88,53 @@ static void scmi_clk_disable(struct clk_hw *hw)
> > scmi_proto_clk_ops->disable(clk->ph, clk->id);
> > }
> >
> > +static int scmi_clk_atomic_enable(struct clk_hw *hw)
> > +{
> > + struct scmi_clk *clk = to_scmi_clk(hw);
> > +
> > + return scmi_proto_clk_ops->enable_atomic(clk->ph, clk->id);
> > +}
> > +
> > +static void scmi_clk_atomic_disable(struct clk_hw *hw)
> > +{
> > + struct scmi_clk *clk = to_scmi_clk(hw);
> > +
> > + scmi_proto_clk_ops->disable_atomic(clk->ph, clk->id);
> > +}
> > +
> > +/*
> > + * We can provide enable/disable atomic callbacks only if the underlying SCMI
> > + * transport for an SCMI instance is configured to handle SCMI commands in an
> > + * atomic manner.
> > + *
> > + * When no SCMI atomic transport support is available we instead provide only
> > + * the prepare/unprepare API, as allowed by the clock framework when atomic
> > + * calls are not available.
> > + *
> > + * Two distinct sets of clk_ops are provided since we could have multiple SCMI
> > + * instances with different underlying transport quality, so they cannot be
> > + * shared.
> > + */
> > static const struct clk_ops scmi_clk_ops = {
> > .recalc_rate = scmi_clk_recalc_rate,
> > .round_rate = scmi_clk_round_rate,
> > .set_rate = scmi_clk_set_rate,
> > - /*
> > - * We can't provide enable/disable callback as we can't perform the same
> > - * in atomic context. Since the clock framework provides standard API
> > - * clk_prepare_enable that helps cases using clk_enable in non-atomic
> > - * context, it should be fine providing prepare/unprepare.
> > - */
> > .prepare = scmi_clk_enable,
> > .unprepare = scmi_clk_disable,
> > };
> >
> > -static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
> > +static const struct clk_ops scmi_atomic_clk_ops = {
> > + .recalc_rate = scmi_clk_recalc_rate,
> > + .round_rate = scmi_clk_round_rate,
> > + .set_rate = scmi_clk_set_rate,
> > + .prepare = scmi_clk_enable,
> > + .unprepare = scmi_clk_disable,
> > + .enable = scmi_clk_atomic_enable,
>
> For each clock, we have to start with clk_prepare and then clk_enable
> this means that for scmi clk we will do
> scmi_clk_enable
> then
> scmi_clk_atomic_enable
>
> scmi_clk_enable and scmi_clk_atomic_enable ends up doing the same
> thing: scmi_clock_config_set but the atomic version doesn't sleep
>
> So you will set enable twice the clock.
>
> This is confirmed when testing your series with virtio scmi backend as
> I can see to consecutive scmi clk set enable request for the same
> clock
>

Yes, I saw that double enable while testing with CLK debugfs entry BUT I
thought that was due to the design of the debugfs entries (that calls
prepare_enable and so prepare and enable in sequence) also becauase, a few
versions ago, this series WAS indeed providing (beside bugs :P) the
sleeping prepare XOR the atomic enable depending on the SCMI atomic support
state (as you are suggesting now), BUT, after a few offline chats with you,
my (probably faulty) understanding was that some partners, even on a system
supporting atomic SCMI transfers, would have liked to be able to call the
atomic .enable selectively only on some (tipically quickly to setup) clocks
while keep calling the sleeping .prepare on some other (slower ones to
settle). (while keeping all the other slower clk_rate setup ops non-atomic)

So in v6/v7 I changed the API to provide both sleepable and atomic clk APIs
when the underlying SCMI stack support atomic mode: this way, though, it is
clearly up to the caller to decide what to do and if, generally, the
clock framework just calls everytime both, it will result in a double
enable.

Was my understanding of the reqs about being able to selectively choose
the sleep_vs_atomic mode in this way wrong ?

> In case of atomic mode, the clk_prepare should be a nop
>

I can certainly revert to use the old exclusive approach, not providing
a .prepare when atomic is supported, but then all clock enable ops on any
clock defined on the system will be operated atomically withot any choice
when atomic SCMI transactions are available.

Ideally, we could like, on a SCMI system supporting atomic, to be able to
'tag' a specific clock something like 'prefer-non-atomic' and so selectively
'noppify' the .prepare or the .enable at the SCMI clk-driver level depending
on such tag, but I cannot see any way to expose this from the CLK framework
API or DT, nor it seems suitable for a per-clock DT config option AND it
would break the current logic of clk_is_enabled_when_prepared().

Indeed clk_is_enabled_when_prepared() logic is ALREADY broken also by this V7
since when providing also .enable/.disable on atomic transports, the core CLK
framework would return clk_is_enabled_when_prepared() --> False which does
NOT fit reality since our SCMI prepare/unprepare DO enable/disable clks even
if .disable/.enable are provided too.

Probably this last observation on clk_is_enabled_when_prepared() is enough
to revert to the exclusive atomic/non-atomic approach and just ignore the
customer wish to be able to selectively choose which clock to operate in
atomic mode.

Any thoughs before a V8 ?

Thanks,
Cristian

2021-12-10 14:27:18

by Vincent Guittot

[permalink] [raw]

Subject: Re: [PATCH v7 16/16] clk: scmi: Support atomic clock enable/disable API

On Fri, 10 Dec 2021 at 14:30, Cristian Marussi <[email protected]> wrote:
>
> On Fri, Dec 10, 2021 at 11:52:39AM +0100, Vincent Guittot wrote:
> > Hi Cristian,
> >
>
> Hi Vincent,
>
> thanks for the feedback, my replies below.
>
> > On Mon, 29 Nov 2021 at 20:13, Cristian Marussi <[email protected]> wrote:
> > >
> > > Support also atomic enable/disable clk_ops beside the bare non-atomic one
> > > (prepare/unprepare) when the underlying SCMI transport is configured to
> > > support atomic transactions for synchronous commands.
> > >
> > > Cc: Michael Turquette <[email protected]>
> > > Cc: Stephen Boyd <[email protected]>
> > > Cc: [email protected]
> > > Signed-off-by: Cristian Marussi <[email protected]>
> > > ---
> > > V5 --> V6
> > > - add concurrent availability of atomic and non atomic reqs
> > > ---
> > > drivers/clk/clk-scmi.c | 56 +++++++++++++++++++++++++++++++++++-------
> > > 1 file changed, 47 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/clk/clk-scmi.c b/drivers/clk/clk-scmi.c
> > > index 1e357d364ca2..50033d873dde 100644
> > > --- a/drivers/clk/clk-scmi.c
> > > +++ b/drivers/clk/clk-scmi.c
> > > @@ -88,21 +88,53 @@ static void scmi_clk_disable(struct clk_hw *hw)
> > > scmi_proto_clk_ops->disable(clk->ph, clk->id);
> > > }
> > >
> > > +static int scmi_clk_atomic_enable(struct clk_hw *hw)
> > > +{
> > > + struct scmi_clk *clk = to_scmi_clk(hw);
> > > +
> > > + return scmi_proto_clk_ops->enable_atomic(clk->ph, clk->id);
> > > +}
> > > +
> > > +static void scmi_clk_atomic_disable(struct clk_hw *hw)
> > > +{
> > > + struct scmi_clk *clk = to_scmi_clk(hw);
> > > +
> > > + scmi_proto_clk_ops->disable_atomic(clk->ph, clk->id);
> > > +}
> > > +
> > > +/*
> > > + * We can provide enable/disable atomic callbacks only if the underlying SCMI
> > > + * transport for an SCMI instance is configured to handle SCMI commands in an
> > > + * atomic manner.
> > > + *
> > > + * When no SCMI atomic transport support is available we instead provide only
> > > + * the prepare/unprepare API, as allowed by the clock framework when atomic
> > > + * calls are not available.
> > > + *
> > > + * Two distinct sets of clk_ops are provided since we could have multiple SCMI
> > > + * instances with different underlying transport quality, so they cannot be
> > > + * shared.
> > > + */
> > > static const struct clk_ops scmi_clk_ops = {
> > > .recalc_rate = scmi_clk_recalc_rate,
> > > .round_rate = scmi_clk_round_rate,
> > > .set_rate = scmi_clk_set_rate,
> > > - /*
> > > - * We can't provide enable/disable callback as we can't perform the same
> > > - * in atomic context. Since the clock framework provides standard API
> > > - * clk_prepare_enable that helps cases using clk_enable in non-atomic
> > > - * context, it should be fine providing prepare/unprepare.
> > > - */
> > > .prepare = scmi_clk_enable,
> > > .unprepare = scmi_clk_disable,
> > > };
> > >
> > > -static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
> > > +static const struct clk_ops scmi_atomic_clk_ops = {
> > > + .recalc_rate = scmi_clk_recalc_rate,
> > > + .round_rate = scmi_clk_round_rate,
> > > + .set_rate = scmi_clk_set_rate,
> > > + .prepare = scmi_clk_enable,
> > > + .unprepare = scmi_clk_disable,
> > > + .enable = scmi_clk_atomic_enable,
> >
> > For each clock, we have to start with clk_prepare and then clk_enable
> > this means that for scmi clk we will do
> > scmi_clk_enable
> > then
> > scmi_clk_atomic_enable
> >
> > scmi_clk_enable and scmi_clk_atomic_enable ends up doing the same
> > thing: scmi_clock_config_set but the atomic version doesn't sleep
> >
> > So you will set enable twice the clock.
> >
> > This is confirmed when testing your series with virtio scmi backend as
> > I can see to consecutive scmi clk set enable request for the same
> > clock
> >
>
> Yes, I saw that double enable while testing with CLK debugfs entry BUT I
> thought that was due to the design of the debugfs entries (that calls
> prepare_enable and so prepare and enable in sequence) also becauase, a few
> versions ago, this series WAS indeed providing (beside bugs :P) the
> sleeping prepare XOR the atomic enable depending on the SCMI atomic support
> state (as you are suggesting now), BUT, after a few offline chats with you,
> my (probably faulty) understanding was that some partners, even on a system
> supporting atomic SCMI transfers, would have liked to be able to call the
> atomic .enable selectively only on some (tipically quickly to setup) clocks
> while keep calling the sleeping .prepare on some other (slower ones to
> settle). (while keeping all the other slower clk_rate setup ops non-atomic)
>
> So in v6/v7 I changed the API to provide both sleepable and atomic clk APIs
> when the underlying SCMI stack support atomic mode: this way, though, it is
> clearly up to the caller to decide what to do and if, generally, the
> clock framework just calls everytime both, it will result in a double
> enable.
>
> Was my understanding of the reqs about being able to selectively choose
> the sleep_vs_atomic mode in this way wrong ?

At clk framework level, a user must always call clk_prepare then
clk_enable. Usually clk_prepare is used to do action that can take
time like locking a pll and clk_prepare do the short and atomic action
like gating the clk.

Then, clk_prepare can be a nop if a clk is only doing clk gating as an
example. The same can happen for clk_enable when a pll can't be gated
as an example

What I wanted to say when we discussed offline is that scmi can
provide some PLL and some simple IP clock gating. The PLL would want
to be enabled during the clk_prepare because it's take time and you
can even sleep whereas IP clock gating want to be atomic
So scmi clock 0 might be a PLL with clk_prepare == scmi_clock_enable
and clk_enable == nop
but scmi clock 1 could the clock gating of a HW ip and want
clk_prepare == nop and clk_enable == scmi_clock_atomic_enable

>
> > In case of atomic mode, the clk_prepare should be a nop
> >
>
> I can certainly revert to use the old exclusive approach, not providing
> a .prepare when atomic is supported, but then all clock enable ops on any
> clock defined on the system will be operated atomically withot any choice
> when atomic SCMI transactions are available.
>
> Ideally, we could like, on a SCMI system supporting atomic, to be able to
> 'tag' a specific clock something like 'prefer-non-atomic' and so selectively
> 'noppify' the .prepare or the .enable at the SCMI clk-driver level depending
> on such tag, but I cannot see any way to expose this from the CLK framework

Yes that would be the right behaviour IMHO

> API or DT, nor it seems suitable for a per-clock DT config option AND it
> would break the current logic of clk_is_enabled_when_prepared().
>
> Indeed clk_is_enabled_when_prepared() logic is ALREADY broken also by this V7
> since when providing also .enable/.disable on atomic transports, the core CLK
> framework would return clk_is_enabled_when_prepared() --> False which does
> NOT fit reality since our SCMI prepare/unprepare DO enable/disable clks even
> if .disable/.enable are provided too.
>
> Probably this last observation on clk_is_enabled_when_prepared() is enough
> to revert to the exclusive atomic/non-atomic approach and just ignore the
> customer wish to be able to selectively choose which clock to operate in
> atomic mode.

Another thing to keep in mind is the use of clk_prepare and clk_enable

clk_prepare/unprepare is usually called when you open/close a driver
whereas clk_enable/disable is called close to the use of the HW ip for
power consumption consideration. If you only use clk_prepare, the IP
will be clocked much more than needed. But if you alwayse use
clk_enable with atomic transfer, you end up taking lot of time to
enable your IP because the parent PLL will be locked at that time in
addition to miss some timing requirement

>
> Any thoughs before a V8 ?
>
> Thanks,
> Cristian
>

2021-12-10 19:36:40

by Cristian Marussi

[permalink] [raw]

Subject: Re: [PATCH v7 16/16] clk: scmi: Support atomic clock enable/disable API

On Fri, Dec 10, 2021 at 03:27:02PM +0100, Vincent Guittot wrote:
> On Fri, 10 Dec 2021 at 14:30, Cristian Marussi <[email protected]> wrote:
> >
> > On Fri, Dec 10, 2021 at 11:52:39AM +0100, Vincent Guittot wrote:
> > > Hi Cristian,
> > >

Hi,

> >
> > Hi Vincent,
> >
> > thanks for the feedback, my replies below.
> >
> > > On Mon, 29 Nov 2021 at 20:13, Cristian Marussi <[email protected]> wrote:
> > > >
> > > > Support also atomic enable/disable clk_ops beside the bare non-atomic one
> > > > (prepare/unprepare) when the underlying SCMI transport is configured to
> > > > support atomic transactions for synchronous commands.
> > > >
> > > > Cc: Michael Turquette <[email protected]>
> > > > Cc: Stephen Boyd <[email protected]>
> > > > Cc: [email protected]
> > > > Signed-off-by: Cristian Marussi <[email protected]>
> > > > ---
> > > > V5 --> V6
> > > > - add concurrent availability of atomic and non atomic reqs
> > > > ---
> > > > drivers/clk/clk-scmi.c | 56 +++++++++++++++++++++++++++++++++++-------
> > > > 1 file changed, 47 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/drivers/clk/clk-scmi.c b/drivers/clk/clk-scmi.c
> > > > index 1e357d364ca2..50033d873dde 100644
> > > > --- a/drivers/clk/clk-scmi.c
> > > > +++ b/drivers/clk/clk-scmi.c
> > > > @@ -88,21 +88,53 @@ static void scmi_clk_disable(struct clk_hw *hw)
> > > > scmi_proto_clk_ops->disable(clk->ph, clk->id);
> > > > }
> > > >
> > > > +static int scmi_clk_atomic_enable(struct clk_hw *hw)
> > > > +{
> > > > + struct scmi_clk *clk = to_scmi_clk(hw);
> > > > +
> > > > + return scmi_proto_clk_ops->enable_atomic(clk->ph, clk->id);
> > > > +}
> > > > +
> > > > +static void scmi_clk_atomic_disable(struct clk_hw *hw)
> > > > +{
> > > > + struct scmi_clk *clk = to_scmi_clk(hw);
> > > > +
> > > > + scmi_proto_clk_ops->disable_atomic(clk->ph, clk->id);
> > > > +}
> > > > +
> > > > +/*
> > > > + * We can provide enable/disable atomic callbacks only if the underlying SCMI
> > > > + * transport for an SCMI instance is configured to handle SCMI commands in an
> > > > + * atomic manner.
> > > > + *
> > > > + * When no SCMI atomic transport support is available we instead provide only
> > > > + * the prepare/unprepare API, as allowed by the clock framework when atomic
> > > > + * calls are not available.
> > > > + *
> > > > + * Two distinct sets of clk_ops are provided since we could have multiple SCMI
> > > > + * instances with different underlying transport quality, so they cannot be
> > > > + * shared.
> > > > + */
> > > > static const struct clk_ops scmi_clk_ops = {
> > > > .recalc_rate = scmi_clk_recalc_rate,
> > > > .round_rate = scmi_clk_round_rate,
> > > > .set_rate = scmi_clk_set_rate,
> > > > - /*
> > > > - * We can't provide enable/disable callback as we can't perform the same
> > > > - * in atomic context. Since the clock framework provides standard API
> > > > - * clk_prepare_enable that helps cases using clk_enable in non-atomic
> > > > - * context, it should be fine providing prepare/unprepare.
> > > > - */
> > > > .prepare = scmi_clk_enable,
> > > > .unprepare = scmi_clk_disable,
> > > > };
> > > >
> > > > -static int scmi_clk_ops_init(struct device *dev, struct scmi_clk *sclk)
> > > > +static const struct clk_ops scmi_atomic_clk_ops = {
> > > > + .recalc_rate = scmi_clk_recalc_rate,
> > > > + .round_rate = scmi_clk_round_rate,
> > > > + .set_rate = scmi_clk_set_rate,
> > > > + .prepare = scmi_clk_enable,
> > > > + .unprepare = scmi_clk_disable,
> > > > + .enable = scmi_clk_atomic_enable,
> > >
> > > For each clock, we have to start with clk_prepare and then clk_enable
> > > this means that for scmi clk we will do
> > > scmi_clk_enable
> > > then
> > > scmi_clk_atomic_enable
> > >
> > > scmi_clk_enable and scmi_clk_atomic_enable ends up doing the same
> > > thing: scmi_clock_config_set but the atomic version doesn't sleep
> > >
> > > So you will set enable twice the clock.
> > >
> > > This is confirmed when testing your series with virtio scmi backend as
> > > I can see to consecutive scmi clk set enable request for the same
> > > clock
> > >
> >
> > Yes, I saw that double enable while testing with CLK debugfs entry BUT I
> > thought that was due to the design of the debugfs entries (that calls
> > prepare_enable and so prepare and enable in sequence) also becauase, a few
> > versions ago, this series WAS indeed providing (beside bugs :P) the
> > sleeping prepare XOR the atomic enable depending on the SCMI atomic support
> > state (as you are suggesting now), BUT, after a few offline chats with you,
> > my (probably faulty) understanding was that some partners, even on a system
> > supporting atomic SCMI transfers, would have liked to be able to call the
> > atomic .enable selectively only on some (tipically quickly to setup) clocks
> > while keep calling the sleeping .prepare on some other (slower ones to
> > settle). (while keeping all the other slower clk_rate setup ops non-atomic)
> >
> > So in v6/v7 I changed the API to provide both sleepable and atomic clk APIs
> > when the underlying SCMI stack support atomic mode: this way, though, it is
> > clearly up to the caller to decide what to do and if, generally, the
> > clock framework just calls everytime both, it will result in a double
> > enable.
> >
> > Was my understanding of the reqs about being able to selectively choose
> > the sleep_vs_atomic mode in this way wrong ?
>
> At clk framework level, a user must always call clk_prepare then
> clk_enable. Usually clk_prepare is used to do action that can take
> time like locking a pll and clk_prepare do the short and atomic action
> like gating the clk.
>

Ok, I was missing this, I supposed was up to the caller decide when to
use what and if to use both or no.

> Then, clk_prepare can be a nop if a clk is only doing clk gating as an
> example. The same can happen for clk_enable when a pll can't be gated
> as an example
>
> What I wanted to say when we discussed offline is that scmi can
> provide some PLL and some simple IP clock gating. The PLL would want
> to be enabled during the clk_prepare because it's take time and you
> can even sleep whereas IP clock gating want to be atomic
> So scmi clock 0 might be a PLL with clk_prepare == scmi_clock_enable
> and clk_enable == nop
> but scmi clock 1 could the clock gating of a HW ip and want
> clk_prepare == nop and clk_enable == scmi_clock_atomic_enable
>

Yes that difference in 'enabling-time' was clear, but I was assuming that
the caller was in charge to decide which primitive to use based on the
specific clock characteristics so that custom ops per-clock would have
not be needed.

Because in this scenario instead I will certainly need some descriptor
somewhere to discriminate one 'slow' clock having only .prepare sleep
method from a 'fast' one that will be served by atomic .enable instead...
...and I can already sense the next diatribe about matching the clock
by-clock-id vs by-clock-name ... :P

I'll take this offline with Sudeep, to explore also the possibility to
delegate instead such a descriptor to the SCMI protocol itself.

Thanks for your explanation.

> >
> > > In case of atomic mode, the clk_prepare should be a nop
> > >
> >
> > I can certainly revert to use the old exclusive approach, not providing
> > a .prepare when atomic is supported, but then all clock enable ops on any
> > clock defined on the system will be operated atomically withot any choice
> > when atomic SCMI transactions are available.
> >
> > Ideally, we could like, on a SCMI system supporting atomic, to be able to
> > 'tag' a specific clock something like 'prefer-non-atomic' and so selectively
> > 'noppify' the .prepare or the .enable at the SCMI clk-driver level depending
> > on such tag, but I cannot see any way to expose this from the CLK framework
>
> Yes that would be the right behaviour IMHO
>

As said above the remaining thing to determine is where to get from this
information about the nature of the clock at hand...

> > API or DT, nor it seems suitable for a per-clock DT config option AND it
> > would break the current logic of clk_is_enabled_when_prepared().
> >
> > Indeed clk_is_enabled_when_prepared() logic is ALREADY broken also by this V7
> > since when providing also .enable/.disable on atomic transports, the core CLK
> > framework would return clk_is_enabled_when_prepared() --> False which does
> > NOT fit reality since our SCMI prepare/unprepare DO enable/disable clks even
> > if .disable/.enable are provided too.
> >
> > Probably this last observation on clk_is_enabled_when_prepared() is enough
> > to revert to the exclusive atomic/non-atomic approach and just ignore the
> > customer wish to be able to selectively choose which clock to operate in
> > atomic mode.
>
> Another thing to keep in mind is the use of clk_prepare and clk_enable
>
> clk_prepare/unprepare is usually called when you open/close a driver
> whereas clk_enable/disable is called close to the use of the HW ip for
> power consumption consideration. If you only use clk_prepare, the IP
> will be clocked much more than needed. But if you alwayse use
> clk_enable with atomic transfer, you end up taking lot of time to
> enable your IP because the parent PLL will be locked at that time in
> addition to miss some timing requirement
>

So this is another reason to use the above schema of per-clock custom
enable operation based on the nature of the clock, so that at least the
polling behavior is not used in case of slow PLL-like clocks that long
to acquire.

Thanks,
Cristian

2021-12-13 11:06:29

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v7 02/16] firmware: arm_scmi: Set polling timeout to max_rx_timeout_ms

On Mon, Nov 29, 2021 at 07:11:42PM +0000, Cristian Marussi wrote:
> Use transport specific transmission timeout (max_rx_timeout_ms) also for
> polling transactions.
>
> Signed-off-by: Cristian Marussi <[email protected]>
> ---
> drivers/firmware/arm_scmi/driver.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> index 3cf161f3bcc7..568562121f64 100644
> --- a/drivers/firmware/arm_scmi/driver.c
> +++ b/drivers/firmware/arm_scmi/driver.c
> @@ -724,8 +724,6 @@ static void xfer_put(const struct scmi_protocol_handle *ph,
> __scmi_xfer_put(&info->tx_minfo, xfer);
> }
>
> -#define SCMI_MAX_POLL_TO_NS (100 * NSEC_PER_USEC)
> -

While there are no users of SCMI polling now in sched context, it was added
and the choice of 100uS was that it was called in sched context. I will update
this and merge patches 1-5 as they are ready to go.

--
Regards,
Sudeep

2021-12-13 11:26:03

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v7 06/16] firmware: arm_scmi: Add configurable polling mode for transports

On Mon, Nov 29, 2021 at 07:11:46PM +0000, Cristian Marussi wrote:
> SCMI communications along TX channels can optionally be provided of a
> completion interrupt; when such interrupt is not available, command
> transactions should rely on polling, where the SCMI core takes care to
> repeatedly evaluate the transport-specific .poll_done() function, if
> available, to determine if and when a request was fully completed or
> timed out.
>
> Such mechanism is already present and working on a single transfer base:
> SCMI protocols can indeed enable hdr.poll_completion on specific commands
> ahead of each transfer and cause that transaction to be handled with
> polling.
>
> Introduce a couple of flags to be able to enforce such polling behaviour
> globally at will:
>
> - scmi_desc.force_polling: to statically switch the whole transport to
> polling mode.
>
> - scmi_chan_info.no_completion_irq: to switch a single channel dynamically
> to polling mode if, at runtime, is determined that no completion
> interrupt was available for such channel.
>
> Signed-off-by: Cristian Marussi <[email protected]>
> ---
> v5 --> v6
> - removed check on replies received by IRQs when xfer was requested
> as poll_completion (not all transport can suppress IRQs on an xfer basis)
> v4 --> v5
> - make force_polling const
> - introduce polling_enabled flag to simplify checks on do_xfer
> v3 --> v4:
> - renamed .needs_polling flag to .no_completion_irq
> - refactored error path when polling needed but not supported
> ---
> drivers/firmware/arm_scmi/common.h | 11 +++++++++++
> drivers/firmware/arm_scmi/driver.c | 17 +++++++++++++++++
> 2 files changed, 28 insertions(+)
>
> diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> index 6438b5248c24..99b74f4d39b6 100644
> --- a/drivers/firmware/arm_scmi/common.h
> +++ b/drivers/firmware/arm_scmi/common.h
> @@ -339,11 +339,19 @@ void scmi_protocol_release(const struct scmi_handle *handle, u8 protocol_id);
> * @dev: Reference to device in the SCMI hierarchy corresponding to this
> * channel
> * @handle: Pointer to SCMI entity handle
> + * @no_completion_irq: Flag to indicate that this channel has no completion
> + * interrupt mechanism for synchronous commands.
> + * This can be dynamically set by transports at run-time
> + * inside their provided .chan_setup().
> + * @polling_enabled: Flag used to annotate if polling mode is currently enabled
> + * on this channel.
> * @transport_info: Transport layer related information
> */
> struct scmi_chan_info {
> struct device *dev;
> struct scmi_handle *handle;
> + bool no_completion_irq;
> + bool polling_enabled;

Do we really need a separate flag for polling_enabled ?
no_completion_irq means you need to enable polling and force_polling too.
Just trying to see if we can get rid of unnecessary flags.

--
Regards,
Sudeep

2021-12-13 11:35:06

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v7 14/16] firmware: arm_scmi: Add atomic mode support to virtio transport

On Mon, Nov 29, 2021 at 07:11:54PM +0000, Cristian Marussi wrote:
> Add support for .mark_txdone and .poll_done transport operations to SCMI
> VirtIO transport as pre-requisites to enable atomic operations.
>
> Add a Kernel configuration option to enable SCMI VirtIO transport polling
> and atomic mode for selected SCMI transactions while leaving it default
> disabled.
>
> Cc: "Michael S. Tsirkin" <[email protected]>
> Cc: Igor Skalkin <[email protected]>
> Cc: Peter Hilber <[email protected]>
> Cc: [email protected]
> Signed-off-by: Cristian Marussi <[email protected]>
> ---
> V6 --> V7
> - added a few comments about virtio polling internals
> - fixed missing list_del on pending_cmds_list processing
> - shrinked spinlocked areas in virtio_poll_done
> - added proper spinlocking to scmi_vio_complete_cb while scanning list
> of pending cmds
> ---
> drivers/firmware/arm_scmi/Kconfig | 15 ++
> drivers/firmware/arm_scmi/virtio.c | 241 +++++++++++++++++++++++++++--
> 2 files changed, 243 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> index d429326433d1..7794bd41eaa0 100644
> --- a/drivers/firmware/arm_scmi/Kconfig
> +++ b/drivers/firmware/arm_scmi/Kconfig
> @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
> the ones implemented by kvmtool) and let the core Kernel VirtIO layer
> take care of the needed conversions, say N.
>
> +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> + bool "Enable atomic mode for SCMI VirtIO transport"
> + depends on ARM_SCMI_TRANSPORT_VIRTIO
> + help
> + Enable support of atomic operation for SCMI VirtIO based transport.
> +
> + If you want the SCMI VirtIO based transport to operate in atomic
> + mode, avoiding any kind of sleeping behaviour for selected
> + transactions on the TX path, answer Y.
> +
> + Enabling atomic mode operations allows any SCMI driver using this
> + transport to optionally ask for atomic SCMI transactions and operate
> + in atomic context too, at the price of using a number of busy-waiting
> + primitives all over instead. If unsure say N.
> +
> endif #ARM_SCMI_PROTOCOL
>
> config ARM_SCMI_POWER_DOMAIN
> diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> index fd0f6f91fc0b..0598e185a786 100644
> --- a/drivers/firmware/arm_scmi/virtio.c
> +++ b/drivers/firmware/arm_scmi/virtio.c
> @@ -38,6 +38,7 @@
> * @vqueue: Associated virtqueue
> * @cinfo: SCMI Tx or Rx channel
> * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
> * @is_rx: Whether channel is an Rx channel
> * @ready: Whether transport user is ready to hear about channel
> * @max_msg: Maximum number of pending messages for this channel.
> @@ -49,6 +50,9 @@ struct scmi_vio_channel {
> struct virtqueue *vqueue;
> struct scmi_chan_info *cinfo;
> struct list_head free_list;
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> + struct list_head pending_cmds_list;
> +#endif
> bool is_rx;
> bool ready;
> unsigned int max_msg;
> @@ -65,12 +69,22 @@ struct scmi_vio_channel {
> * @input: SDU used for (delayed) responses and notifications
> * @list: List which scmi_vio_msg may be part of
> * @rx_len: Input SDU size in bytes, once input has been received
> + * @poll_idx: Last used index registered for polling purposes if this message
> + * transaction reply was configured for polling.
> + * Note that virtqueue used index is an unsigned 16-bit.
> + * @poll_lock: Protect access to @poll_idx.
> */
> struct scmi_vio_msg {
> struct scmi_msg_payld *request;
> struct scmi_msg_payld *input;
> struct list_head list;
> unsigned int rx_len;
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE

Do we really need the #ifdefery for struct definition ? TBH I don't like
the way it is. I would avoid it as much as possible. I assume some are
added to avoid build warnings ?

Doesn't __maybe_unused help to remove some of them like the functions
mark_txdone and poll_done. I haven't tried but thought of checking.

--
Regards,
Sudeep

2021-12-13 11:42:41

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v7 12/16] firmware: arm_scmi: Add atomic mode support to smc transport

On Mon, Nov 29, 2021 at 07:11:52PM +0000, Cristian Marussi wrote:
> Add a Kernel configuration option to enable SCMI SMC transport atomic
> mode operation for selected SCMI transactions and leave it as default
> disabled.
>
> Substitute mutex usages with busy-waiting and declare smc transport as
> .atomic_enabled if such Kernel configuration option is enabled.
>
> Signed-off-by: Cristian Marussi <[email protected]>
> ---
> v5 --> v6
> - remove usage of atomic_capable
> - removed needless union
> - reviewed Kconfig help
> v4 --> v5
> - removed RFC tag
> - add CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE option
> - add .atomic_enable support
> - make atomic_capable dependent on
> CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> - make also usage of mutexes vs busy-waiting dependent on
> CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> ---
> drivers/firmware/arm_scmi/Kconfig | 14 +++++++
> drivers/firmware/arm_scmi/smc.c | 66 ++++++++++++++++++++++++++++---
> 2 files changed, 74 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> index 638ecec89ff1..d429326433d1 100644
> --- a/drivers/firmware/arm_scmi/Kconfig
> +++ b/drivers/firmware/arm_scmi/Kconfig
> @@ -78,6 +78,20 @@ config ARM_SCMI_TRANSPORT_SMC
> If you want the ARM SCMI PROTOCOL stack to include support for a
> transport based on SMC, answer Y.
>
> +config ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> + bool "Enable atomic mode support for SCMI SMC transport"
> + depends on ARM_SCMI_TRANSPORT_SMC
> + help
> + Enable support of atomic operation for SCMI SMC based transport.
> +
> + If you want the SCMI SMC based transport to operate in atomic
> + mode, avoiding any kind of sleeping behaviour for selected
> + transactions on the TX path, answer Y.
> + Enabling atomic mode operations allows any SCMI driver using this
> + transport to optionally ask for atomic SCMI transactions and operate
> + in atomic context too, at the price of using a number of busy-waiting
> + primitives all over instead. If unsure say N.
> +
> config ARM_SCMI_TRANSPORT_VIRTIO
> bool "SCMI transport based on VirtIO"
> depends on VIRTIO=y || VIRTIO=ARM_SCMI_PROTOCOL
> diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
> index b2f31d3feb10..0fc49cb49185 100644
> --- a/drivers/firmware/arm_scmi/smc.c
> +++ b/drivers/firmware/arm_scmi/smc.c
> @@ -7,6 +7,7 @@
> */
>
> #include <linux/arm-smccc.h>
> +#include <linux/atomic.h>
> #include <linux/device.h>
> #include <linux/err.h>
> #include <linux/interrupt.h>
> @@ -14,6 +15,9 @@
> #include <linux/of.h>
> #include <linux/of_address.h>
> #include <linux/of_irq.h>
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> +#include <linux/processor.h>
> +#endif
> #include <linux/slab.h>
>
> #include "common.h"
> @@ -23,14 +27,23 @@
> *
> * @cinfo: SCMI channel info
> * @shmem: Transmit/Receive shared memory area
> - * @shmem_lock: Lock to protect access to Tx/Rx shared memory area
> + * @shmem_lock: Lock to protect access to Tx/Rx shared memory area.
> + * Used when NOT operating in atomic mode.
> + * @inflight: Atomic flag to protect access to Tx/Rx shared memory area.
> + * Used when operating in atomic mode.
> * @func_id: smc/hvc call function id
> */
>
> struct scmi_smc {
> struct scmi_chan_info *cinfo;
> struct scmi_shared_mem __iomem *shmem;
> +#ifndef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> + /* Protect access to shmem area */
> struct mutex shmem_lock;

Ditto here, do we really need to do this saving ? I would wait until someone
really complains about space. It unnecessarily makes it hard to read.

> +#else
> +#define INFLIGHT_NONE MSG_TOKEN_MAX
> + atomic_t inflight;
> +#endif
> u32 func_id;
> };
>
> @@ -54,6 +67,46 @@ static bool smc_chan_available(struct device *dev, int idx)
> return true;
> }
>
> +static inline void smc_channel_lock_init(struct scmi_smc *scmi_info)
> +{
> +#ifndef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> + mutex_init(&scmi_info->shmem_lock);
> +#else
> + atomic_set(&scmi_info->inflight, INFLIGHT_NONE);

You can do both if you remove conditional definition of struct.

> +#endif
> +}
> +
> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> +static bool smc_xfer_inflight(struct scmi_xfer *xfer, atomic_t *inflight)
> +{
> + int ret;
> +
> + ret = atomic_cmpxchg(inflight, INFLIGHT_NONE, xfer->hdr.seq);
> +
> + return ret == INFLIGHT_NONE;
> +}
> +#endif
> +
> +static inline void
> +smc_channel_lock_acquire(struct scmi_smc *scmi_info,
> + struct scmi_xfer *xfer __maybe_unused)
> +{
> +#ifndef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE

If possible make it based some local variable or you can always do

if (IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE))

--
Regards,
Sudeep

2021-12-13 11:54:45

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v7 08/16] firmware: arm_scmi: Add sync_cmds_atomic_replies transport flag

On Mon, Nov 29, 2021 at 07:11:48PM +0000, Cristian Marussi wrote:
> Add a flag to let the transport signal to the core if its handling of sync
> command implies that, after .send_message has returned successfully, the
> requested command can be assumed to be fully and completely executed on
> SCMI platform side so that any possible response value is already
> immediately available to be retrieved by a .fetch_response: in other words
> the polling phase can be skipped in such a case and the response values
> accessed straight away.
>
> Note that all of the above applies only when polling mode of operation was
> selected by the core: if instead a completion IRQ was found to be available
> the normal response processing path based on completions will still be
> followed.
>
> Signed-off-by: Cristian Marussi <[email protected]>
> ---
> v5 --> v6
> - added polling_capable helper flag
> v4 --> v5
> - removed RFC tag
> - consider sync_cmds_atomic_replies flag when deciding if polling is to be
> supported and .poll_done() is not provided.
> - reviewed commit message
> ---
> drivers/firmware/arm_scmi/common.h | 8 ++++++
> drivers/firmware/arm_scmi/driver.c | 43 +++++++++++++++++++++++-------
> 2 files changed, 41 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> index 99b74f4d39b6..bf25f0e89c78 100644
> --- a/drivers/firmware/arm_scmi/common.h
> +++ b/drivers/firmware/arm_scmi/common.h
> @@ -412,6 +412,13 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
> * @max_msg_size: Maximum size of data per message that can be handled.
> * @force_polling: Flag to force this whole transport to use SCMI core polling
> * mechanism instead of completion interrupts even if available.
> + * @sync_cmds_atomic_replies: Flag to indicate that the transport assures
> + * synchronous-command messages are atomically
> + * completed on .send_message: no need to poll
> + * actively waiting for a response.

Not sure if atomic is right term to use. It is atomic w.r.t OSPM though.
Can we just say sync_cmd_complete_on_ret or something similar.

--
Regards,
Sudeep

2021-12-13 17:52:23

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v7 00/16] (subset) Introduce atomic support for SCMI transports

On Mon, 29 Nov 2021 19:11:40 +0000, Cristian Marussi wrote:
> This series mainly aims to introduce atomic support for SCMI transports
> that can support it.
>
> After a bit of refactoring in the first 5 patches of the series, in
> [06/16], as a closely related addition, it is introduced a common way for a
> transport to signal to the SCMI core that it does not offer completion
> interrupts, so that the usual polling behaviour will be required: this can
> be done enabling statically a global polling behaviour for the whole
> transport with flag scmi_desc.force_polling OR dynamically enabling at
> runtime such polling behaviour on a per-channel basis using the flag
> scmi_chan_info.no_completion_irq, typically during .chan_setup().
> The usual per-command polling selection behaviour based on
> hdr.poll_completion is preserved as before.
>
> [...]

Applied to sudeep.holla/linux (for-next/scmi), thanks!

[01/16] firmware: arm_scmi: Perform earlier cinfo lookup call in do_xfer
https://git.kernel.org/sudeep.holla/c/d211ddeb51
[02/16] firmware: arm_scmi: Set polling timeout to max_rx_timeout_ms
https://git.kernel.org/sudeep.holla/c/582730b9cb
[03/16] firmware: arm_scmi: Refactor message response path
https://git.kernel.org/sudeep.holla/c/5a731aebd3
[04/16] include: trace: Add new scmi_xfer_response_wait event
https://git.kernel.org/sudeep.holla/c/8b276b59cc
[05/16] firmware: arm_scmi: Use new trace event scmi_xfer_response_wait
https://git.kernel.org/sudeep.holla/c/f872af0909

--
Regards,
Sudeep

2021-12-14 10:50:00

by Cristian Marussi

[permalink] [raw]

Subject: Re: [PATCH v7 06/16] firmware: arm_scmi: Add configurable polling mode for transports

On Mon, Dec 13, 2021 at 11:25:55AM +0000, Sudeep Holla wrote:
> On Mon, Nov 29, 2021 at 07:11:46PM +0000, Cristian Marussi wrote:
> > SCMI communications along TX channels can optionally be provided of a
> > completion interrupt; when such interrupt is not available, command
> > transactions should rely on polling, where the SCMI core takes care to
> > repeatedly evaluate the transport-specific .poll_done() function, if
> > available, to determine if and when a request was fully completed or
> > timed out.
> >

Hi Sudeep,

thanks for the review.

> > Such mechanism is already present and working on a single transfer base:
> > SCMI protocols can indeed enable hdr.poll_completion on specific commands
> > ahead of each transfer and cause that transaction to be handled with
> > polling.
> >
> > Introduce a couple of flags to be able to enforce such polling behaviour
> > globally at will:
> >
> > - scmi_desc.force_polling: to statically switch the whole transport to
> > polling mode.
> >
> > - scmi_chan_info.no_completion_irq: to switch a single channel dynamically
> > to polling mode if, at runtime, is determined that no completion
> > interrupt was available for such channel.
> >
> > Signed-off-by: Cristian Marussi <[email protected]>
> > ---
> > v5 --> v6
> > - removed check on replies received by IRQs when xfer was requested
> > as poll_completion (not all transport can suppress IRQs on an xfer basis)
> > v4 --> v5
> > - make force_polling const
> > - introduce polling_enabled flag to simplify checks on do_xfer
> > v3 --> v4:
> > - renamed .needs_polling flag to .no_completion_irq
> > - refactored error path when polling needed but not supported
> > ---
> > drivers/firmware/arm_scmi/common.h | 11 +++++++++++
> > drivers/firmware/arm_scmi/driver.c | 17 +++++++++++++++++
> > 2 files changed, 28 insertions(+)
> >
> > diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> > index 6438b5248c24..99b74f4d39b6 100644
> > --- a/drivers/firmware/arm_scmi/common.h
> > +++ b/drivers/firmware/arm_scmi/common.h
> > @@ -339,11 +339,19 @@ void scmi_protocol_release(const struct scmi_handle *handle, u8 protocol_id);
> > * @dev: Reference to device in the SCMI hierarchy corresponding to this
> > * channel
> > * @handle: Pointer to SCMI entity handle
> > + * @no_completion_irq: Flag to indicate that this channel has no completion
> > + * interrupt mechanism for synchronous commands.
> > + * This can be dynamically set by transports at run-time
> > + * inside their provided .chan_setup().
> > + * @polling_enabled: Flag used to annotate if polling mode is currently enabled
> > + * on this channel.
> > * @transport_info: Transport layer related information
> > */
> > struct scmi_chan_info {
> > struct device *dev;
> > struct scmi_handle *handle;
> > + bool no_completion_irq;
> > + bool polling_enabled;
>
> Do we really need a separate flag for polling_enabled ?
> no_completion_irq means you need to enable polling and force_polling too.
> Just trying to see if we can get rid of unnecessary flags.
>

Not really needed indeed, I was just trying to avoid multiple conditions
checks later on the TX path (given no_completion_irq and force_polling
never change after channel setup so I can just note down at setup that
polling is enabled and then just check that later) and improve readability,
but I can just use macros for readability and just ignore the multiple
unneeded checks.

Same holds later on the series for polling_capable flag that I similarly
setup early during probe to avoid unneeded multiple condition checks.

I'll remove both this internal flags and resort to macros, if it's fine
for you.

Thanks,
Cristian

2021-12-14 10:52:47

by Cristian Marussi

[permalink] [raw]

Subject: Re: [PATCH v7 08/16] firmware: arm_scmi: Add sync_cmds_atomic_replies transport flag

On Mon, Dec 13, 2021 at 11:54:37AM +0000, Sudeep Holla wrote:
> On Mon, Nov 29, 2021 at 07:11:48PM +0000, Cristian Marussi wrote:
> > Add a flag to let the transport signal to the core if its handling of sync
> > command implies that, after .send_message has returned successfully, the
> > requested command can be assumed to be fully and completely executed on
> > SCMI platform side so that any possible response value is already
> > immediately available to be retrieved by a .fetch_response: in other words
> > the polling phase can be skipped in such a case and the response values
> > accessed straight away.
> >
> > Note that all of the above applies only when polling mode of operation was
> > selected by the core: if instead a completion IRQ was found to be available
> > the normal response processing path based on completions will still be
> > followed.
> >
> > Signed-off-by: Cristian Marussi <[email protected]>
> > ---
> > v5 --> v6
> > - added polling_capable helper flag
> > v4 --> v5
> > - removed RFC tag
> > - consider sync_cmds_atomic_replies flag when deciding if polling is to be
> > supported and .poll_done() is not provided.
> > - reviewed commit message
> > ---
> > drivers/firmware/arm_scmi/common.h | 8 ++++++
> > drivers/firmware/arm_scmi/driver.c | 43 +++++++++++++++++++++++-------
> > 2 files changed, 41 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
> > index 99b74f4d39b6..bf25f0e89c78 100644
> > --- a/drivers/firmware/arm_scmi/common.h
> > +++ b/drivers/firmware/arm_scmi/common.h
> > @@ -412,6 +412,13 @@ struct scmi_device *scmi_child_dev_find(struct device *parent,
> > * @max_msg_size: Maximum size of data per message that can be handled.
> > * @force_polling: Flag to force this whole transport to use SCMI core polling
> > * mechanism instead of completion interrupts even if available.
> > + * @sync_cmds_atomic_replies: Flag to indicate that the transport assures
> > + * synchronous-command messages are atomically
> > + * completed on .send_message: no need to poll
> > + * actively waiting for a response.
>
> Not sure if atomic is right term to use. It is atomic w.r.t OSPM though.
> Can we just say sync_cmd_complete_on_ret or something similar.
>
Yes, I really never liked the naming so I'll happily take you suggestion :D

Thanks,
Cristian

2021-12-14 10:55:00

by Cristian Marussi

[permalink] [raw]

Subject: Re: [PATCH v7 12/16] firmware: arm_scmi: Add atomic mode support to smc transport

On Mon, Dec 13, 2021 at 11:42:33AM +0000, Sudeep Holla wrote:
> On Mon, Nov 29, 2021 at 07:11:52PM +0000, Cristian Marussi wrote:
> > Add a Kernel configuration option to enable SCMI SMC transport atomic
> > mode operation for selected SCMI transactions and leave it as default
> > disabled.
> >
> > Substitute mutex usages with busy-waiting and declare smc transport as
> > .atomic_enabled if such Kernel configuration option is enabled.
> >
> > Signed-off-by: Cristian Marussi <[email protected]>
> > ---
> > v5 --> v6
> > - remove usage of atomic_capable
> > - removed needless union
> > - reviewed Kconfig help
> > v4 --> v5
> > - removed RFC tag
> > - add CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE option
> > - add .atomic_enable support
> > - make atomic_capable dependent on
> > CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> > - make also usage of mutexes vs busy-waiting dependent on
> > CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> > ---
> > drivers/firmware/arm_scmi/Kconfig | 14 +++++++
> > drivers/firmware/arm_scmi/smc.c | 66 ++++++++++++++++++++++++++++---
> > 2 files changed, 74 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> > index 638ecec89ff1..d429326433d1 100644
> > --- a/drivers/firmware/arm_scmi/Kconfig
> > +++ b/drivers/firmware/arm_scmi/Kconfig
> > @@ -78,6 +78,20 @@ config ARM_SCMI_TRANSPORT_SMC
> > If you want the ARM SCMI PROTOCOL stack to include support for a
> > transport based on SMC, answer Y.
> >
> > +config ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> > + bool "Enable atomic mode support for SCMI SMC transport"
> > + depends on ARM_SCMI_TRANSPORT_SMC
> > + help
> > + Enable support of atomic operation for SCMI SMC based transport.
> > +
> > + If you want the SCMI SMC based transport to operate in atomic
> > + mode, avoiding any kind of sleeping behaviour for selected
> > + transactions on the TX path, answer Y.
> > + Enabling atomic mode operations allows any SCMI driver using this
> > + transport to optionally ask for atomic SCMI transactions and operate
> > + in atomic context too, at the price of using a number of busy-waiting
> > + primitives all over instead. If unsure say N.
> > +
> > config ARM_SCMI_TRANSPORT_VIRTIO
> > bool "SCMI transport based on VirtIO"
> > depends on VIRTIO=y || VIRTIO=ARM_SCMI_PROTOCOL
> > diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
> > index b2f31d3feb10..0fc49cb49185 100644
> > --- a/drivers/firmware/arm_scmi/smc.c
> > +++ b/drivers/firmware/arm_scmi/smc.c
> > @@ -7,6 +7,7 @@
> > */
> >
> > #include <linux/arm-smccc.h>
> > +#include <linux/atomic.h>
> > #include <linux/device.h>
> > #include <linux/err.h>
> > #include <linux/interrupt.h>
> > @@ -14,6 +15,9 @@
> > #include <linux/of.h>
> > #include <linux/of_address.h>
> > #include <linux/of_irq.h>
> > +#ifdef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> > +#include <linux/processor.h>
> > +#endif
> > #include <linux/slab.h>
> >
> > #include "common.h"
> > @@ -23,14 +27,23 @@
> > *
> > * @cinfo: SCMI channel info
> > * @shmem: Transmit/Receive shared memory area
> > - * @shmem_lock: Lock to protect access to Tx/Rx shared memory area
> > + * @shmem_lock: Lock to protect access to Tx/Rx shared memory area.
> > + * Used when NOT operating in atomic mode.
> > + * @inflight: Atomic flag to protect access to Tx/Rx shared memory area.
> > + * Used when operating in atomic mode.
> > * @func_id: smc/hvc call function id
> > */
> >
> > struct scmi_smc {
> > struct scmi_chan_info *cinfo;
> > struct scmi_shared_mem __iomem *shmem;
> > +#ifndef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> > + /* Protect access to shmem area */
> > struct mutex shmem_lock;
>
> Ditto here, do we really need to do this saving ? I would wait until someone
> really complains about space. It unnecessarily makes it hard to read.
>

Yes, indeed I can remove ifdeffery and go with runtime checks. I'll do.
> > +#else
> > +#define INFLIGHT_NONE MSG_TOKEN_MAX
> > + atomic_t inflight;
> > +#endif
> > u32 func_id;
> > };
> >
> > @@ -54,6 +67,46 @@ static bool smc_chan_available(struct device *dev, int idx)
> > return true;
> > }
> >
> > +static inline void smc_channel_lock_init(struct scmi_smc *scmi_info)
> > +{
> > +#ifndef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> > + mutex_init(&scmi_info->shmem_lock);
> > +#else
> > + atomic_set(&scmi_info->inflight, INFLIGHT_NONE);
>
> You can do both if you remove conditional definition of struct.
>

Yes.

> > +#endif
> > +}
> > +
> > +#ifdef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
> > +static bool smc_xfer_inflight(struct scmi_xfer *xfer, atomic_t *inflight)
> > +{
> > + int ret;
> > +
> > + ret = atomic_cmpxchg(inflight, INFLIGHT_NONE, xfer->hdr.seq);
> > +
> > + return ret == INFLIGHT_NONE;
> > +}
> > +#endif
> > +
> > +static inline void
> > +smc_channel_lock_acquire(struct scmi_smc *scmi_info,
> > + struct scmi_xfer *xfer __maybe_unused)
> > +{
> > +#ifndef CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE
>
> If possible make it based some local variable or you can always do
>
> if (IS_ENABLED(CONFIG_ARM_SCMI_TRANSPORT_SMC_ATOMIC_ENABLE))
>

I'll do.

Thanks,
Cristian

2021-12-14 10:57:08

by Cristian Marussi

[permalink] [raw]

Subject: Re: [PATCH v7 14/16] firmware: arm_scmi: Add atomic mode support to virtio transport

On Mon, Dec 13, 2021 at 11:34:56AM +0000, Sudeep Holla wrote:
> On Mon, Nov 29, 2021 at 07:11:54PM +0000, Cristian Marussi wrote:
> > Add support for .mark_txdone and .poll_done transport operations to SCMI
> > VirtIO transport as pre-requisites to enable atomic operations.
> >
> > Add a Kernel configuration option to enable SCMI VirtIO transport polling
> > and atomic mode for selected SCMI transactions while leaving it default
> > disabled.
> >
> > Cc: "Michael S. Tsirkin" <[email protected]>
> > Cc: Igor Skalkin <[email protected]>
> > Cc: Peter Hilber <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: Cristian Marussi <[email protected]>
> > ---
> > V6 --> V7
> > - added a few comments about virtio polling internals
> > - fixed missing list_del on pending_cmds_list processing
> > - shrinked spinlocked areas in virtio_poll_done
> > - added proper spinlocking to scmi_vio_complete_cb while scanning list
> > of pending cmds
> > ---
> > drivers/firmware/arm_scmi/Kconfig | 15 ++
> > drivers/firmware/arm_scmi/virtio.c | 241 +++++++++++++++++++++++++++--
> > 2 files changed, 243 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> > index d429326433d1..7794bd41eaa0 100644
> > --- a/drivers/firmware/arm_scmi/Kconfig
> > +++ b/drivers/firmware/arm_scmi/Kconfig
> > @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
> > the ones implemented by kvmtool) and let the core Kernel VirtIO layer
> > take care of the needed conversions, say N.
> >
> > +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > + bool "Enable atomic mode for SCMI VirtIO transport"
> > + depends on ARM_SCMI_TRANSPORT_VIRTIO
> > + help
> > + Enable support of atomic operation for SCMI VirtIO based transport.
> > +
> > + If you want the SCMI VirtIO based transport to operate in atomic
> > + mode, avoiding any kind of sleeping behaviour for selected
> > + transactions on the TX path, answer Y.
> > +
> > + Enabling atomic mode operations allows any SCMI driver using this
> > + transport to optionally ask for atomic SCMI transactions and operate
> > + in atomic context too, at the price of using a number of busy-waiting
> > + primitives all over instead. If unsure say N.
> > +
> > endif #ARM_SCMI_PROTOCOL
> >
> > config ARM_SCMI_POWER_DOMAIN
> > diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> > index fd0f6f91fc0b..0598e185a786 100644
> > --- a/drivers/firmware/arm_scmi/virtio.c
> > +++ b/drivers/firmware/arm_scmi/virtio.c
> > @@ -38,6 +38,7 @@
> > * @vqueue: Associated virtqueue
> > * @cinfo: SCMI Tx or Rx channel
> > * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> > + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
> > * @is_rx: Whether channel is an Rx channel
> > * @ready: Whether transport user is ready to hear about channel
> > * @max_msg: Maximum number of pending messages for this channel.
> > @@ -49,6 +50,9 @@ struct scmi_vio_channel {
> > struct virtqueue *vqueue;
> > struct scmi_chan_info *cinfo;
> > struct list_head free_list;
> > +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > + struct list_head pending_cmds_list;
> > +#endif
> > bool is_rx;
> > bool ready;
> > unsigned int max_msg;
> > @@ -65,12 +69,22 @@ struct scmi_vio_channel {
> > * @input: SDU used for (delayed) responses and notifications
> > * @list: List which scmi_vio_msg may be part of
> > * @rx_len: Input SDU size in bytes, once input has been received
> > + * @poll_idx: Last used index registered for polling purposes if this message
> > + * transaction reply was configured for polling.
> > + * Note that virtqueue used index is an unsigned 16-bit.
> > + * @poll_lock: Protect access to @poll_idx.
> > */
> > struct scmi_vio_msg {
> > struct scmi_msg_payld *request;
> > struct scmi_msg_payld *input;
> > struct list_head list;
> > unsigned int rx_len;
> > +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
>
> Do we really need the #ifdefery for struct definition ? TBH I don't like
> the way it is. I would avoid it as much as possible. I assume some are
> added to avoid build warnings ?
>
> Doesn't __maybe_unused help to remove some of them like the functions
> mark_txdone and poll_done. I haven't tried but thought of checking.
>

Ok, I'll remove ifdeffery while addressing Peter concerns.

Thanks,
Cristian

2021-12-20 21:30:50

by Cristian Marussi

[permalink] [raw]

Subject: Re: [PATCH v7 14/16] firmware: arm_scmi: Add atomic mode support to virtio transport

On Fri, Dec 10, 2021 at 01:12:18PM +0100, Peter Hilber wrote:
> On 29.11.21 20:11, Cristian Marussi wrote:
> > Add support for .mark_txdone and .poll_done transport operations to SCMI
> > VirtIO transport as pre-requisites to enable atomic operations.
> >
>
> Hi Cristian,

Hi Peter,

>
> I see no conceptual problems with this patch. Please find some inline
> remarks below.

Thanks a lot for your feedback, I addressed (hopefully) some of your
concerns in V8; see down below my replies inline.

>
> Best regards,
>
> Peter
>
> > Add a Kernel configuration option to enable SCMI VirtIO transport polling
> > and atomic mode for selected SCMI transactions while leaving it default
> > disabled.
> >
> > Cc: "Michael S. Tsirkin" <[email protected]>
> > Cc: Igor Skalkin <[email protected]>
> > Cc: Peter Hilber <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: Cristian Marussi <[email protected]>
> > ---
> > V6 --> V7
> > - added a few comments about virtio polling internals
> > - fixed missing list_del on pending_cmds_list processing
> > - shrinked spinlocked areas in virtio_poll_done
> > - added proper spinlocking to scmi_vio_complete_cb while scanning list
> > of pending cmds
> > ---
> > drivers/firmware/arm_scmi/Kconfig | 15 ++
> > drivers/firmware/arm_scmi/virtio.c | 241 +++++++++++++++++++++++++++--
> > 2 files changed, 243 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
> > index d429326433d1..7794bd41eaa0 100644
> > --- a/drivers/firmware/arm_scmi/Kconfig
> > +++ b/drivers/firmware/arm_scmi/Kconfig
> > @@ -118,6 +118,21 @@ config ARM_SCMI_TRANSPORT_VIRTIO_VERSION1_COMPLIANCE
> > the ones implemented by kvmtool) and let the core Kernel VirtIO layer
> > take care of the needed conversions, say N.
> >
> > +config ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > + bool "Enable atomic mode for SCMI VirtIO transport"
> > + depends on ARM_SCMI_TRANSPORT_VIRTIO
> > + help
> > + Enable support of atomic operation for SCMI VirtIO based transport.
> > +
> > + If you want the SCMI VirtIO based transport to operate in atomic
> > + mode, avoiding any kind of sleeping behaviour for selected
> > + transactions on the TX path, answer Y.
> > +
> > + Enabling atomic mode operations allows any SCMI driver using this
> > + transport to optionally ask for atomic SCMI transactions and operate
> > + in atomic context too, at the price of using a number of busy-waiting
> > + primitives all over instead. If unsure say N.
> > +
> > endif #ARM_SCMI_PROTOCOL
> >
> > config ARM_SCMI_POWER_DOMAIN
> > diff --git a/drivers/firmware/arm_scmi/virtio.c b/drivers/firmware/arm_scmi/virtio.c
> > index fd0f6f91fc0b..0598e185a786 100644
> > --- a/drivers/firmware/arm_scmi/virtio.c
> > +++ b/drivers/firmware/arm_scmi/virtio.c
> > @@ -38,6 +38,7 @@
> > * @vqueue: Associated virtqueue
> > * @cinfo: SCMI Tx or Rx channel
> > * @free_list: List of unused scmi_vio_msg, maintained for Tx channels only
> > + * @pending_cmds_list: List of pre-fetched commands queueud for later processing
> > * @is_rx: Whether channel is an Rx channel
> > * @ready: Whether transport user is ready to hear about channel
> > * @max_msg: Maximum number of pending messages for this channel.
> > @@ -49,6 +50,9 @@ struct scmi_vio_channel {
> > struct virtqueue *vqueue;
> > struct scmi_chan_info *cinfo;
> > struct list_head free_list;
> > +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > + struct list_head pending_cmds_list;
> > +#endif
> > bool is_rx;
> > bool ready;
> > unsigned int max_msg;
> > @@ -65,12 +69,22 @@ struct scmi_vio_channel {
> > * @input: SDU used for (delayed) responses and notifications
> > * @list: List which scmi_vio_msg may be part of
> > * @rx_len: Input SDU size in bytes, once input has been received
> > + * @poll_idx: Last used index registered for polling purposes if this message
> > + * transaction reply was configured for polling.
> > + * Note that virtqueue used index is an unsigned 16-bit.
> > + * @poll_lock: Protect access to @poll_idx.
> > */
> > struct scmi_vio_msg {
> > struct scmi_msg_payld *request;
> > struct scmi_msg_payld *input;
> > struct list_head list;
> > unsigned int rx_len;
> > +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > +#define VIO_MSG_POLL_DONE 0xffffffffUL
>
> virtqueue_enable_cb_prepare() returns an "opaque unsigned value", so
> this special value should not be used for .poll_idx.
>

Yes you are right, but looking at comments in virtqueue_enable_cb_prepare()
and virtqueue_poll()

/**
* virtqueue_enable_cb_prepare - restart callbacks after disable_cb
*
* This re-enables callbacks; it returns current queue state
* in an opaque unsigned value. This value should be later tested by
* virtqueue_poll, to detect a possible race between the driver
* checking for more work, and enabling callbacks.

/**
* virtqueue_poll - query pending used buffers
* @_vq: the struct virtqueue we're talking about.
* @last_used_idx: virtqueue state (from call to virtqueue_enable_cb_prepare).

... it seems to me that is exactly how I'm using it, and moreover I
don't see any other way via the VirtIO API to grab that last_used_idx
that I need for virtqueu_poll.

> > + unsigned int poll_idx;
> > + /* lock to protect access to poll_idx. */
> > + spinlock_t poll_lock;
> > +#endif
> > };
> >
> > /* Only one SCMI VirtIO device can possibly exist */
> > @@ -104,17 +118,22 @@ static int scmi_vio_feed_vq_rx(struct scmi_vio_channel *vioch,
> > return rc;
> > }
> >
> > +static inline void scmi_vio_feed_vq_tx(struct scmi_vio_channel *vioch,
> > + struct scmi_vio_msg *msg)
> > +{
> > + /* Here IRQs are assumed to be already disabled by the caller */
> > + spin_lock(&vioch->lock);
> > + list_add(&msg->list, &vioch->free_list);
> > + spin_unlock(&vioch->lock);
> > +}
> > +
> > static void scmi_finalize_message(struct scmi_vio_channel *vioch,
> > struct scmi_vio_msg *msg)
> > {
> > - if (vioch->is_rx) {
> > + if (vioch->is_rx)
> > scmi_vio_feed_vq_rx(vioch, msg, vioch->cinfo->dev);
> > - } else {
> > - /* Here IRQs are assumed to be already disabled by the caller */
> > - spin_lock(&vioch->lock);
> > - list_add(&msg->list, &vioch->free_list);
> > - spin_unlock(&vioch->lock);
> > - }
> > + else
> > + scmi_vio_feed_vq_tx(vioch, msg);
> > }
> >
> > static void scmi_vio_complete_cb(struct virtqueue *vqueue)
> > @@ -140,6 +159,26 @@ static void scmi_vio_complete_cb(struct virtqueue *vqueue)
>
> Looks like this function could also scmi_rx_callback() for polled
> messages, which should probably not happen.
>

Yes indeed, but that is by design :P ... let me explain.

Using VirtIO API I can poll on the last used index waiting for something
new to arrive but I cannot suppress IRQ/notify on selected messages
(like the one I poll..) because the only way to do it would be to
request IRQ/notify suppression for the whole virtqueue and this is
something I don't want to do since it would forcibly move to busy
waiting all the traffic on that vqueue. (...even though certainly IRQs
are suppressed when I disable/enable CBs...)

So, as a matter of fact, even while I am polling buisy-waiting for a
specific message to come back, maybe with IRQ locally disabled, it could
happen that my polled message is delivered on some other CPUs where IRQs
are not disabled (during that window in which CBs are enabled...)...

...but this is fine, since the whole SCMI stack accesses to the xfers
have been serialized with locks and if an scmi_rx_callback is called on
an xfer which I was polling on due to the IRQ being deliverd on another
CPU, the xfer will be properly filled with the data from the message
buffer AND the polling loop will be terminated too by scmi_rx_callback
(this was introduced since needed to handle Out of Order message
delivery).

In other words even though most probably you'll complete your polling
transaction via poll_done, it is possible and handled that a polling
transaction is orderly completed instead by an IRQs delivered on another
core.

> >
> > /* IRQs already disabled here no need to irqsave */
> > spin_lock(&vioch->lock);
> > +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > + /* At first scan the list of possibly pre-fetched messages */
> > + if (!vioch->is_rx) {
> > + struct scmi_vio_msg *tmp;
> > +
> > + list_for_each_entry_safe(msg, tmp,
> > + &vioch->pending_cmds_list,
> > + list) {
> > + list_del(&msg->list);
> > + spin_unlock(&vioch->lock);
> > +
> > + scmi_rx_callback(vioch->cinfo,
> > + msg_read_header(msg->input),
> > + msg);
> > + /* Free the processed message once done */
> > + spin_lock(&vioch->lock);
> > + list_add(&msg->list, &vioch->free_list);
> > + }
> > + }
> > +#endif
> > if (cb_enabled) {
> > virtqueue_disable_cb(vqueue);
> > cb_enabled = false;
> > @@ -257,6 +296,9 @@ static int virtio_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
> > GFP_KERNEL);
> > if (!msg->request)
> > return -ENOMEM;
> > +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > + spin_lock_init(&msg->poll_lock);
> > +#endif
> > }
> >
> > msg->input = devm_kzalloc(cinfo->dev, VIRTIO_SCMI_MAX_PDU_SIZE,
> > @@ -324,7 +366,8 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> > }
> >
> > msg = list_first_entry(&vioch->free_list, typeof(*msg), list);
> > - list_del(&msg->list);
> > + /* Re-init element so we can discern anytime if it is still in-flight */
> > + list_del_init(&msg->list);
> >
> > msg_tx_prepare(msg->request, xfer);
> >
> > @@ -337,6 +380,20 @@ static int virtio_send_message(struct scmi_chan_info *cinfo,
> > dev_err(vioch->cinfo->dev,
> > "failed to add to TX virtqueue (%d)\n", rc);
> > } else {
> > +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > + /*
> > + * If polling was requested for this transaction:
> > + * - retrieve last used index (will be used as polling reference)
> > + * - bind the polled message to the xfer via .priv
> > + */
> > + if (xfer->hdr.poll_completion) {
> > + spin_lock(&msg->poll_lock);
> > + msg->poll_idx =
> > + virtqueue_enable_cb_prepare(vioch->vqueue);
> > + spin_unlock(&msg->poll_lock);
> > + xfer->priv = msg;
> > + }
> > +#endif
> > virtqueue_kick(vioch->vqueue);
> > }
> >
> > @@ -350,10 +407,8 @@ static void virtio_fetch_response(struct scmi_chan_info *cinfo,
> > {
> > struct scmi_vio_msg *msg = xfer->priv;
> >
> > - if (msg) {
> > + if (msg)
> > msg_fetch_response(msg->input, msg->rx_len, xfer);
> > - xfer->priv = NULL;
> > - }
> > }
> >
> > static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> > @@ -361,11 +416,163 @@ static void virtio_fetch_notification(struct scmi_chan_info *cinfo,
> > {
> > struct scmi_vio_msg *msg = xfer->priv;
> >
> > - if (msg) {
> > + if (msg)
> > msg_fetch_notification(msg->input, msg->rx_len, max_len, xfer);
> > - xfer->priv = NULL;
> > +}
> > +
> > +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
> > +/**
> > + * virtio_mark_txdone - Mark transmission done
> > + *
> > + * Free only successfully completed polling transfer messages.
> > + *
> > + * Note that in the SCMI VirtIO transport we never explicitly release timed-out
> > + * messages by forcibly re-adding them to the free-list on timeout inside the TX
> > + * code path; we instead let IRQ/RX callbacks eventually clean up such messages
> > + * once, finally, a late reply is received and discarded (if ever).
> > + *
> > + * This approach was deemed preferable since those pending timed-out buffers are
> > + * still effectively owned by the SCMI platform VirtIO device even after timeout
> > + * expiration: forcibly freeing and reusing them before they had beeen returned
> > + * by the SCMI platform could lead to subtle bugs due to message corruption.
> > + * An SCMI platform VirtIO device which never returns message buffers is
> > + * anyway broken and it will quickly lead to message exhaustion.
> > + *
> > + * For this same reason, here, we take care to free only the successfully
> > + * completed polled messages, since they won't be freed elsewhere; late replies
> > + * to timed-out polled messages would be anyway freed by RX callbacks instead.
> > + *
> > + * @cinfo: SCMI channel info
> > + * @ret: Transmission return code
> > + * @xfer: Transfer descriptor
> > + */
> > +static void virtio_mark_txdone(struct scmi_chan_info *cinfo, int ret,
> > + struct scmi_xfer *xfer)
> > +{
> > + unsigned long flags;
> > + struct scmi_vio_channel *vioch = cinfo->transport_info;
> > + struct scmi_vio_msg *msg = xfer->priv;
> > +
> > + if (!msg)
> > + return;
> > +
> > + /* Is a successfully completed polled message still to be finalized ? */
> > + spin_lock_irqsave(&msg->poll_lock, flags);
>
> .poll_lock is acquired before vioch->lock (in scmi_vio_feed_vq_tx()),
> but e. g. virtio_poll_done() uses the reverse locking order.
>

Right good catch ! Indeed is not even needed..I have rewoked all the
spinlocking around feed_vq_tx/rx in V8.

> > + if (!ret && xfer->hdr.poll_completion && list_empty(&msg->list))
> > + scmi_vio_feed_vq_tx(vioch, msg);
> > + spin_unlock_irqrestore(&msg->poll_lock, flags);
> > +
> > + xfer->priv = NULL;
> > +}
> > +
> > +/**
> > + * virtio_poll_done - Provide polling support for VirtIO transport
> > + *
> > + * @cinfo: SCMI channel info
> > + * @xfer: Reference to the transfer being poll for.
> > + *
> > + * VirtIO core provides a polling mechanism based only on last used indexes:
> > + * this means that it is possible to poll the virtqueues waiting for something
> > + * new to arrive from the host side but the only way to check if the freshly
> > + * arrived buffer was what we were waiting for is to compare the newly arrived
> > + * message descriptors with the one we are polling on.
> > + *
> > + * As a consequence it can happen to dequeue something different from the buffer
> > + * we were poll-waiting for: if that is the case such early fetched buffers are
> > + * then added to a the @pending_cmds_list list for later processing within the
> > + * usual VirtIO callbacks; so, basically, once something new is spotted we
> > + * proceed to de-queue all the freshly received used buffers until we found the
> > + * one we were polling on, or we empty the virtqueue.
> > + *
> > + * Note that we do NOT suppress notification with VIRTQ_USED_F_NO_NOTIFY even
> > + * when polling since such flag is per-virtqueues and we do not want to
> > + * suppress notifications as a whole: so, if the message we are polling for is
> > + * delivered via usual IRQs callbacks, it will be handled as such and the
> > + * polling loop in the SCMI Core TX path will be transparently terminated
> > + * anyway.
> > + *
> > + * Return: True once polling has successfully completed.
> > + */
> > +static bool virtio_poll_done(struct scmi_chan_info *cinfo,
> > + struct scmi_xfer *xfer)
> > +{
> > + bool ret;
> > + unsigned int poll_idx;
> > + unsigned long flags;
> > + struct scmi_vio_msg *msg = xfer->priv;
> > + struct scmi_vio_channel *vioch = cinfo->transport_info;
> > +
> > + if (!msg)
> > + return true;
> > +
> > + /*
> > + * Keep the spinlocked region as small as possible: don't care if
> > + * missing something this time it will be polled again next.
> > + */
> > + spin_lock_irqsave(&msg->poll_lock, flags);
> > + poll_idx = msg->poll_idx;
> > + spin_unlock_irqrestore(&msg->poll_lock, flags);
> > +
> > + /* Processed already by other polling loop on another CPU ? */
> > + if (poll_idx == VIO_MSG_POLL_DONE)
> > + return true;
> > +
> > + /* Has cmdq index moved at all ? */
> > + ret = virtqueue_poll(vioch->vqueue, poll_idx);
>
> In theory, if polling gets delayed, virtqueue_poll() might run into an
> ABA problem, if 2**16 descriptors have been used in the meantime (and
> activity stops after this).
>
> I think this could be avoided by clearing .poll_idx of each returned
> message everywhere (also in scmi_vio_complete_cb()).
>

Indeed I'm doing just that in V8...if I go it right what you meant.

> > + if (ret) {
> > + struct scmi_vio_msg *next_msg;
> > +
> > + spin_lock_irqsave(&vioch->lock, flags);
> > + virtqueue_disable_cb(vioch->vqueue);
> > + /*
> > + * If something arrived we cannot be sure if it was the reply to
> > + * the xfer we are polling for, or some replies to other, even
> > + * possibly non-polling, pending xfers: process all new messages
> > + * till the polled-for message is found OR the vqueue is empty.
> > + */
> > + do {
> > + unsigned int length;
> > +
> > + next_msg = virtqueue_get_buf(vioch->vqueue, &length);
> > + if (next_msg) {
> > + next_msg->rx_len = length;
> > + if (next_msg == msg) {
> > + ret = true;
> > + break;
> > + }
> > +
> > + list_add_tail(&next_msg->list,
> > + &vioch->pending_cmds_list);
> > + spin_lock(&next_msg->poll_lock);
> > + next_msg->poll_idx = VIO_MSG_POLL_DONE;
> > + spin_unlock(&next_msg->poll_lock);
> > + ret = false;
> > + }
> > + } while (next_msg);
> > +
> > + /*
> > + * When the polling loop has successfully terminated simply
> > + * restart the vqueue, no matter if something else was queued
>
> "restart the vqueue" should better be phrased as "re-enable the vqueue
> callbacks".
>

Missed this in V8...I'll in v9 (which will be due for other reasons
related to the Clock frwm anyway :D)

> > + * in the meantime, it will be served by normal IRQ/callback
> > + * or by the next poll loop.
> > + *
> > + * Update the polling index to the current vqueue last used
> > + * index, if still looking for a reply.
> > + */
> > + if (ret) {
> > + virtqueue_enable_cb(vioch->vqueue);
>
> If this function returns false, how is the race described in the
> function documentation handled? (The same comment might also apply in
> other places.)
>

Good point.
I had wrong assumptions about this behvaiour indeed: what I was missing
is the fact that any pending message in the vqueue at this point won't
be signalled anymore until the next message possibly arrives; in v8 I
check for pending buffers when re-enabling and then I have them processed
by a deferred workqueue together with pre-fetched non-polled ones; this
way I also avoid to loop too much de-queuing non-polled messages within
the poll loop.

Thanks,
Cristian

2022-01-20 12:47:45

by Peter Hilber

[permalink] [raw]

Subject: Re: [PATCH v7 14/16] firmware: arm_scmi: Add atomic mode support to virtio transport

On 20.12.21 22:30, Cristian Marussi wrote:
> On Fri, Dec 10, 2021 at 01:12:18PM +0100, Peter Hilber wrote:
>> On 29.11.21 20:11, Cristian Marussi wrote:
<snip>
>>> @@ -65,12 +69,22 @@ struct scmi_vio_channel {
>>> * @input: SDU used for (delayed) responses and notifications
>>> * @list: List which scmi_vio_msg may be part of
>>> * @rx_len: Input SDU size in bytes, once input has been received
>>> + * @poll_idx: Last used index registered for polling purposes if this message
>>> + * transaction reply was configured for polling.
>>> + * Note that virtqueue used index is an unsigned 16-bit.
>>> + * @poll_lock: Protect access to @poll_idx.
>>> */
>>> struct scmi_vio_msg {
>>> struct scmi_msg_payld *request;
>>> struct scmi_msg_payld *input;
>>> struct list_head list;
>>> unsigned int rx_len;
>>> +#ifdef CONFIG_ARM_SCMI_TRANSPORT_VIRTIO_ATOMIC_ENABLE
>>> +#define VIO_MSG_POLL_DONE 0xffffffffUL
>>
>> virtqueue_enable_cb_prepare() returns an "opaque unsigned value", so
>> this special value should not be used for .poll_idx.
>>
>
> Yes you are right, but looking at comments in virtqueue_enable_cb_prepare()
> and virtqueue_poll()
>
> /**
> * virtqueue_enable_cb_prepare - restart callbacks after disable_cb
> *
> * This re-enables callbacks; it returns current queue state
> * in an opaque unsigned value. This value should be later tested by
> * virtqueue_poll, to detect a possible race between the driver
> * checking for more work, and enabling callbacks.
>
>
> /**
> * virtqueue_poll - query pending used buffers
> * @_vq: the struct virtqueue we're talking about.
> * @last_used_idx: virtqueue state (from call to virtqueue_enable_cb_prepare).
>
> ... it seems to me that is exactly how I'm using it, and moreover I
> don't see any other way via the VirtIO API to grab that last_used_idx
> that I need for virtqueu_poll.
>

I meant to say that the VIO_MSG_POLL_DONE special value should best not be put
into the .poll_idx (since the special value might in theory overlap with an
opaque value). Another variable could hold the special states VIO_MSG_POLL_DONE
and VIO_MSG_NOT_POLLED.