2022-10-19 21:22:58

by Cristian Marussi

[permalink] [raw]
Subject: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

Hi all,

This series aims to introduce a new SCMI unified userspace interface meant
to ease testing an SCMI Server implementation for compliance, fuzzing etc.,
from the perspective of the OSPM agent (non-secure world only ...)

It is proposed as a testing/development facility, it is NOT meant to be a
feature to use in production, but only enabled in Kconfig for test
deployments.

Currently an SCMI Compliance Suite like the one at [1] can only work by
injecting SCMI messages at the SCMI transport layer using the mailbox test
driver (CONFIG_MAILBOX_TEST) via its few debugfs entries and looking at
the related replies from the SCMI backend Server.

This approach has a few drawbacks:

- the SCMI Server under test MUST be reachable through a mailbox based
SCMI transport: any other SCMI Server placement is not possible (like in
a VM reachable via SCMI Virtio). In order to cover other placements in
the current scenario we should write some sort of test driver for each
and every existent SCMI transport and for any future additional transport
...this clearly does not scale.

- even in the mailbox case the userspace Compliance suite cannot simply
send and receive bare SCMI messages BUT it has to properly lay them out
into the shared memory exposed by the mailbox test driver as expected by
the transport definitions. In other words such a userspace test
application has to, not only use a proper transport driver for the system
at hand, but it also has to have a comprehensive knowledge of the
internals of the underlying transport in order to operate.

- last but not least, the system under test has to be specifically
configured and built, in terms of Kconfig and DT, to perform such kind of
testing, it cannot be used for anything else, which is unfortunate for
CI/CD deployments.

This series introduces a new SCMI Raw mode support feature that, when
configured and enabled exposes a new interface in debugfs through which:

- a userspace application can inject bare SCMI binary messages into the
SCMI core stack; such messages will be routed by the SCMI regular kernel
stack to the backend Server using the currently configured transport
transparently: in other words you can test the SCMI server, no matter
where it is placed, as long as it is reachable from the currently
configured SCMI stack.
Same goes the other way around on the reading path: any SCMI server reply
can be read as a bare SCMI binary message from the same debugfs path.

- as a direct consequence of this way of injecting bare messages in the
middle of the SCMI stack (instead of beneath it at the transport layer)
the user application has to handle only bare SCMI messages without having
to worry about the specific underlying transport internals that will be
taken care of by the SCMI core stack itself using its own machinery,
without duplicating such logic.

- a system under test, once configured with SCMI Raw support enabled in
Kconfig, can be booted without any particular DT change.

In V2 the runtime enable/disable switching capability has been removed
(for now) since still not deemed to be stable/reliable enough: as a
consequence when SCMI Raw support is compiled in, the regular SCMI stack
drivers are now inhibited permanently for that Kernel.

In V4 it has been added the support for transports lacking a completion_irq
or configured forcibly in polled mode.

A quick and trivial example from the shell...reading from a sensor
injecting a properly crafted packet in raw mode:

# INJECT THE SENSOR_READING MESSAGE FOR SENSOR ID=1 (binary little endian)
root@deb-buster-arm64:~# echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00 > /sys/kernel/debug/scmi_raw/message

# READING BACK THE REPLY...
root@deb-buster-arm64:~# cat /sys/kernel/debug/scmi_raw/message | od --endian=little -t x4
0000000 00005406 00000000 00000335 00000000
0000020

while doing that, since Raw mode makes (partial) use of the regular SCMI
stack, you can observe the messages going through the SCMI stack with the
usual traces:

bash-329 [000] ..... 14183.446808: scmi_msg_dump: pt=15 t=CMND msg_id=06 seq=0000 s=0 pyld=0100000000000000
irq/35-mhu_db_l-81 [000] ..... 14183.447809: scmi_msg_dump: pt=15 t=RESP msg_id=06 seq=0000 s=0 pyld=3503000000000000


..trying to read in async when the backend server does NOT supports asyncs:

# AN ASYNC SENSOR READING REQUEST...
root@deb-buster-arm64:~# echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x01\\x00\\x00\\x00 > /sys/kernel/debug/scmi_raw/message_async

bash-329 [000] ..... 16415.938739: scmi_msg_dump: pt=15 t=CMND msg_id=06 seq=0000 s=0 pyld=0100000001000000
irq/35-mhu_db_l-81 [000] ..... 16415.944129: scmi_msg_dump: pt=15 t=RESP msg_id=06 seq=0000 s=-1 pyld=

# RETURNS A STATUS -1 FROM THE SERVER NOT SUPPORTING IT
root@deb-buster-arm64:~# cat /sys/kernel/debug/scmi_raw/message | od --endian=little -t x4
0000000 00005406 ffffffff
0000010

Note that this was on a JUNO, BUT exactly the same steps can be used to
reach an SCMI Server living on a VM reachable via virtio as long as the
system under test if properly configured to work with a virtio transport.

In a nutshell the exposed API is as follows:

/sys/kernel/debug/scmi_raw/
├── errors
├── message
├── message_async
├── notification
├── reset
├── transport_max_msg_size
├── transport_rx_timeout_ms
└── transport_tx_max_msg

where:

- message*: used to send sync/async commands and read back immediate and
delayed responses (if any)
- errors: used to report timeout and unexpected replies
- reset: used to reset the SCMI Raw stack, flushing all queues from
received messages still pending to be read out (useful to be sure to
cleanup between test suite runs...)
- notification: used to read any notification being spit by the system
(if previously enabled by the user app)
- transport*: a bunch of configuration useful to setup the user
application expectations in terms of timeouts and message
characteristics.

Each write corresponds to one command request and the replies or delayed
response are read back one message at time (receiving an EOF at each
message boundary).

The user application running the test is in charge of handling timeouts
and properly choosing SCMI sequence numbers for the outgoing requests: note
that the same fixed number can be re-used (...though discouraged...) as
long as the suite does NOT expect to send multiple in-flight commands
concurrently.

Since the SCMI core regular stack is partially used to deliver and collect
the messages, late replies after timeouts and any other sort of unexpected
message sent by the SCMI server platform back can be identified by the SCMI
core as usual and it will be reported under /errors for later analysis.
(a userspace test-app will have anyway properly detected the timeout on
/message* ...)

All of the above has been roughly tested against a standard JUNO SCP SCMI
Server (mailbox trans) and an emulated SCMI Server living in a VM (virtio
trans) using a custom experimental version of the scmi-tests Compliance
suite patched to support Raw mode and posted at [2]. (still in development
...merge requests are in progress...for now it is just a mean for me to
test the testing API ... O_o)

The series is based on v6.1-rc1.

Having said that (in such a concise and brief way :P) ...

...any feedback/comments are welcome !

Thanks,
Cristian

---
V3 --> v4
- rebased on v6.1-rc1
- addedd missing support for 'polled' transports and transport lacking a
completion_irq (like smc/optee)
- removed a few inlines
- refactored SCMI Raw RX patch to make use more extensively of the regular
non-Raw RX path
- fix handling of O_NONBLOCK raw_mode read requests

v2 --> v3
- fixed some sparse warning on LE and __poll_t
- reworked and simplified deferred worker in charge of xfer delayed waiting
- allow for injection of DT-unknown protocols messages when in Raw mode
(needed for any kind of fuzzing...)

v1 --> v2
- added comments and debugfs docs
- added dedicated transport devices for channels initialization
- better channels handling in Raw mode
- removed runtime enable, moved to static compile time exclusion
of SCMI regular stack

[1]: https://gitlab.arm.com/tests/scmi-tests
[2]: https://gitlab.arm.com/tests/scmi-tests/-/commits/raw_mode_support_devel/


Cristian Marussi (11):
firmware: arm_scmi: Refactor xfer in-flight registration routines
firmware: arm_scmi: Simplify chan_available transport operation
firmware: arm_scmi: Use dedicated devices to initialize channels
firmware: arm_scmi: Refactor polling helpers
firmware: arm_scmi: Refactor scmi_wait_for_message_response
firmware: arm_scmi: Add xfer raw helpers
firmware: arm_scmi: Move errors defs and code to common.h
firmware: arm_scmi: Add raw transmission support
firmware: arm_scmi: Add debugfs ABI documentation for Raw mode
firmware: arm_scmi: Reject SCMI drivers while in Raw mode
firmware: arm_scmi: Call Raw mode hooks from the core stack

Documentation/ABI/testing/debugfs-scmi-raw | 88 ++
drivers/firmware/arm_scmi/Kconfig | 13 +
drivers/firmware/arm_scmi/Makefile | 1 +
drivers/firmware/arm_scmi/common.h | 72 +-
drivers/firmware/arm_scmi/driver.c | 521 +++++---
drivers/firmware/arm_scmi/mailbox.c | 4 +-
drivers/firmware/arm_scmi/optee.c | 4 +-
drivers/firmware/arm_scmi/raw_mode.c | 1244 ++++++++++++++++++++
drivers/firmware/arm_scmi/raw_mode.h | 29 +
drivers/firmware/arm_scmi/smc.c | 4 +-
drivers/firmware/arm_scmi/virtio.c | 2 +-
11 files changed, 1827 insertions(+), 155 deletions(-)
create mode 100644 Documentation/ABI/testing/debugfs-scmi-raw
create mode 100644 drivers/firmware/arm_scmi/raw_mode.c
create mode 100644 drivers/firmware/arm_scmi/raw_mode.h

--
2.34.1


2022-10-19 21:23:04

by Cristian Marussi

[permalink] [raw]
Subject: [PATCH v4 09/11] firmware: arm_scmi: Add debugfs ABI documentation for Raw mode

Add description of the debugfs SCMI Raw ABI.

Signed-off-by: Cristian Marussi <[email protected]>
---
Documentation/ABI/testing/debugfs-scmi-raw | 88 ++++++++++++++++++++++
1 file changed, 88 insertions(+)
create mode 100644 Documentation/ABI/testing/debugfs-scmi-raw

diff --git a/Documentation/ABI/testing/debugfs-scmi-raw b/Documentation/ABI/testing/debugfs-scmi-raw
new file mode 100644
index 000000000000..183ec678cb3e
--- /dev/null
+++ b/Documentation/ABI/testing/debugfs-scmi-raw
@@ -0,0 +1,88 @@
+What: /sys/kernel/debug/scmi_raw/transport_max_msg_size
+Date: December 2022
+KernelVersion: 6.1
+Contact: [email protected]
+Description: Max message size of allowed SCMI messages for the currently
+ configured SCMI transport.
+Users: Debugging, any userspace test suite
+
+What: /sys/kernel/debug/scmi_raw/transport_tx_max_msg
+Date: December 2022
+KernelVersion: 6.1
+Contact: [email protected]
+Description: Max number of concurrently allowed in-flight SCMI messages for
+ the currently configured SCMI transport.
+Users: Debugging, any userspace test suite
+
+What: /sys/kernel/debug/scmi_raw/transport_rx_timeout_ms
+Date: December 2022
+KernelVersion: 6.1
+Contact: [email protected]
+Description: Timeout in milliseconds allowed for SCMI synchronous replies
+ for the currently configured SCMI transport.
+Users: Debugging, any userspace test suite
+
+What: /sys/kernel/debug/scmi_raw/message
+Date: December 2022
+KernelVersion: 6.1
+Contact: [email protected]
+Description: SCMI Raw synchronous message injection/snooping facility; write
+ a complete SCMI synchronous command message (header included)
+ in little-endian binary format to have it sent to the configured
+ backend SCMI server.
+ Any subsequently received response can be read from this same
+ entry if it arrived within the configured timeout.
+ Each write to the entry causes one command request to be built
+ and sent while the replies are read back one message at time
+ (receiving an EOF at each message boundary).
+Users: Debugging, any userspace test suite
+
+What: /sys/kernel/debug/scmi_raw/message_async
+Date: December 2022
+KernelVersion: 6.1
+Contact: [email protected]
+Description: SCMI Raw asynchronous message injection/snooping facility; write
+ a complete SCMI asynchronous command message (header included)
+ in little-endian binary format to have it sent to the configured
+ backend SCMI server.
+ Any subsequently received response can be read from this same
+ entry if it arrived within the configured timeout.
+ Any additional delayed response received afterwards can be read
+ from this same entry too if it arrived within the configured
+ timeout.
+ Each write to the entry causes one command request to be built
+ and sent while the replies are read back one message at time
+ (receiving an EOF at each message boundary).
+Users: Debugging, any userspace test suite
+
+What: /sys/kernel/debug/scmi_raw/errors
+Date: December 2022
+KernelVersion: 6.1
+Contact: [email protected]
+Description: SCMI Raw message errors facility; any kind of timed-out or
+ generally unexpectedly received SCMI message can be read from
+ this entry.
+ Each read gives back one message at time (receiving an EOF at
+ each message boundary).
+Users: Debugging, any userspace test suite
+
+What: /sys/kernel/debug/scmi_raw/notification
+Date: December 2022
+KernelVersion: 6.1
+Contact: [email protected]
+Description: SCMI Raw notification snooping facility; any notification
+ emitted by the backend SCMI server can be read from this entry.
+ Each read gives back one message at time (receiving an EOF at
+ each message boundary).
+Users: Debugging, any userspace test suite
+
+What: /sys/kernel/debug/scmi_raw/reset
+Date: December 2022
+KernelVersion: 6.1
+Contact: [email protected]
+Description: SCMI Raw stack reset facility; writing a value to this entry
+ causes the internal queues of any kind of received message,
+ still pending to be read out, to be flushed.
+ Can be used to reset and clean the SCMI Raw stack between to
+ different test-run.
+Users: Debugging, any userspace test suite
--
2.34.1

2022-10-19 21:45:34

by Cristian Marussi

[permalink] [raw]
Subject: [PATCH v4 01/11] firmware: arm_scmi: Refactor xfer in-flight registration routines

Move the whole xfer in-flight registration process out of scmi_xfer_get
and while at that, split the sequence number selection steps from the
in-flight registration procedure itself.

No functional change.

Signed-off-by: Cristian Marussi <[email protected]>
---
drivers/firmware/arm_scmi/driver.c | 102 +++++++++++++++++++----------
1 file changed, 68 insertions(+), 34 deletions(-)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 609ebedee9cb..e5193da2ce09 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -311,8 +311,6 @@ static int scmi_xfer_token_set(struct scmi_xfers_info *minfo,
if (xfer_id != next_token)
atomic_add((int)(xfer_id - next_token), &transfer_last_id);

- /* Set in-flight */
- set_bit(xfer_id, minfo->xfer_alloc_table);
xfer->hdr.seq = (u16)xfer_id;

return 0;
@@ -330,33 +328,77 @@ static inline void scmi_xfer_token_clear(struct scmi_xfers_info *minfo,
clear_bit(xfer->hdr.seq, minfo->xfer_alloc_table);
}

+/**
+ * scmi_xfer_inflight_register_unlocked - Register the xfer as in-flight
+ *
+ * @xfer: The xfer to register
+ * @minfo: Pointer to Tx/Rx Message management info based on channel type
+ *
+ * Note that this helper assumes that the xfer to be registered as in-flight
+ * had been built using an xfer sequence number which still corresponds to a
+ * free slot in the xfer_alloc_table.
+ *
+ * Context: Assumes to be called with @xfer_lock already acquired.
+ */
+static inline void
+scmi_xfer_inflight_register_unlocked(struct scmi_xfer *xfer,
+ struct scmi_xfers_info *minfo)
+{
+ /* Set in-flight */
+ set_bit(xfer->hdr.seq, minfo->xfer_alloc_table);
+ hash_add(minfo->pending_xfers, &xfer->node, xfer->hdr.seq);
+ xfer->pending = true;
+}
+
+/**
+ * scmi_xfer_pending_set - Pick a proper sequence number and mark the xfer
+ * as pending in-flight
+ *
+ * @xfer: The xfer to act upon
+ * @minfo: Pointer to Tx/Rx Message management info based on channel type
+ *
+ * Return: 0 on Success or error otherwise
+ */
+static inline int scmi_xfer_pending_set(struct scmi_xfer *xfer,
+ struct scmi_xfers_info *minfo)
+{
+ int ret;
+ unsigned long flags;
+
+ spin_lock_irqsave(&minfo->xfer_lock, flags);
+ /* Set a new monotonic token as the xfer sequence number */
+ ret = scmi_xfer_token_set(minfo, xfer);
+ if (!ret)
+ scmi_xfer_inflight_register_unlocked(xfer, minfo);
+ spin_unlock_irqrestore(&minfo->xfer_lock, flags);
+
+ return ret;
+}
+
/**
* scmi_xfer_get() - Allocate one message
*
* @handle: Pointer to SCMI entity handle
* @minfo: Pointer to Tx/Rx Message management info based on channel type
- * @set_pending: If true a monotonic token is picked and the xfer is added to
- * the pending hash table.
*
* Helper function which is used by various message functions that are
* exposed to clients of this driver for allocating a message traffic event.
*
- * Picks an xfer from the free list @free_xfers (if any available) and, if
- * required, sets a monotonically increasing token and stores the inflight xfer
- * into the @pending_xfers hashtable for later retrieval.
+ * Picks an xfer from the free list @free_xfers (if any available) and perform
+ * a basic initialization.
+ *
+ * Note that, at this point, still no sequence number is assigned to the
+ * allocated xfer, nor it is registered as a pending transaction.
*
* The successfully initialized xfer is refcounted.
*
- * Context: Holds @xfer_lock while manipulating @xfer_alloc_table and
- * @free_xfers.
+ * Context: Holds @xfer_lock while manipulating @free_xfers.
*
- * Return: 0 if all went fine, else corresponding error.
+ * Return: An initialized xfer if all went fine, else pointer error.
*/
static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
- struct scmi_xfers_info *minfo,
- bool set_pending)
+ struct scmi_xfers_info *minfo)
{
- int ret;
unsigned long flags;
struct scmi_xfer *xfer;

@@ -376,25 +418,8 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
*/
xfer->transfer_id = atomic_inc_return(&transfer_last_id);

- if (set_pending) {
- /* Pick and set monotonic token */
- ret = scmi_xfer_token_set(minfo, xfer);
- if (!ret) {
- hash_add(minfo->pending_xfers, &xfer->node,
- xfer->hdr.seq);
- xfer->pending = true;
- } else {
- dev_err(handle->dev,
- "Failed to get monotonic token %d\n", ret);
- hlist_add_head(&xfer->node, &minfo->free_xfers);
- xfer = ERR_PTR(ret);
- }
- }
-
- if (!IS_ERR(xfer)) {
- refcount_set(&xfer->users, 1);
- atomic_set(&xfer->busy, SCMI_XFER_FREE);
- }
+ refcount_set(&xfer->users, 1);
+ atomic_set(&xfer->busy, SCMI_XFER_FREE);
spin_unlock_irqrestore(&minfo->xfer_lock, flags);

return xfer;
@@ -652,7 +677,7 @@ static void scmi_handle_notification(struct scmi_chan_info *cinfo,
ktime_t ts;

ts = ktime_get_boottime();
- xfer = scmi_xfer_get(cinfo->handle, minfo, false);
+ xfer = scmi_xfer_get(cinfo->handle, minfo);
if (IS_ERR(xfer)) {
dev_err(dev, "failed to get free message slot (%ld)\n",
PTR_ERR(xfer));
@@ -1041,13 +1066,22 @@ static int xfer_get_init(const struct scmi_protocol_handle *ph,
tx_size > info->desc->max_msg_size)
return -ERANGE;

- xfer = scmi_xfer_get(pi->handle, minfo, true);
+ xfer = scmi_xfer_get(pi->handle, minfo);
if (IS_ERR(xfer)) {
ret = PTR_ERR(xfer);
dev_err(dev, "failed to get free message slot(%d)\n", ret);
return ret;
}

+ /* Pick a sequence number and register this xfer as in-flight */
+ ret = scmi_xfer_pending_set(xfer, minfo);
+ if (ret) {
+ dev_err(pi->handle->dev,
+ "Failed to get monotonic token %d\n", ret);
+ __scmi_xfer_put(minfo, xfer);
+ return ret;
+ }
+
xfer->tx.len = tx_size;
xfer->rx.len = rx_size ? : info->desc->max_msg_size;
xfer->hdr.type = MSG_TYPE_COMMAND;
--
2.34.1

2022-10-19 21:46:08

by Cristian Marussi

[permalink] [raw]
Subject: [PATCH v4 03/11] firmware: arm_scmi: Use dedicated devices to initialize channels

Refactor channels initialization to use dedicated devices instead of using
devices borrowed from the SCMI drivers.

Initialize all channels as described in the DT upfront.

Signed-off-by: Cristian Marussi <[email protected]>
---
v3 --> v4
- fix missing devm_kfree on failpath in scmi_chan_setup
---
drivers/firmware/arm_scmi/driver.c | 96 ++++++++++++++++++++++--------
1 file changed, 72 insertions(+), 24 deletions(-)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 62e02b6475ff..032d1140d631 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -2019,23 +2019,20 @@ static int scmi_xfer_info_init(struct scmi_info *sinfo)
return ret;
}

-static int scmi_chan_setup(struct scmi_info *info, struct device *dev,
+static int scmi_chan_setup(struct scmi_info *info, struct device_node *of_node,
int prot_id, bool tx)
{
int ret, idx;
+ char name[32];
struct scmi_chan_info *cinfo;
struct idr *idr;
+ struct scmi_device *tdev = NULL;

/* Transmit channel is first entry i.e. index 0 */
idx = tx ? 0 : 1;
idr = tx ? &info->tx_idr : &info->rx_idr;

- /* check if already allocated, used for multiple device per protocol */
- cinfo = idr_find(idr, prot_id);
- if (cinfo)
- return 0;
-
- if (!info->desc->ops->chan_available(dev->of_node, idx)) {
+ if (!info->desc->ops->chan_available(of_node, idx)) {
cinfo = idr_find(idr, SCMI_PROTOCOL_BASE);
if (unlikely(!cinfo)) /* Possible only if platform has no Rx */
return -EINVAL;
@@ -2046,26 +2043,43 @@ static int scmi_chan_setup(struct scmi_info *info, struct device *dev,
if (!cinfo)
return -ENOMEM;

- cinfo->dev = dev;
+ /* Create a unique name for this transport device */
+ snprintf(name, 32, "__scmi_transport_device_%s_%02X",
+ idx ? "rx" : "tx", prot_id);
+ /* Create a uniquely named, dedicated transport device for this chan */
+ tdev = scmi_device_create(of_node, info->dev, prot_id, name);
+ if (!tdev) {
+ devm_kfree(info->dev, cinfo);
+ return -EINVAL;
+ }

+ cinfo->dev = &tdev->dev;
ret = info->desc->ops->chan_setup(cinfo, info->dev, tx);
- if (ret)
+ if (ret) {
+ scmi_device_destroy(tdev);
+ devm_kfree(info->dev, cinfo);
return ret;
+ }

if (tx && is_polling_required(cinfo, info)) {
if (is_transport_polling_capable(info))
- dev_info(dev,
+ dev_info(&tdev->dev,
"Enabled polling mode TX channel - prot_id:%d\n",
prot_id);
else
- dev_warn(dev,
+ dev_warn(&tdev->dev,
"Polling mode NOT supported by transport.\n");
}

idr_alloc:
ret = idr_alloc(idr, cinfo, prot_id, prot_id + 1, GFP_KERNEL);
if (ret != prot_id) {
- dev_err(dev, "unable to allocate SCMI idr slot err %d\n", ret);
+ dev_err(info->dev,
+ "unable to allocate SCMI idr slot err %d\n", ret);
+ if (tdev) {
+ scmi_device_destroy(tdev);
+ devm_kfree(info->dev, cinfo);
+ }
return ret;
}

@@ -2074,16 +2088,57 @@ static int scmi_chan_setup(struct scmi_info *info, struct device *dev,
}

static inline int
-scmi_txrx_setup(struct scmi_info *info, struct device *dev, int prot_id)
+scmi_txrx_setup(struct scmi_info *info, struct device_node *of_node,
+ int prot_id)
{
- int ret = scmi_chan_setup(info, dev, prot_id, true);
+ int ret = scmi_chan_setup(info, of_node, prot_id, true);

if (!ret) /* Rx is optional, hence no error check */
- scmi_chan_setup(info, dev, prot_id, false);
+ scmi_chan_setup(info, of_node, prot_id, false);

return ret;
}

+/**
+ * scmi_channels_setup - Helper to initialize all required channels
+ *
+ * @info: The SCMI instance descriptor.
+ *
+ * Initialize all the channels found described in the DT against the underlying
+ * configured transport using custom defined dedicated devices instead of
+ * borrowing devices from the SCMI drivers; this way channels are initialized
+ * upfront during core SCMI stack probing and are operational even if then no
+ * SCMI driver is loaded. (useful to operate in Raw mode)
+ *
+ * Return: 0 on Success
+ */
+static int scmi_channels_setup(struct scmi_info *info)
+{
+ int ret;
+ struct device_node *child, *top_np = info->dev->of_node;
+
+ ret = scmi_txrx_setup(info, top_np, SCMI_PROTOCOL_BASE);
+ if (ret)
+ return ret;
+
+ for_each_available_child_of_node(top_np, child) {
+ u32 prot_id;
+
+ if (of_property_read_u32(child, "reg", &prot_id))
+ continue;
+
+ if (!FIELD_FIT(MSG_PROTOCOL_ID_MASK, prot_id))
+ dev_err(info->dev,
+ "Out of range protocol %d\n", prot_id);
+
+ ret = scmi_txrx_setup(info, child, prot_id);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
/**
* scmi_get_protocol_device - Helper to get/create an SCMI device.
*
@@ -2133,14 +2188,6 @@ scmi_get_protocol_device(struct device_node *np, struct scmi_info *info,
return NULL;
}

- if (scmi_txrx_setup(info, &sdev->dev, prot_id)) {
- dev_err(&sdev->dev, "failed to setup transport\n");
- scmi_device_destroy(sdev);
- mutex_unlock(&scmi_syspower_mtx);
-
- return NULL;
- }
-
if (prot_id == SCMI_PROTOCOL_SYSTEM)
scmi_syspower_registered = true;

@@ -2432,7 +2479,8 @@ static int scmi_probe(struct platform_device *pdev)
return ret;
}

- ret = scmi_txrx_setup(info, dev, SCMI_PROTOCOL_BASE);
+ /* Setup all channels described in the DT at first */
+ ret = scmi_channels_setup(info);
if (ret)
return ret;

--
2.34.1

2022-10-19 21:47:13

by Cristian Marussi

[permalink] [raw]
Subject: [PATCH v4 05/11] firmware: arm_scmi: Refactor scmi_wait_for_message_response

Refactor scmi_wait_for_message_response() to use a internal helper to
carry out its main duties; while doing that make it accept directly an
scmi_desc parameter to interact with the configured transport.

No functional change.

Signed-off-by: Cristian Marussi <[email protected]>
---
drivers/firmware/arm_scmi/driver.c | 57 +++++++++++++++++-------------
1 file changed, 33 insertions(+), 24 deletions(-)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 9c77c931d91b..d496dfe43618 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -791,36 +791,18 @@ static bool scmi_xfer_done_no_timeout(struct scmi_chan_info *cinfo,
ktime_after(ktime_get(), stop);
}

-/**
- * scmi_wait_for_message_response - An helper to group all the possible ways of
- * waiting for a synchronous message response.
- *
- * @cinfo: SCMI channel info
- * @xfer: Reference to the transfer being waited for.
- *
- * Chooses waiting strategy (sleep-waiting vs busy-waiting) depending on
- * configuration flags like xfer->hdr.poll_completion.
- *
- * Return: 0 on Success, error otherwise.
- */
-static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
- struct scmi_xfer *xfer)
+static int scmi_wait_for_reply(struct device *dev, const struct scmi_desc *desc,
+ struct scmi_chan_info *cinfo,
+ struct scmi_xfer *xfer, unsigned int timeout_ms)
{
- struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
- struct device *dev = info->dev;
- int ret = 0, timeout_ms = info->desc->max_rx_timeout_ms;
-
- trace_scmi_xfer_response_wait(xfer->transfer_id, xfer->hdr.id,
- xfer->hdr.protocol_id, xfer->hdr.seq,
- timeout_ms,
- xfer->hdr.poll_completion);
+ int ret = 0;

if (xfer->hdr.poll_completion) {
/*
* Real polling is needed only if transport has NOT declared
* itself to support synchronous commands replies.
*/
- if (!info->desc->sync_cmds_completed_on_ret) {
+ if (!desc->sync_cmds_completed_on_ret) {
/*
* Poll on xfer using transport provided .poll_done();
* assumes no completion interrupt was available.
@@ -846,7 +828,7 @@ static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
*/
spin_lock_irqsave(&xfer->lock, flags);
if (xfer->state == SCMI_XFER_SENT_OK) {
- info->desc->ops->fetch_response(cinfo, xfer);
+ desc->ops->fetch_response(cinfo, xfer);
xfer->state = SCMI_XFER_RESP_OK;
}
spin_unlock_irqrestore(&xfer->lock, flags);
@@ -870,6 +852,33 @@ static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
return ret;
}

+/**
+ * scmi_wait_for_message_response - An helper to group all the possible ways of
+ * waiting for a synchronous message response.
+ *
+ * @cinfo: SCMI channel info
+ * @xfer: Reference to the transfer being waited for.
+ *
+ * Chooses waiting strategy (sleep-waiting vs busy-waiting) depending on
+ * configuration flags like xfer->hdr.poll_completion.
+ *
+ * Return: 0 on Success, error otherwise.
+ */
+static int scmi_wait_for_message_response(struct scmi_chan_info *cinfo,
+ struct scmi_xfer *xfer)
+{
+ struct scmi_info *info = handle_to_scmi_info(cinfo->handle);
+ struct device *dev = info->dev;
+
+ trace_scmi_xfer_response_wait(xfer->transfer_id, xfer->hdr.id,
+ xfer->hdr.protocol_id, xfer->hdr.seq,
+ info->desc->max_rx_timeout_ms,
+ xfer->hdr.poll_completion);
+
+ return scmi_wait_for_reply(dev, info->desc, cinfo, xfer,
+ info->desc->max_rx_timeout_ms);
+}
+
/**
* do_xfer() - Do one transfer
*
--
2.34.1

2022-10-19 21:48:05

by Cristian Marussi

[permalink] [raw]
Subject: [PATCH v4 04/11] firmware: arm_scmi: Refactor polling helpers

Refactor polling helpers to receive scmi_desc directly as a parameter and
move all of them to common.h.

No functional change.

Signed-off-by: Cristian Marussi <[email protected]>
---
drivers/firmware/arm_scmi/common.h | 18 ++++++++++++++++
drivers/firmware/arm_scmi/driver.c | 34 ++++++++----------------------
2 files changed, 27 insertions(+), 25 deletions(-)

diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h
index 096b66442d84..30d056febef1 100644
--- a/drivers/firmware/arm_scmi/common.h
+++ b/drivers/firmware/arm_scmi/common.h
@@ -212,6 +212,24 @@ struct scmi_desc {
const bool atomic_enabled;
};

+static inline bool is_polling_required(struct scmi_chan_info *cinfo,
+ const struct scmi_desc *desc)
+{
+ return cinfo->no_completion_irq || desc->force_polling;
+}
+
+static inline bool is_transport_polling_capable(const struct scmi_desc *desc)
+{
+ return desc->ops->poll_done || desc->sync_cmds_completed_on_ret;
+}
+
+static inline bool is_polling_enabled(struct scmi_chan_info *cinfo,
+ const struct scmi_desc *desc)
+{
+ return is_polling_required(cinfo, desc) &&
+ is_transport_polling_capable(desc);
+}
+
#ifdef CONFIG_ARM_SCMI_TRANSPORT_MAILBOX
extern const struct scmi_desc scmi_mailbox_desc;
#endif
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 032d1140d631..9c77c931d91b 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -648,25 +648,6 @@ static inline void scmi_clear_channel(struct scmi_info *info,
info->desc->ops->clear_channel(cinfo);
}

-static inline bool is_polling_required(struct scmi_chan_info *cinfo,
- struct scmi_info *info)
-{
- return cinfo->no_completion_irq || info->desc->force_polling;
-}
-
-static inline bool is_transport_polling_capable(struct scmi_info *info)
-{
- return info->desc->ops->poll_done ||
- info->desc->sync_cmds_completed_on_ret;
-}
-
-static inline bool is_polling_enabled(struct scmi_chan_info *cinfo,
- struct scmi_info *info)
-{
- return is_polling_required(cinfo, info) &&
- is_transport_polling_capable(info);
-}
-
static void scmi_handle_notification(struct scmi_chan_info *cinfo,
u32 msg_hdr, void *priv)
{
@@ -909,7 +890,8 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
struct scmi_chan_info *cinfo;

/* Check for polling request on custom command xfers at first */
- if (xfer->hdr.poll_completion && !is_transport_polling_capable(info)) {
+ if (xfer->hdr.poll_completion &&
+ !is_transport_polling_capable(info->desc)) {
dev_warn_once(dev,
"Polling mode is not supported by transport.\n");
return -EINVAL;
@@ -920,7 +902,7 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
return -EINVAL;

/* True ONLY if also supported by transport. */
- if (is_polling_enabled(cinfo, info))
+ if (is_polling_enabled(cinfo, info->desc))
xfer->hdr.poll_completion = true;

/*
@@ -1854,7 +1836,8 @@ static bool scmi_is_transport_atomic(const struct scmi_handle *handle,
bool ret;
struct scmi_info *info = handle_to_scmi_info(handle);

- ret = info->desc->atomic_enabled && is_transport_polling_capable(info);
+ ret = info->desc->atomic_enabled &&
+ is_transport_polling_capable(info->desc);
if (ret && atomic_threshold)
*atomic_threshold = info->atomic_threshold;

@@ -2061,8 +2044,8 @@ static int scmi_chan_setup(struct scmi_info *info, struct device_node *of_node,
return ret;
}

- if (tx && is_polling_required(cinfo, info)) {
- if (is_transport_polling_capable(info))
+ if (tx && is_polling_required(cinfo, info->desc)) {
+ if (is_transport_polling_capable(info->desc))
dev_info(&tdev->dev,
"Enabled polling mode TX channel - prot_id:%d\n",
prot_id);
@@ -2491,7 +2474,8 @@ static int scmi_probe(struct platform_device *pdev)
if (scmi_notification_init(handle))
dev_err(dev, "SCMI Notifications NOT available.\n");

- if (info->desc->atomic_enabled && !is_transport_polling_capable(info))
+ if (info->desc->atomic_enabled &&
+ !is_transport_polling_capable(info->desc))
dev_err(dev,
"Transport is not polling capable. Atomic mode not supported.\n");

--
2.34.1

2022-10-19 21:53:10

by Cristian Marussi

[permalink] [raw]
Subject: [PATCH v4 10/11] firmware: arm_scmi: Reject SCMI drivers while in Raw mode

Reject SCMI driver registration when SCMI Raw mode support is configured,
so as to avoid interferences between the SCMI Raw mode transactions and the
normal SCMI stack operations.

Signed-off-by: Cristian Marussi <[email protected]>
---
drivers/firmware/arm_scmi/driver.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index b504b5cdc55f..3e5539987443 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -2365,6 +2365,12 @@ int scmi_protocol_device_request(const struct scmi_device_id *id_table)
pr_debug("Requesting SCMI device (%s) for protocol %x\n",
id_table->name, id_table->protocol_id);

+ if (IS_ENABLED(CONFIG_ARM_SCMI_RAW_MODE_SUPPORT)) {
+ pr_warn("SCMI Raw mode active. Rejecting '%s'/0x%02X\n",
+ id_table->name, id_table->protocol_id);
+ return -EINVAL;
+ }
+
/*
* Search for the matching protocol rdev list and then search
* of any existent equally named device...fails if any duplicate found.
--
2.34.1

2022-10-28 15:02:02

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

Hi Christian,

On 10/19/2022 1:46 PM, Cristian Marussi wrote:
> Hi all,
>
> This series aims to introduce a new SCMI unified userspace interface meant
> to ease testing an SCMI Server implementation for compliance, fuzzing etc.,
> from the perspective of the OSPM agent (non-secure world only ...)
>
> It is proposed as a testing/development facility, it is NOT meant to be a
> feature to use in production, but only enabled in Kconfig for test
> deployments.
>
> Currently an SCMI Compliance Suite like the one at [1] can only work by
> injecting SCMI messages at the SCMI transport layer using the mailbox test
> driver (CONFIG_MAILBOX_TEST) via its few debugfs entries and looking at
> the related replies from the SCMI backend Server.

I plan on giving this a try on our systems later today and will let you
know the outcome. This is very useful for making sure the SCMI
implementation is both correct and properly hardened.
--
Florian

2022-10-28 15:06:52

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <[email protected]> wrote:
>
> Hi all,
>
> This series aims to introduce a new SCMI unified userspace interface meant
> to ease testing an SCMI Server implementation for compliance, fuzzing etc.,
> from the perspective of the OSPM agent (non-secure world only ...)
>
> It is proposed as a testing/development facility, it is NOT meant to be a
> feature to use in production, but only enabled in Kconfig for test
> deployments.
>
> Currently an SCMI Compliance Suite like the one at [1] can only work by
> injecting SCMI messages at the SCMI transport layer using the mailbox test
> driver (CONFIG_MAILBOX_TEST) via its few debugfs entries and looking at
> the related replies from the SCMI backend Server.
>
> This approach has a few drawbacks:
>
> - the SCMI Server under test MUST be reachable through a mailbox based
> SCMI transport: any other SCMI Server placement is not possible (like in
> a VM reachable via SCMI Virtio). In order to cover other placements in
> the current scenario we should write some sort of test driver for each
> and every existent SCMI transport and for any future additional transport
> ...this clearly does not scale.
>
> - even in the mailbox case the userspace Compliance suite cannot simply
> send and receive bare SCMI messages BUT it has to properly lay them out
> into the shared memory exposed by the mailbox test driver as expected by
> the transport definitions. In other words such a userspace test
> application has to, not only use a proper transport driver for the system
> at hand, but it also has to have a comprehensive knowledge of the
> internals of the underlying transport in order to operate.
>
> - last but not least, the system under test has to be specifically
> configured and built, in terms of Kconfig and DT, to perform such kind of
> testing, it cannot be used for anything else, which is unfortunate for
> CI/CD deployments.
>
> This series introduces a new SCMI Raw mode support feature that, when
> configured and enabled exposes a new interface in debugfs through which:
>
> - a userspace application can inject bare SCMI binary messages into the
> SCMI core stack; such messages will be routed by the SCMI regular kernel
> stack to the backend Server using the currently configured transport
> transparently: in other words you can test the SCMI server, no matter
> where it is placed, as long as it is reachable from the currently
> configured SCMI stack.
> Same goes the other way around on the reading path: any SCMI server reply
> can be read as a bare SCMI binary message from the same debugfs path.
>
> - as a direct consequence of this way of injecting bare messages in the
> middle of the SCMI stack (instead of beneath it at the transport layer)
> the user application has to handle only bare SCMI messages without having
> to worry about the specific underlying transport internals that will be
> taken care of by the SCMI core stack itself using its own machinery,
> without duplicating such logic.
>
> - a system under test, once configured with SCMI Raw support enabled in
> Kconfig, can be booted without any particular DT change.
>
> In V2 the runtime enable/disable switching capability has been removed
> (for now) since still not deemed to be stable/reliable enough: as a
> consequence when SCMI Raw support is compiled in, the regular SCMI stack
> drivers are now inhibited permanently for that Kernel.
>
> In V4 it has been added the support for transports lacking a completion_irq
> or configured forcibly in polled mode.
>
> A quick and trivial example from the shell...reading from a sensor
> injecting a properly crafted packet in raw mode:
>
> # INJECT THE SENSOR_READING MESSAGE FOR SENSOR ID=1 (binary little endian)
> root@deb-buster-arm64:~# echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00 > /sys/kernel/debug/scmi_raw/message
>
> # READING BACK THE REPLY...
> root@deb-buster-arm64:~# cat /sys/kernel/debug/scmi_raw/message | od --endian=little -t x4
> 0000000 00005406 00000000 00000335 00000000
> 0000020
>
> while doing that, since Raw mode makes (partial) use of the regular SCMI
> stack, you can observe the messages going through the SCMI stack with the
> usual traces:
>
> bash-329 [000] ..... 14183.446808: scmi_msg_dump: pt=15 t=CMND msg_id=06 seq=0000 s=0 pyld=0100000000000000
> irq/35-mhu_db_l-81 [000] ..... 14183.447809: scmi_msg_dump: pt=15 t=RESP msg_id=06 seq=0000 s=0 pyld=3503000000000000
>
>
> ..trying to read in async when the backend server does NOT supports asyncs:
>
> # AN ASYNC SENSOR READING REQUEST...
> root@deb-buster-arm64:~# echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x01\\x00\\x00\\x00 > /sys/kernel/debug/scmi_raw/message_async
>
> bash-329 [000] ..... 16415.938739: scmi_msg_dump: pt=15 t=CMND msg_id=06 seq=0000 s=0 pyld=0100000001000000
> irq/35-mhu_db_l-81 [000] ..... 16415.944129: scmi_msg_dump: pt=15 t=RESP msg_id=06 seq=0000 s=-1 pyld=
>
> # RETURNS A STATUS -1 FROM THE SERVER NOT SUPPORTING IT
> root@deb-buster-arm64:~# cat /sys/kernel/debug/scmi_raw/message | od --endian=little -t x4
> 0000000 00005406 ffffffff
> 0000010
>
> Note that this was on a JUNO, BUT exactly the same steps can be used to
> reach an SCMI Server living on a VM reachable via virtio as long as the
> system under test if properly configured to work with a virtio transport.
>
> In a nutshell the exposed API is as follows:
>
> /sys/kernel/debug/scmi_raw/
> ├── errors
> ├── message
> ├── message_async
> ├── notification
> ├── reset
> ├── transport_max_msg_size
> ├── transport_rx_timeout_ms
> └── transport_tx_max_msg
>
> where:
>
> - message*: used to send sync/async commands and read back immediate and
> delayed responses (if any)
> - errors: used to report timeout and unexpected replies
> - reset: used to reset the SCMI Raw stack, flushing all queues from
> received messages still pending to be read out (useful to be sure to
> cleanup between test suite runs...)
> - notification: used to read any notification being spit by the system
> (if previously enabled by the user app)
> - transport*: a bunch of configuration useful to setup the user
> application expectations in terms of timeouts and message
> characteristics.
>
> Each write corresponds to one command request and the replies or delayed
> response are read back one message at time (receiving an EOF at each
> message boundary).
>
> The user application running the test is in charge of handling timeouts
> and properly choosing SCMI sequence numbers for the outgoing requests: note
> that the same fixed number can be re-used (...though discouraged...) as
> long as the suite does NOT expect to send multiple in-flight commands
> concurrently.
>
> Since the SCMI core regular stack is partially used to deliver and collect
> the messages, late replies after timeouts and any other sort of unexpected
> message sent by the SCMI server platform back can be identified by the SCMI
> core as usual and it will be reported under /errors for later analysis.
> (a userspace test-app will have anyway properly detected the timeout on
> /message* ...)
>
> All of the above has been roughly tested against a standard JUNO SCP SCMI
> Server (mailbox trans) and an emulated SCMI Server living in a VM (virtio
> trans) using a custom experimental version of the scmi-tests Compliance
> suite patched to support Raw mode and posted at [2]. (still in development
> ...merge requests are in progress...for now it is just a mean for me to
> test the testing API ... O_o)
>
> The series is based on v6.1-rc1.
>
> Having said that (in such a concise and brief way :P) ...
>
> ...any feedback/comments are welcome !

Hi Cristian,

I have tested your series with an optee message transport layer and
been able to send raw messages to the scmi server PTA

FWIW

Tested-by: Vincent Guittot <[email protected]>

>
> Thanks,
> Cristian
>
> ---
> V3 --> v4
> - rebased on v6.1-rc1
> - addedd missing support for 'polled' transports and transport lacking a
> completion_irq (like smc/optee)
> - removed a few inlines
> - refactored SCMI Raw RX patch to make use more extensively of the regular
> non-Raw RX path
> - fix handling of O_NONBLOCK raw_mode read requests
>
> v2 --> v3
> - fixed some sparse warning on LE and __poll_t
> - reworked and simplified deferred worker in charge of xfer delayed waiting
> - allow for injection of DT-unknown protocols messages when in Raw mode
> (needed for any kind of fuzzing...)
>
> v1 --> v2
> - added comments and debugfs docs
> - added dedicated transport devices for channels initialization
> - better channels handling in Raw mode
> - removed runtime enable, moved to static compile time exclusion
> of SCMI regular stack
>
> [1]: https://gitlab.arm.com/tests/scmi-tests
> [2]: https://gitlab.arm.com/tests/scmi-tests/-/commits/raw_mode_support_devel/
>
>
> Cristian Marussi (11):
> firmware: arm_scmi: Refactor xfer in-flight registration routines
> firmware: arm_scmi: Simplify chan_available transport operation
> firmware: arm_scmi: Use dedicated devices to initialize channels
> firmware: arm_scmi: Refactor polling helpers
> firmware: arm_scmi: Refactor scmi_wait_for_message_response
> firmware: arm_scmi: Add xfer raw helpers
> firmware: arm_scmi: Move errors defs and code to common.h
> firmware: arm_scmi: Add raw transmission support
> firmware: arm_scmi: Add debugfs ABI documentation for Raw mode
> firmware: arm_scmi: Reject SCMI drivers while in Raw mode
> firmware: arm_scmi: Call Raw mode hooks from the core stack
>
> Documentation/ABI/testing/debugfs-scmi-raw | 88 ++
> drivers/firmware/arm_scmi/Kconfig | 13 +
> drivers/firmware/arm_scmi/Makefile | 1 +
> drivers/firmware/arm_scmi/common.h | 72 +-
> drivers/firmware/arm_scmi/driver.c | 521 +++++---
> drivers/firmware/arm_scmi/mailbox.c | 4 +-
> drivers/firmware/arm_scmi/optee.c | 4 +-
> drivers/firmware/arm_scmi/raw_mode.c | 1244 ++++++++++++++++++++
> drivers/firmware/arm_scmi/raw_mode.h | 29 +
> drivers/firmware/arm_scmi/smc.c | 4 +-
> drivers/firmware/arm_scmi/virtio.c | 2 +-
> 11 files changed, 1827 insertions(+), 155 deletions(-)
> create mode 100644 Documentation/ABI/testing/debugfs-scmi-raw
> create mode 100644 drivers/firmware/arm_scmi/raw_mode.c
> create mode 100644 drivers/firmware/arm_scmi/raw_mode.h
>
> --
> 2.34.1
>

2022-10-28 15:26:01

by Cristian Marussi

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote:
> On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <[email protected]> wrote:
> >
> > Hi all,
> >

Hi Vincent,

> > This series aims to introduce a new SCMI unified userspace interface meant
> > to ease testing an SCMI Server implementation for compliance, fuzzing etc.,
> > from the perspective of the OSPM agent (non-secure world only ...)
> >
> > It is proposed as a testing/development facility, it is NOT meant to be a
> > feature to use in production, but only enabled in Kconfig for test
> > deployments.
> >
> > Currently an SCMI Compliance Suite like the one at [1] can only work by
> > injecting SCMI messages at the SCMI transport layer using the mailbox test
> > driver (CONFIG_MAILBOX_TEST) via its few debugfs entries and looking at
> > the related replies from the SCMI backend Server.
> >
> > This approach has a few drawbacks:
> >
> > - the SCMI Server under test MUST be reachable through a mailbox based
> > SCMI transport: any other SCMI Server placement is not possible (like in
> > a VM reachable via SCMI Virtio). In order to cover other placements in
> > the current scenario we should write some sort of test driver for each
> > and every existent SCMI transport and for any future additional transport
> > ...this clearly does not scale.
> >
> > - even in the mailbox case the userspace Compliance suite cannot simply
> > send and receive bare SCMI messages BUT it has to properly lay them out
> > into the shared memory exposed by the mailbox test driver as expected by
> > the transport definitions. In other words such a userspace test
> > application has to, not only use a proper transport driver for the system
> > at hand, but it also has to have a comprehensive knowledge of the
> > internals of the underlying transport in order to operate.
> >
> > - last but not least, the system under test has to be specifically
> > configured and built, in terms of Kconfig and DT, to perform such kind of
> > testing, it cannot be used for anything else, which is unfortunate for
> > CI/CD deployments.
> >
> > This series introduces a new SCMI Raw mode support feature that, when
> > configured and enabled exposes a new interface in debugfs through which:
> >
> > - a userspace application can inject bare SCMI binary messages into the
> > SCMI core stack; such messages will be routed by the SCMI regular kernel
> > stack to the backend Server using the currently configured transport
> > transparently: in other words you can test the SCMI server, no matter
> > where it is placed, as long as it is reachable from the currently
> > configured SCMI stack.
> > Same goes the other way around on the reading path: any SCMI server reply
> > can be read as a bare SCMI binary message from the same debugfs path.
> >
> > - as a direct consequence of this way of injecting bare messages in the
> > middle of the SCMI stack (instead of beneath it at the transport layer)
> > the user application has to handle only bare SCMI messages without having
> > to worry about the specific underlying transport internals that will be
> > taken care of by the SCMI core stack itself using its own machinery,
> > without duplicating such logic.
> >
> > - a system under test, once configured with SCMI Raw support enabled in
> > Kconfig, can be booted without any particular DT change.
> >
> > In V2 the runtime enable/disable switching capability has been removed
> > (for now) since still not deemed to be stable/reliable enough: as a
> > consequence when SCMI Raw support is compiled in, the regular SCMI stack
> > drivers are now inhibited permanently for that Kernel.
> >
> > In V4 it has been added the support for transports lacking a completion_irq
> > or configured forcibly in polled mode.
> >
> > A quick and trivial example from the shell...reading from a sensor
> > injecting a properly crafted packet in raw mode:
> >
> > # INJECT THE SENSOR_READING MESSAGE FOR SENSOR ID=1 (binary little endian)
> > root@deb-buster-arm64:~# echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00 > /sys/kernel/debug/scmi_raw/message
> >
> > # READING BACK THE REPLY...
> > root@deb-buster-arm64:~# cat /sys/kernel/debug/scmi_raw/message | od --endian=little -t x4
> > 0000000 00005406 00000000 00000335 00000000
> > 0000020
> >
> > while doing that, since Raw mode makes (partial) use of the regular SCMI
> > stack, you can observe the messages going through the SCMI stack with the
> > usual traces:
> >
> > bash-329 [000] ..... 14183.446808: scmi_msg_dump: pt=15 t=CMND msg_id=06 seq=0000 s=0 pyld=0100000000000000
> > irq/35-mhu_db_l-81 [000] ..... 14183.447809: scmi_msg_dump: pt=15 t=RESP msg_id=06 seq=0000 s=0 pyld=3503000000000000
> >
> >
> > ..trying to read in async when the backend server does NOT supports asyncs:
> >
> > # AN ASYNC SENSOR READING REQUEST...
> > root@deb-buster-arm64:~# echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x01\\x00\\x00\\x00 > /sys/kernel/debug/scmi_raw/message_async
> >
> > bash-329 [000] ..... 16415.938739: scmi_msg_dump: pt=15 t=CMND msg_id=06 seq=0000 s=0 pyld=0100000001000000
> > irq/35-mhu_db_l-81 [000] ..... 16415.944129: scmi_msg_dump: pt=15 t=RESP msg_id=06 seq=0000 s=-1 pyld=
> >
> > # RETURNS A STATUS -1 FROM THE SERVER NOT SUPPORTING IT
> > root@deb-buster-arm64:~# cat /sys/kernel/debug/scmi_raw/message | od --endian=little -t x4
> > 0000000 00005406 ffffffff
> > 0000010
> >
> > Note that this was on a JUNO, BUT exactly the same steps can be used to
> > reach an SCMI Server living on a VM reachable via virtio as long as the
> > system under test if properly configured to work with a virtio transport.
> >
> > In a nutshell the exposed API is as follows:
> >
> > /sys/kernel/debug/scmi_raw/
> > ├── errors
> > ├── message
> > ├── message_async
> > ├── notification
> > ├── reset
> > ├── transport_max_msg_size
> > ├── transport_rx_timeout_ms
> > └── transport_tx_max_msg
> >
> > where:
> >
> > - message*: used to send sync/async commands and read back immediate and
> > delayed responses (if any)
> > - errors: used to report timeout and unexpected replies
> > - reset: used to reset the SCMI Raw stack, flushing all queues from
> > received messages still pending to be read out (useful to be sure to
> > cleanup between test suite runs...)
> > - notification: used to read any notification being spit by the system
> > (if previously enabled by the user app)
> > - transport*: a bunch of configuration useful to setup the user
> > application expectations in terms of timeouts and message
> > characteristics.
> >
> > Each write corresponds to one command request and the replies or delayed
> > response are read back one message at time (receiving an EOF at each
> > message boundary).
> >
> > The user application running the test is in charge of handling timeouts
> > and properly choosing SCMI sequence numbers for the outgoing requests: note
> > that the same fixed number can be re-used (...though discouraged...) as
> > long as the suite does NOT expect to send multiple in-flight commands
> > concurrently.
> >
> > Since the SCMI core regular stack is partially used to deliver and collect
> > the messages, late replies after timeouts and any other sort of unexpected
> > message sent by the SCMI server platform back can be identified by the SCMI
> > core as usual and it will be reported under /errors for later analysis.
> > (a userspace test-app will have anyway properly detected the timeout on
> > /message* ...)
> >
> > All of the above has been roughly tested against a standard JUNO SCP SCMI
> > Server (mailbox trans) and an emulated SCMI Server living in a VM (virtio
> > trans) using a custom experimental version of the scmi-tests Compliance
> > suite patched to support Raw mode and posted at [2]. (still in development
> > ...merge requests are in progress...for now it is just a mean for me to
> > test the testing API ... O_o)
> >
> > The series is based on v6.1-rc1.
> >
> > Having said that (in such a concise and brief way :P) ...
> >
> > ...any feedback/comments are welcome !
>
> Hi Cristian,
>
> I have tested your series with an optee message transport layer and
> been able to send raw messages to the scmi server PTA
>
> FWIW
>
> Tested-by: Vincent Guittot <[email protected]>
>

Thanks a lot for your test and feedback !

... but I was going to reply to this saying that I spotted another issue
with the current SCMI Raw implementation (NOT related to optee/smc) so
that I'll post a V5 next week :P

Anyway, the issue is much related to the debugfs root used by SCMI Raw,
i.e.:

/sys/kernel/debug/scmi_raw/

..this works fine unless you run it on a system sporting multiple DT-defined
server instances ...that is not officially supported but....ehm...a little
bird told these system with multiple servers do exists :D

In such a case the SCMI core stack is probed multiuple times and so it
will try to register multiple debugfs Raw roots: there is no chanche to
root the SCMI Raw entries at the same point clearly ... and then anyway
there is the issue of recognizing which server is rooted where ... with
the additional pain that a server CANNOT be recognized by querying...cause
there is only one by teh spec with agentID ZERO ... in theory :D...

So my tentaive solution for V5 would be:

- change the Raw root debugfs as:

/sys/kernel/debug/scmi_raw/0/... (first server defined)

/sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..)

- expose the DT scmi-server root-node name of the server somewhere under
that debugfs root like:

..../scmi_raw/0/transport_name -> 'scmi-mbx'

..../scmi_raw/1/transport_name -> 'scmi-virtio'

so that if you know HOW you have configured your own system in the DT
(maybe multiple servers with different kind of transports ?), you can
easily select programmatically which one is which, and so decide
which Raw debugfs fs to use...

... I plan to leave the SCMI ACS suite use by default the first system
rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline
option to choose an alternative path for SCMI Raw.

Does all of this sound reasonable ?

Thanks,
Cristian


2022-10-28 15:50:29

by Cristian Marussi

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

On Fri, Oct 28, 2022 at 07:44:32AM -0700, Florian Fainelli wrote:
> Hi Christian,
>
> On 10/19/2022 1:46 PM, Cristian Marussi wrote:
> > Hi all,
> >

Hi Florian,

> > This series aims to introduce a new SCMI unified userspace interface meant
> > to ease testing an SCMI Server implementation for compliance, fuzzing etc.,
> > from the perspective of the OSPM agent (non-secure world only ...)
> >
> > It is proposed as a testing/development facility, it is NOT meant to be a
> > feature to use in production, but only enabled in Kconfig for test
> > deployments.
> >
> > Currently an SCMI Compliance Suite like the one at [1] can only work by
> > injecting SCMI messages at the SCMI transport layer using the mailbox test
> > driver (CONFIG_MAILBOX_TEST) via its few debugfs entries and looking at
> > the related replies from the SCMI backend Server.
>
> I plan on giving this a try on our systems later today and will let you know
> the outcome.

Great ! It would be much appreciated...

> This is very useful for making sure the SCMI implementation is
> both correct and properly hardened.

... that was the plan :P

Note that the upstream SCMI ACS suite that I am using for stressing/testing
this Raw thing is still WIP in term of supporting Raw mode injection
(i.e. functional but ALL still to be merged)..but if you need I can give
you pointers on how to use it....unless of course you have your suite or
you just want to test using the shell as in the cover-letter examples...

... on my side I tried to fuzz me with a brutal

'dd bs=128 count=1 if=/dev/random of=<scmi_raw>/message'

as a poor man fuzzying tool :D ... so I was thinking if it was meaningful
to think about upstreaming some common tools for fuzzying or simply
pre-building bare payloads (in proper endianity) to be injected with this
SCMI raw thing... (I mean something useful that could live in tools/)

...any feedbacks/hints in these regards are welcome.

Thanks,
Cristian


2022-10-28 16:22:12

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

On Fri, 28 Oct 2022 at 17:04, Cristian Marussi <[email protected]> wrote:
>
> On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote:
> > On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <[email protected]> wrote:
> > >
> > > Hi all,
> > >
>
> Hi Vincent,
>
> > > This series aims to introduce a new SCMI unified userspace interface meant
> > > to ease testing an SCMI Server implementation for compliance, fuzzing etc.,
> > > from the perspective of the OSPM agent (non-secure world only ...)
> > >

[ snip]

> > Hi Cristian,
> >
> > I have tested your series with an optee message transport layer and
> > been able to send raw messages to the scmi server PTA
> >
> > FWIW
> >
> > Tested-by: Vincent Guittot <[email protected]>
> >
>
> Thanks a lot for your test and feedback !
>
> ... but I was going to reply to this saying that I spotted another issue
> with the current SCMI Raw implementation (NOT related to optee/smc) so
> that I'll post a V5 next week :P
>
> Anyway, the issue is much related to the debugfs root used by SCMI Raw,
> i.e.:
>
> /sys/kernel/debug/scmi_raw/
>
> ..this works fine unless you run it on a system sporting multiple DT-defined
> server instances ...that is not officially supported but....ehm...a little
> bird told these system with multiple servers do exists :D

;-)

>
> In such a case the SCMI core stack is probed multiuple times and so it
> will try to register multiple debugfs Raw roots: there is no chanche to
> root the SCMI Raw entries at the same point clearly ... and then anyway
> there is the issue of recognizing which server is rooted where ... with
> the additional pain that a server CANNOT be recognized by querying...cause
> there is only one by teh spec with agentID ZERO ... in theory :D...
>
> So my tentaive solution for V5 would be:
>
> - change the Raw root debugfs as:
>
> /sys/kernel/debug/scmi_raw/0/... (first server defined)
>
> /sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..)
>
> - expose the DT scmi-server root-node name of the server somewhere under
> that debugfs root like:
>
> ..../scmi_raw/0/transport_name -> 'scmi-mbx'
>
> ..../scmi_raw/1/transport_name -> 'scmi-virtio'

I was about to say that you would display the server name but that
means that you have send a request to the server which probably
defeats the purpose of the raw mode

>
> so that if you know HOW you have configured your own system in the DT
> (maybe multiple servers with different kind of transports ?), you can
> easily select programmatically which one is which, and so decide
> which Raw debugfs fs to use...
>
> ... I plan to leave the SCMI ACS suite use by default the first system
> rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline
> option to choose an alternative path for SCMI Raw.
>
> Does all of this sound reasonable ?

Yes, adding an index looks good to me.

As we are there, should we consider adding a per channel entry in the
tree when there are several channels shared with the same server ?

Vincent

>
> Thanks,
> Cristian
>

2022-10-28 17:57:41

by Cristian Marussi

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

On Fri, Oct 28, 2022 at 06:18:52PM +0200, Vincent Guittot wrote:
> On Fri, 28 Oct 2022 at 17:04, Cristian Marussi <[email protected]> wrote:
> >
> > On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote:
> > > On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <[email protected]> wrote:
> > > >
> > > > Hi all,
> > > >
> >
> > Hi Vincent,
> >
> > > > This series aims to introduce a new SCMI unified userspace interface meant
> > > > to ease testing an SCMI Server implementation for compliance, fuzzing etc.,
> > > > from the perspective of the OSPM agent (non-secure world only ...)
> > > >
>
> [ snip]
>
> > > Hi Cristian,
> > >
> > > I have tested your series with an optee message transport layer and
> > > been able to send raw messages to the scmi server PTA
> > >
> > > FWIW
> > >
> > > Tested-by: Vincent Guittot <[email protected]>
> > >
> >
> > Thanks a lot for your test and feedback !
> >
> > ... but I was going to reply to this saying that I spotted another issue
> > with the current SCMI Raw implementation (NOT related to optee/smc) so
> > that I'll post a V5 next week :P
> >
> > Anyway, the issue is much related to the debugfs root used by SCMI Raw,
> > i.e.:
> >
> > /sys/kernel/debug/scmi_raw/
> >
> > ..this works fine unless you run it on a system sporting multiple DT-defined
> > server instances ...that is not officially supported but....ehm...a little
> > bird told these system with multiple servers do exists :D
>
> ;-)
>
> >
> > In such a case the SCMI core stack is probed multiuple times and so it
> > will try to register multiple debugfs Raw roots: there is no chanche to
> > root the SCMI Raw entries at the same point clearly ... and then anyway
> > there is the issue of recognizing which server is rooted where ... with
> > the additional pain that a server CANNOT be recognized by querying...cause
> > there is only one by teh spec with agentID ZERO ... in theory :D...
> >
> > So my tentaive solution for V5 would be:
> >
> > - change the Raw root debugfs as:
> >
> > /sys/kernel/debug/scmi_raw/0/... (first server defined)
> >
> > /sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..)
> >
> > - expose the DT scmi-server root-node name of the server somewhere under
> > that debugfs root like:
> >
> > ..../scmi_raw/0/transport_name -> 'scmi-mbx'
> >
> > ..../scmi_raw/1/transport_name -> 'scmi-virtio'
>
> I was about to say that you would display the server name but that
> means that you have send a request to the server which probably
> defeats the purpose of the raw mode
>
> >
> > so that if you know HOW you have configured your own system in the DT
> > (maybe multiple servers with different kind of transports ?), you can
> > easily select programmatically which one is which, and so decide
> > which Raw debugfs fs to use...
> >
> > ... I plan to leave the SCMI ACS suite use by default the first system
> > rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline
> > option to choose an alternative path for SCMI Raw.
> >
> > Does all of this sound reasonable ?
>
> Yes, adding an index looks good to me.

Ok, I'll rework accordingly.

>
> As we are there, should we consider adding a per channel entry in the
> tree when there are several channels shared with the same server ?
>

So, I was thinking about this and, even though, it seems not strictly
needed for Compliance testing (as discussed offline) I do think that
could be a sensible option to have as an additional mean to stress the
server transport implementation (as you wish).

Having said that, this week, I was reasoning about an alternative
interface to do this, i.e. to avoid to add even more debugfs entries
for this chosen-channel config or possibly in the future to ask for
transport polling mode (if supported by the underlying transport)

My idea (not thought fully through as of now eh..) would be as follows:

since the current Raw implementation enforces a minimum size of 4 bytes
for the injected message (more on this later down below in NOTE), I was
thinking about using less-than-4-bytes-sized messages to sort of
pre-configure the Raw stack.

IOW, instead of having a number of new additional entries like

../message_ch10
../message_ch13
../message_poll

we could design a sort of API (in the API :D) that defines how
3-bytes message payload are to be interpreted, so that in normal usage
everything will go on as it is now, while if a 3-bytes message is
injected by a specially crafted testcase, it would be used to configure
the behaviour stack for the subsequent set of Raw transactions
(i.e. for the currently opened fd...) like:

- open message fd

- send a configure message:

| proto_chan_# | flags (polling..) |
------------------------------------------
0 7 21

- send/receive your test messages

- repeat or close (then the config will vanish...)

This would mean adding some sort entry under scmi_raw to expose the
configured available channels on the system though.

[maybe the flags above could also include an async flag and avoid
also to add the message_async entries...]

I discarded the idea to run the above config process via IOCTLs since
it seemed to me even more frowned upon to use IOCTLs on a debugfs entry
:P...but I maybe wrong ah...

All of this is still to be explored anyway, any thoughts ? or evident
drawbacks ? (beside having to clearly define an API for these message
config machinery)

Anyway, whatever direction we'll choose (additional entries vs 3-bytes
config msg), I would prefer to add this per-channel (or polling)
capabilities with separate series to post on top of this in teh next
cycle.

..too many words even this time :P

Thanks,
Cristian


P.S: NOTE min_injected_msg_size:
--------------------------------
Thinking about all of the above, at first, I was a bit dubious if
instead I should not allow, in Raw mode, the injection of shorter than
4 bytes messages (i.e. shorter than a SCMI header) for the purpose of
fuzzing: then I realized that even though I should allow the injection
of smaller messages, the underlying transports, as they are defined, both
sides (platform and agent) will anyway carry out a 4bytes transaction,
it's just that all the other non-provided bytes will be zeroed in the
memory layout; this is just how the transports itself (shmem or msg
based) are designed to work both sides. (and again would be transport
layer testing more than SCMI spec verification..)

So at the end I thought this kind of less-than-4-bytes transmissions
gave no benefit and I came up with the above trick to use such tiny
message for configuration.


2022-10-29 02:50:01

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

Hi Christian,

On 10/19/2022 1:46 PM, Cristian Marussi wrote:
[snip]

> In V2 the runtime enable/disable switching capability has been removed
> (for now) since still not deemed to be stable/reliable enough: as a
> consequence when SCMI Raw support is compiled in, the regular SCMI stack
> drivers are now inhibited permanently for that Kernel.

For our platforms (ARCH_BRCMSTB) we would need to have the ability to
start with the regular SCMI stack to satisfy if nothing else, all clock
consumers otherwise it makes it fairly challenging for us to boot to a
prompt as we purposely turn off all unnecessary peripherals to conserve
power. We could introduce a "full on" mode to remove the clock provider
dependency, but I suspect others on "real" silicon may suffer from the
same short comings.

Once user-space is reached, I suppose we could find a way to unbind from
all SCMI consumers, and/or ensure that runtime PM is disabled, cpufreq
is in a governor that won't do any active frequency switching etc.

What do you think?
--
Florian

2022-10-29 11:24:36

by Cristian Marussi

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

On Fri, Oct 28, 2022 at 07:38:25PM -0700, Florian Fainelli wrote:
> Hi Christian,

Hi Florian,

>
> On 10/19/2022 1:46 PM, Cristian Marussi wrote:
> [snip]
>
> > In V2 the runtime enable/disable switching capability has been removed
> > (for now) since still not deemed to be stable/reliable enough: as a
> > consequence when SCMI Raw support is compiled in, the regular SCMI stack
> > drivers are now inhibited permanently for that Kernel.
>
> For our platforms (ARCH_BRCMSTB) we would need to have the ability to start
> with the regular SCMI stack to satisfy if nothing else, all clock consumers
> otherwise it makes it fairly challenging for us to boot to a prompt as we
> purposely turn off all unnecessary peripherals to conserve power. We could
> introduce a "full on" mode to remove the clock provider dependency, but I
> suspect others on "real" silicon may suffer from the same short comings.
>

Indeed in V1 of this series the Raw mode was dynamically switched on/off at
runtime, so that you could have booted your system with a full working
SCMI stack and then Raw mode could have been enabled/disabled via
scmi_raw/enable entry, so causing the SCMI drivers to be unbound after
boot when entering Raw mode.

The idea, indeed, was initially to be able to boot a regular system, perform
any kind of non-SCMI testing and then switch at will into Raw mode, perform
your SCMI testing, and then back from the grave into normal mode when needed.
(this way you could have deployed into CI one single image for all
testing scenarios...)

The valid objections/worries were around the stability/relliability of such
a solution both when entering Raw mode and then coming back to normal
use: i.e. not being sure to be able to safely unbind all and to safely
bind back all the stack at the end.

The full discussion about this is in this thread if you'd want to
chime in with your point of view:

https://lore.kernel.org/all/[email protected]/

So we removed it, but the idea was not fullly abandoned, we could revive
it with some variations, most probably binding this feature to a Kconfig
option (default=N).

Any feedback/idea from You in these regards is highly welcome.

> Once user-space is reached, I suppose we could find a way to unbind from all
> SCMI consumers, and/or ensure that runtime PM is disabled, cpufreq is in a
> governor that won't do any active frequency switching etc.
>
> What do you think?

Anyway, thinking about your scenario, maybe this dynamic-switch is NOT
a good solution in your case, because that was an all-or-nothing switch
that caused the full SCMI stack to be unbound, you could not selectively
keep alive what you possibly need to stay on even after boot.

I think that an alternative, maybe better, option in your case, since
you are willing to manually fine-tune at runtime which parts of the SCMI
has to be inhibited to avoid interferences while Raw-testing (via unbind/
unload or policy governors changes), a better option could be a
'full-coexistence' Raw mode solution.

In such a COEX configuration you'll boot a normal system with all the
SCMI drivers operational as configured in the DT, BUT with also the
Raw mode initialized and ready to be used.

In this scenario, basically, you'll have the normal message transactions,
coming from the regular SCMI drivers, and the Raw transactions, injected
from your test suite, that happily coexist side-by-side at the pure
trasaction level: this does NOT mean that you won't suffer any interference
at the protocol level so, as said, you'll have anyway to inhibit properly the
SCMI drivers by hand to avoid false-positives in your results.
(imagine testcase generating a series of Raw get/set/get transaction on
a resource while the regular stack issue a set on the same
res...notifications interferences are even worst...)

Now, the GOOD_NEWS : is ... that this can be done already with an additional
slim patch that has to be applied on top of this series, patch that I
have not posted still since not sure of its utility, but that I am using
heavily in my setup and which works fine for me (with really rare
interferences on testing even without fine-tuning/disabling anything by hand..)

I attached such patch at the end of this mail so that you can
immediately be unblocked and experiment further with Raw mode as you
planned.

I'll cleaned it up and post it also to the next V5 at this point.

Once that COEX is enabled, you should see something like this at boot:

[ 1.824269] arm-scmi firmware:scmi: SCMI RAW Mode initialized.
[ 1.830155] arm-scmi firmware:scmi: SCMI RAW Mode COEX enabled !
[ 1.836473] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
[ 1.847481] arm-scmi firmware:scmi: SCMI Protocol v2.0 'arm:arm' Firmware version 0x20a0000
...

... and then you can just use the scmi_raw entries as you wish.

Any transaction, normal or raw, will be visible as usual in the SCMI
traces (even though, currently, NOT distinguishable by type raw/normal)

So...after this other too much long mail (:P)...let me know what you
think about al of this (including the possibility of revive the runtime
dynamic switch too...)

Thanks,
Cristian


---->8-----

From 8613438d4171866088339e030959cb1de8e88c6a Mon Sep 17 00:00:00 2001
From: Cristian Marussi <[email protected]>
Date: Sun, 21 Aug 2022 19:09:39 +0100
Subject: [PATCH] [DEBUG] firmware: arm_scmi: Add SCMI Raw mode COEXISTENCE
support

When Raw mode support is configured in coexistence mode, normal SCMI
drivers are allowed to register and work as usual with the SCMI core.
Normal and raw SCMI message transactions will remain anyway segregated from
each other, it is just that any SCMI test suite using the Raw mode
access could report unreliable results due to possible interferences
from the regular drivers access to shared SCMI resources.

Signed-off-by: Cristian Marussi <[email protected]>
---
drivers/firmware/arm_scmi/Kconfig | 10 ++++++++++
drivers/firmware/arm_scmi/driver.c | 21 +++++++++++++++------
drivers/firmware/arm_scmi/protocols.h | 2 ++
drivers/firmware/arm_scmi/raw_mode.c | 2 +-
4 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig
index ab726a92ac2f..743f53fbe2f8 100644
--- a/drivers/firmware/arm_scmi/Kconfig
+++ b/drivers/firmware/arm_scmi/Kconfig
@@ -36,6 +36,16 @@ config ARM_SCMI_RAW_MODE_SUPPORT
order to avoid unexpected interactions with the SCMI Raw message
flow. If unsure say N.

+config ARM_SCMI_RAW_MODE_SUPPORT_COEX
+ bool "Allow SCMI Raw mode coexistence with normal SCMI stack"
+ depends on ARM_SCMI_RAW_MODE_SUPPORT
+ help
+ Allow SCMI Raw transmission mode to coexist with normal SCMI stack.
+
+ This will allow regular SCMI drivers to register with the core and
+ operate normally, thing which could make an SCMI test suite using the
+ SCMI Raw mode support unreliable. If unsure, say N.
+
config ARM_SCMI_HAVE_TRANSPORT
bool
help
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index 32374fdba997..f0b06b6e8dc2 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -449,9 +449,14 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle,
*/
struct scmi_xfer *scmi_xfer_raw_get(const struct scmi_handle *handle)
{
+ struct scmi_xfer *xfer;
struct scmi_info *info = handle_to_scmi_info(handle);

- return scmi_xfer_get(handle, &info->tx_minfo);
+ xfer = scmi_xfer_get(handle, &info->tx_minfo);
+ if (!IS_ERR(xfer))
+ xfer->is_raw = true;
+
+ return xfer;
}

/**
@@ -531,6 +536,7 @@ void scmi_xfer_raw_put(const struct scmi_handle *handle, struct scmi_xfer *xfer)
{
struct scmi_info *info = handle_to_scmi_info(handle);

+ xfer->is_raw = false;
return __scmi_xfer_put(&info->tx_minfo, xfer);
}

@@ -2401,7 +2407,8 @@ int scmi_protocol_device_request(const struct scmi_device_id *id_table)
pr_debug("Requesting SCMI device (%s) for protocol %x\n",
id_table->name, id_table->protocol_id);

- if (IS_ENABLED(CONFIG_ARM_SCMI_RAW_MODE_SUPPORT)) {
+ if (IS_ENABLED(CONFIG_ARM_SCMI_RAW_MODE_SUPPORT) &&
+ !IS_ENABLED(CONFIG_ARM_SCMI_RAW_MODE_SUPPORT_COEX)) {
pr_warn("SCMI Raw mode active. Rejecting '%s'/0x%02X\n",
id_table->name, id_table->protocol_id);
return -EINVAL;
@@ -2634,11 +2641,13 @@ static int scmi_probe(struct platform_device *pdev)
info->tx_minfo.max_msg);
if (!IS_ERR(info->raw)) {
dev_info(dev, "SCMI RAW Mode initialized.\n");
- return 0;
+ if (!IS_ENABLED(CONFIG_ARM_SCMI_RAW_MODE_SUPPORT_COEX))
+ return 0;
+ dev_info(dev, "SCMI RAW Mode COEX enabled !\n");
+ } else {
+ dev_err(dev, "Failed to initialize SCMI RAW Mode !\n");
+ info->raw = NULL;
}
-
- dev_err(dev, "Failed to initialize SCMI RAW Mode !\n");
- info->raw = NULL;
}

if (scmi_notification_init(handle))
diff --git a/drivers/firmware/arm_scmi/protocols.h b/drivers/firmware/arm_scmi/protocols.h
index 2f3bf691db7c..70a48adcc320 100644
--- a/drivers/firmware/arm_scmi/protocols.h
+++ b/drivers/firmware/arm_scmi/protocols.h
@@ -88,6 +88,7 @@ struct scmi_msg_hdr {
/**
* struct scmi_xfer - Structure representing a message flow
*
+ * @is_raw: Flag to state if this xfer has been generated by RAW mode
* @transfer_id: Unique ID for debug & profiling purpose
* @hdr: Transmit message header
* @tx: Transmit message
@@ -119,6 +120,7 @@ struct scmi_msg_hdr {
* @priv: A pointer for transport private usage.
*/
struct scmi_xfer {
+ bool is_raw;
int transfer_id;
struct scmi_msg_hdr hdr;
struct scmi_msg tx;
diff --git a/drivers/firmware/arm_scmi/raw_mode.c b/drivers/firmware/arm_scmi/raw_mode.c
index 3fdfc0564286..0edaeb405267 100644
--- a/drivers/firmware/arm_scmi/raw_mode.c
+++ b/drivers/firmware/arm_scmi/raw_mode.c
@@ -1154,7 +1154,7 @@ void scmi_raw_message_report(void *r, struct scmi_xfer *xfer, unsigned int idx)
struct device *dev;
struct scmi_raw_mode_info *raw = r;

- if (!raw)
+ if (!raw || (idx == SCMI_RAW_REPLY_QUEUE && !xfer->is_raw))
return;

dev = raw->handle->dev;
--
2.34.1


----8<-----

2022-11-02 09:07:39

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

On Fri, 28 Oct 2022 at 18:58, Cristian Marussi <[email protected]> wrote:
>
> On Fri, Oct 28, 2022 at 06:18:52PM +0200, Vincent Guittot wrote:
> > On Fri, 28 Oct 2022 at 17:04, Cristian Marussi <[email protected]> wrote:
> > >
> > > On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote:
> > > > On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <[email protected]> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > >
> > > Hi Vincent,
> > >
> > > > > This series aims to introduce a new SCMI unified userspace interface meant
> > > > > to ease testing an SCMI Server implementation for compliance, fuzzing etc.,
> > > > > from the perspective of the OSPM agent (non-secure world only ...)
> > > > >
> >
> > [ snip]
> >
> > > > Hi Cristian,
> > > >
> > > > I have tested your series with an optee message transport layer and
> > > > been able to send raw messages to the scmi server PTA
> > > >
> > > > FWIW
> > > >
> > > > Tested-by: Vincent Guittot <[email protected]>
> > > >
> > >
> > > Thanks a lot for your test and feedback !
> > >
> > > ... but I was going to reply to this saying that I spotted another issue
> > > with the current SCMI Raw implementation (NOT related to optee/smc) so
> > > that I'll post a V5 next week :P
> > >
> > > Anyway, the issue is much related to the debugfs root used by SCMI Raw,
> > > i.e.:
> > >
> > > /sys/kernel/debug/scmi_raw/
> > >
> > > ..this works fine unless you run it on a system sporting multiple DT-defined
> > > server instances ...that is not officially supported but....ehm...a little
> > > bird told these system with multiple servers do exists :D
> >
> > ;-)
> >
> > >
> > > In such a case the SCMI core stack is probed multiuple times and so it
> > > will try to register multiple debugfs Raw roots: there is no chanche to
> > > root the SCMI Raw entries at the same point clearly ... and then anyway
> > > there is the issue of recognizing which server is rooted where ... with
> > > the additional pain that a server CANNOT be recognized by querying...cause
> > > there is only one by teh spec with agentID ZERO ... in theory :D...
> > >
> > > So my tentaive solution for V5 would be:
> > >
> > > - change the Raw root debugfs as:
> > >
> > > /sys/kernel/debug/scmi_raw/0/... (first server defined)
> > >
> > > /sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..)
> > >
> > > - expose the DT scmi-server root-node name of the server somewhere under
> > > that debugfs root like:
> > >
> > > ..../scmi_raw/0/transport_name -> 'scmi-mbx'
> > >
> > > ..../scmi_raw/1/transport_name -> 'scmi-virtio'
> >
> > I was about to say that you would display the server name but that
> > means that you have send a request to the server which probably
> > defeats the purpose of the raw mode
> >
> > >
> > > so that if you know HOW you have configured your own system in the DT
> > > (maybe multiple servers with different kind of transports ?), you can
> > > easily select programmatically which one is which, and so decide
> > > which Raw debugfs fs to use...
> > >
> > > ... I plan to leave the SCMI ACS suite use by default the first system
> > > rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline
> > > option to choose an alternative path for SCMI Raw.
> > >
> > > Does all of this sound reasonable ?
> >
> > Yes, adding an index looks good to me.
>
> Ok, I'll rework accordingly.
>
> >
> > As we are there, should we consider adding a per channel entry in the
> > tree when there are several channels shared with the same server ?
> >
>
> So, I was thinking about this and, even though, it seems not strictly
> needed for Compliance testing (as discussed offline) I do think that
> could be a sensible option to have as an additional mean to stress the
> server transport implementation (as you wish).

Thanks

>
> Having said that, this week, I was reasoning about an alternative
> interface to do this, i.e. to avoid to add even more debugfs entries
> for this chosen-channel config or possibly in the future to ask for
> transport polling mode (if supported by the underlying transport)
>
> My idea (not thought fully through as of now eh..) would be as follows:
>
> since the current Raw implementation enforces a minimum size of 4 bytes
> for the injected message (more on this later down below in NOTE), I was
> thinking about using less-than-4-bytes-sized messages to sort of
> pre-configure the Raw stack.
>
> IOW, instead of having a number of new additional entries like
>
> ../message_ch10
> ../message_ch13
> ../message_poll
>
> we could design a sort of API (in the API :D) that defines how
> 3-bytes message payload are to be interpreted, so that in normal usage
> everything will go on as it is now, while if a 3-bytes message is
> injected by a specially crafted testcase, it would be used to configure
> the behaviour stack for the subsequent set of Raw transactions
> (i.e. for the currently opened fd...) like:
>
> - open message fd
>
> - send a configure message:
>
> | proto_chan_# | flags (polling..) |
> ------------------------------------------
> 0 7 21
>
> - send/receive your test messages
>
> - repeat or close (then the config will vanish...)
>
> This would mean adding some sort entry under scmi_raw to expose the
> configured available channels on the system though.
>
> [maybe the flags above could also include an async flag and avoid
> also to add the message_async entries...]
>
> I discarded the idea to run the above config process via IOCTLs since
> it seemed to me even more frowned upon to use IOCTLs on a debugfs entry
> :P...but I maybe wrong ah...
>
> All of this is still to be explored anyway, any thoughts ? or evident
> drawbacks ? (beside having to clearly define an API for these message
> config machinery)

TBH, I'm not a fan of adding a protocol on top of the SCMI one. This
interface aims to test the SCMI servers and their channels so we
should focus on this and make it simple to use. IMHO, adding some
special bytes before the real scmi message is prone to create
complexity and error in the use of this debug interface.

>
> Anyway, whatever direction we'll choose (additional entries vs 3-bytes
> config msg), I would prefer to add this per-channel (or polling)
> capabilities with separate series to post on top of this in teh next
> cycle.

Ok

>
> ..too many words even this time :P

Thanks
Vincent

>
> Thanks,
> Cristian
>
>
> P.S: NOTE min_injected_msg_size:
> --------------------------------
> Thinking about all of the above, at first, I was a bit dubious if
> instead I should not allow, in Raw mode, the injection of shorter than
> 4 bytes messages (i.e. shorter than a SCMI header) for the purpose of
> fuzzing: then I realized that even though I should allow the injection
> of smaller messages, the underlying transports, as they are defined, both
> sides (platform and agent) will anyway carry out a 4bytes transaction,
> it's just that all the other non-provided bytes will be zeroed in the
> memory layout; this is just how the transports itself (shmem or msg
> based) are designed to work both sides. (and again would be transport
> layer testing more than SCMI spec verification..)
>
> So at the end I thought this kind of less-than-4-bytes transmissions
> gave no benefit and I came up with the above trick to use such tiny
> message for configuration.
>

2022-11-03 09:27:45

by Cristian Marussi

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

On Wed, Nov 02, 2022 at 09:54:50AM +0100, Vincent Guittot wrote:
> On Fri, 28 Oct 2022 at 18:58, Cristian Marussi <[email protected]> wrote:
> >
> > On Fri, Oct 28, 2022 at 06:18:52PM +0200, Vincent Guittot wrote:
> > > On Fri, 28 Oct 2022 at 17:04, Cristian Marussi <[email protected]> wrote:
> > > >
> > > > On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote:
> > > > > On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <[email protected]> wrote:
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > >
> > > > Hi Vincent,
> > > >
> > > > > > This series aims to introduce a new SCMI unified userspace interface meant
> > > > > > to ease testing an SCMI Server implementation for compliance, fuzzing etc.,
> > > > > > from the perspective of the OSPM agent (non-secure world only ...)
> > > > > >
> > >
> > > [ snip]
> > >
> > > > > Hi Cristian,
> > > > >
> > > > > I have tested your series with an optee message transport layer and
> > > > > been able to send raw messages to the scmi server PTA
> > > > >
> > > > > FWIW
> > > > >
> > > > > Tested-by: Vincent Guittot <[email protected]>
> > > > >
> > > >
> > > > Thanks a lot for your test and feedback !
> > > >
> > > > ... but I was going to reply to this saying that I spotted another issue
> > > > with the current SCMI Raw implementation (NOT related to optee/smc) so
> > > > that I'll post a V5 next week :P
> > > >
> > > > Anyway, the issue is much related to the debugfs root used by SCMI Raw,
> > > > i.e.:
> > > >
> > > > /sys/kernel/debug/scmi_raw/
> > > >
> > > > ..this works fine unless you run it on a system sporting multiple DT-defined
> > > > server instances ...that is not officially supported but....ehm...a little
> > > > bird told these system with multiple servers do exists :D
> > >
> > > ;-)
> > >
> > > >
> > > > In such a case the SCMI core stack is probed multiuple times and so it
> > > > will try to register multiple debugfs Raw roots: there is no chanche to
> > > > root the SCMI Raw entries at the same point clearly ... and then anyway
> > > > there is the issue of recognizing which server is rooted where ... with
> > > > the additional pain that a server CANNOT be recognized by querying...cause
> > > > there is only one by teh spec with agentID ZERO ... in theory :D...
> > > >
> > > > So my tentaive solution for V5 would be:
> > > >
> > > > - change the Raw root debugfs as:
> > > >
> > > > /sys/kernel/debug/scmi_raw/0/... (first server defined)
> > > >
> > > > /sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..)
> > > >
> > > > - expose the DT scmi-server root-node name of the server somewhere under
> > > > that debugfs root like:
> > > >
> > > > ..../scmi_raw/0/transport_name -> 'scmi-mbx'
> > > >
> > > > ..../scmi_raw/1/transport_name -> 'scmi-virtio'
> > >
> > > I was about to say that you would display the server name but that
> > > means that you have send a request to the server which probably
> > > defeats the purpose of the raw mode
> > >
> > > >
> > > > so that if you know HOW you have configured your own system in the DT
> > > > (maybe multiple servers with different kind of transports ?), you can
> > > > easily select programmatically which one is which, and so decide
> > > > which Raw debugfs fs to use...
> > > >
> > > > ... I plan to leave the SCMI ACS suite use by default the first system
> > > > rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline
> > > > option to choose an alternative path for SCMI Raw.
> > > >
> > > > Does all of this sound reasonable ?
> > >
> > > Yes, adding an index looks good to me.
> >
> > Ok, I'll rework accordingly.
> >
> > >
> > > As we are there, should we consider adding a per channel entry in the
> > > tree when there are several channels shared with the same server ?
> > >
> >
> > So, I was thinking about this and, even though, it seems not strictly
> > needed for Compliance testing (as discussed offline) I do think that
> > could be a sensible option to have as an additional mean to stress the
> > server transport implementation (as you wish).
>
> Thanks
>
> >
> > Having said that, this week, I was reasoning about an alternative
> > interface to do this, i.e. to avoid to add even more debugfs entries
> > for this chosen-channel config or possibly in the future to ask for
> > transport polling mode (if supported by the underlying transport)
> >
> > My idea (not thought fully through as of now eh..) would be as follows:
> >
> > since the current Raw implementation enforces a minimum size of 4 bytes
> > for the injected message (more on this later down below in NOTE), I was
> > thinking about using less-than-4-bytes-sized messages to sort of
> > pre-configure the Raw stack.
> >
> > IOW, instead of having a number of new additional entries like
> >
> > ../message_ch10
> > ../message_ch13
> > ../message_poll
> >
> > we could design a sort of API (in the API :D) that defines how
> > 3-bytes message payload are to be interpreted, so that in normal usage
> > everything will go on as it is now, while if a 3-bytes message is
> > injected by a specially crafted testcase, it would be used to configure
> > the behaviour stack for the subsequent set of Raw transactions
> > (i.e. for the currently opened fd...) like:
> >
> > - open message fd
> >
> > - send a configure message:
> >
> > | proto_chan_# | flags (polling..) |
> > ------------------------------------------
> > 0 7 21
> >
> > - send/receive your test messages
> >
> > - repeat or close (then the config will vanish...)
> >
> > This would mean adding some sort entry under scmi_raw to expose the
> > configured available channels on the system though.
> >
> > [maybe the flags above could also include an async flag and avoid
> > also to add the message_async entries...]
> >
> > I discarded the idea to run the above config process via IOCTLs since
> > it seemed to me even more frowned upon to use IOCTLs on a debugfs entry
> > :P...but I maybe wrong ah...
> >
> > All of this is still to be explored anyway, any thoughts ? or evident
> > drawbacks ? (beside having to clearly define an API for these message
> > config machinery)
>
> TBH, I'm not a fan of adding a protocol on top of the SCMI one. This
> interface aims to test the SCMI servers and their channels so we
> should focus on this and make it simple to use. IMHO, adding some
> special bytes before the real scmi message is prone to create
> complexity and error in the use of this debug interface.
>

Indeed, even if only for transport-related tests, the risk is to make
more complicate to use the interface.

Agreed, just wanted to have some feedback. I'll revert to some based on
debugfs trying to minimize entries and improper usage...maybe something
like grouping on per-channel subdirs when different channels are
available.

E.g. on a system with a dedicated PERF channel I could have


../scmi_raw/0/message <<< usual auto channel selection
/message_async
...

/chan_0x10/message <<< use default channel (base proto)
/message_async
....

/chan_0x13/message <<< use PERF channel
/message_async
....

The alternative would be to have a common single entry to configure the usage
of the a single batch of message/message_async entries BUT that seems to
me more prone to error, e.g. if you dont clear the special config at the
end of your special testcase you could endup using custom channels conf also
with any following regular test which expects to benefit from the
automatic channel selection (which I still think should be default way of
running these tests..)

Thoughts ?

Thanks for the feedback.
Cristian

2022-11-03 11:06:33

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

On Thu, 3 Nov 2022 at 10:21, Cristian Marussi <[email protected]> wrote:
>
> On Wed, Nov 02, 2022 at 09:54:50AM +0100, Vincent Guittot wrote:
> > On Fri, 28 Oct 2022 at 18:58, Cristian Marussi <[email protected]> wrote:
> > >
> > > On Fri, Oct 28, 2022 at 06:18:52PM +0200, Vincent Guittot wrote:
> > > > On Fri, 28 Oct 2022 at 17:04, Cristian Marussi <[email protected]> wrote:
> > > > >
> > > > > On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote:
> > > > > > On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <[email protected]> wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > >
> > > > > Hi Vincent,
> > > > >
> > > > > > > This series aims to introduce a new SCMI unified userspace interface meant
> > > > > > > to ease testing an SCMI Server implementation for compliance, fuzzing etc.,
> > > > > > > from the perspective of the OSPM agent (non-secure world only ...)
> > > > > > >
> > > >
> > > > [ snip]
> > > >
> > > > > > Hi Cristian,
> > > > > >
> > > > > > I have tested your series with an optee message transport layer and
> > > > > > been able to send raw messages to the scmi server PTA
> > > > > >
> > > > > > FWIW
> > > > > >
> > > > > > Tested-by: Vincent Guittot <[email protected]>
> > > > > >
> > > > >
> > > > > Thanks a lot for your test and feedback !
> > > > >
> > > > > ... but I was going to reply to this saying that I spotted another issue
> > > > > with the current SCMI Raw implementation (NOT related to optee/smc) so
> > > > > that I'll post a V5 next week :P
> > > > >
> > > > > Anyway, the issue is much related to the debugfs root used by SCMI Raw,
> > > > > i.e.:
> > > > >
> > > > > /sys/kernel/debug/scmi_raw/
> > > > >
> > > > > ..this works fine unless you run it on a system sporting multiple DT-defined
> > > > > server instances ...that is not officially supported but....ehm...a little
> > > > > bird told these system with multiple servers do exists :D
> > > >
> > > > ;-)
> > > >
> > > > >
> > > > > In such a case the SCMI core stack is probed multiuple times and so it
> > > > > will try to register multiple debugfs Raw roots: there is no chanche to
> > > > > root the SCMI Raw entries at the same point clearly ... and then anyway
> > > > > there is the issue of recognizing which server is rooted where ... with
> > > > > the additional pain that a server CANNOT be recognized by querying...cause
> > > > > there is only one by teh spec with agentID ZERO ... in theory :D...
> > > > >
> > > > > So my tentaive solution for V5 would be:
> > > > >
> > > > > - change the Raw root debugfs as:
> > > > >
> > > > > /sys/kernel/debug/scmi_raw/0/... (first server defined)
> > > > >
> > > > > /sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..)
> > > > >
> > > > > - expose the DT scmi-server root-node name of the server somewhere under
> > > > > that debugfs root like:
> > > > >
> > > > > ..../scmi_raw/0/transport_name -> 'scmi-mbx'
> > > > >
> > > > > ..../scmi_raw/1/transport_name -> 'scmi-virtio'
> > > >
> > > > I was about to say that you would display the server name but that
> > > > means that you have send a request to the server which probably
> > > > defeats the purpose of the raw mode
> > > >
> > > > >
> > > > > so that if you know HOW you have configured your own system in the DT
> > > > > (maybe multiple servers with different kind of transports ?), you can
> > > > > easily select programmatically which one is which, and so decide
> > > > > which Raw debugfs fs to use...
> > > > >
> > > > > ... I plan to leave the SCMI ACS suite use by default the first system
> > > > > rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline
> > > > > option to choose an alternative path for SCMI Raw.
> > > > >
> > > > > Does all of this sound reasonable ?
> > > >
> > > > Yes, adding an index looks good to me.
> > >
> > > Ok, I'll rework accordingly.
> > >
> > > >
> > > > As we are there, should we consider adding a per channel entry in the
> > > > tree when there are several channels shared with the same server ?
> > > >
> > >
> > > So, I was thinking about this and, even though, it seems not strictly
> > > needed for Compliance testing (as discussed offline) I do think that
> > > could be a sensible option to have as an additional mean to stress the
> > > server transport implementation (as you wish).
> >
> > Thanks
> >
> > >
> > > Having said that, this week, I was reasoning about an alternative
> > > interface to do this, i.e. to avoid to add even more debugfs entries
> > > for this chosen-channel config or possibly in the future to ask for
> > > transport polling mode (if supported by the underlying transport)
> > >
> > > My idea (not thought fully through as of now eh..) would be as follows:
> > >
> > > since the current Raw implementation enforces a minimum size of 4 bytes
> > > for the injected message (more on this later down below in NOTE), I was
> > > thinking about using less-than-4-bytes-sized messages to sort of
> > > pre-configure the Raw stack.
> > >
> > > IOW, instead of having a number of new additional entries like
> > >
> > > ../message_ch10
> > > ../message_ch13
> > > ../message_poll
> > >
> > > we could design a sort of API (in the API :D) that defines how
> > > 3-bytes message payload are to be interpreted, so that in normal usage
> > > everything will go on as it is now, while if a 3-bytes message is
> > > injected by a specially crafted testcase, it would be used to configure
> > > the behaviour stack for the subsequent set of Raw transactions
> > > (i.e. for the currently opened fd...) like:
> > >
> > > - open message fd
> > >
> > > - send a configure message:
> > >
> > > | proto_chan_# | flags (polling..) |
> > > ------------------------------------------
> > > 0 7 21
> > >
> > > - send/receive your test messages
> > >
> > > - repeat or close (then the config will vanish...)
> > >
> > > This would mean adding some sort entry under scmi_raw to expose the
> > > configured available channels on the system though.
> > >
> > > [maybe the flags above could also include an async flag and avoid
> > > also to add the message_async entries...]
> > >
> > > I discarded the idea to run the above config process via IOCTLs since
> > > it seemed to me even more frowned upon to use IOCTLs on a debugfs entry
> > > :P...but I maybe wrong ah...
> > >
> > > All of this is still to be explored anyway, any thoughts ? or evident
> > > drawbacks ? (beside having to clearly define an API for these message
> > > config machinery)
> >
> > TBH, I'm not a fan of adding a protocol on top of the SCMI one. This
> > interface aims to test the SCMI servers and their channels so we
> > should focus on this and make it simple to use. IMHO, adding some
> > special bytes before the real scmi message is prone to create
> > complexity and error in the use of this debug interface.
> >
>
> Indeed, even if only for transport-related tests, the risk is to make
> more complicate to use the interface.
>
> Agreed, just wanted to have some feedback. I'll revert to some based on
> debugfs trying to minimize entries and improper usage...maybe something
> like grouping on per-channel subdirs when different channels are
> available.
>
> E.g. on a system with a dedicated PERF channel I could have
>
>
> ../scmi_raw/0/message <<< usual auto channel selection
> /message_async
> ...
>
> /chan_0x10/message <<< use default channel (base proto)
> /message_async
> ....
>
> /chan_0x13/message <<< use PERF channel
> /message_async
> ....


The proposal above looks good for me

Thanks

>
> The alternative would be to have a common single entry to configure the usage
> of the a single batch of message/message_async entries BUT that seems to
> me more prone to error, e.g. if you dont clear the special config at the
> end of your special testcase you could endup using custom channels conf also
> with any following regular test which expects to benefit from the
> automatic channel selection (which I still think should be default way of
> running these tests..)
>
> Thoughts ?
>
> Thanks for the feedback.
> Cristian

2022-11-03 12:19:48

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

On Fri, Oct 28, 2022 at 07:38:25PM -0700, Florian Fainelli wrote:
> Hi Christian,
>
> On 10/19/2022 1:46 PM, Cristian Marussi wrote:
> [snip]
>
> > In V2 the runtime enable/disable switching capability has been removed
> > (for now) since still not deemed to be stable/reliable enough: as a
> > consequence when SCMI Raw support is compiled in, the regular SCMI stack
> > drivers are now inhibited permanently for that Kernel.
>
> For our platforms (ARCH_BRCMSTB) we would need to have the ability to start
> with the regular SCMI stack to satisfy if nothing else, all clock consumers
> otherwise it makes it fairly challenging for us to boot to a prompt as we
> purposely turn off all unnecessary peripherals to conserve power. We could
> introduce a "full on" mode to remove the clock provider dependency, but I
> suspect others on "real" silicon may suffer from the same short comings.
>

Fair enough. But if we are doing SCMI firmware testing or conformance via
the $subject proposed way, can these drivers survive if the userspace do
a random or a torture test changing the clock configurations ? Not sure
how to deal with that as the intention here is to do the testing from the
user-space and anything can happen. How do we avoid bring the entire system
down while doing this testing. Can we unbind all the drivers using scmi on
your platform ? I guess no. Let me know.

> Once user-space is reached, I suppose we could find a way to unbind from all
> SCMI consumers, and/or ensure that runtime PM is disabled, cpufreq is in a
> governor that won't do any active frequency switching etc.
>
> What do you think?

Yes, Cristian always wanted to support that but I am the one trying to
convince him not to unless there is a strong requirement for it. You seem
to suggest that you have such a requirement, but that just opens loads of
questions and how to we deal with that. Few of them are as stated above, I
need to recall all the conversations I had with Cristian around that and why
handling it may be bit complex.

--
Regards,
Sudeep

2022-11-03 12:38:03

by Cristian Marussi

[permalink] [raw]
Subject: Re: [PATCH v4 0/11] Introduce a unified API for SCMI Server testing

On Thu, Nov 03, 2022 at 11:21:47AM +0000, Sudeep Holla wrote:
> On Fri, Oct 28, 2022 at 07:38:25PM -0700, Florian Fainelli wrote:
> > Hi Christian,
> >
> > On 10/19/2022 1:46 PM, Cristian Marussi wrote:
> > [snip]
> >
> > > In V2 the runtime enable/disable switching capability has been removed
> > > (for now) since still not deemed to be stable/reliable enough: as a
> > > consequence when SCMI Raw support is compiled in, the regular SCMI stack
> > > drivers are now inhibited permanently for that Kernel.
> >
> > For our platforms (ARCH_BRCMSTB) we would need to have the ability to start
> > with the regular SCMI stack to satisfy if nothing else, all clock consumers
> > otherwise it makes it fairly challenging for us to boot to a prompt as we
> > purposely turn off all unnecessary peripherals to conserve power. We could
> > introduce a "full on" mode to remove the clock provider dependency, but I
> > suspect others on "real" silicon may suffer from the same short comings.
> >
>
> Fair enough. But if we are doing SCMI firmware testing or conformance via
> the $subject proposed way, can these drivers survive if the userspace do
> a random or a torture test changing the clock configurations ? Not sure
> how to deal with that as the intention here is to do the testing from the
> user-space and anything can happen. How do we avoid bring the entire system
> down while doing this testing. Can we unbind all the drivers using scmi on
> your platform ? I guess no. Let me know.
>
> > Once user-space is reached, I suppose we could find a way to unbind from all
> > SCMI consumers, and/or ensure that runtime PM is disabled, cpufreq is in a
> > governor that won't do any active frequency switching etc.
> >
> > What do you think?
>
> Yes, Cristian always wanted to support that but I am the one trying to
> convince him not to unless there is a strong requirement for it. You seem
> to suggest that you have such a requirement, but that just opens loads of
> questions and how to we deal with that. Few of them are as stated above, I
> need to recall all the conversations I had with Cristian around that and why
> handling it may be bit complex.

:D ... I really even more like the idea of enabling on demand full coexistence
so that I completely delegate to the users to manually deal with possible
interferences at runtime and drop any liabilities if someone shoots himself
in the foot :P

... jokes apart I'll post today a V5 with a few fixes and and an optional
coexistence mode so that Florian can experiment and see how much is feasible
to operate in this way by manually unbinding/re-configuring SCMI behaviour
at runtime before starting tests and not kill the system on something
like ARCH_BRCMSTB platforms.

Thanks,
Cristian