LinuxLists.cc - [RFC PATCH 0/1] remoteproc: Enable getter for drivers that manage 2+ remotes

2022-11-15 16:04:15

Subject: [RFC PATCH 0/1] remoteproc: Enable getter for drivers that manage 2+ remotes

This RFC is to show the following
(a) a use case for a new remoteproc API rproc_get_by_id()
(b) patch for the new API rproc_get_by_id()

For context there exist multiple drivers in remoteproc that manage more than
one remote processor. For these drivers, calls to rproc_get_by_phandle()
are not sufficient as the check at
https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2111
will not work. This is because for r->dev.parent, r->dev's parent is
expected to be the platform device that corresponds to the platform-probe()
call but instead is the child and this child does not have the driver field
set.

An example to show this issue is as follows:

If a remoteproc driver has the following DTS binding:

/{
remoteproc_cluster {
compatible = "soc,remoteproc-cluster";

core0: core0 {
memory-region;
sram;
};

core1: core1 {
memory-region;
sram;
}
};
};

And in the corresponding driver the platform-probe() is as follows:

static int cluster_platform_probe(struct platform_device *pdev)
{
struct device_node *np = dev_of_node(dev);
struct device *dev = &pdev->dev;
struct platform_device *cpdev;
struct device *child_dev;
struct rproc *rp;

for_each_available_child_of_node(np, child) {
cpdev = of_find_device_by_node(child);
child_dev = &cpdev->dev;

rp = rproc_alloc(cdev, dev_name(cdev), dummy_ops, NULL,
sizeof(struct dummy_ops));
}

return 0;
}

After the rproc call is done and when another driver tries to access this
rproc structure via a rproc_get_by_phandle(), the aforementioned check of
r->dev.parent->driver will be NULL.

To account for a remoteproc driver that manages multiple remote processors,
I have provided an API rproc_get_by_id() that enables getting rp
given a phandle to the core in question with a DT binding and usage of the API.

Sample binding:

/{
platform_driver_sample {
compatible = "custom_platform";
rproc = <&core1>;
};
};

Sample usage:

static int custom_platform_probe(struct platform_device *pdev)
{
struct rproc *rp;
struct device_node *node;

node = of_parse_phandle(pdev->dev.of_node, "rproc", 0);

/* Here get rproc 1, as its index should be 1 */
rp = rproc_get_by_id(node->phandle, 1);

return 0;
}

If we want further specification of getting the correct remoteproc ID,
this can be inferred from the pdev->dev child's device child node and
its dev->init_name field as this is set in rproc_alloc() as follows:

dev_set_name(&rproc->dev, "remoteproc%d", rproc->index);

We can then parse the pdev->dev child device as follows:

int index;

sscanf(dev_name(dev), "remoteproc%d", &index);

Additionally I have provided the implementation for the API in
the subsequent patch.

Ben Levinsky (1):
remoteproc: Introduce rproc_get_by_id API

drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++++++++++-
include/linux/remoteproc.h | 1 +
2 files changed, 64 insertions(+), 1 deletion(-)

--
2.25.1

2022-11-15 16:35:15

by Ben Levinsky

[permalink] [raw]

Subject: [RFC PATCH 1/1] remoteproc: Introduce rproc_get_by_id API

Allow users of remoteproc the ability to get a handle to an rproc by
passing in node that has parent rproc device and an ID that matches
an expected rproc struct's index field.

This enables to get rproc structure for remoteproc drivers that manage
more than 1 remote processor (e.g. TI and Xilinx R5 drivers).

Signed-off-by: Ben Levinsky <[email protected]>
---
drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++++++++++-
include/linux/remoteproc.h | 1 +
2 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index 775df165eb45..6f7058bcc80c 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -40,6 +40,7 @@
#include <linux/virtio_ring.h>
#include <asm/byteorder.h>
#include <linux/platform_device.h>
+#include <linux/of_platform.h>

#include "remoteproc_internal.h"

@@ -2203,13 +2204,74 @@ struct rproc *rproc_get_by_phandle(phandle phandle)

return rproc;
}
+
+/**
+ * rproc_get_by_id() - find a remote processor by ID
+ * @phandle: phandle to the rproc
+ * @id: Index into rproc list that uniquely identifies the rproc struct
+ *
+ * Finds an rproc handle using the remote processor's index, and then
+ * return a handle to the rproc. Before returning, ensure that the
+ * parent node's driver is still loaded.
+ *
+ * This function increments the remote processor's refcount, so always
+ * use rproc_put() to decrement it back once rproc isn't needed anymore.
+ *
+ * Return: rproc handle on success, and NULL on failure
+ */
+
+struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
+{
+ struct rproc *rproc = NULL, *r;
+ struct platform_device *parent_pdev;
+ struct device_node *np;
+
+ np = of_find_node_by_phandle(phandle);
+ if (!np)
+ return NULL;
+
+ parent_pdev = of_find_device_by_node(np->parent);
+ if (!parent_pdev) {
+ dev_err(&parent_pdev->dev,
+ "no platform device for node %pOF\n", np);
+ of_node_put(np);
+ return NULL;
+ }
+
+ /* prevent underlying implementation from being removed */
+ if (!try_module_get(parent_pdev->dev.driver->owner)) {
+ dev_err(&parent_pdev->dev, "can't get owner\n");
+ of_node_put(np);
+ return NULL;
+ }
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(r, &rproc_list, node) {
+ if (r->index == id) {
+ rproc = r;
+ get_device(&rproc->dev);
+ break;
+ }
+ }
+ rcu_read_unlock();
+
+ of_node_put(np);
+
+ return rproc;
+}
+EXPORT_SYMBOL(rproc_get_by_id);
#else
struct rproc *rproc_get_by_phandle(phandle phandle)
{
return NULL;
}
-#endif
EXPORT_SYMBOL(rproc_get_by_phandle);
+struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
+{
+ return NULL;
+}
+EXPORT_SYMBOL(rproc_get_by_id);
+#endif

/**
* rproc_set_firmware() - assign a new firmware
diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
index 3cde845ba26e..10961fae0f77 100644
--- a/include/linux/remoteproc.h
+++ b/include/linux/remoteproc.h
@@ -645,6 +645,7 @@ struct rproc_vdev {
};

struct rproc *rproc_get_by_phandle(phandle phandle);
+struct rproc *rproc_get_by_id(phandle phandle, unsigned int id);
struct rproc *rproc_get_by_child(struct device *dev);

struct rproc *rproc_alloc(struct device *dev, const char *name,
--
2.25.1

2022-11-25 18:16:47

by Mathieu Poirier

[permalink] [raw]

Subject: Re: [RFC PATCH 1/1] remoteproc: Introduce rproc_get_by_id API

Hi Ben,

On Tue, Nov 15, 2022 at 07:37:53AM -0800, Ben Levinsky wrote:
> Allow users of remoteproc the ability to get a handle to an rproc by
> passing in node that has parent rproc device and an ID that matches
> an expected rproc struct's index field.
>
> This enables to get rproc structure for remoteproc drivers that manage
> more than 1 remote processor (e.g. TI and Xilinx R5 drivers).
>
> Signed-off-by: Ben Levinsky <[email protected]>
> ---
> drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++++++++++-
> include/linux/remoteproc.h | 1 +
> 2 files changed, 64 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index 775df165eb45..6f7058bcc80c 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -40,6 +40,7 @@
> #include <linux/virtio_ring.h>
> #include <asm/byteorder.h>
> #include <linux/platform_device.h>
> +#include <linux/of_platform.h>
>
> #include "remoteproc_internal.h"
>
> @@ -2203,13 +2204,74 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
>
> return rproc;
> }
> +
> +/**
> + * rproc_get_by_id() - find a remote processor by ID
> + * @phandle: phandle to the rproc
> + * @id: Index into rproc list that uniquely identifies the rproc struct
> + *
> + * Finds an rproc handle using the remote processor's index, and then
> + * return a handle to the rproc. Before returning, ensure that the
> + * parent node's driver is still loaded.
> + *
> + * This function increments the remote processor's refcount, so always
> + * use rproc_put() to decrement it back once rproc isn't needed anymore.
> + *
> + * Return: rproc handle on success, and NULL on failure
> + */
> +
> +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
> +{
> + struct rproc *rproc = NULL, *r;
> + struct platform_device *parent_pdev;
> + struct device_node *np;
> +
> + np = of_find_node_by_phandle(phandle);
> + if (!np)
> + return NULL;
> +
> + parent_pdev = of_find_device_by_node(np->parent);
> + if (!parent_pdev) {
> + dev_err(&parent_pdev->dev,
> + "no platform device for node %pOF\n", np);
> + of_node_put(np);
> + return NULL;
> + }
> +
> + /* prevent underlying implementation from being removed */
> + if (!try_module_get(parent_pdev->dev.driver->owner)) {
> + dev_err(&parent_pdev->dev, "can't get owner\n");
> + of_node_put(np);
> + return NULL;
> + }
> +
> + rcu_read_lock();
> + list_for_each_entry_rcu(r, &rproc_list, node) {
> + if (r->index == id) {
> + rproc = r;
> + get_device(&rproc->dev);
> + break;
> + }
> + }

This won't work because several remote processors can be on the list. If
another remote processor was discovered before the one @phandle is associated
with, the remote processor pertaining to that previous one will returned.

There is also an issue with rproc_put().

I think your description of the problem is mostly correct. The intermediate
devices created by the cascading entries for individual remote processors in the
device tree are causing an issue. The "compatible" string for each remote
processor can't be handled by any platform drivers (as it should be), which
makes try_module_get() fail because r->dev.parent->driver is not bound to
anything.

Looking at the code for Xilinx's R5F support that I just queued[1], the simplest
solution may be to pass @dev, which is in fact @cluster->dev, to
zynqmp_r5_add_rproc_core() rather than the device associated with the
intermediate platform device.

That _should_ work. It is hard for me to know for sure since I don't have a
platform that has dual core remote processor to test with.

Get back to me with how that turned out and we'll go from there.

Thanks,
Mathieu

[1]. https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923

> + rcu_read_unlock();
> +
> + of_node_put(np);
> +
> + return rproc;
> +}
> +EXPORT_SYMBOL(rproc_get_by_id);
> #else
> struct rproc *rproc_get_by_phandle(phandle phandle)
> {
> return NULL;
> }
> -#endif
> EXPORT_SYMBOL(rproc_get_by_phandle);
> +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
> +{
> + return NULL;
> +}
> +EXPORT_SYMBOL(rproc_get_by_id);
> +#endif
>
> /**
> * rproc_set_firmware() - assign a new firmware
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index 3cde845ba26e..10961fae0f77 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -645,6 +645,7 @@ struct rproc_vdev {
> };
>
> struct rproc *rproc_get_by_phandle(phandle phandle);
> +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id);
> struct rproc *rproc_get_by_child(struct device *dev);
>
> struct rproc *rproc_alloc(struct device *dev, const char *name,
> --
> 2.25.1
>

2022-11-30 22:52:02

by Ben Levinsky

[permalink] [raw]

Subject: Re: [RFC PATCH 1/1] remoteproc: Introduce rproc_get_by_id API

Hi Mathieu,

Thank you for your review. Please see my reply inline.

Thanks
Ben

On 11/25/22, 10:05 AM, "Mathieu Poirier" <[email protected]> wrote:

CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.

Hi Ben,

On Tue, Nov 15, 2022 at 07:37:53AM -0800, Ben Levinsky wrote:
> Allow users of remoteproc the ability to get a handle to an rproc by
> passing in node that has parent rproc device and an ID that matches
> an expected rproc struct's index field.
>
> This enables to get rproc structure for remoteproc drivers that manage
> more than 1 remote processor (e.g. TI and Xilinx R5 drivers).
>
> Signed-off-by: Ben Levinsky <[email protected]>
> ---
> drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++++++++++-
> include/linux/remoteproc.h | 1 +
> 2 files changed, 64 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index 775df165eb45..6f7058bcc80c 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -40,6 +40,7 @@
> #include <linux/virtio_ring.h>
> #include <asm/byteorder.h>
> #include <linux/platform_device.h>
> +#include <linux/of_platform.h>
>
> #include "remoteproc_internal.h"
>
> @@ -2203,13 +2204,74 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
>
> return rproc;
> }
> +
> +/**
> + * rproc_get_by_id() - find a remote processor by ID
> + * @phandle: phandle to the rproc
> + * @id: Index into rproc list that uniquely identifies the rproc struct
> + *
> + * Finds an rproc handle using the remote processor's index, and then
> + * return a handle to the rproc. Before returning, ensure that the
> + * parent node's driver is still loaded.
> + *
> + * This function increments the remote processor's refcount, so always
> + * use rproc_put() to decrement it back once rproc isn't needed anymore.
> + *
> + * Return: rproc handle on success, and NULL on failure
> + */
> +
> +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
> +{
> + struct rproc *rproc = NULL, *r;
> + struct platform_device *parent_pdev;
> + struct device_node *np;
> +
> + np = of_find_node_by_phandle(phandle);
> + if (!np)
> + return NULL;
> +
> + parent_pdev = of_find_device_by_node(np->parent);
> + if (!parent_pdev) {
> + dev_err(&parent_pdev->dev,
> + "no platform device for node %pOF\n", np);
> + of_node_put(np);
> + return NULL;
> + }
> +
> + /* prevent underlying implementation from being removed */
> + if (!try_module_get(parent_pdev->dev.driver->owner)) {
> + dev_err(&parent_pdev->dev, "can't get owner\n");
> + of_node_put(np);
> + return NULL;
> + }
> +
> + rcu_read_lock();
> + list_for_each_entry_rcu(r, &rproc_list, node) {
> + if (r->index == id) {
> + rproc = r;
> + get_device(&rproc->dev);
> + break;
> + }
> + }

This won't work because several remote processors can be on the list. If
another remote processor was discovered before the one @phandle is associated
with, the remote processor pertaining to that previous one will returned.

I didn't understand. From my point of view passing in the phandle of the child-platform device here will work because each child-platform will have its own entry in the remoteproc list.

Also " If another remote processor was discovered before the one" Here this prevented from what I can see because the remoteproc_list is protected by a mutex_lock. See https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2288 for the mutex_lock.

Additionally the calls to zynqmp_r5_add_rproc_core() are called sequentially so this also prevents the race condition.

I think I am missing something in your paragraph above. Can you expand on this issue?

Do you mean to say that if we use the cluster platform device you think using one of the existing APIs will work? For example rproc_get_by_child() or rproc_get_by_phandle()

At https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923 " zynqmp_r5_add_rproc_core(&child_pdev->dev);" Here if we use cluster->dev this will work? To dig deeper into this for both the Xilinx and TI R5 remoteproc drivers, I think this proposed solution will create an issue in that for Split modes, the existing getter APIs will not be able to return one of the corresponding rproc instances because both cores will refer to the same platform-device structure.

I can bring up the above in the community call.

There is also an issue with rproc_put().

If passing the cluster platform device works for the above then rproc_put() should work correct? We can test this on our side as well. That being said I can bring this up in the community call

I think your description of the problem is mostly correct. The intermediate
devices created by the cascading entries for individual remote processors in the
device tree are causing an issue. The "compatible" string for each remote
processor can't be handled by any platform drivers (as it should be), which
makes try_module_get() fail because r->dev.parent->driver is not bound to
anything.

Looking at the code for Xilinx's R5F support that I just queued[1], the simplest
solution may be to pass @dev, which is in fact @cluster->dev, to
zynqmp_r5_add_rproc_core() rather than the device associated with the
intermediate platform device.

That _should_ work. It is hard for me to know for sure since I don't have a
platform that has dual core remote processor to test with.

Get back to me with how that turned out and we'll go from there.

Thanks,
Mathieu

[1]. https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923

> + rcu_read_unlock();
> +
> + of_node_put(np);
> +
> + return rproc;
> +}
> +EXPORT_SYMBOL(rproc_get_by_id);
> #else
> struct rproc *rproc_get_by_phandle(phandle phandle)
> {
> return NULL;
> }
> -#endif
> EXPORT_SYMBOL(rproc_get_by_phandle);
> +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
> +{
> + return NULL;
> +}
> +EXPORT_SYMBOL(rproc_get_by_id);
> +#endif
>
> /**
> * rproc_set_firmware() - assign a new firmware
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index 3cde845ba26e..10961fae0f77 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -645,6 +645,7 @@ struct rproc_vdev {
> };
>
> struct rproc *rproc_get_by_phandle(phandle phandle);
> +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id);
> struct rproc *rproc_get_by_child(struct device *dev);
>
> struct rproc *rproc_alloc(struct device *dev, const char *name,
> --
> 2.25.1
>

Attachments:

winmail.dat (16.74 kB)

2022-12-02 17:23:32

by Mathieu Poirier

[permalink] [raw]

Subject: Re: [RFC PATCH 1/1] remoteproc: Introduce rproc_get_by_id API

On Wed, Nov 30, 2022 at 09:39:33PM +0000, Levinsky, Ben wrote:
> Hi Mathieu,
>
> Thank you for your review. Please see my reply inline.
>
> Thanks
> Ben
>
> On 11/25/22, 10:05 AM, "Mathieu Poirier" <[email protected]> wrote:
>
> CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>
>
> Hi Ben,
>
> On Tue, Nov 15, 2022 at 07:37:53AM -0800, Ben Levinsky wrote:
> > Allow users of remoteproc the ability to get a handle to an rproc by
> > passing in node that has parent rproc device and an ID that matches
> > an expected rproc struct's index field.
> >
> > This enables to get rproc structure for remoteproc drivers that manage
> > more than 1 remote processor (e.g. TI and Xilinx R5 drivers).
> >
> > Signed-off-by: Ben Levinsky <[email protected]>
> > ---
> > drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++++++++++-
> > include/linux/remoteproc.h | 1 +
> > 2 files changed, 64 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > index 775df165eb45..6f7058bcc80c 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -40,6 +40,7 @@
> > #include <linux/virtio_ring.h>
> > #include <asm/byteorder.h>
> > #include <linux/platform_device.h>
> > +#include <linux/of_platform.h>
> >
> > #include "remoteproc_internal.h"
> >
> > @@ -2203,13 +2204,74 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
> >
> > return rproc;
> > }
> > +
> > +/**
> > + * rproc_get_by_id() - find a remote processor by ID
> > + * @phandle: phandle to the rproc
> > + * @id: Index into rproc list that uniquely identifies the rproc struct
> > + *
> > + * Finds an rproc handle using the remote processor's index, and then
> > + * return a handle to the rproc. Before returning, ensure that the
> > + * parent node's driver is still loaded.
> > + *
> > + * This function increments the remote processor's refcount, so always
> > + * use rproc_put() to decrement it back once rproc isn't needed anymore.
> > + *
> > + * Return: rproc handle on success, and NULL on failure
> > + */
> > +
> > +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
> > +{
> > + struct rproc *rproc = NULL, *r;
> > + struct platform_device *parent_pdev;
> > + struct device_node *np;
> > +
> > + np = of_find_node_by_phandle(phandle);
> > + if (!np)
> > + return NULL;
> > +
> > + parent_pdev = of_find_device_by_node(np->parent);
> > + if (!parent_pdev) {
> > + dev_err(&parent_pdev->dev,
> > + "no platform device for node %pOF\n", np);
> > + of_node_put(np);
> > + return NULL;
> > + }
> > +
> > + /* prevent underlying implementation from being removed */
> > + if (!try_module_get(parent_pdev->dev.driver->owner)) {
> > + dev_err(&parent_pdev->dev, "can't get owner\n");
> > + of_node_put(np);
> > + return NULL;
> > + }
> > +
> > + rcu_read_lock();
> > + list_for_each_entry_rcu(r, &rproc_list, node) {
> > + if (r->index == id) {
> > + rproc = r;
> > + get_device(&rproc->dev);
> > + break;
> > + }
> > + }
>
> This won't work because several remote processors can be on the list. If
> another remote processor was discovered before the one @phandle is associated
> with, the remote processor pertaining to that previous one will returned.
>
> I didn't understand. From my point of view passing in the phandle of the child-platform device here will work because each child-platform will have its own entry in the remoteproc list.

You are correct, each child platform device will have its own entry in
@rproc_list. The problem is that r->index may not match @id that is passed as a
parameter.

>
> Also " If another remote processor was discovered before the one" Here this prevented from what I can see because the remoteproc_list is protected by a mutex_lock. See https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2288 for the mutex_lock.
>
> Additionally the calls to zynqmp_r5_add_rproc_core() are called sequentially so this also prevents the race condition.
>
> I think I am missing something in your paragraph above. Can you expand on this issue?

As explained above, the issue is not about race conditions but the value of
r->index and @id.

>
> Do you mean to say that if we use the cluster platform device you think using one of the existing APIs will work? For example rproc_get_by_child() or rproc_get_by_phandle()
>
> At https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923 " zynqmp_r5_add_rproc_core(&child_pdev->dev);" Here if we use cluster->dev this will work? To dig deeper into this for both the Xilinx and TI R5 remoteproc drivers, I think this proposed solution will create an issue in that for Split modes, the existing getter APIs will not be able to return one of the corresponding rproc instances because both cores will refer to the same platform-device structure.
>
> I can bring up the above in the community call.
>
> There is also an issue with rproc_put().
>
> If passing the cluster platform device works for the above then rproc_put() should work correct? We can test this on our side as well. That being said I can bring this up in the community call

Yes, using the cluster platform device will work with rproc_put().

>
>
> I think your description of the problem is mostly correct. The intermediate
> devices created by the cascading entries for individual remote processors in the
> device tree are causing an issue. The "compatible" string for each remote
> processor can't be handled by any platform drivers (as it should be), which
> makes try_module_get() fail because r->dev.parent->driver is not bound to
> anything.
>
> Looking at the code for Xilinx's R5F support that I just queued[1], the simplest
> solution may be to pass @dev, which is in fact @cluster->dev, to
> zynqmp_r5_add_rproc_core() rather than the device associated with the
> intermediate platform device.
>
> That _should_ work. It is hard for me to know for sure since I don't have a
> platform that has dual core remote processor to test with.
>
> Get back to me with how that turned out and we'll go from there.
>
> Thanks,
> Mathieu
>
>
>
>
> [1]. https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923
>
> > + rcu_read_unlock();
> > +
> > + of_node_put(np);
> > +
> > + return rproc;
> > +}
> > +EXPORT_SYMBOL(rproc_get_by_id);
> > #else
> > struct rproc *rproc_get_by_phandle(phandle phandle)
> > {
> > return NULL;
> > }
> > -#endif
> > EXPORT_SYMBOL(rproc_get_by_phandle);
> > +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
> > +{
> > + return NULL;
> > +}
> > +EXPORT_SYMBOL(rproc_get_by_id);
> > +#endif
> >
> > /**
> > * rproc_set_firmware() - assign a new firmware
> > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> > index 3cde845ba26e..10961fae0f77 100644
> > --- a/include/linux/remoteproc.h
> > +++ b/include/linux/remoteproc.h
> > @@ -645,6 +645,7 @@ struct rproc_vdev {
> > };
> >
> > struct rproc *rproc_get_by_phandle(phandle phandle);
> > +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id);
> > struct rproc *rproc_get_by_child(struct device *dev);
> >
> > struct rproc *rproc_alloc(struct device *dev, const char *name,
> > --
> > 2.25.1
> >
>

2022-12-08 19:22:46

by Mathieu Poirier

[permalink] [raw]

Subject: Re: [RFC PATCH 1/1] remoteproc: Introduce rproc_get_by_id API

On Tue, Dec 06, 2022 at 08:23:13AM -0800, Ben Levinsky wrote:
> Good Morning Mathieu,
>
>
> I did some testing and replied inline.
>
>
> On 12/2/22 9:00 AM, Mathieu Poirier wrote:
> > On Wed, Nov 30, 2022 at 09:39:33PM +0000, Levinsky, Ben wrote:
> > > Hi Mathieu,
> > >
> > > Thank you for your review. Please see my reply inline.
> > >
> > > Thanks
> > > Ben
> > >
> > > On 11/25/22, 10:05 AM, "Mathieu Poirier"<[email protected]> wrote:
> > >
> > > CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
> > >
> > >
> > > Hi Ben,
> > >
> > > On Tue, Nov 15, 2022 at 07:37:53AM -0800, Ben Levinsky wrote:
> > > > Allow users of remoteproc the ability to get a handle to an rproc by
> > > > passing in node that has parent rproc device and an ID that matches
> > > > an expected rproc struct's index field.
> > > >
> > > > This enables to get rproc structure for remoteproc drivers that manage
> > > > more than 1 remote processor (e.g. TI and Xilinx R5 drivers).
> > > >
> > > > Signed-off-by: Ben Levinsky<[email protected]>
> > > > ---
> > > > drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++++++++++-
> > > > include/linux/remoteproc.h | 1 +
> > > > 2 files changed, 64 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > > > index 775df165eb45..6f7058bcc80c 100644
> > > > --- a/drivers/remoteproc/remoteproc_core.c
> > > > +++ b/drivers/remoteproc/remoteproc_core.c
> > > > @@ -40,6 +40,7 @@
> > > > #include <linux/virtio_ring.h>
> > > > #include <asm/byteorder.h>
> > > > #include <linux/platform_device.h>
> > > > +#include <linux/of_platform.h>
> > > >
> > > > #include "remoteproc_internal.h"
> > > >
> > > > @@ -2203,13 +2204,74 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
> > > >
> > > > return rproc;
> > > > }
> > > > +
> > > > +/**
> > > > + * rproc_get_by_id() - find a remote processor by ID
> > > > + * @phandle: phandle to the rproc
> > > > + * @id: Index into rproc list that uniquely identifies the rproc struct
> > > > + *
> > > > + * Finds an rproc handle using the remote processor's index, and then
> > > > + * return a handle to the rproc. Before returning, ensure that the
> > > > + * parent node's driver is still loaded.
> > > > + *
> > > > + * This function increments the remote processor's refcount, so always
> > > > + * use rproc_put() to decrement it back once rproc isn't needed anymore.
> > > > + *
> > > > + * Return: rproc handle on success, and NULL on failure
> > > > + */
> > > > +
> > > > +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
> > > > +{
> > > > + struct rproc *rproc = NULL, *r;
> > > > + struct platform_device *parent_pdev;
> > > > + struct device_node *np;
> > > > +
> > > > + np = of_find_node_by_phandle(phandle);
> > > > + if (!np)
> > > > + return NULL;
> > > > +
> > > > + parent_pdev = of_find_device_by_node(np->parent);
> > > > + if (!parent_pdev) {
> > > > + dev_err(&parent_pdev->dev,
> > > > + "no platform device for node %pOF\n", np);
> > > > + of_node_put(np);
> > > > + return NULL;
> > > > + }
> > > > +
> > > > + /* prevent underlying implementation from being removed */
> > > > + if (!try_module_get(parent_pdev->dev.driver->owner)) {
> > > > + dev_err(&parent_pdev->dev, "can't get owner\n");
> > > > + of_node_put(np);
> > > > + return NULL;
> > > > + }
> > > > +
> > > > + rcu_read_lock();
> > > > + list_for_each_entry_rcu(r, &rproc_list, node) {
> > > > + if (r->index == id) {
> > > > + rproc = r;
> > > > + get_device(&rproc->dev);
> > > > + break;
> > > > + }
> > > > + }
> > >
> > > This won't work because several remote processors can be on the list. If
> > > another remote processor was discovered before the one @phandle is associated
> > > with, the remote processor pertaining to that previous one will returned.
> > >
> > > I didn't understand. From my point of view passing in the phandle of the child-platform device here will work because each child-platform will have its own entry in the remoteproc list.
> > You are correct, each child platform device will have its own entry in
> > @rproc_list. The problem is that r->index may not match @id that is passed as a
> > parameter.
> >
> > > Also " If another remote processor was discovered before the one" Here this prevented from what I can see because the remoteproc_list is protected by a mutex_lock. Seehttps://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2288 for the mutex_lock.
> > >
> > > Additionally the calls to zynqmp_r5_add_rproc_core() are called sequentially so this also prevents the race condition.
> > >
> > > I think I am missing something in your paragraph above. Can you expand on this issue?
> > As explained above, the issue is not about race conditions but the value of
> > r->index and @id.
> >
> > > Do you mean to say that if we use the cluster platform device you think using one of the existing APIs will work? For example rproc_get_by_child() or rproc_get_by_phandle()
> > >
> > > Athttps://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923 " zynqmp_r5_add_rproc_core(&child_pdev->dev);" Here if we use cluster->dev this will work? To dig deeper into this for both the Xilinx and TI R5 remoteproc drivers, I think this proposed solution will create an issue in that for Split modes, the existing getter APIs will not be able to return one of the corresponding rproc instances because both cores will refer to the same platform-device structure.
> > >
> > > I can bring up the above in the community call.
> > >
> > > There is also an issue with rproc_put().
> > >
> > > If passing the cluster platform device works for the above then rproc_put() should work correct? We can test this on our side as well. That being said I can bring this up in the community call
> > Yes, using the cluster platform device will work with rproc_put().
> >
> > >
> > > I think your description of the problem is mostly correct. The intermediate
> > > devices created by the cascading entries for individual remote processors in the
> > > device tree are causing an issue. The "compatible" string for each remote
> > > processor can't be handled by any platform drivers (as it should be), which
> > > makes try_module_get() fail because r->dev.parent->driver is not bound to
> > > anything.
> > >
> > > Looking at the code for Xilinx's R5F support that I just queued[1], the simplest
> > > solution may be to pass @dev, which is in fact @cluster->dev, to
> > > zynqmp_r5_add_rproc_core() rather than the device associated with the
> > > intermediate platform device.
> > >
> > > That _should_ work. It is hard for me to know for sure since I don't have a
> > > platform that has dual core remote processor to test with.
> > >
> > > Get back to me with how that turned out and we'll go from there.
> > >
> > > Thanks,
> > > Mathieu
> > >
> > >
> > >
> > >
> > > [1].https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923
>
> I have an update on this.
>
>
>
> I tested the following using the RPU-cluster platform device:
>
> test 1: RPU split with 2 core
>
> test 2: RPU split with 1 core
>
> test 3: lockstep RPU
>
>
> I tested with the zynqmp-r5-remoteproc platform probe using the (RPU)
> cluster platform device instead of the core/child platform device. When I
> used this I was unable to properly use the API rproc_get_by_phandle() and
> there was _only_ an issue for test 1. This was because each core will have
> its own call to rproc_alloc(), rproc_add() and each core's remoteproc
> structure has the same parent device.

You haven't specified if my proposal worked with test 2 and 3. I'm guessing
that it does.

>
> This results in the later call to rproc_get_by_phandle() not behaving
> properly because the function will return whichever core had its entries
> added to the list first.
>

That is a valid observation, but at least we are getting closer. The next step
is to find the right remote processor and I think we should look at np->name and
rproc->name. They should be quite close because rproc_alloc() is called with
dev_name(cdev).

I will look into this further tomorrow morning if I have time, but I encourage
you to do the same on your side.

>
> For reference I placed the logic for API rproc_get_by_phandle() that loops
> through device and the rproc_alloc() line where the dev parent is set:
>
>
> Here is the getter API where the loop checking the remoteproc dev parent is:
>
> https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2109
>
>
> if(r->dev.parent&& r->dev.parent->of_node== np) {
>
>
> Here is the rproc_alloc() call where they set remoteproc dev parent:
>
> https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2448
>
>
> rproc->dev.parent= dev;
>
> Thanks,
>
> Ben
>

2022-12-09 19:10:35

by Ben Levinsky

[permalink] [raw]

Subject: Re: [RFC PATCH 1/1] remoteproc: Introduce rproc_get_by_id API

Hi Mathieu,

On 12/8/22 11:05 AM, Mathieu Poirier wrote:
> On Tue, Dec 06, 2022 at 08:23:13AM -0800, Ben Levinsky wrote:
>> Good Morning Mathieu,
>>
>>
>> I did some testing and replied inline.
>>
>>
>> On 12/2/22 9:00 AM, Mathieu Poirier wrote:
>>> On Wed, Nov 30, 2022 at 09:39:33PM +0000, Levinsky, Ben wrote:
>>>> Hi Mathieu,
>>>>
>>>> Thank you for your review. Please see my reply inline.
>>>>
>>>> Thanks
>>>> Ben
>>>>
>>>> On 11/25/22, 10:05 AM, "Mathieu Poirier"<[email protected]> wrote:
>>>>
>>>> CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>>>
>>>>
>>>> Hi Ben,
>>>>
>>>> On Tue, Nov 15, 2022 at 07:37:53AM -0800, Ben Levinsky wrote:
>>>> > Allow users of remoteproc the ability to get a handle to an rproc by
>>>> > passing in node that has parent rproc device and an ID that matches
>>>> > an expected rproc struct's index field.
>>>> >
>>>> > This enables to get rproc structure for remoteproc drivers that manage
>>>> > more than 1 remote processor (e.g. TI and Xilinx R5 drivers).
>>>> >
>>>> > Signed-off-by: Ben Levinsky<[email protected]>
>>>> > ---
>>>> > drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++++++++++-
>>>> > include/linux/remoteproc.h | 1 +
>>>> > 2 files changed, 64 insertions(+), 1 deletion(-)
>>>> >
>>>> > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
>>>> > index 775df165eb45..6f7058bcc80c 100644
>>>> > --- a/drivers/remoteproc/remoteproc_core.c
>>>> > +++ b/drivers/remoteproc/remoteproc_core.c
>>>> > @@ -40,6 +40,7 @@
>>>> > #include <linux/virtio_ring.h>
>>>> > #include <asm/byteorder.h>
>>>> > #include <linux/platform_device.h>
>>>> > +#include <linux/of_platform.h>
>>>> >
>>>> > #include "remoteproc_internal.h"
>>>> >
>>>> > @@ -2203,13 +2204,74 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
>>>> >
>>>> > return rproc;
>>>> > }
>>>> > +
>>>> > +/**
>>>> > + * rproc_get_by_id() - find a remote processor by ID
>>>> > + * @phandle: phandle to the rproc
>>>> > + * @id: Index into rproc list that uniquely identifies the rproc struct
>>>> > + *
>>>> > + * Finds an rproc handle using the remote processor's index, and then
>>>> > + * return a handle to the rproc. Before returning, ensure that the
>>>> > + * parent node's driver is still loaded.
>>>> > + *
>>>> > + * This function increments the remote processor's refcount, so always
>>>> > + * use rproc_put() to decrement it back once rproc isn't needed anymore.
>>>> > + *
>>>> > + * Return: rproc handle on success, and NULL on failure
>>>> > + */
>>>> > +
>>>> > +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
>>>> > +{
>>>> > + struct rproc *rproc = NULL, *r;
>>>> > + struct platform_device *parent_pdev;
>>>> > + struct device_node *np;
>>>> > +
>>>> > + np = of_find_node_by_phandle(phandle);
>>>> > + if (!np)
>>>> > + return NULL;
>>>> > +
>>>> > + parent_pdev = of_find_device_by_node(np->parent);
>>>> > + if (!parent_pdev) {
>>>> > + dev_err(&parent_pdev->dev,
>>>> > + "no platform device for node %pOF\n", np);
>>>> > + of_node_put(np);
>>>> > + return NULL;
>>>> > + }
>>>> > +
>>>> > + /* prevent underlying implementation from being removed */
>>>> > + if (!try_module_get(parent_pdev->dev.driver->owner)) {
>>>> > + dev_err(&parent_pdev->dev, "can't get owner\n");
>>>> > + of_node_put(np);
>>>> > + return NULL;
>>>> > + }
>>>> > +
>>>> > + rcu_read_lock();
>>>> > + list_for_each_entry_rcu(r, &rproc_list, node) {
>>>> > + if (r->index == id) {
>>>> > + rproc = r;
>>>> > + get_device(&rproc->dev);
>>>> > + break;
>>>> > + }
>>>> > + }
>>>>
>>>> This won't work because several remote processors can be on the list. If
>>>> another remote processor was discovered before the one @phandle is associated
>>>> with, the remote processor pertaining to that previous one will returned.
>>>>
>>>> I didn't understand. From my point of view passing in the phandle of the child-platform device here will work because each child-platform will have its own entry in the remoteproc list.
>>> You are correct, each child platform device will have its own entry in
>>> @rproc_list. The problem is that r->index may not match @id that is passed as a
>>> parameter.
>>>
>>>> Also " If another remote processor was discovered before the one" Here this prevented from what I can see because the remoteproc_list is protected by a mutex_lock. Seehttps://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2288 for the mutex_lock.
>>>>
>>>> Additionally the calls to zynqmp_r5_add_rproc_core() are called sequentially so this also prevents the race condition.
>>>>
>>>> I think I am missing something in your paragraph above. Can you expand on this issue?
>>> As explained above, the issue is not about race conditions but the value of
>>> r->index and @id.
>>>
>>>> Do you mean to say that if we use the cluster platform device you think using one of the existing APIs will work? For example rproc_get_by_child() or rproc_get_by_phandle()
>>>>
>>>> Athttps://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923 " zynqmp_r5_add_rproc_core(&child_pdev->dev);" Here if we use cluster->dev this will work? To dig deeper into this for both the Xilinx and TI R5 remoteproc drivers, I think this proposed solution will create an issue in that for Split modes, the existing getter APIs will not be able to return one of the corresponding rproc instances because both cores will refer to the same platform-device structure.
>>>>
>>>> I can bring up the above in the community call.
>>>>
>>>> There is also an issue with rproc_put().
>>>>
>>>> If passing the cluster platform device works for the above then rproc_put() should work correct? We can test this on our side as well. That being said I can bring this up in the community call
>>> Yes, using the cluster platform device will work with rproc_put().
>>>
>>>>
>>>> I think your description of the problem is mostly correct. The intermediate
>>>> devices created by the cascading entries for individual remote processors in the
>>>> device tree are causing an issue. The "compatible" string for each remote
>>>> processor can't be handled by any platform drivers (as it should be), which
>>>> makes try_module_get() fail because r->dev.parent->driver is not bound to
>>>> anything.
>>>>
>>>> Looking at the code for Xilinx's R5F support that I just queued[1], the simplest
>>>> solution may be to pass @dev, which is in fact @cluster->dev, to
>>>> zynqmp_r5_add_rproc_core() rather than the device associated with the
>>>> intermediate platform device.
>>>>
>>>> That _should_ work. It is hard for me to know for sure since I don't have a
>>>> platform that has dual core remote processor to test with.
>>>>
>>>> Get back to me with how that turned out and we'll go from there.
>>>>
>>>> Thanks,
>>>> Mathieu
>>>>
>>>>
>>>>
>>>>
>>>> [1].https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923
>>
>> I have an update on this.
>>
>>
>>
>> I tested the following using the RPU-cluster platform device:
>>
>> test 1: RPU split with 2 core
>>
>> test 2: RPU split with 1 core
>>
>> test 3: lockstep RPU
>>
>>
>> I tested with the zynqmp-r5-remoteproc platform probe using the (RPU)
>> cluster platform device instead of the core/child platform device. When I
>> used this I was unable to properly use the API rproc_get_by_phandle() and
>> there was _only_ an issue for test 1. This was because each core will have
>> its own call to rproc_alloc(), rproc_add() and each core's remoteproc
>> structure has the same parent device.
>
> You haven't specified if my proposal worked with test 2 and 3. I'm guessing
> that it does.
>
Sorry, yes tests 2 and 3 work with your proposal.
>>
>> This results in the later call to rproc_get_by_phandle() not behaving
>> properly because the function will return whichever core had its entries
>> added to the list first.
>>
>
> That is a valid observation, but at least we are getting closer. The next step
> is to find the right remote processor and I think we should look at np->name and
> rproc->name. They should be quite close because rproc_alloc() is called with
> dev_name(cdev).
>
> I will look into this further tomorrow morning if I have time, but I encourage
> you to do the same on your side.
>

For the case where the cluster is in split mode and there are 2 child
nodes here is my update:

1. The rproc_list has 2 entries as follows:

as expected each entry has the same r->dev.parent (E.g. the cluster node)

The entries have the with device name rproc->dev name 'remoteproc0'
and 'remoteproc1'

2. For my use case I am trying to pass in the phandle of the core node
(child of the cluster). If I pass in the core node then
rproc_get_by_phandle() returns NULL because the r->dev.parent->of_node
does not match. This is expected because at rproc_alloc() we passed in
the cluster and not the core.

np->name in the loop is then name of the cluster node in my sample
device tree that I booted with that is 'r5f_0' where the cluster is 'rf5ss'.

If I am trying to get the rproc entry with name 'remoteproc0' and I pass
in to rproc_get_by_phandle() the cluster node's phandle (that is of
rf5ss) then the API _does_ work for getting the first entry from the
rproc list.

But If I am trying to the second rproc entry (dev name 'remoteproc1')
and I pass into rproc_get_by_phandle() I will still get the
'remoteproc0' entry because the phandle of the first entry also matches
in the loop.

Thanks
Ben

>>
>> For reference I placed the logic for API rproc_get_by_phandle() that loops
>> through device and the rproc_alloc() line where the dev parent is set:
>>
>>
>> Here is the getter API where the loop checking the remoteproc dev parent is:
>>
>> https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2109
>>
>>
>> if(r->dev.parent&& r->dev.parent->of_node== np) {
>>
>>
>> Here is the rproc_alloc() call where they set remoteproc dev parent:
>>
>> https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2448
>>
>>
>> rproc->dev.parent= dev;
>>
>> Thanks,
>>
>> Ben
>>
>

2022-12-13 22:01:37

by Mathieu Poirier

[permalink] [raw]

Subject: Re: [RFC PATCH 1/1] remoteproc: Introduce rproc_get_by_id API

On Fri, Dec 09, 2022 at 11:01:47AM -0800, Ben Levinsky wrote:
> Hi Mathieu,
>
> On 12/8/22 11:05 AM, Mathieu Poirier wrote:
> > On Tue, Dec 06, 2022 at 08:23:13AM -0800, Ben Levinsky wrote:
> > > Good Morning Mathieu,
> > >
> > >
> > > I did some testing and replied inline.
> > >
> > >
> > > On 12/2/22 9:00 AM, Mathieu Poirier wrote:
> > > > On Wed, Nov 30, 2022 at 09:39:33PM +0000, Levinsky, Ben wrote:
> > > > > Hi Mathieu,
> > > > >
> > > > > Thank you for your review. Please see my reply inline.
> > > > >
> > > > > Thanks
> > > > > Ben
> > > > >
> > > > > On 11/25/22, 10:05 AM, "Mathieu Poirier"<[email protected]> wrote:
> > > > >
> > > > > CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
> > > > >
> > > > >
> > > > > Hi Ben,
> > > > >
> > > > > On Tue, Nov 15, 2022 at 07:37:53AM -0800, Ben Levinsky wrote:
> > > > > > Allow users of remoteproc the ability to get a handle to an rproc by
> > > > > > passing in node that has parent rproc device and an ID that matches
> > > > > > an expected rproc struct's index field.
> > > > > >
> > > > > > This enables to get rproc structure for remoteproc drivers that manage
> > > > > > more than 1 remote processor (e.g. TI and Xilinx R5 drivers).
> > > > > >
> > > > > > Signed-off-by: Ben Levinsky<[email protected]>
> > > > > > ---
> > > > > > drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++++++++++-
> > > > > > include/linux/remoteproc.h | 1 +
> > > > > > 2 files changed, 64 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > > > > > index 775df165eb45..6f7058bcc80c 100644
> > > > > > --- a/drivers/remoteproc/remoteproc_core.c
> > > > > > +++ b/drivers/remoteproc/remoteproc_core.c
> > > > > > @@ -40,6 +40,7 @@
> > > > > > #include <linux/virtio_ring.h>
> > > > > > #include <asm/byteorder.h>
> > > > > > #include <linux/platform_device.h>
> > > > > > +#include <linux/of_platform.h>
> > > > > >
> > > > > > #include "remoteproc_internal.h"
> > > > > >
> > > > > > @@ -2203,13 +2204,74 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
> > > > > >
> > > > > > return rproc;
> > > > > > }
> > > > > > +
> > > > > > +/**
> > > > > > + * rproc_get_by_id() - find a remote processor by ID
> > > > > > + * @phandle: phandle to the rproc
> > > > > > + * @id: Index into rproc list that uniquely identifies the rproc struct
> > > > > > + *
> > > > > > + * Finds an rproc handle using the remote processor's index, and then
> > > > > > + * return a handle to the rproc. Before returning, ensure that the
> > > > > > + * parent node's driver is still loaded.
> > > > > > + *
> > > > > > + * This function increments the remote processor's refcount, so always
> > > > > > + * use rproc_put() to decrement it back once rproc isn't needed anymore.
> > > > > > + *
> > > > > > + * Return: rproc handle on success, and NULL on failure
> > > > > > + */
> > > > > > +
> > > > > > +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
> > > > > > +{
> > > > > > + struct rproc *rproc = NULL, *r;
> > > > > > + struct platform_device *parent_pdev;
> > > > > > + struct device_node *np;
> > > > > > +
> > > > > > + np = of_find_node_by_phandle(phandle);
> > > > > > + if (!np)
> > > > > > + return NULL;
> > > > > > +
> > > > > > + parent_pdev = of_find_device_by_node(np->parent);
> > > > > > + if (!parent_pdev) {
> > > > > > + dev_err(&parent_pdev->dev,
> > > > > > + "no platform device for node %pOF\n", np);
> > > > > > + of_node_put(np);
> > > > > > + return NULL;
> > > > > > + }
> > > > > > +
> > > > > > + /* prevent underlying implementation from being removed */
> > > > > > + if (!try_module_get(parent_pdev->dev.driver->owner)) {
> > > > > > + dev_err(&parent_pdev->dev, "can't get owner\n");
> > > > > > + of_node_put(np);
> > > > > > + return NULL;
> > > > > > + }
> > > > > > +
> > > > > > + rcu_read_lock();
> > > > > > + list_for_each_entry_rcu(r, &rproc_list, node) {
> > > > > > + if (r->index == id) {
> > > > > > + rproc = r;
> > > > > > + get_device(&rproc->dev);
> > > > > > + break;
> > > > > > + }
> > > > > > + }
> > > > >
> > > > > This won't work because several remote processors can be on the list. If
> > > > > another remote processor was discovered before the one @phandle is associated
> > > > > with, the remote processor pertaining to that previous one will returned.
> > > > >
> > > > > I didn't understand. From my point of view passing in the phandle of the child-platform device here will work because each child-platform will have its own entry in the remoteproc list.
> > > > You are correct, each child platform device will have its own entry in
> > > > @rproc_list. The problem is that r->index may not match @id that is passed as a
> > > > parameter.
> > > >
> > > > > Also " If another remote processor was discovered before the one" Here this prevented from what I can see because the remoteproc_list is protected by a mutex_lock. Seehttps://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2288 for the mutex_lock.
> > > > >
> > > > > Additionally the calls to zynqmp_r5_add_rproc_core() are called sequentially so this also prevents the race condition.
> > > > >
> > > > > I think I am missing something in your paragraph above. Can you expand on this issue?
> > > > As explained above, the issue is not about race conditions but the value of
> > > > r->index and @id.
> > > >
> > > > > Do you mean to say that if we use the cluster platform device you think using one of the existing APIs will work? For example rproc_get_by_child() or rproc_get_by_phandle()
> > > > >
> > > > > Athttps://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923 " zynqmp_r5_add_rproc_core(&child_pdev->dev);" Here if we use cluster->dev this will work? To dig deeper into this for both the Xilinx and TI R5 remoteproc drivers, I think this proposed solution will create an issue in that for Split modes, the existing getter APIs will not be able to return one of the corresponding rproc instances because both cores will refer to the same platform-device structure.
> > > > >
> > > > > I can bring up the above in the community call.
> > > > >
> > > > > There is also an issue with rproc_put().
> > > > >
> > > > > If passing the cluster platform device works for the above then rproc_put() should work correct? We can test this on our side as well. That being said I can bring this up in the community call
> > > > Yes, using the cluster platform device will work with rproc_put().
> > > >
> > > > >
> > > > > I think your description of the problem is mostly correct. The intermediate
> > > > > devices created by the cascading entries for individual remote processors in the
> > > > > device tree are causing an issue. The "compatible" string for each remote
> > > > > processor can't be handled by any platform drivers (as it should be), which
> > > > > makes try_module_get() fail because r->dev.parent->driver is not bound to
> > > > > anything.
> > > > >
> > > > > Looking at the code for Xilinx's R5F support that I just queued[1], the simplest
> > > > > solution may be to pass @dev, which is in fact @cluster->dev, to
> > > > > zynqmp_r5_add_rproc_core() rather than the device associated with the
> > > > > intermediate platform device.
> > > > >
> > > > > That _should_ work. It is hard for me to know for sure since I don't have a
> > > > > platform that has dual core remote processor to test with.
> > > > >
> > > > > Get back to me with how that turned out and we'll go from there.
> > > > >
> > > > > Thanks,
> > > > > Mathieu
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > [1].https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923
> > >
> > > I have an update on this.
> > >
> > >
> > >
> > > I tested the following using the RPU-cluster platform device:
> > >
> > > test 1: RPU split with 2 core
> > >
> > > test 2: RPU split with 1 core
> > >
> > > test 3: lockstep RPU
> > >
> > >
> > > I tested with the zynqmp-r5-remoteproc platform probe using the (RPU)
> > > cluster platform device instead of the core/child platform device. When I
> > > used this I was unable to properly use the API rproc_get_by_phandle() and
> > > there was _only_ an issue for test 1. This was because each core will have
> > > its own call to rproc_alloc(), rproc_add() and each core's remoteproc
> > > structure has the same parent device.
> >
> > You haven't specified if my proposal worked with test 2 and 3. I'm guessing
> > that it does.
> >
> Sorry, yes tests 2 and 3 work with your proposal.
> > >
> > > This results in the later call to rproc_get_by_phandle() not behaving
> > > properly because the function will return whichever core had its entries
> > > added to the list first.
> > >
> >
> > That is a valid observation, but at least we are getting closer. The next step
> > is to find the right remote processor and I think we should look at np->name and
> > rproc->name. They should be quite close because rproc_alloc() is called with
> > dev_name(cdev).
> >
> > I will look into this further tomorrow morning if I have time, but I encourage
> > you to do the same on your side.
> >
>
> For the case where the cluster is in split mode and there are 2 child nodes
> here is my update:
>
> 1. The rproc_list has 2 entries as follows:
>
> as expected each entry has the same r->dev.parent (E.g. the cluster node)
>
> The entries have the with device name rproc->dev name 'remoteproc0' and
> 'remoteproc1'
>
>
>
> 2. For my use case I am trying to pass in the phandle of the core node
> (child of the cluster). If I pass in the core node then
> rproc_get_by_phandle() returns NULL because the r->dev.parent->of_node does
> not match. This is expected because at rproc_alloc() we passed in the
> cluster and not the core.
>

I had a serious look into this and trying to do something with the rproc->name and
device_node->name won't work. As such, I suggest the following (uncompiled and
untested):

struct rproc *rproc_get_by_phandle(phandle phandle)
{
struct platform_device *cluster_pdev;
struct rproc *rproc = NULL, *r;
struct device_driver *driver;
struct device_node *np;

np = of_find_node_by_phandle(phandle);
if (!np)
return NULL;

rcu_read_lock();
list_for_each_entry_rcu(r, &rproc_list, node) {
if (r->dev.parent && r->dev.parent->of_node == np) {
/* prevent underlying implementation from being removed */

/*
* If the remoteproc's parent has a driver, the
* remoteproc is not part of a cluster and we can use
* that driver.
*/
driver = r->dev.parent->driver;

/*
* If the remoteproc's parent does not have a driver,
* look for the driver associated with the cluster.
*/
if (!driver) {
cluster_pdev = of_find_device_by_node(np->parent);
if (!cluster_pdev) {
dev_err(&r->dev, "can't get driver\n");
break;
}

driver = cluster_pdev->dev.parent->driver;
put_device(&cluster_pdev->dev);
}

if (!try_module_get(driver->owner)) {
dev_err(&r->dev, "can't get owner\n");
break;
}

rproc = r;
get_device(&rproc->dev);
break;
}
}
rcu_read_unlock();

of_node_put(np);

return rproc;
}

Let me know if that works for you.

Thanks,
Mathieu

> np->name in the loop is then name of the cluster node in my sample device
> tree that I booted with that is 'r5f_0' where the cluster is 'rf5ss'.
>
> If I am trying to get the rproc entry with name 'remoteproc0' and I pass in
> to rproc_get_by_phandle() the cluster node's phandle (that is of rf5ss) then
> the API _does_ work for getting the first entry from the rproc list.
>
> But If I am trying to the second rproc entry (dev name 'remoteproc1') and I
> pass into rproc_get_by_phandle() I will still get the 'remoteproc0' entry
> because the phandle of the first entry also matches in the loop.
>
> Thanks
> Ben
>
> > >
> > > For reference I placed the logic for API rproc_get_by_phandle() that loops
> > > through device and the rproc_alloc() line where the dev parent is set:
> > >
> > >
> > > Here is the getter API where the loop checking the remoteproc dev parent is:
> > >
> > > https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2109
> > >
> > >
> > > if(r->dev.parent&& r->dev.parent->of_node== np) {
> > >
> > >
> > > Here is the rproc_alloc() call where they set remoteproc dev parent:
> > >
> > > https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2448
> > >
> > >
> > > rproc->dev.parent= dev;
> > >
> > > Thanks,
> > >
> > > Ben
> > >
> >

2022-12-13 22:34:52

by Mathieu Poirier

[permalink] [raw]

Subject: Re: [RFC PATCH 1/1] remoteproc: Introduce rproc_get_by_id API

On Tue, Dec 13, 2022 at 02:53:18PM -0700, Mathieu Poirier wrote:
> On Fri, Dec 09, 2022 at 11:01:47AM -0800, Ben Levinsky wrote:
> > Hi Mathieu,
> >
> > On 12/8/22 11:05 AM, Mathieu Poirier wrote:
> > > On Tue, Dec 06, 2022 at 08:23:13AM -0800, Ben Levinsky wrote:
> > > > Good Morning Mathieu,
> > > >
> > > >
> > > > I did some testing and replied inline.
> > > >
> > > >
> > > > On 12/2/22 9:00 AM, Mathieu Poirier wrote:
> > > > > On Wed, Nov 30, 2022 at 09:39:33PM +0000, Levinsky, Ben wrote:
> > > > > > Hi Mathieu,
> > > > > >
> > > > > > Thank you for your review. Please see my reply inline.
> > > > > >
> > > > > > Thanks
> > > > > > Ben
> > > > > >
> > > > > > On 11/25/22, 10:05 AM, "Mathieu Poirier"<[email protected]> wrote:
> > > > > >
> > > > > > CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
> > > > > >
> > > > > >
> > > > > > Hi Ben,
> > > > > >
> > > > > > On Tue, Nov 15, 2022 at 07:37:53AM -0800, Ben Levinsky wrote:
> > > > > > > Allow users of remoteproc the ability to get a handle to an rproc by
> > > > > > > passing in node that has parent rproc device and an ID that matches
> > > > > > > an expected rproc struct's index field.
> > > > > > >
> > > > > > > This enables to get rproc structure for remoteproc drivers that manage
> > > > > > > more than 1 remote processor (e.g. TI and Xilinx R5 drivers).
> > > > > > >
> > > > > > > Signed-off-by: Ben Levinsky<[email protected]>
> > > > > > > ---
> > > > > > > drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++++++++++-
> > > > > > > include/linux/remoteproc.h | 1 +
> > > > > > > 2 files changed, 64 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > > > > > > index 775df165eb45..6f7058bcc80c 100644
> > > > > > > --- a/drivers/remoteproc/remoteproc_core.c
> > > > > > > +++ b/drivers/remoteproc/remoteproc_core.c
> > > > > > > @@ -40,6 +40,7 @@
> > > > > > > #include <linux/virtio_ring.h>
> > > > > > > #include <asm/byteorder.h>
> > > > > > > #include <linux/platform_device.h>
> > > > > > > +#include <linux/of_platform.h>
> > > > > > >
> > > > > > > #include "remoteproc_internal.h"
> > > > > > >
> > > > > > > @@ -2203,13 +2204,74 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
> > > > > > >
> > > > > > > return rproc;
> > > > > > > }
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * rproc_get_by_id() - find a remote processor by ID
> > > > > > > + * @phandle: phandle to the rproc
> > > > > > > + * @id: Index into rproc list that uniquely identifies the rproc struct
> > > > > > > + *
> > > > > > > + * Finds an rproc handle using the remote processor's index, and then
> > > > > > > + * return a handle to the rproc. Before returning, ensure that the
> > > > > > > + * parent node's driver is still loaded.
> > > > > > > + *
> > > > > > > + * This function increments the remote processor's refcount, so always
> > > > > > > + * use rproc_put() to decrement it back once rproc isn't needed anymore.
> > > > > > > + *
> > > > > > > + * Return: rproc handle on success, and NULL on failure
> > > > > > > + */
> > > > > > > +
> > > > > > > +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
> > > > > > > +{
> > > > > > > + struct rproc *rproc = NULL, *r;
> > > > > > > + struct platform_device *parent_pdev;
> > > > > > > + struct device_node *np;
> > > > > > > +
> > > > > > > + np = of_find_node_by_phandle(phandle);
> > > > > > > + if (!np)
> > > > > > > + return NULL;
> > > > > > > +
> > > > > > > + parent_pdev = of_find_device_by_node(np->parent);
> > > > > > > + if (!parent_pdev) {
> > > > > > > + dev_err(&parent_pdev->dev,
> > > > > > > + "no platform device for node %pOF\n", np);
> > > > > > > + of_node_put(np);
> > > > > > > + return NULL;
> > > > > > > + }
> > > > > > > +
> > > > > > > + /* prevent underlying implementation from being removed */
> > > > > > > + if (!try_module_get(parent_pdev->dev.driver->owner)) {
> > > > > > > + dev_err(&parent_pdev->dev, "can't get owner\n");
> > > > > > > + of_node_put(np);
> > > > > > > + return NULL;
> > > > > > > + }
> > > > > > > +
> > > > > > > + rcu_read_lock();
> > > > > > > + list_for_each_entry_rcu(r, &rproc_list, node) {
> > > > > > > + if (r->index == id) {
> > > > > > > + rproc = r;
> > > > > > > + get_device(&rproc->dev);
> > > > > > > + break;
> > > > > > > + }
> > > > > > > + }
> > > > > >
> > > > > > This won't work because several remote processors can be on the list. If
> > > > > > another remote processor was discovered before the one @phandle is associated
> > > > > > with, the remote processor pertaining to that previous one will returned.
> > > > > >
> > > > > > I didn't understand. From my point of view passing in the phandle of the child-platform device here will work because each child-platform will have its own entry in the remoteproc list.
> > > > > You are correct, each child platform device will have its own entry in
> > > > > @rproc_list. The problem is that r->index may not match @id that is passed as a
> > > > > parameter.
> > > > >
> > > > > > Also " If another remote processor was discovered before the one" Here this prevented from what I can see because the remoteproc_list is protected by a mutex_lock. Seehttps://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2288 for the mutex_lock.
> > > > > >
> > > > > > Additionally the calls to zynqmp_r5_add_rproc_core() are called sequentially so this also prevents the race condition.
> > > > > >
> > > > > > I think I am missing something in your paragraph above. Can you expand on this issue?
> > > > > As explained above, the issue is not about race conditions but the value of
> > > > > r->index and @id.
> > > > >
> > > > > > Do you mean to say that if we use the cluster platform device you think using one of the existing APIs will work? For example rproc_get_by_child() or rproc_get_by_phandle()
> > > > > >
> > > > > > Athttps://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923 " zynqmp_r5_add_rproc_core(&child_pdev->dev);" Here if we use cluster->dev this will work? To dig deeper into this for both the Xilinx and TI R5 remoteproc drivers, I think this proposed solution will create an issue in that for Split modes, the existing getter APIs will not be able to return one of the corresponding rproc instances because both cores will refer to the same platform-device structure.
> > > > > >
> > > > > > I can bring up the above in the community call.
> > > > > >
> > > > > > There is also an issue with rproc_put().
> > > > > >
> > > > > > If passing the cluster platform device works for the above then rproc_put() should work correct? We can test this on our side as well. That being said I can bring this up in the community call
> > > > > Yes, using the cluster platform device will work with rproc_put().
> > > > >
> > > > > >
> > > > > > I think your description of the problem is mostly correct. The intermediate
> > > > > > devices created by the cascading entries for individual remote processors in the
> > > > > > device tree are causing an issue. The "compatible" string for each remote
> > > > > > processor can't be handled by any platform drivers (as it should be), which
> > > > > > makes try_module_get() fail because r->dev.parent->driver is not bound to
> > > > > > anything.
> > > > > >
> > > > > > Looking at the code for Xilinx's R5F support that I just queued[1], the simplest
> > > > > > solution may be to pass @dev, which is in fact @cluster->dev, to
> > > > > > zynqmp_r5_add_rproc_core() rather than the device associated with the
> > > > > > intermediate platform device.
> > > > > >
> > > > > > That _should_ work. It is hard for me to know for sure since I don't have a
> > > > > > platform that has dual core remote processor to test with.
> > > > > >
> > > > > > Get back to me with how that turned out and we'll go from there.
> > > > > >
> > > > > > Thanks,
> > > > > > Mathieu
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > [1].https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923
> > > >
> > > > I have an update on this.
> > > >
> > > >
> > > >
> > > > I tested the following using the RPU-cluster platform device:
> > > >
> > > > test 1: RPU split with 2 core
> > > >
> > > > test 2: RPU split with 1 core
> > > >
> > > > test 3: lockstep RPU
> > > >
> > > >
> > > > I tested with the zynqmp-r5-remoteproc platform probe using the (RPU)
> > > > cluster platform device instead of the core/child platform device. When I
> > > > used this I was unable to properly use the API rproc_get_by_phandle() and
> > > > there was _only_ an issue for test 1. This was because each core will have
> > > > its own call to rproc_alloc(), rproc_add() and each core's remoteproc
> > > > structure has the same parent device.
> > >
> > > You haven't specified if my proposal worked with test 2 and 3. I'm guessing
> > > that it does.
> > >
> > Sorry, yes tests 2 and 3 work with your proposal.
> > > >
> > > > This results in the later call to rproc_get_by_phandle() not behaving
> > > > properly because the function will return whichever core had its entries
> > > > added to the list first.
> > > >
> > >
> > > That is a valid observation, but at least we are getting closer. The next step
> > > is to find the right remote processor and I think we should look at np->name and
> > > rproc->name. They should be quite close because rproc_alloc() is called with
> > > dev_name(cdev).
> > >
> > > I will look into this further tomorrow morning if I have time, but I encourage
> > > you to do the same on your side.
> > >
> >
> > For the case where the cluster is in split mode and there are 2 child nodes
> > here is my update:
> >
> > 1. The rproc_list has 2 entries as follows:
> >
> > as expected each entry has the same r->dev.parent (E.g. the cluster node)
> >
> > The entries have the with device name rproc->dev name 'remoteproc0' and
> > 'remoteproc1'
> >
> >
> >
> > 2. For my use case I am trying to pass in the phandle of the core node
> > (child of the cluster). If I pass in the core node then
> > rproc_get_by_phandle() returns NULL because the r->dev.parent->of_node does
> > not match. This is expected because at rproc_alloc() we passed in the
> > cluster and not the core.
> >
>
>
> I had a serious look into this and trying to do something with the rproc->name and
> device_node->name won't work. As such, I suggest the following (uncompiled and
> untested):
>
> struct rproc *rproc_get_by_phandle(phandle phandle)
> {
> struct platform_device *cluster_pdev;
> struct rproc *rproc = NULL, *r;
> struct device_driver *driver;
> struct device_node *np;
>
> np = of_find_node_by_phandle(phandle);
> if (!np)
> return NULL;
>
> rcu_read_lock();
> list_for_each_entry_rcu(r, &rproc_list, node) {
> if (r->dev.parent && r->dev.parent->of_node == np) {
> /* prevent underlying implementation from being removed */
>
> /*
> * If the remoteproc's parent has a driver, the
> * remoteproc is not part of a cluster and we can use
> * that driver.
> */
> driver = r->dev.parent->driver;
>
> /*
> * If the remoteproc's parent does not have a driver,
> * look for the driver associated with the cluster.
> */
> if (!driver) {
> cluster_pdev = of_find_device_by_node(np->parent);
> if (!cluster_pdev) {
> dev_err(&r->dev, "can't get driver\n");
> break;
> }
>
> driver = cluster_pdev->dev.parent->driver;

This should be:

driver = cluster_pdev->dev.driver;

> put_device(&cluster_pdev->dev);
> }
>
> if (!try_module_get(driver->owner)) {
> dev_err(&r->dev, "can't get owner\n");
> break;
> }
>
> rproc = r;
> get_device(&rproc->dev);
> break;
> }
> }
> rcu_read_unlock();
>
> of_node_put(np);
>
> return rproc;
> }
>
> Let me know if that works for you.
>
> Thanks,
> Mathieu
>
>
>
> > np->name in the loop is then name of the cluster node in my sample device
> > tree that I booted with that is 'r5f_0' where the cluster is 'rf5ss'.
> >
> > If I am trying to get the rproc entry with name 'remoteproc0' and I pass in
> > to rproc_get_by_phandle() the cluster node's phandle (that is of rf5ss) then
> > the API _does_ work for getting the first entry from the rproc list.
> >
> > But If I am trying to the second rproc entry (dev name 'remoteproc1') and I
> > pass into rproc_get_by_phandle() I will still get the 'remoteproc0' entry
> > because the phandle of the first entry also matches in the loop.
> >
> > Thanks
> > Ben
> >
> > > >
> > > > For reference I placed the logic for API rproc_get_by_phandle() that loops
> > > > through device and the rproc_alloc() line where the dev parent is set:
> > > >
> > > >
> > > > Here is the getter API where the loop checking the remoteproc dev parent is:
> > > >
> > > > https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2109
> > > >
> > > >
> > > > if(r->dev.parent&& r->dev.parent->of_node== np) {
> > > >
> > > >
> > > > Here is the rproc_alloc() call where they set remoteproc dev parent:
> > > >
> > > > https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2448
> > > >
> > > >
> > > > rproc->dev.parent= dev;
> > > >
> > > > Thanks,
> > > >
> > > > Ben
> > > >
> > >

2022-12-14 17:33:03

by Ben Levinsky

[permalink] [raw]

Subject: Re: [RFC PATCH 1/1] remoteproc: Introduce rproc_get_by_id API

Can confirm this works for my use case! Thank you!

On 12/14/22 9:16 AM, Levinsky, Ben wrote:
>
>
> On 12/13/22, 2:21 PM, "Mathieu Poirier" <[email protected]> wrote:
>
> On Tue, Dec 13, 2022 at 02:53:18PM -0700, Mathieu Poirier wrote:
> > On Fri, Dec 09, 2022 at 11:01:47AM -0800, Ben Levinsky wrote:
> > > Hi Mathieu,
> > >
> > > On 12/8/22 11:05 AM, Mathieu Poirier wrote:
> > > > On Tue, Dec 06, 2022 at 08:23:13AM -0800, Ben Levinsky wrote:
> > > > > Good Morning Mathieu,
> > > > >
> > > > >
> > > > > I did some testing and replied inline.
> > > > >
> > > > >
> > > > > On 12/2/22 9:00 AM, Mathieu Poirier wrote:
> > > > > > On Wed, Nov 30, 2022 at 09:39:33PM +0000, Levinsky, Ben wrote:
> > > > > > > Hi Mathieu,
> > > > > > >
> > > > > > > Thank you for your review. Please see my reply inline.
> > > > > > >
> > > > > > > Thanks
> > > > > > > Ben
> > > > > > >
> > > > > > > On 11/25/22, 10:05 AM, "Mathieu Poirier"<[email protected]> wrote:
> > > > > > >
> > > > > > > CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
> > > > > > >
> > > > > > >
> > > > > > > Hi Ben,
> > > > > > >
> > > > > > > On Tue, Nov 15, 2022 at 07:37:53AM -0800, Ben Levinsky wrote:
> > > > > > > > Allow users of remoteproc the ability to get a handle to an rproc by
> > > > > > > > passing in node that has parent rproc device and an ID that matches
> > > > > > > > an expected rproc struct's index field.
> > > > > > > >
> > > > > > > > This enables to get rproc structure for remoteproc drivers that manage
> > > > > > > > more than 1 remote processor (e.g. TI and Xilinx R5 drivers).
> > > > > > > >
> > > > > > > > Signed-off-by: Ben Levinsky<[email protected]>
> > > > > > > > ---
> > > > > > > > drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++++++++++-
> > > > > > > > include/linux/remoteproc.h | 1 +
> > > > > > > > 2 files changed, 64 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > > > > > > > index 775df165eb45..6f7058bcc80c 100644
> > > > > > > > --- a/drivers/remoteproc/remoteproc_core.c
> > > > > > > > +++ b/drivers/remoteproc/remoteproc_core.c
> > > > > > > > @@ -40,6 +40,7 @@
> > > > > > > > #include <linux/virtio_ring.h>
> > > > > > > > #include <asm/byteorder.h>
> > > > > > > > #include <linux/platform_device.h>
> > > > > > > > +#include <linux/of_platform.h>
> > > > > > > >
> > > > > > > > #include "remoteproc_internal.h"
> > > > > > > >
> > > > > > > > @@ -2203,13 +2204,74 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
> > > > > > > >
> > > > > > > > return rproc;
> > > > > > > > }
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * rproc_get_by_id() - find a remote processor by ID
> > > > > > > > + * @phandle: phandle to the rproc
> > > > > > > > + * @id: Index into rproc list that uniquely identifies the rproc struct
> > > > > > > > + *
> > > > > > > > + * Finds an rproc handle using the remote processor's index, and then
> > > > > > > > + * return a handle to the rproc. Before returning, ensure that the
> > > > > > > > + * parent node's driver is still loaded.
> > > > > > > > + *
> > > > > > > > + * This function increments the remote processor's refcount, so always
> > > > > > > > + * use rproc_put() to decrement it back once rproc isn't needed anymore.
> > > > > > > > + *
> > > > > > > > + * Return: rproc handle on success, and NULL on failure
> > > > > > > > + */
> > > > > > > > +
> > > > > > > > +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
> > > > > > > > +{
> > > > > > > > + struct rproc *rproc = NULL, *r;
> > > > > > > > + struct platform_device *parent_pdev;
> > > > > > > > + struct device_node *np;
> > > > > > > > +
> > > > > > > > + np = of_find_node_by_phandle(phandle);
> > > > > > > > + if (!np)
> > > > > > > > + return NULL;
> > > > > > > > +
> > > > > > > > + parent_pdev = of_find_device_by_node(np->parent);
> > > > > > > > + if (!parent_pdev) {
> > > > > > > > + dev_err(&parent_pdev->dev,
> > > > > > > > + "no platform device for node %pOF\n", np);
> > > > > > > > + of_node_put(np);
> > > > > > > > + return NULL;
> > > > > > > > + }
> > > > > > > > +
> > > > > > > > + /* prevent underlying implementation from being removed */
> > > > > > > > + if (!try_module_get(parent_pdev->dev.driver->owner)) {
> > > > > > > > + dev_err(&parent_pdev->dev, "can't get owner\n");
> > > > > > > > + of_node_put(np);
> > > > > > > > + return NULL;
> > > > > > > > + }
> > > > > > > > +
> > > > > > > > + rcu_read_lock();
> > > > > > > > + list_for_each_entry_rcu(r, &rproc_list, node) {
> > > > > > > > + if (r->index == id) {
> > > > > > > > + rproc = r;
> > > > > > > > + get_device(&rproc->dev);
> > > > > > > > + break;
> > > > > > > > + }
> > > > > > > > + }
> > > > > > >
> > > > > > > This won't work because several remote processors can be on the list. If
> > > > > > > another remote processor was discovered before the one @phandle is associated
> > > > > > > with, the remote processor pertaining to that previous one will returned.
> > > > > > >
> > > > > > > I didn't understand. From my point of view passing in the phandle of the child-platform device here will work because each child-platform will have its own entry in the remoteproc list.
> > > > > > You are correct, each child platform device will have its own entry in
> > > > > > @rproc_list. The problem is that r->index may not match @id that is passed as a
> > > > > > parameter.
> > > > > >
> > > > > > > Also " If another remote processor was discovered before the one" Here this prevented from what I can see because the remoteproc_list is protected by a mutex_lock. Seehttps://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2288 for the mutex_lock.
> > > > > > >
> > > > > > > Additionally the calls to zynqmp_r5_add_rproc_core() are called sequentially so this also prevents the race condition.
> > > > > > >
> > > > > > > I think I am missing something in your paragraph above. Can you expand on this issue?
> > > > > > As explained above, the issue is not about race conditions but the value of
> > > > > > r->index and @id.
> > > > > >
> > > > > > > Do you mean to say that if we use the cluster platform device you think using one of the existing APIs will work? For example rproc_get_by_child() or rproc_get_by_phandle()
> > > > > > >
> > > > > > > Athttps://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923 " zynqmp_r5_add_rproc_core(&child_pdev->dev);" Here if we use cluster->dev this will work? To dig deeper into this for both the Xilinx and TI R5 remoteproc drivers, I think this proposed solution will create an issue in that for Split modes, the existing getter APIs will not be able to return one of the corresponding rproc instances because both cores will refer to the same platform-device structure.
> > > > > > >
> > > > > > > I can bring up the above in the community call.
> > > > > > >
> > > > > > > There is also an issue with rproc_put().
> > > > > > >
> > > > > > > If passing the cluster platform device works for the above then rproc_put() should work correct? We can test this on our side as well. That being said I can bring this up in the community call
> > > > > > Yes, using the cluster platform device will work with rproc_put().
> > > > > >
> > > > > > >
> > > > > > > I think your description of the problem is mostly correct. The intermediate
> > > > > > > devices created by the cascading entries for individual remote processors in the
> > > > > > > device tree are causing an issue. The "compatible" string for each remote
> > > > > > > processor can't be handled by any platform drivers (as it should be), which
> > > > > > > makes try_module_get() fail because r->dev.parent->driver is not bound to
> > > > > > > anything.
> > > > > > >
> > > > > > > Looking at the code for Xilinx's R5F support that I just queued[1], the simplest
> > > > > > > solution may be to pass @dev, which is in fact @cluster->dev, to
> > > > > > > zynqmp_r5_add_rproc_core() rather than the device associated with the
> > > > > > > intermediate platform device.
> > > > > > >
> > > > > > > That _should_ work. It is hard for me to know for sure since I don't have a
> > > > > > > platform that has dual core remote processor to test with.
> > > > > > >
> > > > > > > Get back to me with how that turned out and we'll go from there.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Mathieu
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > [1].https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923
> > > > >
> > > > > I have an update on this.
> > > > >
> > > > >
> > > > >
> > > > > I tested the following using the RPU-cluster platform device:
> > > > >
> > > > > test 1: RPU split with 2 core
> > > > >
> > > > > test 2: RPU split with 1 core
> > > > >
> > > > > test 3: lockstep RPU
> > > > >
> > > > >
> > > > > I tested with the zynqmp-r5-remoteproc platform probe using the (RPU)
> > > > > cluster platform device instead of the core/child platform device. When I
> > > > > used this I was unable to properly use the API rproc_get_by_phandle() and
> > > > > there was _only_ an issue for test 1. This was because each core will have
> > > > > its own call to rproc_alloc(), rproc_add() and each core's remoteproc
> > > > > structure has the same parent device.
> > > >
> > > > You haven't specified if my proposal worked with test 2 and 3. I'm guessing
> > > > that it does.
> > > >
> > > Sorry, yes tests 2 and 3 work with your proposal.
> > > > >
> > > > > This results in the later call to rproc_get_by_phandle() not behaving
> > > > > properly because the function will return whichever core had its entries
> > > > > added to the list first.
> > > > >
> > > >
> > > > That is a valid observation, but at least we are getting closer. The next step
> > > > is to find the right remote processor and I think we should look at np->name and
> > > > rproc->name. They should be quite close because rproc_alloc() is called with
> > > > dev_name(cdev).
> > > >
> > > > I will look into this further tomorrow morning if I have time, but I encourage
> > > > you to do the same on your side.
> > > >
> > >
> > > For the case where the cluster is in split mode and there are 2 child nodes
> > > here is my update:
> > >
> > > 1. The rproc_list has 2 entries as follows:
> > >
> > > as expected each entry has the same r->dev.parent (E.g. the cluster node)
> > >
> > > The entries have the with device name rproc->dev name 'remoteproc0' and
> > > 'remoteproc1'
> > >
> > >
> > >
> > > 2. For my use case I am trying to pass in the phandle of the core node
> > > (child of the cluster). If I pass in the core node then
> > > rproc_get_by_phandle() returns NULL because the r->dev.parent->of_node does
> > > not match. This is expected because at rproc_alloc() we passed in the
> > > cluster and not the core.
> > >
> >
> >
> > I had a serious look into this and trying to do something with the rproc->name and
> > device_node->name won't work. As such, I suggest the following (uncompiled and
> > untested):
> >
> > struct rproc *rproc_get_by_phandle(phandle phandle)
> > {
> > struct platform_device *cluster_pdev;
> > struct rproc *rproc = NULL, *r;
> > struct device_driver *driver;
> > struct device_node *np;
> >
> > np = of_find_node_by_phandle(phandle);
> > if (!np)
> > return NULL;
> >
> > rcu_read_lock();
> > list_for_each_entry_rcu(r, &rproc_list, node) {
> > if (r->dev.parent && r->dev.parent->of_node == np) {
> > /* prevent underlying implementation from being removed */
> >
> > /*
> > * If the remoteproc's parent has a driver, the
> > * remoteproc is not part of a cluster and we can use
> > * that driver.
> > */
> > driver = r->dev.parent->driver;
> >
> > /*
> > * If the remoteproc's parent does not have a driver,
> > * look for the driver associated with the cluster.
> > */
> > if (!driver) {
> > cluster_pdev = of_find_device_by_node(np->parent);
> > if (!cluster_pdev) {
> > dev_err(&r->dev, "can't get driver\n");
> > break;
> > }
> >
> > driver = cluster_pdev->dev.parent->driver;
>
> This should be:
>
> driver = cluster_pdev->dev.driver;
>
> > put_device(&cluster_pdev->dev);
> > }
> >
> > if (!try_module_get(driver->owner)) {
> > dev_err(&r->dev, "can't get owner\n");
> > break;
> > }
> >
> > rproc = r;
> > get_device(&rproc->dev);
> > break;
> > }
> > }
> > rcu_read_unlock();
> >
> > of_node_put(np);
> >
> > return rproc;
> > }
> >
> > Let me know if that works for you.
> >
> > Thanks,
> > Mathieu
> >
> >
> >
> > > np->name in the loop is then name of the cluster node in my sample device
> > > tree that I booted with that is 'r5f_0' where the cluster is 'rf5ss'.
> > >
> > > If I am trying to get the rproc entry with name 'remoteproc0' and I pass in
> > > to rproc_get_by_phandle() the cluster node's phandle (that is of rf5ss) then
> > > the API _does_ work for getting the first entry from the rproc list.
> > >
> > > But If I am trying to the second rproc entry (dev name 'remoteproc1') and I
> > > pass into rproc_get_by_phandle() I will still get the 'remoteproc0' entry
> > > because the phandle of the first entry also matches in the loop.
> > >
> > > Thanks
> > > Ben
> > >
> > > > >
> > > > > For reference I placed the logic for API rproc_get_by_phandle() that loops
> > > > > through device and the rproc_alloc() line where the dev parent is set:
> > > > >
> > > > >
> > > > > Here is the getter API where the loop checking the remoteproc dev parent is:
> > > > >
> > > > > https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2109
> > > > >
> > > > >
> > > > > if(r->dev.parent&& r->dev.parent->of_node== np) {
> > > > >
> > > > >
> > > > > Here is the rproc_alloc() call where they set remoteproc dev parent:
> > > > >
> > > > > https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2448
> > > > >
> > > > >
> > > > > rproc->dev.parent= dev;
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Ben
> > > > >
> > > >
>
> ep
> is to find the right remote processor and I think we should look at np->name and
> rproc->name. They should be quite close because rproc_alloc() is called with
> dev_name(cdev).
>
> I will look into this further tomorrow morning if I have time, but I encourage
> you to do the same on your side.
>
>>
>> For reference I placed the logic for API rproc_get_by_phandle() that loops
>> through device and the rproc_alloc() line where the dev parent is set:
>>
>>
>> Here is the getter API where the loop checking the remoteproc dev parent is:
>>
>> https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2109
>>
>>
>> if(r->dev.parent&& r->dev.parent->of_node=np) {
>>
>>
>> Here is the rproc_alloc() call where they set remoteproc dev parent:
>>
>> https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2448
>>
>>
>> rproc->dev.parent=ev;
>>
>> Thanks,
>>
>> Ben
>>
>

2022-12-14 21:46:10

by Mathieu Poirier

[permalink] [raw]

Subject: Re: [RFC PATCH 1/1] remoteproc: Introduce rproc_get_by_id API

On Wed, 14 Dec 2022 at 10:17, Ben Levinsky <[email protected]> wrote:
>
> Can confirm this works for my use case! Thank you!
>

Thanks for getting back to me. I will send an official patch in the
coming hour, please add your R-B or T-B and I will queue it when
6.2-rc2 comes out on the 26th.

> On 12/14/22 9:16 AM, Levinsky, Ben wrote:
> >
> >
> > On 12/13/22, 2:21 PM, "Mathieu Poirier" <[email protected]> wrote:
> >
> > On Tue, Dec 13, 2022 at 02:53:18PM -0700, Mathieu Poirier wrote:
> > > On Fri, Dec 09, 2022 at 11:01:47AM -0800, Ben Levinsky wrote:
> > > > Hi Mathieu,
> > > >
> > > > On 12/8/22 11:05 AM, Mathieu Poirier wrote:
> > > > > On Tue, Dec 06, 2022 at 08:23:13AM -0800, Ben Levinsky wrote:
> > > > > > Good Morning Mathieu,
> > > > > >
> > > > > >
> > > > > > I did some testing and replied inline.
> > > > > >
> > > > > >
> > > > > > On 12/2/22 9:00 AM, Mathieu Poirier wrote:
> > > > > > > On Wed, Nov 30, 2022 at 09:39:33PM +0000, Levinsky, Ben wrote:
> > > > > > > > Hi Mathieu,
> > > > > > > >
> > > > > > > > Thank you for your review. Please see my reply inline.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > > Ben
> > > > > > > >
> > > > > > > > On 11/25/22, 10:05 AM, "Mathieu Poirier"<[email protected]> wrote:
> > > > > > > >
> > > > > > > > CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
> > > > > > > >
> > > > > > > >
> > > > > > > > Hi Ben,
> > > > > > > >
> > > > > > > > On Tue, Nov 15, 2022 at 07:37:53AM -0800, Ben Levinsky wrote:
> > > > > > > > > Allow users of remoteproc the ability to get a handle to an rproc by
> > > > > > > > > passing in node that has parent rproc device and an ID that matches
> > > > > > > > > an expected rproc struct's index field.
> > > > > > > > >
> > > > > > > > > This enables to get rproc structure for remoteproc drivers that manage
> > > > > > > > > more than 1 remote processor (e.g. TI and Xilinx R5 drivers).
> > > > > > > > >
> > > > > > > > > Signed-off-by: Ben Levinsky<[email protected]>
> > > > > > > > > ---
> > > > > > > > > drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++++++++++-
> > > > > > > > > include/linux/remoteproc.h | 1 +
> > > > > > > > > 2 files changed, 64 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > > > > > > > > index 775df165eb45..6f7058bcc80c 100644
> > > > > > > > > --- a/drivers/remoteproc/remoteproc_core.c
> > > > > > > > > +++ b/drivers/remoteproc/remoteproc_core.c
> > > > > > > > > @@ -40,6 +40,7 @@
> > > > > > > > > #include <linux/virtio_ring.h>
> > > > > > > > > #include <asm/byteorder.h>
> > > > > > > > > #include <linux/platform_device.h>
> > > > > > > > > +#include <linux/of_platform.h>
> > > > > > > > >
> > > > > > > > > #include "remoteproc_internal.h"
> > > > > > > > >
> > > > > > > > > @@ -2203,13 +2204,74 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
> > > > > > > > >
> > > > > > > > > return rproc;
> > > > > > > > > }
> > > > > > > > > +
> > > > > > > > > +/**
> > > > > > > > > + * rproc_get_by_id() - find a remote processor by ID
> > > > > > > > > + * @phandle: phandle to the rproc
> > > > > > > > > + * @id: Index into rproc list that uniquely identifies the rproc struct
> > > > > > > > > + *
> > > > > > > > > + * Finds an rproc handle using the remote processor's index, and then
> > > > > > > > > + * return a handle to the rproc. Before returning, ensure that the
> > > > > > > > > + * parent node's driver is still loaded.
> > > > > > > > > + *
> > > > > > > > > + * This function increments the remote processor's refcount, so always
> > > > > > > > > + * use rproc_put() to decrement it back once rproc isn't needed anymore.
> > > > > > > > > + *
> > > > > > > > > + * Return: rproc handle on success, and NULL on failure
> > > > > > > > > + */
> > > > > > > > > +
> > > > > > > > > +struct rproc *rproc_get_by_id(phandle phandle, unsigned int id)
> > > > > > > > > +{
> > > > > > > > > + struct rproc *rproc = NULL, *r;
> > > > > > > > > + struct platform_device *parent_pdev;
> > > > > > > > > + struct device_node *np;
> > > > > > > > > +
> > > > > > > > > + np = of_find_node_by_phandle(phandle);
> > > > > > > > > + if (!np)
> > > > > > > > > + return NULL;
> > > > > > > > > +
> > > > > > > > > + parent_pdev = of_find_device_by_node(np->parent);
> > > > > > > > > + if (!parent_pdev) {
> > > > > > > > > + dev_err(&parent_pdev->dev,
> > > > > > > > > + "no platform device for node %pOF\n", np);
> > > > > > > > > + of_node_put(np);
> > > > > > > > > + return NULL;
> > > > > > > > > + }
> > > > > > > > > +
> > > > > > > > > + /* prevent underlying implementation from being removed */
> > > > > > > > > + if (!try_module_get(parent_pdev->dev.driver->owner)) {
> > > > > > > > > + dev_err(&parent_pdev->dev, "can't get owner\n");
> > > > > > > > > + of_node_put(np);
> > > > > > > > > + return NULL;
> > > > > > > > > + }
> > > > > > > > > +
> > > > > > > > > + rcu_read_lock();
> > > > > > > > > + list_for_each_entry_rcu(r, &rproc_list, node) {
> > > > > > > > > + if (r->index == id) {
> > > > > > > > > + rproc = r;
> > > > > > > > > + get_device(&rproc->dev);
> > > > > > > > > + break;
> > > > > > > > > + }
> > > > > > > > > + }
> > > > > > > >
> > > > > > > > This won't work because several remote processors can be on the list. If
> > > > > > > > another remote processor was discovered before the one @phandle is associated
> > > > > > > > with, the remote processor pertaining to that previous one will returned.
> > > > > > > >
> > > > > > > > I didn't understand. From my point of view passing in the phandle of the child-platform device here will work because each child-platform will have its own entry in the remoteproc list.
> > > > > > > You are correct, each child platform device will have its own entry in
> > > > > > > @rproc_list. The problem is that r->index may not match @id that is passed as a
> > > > > > > parameter.
> > > > > > >
> > > > > > > > Also " If another remote processor was discovered before the one" Here this prevented from what I can see because the remoteproc_list is protected by a mutex_lock. Seehttps://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2288 for the mutex_lock.
> > > > > > > >
> > > > > > > > Additionally the calls to zynqmp_r5_add_rproc_core() are called sequentially so this also prevents the race condition.
> > > > > > > >
> > > > > > > > I think I am missing something in your paragraph above. Can you expand on this issue?
> > > > > > > As explained above, the issue is not about race conditions but the value of
> > > > > > > r->index and @id.
> > > > > > >
> > > > > > > > Do you mean to say that if we use the cluster platform device you think using one of the existing APIs will work? For example rproc_get_by_child() or rproc_get_by_phandle()
> > > > > > > >
> > > > > > > > Athttps://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923 " zynqmp_r5_add_rproc_core(&child_pdev->dev);" Here if we use cluster->dev this will work? To dig deeper into this for both the Xilinx and TI R5 remoteproc drivers, I think this proposed solution will create an issue in that for Split modes, the existing getter APIs will not be able to return one of the corresponding rproc instances because both cores will refer to the same platform-device structure.
> > > > > > > >
> > > > > > > > I can bring up the above in the community call.
> > > > > > > >
> > > > > > > > There is also an issue with rproc_put().
> > > > > > > >
> > > > > > > > If passing the cluster platform device works for the above then rproc_put() should work correct? We can test this on our side as well. That being said I can bring this up in the community call
> > > > > > > Yes, using the cluster platform device will work with rproc_put().
> > > > > > >
> > > > > > > >
> > > > > > > > I think your description of the problem is mostly correct. The intermediate
> > > > > > > > devices created by the cascading entries for individual remote processors in the
> > > > > > > > device tree are causing an issue. The "compatible" string for each remote
> > > > > > > > processor can't be handled by any platform drivers (as it should be), which
> > > > > > > > makes try_module_get() fail because r->dev.parent->driver is not bound to
> > > > > > > > anything.
> > > > > > > >
> > > > > > > > Looking at the code for Xilinx's R5F support that I just queued[1], the simplest
> > > > > > > > solution may be to pass @dev, which is in fact @cluster->dev, to
> > > > > > > > zynqmp_r5_add_rproc_core() rather than the device associated with the
> > > > > > > > intermediate platform device.
> > > > > > > >
> > > > > > > > That _should_ work. It is hard for me to know for sure since I don't have a
> > > > > > > > platform that has dual core remote processor to test with.
> > > > > > > >
> > > > > > > > Get back to me with how that turned out and we'll go from there.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Mathieu
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > [1].https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git/tree/drivers/remoteproc/xlnx_r5_remoteproc.c?h=rproc-next#n923
> > > > > >
> > > > > > I have an update on this.
> > > > > >
> > > > > >
> > > > > >
> > > > > > I tested the following using the RPU-cluster platform device:
> > > > > >
> > > > > > test 1: RPU split with 2 core
> > > > > >
> > > > > > test 2: RPU split with 1 core
> > > > > >
> > > > > > test 3: lockstep RPU
> > > > > >
> > > > > >
> > > > > > I tested with the zynqmp-r5-remoteproc platform probe using the (RPU)
> > > > > > cluster platform device instead of the core/child platform device. When I
> > > > > > used this I was unable to properly use the API rproc_get_by_phandle() and
> > > > > > there was _only_ an issue for test 1. This was because each core will have
> > > > > > its own call to rproc_alloc(), rproc_add() and each core's remoteproc
> > > > > > structure has the same parent device.
> > > > >
> > > > > You haven't specified if my proposal worked with test 2 and 3. I'm guessing
> > > > > that it does.
> > > > >
> > > > Sorry, yes tests 2 and 3 work with your proposal.
> > > > > >
> > > > > > This results in the later call to rproc_get_by_phandle() not behaving
> > > > > > properly because the function will return whichever core had its entries
> > > > > > added to the list first.
> > > > > >
> > > > >
> > > > > That is a valid observation, but at least we are getting closer. The next step
> > > > > is to find the right remote processor and I think we should look at np->name and
> > > > > rproc->name. They should be quite close because rproc_alloc() is called with
> > > > > dev_name(cdev).
> > > > >
> > > > > I will look into this further tomorrow morning if I have time, but I encourage
> > > > > you to do the same on your side.
> > > > >
> > > >
> > > > For the case where the cluster is in split mode and there are 2 child nodes
> > > > here is my update:
> > > >
> > > > 1. The rproc_list has 2 entries as follows:
> > > >
> > > > as expected each entry has the same r->dev.parent (E.g. the cluster node)
> > > >
> > > > The entries have the with device name rproc->dev name 'remoteproc0' and
> > > > 'remoteproc1'
> > > >
> > > >
> > > >
> > > > 2. For my use case I am trying to pass in the phandle of the core node
> > > > (child of the cluster). If I pass in the core node then
> > > > rproc_get_by_phandle() returns NULL because the r->dev.parent->of_node does
> > > > not match. This is expected because at rproc_alloc() we passed in the
> > > > cluster and not the core.
> > > >
> > >
> > >
> > > I had a serious look into this and trying to do something with the rproc->name and
> > > device_node->name won't work. As such, I suggest the following (uncompiled and
> > > untested):
> > >
> > > struct rproc *rproc_get_by_phandle(phandle phandle)
> > > {
> > > struct platform_device *cluster_pdev;
> > > struct rproc *rproc = NULL, *r;
> > > struct device_driver *driver;
> > > struct device_node *np;
> > >
> > > np = of_find_node_by_phandle(phandle);
> > > if (!np)
> > > return NULL;
> > >
> > > rcu_read_lock();
> > > list_for_each_entry_rcu(r, &rproc_list, node) {
> > > if (r->dev.parent && r->dev.parent->of_node == np) {
> > > /* prevent underlying implementation from being removed */
> > >
> > > /*
> > > * If the remoteproc's parent has a driver, the
> > > * remoteproc is not part of a cluster and we can use
> > > * that driver.
> > > */
> > > driver = r->dev.parent->driver;
> > >
> > > /*
> > > * If the remoteproc's parent does not have a driver,
> > > * look for the driver associated with the cluster.
> > > */
> > > if (!driver) {
> > > cluster_pdev = of_find_device_by_node(np->parent);
> > > if (!cluster_pdev) {
> > > dev_err(&r->dev, "can't get driver\n");
> > > break;
> > > }
> > >
> > > driver = cluster_pdev->dev.parent->driver;
> >
> > This should be:
> >
> > driver = cluster_pdev->dev.driver;
> >
> > > put_device(&cluster_pdev->dev);
> > > }
> > >
> > > if (!try_module_get(driver->owner)) {
> > > dev_err(&r->dev, "can't get owner\n");
> > > break;
> > > }
> > >
> > > rproc = r;
> > > get_device(&rproc->dev);
> > > break;
> > > }
> > > }
> > > rcu_read_unlock();
> > >
> > > of_node_put(np);
> > >
> > > return rproc;
> > > }
> > >
> > > Let me know if that works for you.
> > >
> > > Thanks,
> > > Mathieu
> > >
> > >
> > >
> > > > np->name in the loop is then name of the cluster node in my sample device
> > > > tree that I booted with that is 'r5f_0' where the cluster is 'rf5ss'.
> > > >
> > > > If I am trying to get the rproc entry with name 'remoteproc0' and I pass in
> > > > to rproc_get_by_phandle() the cluster node's phandle (that is of rf5ss) then
> > > > the API _does_ work for getting the first entry from the rproc list.
> > > >
> > > > But If I am trying to the second rproc entry (dev name 'remoteproc1') and I
> > > > pass into rproc_get_by_phandle() I will still get the 'remoteproc0' entry
> > > > because the phandle of the first entry also matches in the loop.
> > > >
> > > > Thanks
> > > > Ben
> > > >
> > > > > >
> > > > > > For reference I placed the logic for API rproc_get_by_phandle() that loops
> > > > > > through device and the rproc_alloc() line where the dev parent is set:
> > > > > >
> > > > > >
> > > > > > Here is the getter API where the loop checking the remoteproc dev parent is:
> > > > > >
> > > > > > https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2109
> > > > > >
> > > > > >
> > > > > > if(r->dev.parent&& r->dev.parent->of_node== np) {
> > > > > >
> > > > > >
> > > > > > Here is the rproc_alloc() call where they set remoteproc dev parent:
> > > > > >
> > > > > > https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2448
> > > > > >
> > > > > >
> > > > > > rproc->dev.parent= dev;
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Ben
> > > > > >
> > > > >
> >
> > ep
> > is to find the right remote processor and I think we should look at np->name and
> > rproc->name. They should be quite close because rproc_alloc() is called with
> > dev_name(cdev).
> >
> > I will look into this further tomorrow morning if I have time, but I encourage
> > you to do the same on your side.
> >
> >>
> >> For reference I placed the logic for API rproc_get_by_phandle() that loops
> >> through device and the rproc_alloc() line where the dev parent is set:
> >>
> >>
> >> Here is the getter API where the loop checking the remoteproc dev parent is:
> >>
> >> https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2109
> >>
> >>
> >> if(r->dev.parent&& r->dev.parent->of_node=np) {
> >>
> >>
> >> Here is the rproc_alloc() call where they set remoteproc dev parent:
> >>
> >> https://github.com/torvalds/linux/blob/master/drivers/remoteproc/remoteproc_core.c#L2448
> >>
> >>
> >> rproc->dev.parent=ev;
> >>
> >> Thanks,
> >>
> >> Ben
> >>
> >