2023-01-31 16:08:47

by Christian Marangi

[permalink] [raw]
Subject: [PATCH v2 1/2] clk: Warn and add workaround on misuse of .parent_data with .name only

By a simple mistake in a .parent_names to .parent_data conversion it was
found that clk core assume fw_name is always provided with a parent_data
struct for each parent and never fallback to .name to get parent name even
if declared.

This is caused by clk_core_get that only checks for parent .fw_name and
doesn't handle .name.

While it's sane to request the dev to correctly do the conversion and
add both .fw_name and .name in a parent_data struct, it's not sane to
silently drop parents without a warning.

Fix this in 2 ways. Add a kernel warning when a wrong implementation is
used and copy .name in .fw_name in parent map populate function to
handle clk problems and malfunctions.

Signed-off-by: Christian Marangi <[email protected]>
---
drivers/clk/clk.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 57b83665e5c3..dccd4ea6f692 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -4015,10 +4015,21 @@ static int clk_core_populate_parent_map(struct clk_core *core,
ret = clk_cpy_name(&parent->name, parent_names[i],
true);
} else if (parent_data) {
+ const char *parent_name;
+
parent->hw = parent_data[i].hw;
parent->index = parent_data[i].index;
+ parent_name = parent_data[i].fw_name;
+
+ if (!parent_name && parent_data[i].name) {
+ WARN(1, "Empty .fw_name with .name in %s's .parent_data. Using .name for .fw_name declaration.\n",
+ core->name);
+ parent_name = parent_data[i].name;
+ }
+
ret = clk_cpy_name(&parent->fw_name,
- parent_data[i].fw_name, false);
+ parent_name, false);
+
if (!ret)
ret = clk_cpy_name(&parent->name,
parent_data[i].name,
--
2.38.1



2023-01-31 16:08:51

by Christian Marangi

[permalink] [raw]
Subject: [PATCH v2 2/2] clk: gate: Add missing fw_name for clk_gate_register_test_parent_data_legacy

Fix warning for missing .fw_name in parent_data based on names.
It's wrong to define only .name since clk core expect always .fw_name to
be defined.

Reported-by: kernel test robot <[email protected]>
Signed-off-by: Christian Marangi <[email protected]>
---
drivers/clk/clk-gate_test.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/clk/clk-gate_test.c b/drivers/clk/clk-gate_test.c
index e136aaad48bf..a0a63cd4ce0b 100644
--- a/drivers/clk/clk-gate_test.c
+++ b/drivers/clk/clk-gate_test.c
@@ -74,6 +74,7 @@ static void clk_gate_register_test_parent_data_legacy(struct kunit *test)
1000000);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, parent);
pdata.name = "test_parent";
+ pdata.fw_name = "test_parent";

ret = clk_hw_register_gate_parent_data(NULL, "test_gate", &pdata, 0,
NULL, 0, 0, NULL);
--
2.38.1


2023-02-11 00:40:39

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] clk: Warn and add workaround on misuse of .parent_data with .name only

Quoting Christian Marangi (2023-01-31 08:08:28)
> By a simple mistake in a .parent_names to .parent_data conversion it was
> found that clk core assume fw_name is always provided with a parent_data
> struct for each parent and never fallback to .name to get parent name even
> if declared.

It sounds like you have clk_parent_data and the .index member is 0? Can
you show an example structure? I'm guessing it is like this:

struct clk_parent_data pdata = { .name = "global_name" };

>
> This is caused by clk_core_get that only checks for parent .fw_name and
> doesn't handle .name.

clk_core_get() is not supposed to operate on the .name member. It is a
firmware based lookup with clkdev as a fallback because clkdev is a
psudeo-firmware interface to assign a name to a clk when some device
pointer is used in conjunction with it.

>
> While it's sane to request the dev to correctly do the conversion and
> add both .fw_name and .name in a parent_data struct, it's not sane to
> silently drop parents without a warning.

I suppose we can do

WARN(parent->index >= 0 && !parent_data[i].fw_name && parent_data[i].name, ...);

or maybe better would be to make the clk registration fail if there's a
.name field and the index is non-negative and the fw_name is NULL.

Can you grep the code and see if anyone is assigning a .name without a
.fw_name or .index?

>
> Fix this in 2 ways. Add a kernel warning when a wrong implementation is
> used and copy .name in .fw_name in parent map populate function to
> handle clk problems and malfunctions.

We shouldn't be copying .name to .fw_name. They're different things.

2023-02-11 00:52:45

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] clk: gate: Add missing fw_name for clk_gate_register_test_parent_data_legacy

Quoting Christian Marangi (2023-01-31 08:08:29)
> Fix warning for missing .fw_name in parent_data based on names.
> It's wrong to define only .name since clk core expect always .fw_name to
> be defined.
>
> Reported-by: kernel test robot <[email protected]>

What was the report?

> Signed-off-by: Christian Marangi <[email protected]>
> ---
> drivers/clk/clk-gate_test.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/clk/clk-gate_test.c b/drivers/clk/clk-gate_test.c
> index e136aaad48bf..a0a63cd4ce0b 100644
> --- a/drivers/clk/clk-gate_test.c
> +++ b/drivers/clk/clk-gate_test.c
> @@ -74,6 +74,7 @@ static void clk_gate_register_test_parent_data_legacy(struct kunit *test)
> 1000000);
> KUNIT_ASSERT_NOT_ERR_OR_NULL(test, parent);
> pdata.name = "test_parent";
> + pdata.fw_name = "test_parent";
>
> ret = clk_hw_register_gate_parent_data(NULL, "test_gate", &pdata, 0,

We don't pass a 'dev' here, so the pdata.index isn't looked at. I
suppose we can assign .index to -1 to be more explicit, but because
there isn't a device used for registering, we won't try to use the
.index. Instead we'll try to use .fw_name for clkdev, of which there
won't be a clkdev lookup either. Eventually we'll fallback to the .name
lookup, and it will be fine.

We need tests that exercises the 'dev' path and also the DT path and the
clkdev path. I was thinking about working on that outside of the gate
test though, and just having a generic clk test for that with simple
clk_ops that do basically nothing.

2023-02-11 00:57:52

by Christian Marangi

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] clk: Warn and add workaround on misuse of .parent_data with .name only

On Fri, Feb 10, 2023 at 04:40:29PM -0800, Stephen Boyd wrote:
> Quoting Christian Marangi (2023-01-31 08:08:28)
> > By a simple mistake in a .parent_names to .parent_data conversion it was
> > found that clk core assume fw_name is always provided with a parent_data
> > struct for each parent and never fallback to .name to get parent name even
> > if declared.
>
> It sounds like you have clk_parent_data and the .index member is 0? Can
> you show an example structure? I'm guessing it is like this:
>
> struct clk_parent_data pdata = { .name = "global_name" };
>

An example of this problem and the relative fix is here
35dc8e101a8e08f69f4725839b98ec0f11a8e2d3

You example is also ok and this patch wants to handle just a case like
that.

> >
> > This is caused by clk_core_get that only checks for parent .fw_name and
> > doesn't handle .name.
>
> clk_core_get() is not supposed to operate on the .name member. It is a
> firmware based lookup with clkdev as a fallback because clkdev is a
> psudeo-firmware interface to assign a name to a clk when some device
> pointer is used in conjunction with it.
>

And the problem is just that. We currently permit to have a
configuration with .name but no .fw_name. In a case like that a dev may
think that this configuration is valid but in reality the clk is
silently ignored/not found and cause clk problem with selecting a
parent.

Took some good hours to discover this and to me it seems an error that
everybody can do since nowhere is specificed that the following
parent_data configuration is illegal.

> >
> > While it's sane to request the dev to correctly do the conversion and
> > add both .fw_name and .name in a parent_data struct, it's not sane to
> > silently drop parents without a warning.
>
> I suppose we can do
>
> WARN(parent->index >= 0 && !parent_data[i].fw_name && parent_data[i].name, ...);
>
> or maybe better would be to make the clk registration fail if there's a
> .name field and the index is non-negative and the fw_name is NULL.
>
> Can you grep the code and see if anyone is assigning a .name without a
> .fw_name or .index?
>

I can check and have some fun with a good regex.

Reject registration may be an option but consider that this may cause
some device to not boot at all if the error is done on a core clock
driver like a gcc driver.

What I would love is if there is a way to cause a compilation error but
I don't think that is doable with a C macro?

> >
> > Fix this in 2 ways. Add a kernel warning when a wrong implementation is
> > used and copy .name in .fw_name in parent map populate function to
> > handle clk problems and malfunctions.
>
> We shouldn't be copying .name to .fw_name. They're different things.

The idea here was that in theory the global name should not be that
different than fw_name. But I understand this can have drammatic side
effect so I agree that we should only WARN that there is something
wrong.

Hope with these expleination it's more clear what this patch is trying
to achieve. The referenced commit should make the problem clear.

--
Ansuel

2023-02-11 01:02:27

by Christian Marangi

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] clk: gate: Add missing fw_name for clk_gate_register_test_parent_data_legacy

On Fri, Feb 10, 2023 at 04:52:36PM -0800, Stephen Boyd wrote:
> Quoting Christian Marangi (2023-01-31 08:08:29)
> > Fix warning for missing .fw_name in parent_data based on names.
> > It's wrong to define only .name since clk core expect always .fw_name to
> > be defined.
> >
> > Reported-by: kernel test robot <[email protected]>
>
> What was the report?
>

With the previous patch applied kernel test robot report the WARN for
declaring a parent_data with .name but no .fw_name.

> > Signed-off-by: Christian Marangi <[email protected]>
> > ---
> > drivers/clk/clk-gate_test.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/clk/clk-gate_test.c b/drivers/clk/clk-gate_test.c
> > index e136aaad48bf..a0a63cd4ce0b 100644
> > --- a/drivers/clk/clk-gate_test.c
> > +++ b/drivers/clk/clk-gate_test.c
> > @@ -74,6 +74,7 @@ static void clk_gate_register_test_parent_data_legacy(struct kunit *test)
> > 1000000);
> > KUNIT_ASSERT_NOT_ERR_OR_NULL(test, parent);
> > pdata.name = "test_parent";
> > + pdata.fw_name = "test_parent";
> >
> > ret = clk_hw_register_gate_parent_data(NULL, "test_gate", &pdata, 0,
>
> We don't pass a 'dev' here, so the pdata.index isn't looked at. I
> suppose we can assign .index to -1 to be more explicit, but because
> there isn't a device used for registering, we won't try to use the
> .index. Instead we'll try to use .fw_name for clkdev, of which there
> won't be a clkdev lookup either. Eventually we'll fallback to the .name
> lookup, and it will be fine.

Problem is that from what we observed, it won't fallback to .name if
.fw_name is not declared.

But it will work if .fw_name is declared but not exposed by DT. (and
will correctly fallback to .name as .fw_name is not found)
(but this is to explain why the change in the other patch is needed so I
may be OT here)

>
> We need tests that exercises the 'dev' path and also the DT path and the
> clkdev path. I was thinking about working on that outside of the gate
> test though, and just having a generic clk test for that with simple
> clk_ops that do basically nothing.

--
Ansuel

2023-02-15 18:55:04

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] clk: Warn and add workaround on misuse of .parent_data with .name only

Quoting Christian Marangi (2023-02-10 10:34:11)
> On Fri, Feb 10, 2023 at 04:40:29PM -0800, Stephen Boyd wrote:
> > Quoting Christian Marangi (2023-01-31 08:08:28)
> > > By a simple mistake in a .parent_names to .parent_data conversion it was
> > > found that clk core assume fw_name is always provided with a parent_data
> > > struct for each parent and never fallback to .name to get parent name even
> > > if declared.
> >
> > It sounds like you have clk_parent_data and the .index member is 0? Can
> > you show an example structure? I'm guessing it is like this:
> >
> > struct clk_parent_data pdata = { .name = "global_name" };
> >
>
> An example of this problem and the relative fix is here
> 35dc8e101a8e08f69f4725839b98ec0f11a8e2d3
>
> You example is also ok and this patch wants to handle just a case like
> that.

Ok, so you have a firmware .index of 0. The .name is a fallback. I
suppose you want the .name to be a fallback if there isn't a clocks
property in the registering device node? I thought that should already
work but maybe there is a bug somewhere. Presumably you have a gcc node
that doesn't have a clocks property

gcc: gcc@1800000 {
compatible = "qcom,gcc-ipq8074";
reg = <0x01800000 0x80000>;
#clock-cells = <0x1>;
#power-domain-cells = <1>;
#reset-cells = <0x1>;
};

Looking at clk_core_get() we'll call of_parse_clkspec() and that should fail

struct clk_hw *hw = ERR_PTR(-ENOENT);

...

if (np && (name || index >= 0) &&
!of_parse_clkspec(np, index, name, &clkspec)) {
...
} else if (name) {
...
}

if (IS_ERR(hw))
return ERR_CAST(hw);

so we should have a -ENOENT clk_hw pointer in
clk_core_fill_parent_index(). That should land in this if condition in
clk_core_fill_parent_index()

parent = clk_core_get(core, index);
if (PTR_ERR(parent) == -ENOENT && entry->name)
parent = clk_core_lookup(entry->name);

and then entry->name should be used.

>
> > >
> > > This is caused by clk_core_get that only checks for parent .fw_name and
> > > doesn't handle .name.
> >
> > clk_core_get() is not supposed to operate on the .name member. It is a
> > firmware based lookup with clkdev as a fallback because clkdev is a
> > psudeo-firmware interface to assign a name to a clk when some device
> > pointer is used in conjunction with it.
> >
>
> And the problem is just that. We currently permit to have a
> configuration with .name but no .fw_name. In a case like that a dev may
> think that this configuration is valid but in reality the clk is
> silently ignored/not found and cause clk problem with selecting a
> parent.

It is valid though.

>
> Took some good hours to discover this and to me it seems an error that
> everybody can do since nowhere is specificed that the following
> parent_data configuration is illegal.
>

I'll look at adding a test. Seems to be the best way to solve this.

2023-02-15 23:42:27

by Christian Marangi

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] clk: Warn and add workaround on misuse of .parent_data with .name only

On Wed, Feb 15, 2023 at 10:54:56AM -0800, Stephen Boyd wrote:
> Quoting Christian Marangi (2023-02-10 10:34:11)
> > On Fri, Feb 10, 2023 at 04:40:29PM -0800, Stephen Boyd wrote:
> > > Quoting Christian Marangi (2023-01-31 08:08:28)
> > > > By a simple mistake in a .parent_names to .parent_data conversion it was
> > > > found that clk core assume fw_name is always provided with a parent_data
> > > > struct for each parent and never fallback to .name to get parent name even
> > > > if declared.
> > >
> > > It sounds like you have clk_parent_data and the .index member is 0? Can
> > > you show an example structure? I'm guessing it is like this:
> > >
> > > struct clk_parent_data pdata = { .name = "global_name" };
> > >
> >
> > An example of this problem and the relative fix is here
> > 35dc8e101a8e08f69f4725839b98ec0f11a8e2d3
> >
> > You example is also ok and this patch wants to handle just a case like
> > that.
>
> Ok, so you have a firmware .index of 0. The .name is a fallback. I
> suppose you want the .name to be a fallback if there isn't a clocks
> property in the registering device node? I thought that should already
> work but maybe there is a bug somewhere. Presumably you have a gcc node
> that doesn't have a clocks property
>
> gcc: gcc@1800000 {
> compatible = "qcom,gcc-ipq8074";
> reg = <0x01800000 0x80000>;
> #clock-cells = <0x1>;
> #power-domain-cells = <1>;
> #reset-cells = <0x1>;
> };
>
> Looking at clk_core_get() we'll call of_parse_clkspec() and that should fail
>
> struct clk_hw *hw = ERR_PTR(-ENOENT);
>
> ...
>
> if (np && (name || index >= 0) &&
> !of_parse_clkspec(np, index, name, &clkspec)) {
> ...
> } else if (name) {
> ...
> }
>
> if (IS_ERR(hw))
> return ERR_CAST(hw);
>
> so we should have a -ENOENT clk_hw pointer in
> clk_core_fill_parent_index(). That should land in this if condition in
> clk_core_fill_parent_index()
>
> parent = clk_core_get(core, index);
> if (PTR_ERR(parent) == -ENOENT && entry->name)
> parent = clk_core_lookup(entry->name);
>
> and then entry->name should be used.
>

Hi, thanks for making me give this an extra check... I think I found
the real cause.
I send a patch that should suppress this and give an extensive
explaination of the problem.
This is the ID: [email protected]

The hint that made me get what was wrong was a problem with index and
the fact that it should have returned -ENOENT... Fun to discover a clock
was actually returned and the function never returned an error.

> >
> > > >
> > > > This is caused by clk_core_get that only checks for parent .fw_name and
> > > > doesn't handle .name.
> > >
> > > clk_core_get() is not supposed to operate on the .name member. It is a
> > > firmware based lookup with clkdev as a fallback because clkdev is a
> > > psudeo-firmware interface to assign a name to a clk when some device
> > > pointer is used in conjunction with it.
> > >
> >
> > And the problem is just that. We currently permit to have a
> > configuration with .name but no .fw_name. In a case like that a dev may
> > think that this configuration is valid but in reality the clk is
> > silently ignored/not found and cause clk problem with selecting a
> > parent.
>
> It is valid though.
>
> >
> > Took some good hours to discover this and to me it seems an error that
> > everybody can do since nowhere is specificed that the following
> > parent_data configuration is illegal.
> >
>
> I'll look at adding a test. Seems to be the best way to solve this.

Eh probably a test may have made this more clear. The main problem here
was that the function never returned an error but under the hood the
parent was pointing to another clock.

--
Ansuel