2013-05-01 04:42:23

by Saravana Kannan

[permalink] [raw]
Subject: [PATCH] clk: Fix race condition between clk_set_parent and clk_enable()

Without this patch, the following race conditions are possible.

Race condition 1:
* clk-A has two parents - clk-X and clk-Y.
* All three are disabled and clk-X is current parent.
* Thread A: clk_set_parent(clk-A, clk-Y).
* Thread A: <snip execution flow>
* Thread A: Grabs enable lock.
* Thread A: Sees enable count of clk-A is 0, so doesn't enable clk-Y.
* Thread A: Updates clk-A SW parent to clk-Y
* Thread A: Releases enable lock.
* Thread B: clk_enable(clk-A).
* Thread B: clk_enable() enables clk-Y, then enabled clk-A and returns.

clk-A is now enabled in software, but not clocking in hardware since the
hardware parent is still clk-X.

The only way to avoid race conditions between clk_set_parent() and
clk_enable/disable() is to ensure that clk_enable/disable() calls don't
require changes to hardware enable state between changes to software clock
topology and hardware clock topology.

There are options to achieve the above:
1. Grab the enable lock before changing software/hardware topology and
release it afterwards.
2. Keep the clock enabled for the duration of software/hardware topology
change so that any additional enable/disable calls don't try to change
the hardware state. Once the topology change is complete, the clock can
be put back in its original enable state.

Option (1) is not an acceptable solution since the set_parent() ops might
need to sleep.

Therefore, this patch implements option (2).

This patch doesn't violate any API semantics. clk_disable() doesn't
guarantee that the clock is actually disabled. So, no clients of a clock
can assume that a clock is disabled after their last call to clk_disable().
So, enabling the clock during a parent change is not a violation of any API
semantics.

This also has the nice side effect of simplifying the error handling code.

Signed-off-by: Saravana Kannan <[email protected]>
---
It's been a while since I submitted a patch. So, apologies if I'm cc'ing
people who no longer care about the state of the common clock framework.

drivers/clk/clk.c | 72 +++++++++++++++++++++++-----------------------------
1 files changed, 32 insertions(+), 40 deletions(-)

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 934cfd1..fe4055f 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -1377,67 +1377,59 @@ static int __clk_set_parent(struct clk *clk, struct clk *parent, u8 p_index)
unsigned long flags;
int ret = 0;
struct clk *old_parent = clk->parent;
- bool migrated_enable = false;

- /* migrate prepare */
- if (clk->prepare_count)
+ /*
+ * Migrate prepare state between parents and prevent race with
+ * clk_enable().
+ *
+ * If the clock is not prepared, then a race with
+ * clk_enable/disable() is impossible since we already have the
+ * prepare lock (future calls to clk_enable() need to be preceded by
+ * a clk_prepare()).
+ *
+ * If the clock is prepared, migrate the prepared state to the new
+ * parent and also protect against a race with clk_enable() by
+ * forcing the clock and the new parent on. This ensures that all
+ * future calls to clk_enable() are practically NOPs with respect to
+ * hardware and software states.
+ */
+ if (clk->prepare_count) {
__clk_prepare(parent);
-
- flags = clk_enable_lock();
-
- /* migrate enable */
- if (clk->enable_count) {
- __clk_enable(parent);
- migrated_enable = true;
+ clk_enable(parent);
+ clk_enable(clk);
}

/* update the clk tree topology */
+ flags = clk_enable_lock();
clk_reparent(clk, parent);
-
clk_enable_unlock(flags);

/* change clock input source */
if (parent && clk->ops->set_parent)
ret = clk->ops->set_parent(clk->hw, p_index);
-
if (ret) {
- /*
- * The error handling is tricky due to that we need to release
- * the spinlock while issuing the .set_parent callback. This
- * means the new parent might have been enabled/disabled in
- * between, which must be considered when doing rollback.
- */
- flags = clk_enable_lock();

+ flags = clk_enable_lock();
clk_reparent(clk, old_parent);
-
- if (migrated_enable && clk->enable_count) {
- __clk_disable(parent);
- } else if (migrated_enable && (clk->enable_count == 0)) {
- __clk_disable(old_parent);
- } else if (!migrated_enable && clk->enable_count) {
- __clk_disable(parent);
- __clk_enable(old_parent);
- }
-
clk_enable_unlock(flags);

- if (clk->prepare_count)
+ if (clk->prepare_count) {
+ clk_disable(clk);
+ clk_disable(parent);
__clk_unprepare(parent);
-
+ }
return ret;
}

- /* clean up enable for old parent if migration was done */
- if (migrated_enable) {
- flags = clk_enable_lock();
- __clk_disable(old_parent);
- clk_enable_unlock(flags);
- }
-
- /* clean up prepare for old parent if migration was done */
- if (clk->prepare_count)
+ /*
+ * Finish the migration of prepare state and undo the changes done
+ * for preventing a race with clk_enable().
+ */
+ if (clk->prepare_count) {
+ clk_disable(clk);
+ clk_disable(old_parent);
__clk_unprepare(old_parent);
+ }

/* update debugfs with new clk tree topology */
clk_debug_reparent(clk, parent);
--
1.7.8.3

The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation


2013-05-14 18:54:23

by Mike Turquette

[permalink] [raw]
Subject: Re: [PATCH] clk: Fix race condition between clk_set_parent and clk_enable()

Quoting Saravana Kannan (2013-04-30 21:42:08)
> Without this patch, the following race conditions are possible.
>
> Race condition 1:
> * clk-A has two parents - clk-X and clk-Y.
> * All three are disabled and clk-X is current parent.
> * Thread A: clk_set_parent(clk-A, clk-Y).
> * Thread A: <snip execution flow>
> * Thread A: Grabs enable lock.
> * Thread A: Sees enable count of clk-A is 0, so doesn't enable clk-Y.
> * Thread A: Updates clk-A SW parent to clk-Y
> * Thread A: Releases enable lock.
> * Thread B: clk_enable(clk-A).
> * Thread B: clk_enable() enables clk-Y, then enabled clk-A and returns.
>
> clk-A is now enabled in software, but not clocking in hardware since the
> hardware parent is still clk-X.
>
> The only way to avoid race conditions between clk_set_parent() and
> clk_enable/disable() is to ensure that clk_enable/disable() calls don't
> require changes to hardware enable state between changes to software clock
> topology and hardware clock topology.
>
> There are options to achieve the above:
> 1. Grab the enable lock before changing software/hardware topology and
> release it afterwards.
> 2. Keep the clock enabled for the duration of software/hardware topology
> change so that any additional enable/disable calls don't try to change
> the hardware state. Once the topology change is complete, the clock can
> be put back in its original enable state.
>
> Option (1) is not an acceptable solution since the set_parent() ops might
> need to sleep.
>
> Therefore, this patch implements option (2).
>
> This patch doesn't violate any API semantics. clk_disable() doesn't
> guarantee that the clock is actually disabled. So, no clients of a clock
> can assume that a clock is disabled after their last call to clk_disable().
> So, enabling the clock during a parent change is not a violation of any API
> semantics.
>
> This also has the nice side effect of simplifying the error handling code.
>
> Signed-off-by: Saravana Kannan <[email protected]>

I've taken this patch into clk-next for testing. The code itself looks
fine. The only thing that remains to be seen is if any platforms have a
problem with disabled clocks getting turned on during a reparent
operation.

On platforms that I have worked on this is OK, but I suppose there could
be some platform out there where a clock is prepared and disabled, and
briefly enabling the clock during the reparent operation somehow puts
the hardware in a bad state.

Anyways that's a long shot and this look OK until somebody screams.

Regards,
Mike

> ---
> It's been a while since I submitted a patch. So, apologies if I'm cc'ing
> people who no longer care about the state of the common clock framework.
>
> drivers/clk/clk.c | 72 +++++++++++++++++++++++-----------------------------
> 1 files changed, 32 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 934cfd1..fe4055f 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -1377,67 +1377,59 @@ static int __clk_set_parent(struct clk *clk, struct clk *parent, u8 p_index)
> unsigned long flags;
> int ret = 0;
> struct clk *old_parent = clk->parent;
> - bool migrated_enable = false;
>
> - /* migrate prepare */
> - if (clk->prepare_count)
> + /*
> + * Migrate prepare state between parents and prevent race with
> + * clk_enable().
> + *
> + * If the clock is not prepared, then a race with
> + * clk_enable/disable() is impossible since we already have the
> + * prepare lock (future calls to clk_enable() need to be preceded by
> + * a clk_prepare()).
> + *
> + * If the clock is prepared, migrate the prepared state to the new
> + * parent and also protect against a race with clk_enable() by
> + * forcing the clock and the new parent on. This ensures that all
> + * future calls to clk_enable() are practically NOPs with respect to
> + * hardware and software states.
> + */
> + if (clk->prepare_count) {
> __clk_prepare(parent);
> -
> - flags = clk_enable_lock();
> -
> - /* migrate enable */
> - if (clk->enable_count) {
> - __clk_enable(parent);
> - migrated_enable = true;
> + clk_enable(parent);
> + clk_enable(clk);
> }
>
> /* update the clk tree topology */
> + flags = clk_enable_lock();
> clk_reparent(clk, parent);
> -
> clk_enable_unlock(flags);
>
> /* change clock input source */
> if (parent && clk->ops->set_parent)
> ret = clk->ops->set_parent(clk->hw, p_index);
> -
> if (ret) {
> - /*
> - * The error handling is tricky due to that we need to release
> - * the spinlock while issuing the .set_parent callback. This
> - * means the new parent might have been enabled/disabled in
> - * between, which must be considered when doing rollback.
> - */
> - flags = clk_enable_lock();
>
> + flags = clk_enable_lock();
> clk_reparent(clk, old_parent);
> -
> - if (migrated_enable && clk->enable_count) {
> - __clk_disable(parent);
> - } else if (migrated_enable && (clk->enable_count == 0)) {
> - __clk_disable(old_parent);
> - } else if (!migrated_enable && clk->enable_count) {
> - __clk_disable(parent);
> - __clk_enable(old_parent);
> - }
> -
> clk_enable_unlock(flags);
>
> - if (clk->prepare_count)
> + if (clk->prepare_count) {
> + clk_disable(clk);
> + clk_disable(parent);
> __clk_unprepare(parent);
> -
> + }
> return ret;
> }
>
> - /* clean up enable for old parent if migration was done */
> - if (migrated_enable) {
> - flags = clk_enable_lock();
> - __clk_disable(old_parent);
> - clk_enable_unlock(flags);
> - }
> -
> - /* clean up prepare for old parent if migration was done */
> - if (clk->prepare_count)
> + /*
> + * Finish the migration of prepare state and undo the changes done
> + * for preventing a race with clk_enable().
> + */
> + if (clk->prepare_count) {
> + clk_disable(clk);
> + clk_disable(old_parent);
> __clk_unprepare(old_parent);
> + }
>
> /* update debugfs with new clk tree topology */
> clk_debug_reparent(clk, parent);
> --
> 1.7.8.3
>
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation

2013-05-14 21:03:08

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH] clk: Fix race condition between clk_set_parent and clk_enable()

On 05/14/2013 11:54 AM, Mike Turquette wrote:
> Quoting Saravana Kannan (2013-04-30 21:42:08)
>> Without this patch, the following race conditions are possible.
>>
>> Race condition 1:
>> * clk-A has two parents - clk-X and clk-Y.
>> * All three are disabled and clk-X is current parent.
>> * Thread A: clk_set_parent(clk-A, clk-Y).
>> * Thread A: <snip execution flow>
>> * Thread A: Grabs enable lock.
>> * Thread A: Sees enable count of clk-A is 0, so doesn't enable clk-Y.
>> * Thread A: Updates clk-A SW parent to clk-Y
>> * Thread A: Releases enable lock.
>> * Thread B: clk_enable(clk-A).
>> * Thread B: clk_enable() enables clk-Y, then enabled clk-A and returns.
>>
>> clk-A is now enabled in software, but not clocking in hardware since the
>> hardware parent is still clk-X.
>>
>> The only way to avoid race conditions between clk_set_parent() and
>> clk_enable/disable() is to ensure that clk_enable/disable() calls don't
>> require changes to hardware enable state between changes to software clock
>> topology and hardware clock topology.
>>
>> There are options to achieve the above:
>> 1. Grab the enable lock before changing software/hardware topology and
>> release it afterwards.
>> 2. Keep the clock enabled for the duration of software/hardware topology
>> change so that any additional enable/disable calls don't try to change
>> the hardware state. Once the topology change is complete, the clock can
>> be put back in its original enable state.
>>
>> Option (1) is not an acceptable solution since the set_parent() ops might
>> need to sleep.
>>
>> Therefore, this patch implements option (2).
>>
>> This patch doesn't violate any API semantics. clk_disable() doesn't
>> guarantee that the clock is actually disabled. So, no clients of a clock
>> can assume that a clock is disabled after their last call to clk_disable().
>> So, enabling the clock during a parent change is not a violation of any API
>> semantics.
>>
>> This also has the nice side effect of simplifying the error handling code.
>>
>> Signed-off-by: Saravana Kannan <[email protected]>
>
> I've taken this patch into clk-next for testing. The code itself looks
> fine.

Thanks Mike. I'll send it out again with some typo/grammar corrections.

> The only thing that remains to be seen is if any platforms have a
> problem with disabled clocks getting turned on during a reparent
> operation.

I would think that would be a general issue with the clock APIs since
disable doesn't guarantee a disable (since it's ref counted).

Also, those clocks could be marked as CLK_SET_PARENT_GATE if it's a real
issue.

> On platforms that I have worked on this is OK, but I suppose there could
> be some platform out there where a clock is prepared and disabled, and
> briefly enabling the clock during the reparent operation somehow puts
> the hardware in a bad state.

I can't think of any either, but as I mentioned, we have
CLK_SET_PARENT_GATE for that.

Thanks,
Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2013-05-14 22:10:20

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCH] clk: Fix race condition between clk_set_parent and clk_enable()

Hi,

On Tuesday 14 of May 2013 11:54:17 Mike Turquette wrote:
> Quoting Saravana Kannan (2013-04-30 21:42:08)
>
> > Without this patch, the following race conditions are possible.
> >
> > Race condition 1:
> > * clk-A has two parents - clk-X and clk-Y.
> > * All three are disabled and clk-X is current parent.
> > * Thread A: clk_set_parent(clk-A, clk-Y).
> > * Thread A: <snip execution flow>
> > * Thread A: Grabs enable lock.
> > * Thread A: Sees enable count of clk-A is 0, so doesn't enable clk-Y.
> > * Thread A: Updates clk-A SW parent to clk-Y
> > * Thread A: Releases enable lock.
> > * Thread B: clk_enable(clk-A).
> > * Thread B: clk_enable() enables clk-Y, then enabled clk-A and
> > returns.
> >
> > clk-A is now enabled in software, but not clocking in hardware since
> > the hardware parent is still clk-X.
> >
> > The only way to avoid race conditions between clk_set_parent() and
> > clk_enable/disable() is to ensure that clk_enable/disable() calls
> > don't
> > require changes to hardware enable state between changes to software
> > clock topology and hardware clock topology.
> >
> > There are options to achieve the above:
> > 1. Grab the enable lock before changing software/hardware topology and
> >
> > release it afterwards.
> >
> > 2. Keep the clock enabled for the duration of software/hardware
> > topology>
> > change so that any additional enable/disable calls don't try to
> > change
> > the hardware state. Once the topology change is complete, the clock
> > can
> > be put back in its original enable state.
> >
> > Option (1) is not an acceptable solution since the set_parent() ops
> > might need to sleep.
> >
> > Therefore, this patch implements option (2).
> >
> > This patch doesn't violate any API semantics. clk_disable() doesn't
> > guarantee that the clock is actually disabled. So, no clients of a
> > clock can assume that a clock is disabled after their last call to
> > clk_disable(). So, enabling the clock during a parent change is not a
> > violation of any API semantics.
> >
> > This also has the nice side effect of simplifying the error handling
> > code.
> >
> > Signed-off-by: Saravana Kannan <[email protected]>
>
> I've taken this patch into clk-next for testing. The code itself looks
> fine. The only thing that remains to be seen is if any platforms have a
> problem with disabled clocks getting turned on during a reparent
> operation.

IMHO this behavior should be documented somewhere, with a note that the
clock must not be prepared to keep it disabled during reparent operation
and possibly also pointing to the CLK_SET_PARENT_GATE flag.

> On platforms that I have worked on this is OK, but I suppose there could
> be some platform out there where a clock is prepared and disabled, and
> briefly enabling the clock during the reparent operation somehow puts
> the hardware in a bad state.

Well, on any platform where default clock settings are not completely
correct this is likely to cause problems, because some device might get
too high frequency for some period of time, which might crash it alone as
well as the whole system.

Best regards,
Tomasz

> Anyways that's a long shot and this look OK until somebody screams.
>
> Regards,
> Mike
>
> > ---
> > It's been a while since I submitted a patch. So, apologies if I'm
> > cc'ing people who no longer care about the state of the common clock
> > framework.>
> > drivers/clk/clk.c | 72
> > +++++++++++++++++++++++----------------------------- 1 files
> > changed, 32 insertions(+), 40 deletions(-)
> >
> > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > index 934cfd1..fe4055f 100644
> > --- a/drivers/clk/clk.c
> > +++ b/drivers/clk/clk.c
> > @@ -1377,67 +1377,59 @@ static int __clk_set_parent(struct clk *clk,
> > struct clk *parent, u8 p_index)>
> > unsigned long flags;
> > int ret = 0;
> > struct clk *old_parent = clk->parent;
> >
> > - bool migrated_enable = false;
> >
> > - /* migrate prepare */
> > - if (clk->prepare_count)
> > + /*
> > + * Migrate prepare state between parents and prevent race with
> > + * clk_enable().
> > + *
> > + * If the clock is not prepared, then a race with
> > + * clk_enable/disable() is impossible since we already have
> > the
> > + * prepare lock (future calls to clk_enable() need to be
> > preceded by + * a clk_prepare()).
> > + *
> > + * If the clock is prepared, migrate the prepared state to the
> > new + * parent and also protect against a race with
> > clk_enable() by + * forcing the clock and the new parent on.
> > This ensures that all + * future calls to clk_enable() are
> > practically NOPs with respect to + * hardware and software
> > states.
> > + */
> > + if (clk->prepare_count) {
> >
> > __clk_prepare(parent);
> >
> > -
> > - flags = clk_enable_lock();
> > -
> > - /* migrate enable */
> > - if (clk->enable_count) {
> > - __clk_enable(parent);
> > - migrated_enable = true;
> > + clk_enable(parent);
> > + clk_enable(clk);
> >
> > }
> >
> > /* update the clk tree topology */
> >
> > + flags = clk_enable_lock();
> >
> > clk_reparent(clk, parent);
> >
> > -
> >
> > clk_enable_unlock(flags);
> >
> > /* change clock input source */
> > if (parent && clk->ops->set_parent)
> >
> > ret = clk->ops->set_parent(clk->hw, p_index);
> >
> > -
> >
> > if (ret) {
> >
> > - /*
> > - * The error handling is tricky due to that we need to
> > release - * the spinlock while issuing the .set_parent
> > callback. This - * means the new parent might have
> > been enabled/disabled in - * between, which must be
> > considered when doing rollback. - */
> > - flags = clk_enable_lock();
> >
> > + flags = clk_enable_lock();
> >
> > clk_reparent(clk, old_parent);
> >
> > -
> > - if (migrated_enable && clk->enable_count) {
> > - __clk_disable(parent);
> > - } else if (migrated_enable && (clk->enable_count ==
> > 0)) { - __clk_disable(old_parent);
> > - } else if (!migrated_enable && clk->enable_count) {
> > - __clk_disable(parent);
> > - __clk_enable(old_parent);
> > - }
> > -
> >
> > clk_enable_unlock(flags);
> >
> > - if (clk->prepare_count)
> > + if (clk->prepare_count) {
> > + clk_disable(clk);
> > + clk_disable(parent);
> >
> > __clk_unprepare(parent);
> >
> > -
> > + }
> >
> > return ret;
> >
> > }
> >
> > - /* clean up enable for old parent if migration was done */
> > - if (migrated_enable) {
> > - flags = clk_enable_lock();
> > - __clk_disable(old_parent);
> > - clk_enable_unlock(flags);
> > - }
> > -
> > - /* clean up prepare for old parent if migration was done */
> > - if (clk->prepare_count)
> > + /*
> > + * Finish the migration of prepare state and undo the changes
> > done + * for preventing a race with clk_enable().
> > + */
> > + if (clk->prepare_count) {
> > + clk_disable(clk);
> > + clk_disable(old_parent);
> >
> > __clk_unprepare(old_parent);
> >
> > + }
> >
> > /* update debugfs with new clk tree topology */
> > clk_debug_reparent(clk, parent);
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2013-05-14 22:46:36

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH] clk: Fix race condition between clk_set_parent and clk_enable()

On 05/14/2013 03:10 PM, Tomasz Figa wrote:
> Hi,
>
> On Tuesday 14 of May 2013 11:54:17 Mike Turquette wrote:
>> Quoting Saravana Kannan (2013-04-30 21:42:08)
>>
>>> Without this patch, the following race conditions are possible.
>>>
>>> Race condition 1:
>>> * clk-A has two parents - clk-X and clk-Y.
>>> * All three are disabled and clk-X is current parent.
>>> * Thread A: clk_set_parent(clk-A, clk-Y).
>>> * Thread A: <snip execution flow>
>>> * Thread A: Grabs enable lock.
>>> * Thread A: Sees enable count of clk-A is 0, so doesn't enable clk-Y.
>>> * Thread A: Updates clk-A SW parent to clk-Y
>>> * Thread A: Releases enable lock.
>>> * Thread B: clk_enable(clk-A).
>>> * Thread B: clk_enable() enables clk-Y, then enabled clk-A and
>>> returns.
>>>
>>> clk-A is now enabled in software, but not clocking in hardware since
>>> the hardware parent is still clk-X.
>>>
>>> The only way to avoid race conditions between clk_set_parent() and
>>> clk_enable/disable() is to ensure that clk_enable/disable() calls
>>> don't
>>> require changes to hardware enable state between changes to software
>>> clock topology and hardware clock topology.
>>>
>>> There are options to achieve the above:
>>> 1. Grab the enable lock before changing software/hardware topology and
>>>
>>> release it afterwards.
>>>
>>> 2. Keep the clock enabled for the duration of software/hardware
>>> topology>
>>> change so that any additional enable/disable calls don't try to
>>> change
>>> the hardware state. Once the topology change is complete, the clock
>>> can
>>> be put back in its original enable state.
>>>
>>> Option (1) is not an acceptable solution since the set_parent() ops
>>> might need to sleep.
>>>
>>> Therefore, this patch implements option (2).
>>>
>>> This patch doesn't violate any API semantics. clk_disable() doesn't
>>> guarantee that the clock is actually disabled. So, no clients of a
>>> clock can assume that a clock is disabled after their last call to
>>> clk_disable(). So, enabling the clock during a parent change is not a
>>> violation of any API semantics.
>>>
>>> This also has the nice side effect of simplifying the error handling
>>> code.
>>>
>>> Signed-off-by: Saravana Kannan <[email protected]>
>>
>> I've taken this patch into clk-next for testing. The code itself looks
>> fine. The only thing that remains to be seen is if any platforms have a
>> problem with disabled clocks getting turned on during a reparent
>> operation.
>
> IMHO this behavior should be documented somewhere, with a note that the
> clock must not be prepared to keep it disabled during reparent operation
> and possibly also pointing to the CLK_SET_PARENT_GATE flag.

Reasonable request. I can update the documentation of clk_set_parent()
to indicate that the clock might get turned on for the duration of the
call and if they need a guarantee the GATE flag should be used.

>
>> On platforms that I have worked on this is OK, but I suppose there could
>> be some platform out there where a clock is prepared and disabled, and
>> briefly enabling the clock during the reparent operation somehow puts
>> the hardware in a bad state.
>
> Well, on any platform where default clock settings are not completely
> correct this is likely to cause problems, because some device might get
> too high frequency for some period of time, which might crash it alone as
> well as the whole system.
>

I don't think this is really a problem with this patch. It's present
even without this patch.

The patch doesn't switch to some other unspecified parent. It only
switches between the new/old parent. Even without this patch, if a clock
is prepared while you reparent it, clk_enable() could be called at
anytime between the parent switch and the future clock API calls to set
up the new parent correctly. I think that's just crappy driver code to
switch to a new parent before setting it up correctly. There's
absolutely no good reason to do it that way.

-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2013-05-15 00:10:21

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCH] clk: Fix race condition between clk_set_parent and clk_enable()

On Tuesday 14 of May 2013 15:46:33 Saravana Kannan wrote:
> On 05/14/2013 03:10 PM, Tomasz Figa wrote:
> > Hi,
> >
> > On Tuesday 14 of May 2013 11:54:17 Mike Turquette wrote:
> >> Quoting Saravana Kannan (2013-04-30 21:42:08)
> >>
> >>> Without this patch, the following race conditions are possible.
> >>>
> >>> Race condition 1:
> >>> * clk-A has two parents - clk-X and clk-Y.
> >>> * All three are disabled and clk-X is current parent.
> >>> * Thread A: clk_set_parent(clk-A, clk-Y).
> >>> * Thread A: <snip execution flow>
> >>> * Thread A: Grabs enable lock.
> >>> * Thread A: Sees enable count of clk-A is 0, so doesn't enable
> >>> clk-Y.
> >>> * Thread A: Updates clk-A SW parent to clk-Y
> >>> * Thread A: Releases enable lock.
> >>> * Thread B: clk_enable(clk-A).
> >>> * Thread B: clk_enable() enables clk-Y, then enabled clk-A and
> >>> returns.
> >>>
> >>> clk-A is now enabled in software, but not clocking in hardware since
> >>> the hardware parent is still clk-X.
> >>>
> >>> The only way to avoid race conditions between clk_set_parent() and
> >>> clk_enable/disable() is to ensure that clk_enable/disable() calls
> >>> don't
> >>> require changes to hardware enable state between changes to software
> >>> clock topology and hardware clock topology.
> >>>
> >>> There are options to achieve the above:
> >>> 1. Grab the enable lock before changing software/hardware topology
> >>> and
> >>>
> >>> release it afterwards.
> >>>
> >>> 2. Keep the clock enabled for the duration of software/hardware
> >>> topology>
> >>>
> >>> change so that any additional enable/disable calls don't try to
> >>> change
> >>> the hardware state. Once the topology change is complete, the
> >>> clock
> >>> can
> >>> be put back in its original enable state.
> >>>
> >>> Option (1) is not an acceptable solution since the set_parent() ops
> >>> might need to sleep.
> >>>
> >>> Therefore, this patch implements option (2).
> >>>
> >>> This patch doesn't violate any API semantics. clk_disable() doesn't
> >>> guarantee that the clock is actually disabled. So, no clients of a
> >>> clock can assume that a clock is disabled after their last call to
> >>> clk_disable(). So, enabling the clock during a parent change is not
> >>> a
> >>> violation of any API semantics.
> >>>
> >>> This also has the nice side effect of simplifying the error handling
> >>> code.
> >>>
> >>> Signed-off-by: Saravana Kannan <[email protected]>
> >>
> >> I've taken this patch into clk-next for testing. The code itself
> >> looks
> >> fine. The only thing that remains to be seen is if any platforms
> >> have a problem with disabled clocks getting turned on during a
> >> reparent operation.
> >
> > IMHO this behavior should be documented somewhere, with a note that
> > the
> > clock must not be prepared to keep it disabled during reparent
> > operation and possibly also pointing to the CLK_SET_PARENT_GATE flag.
>
> Reasonable request. I can update the documentation of clk_set_parent()
> to indicate that the clock might get turned on for the duration of the
> call and if they need a guarantee the GATE flag should be used.
>
> >> On platforms that I have worked on this is OK, but I suppose there
> >> could be some platform out there where a clock is prepared and
> >> disabled, and briefly enabling the clock during the reparent
> >> operation somehow puts the hardware in a bad state.
> >
> > Well, on any platform where default clock settings are not completely
> > correct this is likely to cause problems, because some device might
> > get
> > too high frequency for some period of time, which might crash it alone
> > as well as the whole system.
>
> I don't think this is really a problem with this patch. It's present
> even without this patch.
>
> The patch doesn't switch to some other unspecified parent. It only
> switches between the new/old parent. Even without this patch, if a clock
> is prepared while you reparent it, clk_enable() could be called at
> anytime between the parent switch and the future clock API calls to set
> up the new parent correctly. I think that's just crappy driver code to
> switch to a new parent before setting it up correctly. There's
> absolutely no good reason to do it that way.

This is not exactly what I meant. I was just giving an example problem of
turning a clock on, if it's not set up correctly yet.

AFAIK most (if not all) of current code either does necessary reparenting
and initial rate setting early, before clk_prepare(), so it is not a
problem or already after clk_enable() (in case of reparenting dynamically
at runtime), so there shouldn't be any problem.

Best regards,
Tomasz

2013-05-15 19:24:54

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] clk: Fix race condition between clk_set_parent and clk_enable()

On 1 May 2013 06:42, Saravana Kannan <[email protected]> wrote:
> Without this patch, the following race conditions are possible.
>
> Race condition 1:
> * clk-A has two parents - clk-X and clk-Y.
> * All three are disabled and clk-X is current parent.
> * Thread A: clk_set_parent(clk-A, clk-Y).
> * Thread A: <snip execution flow>
> * Thread A: Grabs enable lock.
> * Thread A: Sees enable count of clk-A is 0, so doesn't enable clk-Y.
> * Thread A: Updates clk-A SW parent to clk-Y
> * Thread A: Releases enable lock.
> * Thread B: clk_enable(clk-A).
> * Thread B: clk_enable() enables clk-Y, then enabled clk-A and returns.
>
> clk-A is now enabled in software, but not clocking in hardware since the
> hardware parent is still clk-X.
>
> The only way to avoid race conditions between clk_set_parent() and
> clk_enable/disable() is to ensure that clk_enable/disable() calls don't
> require changes to hardware enable state between changes to software clock
> topology and hardware clock topology.
>
> There are options to achieve the above:
> 1. Grab the enable lock before changing software/hardware topology and
> release it afterwards.
> 2. Keep the clock enabled for the duration of software/hardware topology
> change so that any additional enable/disable calls don't try to change
> the hardware state. Once the topology change is complete, the clock can
> be put back in its original enable state.
>
> Option (1) is not an acceptable solution since the set_parent() ops might
> need to sleep.
>
> Therefore, this patch implements option (2).
>
> This patch doesn't violate any API semantics. clk_disable() doesn't
> guarantee that the clock is actually disabled. So, no clients of a clock
> can assume that a clock is disabled after their last call to clk_disable().
> So, enabling the clock during a parent change is not a violation of any API
> semantics.
>
> This also has the nice side effect of simplifying the error handling code.
>
> Signed-off-by: Saravana Kannan <[email protected]>
> ---
> It's been a while since I submitted a patch. So, apologies if I'm cc'ing
> people who no longer care about the state of the common clock framework.
>
> drivers/clk/clk.c | 72 +++++++++++++++++++++++-----------------------------
> 1 files changed, 32 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 934cfd1..fe4055f 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -1377,67 +1377,59 @@ static int __clk_set_parent(struct clk *clk, struct clk *parent, u8 p_index)
> unsigned long flags;
> int ret = 0;
> struct clk *old_parent = clk->parent;
> - bool migrated_enable = false;
>
> - /* migrate prepare */
> - if (clk->prepare_count)
> + /*
> + * Migrate prepare state between parents and prevent race with
> + * clk_enable().
> + *
> + * If the clock is not prepared, then a race with
> + * clk_enable/disable() is impossible since we already have the
> + * prepare lock (future calls to clk_enable() need to be preceded by
> + * a clk_prepare()).
> + *
> + * If the clock is prepared, migrate the prepared state to the new
> + * parent and also protect against a race with clk_enable() by
> + * forcing the clock and the new parent on. This ensures that all
> + * future calls to clk_enable() are practically NOPs with respect to
> + * hardware and software states.
> + */

Maybe an additional note about that since CLK_SET_PARENT_GATE is a
prerequisite for doing migration of "prepare", we also interpreted
this flags as it is acceptable to enable the clock(s) in this context.

> + if (clk->prepare_count) {
> __clk_prepare(parent);
> -
> - flags = clk_enable_lock();
> -
> - /* migrate enable */
> - if (clk->enable_count) {
> - __clk_enable(parent);
> - migrated_enable = true;
> + clk_enable(parent);
> + clk_enable(clk);
> }
>
> /* update the clk tree topology */
> + flags = clk_enable_lock();
> clk_reparent(clk, parent);
> -
> clk_enable_unlock(flags);
>
> /* change clock input source */
> if (parent && clk->ops->set_parent)
> ret = clk->ops->set_parent(clk->hw, p_index);
> -
> if (ret) {
> - /*
> - * The error handling is tricky due to that we need to release
> - * the spinlock while issuing the .set_parent callback. This
> - * means the new parent might have been enabled/disabled in
> - * between, which must be considered when doing rollback.
> - */
> - flags = clk_enable_lock();
>
> + flags = clk_enable_lock();
> clk_reparent(clk, old_parent);
> -
> - if (migrated_enable && clk->enable_count) {
> - __clk_disable(parent);
> - } else if (migrated_enable && (clk->enable_count == 0)) {
> - __clk_disable(old_parent);
> - } else if (!migrated_enable && clk->enable_count) {
> - __clk_disable(parent);
> - __clk_enable(old_parent);
> - }
> -

Really good, that we can remove this awkward error handling!

> clk_enable_unlock(flags);
>
> - if (clk->prepare_count)
> + if (clk->prepare_count) {
> + clk_disable(clk);
> + clk_disable(parent);
> __clk_unprepare(parent);
> -
> + }
> return ret;
> }
>
> - /* clean up enable for old parent if migration was done */
> - if (migrated_enable) {
> - flags = clk_enable_lock();
> - __clk_disable(old_parent);
> - clk_enable_unlock(flags);
> - }
> -
> - /* clean up prepare for old parent if migration was done */
> - if (clk->prepare_count)
> + /*
> + * Finish the migration of prepare state and undo the changes done
> + * for preventing a race with clk_enable().
> + */
> + if (clk->prepare_count) {
> + clk_disable(clk);
> + clk_disable(old_parent);
> __clk_unprepare(old_parent);
> + }
>
> /* update debugfs with new clk tree topology */
> clk_debug_reparent(clk, parent);
> --
> 1.7.8.3
>
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation

Looks good! Thanks for having another round to fixup this kind of
tricky code. :-)

Acked-by: Ulf Hansson <[email protected]>

2013-05-16 04:07:33

by Saravana Kannan

[permalink] [raw]
Subject: [PATCH v2] clk: Fix race condition between clk_set_parent and clk_enable()

Without this patch, the following race condition is possible.
* clk-A has two parents - clk-X and clk-Y.
* All three are disabled and clk-X is current parent.
* Thread A: clk_set_parent(clk-A, clk-Y).
* Thread A: <snip execution flow>
* Thread A: Grabs enable lock.
* Thread A: Sees enable count of clk-A is 0, so doesn't enable clk-Y.
* Thread A: Updates clk-A SW parent to clk-Y
* Thread A: Releases enable lock.
* Thread B: clk_enable(clk-A).
* Thread B: clk_enable() enables clk-Y, then enabled clk-A and returns.

clk-A is now enabled in software, but not clocking in hardware since the
hardware parent is still clk-X.

The only way to avoid race conditions between clk_set_parent() and
clk_enable/disable() is to ensure that clk_enable/disable() calls don't
require changes to hardware enable state between changes to software clock
topology and hardware clock topology.

The options to achieve the above are:
1. Grab the enable lock before changing software/hardware topology and
release it afterwards.
2. Keep the clock enabled for the duration of software/hardware topology
change so that any additional enable/disable calls don't try to change
the hardware state. Once the topology change is complete, the clock can
be put back in its original enable state.

Option (1) is not an acceptable solution since the set_parent() ops might
need to sleep.

Therefore, this patch implements option (2).

This patch doesn't violate any API semantics. clk_disable() doesn't
guarantee that the clock is actually disabled. So, no clients of a clock
can assume that a clock is disabled after their last call to clk_disable().
So, enabling the clock during a parent change is not a violation of any API
semantics.

This also has the nice side effect of simplifying the error handling code.

Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/clk/clk.c | 91 ++++++++++++++++++++++++++---------------------------
1 files changed, 45 insertions(+), 46 deletions(-)

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 934cfd1..b4dbb8c 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -1377,67 +1377,61 @@ static int __clk_set_parent(struct clk *clk, struct clk *parent, u8 p_index)
unsigned long flags;
int ret = 0;
struct clk *old_parent = clk->parent;
- bool migrated_enable = false;

- /* migrate prepare */
- if (clk->prepare_count)
+ /*
+ * Migrate prepare state between parents and prevent race with
+ * clk_enable().
+ *
+ * If the clock is not prepared, then a race with
+ * clk_enable/disable() is impossible since we already have the
+ * prepare lock (future calls to clk_enable() need to be preceded by
+ * a clk_prepare()).
+ *
+ * If the clock is prepared, migrate the prepared state to the new
+ * parent and also protect against a race with clk_enable() by
+ * forcing the clock and the new parent on. This ensures that all
+ * future calls to clk_enable() are practically NOPs with respect to
+ * hardware and software states.
+ *
+ * See also: Comment for clk_set_parent() below.
+ */
+ if (clk->prepare_count) {
__clk_prepare(parent);
-
- flags = clk_enable_lock();
-
- /* migrate enable */
- if (clk->enable_count) {
- __clk_enable(parent);
- migrated_enable = true;
+ clk_enable(parent);
+ clk_enable(clk);
}

/* update the clk tree topology */
+ flags = clk_enable_lock();
clk_reparent(clk, parent);
-
clk_enable_unlock(flags);

/* change clock input source */
if (parent && clk->ops->set_parent)
ret = clk->ops->set_parent(clk->hw, p_index);
-
if (ret) {
- /*
- * The error handling is tricky due to that we need to release
- * the spinlock while issuing the .set_parent callback. This
- * means the new parent might have been enabled/disabled in
- * between, which must be considered when doing rollback.
- */
- flags = clk_enable_lock();

+ flags = clk_enable_lock();
clk_reparent(clk, old_parent);
-
- if (migrated_enable && clk->enable_count) {
- __clk_disable(parent);
- } else if (migrated_enable && (clk->enable_count == 0)) {
- __clk_disable(old_parent);
- } else if (!migrated_enable && clk->enable_count) {
- __clk_disable(parent);
- __clk_enable(old_parent);
- }
-
clk_enable_unlock(flags);

- if (clk->prepare_count)
+ if (clk->prepare_count) {
+ clk_disable(clk);
+ clk_disable(parent);
__clk_unprepare(parent);
-
+ }
return ret;
}

- /* clean up enable for old parent if migration was done */
- if (migrated_enable) {
- flags = clk_enable_lock();
- __clk_disable(old_parent);
- clk_enable_unlock(flags);
- }
-
- /* clean up prepare for old parent if migration was done */
- if (clk->prepare_count)
+ /*
+ * Finish the migration of prepare state and undo the changes done
+ * for preventing a race with clk_enable().
+ */
+ if (clk->prepare_count) {
+ clk_disable(clk);
+ clk_disable(old_parent);
__clk_unprepare(old_parent);
+ }

/* update debugfs with new clk tree topology */
clk_debug_reparent(clk, parent);
@@ -1449,12 +1443,17 @@ static int __clk_set_parent(struct clk *clk, struct clk *parent, u8 p_index)
* @clk: the mux clk whose input we are switching
* @parent: the new input to clk
*
- * Re-parent clk to use parent as it's new input source. If clk has the
- * CLK_SET_PARENT_GATE flag set then clk must be gated for this
- * operation to succeed. After successfully changing clk's parent
- * clk_set_parent will update the clk topology, sysfs topology and
- * propagate rate recalculation via __clk_recalc_rates. Returns 0 on
- * success, -EERROR otherwise.
+ * Re-parent clk to use parent as its new input source. If clk is in
+ * prepared state, the clk will get enabled for the duration of this call. If
+ * that's not acceptable for a specific clk (Eg: the consumer can't handle
+ * that, the reparenting is glitchy in hardware, etc), use the
+ * CLK_SET_PARENT_GATE flag to allow reparenting only when clk is unprepared.
+ *
+ * After successfully changing clk's parent clk_set_parent will update the
+ * clk topology, sysfs topology and propagate rate recalculation via
+ * __clk_recalc_rates.
+ *
+ * Returns 0 on success, -EERROR otherwise.
*/
int clk_set_parent(struct clk *clk, struct clk *parent)
{
--
1.7.8.3

The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2013-05-16 04:17:08

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH] clk: Fix race condition between clk_set_parent and clk_enable()

On 05/15/2013 12:24 PM, Ulf Hansson wrote:
> On 1 May 2013 06:42, Saravana Kannan <[email protected]> wrote:

<snip>

>> - /* migrate prepare */
>> - if (clk->prepare_count)
>> + /*
>> + * Migrate prepare state between parents and prevent race with
>> + * clk_enable().
>> + *
>> + * If the clock is not prepared, then a race with
>> + * clk_enable/disable() is impossible since we already have the
>> + * prepare lock (future calls to clk_enable() need to be preceded by
>> + * a clk_prepare()).
>> + *
>> + * If the clock is prepared, migrate the prepared state to the new
>> + * parent and also protect against a race with clk_enable() by
>> + * forcing the clock and the new parent on. This ensures that all
>> + * future calls to clk_enable() are practically NOPs with respect to
>> + * hardware and software states.
>> + */
>
> Maybe an additional note about that since CLK_SET_PARENT_GATE is a
> prerequisite for doing migration of "prepare", we also interpreted
> this flags as it is acceptable to enable the clock(s) in this context.

Done. Sent v2 patch.

<snip>

>
> Looks good! Thanks for having another round to fixup this kind of
> tricky code. :-)

Thanks :-)

> Acked-by: Ulf Hansson <[email protected]>
>

Thanks again.

-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2013-05-16 20:45:01

by Mike Turquette

[permalink] [raw]
Subject: Re: [PATCH v2] clk: Fix race condition between clk_set_parent and clk_enable()

Quoting Saravana Kannan (2013-05-15 21:07:24)
> Without this patch, the following race condition is possible.
> * clk-A has two parents - clk-X and clk-Y.
> * All three are disabled and clk-X is current parent.
> * Thread A: clk_set_parent(clk-A, clk-Y).
> * Thread A: <snip execution flow>
> * Thread A: Grabs enable lock.
> * Thread A: Sees enable count of clk-A is 0, so doesn't enable clk-Y.
> * Thread A: Updates clk-A SW parent to clk-Y
> * Thread A: Releases enable lock.
> * Thread B: clk_enable(clk-A).
> * Thread B: clk_enable() enables clk-Y, then enabled clk-A and returns.
>
> clk-A is now enabled in software, but not clocking in hardware since the
> hardware parent is still clk-X.
>
> The only way to avoid race conditions between clk_set_parent() and
> clk_enable/disable() is to ensure that clk_enable/disable() calls don't
> require changes to hardware enable state between changes to software clock
> topology and hardware clock topology.
>
> The options to achieve the above are:
> 1. Grab the enable lock before changing software/hardware topology and
> release it afterwards.
> 2. Keep the clock enabled for the duration of software/hardware topology
> change so that any additional enable/disable calls don't try to change
> the hardware state. Once the topology change is complete, the clock can
> be put back in its original enable state.
>
> Option (1) is not an acceptable solution since the set_parent() ops might
> need to sleep.
>
> Therefore, this patch implements option (2).
>
> This patch doesn't violate any API semantics. clk_disable() doesn't
> guarantee that the clock is actually disabled. So, no clients of a clock
> can assume that a clock is disabled after their last call to clk_disable().
> So, enabling the clock during a parent change is not a violation of any API
> semantics.
>
> This also has the nice side effect of simplifying the error handling code.
>
> Signed-off-by: Saravana Kannan <[email protected]>

Updated to this version in clk-next.

Thanks,
Mike

> ---
> drivers/clk/clk.c | 91 ++++++++++++++++++++++++++---------------------------
> 1 files changed, 45 insertions(+), 46 deletions(-)
>
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 934cfd1..b4dbb8c 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -1377,67 +1377,61 @@ static int __clk_set_parent(struct clk *clk, struct clk *parent, u8 p_index)
> unsigned long flags;
> int ret = 0;
> struct clk *old_parent = clk->parent;
> - bool migrated_enable = false;
>
> - /* migrate prepare */
> - if (clk->prepare_count)
> + /*
> + * Migrate prepare state between parents and prevent race with
> + * clk_enable().
> + *
> + * If the clock is not prepared, then a race with
> + * clk_enable/disable() is impossible since we already have the
> + * prepare lock (future calls to clk_enable() need to be preceded by
> + * a clk_prepare()).
> + *
> + * If the clock is prepared, migrate the prepared state to the new
> + * parent and also protect against a race with clk_enable() by
> + * forcing the clock and the new parent on. This ensures that all
> + * future calls to clk_enable() are practically NOPs with respect to
> + * hardware and software states.
> + *
> + * See also: Comment for clk_set_parent() below.
> + */
> + if (clk->prepare_count) {
> __clk_prepare(parent);
> -
> - flags = clk_enable_lock();
> -
> - /* migrate enable */
> - if (clk->enable_count) {
> - __clk_enable(parent);
> - migrated_enable = true;
> + clk_enable(parent);
> + clk_enable(clk);
> }
>
> /* update the clk tree topology */
> + flags = clk_enable_lock();
> clk_reparent(clk, parent);
> -
> clk_enable_unlock(flags);
>
> /* change clock input source */
> if (parent && clk->ops->set_parent)
> ret = clk->ops->set_parent(clk->hw, p_index);
> -
> if (ret) {
> - /*
> - * The error handling is tricky due to that we need to release
> - * the spinlock while issuing the .set_parent callback. This
> - * means the new parent might have been enabled/disabled in
> - * between, which must be considered when doing rollback.
> - */
> - flags = clk_enable_lock();
>
> + flags = clk_enable_lock();
> clk_reparent(clk, old_parent);
> -
> - if (migrated_enable && clk->enable_count) {
> - __clk_disable(parent);
> - } else if (migrated_enable && (clk->enable_count == 0)) {
> - __clk_disable(old_parent);
> - } else if (!migrated_enable && clk->enable_count) {
> - __clk_disable(parent);
> - __clk_enable(old_parent);
> - }
> -
> clk_enable_unlock(flags);
>
> - if (clk->prepare_count)
> + if (clk->prepare_count) {
> + clk_disable(clk);
> + clk_disable(parent);
> __clk_unprepare(parent);
> -
> + }
> return ret;
> }
>
> - /* clean up enable for old parent if migration was done */
> - if (migrated_enable) {
> - flags = clk_enable_lock();
> - __clk_disable(old_parent);
> - clk_enable_unlock(flags);
> - }
> -
> - /* clean up prepare for old parent if migration was done */
> - if (clk->prepare_count)
> + /*
> + * Finish the migration of prepare state and undo the changes done
> + * for preventing a race with clk_enable().
> + */
> + if (clk->prepare_count) {
> + clk_disable(clk);
> + clk_disable(old_parent);
> __clk_unprepare(old_parent);
> + }
>
> /* update debugfs with new clk tree topology */
> clk_debug_reparent(clk, parent);
> @@ -1449,12 +1443,17 @@ static int __clk_set_parent(struct clk *clk, struct clk *parent, u8 p_index)
> * @clk: the mux clk whose input we are switching
> * @parent: the new input to clk
> *
> - * Re-parent clk to use parent as it's new input source. If clk has the
> - * CLK_SET_PARENT_GATE flag set then clk must be gated for this
> - * operation to succeed. After successfully changing clk's parent
> - * clk_set_parent will update the clk topology, sysfs topology and
> - * propagate rate recalculation via __clk_recalc_rates. Returns 0 on
> - * success, -EERROR otherwise.
> + * Re-parent clk to use parent as its new input source. If clk is in
> + * prepared state, the clk will get enabled for the duration of this call. If
> + * that's not acceptable for a specific clk (Eg: the consumer can't handle
> + * that, the reparenting is glitchy in hardware, etc), use the
> + * CLK_SET_PARENT_GATE flag to allow reparenting only when clk is unprepared.
> + *
> + * After successfully changing clk's parent clk_set_parent will update the
> + * clk topology, sysfs topology and propagate rate recalculation via
> + * __clk_recalc_rates.
> + *
> + * Returns 0 on success, -EERROR otherwise.
> */
> int clk_set_parent(struct clk *clk, struct clk *parent)
> {
> --
> 1.7.8.3
>
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation

2013-05-16 21:31:26

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v2] clk: Fix race condition between clk_set_parent and clk_enable()

On 05/16/2013 01:44 PM, Mike Turquette wrote:
> Quoting Saravana Kannan (2013-05-15 21:07:24)
>> Without this patch, the following race condition is possible.
>> * clk-A has two parents - clk-X and clk-Y.
>> * All three are disabled and clk-X is current parent.
>> * Thread A: clk_set_parent(clk-A, clk-Y).
>> * Thread A: <snip execution flow>
>> * Thread A: Grabs enable lock.
>> * Thread A: Sees enable count of clk-A is 0, so doesn't enable clk-Y.
>> * Thread A: Updates clk-A SW parent to clk-Y
>> * Thread A: Releases enable lock.
>> * Thread B: clk_enable(clk-A).
>> * Thread B: clk_enable() enables clk-Y, then enabled clk-A and returns.
>>
>> clk-A is now enabled in software, but not clocking in hardware since the
>> hardware parent is still clk-X.
>>
>> The only way to avoid race conditions between clk_set_parent() and
>> clk_enable/disable() is to ensure that clk_enable/disable() calls don't
>> require changes to hardware enable state between changes to software clock
>> topology and hardware clock topology.
>>
>> The options to achieve the above are:
>> 1. Grab the enable lock before changing software/hardware topology and
>> release it afterwards.
>> 2. Keep the clock enabled for the duration of software/hardware topology
>> change so that any additional enable/disable calls don't try to change
>> the hardware state. Once the topology change is complete, the clock can
>> be put back in its original enable state.
>>
>> Option (1) is not an acceptable solution since the set_parent() ops might
>> need to sleep.
>>
>> Therefore, this patch implements option (2).
>>
>> This patch doesn't violate any API semantics. clk_disable() doesn't
>> guarantee that the clock is actually disabled. So, no clients of a clock
>> can assume that a clock is disabled after their last call to clk_disable().
>> So, enabling the clock during a parent change is not a violation of any API
>> semantics.
>>
>> This also has the nice side effect of simplifying the error handling code.
>>
>> Signed-off-by: Saravana Kannan <[email protected]>
>
> Updated to this version in clk-next.
>

Thanks Mike. I forgot to add the Ack by Ulf. Would be nice if you can
put that in.

Btw, I did send this email to the list. But looks like this mail is
wedged in the series of tubes in the arm mailing list.

-Saravana


--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation