2012-02-02 19:24:51

by Stephen Boyd

[permalink] [raw]
Subject: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

armv7's flush_cache_all() flushes caches via set/way. To
determine the cache attributes (line size, number of sets,
etc.) the assembly first writes the CSSELR register to select a
cache level and then reads the CCSIDR register. The CSSELR register
is banked per-cpu and is used to determine which cache level CCSIDR
reads. If the task is migrated between when the CSSELR is written and
the CCSIDR is read the CCSIDR value may be for an unexpected cache
level (for example L1 instead of L2) and incorrect cache flushing
could occur.

Disable preemption across the write and read so that the correct
cache attributes are read and used for the cache flushing
routine. This fixes a problem we see in scm_call() when
flush_cache_all() is called from preemptible context and
sometimes the L2 cache is not properly flushed out.

Signed-off-by: Stephen Boyd <[email protected]>
Cc: Catalin Marinas <[email protected]>
---

Should we move get_thread_info into assembler.h? It seems odd
to include entry-header.S but I saw that vfp was doing the same.

arch/arm/mm/cache-v7.S | 11 +++++++++++
1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
index 07c4bc8..a033858 100644
--- a/arch/arm/mm/cache-v7.S
+++ b/arch/arm/mm/cache-v7.S
@@ -16,6 +16,7 @@
#include <asm/unwind.h>

#include "proc-macros.S"
+#include "../kernel/entry-header.S"

/*
* v7_flush_icache_all()
@@ -54,9 +55,19 @@ loop1:
and r1, r1, #7 @ mask of the bits for current cache only
cmp r1, #2 @ see what cache we have at this level
blt skip @ skip if no cache, or just i-cache
+#ifdef CONFIG_PREEMPT
+ get_thread_info r9
+ ldr r11, [r9, #TI_PREEMPT] @ get preempt count
+ add r11, r11, #1 @ increment it
+ str r11, [r9, #TI_PREEMPT]
+#endif
mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
isb @ isb to sych the new cssr&csidr
mrc p15, 1, r1, c0, c0, 0 @ read the new csidr
+#ifdef CONFIG_PREEMPT
+ sub r11, r11, #1 @ decrement preempt count
+ str r11, [r9, #TI_PREEMPT]
+#endif
and r2, r1, #7 @ extract the length of the cache lines
add r2, r2, #4 @ add 4 (line length offset)
ldr r4, =0x3ff
--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.


2012-02-02 20:44:30

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Thu, Feb 02, 2012 at 11:24:46AM -0800, Stephen Boyd wrote:
> armv7's flush_cache_all() flushes caches via set/way. To
> determine the cache attributes (line size, number of sets,
> etc.) the assembly first writes the CSSELR register to select a
> cache level and then reads the CCSIDR register. The CSSELR register
> is banked per-cpu and is used to determine which cache level CCSIDR
> reads. If the task is migrated between when the CSSELR is written and
> the CCSIDR is read the CCSIDR value may be for an unexpected cache
> level (for example L1 instead of L2) and incorrect cache flushing
> could occur.
>
> Disable preemption across the write and read so that the correct
> cache attributes are read and used for the cache flushing
> routine. This fixes a problem we see in scm_call() when
> flush_cache_all() is called from preemptible context and
> sometimes the L2 cache is not properly flushed out.

This isn't going to work for two reasons:

(1) (and the VFP code suffers from this) after we re-enable preemption,
we really should check for a pending preemption event in every case.

(2) v7_flush_dcache_all() is called from __v7_setup() using a very small
private stack. This doesn't have a thread info structure at the
bottom.

So, if we need to disable preemption here, we need to find a different
solution to it.

> Should we move get_thread_info into assembler.h? It seems odd
> to include entry-header.S but I saw that vfp was doing the same.

Probably yes, and probably also have preempt_disable and preempt_enable
assembler macros. That's going to get rather icky if we have to
explicitly call the scheduler though (to solve (1)).

2012-02-02 21:38:06

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Thu, 2 Feb 2012, Russell King - ARM Linux wrote:

> On Thu, Feb 02, 2012 at 11:24:46AM -0800, Stephen Boyd wrote:
> > Should we move get_thread_info into assembler.h? It seems odd
> > to include entry-header.S but I saw that vfp was doing the same.
>
> Probably yes, and probably also have preempt_disable and preempt_enable
> assembler macros. That's going to get rather icky if we have to
> explicitly call the scheduler though (to solve (1)).

What about a pair of helpers written in C instead?

v7_flush_dcache_all() could be renamed, and a wrapper function called
v7_flush_dcache_all() would call the preemption disable helper, call the
former v7_flush_dcache_all code, then call the preemption enable helper.

Then __v7_setup() could still call the core cache flush code without
issues.


Nicolas

2012-02-02 23:36:52

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On 02/02/12 13:38, Nicolas Pitre wrote:
> On Thu, 2 Feb 2012, Russell King - ARM Linux wrote
>> On Thu, Feb 02, 2012 at 11:24:46AM -0800, Stephen Boyd wrote:
>>> Should we move get_thread_info into assembler.h? It seems odd
>>> to include entry-header.S but I saw that vfp was doing the same.
>> Probably yes, and probably also have preempt_disable and preempt_enable
>> assembler macros. That's going to get rather icky if we have to
>> explicitly call the scheduler though (to solve (1)).
> What about a pair of helpers written in C instead?
>
> v7_flush_dcache_all() could be renamed, and a wrapper function called
> v7_flush_dcache_all() would call the preemption disable helper, call the
> former v7_flush_dcache_all code, then call the preemption enable helper.
>
> Then __v7_setup() could still call the core cache flush code without
> issues.

I tried to put the preemption disable/enable right around the place
where it was needed. With this approach we would disable preemption
during the entire cache flush. I'm not sure if we want to make this
function worse for performance, do we? It certainly sounds easier than
writing all the preempt macros in assembly though.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2012-02-03 00:36:53

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Thu, Feb 02, 2012 at 03:36:49PM -0800, Stephen Boyd wrote:
> On 02/02/12 13:38, Nicolas Pitre wrote:
> > On Thu, 2 Feb 2012, Russell King - ARM Linux wrote
> >> On Thu, Feb 02, 2012 at 11:24:46AM -0800, Stephen Boyd wrote:
> >>> Should we move get_thread_info into assembler.h? It seems odd
> >>> to include entry-header.S but I saw that vfp was doing the same.
> >> Probably yes, and probably also have preempt_disable and preempt_enable
> >> assembler macros. That's going to get rather icky if we have to
> >> explicitly call the scheduler though (to solve (1)).
> > What about a pair of helpers written in C instead?
> >
> > v7_flush_dcache_all() could be renamed, and a wrapper function called
> > v7_flush_dcache_all() would call the preemption disable helper, call the
> > former v7_flush_dcache_all code, then call the preemption enable helper.
> >
> > Then __v7_setup() could still call the core cache flush code without
> > issues.
>
> I tried to put the preemption disable/enable right around the place
> where it was needed. With this approach we would disable preemption
> during the entire cache flush. I'm not sure if we want to make this
> function worse for performance, do we? It certainly sounds easier than
> writing all the preempt macros in assembly though.

Err, why do you think it's a big task?

preempt disable is a case of incrementing the thread preempt count, while
preempt enable is a case of decrementing it, testing for zero, if zero,
then checking whether TIF_NEED_RESCHED is set and calling a function.

If that's too much, then the simple method in assembly to quickly disable
preemption over a very few set of instructions is using mrs/msr and cpsid i.
That'll be far cheaper than fiddling about with preempt counters or
messing about with veneers in C code.

2012-02-03 00:49:09

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On 02/02/12 16:36, Russell King - ARM Linux wrote:
> On Thu, Feb 02, 2012 at 03:36:49PM -0800, Stephen Boyd wrote:
>> On 02/02/12 13:38, Nicolas Pitre wrote:
>>> On Thu, 2 Feb 2012, Russell King - ARM Linux wrote
>>>> On Thu, Feb 02, 2012 at 11:24:46AM -0800, Stephen Boyd wrote:
>>>>> Should we move get_thread_info into assembler.h? It seems odd
>>>>> to include entry-header.S but I saw that vfp was doing the same.
>>>> Probably yes, and probably also have preempt_disable and preempt_enable
>>>> assembler macros. That's going to get rather icky if we have to
>>>> explicitly call the scheduler though (to solve (1)).
>>> What about a pair of helpers written in C instead?
>>>
>>> v7_flush_dcache_all() could be renamed, and a wrapper function called
>>> v7_flush_dcache_all() would call the preemption disable helper, call the
>>> former v7_flush_dcache_all code, then call the preemption enable helper.
>>>
>>> Then __v7_setup() could still call the core cache flush code without
>>> issues.
>> I tried to put the preemption disable/enable right around the place
>> where it was needed. With this approach we would disable preemption
>> during the entire cache flush. I'm not sure if we want to make this
>> function worse for performance, do we? It certainly sounds easier than
>> writing all the preempt macros in assembly though.
> Err, why do you think it's a big task?
>
> preempt disable is a case of incrementing the thread preempt count, while
> preempt enable is a case of decrementing it, testing for zero, if zero,
> then checking whether TIF_NEED_RESCHED is set and calling a function.
>
> If that's too much, then the simple method in assembly to quickly disable
> preemption over a very few set of instructions is using mrs/msr and cpsid i.
> That'll be far cheaper than fiddling about with preempt counters or
> messing about with veneers in C code.

I'll try the macros. So far it isn't bad, just the __v7_setup to resolve.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2012-02-03 01:16:28

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Fri, 3 Feb 2012, Russell King - ARM Linux wrote:

> On Thu, Feb 02, 2012 at 03:36:49PM -0800, Stephen Boyd wrote:
> > On 02/02/12 13:38, Nicolas Pitre wrote:
> > > What about a pair of helpers written in C instead?
> > >
> > > v7_flush_dcache_all() could be renamed, and a wrapper function called
> > > v7_flush_dcache_all() would call the preemption disable helper, call the
> > > former v7_flush_dcache_all code, then call the preemption enable helper.
> > >
> > > Then __v7_setup() could still call the core cache flush code without
> > > issues.
> >
> > I tried to put the preemption disable/enable right around the place
> > where it was needed. With this approach we would disable preemption
> > during the entire cache flush. I'm not sure if we want to make this
> > function worse for performance, do we? It certainly sounds easier than
> > writing all the preempt macros in assembly though.
>
> Err, why do you think it's a big task?
>
> preempt disable is a case of incrementing the thread preempt count, while
> preempt enable is a case of decrementing it, testing for zero, if zero,
> then checking whether TIF_NEED_RESCHED is set and calling a function.

Oh certainly. And we already do just that in a few places already. I
re-read your previous email to realize that I initially misread your
remark about the ickness of explicitly calling the scheduler.

> If that's too much, then the simple method in assembly to quickly disable
> preemption over a very few set of instructions is using mrs/msr and cpsid i.
> That'll be far cheaper than fiddling about with preempt counters or
> messing about with veneers in C code.

Indeed. And I think that would be plenty sufficient here as the
protected region is really short. I don't think that warrants any
macros.


Nicolas

2012-02-03 01:19:33

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Thu, 2 Feb 2012, Stephen Boyd wrote:

> On 02/02/12 16:36, Russell King - ARM Linux wrote:
> > On Thu, Feb 02, 2012 at 03:36:49PM -0800, Stephen Boyd wrote:
> >> On 02/02/12 13:38, Nicolas Pitre wrote:
> >>> On Thu, 2 Feb 2012, Russell King - ARM Linux wrote
> >>>> On Thu, Feb 02, 2012 at 11:24:46AM -0800, Stephen Boyd wrote:
> >>>>> Should we move get_thread_info into assembler.h? It seems odd
> >>>>> to include entry-header.S but I saw that vfp was doing the same.
> >>>> Probably yes, and probably also have preempt_disable and preempt_enable
> >>>> assembler macros. That's going to get rather icky if we have to
> >>>> explicitly call the scheduler though (to solve (1)).
> >>> What about a pair of helpers written in C instead?
> >>>
> >>> v7_flush_dcache_all() could be renamed, and a wrapper function called
> >>> v7_flush_dcache_all() would call the preemption disable helper, call the
> >>> former v7_flush_dcache_all code, then call the preemption enable helper.
> >>>
> >>> Then __v7_setup() could still call the core cache flush code without
> >>> issues.
> >> I tried to put the preemption disable/enable right around the place
> >> where it was needed. With this approach we would disable preemption
> >> during the entire cache flush. I'm not sure if we want to make this
> >> function worse for performance, do we? It certainly sounds easier than
> >> writing all the preempt macros in assembly though.
> > Err, why do you think it's a big task?
> >
> > preempt disable is a case of incrementing the thread preempt count, while
> > preempt enable is a case of decrementing it, testing for zero, if zero,
> > then checking whether TIF_NEED_RESCHED is set and calling a function.
> >
> > If that's too much, then the simple method in assembly to quickly disable
> > preemption over a very few set of instructions is using mrs/msr and cpsid i.
> > That'll be far cheaper than fiddling about with preempt counters or
> > messing about with veneers in C code.
>
> I'll try the macros. So far it isn't bad, just the __v7_setup to resolve.

If you simply disable/restore IRQs around the critical region then you
don't have to worry about __v7_setup. Plus this will allow for
v7_flush_dcache_all to still be callable from atomic context.


Nicolas

2012-02-03 02:03:52

by Stephen Boyd

[permalink] [raw]
Subject: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

armv7's flush_cache_all() flushes caches via set/way. To
determine the cache attributes (line size, number of sets,
etc.) the assembly first writes the CSSELR register to select a
cache level and then reads the CCSIDR register. The CSSELR register
is banked per-cpu and is used to determine which cache level CCSIDR
reads. If the task is migrated between when the CSSELR is written and
the CCSIDR is read the CCSIDR value may be for an unexpected cache
level (for example L1 instead of L2) and incorrect cache flushing
could occur.

Disable interrupts across the write and read so that the correct
cache attributes are read and used for the cache flushing
routine. We disable interrupts instead of disabling preemption
because the critical section is only 3 instructions and we want
to call v7_dcache_flush_all from __v7_setup which doesn't have a
full kernel stack with a struct thread_info.

This fixes a problem we see in scm_call() when flush_cache_all()
is called from preemptible context and sometimes the L2 cache is
not properly flushed out.

Signed-off-by: Stephen Boyd <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Nicolas Pitre <[email protected]>
---

On 02/02/12 17:18, Nicolas Pitre wrote:
>>> If that's too much, then the simple method in assembly to quickly disable
>>> preemption over a very few set of instructions is using mrs/msr and cpsid i.
>>> That'll be far cheaper than fiddling about with preempt counters or
>>> messing about with veneers in C code.
>>
>> I'll try the macros. So far it isn't bad, just the __v7_setup to resolve.
>
> If you simply disable/restore IRQs around the critical region then you
> don't have to worry about __v7_setup. Plus this will allow for
> v7_flush_dcache_all to still be callable from atomic context.

Ok. Here's a patch. I still need to test it. I'll send another patch
series to cleanup the get_thread_info stuff (there's two of them?).

arch/arm/mm/cache-v7.S | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
index 07c4bc8..654a5fc 100644
--- a/arch/arm/mm/cache-v7.S
+++ b/arch/arm/mm/cache-v7.S
@@ -54,9 +54,15 @@ loop1:
and r1, r1, #7 @ mask of the bits for current cache only
cmp r1, #2 @ see what cache we have at this level
blt skip @ skip if no cache, or just i-cache
+#ifdef CONFIG_PREEMPT
+ save_and_disable_irqs r9 @ make cssr&csidr read atomic
+#endif
mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
isb @ isb to sych the new cssr&csidr
mrc p15, 1, r1, c0, c0, 0 @ read the new csidr
+#ifdef CONFIG_PREEMPT
+ restore_irqs r9
+#endif
and r2, r1, #7 @ extract the length of the cache lines
add r2, r2, #4 @ add 4 (line length offset)
ldr r4, =0x3ff
--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2012-02-03 02:36:01

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Thu, 2 Feb 2012, Stephen Boyd wrote:
> On 02/02/12 17:18, Nicolas Pitre wrote:
> > If you simply disable/restore IRQs around the critical region then you
> > don't have to worry about __v7_setup. Plus this will allow for
> > v7_flush_dcache_all to still be callable from atomic context.
>
> Ok. Here's a patch. I still need to test it. I'll send another patch
> series to cleanup the get_thread_info stuff (there's two of them?).
>
> arch/arm/mm/cache-v7.S | 6 ++++++
> 1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> index 07c4bc8..654a5fc 100644
> --- a/arch/arm/mm/cache-v7.S
> +++ b/arch/arm/mm/cache-v7.S
> @@ -54,9 +54,15 @@ loop1:
> and r1, r1, #7 @ mask of the bits for current cache only
> cmp r1, #2 @ see what cache we have at this level
> blt skip @ skip if no cache, or just i-cache
> +#ifdef CONFIG_PREEMPT
> + save_and_disable_irqs r9 @ make cssr&csidr read atomic
> +#endif
> mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
> isb @ isb to sych the new cssr&csidr
> mrc p15, 1, r1, c0, c0, 0 @ read the new csidr
> +#ifdef CONFIG_PREEMPT
> + restore_irqs r9
> +#endif

I'd suggest using restore_irqs_notrace instead. The IRQ-off period is
so small that there is no point tracing it.

Withthat change:

Reviewed-by: Nicolas Pitre <[email protected]>


Nicolas

2012-02-03 02:37:41

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On 02/02/12 18:35, Nicolas Pitre wrote:
> On Thu, 2 Feb 2012, Stephen Boyd wrote:
>> On 02/02/12 17:18, Nicolas Pitre wrote:
>>> If you simply disable/restore IRQs around the critical region then you
>>> don't have to worry about __v7_setup. Plus this will allow for
>>> v7_flush_dcache_all to still be callable from atomic context.
>> Ok. Here's a patch. I still need to test it. I'll send another patch
>> series to cleanup the get_thread_info stuff (there's two of them?).
>>
>> arch/arm/mm/cache-v7.S | 6 ++++++
>> 1 files changed, 6 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
>> index 07c4bc8..654a5fc 100644
>> --- a/arch/arm/mm/cache-v7.S
>> +++ b/arch/arm/mm/cache-v7.S
>> @@ -54,9 +54,15 @@ loop1:
>> and r1, r1, #7 @ mask of the bits for current cache only
>> cmp r1, #2 @ see what cache we have at this level
>> blt skip @ skip if no cache, or just i-cache
>> +#ifdef CONFIG_PREEMPT
>> + save_and_disable_irqs r9 @ make cssr&csidr read atomic
>> +#endif
>> mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
>> isb @ isb to sych the new cssr&csidr
>> mrc p15, 1, r1, c0, c0, 0 @ read the new csidr
>> +#ifdef CONFIG_PREEMPT
>> + restore_irqs r9
>> +#endif
> I'd suggest using restore_irqs_notrace instead. The IRQ-off period is
> so small that there is no point tracing it.

Thanks. I'll make sure to do that before uploading to the patch tracker.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2012-02-03 03:04:07

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Thu, 2 Feb 2012, Stephen Boyd wrote:

> On 02/02/12 18:35, Nicolas Pitre wrote:
> > On Thu, 2 Feb 2012, Stephen Boyd wrote:
> >> On 02/02/12 17:18, Nicolas Pitre wrote:
> >>> If you simply disable/restore IRQs around the critical region then you
> >>> don't have to worry about __v7_setup. Plus this will allow for
> >>> v7_flush_dcache_all to still be callable from atomic context.
> >> Ok. Here's a patch. I still need to test it. I'll send another patch
> >> series to cleanup the get_thread_info stuff (there's two of them?).
> >>
> >> arch/arm/mm/cache-v7.S | 6 ++++++
> >> 1 files changed, 6 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> >> index 07c4bc8..654a5fc 100644
> >> --- a/arch/arm/mm/cache-v7.S
> >> +++ b/arch/arm/mm/cache-v7.S
> >> @@ -54,9 +54,15 @@ loop1:
> >> and r1, r1, #7 @ mask of the bits for current cache only
> >> cmp r1, #2 @ see what cache we have at this level
> >> blt skip @ skip if no cache, or just i-cache
> >> +#ifdef CONFIG_PREEMPT
> >> + save_and_disable_irqs r9 @ make cssr&csidr read atomic
> >> +#endif
> >> mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
> >> isb @ isb to sych the new cssr&csidr
> >> mrc p15, 1, r1, c0, c0, 0 @ read the new csidr
> >> +#ifdef CONFIG_PREEMPT
> >> + restore_irqs r9
> >> +#endif
> > I'd suggest using restore_irqs_notrace instead. The IRQ-off period is
> > so small that there is no point tracing it.
>
> Thanks. I'll make sure to do that before uploading to the patch tracker.

Might be worth flagging this for the stable kernels as well
(CC: [email protected]).


Nicolas

2012-02-03 11:16:57

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

Hello.

On 03-02-2012 7:04, Nicolas Pitre wrote:

>>>>> If you simply disable/restore IRQs around the critical region then you
>>>>> don't have to worry about __v7_setup. Plus this will allow for
>>>>> v7_flush_dcache_all to still be callable from atomic context.
>>>> Ok. Here's a patch. I still need to test it. I'll send another patch
>>>> series to cleanup the get_thread_info stuff (there's two of them?).
>>>>
>>>> arch/arm/mm/cache-v7.S | 6 ++++++
>>>> 1 files changed, 6 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
>>>> index 07c4bc8..654a5fc 100644
>>>> --- a/arch/arm/mm/cache-v7.S
>>>> +++ b/arch/arm/mm/cache-v7.S
>>>> @@ -54,9 +54,15 @@ loop1:
>>>> and r1, r1, #7 @ mask of the bits for current cache only
>>>> cmp r1, #2 @ see what cache we have at this level
>>>> blt skip @ skip if no cache, or just i-cache
>>>> +#ifdef CONFIG_PREEMPT
>>>> + save_and_disable_irqs r9 @ make cssr&csidr read atomic
>>>> +#endif
>>>> mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
>>>> isb @ isb to sych the new cssr&csidr
>>>> mrc p15, 1, r1, c0, c0, 0 @ read the new csidr
>>>> +#ifdef CONFIG_PREEMPT
>>>> + restore_irqs r9
>>>> +#endif

>>> I'd suggest using restore_irqs_notrace instead. The IRQ-off period is
>>> so small that there is no point tracing it.

>> Thanks. I'll make sure to do that before uploading to the patch tracker.

> Might be worth flagging this for the stable kernels as well
> (CC: [email protected]).

The new address is [email protected] as Greg KH wrote.

WBR, Sergei

2012-02-04 18:01:07

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Fri, Feb 03, 2012 at 02:03:49AM +0000, Stephen Boyd wrote:
> armv7's flush_cache_all() flushes caches via set/way. To
> determine the cache attributes (line size, number of sets,
> etc.) the assembly first writes the CSSELR register to select a
> cache level and then reads the CCSIDR register. The CSSELR register
> is banked per-cpu and is used to determine which cache level CCSIDR
> reads. If the task is migrated between when the CSSELR is written and
> the CCSIDR is read the CCSIDR value may be for an unexpected cache
> level (for example L1 instead of L2) and incorrect cache flushing
> could occur.
>
> Disable interrupts across the write and read so that the correct
> cache attributes are read and used for the cache flushing
> routine. We disable interrupts instead of disabling preemption
> because the critical section is only 3 instructions and we want
> to call v7_dcache_flush_all from __v7_setup which doesn't have a
> full kernel stack with a struct thread_info.
>
> This fixes a problem we see in scm_call() when flush_cache_all()
> is called from preemptible context and sometimes the L2 cache is
> not properly flushed out.
>
> Signed-off-by: Stephen Boyd <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Nicolas Pitre <[email protected]>

Acked-by: Catalin Marinas <[email protected]>

--
Catalin

2012-02-07 03:34:08

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On 02/02/2012 04:36 PM, Russell King - ARM Linux wrote:
> On Thu, Feb 02, 2012 at 03:36:49PM -0800, Stephen Boyd wrote:
>> On 02/02/12 13:38, Nicolas Pitre wrote:
>>> On Thu, 2 Feb 2012, Russell King - ARM Linux wrote
>>>> On Thu, Feb 02, 2012 at 11:24:46AM -0800, Stephen Boyd wrote:
>>>>> Should we move get_thread_info into assembler.h? It seems odd
>>>>> to include entry-header.S but I saw that vfp was doing the same.
>>>> Probably yes, and probably also have preempt_disable and preempt_enable
>>>> assembler macros. That's going to get rather icky if we have to
>>>> explicitly call the scheduler though (to solve (1)).
>>> What about a pair of helpers written in C instead?
>>>
>>> v7_flush_dcache_all() could be renamed, and a wrapper function called
>>> v7_flush_dcache_all() would call the preemption disable helper, call the
>>> former v7_flush_dcache_all code, then call the preemption enable helper.
>>>
>>> Then __v7_setup() could still call the core cache flush code without
>>> issues.
>>
>> I tried to put the preemption disable/enable right around the place
>> where it was needed. With this approach we would disable preemption
>> during the entire cache flush. I'm not sure if we want to make this
>> function worse for performance, do we? It certainly sounds easier than
>> writing all the preempt macros in assembly though.
>
> Err, why do you think it's a big task?
>
> preempt disable is a case of incrementing the thread preempt count, while
> preempt enable is a case of decrementing it, testing for zero, if zero,
> then checking whether TIF_NEED_RESCHED is set and calling a function.
>
> If that's too much, then the simple method in assembly to quickly disable
> preemption over a very few set of instructions is using mrs/msr and cpsid i.
> That'll be far cheaper than fiddling about with preempt counters or
> messing about with veneers in C code.

Russell,

I think you misunderstood Stephen's point about the performance. He
isn't referring to the performance difference between a C call to preemt
disable/enable vs. a few assembly level instructions.

I believe he is referring to the performance hit of having preemption
disabled during the entirety of the cache flush operation vs. having
preemption disabled only for the duration of writing to CSSELR and
reading back CCSIDR.

I would think a cache flush is a fairly long operation and to have
preemption disable across it doesn't sound appealing to me.

Thoughts?

Thanks,
Saravana

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2012-02-07 17:42:43

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On 02/06/12 19:34, Saravana Kannan wrote:
> On 02/02/2012 04:36 PM, Russell King - ARM Linux wrote:
>> On Thu, Feb 02, 2012 at 03:36:49PM -0800, Stephen Boyd wrote:
>>> On 02/02/12 13:38, Nicolas Pitre wrote:
>>>> On Thu, 2 Feb 2012, Russell King - ARM Linux wrote
>>>>> On Thu, Feb 02, 2012 at 11:24:46AM -0800, Stephen Boyd wrote:
>>>>>> Should we move get_thread_info into assembler.h? It seems odd
>>>>>> to include entry-header.S but I saw that vfp was doing the same.
>>>>> Probably yes, and probably also have preempt_disable and
>>>>> preempt_enable
>>>>> assembler macros. That's going to get rather icky if we have to
>>>>> explicitly call the scheduler though (to solve (1)).
>>>> What about a pair of helpers written in C instead?
>>>>
>>>> v7_flush_dcache_all() could be renamed, and a wrapper function called
>>>> v7_flush_dcache_all() would call the preemption disable helper,
>>>> call the
>>>> former v7_flush_dcache_all code, then call the preemption enable
>>>> helper.
>>>>
>>>> Then __v7_setup() could still call the core cache flush code without
>>>> issues.
>>>
>>> I tried to put the preemption disable/enable right around the place
>>> where it was needed. With this approach we would disable preemption
>>> during the entire cache flush. I'm not sure if we want to make this
>>> function worse for performance, do we? It certainly sounds easier than
>>> writing all the preempt macros in assembly though.
>>
>> Err, why do you think it's a big task?
>>
>> preempt disable is a case of incrementing the thread preempt count,
>> while
>> preempt enable is a case of decrementing it, testing for zero, if zero,
>> then checking whether TIF_NEED_RESCHED is set and calling a function.
>>
>> If that's too much, then the simple method in assembly to quickly
>> disable
>> preemption over a very few set of instructions is using mrs/msr and
>> cpsid i.
>> That'll be far cheaper than fiddling about with preempt counters or
>> messing about with veneers in C code.
>
> Russell,
>
> I think you misunderstood Stephen's point about the performance. He
> isn't referring to the performance difference between a C call to
> preemt disable/enable vs. a few assembly level instructions.
>
> I believe he is referring to the performance hit of having preemption
> disabled during the entirety of the cache flush operation vs. having
> preemption disabled only for the duration of writing to CSSELR and
> reading back CCSIDR.
>
> I would think a cache flush is a fairly long operation and to have
> preemption disable across it doesn't sound appealing to me.
>
> Thoughts?
>

Sorry I messed up the headers for v2 of the patch. It didn't get sent to
the msm list.

Anyway, disabling interrupts for those few instructions sounds like the
best approach and so I sent that out in v2.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2012-02-13 17:55:36

by Rabin Vincent

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Fri, Feb 3, 2012 at 07:33, Stephen Boyd <[email protected]> wrote:
> armv7's flush_cache_all() flushes caches via set/way. To
> determine the cache attributes (line size, number of sets,
> etc.) the assembly first writes the CSSELR register to select a
> cache level and then reads the CCSIDR register. The CSSELR register
> is banked per-cpu and is used to determine which cache level CCSIDR
> reads. If the task is migrated between when the CSSELR is written and
> the CCSIDR is read the CCSIDR value may be for an unexpected cache
> level (for example L1 instead of L2) and incorrect cache flushing
> could occur.
>
> Disable interrupts across the write and read so that the correct
> cache attributes are read and used for the cache flushing
> routine. We disable interrupts instead of disabling preemption
> because the critical section is only 3 instructions and we want
> to call v7_dcache_flush_all from __v7_setup which doesn't have a
> full kernel stack with a struct thread_info.
>
> This fixes a problem we see in scm_call() when flush_cache_all()
> is called from preemptible context and sometimes the L2 cache is
> not properly flushed out.
>
> Signed-off-by: Stephen Boyd <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Nicolas Pitre <[email protected]>
> ---
> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> index 07c4bc8..654a5fc 100644
> --- a/arch/arm/mm/cache-v7.S
> +++ b/arch/arm/mm/cache-v7.S
> @@ -54,9 +54,15 @@ loop1:
> ? ? ? ?and ? ? r1, r1, #7 ? ? ? ? ? ? ? ? ? ? ?@ mask of the bits for current cache only
> ? ? ? ?cmp ? ? r1, #2 ? ? ? ? ? ? ? ? ? ? ? ? ?@ see what cache we have at this level
> ? ? ? ?blt ? ? skip ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ skip if no cache, or just i-cache
> +#ifdef CONFIG_PREEMPT
> + ? ? ? save_and_disable_irqs r9 ? ? ? ? ? ? ? ?@ make cssr&csidr read atomic
> +#endif
> ? ? ? ?mcr ? ? p15, 2, r10, c0, c0, 0 ? ? ? ? ?@ select current cache level in cssr
> ? ? ? ?isb ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ isb to sych the new cssr&csidr
> ? ? ? ?mrc ? ? p15, 1, r1, c0, c0, 0 ? ? ? ? ? @ read the new csidr
> +#ifdef CONFIG_PREEMPT
> + ? ? ? restore_irqs r9
> +#endif
> ? ? ? ?and ? ? r2, r1, #7 ? ? ? ? ? ? ? ? ? ? ?@ extract the length of the cache lines
> ? ? ? ?add ? ? r2, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ add 4 (line length offset)
> ? ? ? ?ldr ? ? r4, =0x3ff

This patch breaks the kernel boot when lockdep is enabled.

v7_setup (called before the MMU is enabled) calls v7_flush_dcache_all,
and the save_and_disable_irqs added by this patch ends up calling
into lockdep C code (trace_hardirqs_off()) when we are in no position
to execute it (no stack, no MMU).

The following fixes it. Perhaps it can be folded in?

diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 62f8095..23371b1 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -137,6 +137,11 @@
disable_irq
.endm

+ .macro save_and_disable_irqs_notrace, oldcpsr
+ mrs \oldcpsr, cpsr
+ disable_irq_notrace
+ .endm
+
/*
* Restore interrupt state previously stored in a register. We don't
* guarantee that this will preserve the flags.
diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
index 7a24d396..a655d3d 100644
--- a/arch/arm/mm/cache-v7.S
+++ b/arch/arm/mm/cache-v7.S
@@ -55,7 +55,7 @@ loop1:
cmp r1, #2 @ see what cache we have at this level
blt skip @ skip if no cache, or just i-cache
#ifdef CONFIG_PREEMPT
- save_and_disable_irqs r9 @ make cssr&csidr read atomic
+ save_and_disable_irqs_notrace r9 @ make cssr&csidr read atomic
#endif
mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
isb @ isb to sych the new cssr&csidr

2012-02-13 18:09:11

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Mon, 13 Feb 2012, Rabin Vincent wrote:

> On Fri, Feb 3, 2012 at 07:33, Stephen Boyd <[email protected]> wrote:
> > armv7's flush_cache_all() flushes caches via set/way. To
> > determine the cache attributes (line size, number of sets,
> > etc.) the assembly first writes the CSSELR register to select a
> > cache level and then reads the CCSIDR register. The CSSELR register
> > is banked per-cpu and is used to determine which cache level CCSIDR
> > reads. If the task is migrated between when the CSSELR is written and
> > the CCSIDR is read the CCSIDR value may be for an unexpected cache
> > level (for example L1 instead of L2) and incorrect cache flushing
> > could occur.
> >
> > Disable interrupts across the write and read so that the correct
> > cache attributes are read and used for the cache flushing
> > routine. We disable interrupts instead of disabling preemption
> > because the critical section is only 3 instructions and we want
> > to call v7_dcache_flush_all from __v7_setup which doesn't have a
> > full kernel stack with a struct thread_info.
> >
> > This fixes a problem we see in scm_call() when flush_cache_all()
> > is called from preemptible context and sometimes the L2 cache is
> > not properly flushed out.
> >
> > Signed-off-by: Stephen Boyd <[email protected]>
> > Cc: Catalin Marinas <[email protected]>
> > Cc: Nicolas Pitre <[email protected]>
> > ---
> > diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> > index 07c4bc8..654a5fc 100644
> > --- a/arch/arm/mm/cache-v7.S
> > +++ b/arch/arm/mm/cache-v7.S
> > @@ -54,9 +54,15 @@ loop1:
> > ? ? ? ?and ? ? r1, r1, #7 ? ? ? ? ? ? ? ? ? ? ?@ mask of the bits for current cache only
> > ? ? ? ?cmp ? ? r1, #2 ? ? ? ? ? ? ? ? ? ? ? ? ?@ see what cache we have at this level
> > ? ? ? ?blt ? ? skip ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ skip if no cache, or just i-cache
> > +#ifdef CONFIG_PREEMPT
> > + ? ? ? save_and_disable_irqs r9 ? ? ? ? ? ? ? ?@ make cssr&csidr read atomic
> > +#endif
> > ? ? ? ?mcr ? ? p15, 2, r10, c0, c0, 0 ? ? ? ? ?@ select current cache level in cssr
> > ? ? ? ?isb ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ isb to sych the new cssr&csidr
> > ? ? ? ?mrc ? ? p15, 1, r1, c0, c0, 0 ? ? ? ? ? @ read the new csidr
> > +#ifdef CONFIG_PREEMPT
> > + ? ? ? restore_irqs r9
> > +#endif
> > ? ? ? ?and ? ? r2, r1, #7 ? ? ? ? ? ? ? ? ? ? ?@ extract the length of the cache lines
> > ? ? ? ?add ? ? r2, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ add 4 (line length offset)
> > ? ? ? ?ldr ? ? r4, =0x3ff
>
> This patch breaks the kernel boot when lockdep is enabled.
>
> v7_setup (called before the MMU is enabled) calls v7_flush_dcache_all,
> and the save_and_disable_irqs added by this patch ends up calling
> into lockdep C code (trace_hardirqs_off()) when we are in no position
> to execute it (no stack, no MMU).
>
> The following fixes it. Perhaps it can be folded in?

Absolutely.

No tracing what so ever should be involved here.

>
> diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
> index 62f8095..23371b1 100644
> --- a/arch/arm/include/asm/assembler.h
> +++ b/arch/arm/include/asm/assembler.h
> @@ -137,6 +137,11 @@
> disable_irq
> .endm
>
> + .macro save_and_disable_irqs_notrace, oldcpsr
> + mrs \oldcpsr, cpsr
> + disable_irq_notrace
> + .endm
> +
> /*
> * Restore interrupt state previously stored in a register. We don't
> * guarantee that this will preserve the flags.
> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> index 7a24d396..a655d3d 100644
> --- a/arch/arm/mm/cache-v7.S
> +++ b/arch/arm/mm/cache-v7.S
> @@ -55,7 +55,7 @@ loop1:
> cmp r1, #2 @ see what cache we have at this level
> blt skip @ skip if no cache, or just i-cache
> #ifdef CONFIG_PREEMPT
> - save_and_disable_irqs r9 @ make cssr&csidr read atomic
> + save_and_disable_irqs_notrace r9 @ make cssr&csidr read atomic
> #endif
> mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
> isb @ isb to sych the new cssr&csidr
>

2012-02-13 18:13:53

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On 02/13/12 10:09, Nicolas Pitre wrote:
> On Mon, 13 Feb 2012, Rabin Vincent wrote:
>
>> On Fri, Feb 3, 2012 at 07:33, Stephen Boyd <[email protected]> wrote:
>>> armv7's flush_cache_all() flushes caches via set/way. To
>>> determine the cache attributes (line size, number of sets,
>>> etc.) the assembly first writes the CSSELR register to select a
>>> cache level and then reads the CCSIDR register. The CSSELR register
>>> is banked per-cpu and is used to determine which cache level CCSIDR
>>> reads. If the task is migrated between when the CSSELR is written and
>>> the CCSIDR is read the CCSIDR value may be for an unexpected cache
>>> level (for example L1 instead of L2) and incorrect cache flushing
>>> could occur.
>>>
>>> Disable interrupts across the write and read so that the correct
>>> cache attributes are read and used for the cache flushing
>>> routine. We disable interrupts instead of disabling preemption
>>> because the critical section is only 3 instructions and we want
>>> to call v7_dcache_flush_all from __v7_setup which doesn't have a
>>> full kernel stack with a struct thread_info.
>>>
>>> This fixes a problem we see in scm_call() when flush_cache_all()
>>> is called from preemptible context and sometimes the L2 cache is
>>> not properly flushed out.
>>>
>>> Signed-off-by: Stephen Boyd <[email protected]>
>>> Cc: Catalin Marinas <[email protected]>
>>> Cc: Nicolas Pitre <[email protected]>
>>> ---
>>> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
>>> index 07c4bc8..654a5fc 100644
>>> --- a/arch/arm/mm/cache-v7.S
>>> +++ b/arch/arm/mm/cache-v7.S
>>> @@ -54,9 +54,15 @@ loop1:
>>> and r1, r1, #7 @ mask of the bits for current cache only
>>> cmp r1, #2 @ see what cache we have at this level
>>> blt skip @ skip if no cache, or just i-cache
>>> +#ifdef CONFIG_PREEMPT
>>> + save_and_disable_irqs r9 @ make cssr&csidr read atomic
>>> +#endif
>>> mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
>>> isb @ isb to sych the new cssr&csidr
>>> mrc p15, 1, r1, c0, c0, 0 @ read the new csidr
>>> +#ifdef CONFIG_PREEMPT
>>> + restore_irqs r9
>>> +#endif
>>> and r2, r1, #7 @ extract the length of the cache lines
>>> add r2, r2, #4 @ add 4 (line length offset)
>>> ldr r4, =0x3ff
>> This patch breaks the kernel boot when lockdep is enabled.
>>
>> v7_setup (called before the MMU is enabled) calls v7_flush_dcache_all,
>> and the save_and_disable_irqs added by this patch ends up calling
>> into lockdep C code (trace_hardirqs_off()) when we are in no position
>> to execute it (no stack, no MMU).
>>
>> The following fixes it. Perhaps it can be folded in?
> Absolutely.
>
> No tracing what so ever should be involved here.
>

Thanks. Russell has already merged the original patch to the fixes
branch. Hopefully he can fold this one in.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2012-02-13 18:15:25

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Mon, Feb 13, 2012 at 10:13:27AM -0800, Stephen Boyd wrote:
> Thanks. Russell has already merged the original patch to the fixes
> branch. Hopefully he can fold this one in.

Nope, I've asked Linus to pull it.

So do we conclude that the original patch wasn't properly tested? :P

2012-02-13 22:23:59

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On 02/13/12 10:15, Russell King - ARM Linux wrote:
> On Mon, Feb 13, 2012 at 10:13:27AM -0800, Stephen Boyd wrote:
>> Thanks. Russell has already merged the original patch to the fixes
>> branch. Hopefully he can fold this one in.
> Nope, I've asked Linus to pull it.
>
> So do we conclude that the original patch wasn't properly tested? :P

Sigh. Lockdep strikes again! I promise I tested it with lockdep disabled.

It looks like Linus' hasn't pulled yet but maybe he just hasn't
published it.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2012-02-13 23:29:27

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Mon, Feb 13, 2012 at 02:23:29PM -0800, Stephen Boyd wrote:
> On 02/13/12 10:15, Russell King - ARM Linux wrote:
> > On Mon, Feb 13, 2012 at 10:13:27AM -0800, Stephen Boyd wrote:
> >> Thanks. Russell has already merged the original patch to the fixes
> >> branch. Hopefully he can fold this one in.
> > Nope, I've asked Linus to pull it.
> >
> > So do we conclude that the original patch wasn't properly tested? :P
>
> Sigh. Lockdep strikes again! I promise I tested it with lockdep disabled.
>
> It looks like Linus' hasn't pulled yet but maybe he just hasn't
> published it.

It's not nice to change something after you've sent a pull request -
there's no way of knowing when Linus actually pulls it before he's
published it, and if he gets something different then it can raise
questions.

So, it's gone in as-is, and, as I'm now intending asking for another
pull request soo soon after my previous one, this is something that
we will have to live with probably for the remainder of the week.

Note that you should, by default, build your development kernels with
lockdep enabled, it's there as a debugging tool to help you find
logical locking errors faster than people can provoke deadlocks.

2012-02-14 14:23:33

by Rabin Vincent

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Mon, Feb 13, 2012 at 11:29:01PM +0000, Russell King - ARM Linux wrote:
> On Mon, Feb 13, 2012 at 02:23:29PM -0800, Stephen Boyd wrote:
> > On 02/13/12 10:15, Russell King - ARM Linux wrote:
> > > On Mon, Feb 13, 2012 at 10:13:27AM -0800, Stephen Boyd wrote:
> > >> Thanks. Russell has already merged the original patch to the fixes
> > >> branch. Hopefully he can fold this one in.
> > > Nope, I've asked Linus to pull it.
> > >
> > > So do we conclude that the original patch wasn't properly tested? :P
> >
> > Sigh. Lockdep strikes again! I promise I tested it with lockdep disabled.
> >
> > It looks like Linus' hasn't pulled yet but maybe he just hasn't
> > published it.
>
> It's not nice to change something after you've sent a pull request -
> there's no way of knowing when Linus actually pulls it before he's
> published it, and if he gets something different then it can raise
> questions.
>
> So, it's gone in as-is, and, as I'm now intending asking for another
> pull request soo soon after my previous one, this is something that
> we will have to live with probably for the remainder of the week.

OK, since it can't be folded in, here is a proper patch:

8<---------
>From 26f02624a20a61ed1997a4e8648e4c766a54d91d Mon Sep 17 00:00:00 2001
From: Rabin Vincent <[email protected]>
Date: Tue, 14 Feb 2012 19:22:07 +0530
Subject: [PATCH] ARM: fix v7 boot with lockdep enabled

Bootup with lockdep enabled has been broken on v7 since b46c0f74657d
("ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR").

This is because v7_setup (which is called very early during boot) calls
v7_flush_dcache_all, and the save_and_disable_irqs added by that patch
ends up attempting to call into lockdep C code (trace_hardirqs_off())
when we are in no position to execute it (no stack, MMU off).

Fix this by using a notrace variant of save_and_disable_irqs. The code
already uses the notrace variant of restore_irqs.

Cc: Stephen Boyd <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Nicolas Pitre <[email protected]>
Cc: [email protected]
Signed-off-by: Rabin Vincent <[email protected]>
---
arch/arm/include/asm/assembler.h | 5 +++++
arch/arm/mm/cache-v7.S | 2 +-
2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 62f8095..23371b1 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -137,6 +137,11 @@
disable_irq
.endm

+ .macro save_and_disable_irqs_notrace, oldcpsr
+ mrs \oldcpsr, cpsr
+ disable_irq_notrace
+ .endm
+
/*
* Restore interrupt state previously stored in a register. We don't
* guarantee that this will preserve the flags.
diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
index 7a24d396..a655d3d 100644
--- a/arch/arm/mm/cache-v7.S
+++ b/arch/arm/mm/cache-v7.S
@@ -55,7 +55,7 @@ loop1:
cmp r1, #2 @ see what cache we have at this level
blt skip @ skip if no cache, or just i-cache
#ifdef CONFIG_PREEMPT
- save_and_disable_irqs r9 @ make cssr&csidr read atomic
+ save_and_disable_irqs_notrace r9 @ make cssr&csidr read atomic
#endif
mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
isb @ isb to sych the new cssr&csidr
--
1.7.9

2012-02-14 17:30:04

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On Tue, 14 Feb 2012, Rabin Vincent wrote:

> On Mon, Feb 13, 2012 at 11:29:01PM +0000, Russell King - ARM Linux wrote:
> > On Mon, Feb 13, 2012 at 02:23:29PM -0800, Stephen Boyd wrote:
> > > On 02/13/12 10:15, Russell King - ARM Linux wrote:
> > > > On Mon, Feb 13, 2012 at 10:13:27AM -0800, Stephen Boyd wrote:
> > > >> Thanks. Russell has already merged the original patch to the fixes
> > > >> branch. Hopefully he can fold this one in.
> > > > Nope, I've asked Linus to pull it.
> > > >
> > > > So do we conclude that the original patch wasn't properly tested? :P
> > >
> > > Sigh. Lockdep strikes again! I promise I tested it with lockdep disabled.
> > >
> > > It looks like Linus' hasn't pulled yet but maybe he just hasn't
> > > published it.
> >
> > It's not nice to change something after you've sent a pull request -
> > there's no way of knowing when Linus actually pulls it before he's
> > published it, and if he gets something different then it can raise
> > questions.
> >
> > So, it's gone in as-is, and, as I'm now intending asking for another
> > pull request soo soon after my previous one, this is something that
> > we will have to live with probably for the remainder of the week.
>
> OK, since it can't be folded in, here is a proper patch:
>
> 8<---------
> >From 26f02624a20a61ed1997a4e8648e4c766a54d91d Mon Sep 17 00:00:00 2001
> From: Rabin Vincent <[email protected]>
> Date: Tue, 14 Feb 2012 19:22:07 +0530
> Subject: [PATCH] ARM: fix v7 boot with lockdep enabled
>
> Bootup with lockdep enabled has been broken on v7 since b46c0f74657d
> ("ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR").
>
> This is because v7_setup (which is called very early during boot) calls
> v7_flush_dcache_all, and the save_and_disable_irqs added by that patch
> ends up attempting to call into lockdep C code (trace_hardirqs_off())
> when we are in no position to execute it (no stack, MMU off).
>
> Fix this by using a notrace variant of save_and_disable_irqs. The code
> already uses the notrace variant of restore_irqs.
>
> Cc: Stephen Boyd <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Nicolas Pitre <[email protected]>

Reviewed-by: Nicolas Pitre <[email protected]>

> Cc: [email protected]
> Signed-off-by: Rabin Vincent <[email protected]>
> ---
> arch/arm/include/asm/assembler.h | 5 +++++
> arch/arm/mm/cache-v7.S | 2 +-
> 2 files changed, 6 insertions(+), 1 deletions(-)
>
> diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
> index 62f8095..23371b1 100644
> --- a/arch/arm/include/asm/assembler.h
> +++ b/arch/arm/include/asm/assembler.h
> @@ -137,6 +137,11 @@
> disable_irq
> .endm
>
> + .macro save_and_disable_irqs_notrace, oldcpsr
> + mrs \oldcpsr, cpsr
> + disable_irq_notrace
> + .endm
> +
> /*
> * Restore interrupt state previously stored in a register. We don't
> * guarantee that this will preserve the flags.
> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> index 7a24d396..a655d3d 100644
> --- a/arch/arm/mm/cache-v7.S
> +++ b/arch/arm/mm/cache-v7.S
> @@ -55,7 +55,7 @@ loop1:
> cmp r1, #2 @ see what cache we have at this level
> blt skip @ skip if no cache, or just i-cache
> #ifdef CONFIG_PREEMPT
> - save_and_disable_irqs r9 @ make cssr&csidr read atomic
> + save_and_disable_irqs_notrace r9 @ make cssr&csidr read atomic
> #endif
> mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
> isb @ isb to sych the new cssr&csidr
> --
> 1.7.9
>

2012-02-14 18:07:32

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH] ARM: cache-v7: Disable preemption when reading CCSIDR

On 02/14/12 06:15, Rabin Vincent wrote:
> On Mon, Feb 13, 2012 at 11:29:01PM +0000, Russell King - ARM Linux wrote:
>> On Mon, Feb 13, 2012 at 02:23:29PM -0800, Stephen Boyd wrote:
>>> On 02/13/12 10:15, Russell King - ARM Linux wrote:
>>>> On Mon, Feb 13, 2012 at 10:13:27AM -0800, Stephen Boyd wrote:
>>>>> Thanks. Russell has already merged the original patch to the fixes
>>>>> branch. Hopefully he can fold this one in.
>>>> Nope, I've asked Linus to pull it.
>>>>
>>>> So do we conclude that the original patch wasn't properly tested? :P
>>> Sigh. Lockdep strikes again! I promise I tested it with lockdep disabled.
>>>
>>> It looks like Linus' hasn't pulled yet but maybe he just hasn't
>>> published it.
>> It's not nice to change something after you've sent a pull request -
>> there's no way of knowing when Linus actually pulls it before he's
>> published it, and if he gets something different then it can raise
>> questions.
>>
>> So, it's gone in as-is, and, as I'm now intending asking for another
>> pull request soo soon after my previous one, this is something that
>> we will have to live with probably for the remainder of the week.
> OK, since it can't be folded in, here is a proper patch:
>
> 8<---------
> From 26f02624a20a61ed1997a4e8648e4c766a54d91d Mon Sep 17 00:00:00 2001
> From: Rabin Vincent <[email protected]>
> Date: Tue, 14 Feb 2012 19:22:07 +0530
> Subject: [PATCH] ARM: fix v7 boot with lockdep enabled
>
> Bootup with lockdep enabled has been broken on v7 since b46c0f74657d
> ("ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR").
>
> This is because v7_setup (which is called very early during boot) calls
> v7_flush_dcache_all, and the save_and_disable_irqs added by that patch
> ends up attempting to call into lockdep C code (trace_hardirqs_off())
> when we are in no position to execute it (no stack, MMU off).
>
> Fix this by using a notrace variant of save_and_disable_irqs. The code
> already uses the notrace variant of restore_irqs.
>
> Cc: Stephen Boyd <[email protected]>

Acked-by: Stephen Boyd <[email protected]>

> Cc: Catalin Marinas <[email protected]>
> Cc: Nicolas Pitre <[email protected]>
> Cc: [email protected]
> Signed-off-by: Rabin Vincent <[email protected]>
> ---
> arch/arm/include/asm/assembler.h | 5 +++++
> arch/arm/mm/cache-v7.S | 2 +-
> 2 files changed, 6 insertions(+), 1 deletions(-)
>
> diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
> index 62f8095..23371b1 100644
> --- a/arch/arm/include/asm/assembler.h
> +++ b/arch/arm/include/asm/assembler.h
> @@ -137,6 +137,11 @@
> disable_irq
> .endm
>
> + .macro save_and_disable_irqs_notrace, oldcpsr
> + mrs \oldcpsr, cpsr
> + disable_irq_notrace
> + .endm
> +
> /*
> * Restore interrupt state previously stored in a register. We don't
> * guarantee that this will preserve the flags.
> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> index 7a24d396..a655d3d 100644
> --- a/arch/arm/mm/cache-v7.S
> +++ b/arch/arm/mm/cache-v7.S
> @@ -55,7 +55,7 @@ loop1:
> cmp r1, #2 @ see what cache we have at this level
> blt skip @ skip if no cache, or just i-cache
> #ifdef CONFIG_PREEMPT
> - save_and_disable_irqs r9 @ make cssr&csidr read atomic
> + save_and_disable_irqs_notrace r9 @ make cssr&csidr read atomic
> #endif
> mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr
> isb @ isb to sych the new cssr&csidr


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.