Hello, Linus.
It is very late but this is an important percpu-refcount fix from
Sebastian Ott. The problem is that percpu_ref_*() used __this_cpu_*()
instead of this_cpu_*(). The difference between the two is that the
latter is atomic on the local cpu while the former is not.
this_cpu_inc() is guaranteed to increment the percpu counter on the
cpu that the operation is executed on without any synchronization;
however, __this_cpu_inc() doesn't and if the local cpu invokes the
function from different contexts (e.g. process and irq) of the same
CPU, it's not guaranteed to actually increment as it may be
implemented as rmw.
This bug existed from the get-go but it hasn't been noticed earlier
probably because on x86 __this_cpu_inc() is equivalent to
this_cpu_inc() as both get translated into single instruction;
however, s390 uses the generic rmw implementation and gets affected by
the bug. Kudos to Sebastian and Heiko for diagnosing it.
The change is very low risk and fixes a critical issue on the affected
architectures, so I think it's a good candidate for inclusion although
it's very late in the devel cycle. On the other hand, this has been
broken since v3.11, so backporting it through -stable post -rc1 won't
be the end of the world.
I'll ping Christoph whether __this_cpu_*() ops can be better annotated
so that it can trigger lockdep warning when used from multiple
contexts.
Thanks.
The following changes since commit 5a838c3b60e3a36ade764cf7751b8f17d7c9c2da:
percpu: make pcpu_alloc_chunk() use pcpu_mem_free() instead of kfree() (2014-04-14 16:18:06 -0400)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git for-3.15-fixes
for you to fetch changes up to 0c36b390a546055b6815d4b93a2c9fed4d980ffb:
percpu-refcount: fix usage of this_cpu_ops (2014-06-04 12:12:29 -0400)
----------------------------------------------------------------
Sebastian Ott (1):
percpu-refcount: fix usage of this_cpu_ops
include/linux/percpu-refcount.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 95961f0..0afb48f 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -110,7 +110,7 @@ static inline void percpu_ref_get(struct percpu_ref *ref)
pcpu_count = ACCESS_ONCE(ref->pcpu_count);
if (likely(REF_STATUS(pcpu_count) == PCPU_REF_PTR))
- __this_cpu_inc(*pcpu_count);
+ this_cpu_inc(*pcpu_count);
else
atomic_inc(&ref->count);
@@ -139,7 +139,7 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref)
pcpu_count = ACCESS_ONCE(ref->pcpu_count);
if (likely(REF_STATUS(pcpu_count) == PCPU_REF_PTR)) {
- __this_cpu_inc(*pcpu_count);
+ this_cpu_inc(*pcpu_count);
ret = true;
}
@@ -164,7 +164,7 @@ static inline void percpu_ref_put(struct percpu_ref *ref)
pcpu_count = ACCESS_ONCE(ref->pcpu_count);
if (likely(REF_STATUS(pcpu_count) == PCPU_REF_PTR))
- __this_cpu_dec(*pcpu_count);
+ this_cpu_dec(*pcpu_count);
else if (unlikely(atomic_dec_and_test(&ref->count)))
ref->release(ref);
Hello, again.
My script applied the patch with the wrong author. It should have
been Heiko instead of Sebastian. I fixed up percpu/for-3.15-fixes
branch accordingly. The updated pull-request output follows. There
is no content change.
Thanks.
The following changes since commit 5a838c3b60e3a36ade764cf7751b8f17d7c9c2da:
percpu: make pcpu_alloc_chunk() use pcpu_mem_free() instead of kfree() (2014-04-14 16:18:06 -0400)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git for-3.15-fixes
for you to fetch changes up to 00930b6b56d260b13924b9600aa37652f538edbf:
percpu-refcount: fix usage of this_cpu_ops (2014-06-04 12:57:24 -0400)
----------------------------------------------------------------
Heiko Carstens (1):
percpu-refcount: fix usage of this_cpu_ops
include/linux/percpu-refcount.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 95961f0..0afb48f 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -110,7 +110,7 @@ static inline void percpu_ref_get(struct percpu_ref *ref)
pcpu_count = ACCESS_ONCE(ref->pcpu_count);
if (likely(REF_STATUS(pcpu_count) == PCPU_REF_PTR))
- __this_cpu_inc(*pcpu_count);
+ this_cpu_inc(*pcpu_count);
else
atomic_inc(&ref->count);
@@ -139,7 +139,7 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref)
pcpu_count = ACCESS_ONCE(ref->pcpu_count);
if (likely(REF_STATUS(pcpu_count) == PCPU_REF_PTR)) {
- __this_cpu_inc(*pcpu_count);
+ this_cpu_inc(*pcpu_count);
ret = true;
}
@@ -164,7 +164,7 @@ static inline void percpu_ref_put(struct percpu_ref *ref)
pcpu_count = ACCESS_ONCE(ref->pcpu_count);
if (likely(REF_STATUS(pcpu_count) == PCPU_REF_PTR))
- __this_cpu_dec(*pcpu_count);
+ this_cpu_dec(*pcpu_count);
else if (unlikely(atomic_dec_and_test(&ref->count)))
ref->release(ref);
On Wed, Jun 4, 2014 at 9:59 AM, Tejun Heo <[email protected]> wrote:
> Hello, again.
>
> My script applied the patch with the wrong author. It should have
> been Heiko instead of Sebastian. I fixed up percpu/for-3.15-fixes
> branch accordingly.
I'm just too fast for you. The previous one is already pulled and
pushed out, and Ott got authorship.
Linus
On Wed, Jun 04, 2014 at 10:06:07AM -0700, Linus Torvalds wrote:
> On Wed, Jun 4, 2014 at 9:59 AM, Tejun Heo <[email protected]> wrote:
> > Hello, again.
> >
> > My script applied the patch with the wrong author. It should have
> > been Heiko instead of Sebastian. I fixed up percpu/for-3.15-fixes
> > branch accordingly.
>
> I'm just too fast for you. The previous one is already pulled and
> pushed out, and Ott got authorship.
Urgh... Sorry, Heiko. I owe you a beer.
--
tejun
On Wed, 4 Jun 2014, Tejun Heo wrote:
> I'll ping Christoph whether __this_cpu_*() ops can be better annotated
> so that it can trigger lockdep warning when used from multiple
> contexts.
We just added checks that verify the proper preemption mode for
__this_cpu operations. That did not catch it?
Hello,
On Wed, Jun 04, 2014 at 12:16:56PM -0500, Christoph Lameter wrote:
> On Wed, 4 Jun 2014, Tejun Heo wrote:
>
> > I'll ping Christoph whether __this_cpu_*() ops can be better annotated
> > so that it can trigger lockdep warning when used from multiple
> > contexts.
>
> We just added checks that verify the proper preemption mode for
> __this_cpu operations. That did not catch it?
That one just ensures that __this_cpu_*() are called with preemption
disabled. What's necessary is tracking whether each percpu variable
is operated on from different contexts so that things like
"inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage" lockdep warnings
can be triggered in cases like this.
Given that percpu operations can be performed on any allocated percpu
area, it's a bit difficult to achieve tho. The only way I can think
of is percpu allocator creating lockdep key for all the allocated
offsets which isn't too enticing. :(
Thanks.
--
tejun
On Wed, 4 Jun 2014, Tejun Heo wrote:
>
> Given that percpu operations can be performed on any allocated percpu
> area, it's a bit difficult to achieve tho. The only way I can think
> of is percpu allocator creating lockdep key for all the allocated
> offsets which isn't too enticing. :(
Hmmm.. Or an annotation for a percpu variable saying that its being used
from an interrupt context.