2019-01-28 12:28:17

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 0/3] perf refcount_t conversions

Another set of old patches, rebased and this time the commits
also updated since we merged the docs in past and also
refcount_dec_and_test() gets new acquire ordering on success
very soon, which is also reflected in commit messages.


Elena Reshetova (3):
perf: convert perf_event_context.refcount to refcount_t
perf/ring_buffer: convert ring_buffer.refcount to refcount_t
perf/ring_buffer: convert ring_buffer.aux_refcount to refcount_t

include/linux/perf_event.h | 3 ++-
kernel/events/core.c | 18 +++++++++---------
kernel/events/internal.h | 5 +++--
kernel/events/ring_buffer.c | 8 ++++----
4 files changed, 18 insertions(+), 16 deletions(-)

--
2.7.4



2019-01-28 12:29:10

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 2/3] perf/ring_buffer: convert ring_buffer.refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable ring_buffer.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
for more information.

Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the ring_buffer.refcount it might make a difference
in following places:
- ring_buffer_get(): increment in refcount_inc_not_zero() only
guarantees control dependency on success vs. fully ordered
atomic counterpart
- ring_buffer_put(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and ACQUIRE ordering + control dependency
on success vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
kernel/events/core.c | 4 ++--
kernel/events/internal.h | 3 ++-
kernel/events/ring_buffer.c | 2 +-
3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index a1e87d2..963cee0 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5388,7 +5388,7 @@ struct ring_buffer *ring_buffer_get(struct perf_event *event)
rcu_read_lock();
rb = rcu_dereference(event->rb);
if (rb) {
- if (!atomic_inc_not_zero(&rb->refcount))
+ if (!refcount_inc_not_zero(&rb->refcount))
rb = NULL;
}
rcu_read_unlock();
@@ -5398,7 +5398,7 @@ struct ring_buffer *ring_buffer_get(struct perf_event *event)

void ring_buffer_put(struct ring_buffer *rb)
{
- if (!atomic_dec_and_test(&rb->refcount))
+ if (!refcount_dec_and_test(&rb->refcount))
return;

WARN_ON_ONCE(!list_empty(&rb->event_list));
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 6dc725a..4718de2 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -4,13 +4,14 @@

#include <linux/hardirq.h>
#include <linux/uaccess.h>
+#include <linux/refcount.h>

/* Buffer handling */

#define RING_BUFFER_WRITABLE 0x01

struct ring_buffer {
- atomic_t refcount;
+ refcount_t refcount;
struct rcu_head rcu_head;
#ifdef CONFIG_PERF_USE_VMALLOC
struct work_struct work;
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 4a99370..e841d48 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -285,7 +285,7 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
else
rb->overwrite = 1;

- atomic_set(&rb->refcount, 1);
+ refcount_set(&rb->refcount, 1);

INIT_LIST_HEAD(&rb->event_list);
spin_lock_init(&rb->event_lock);
--
2.7.4


2019-01-28 12:29:59

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 1/3] perf: convert perf_event_context.refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable perf_event_context.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
for more information.

Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the perf_event_context.refcount it might make a difference
in following places:
- get_ctx(), perf_event_ctx_lock_nested(), perf_lock_task_context()
and __perf_event_ctx_lock_double(): increment in
refcount_inc_not_zero() only guarantees control dependency
on success vs. fully ordered atomic counterpart
- put_ctx(): decrement in refcount_dec_and_test() provides
RELEASE ordering and ACQUIRE ordering + control dependency on success
vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
include/linux/perf_event.h | 3 ++-
kernel/events/core.c | 12 ++++++------
2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 1d5c551..6a94097 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -55,6 +55,7 @@ struct perf_guest_info_callbacks {
#include <linux/perf_regs.h>
#include <linux/workqueue.h>
#include <linux/cgroup.h>
+#include <linux/refcount.h>
#include <asm/local.h>

struct perf_callchain_entry {
@@ -737,7 +738,7 @@ struct perf_event_context {
int nr_stat;
int nr_freq;
int rotate_disable;
- atomic_t refcount;
+ refcount_t refcount;
struct task_struct *task;

/*
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3cd13a3..a1e87d2 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1171,7 +1171,7 @@ static void perf_event_ctx_deactivate(struct perf_event_context *ctx)

static void get_ctx(struct perf_event_context *ctx)
{
- WARN_ON(!atomic_inc_not_zero(&ctx->refcount));
+ WARN_ON(!refcount_inc_not_zero(&ctx->refcount));
}

static void free_ctx(struct rcu_head *head)
@@ -1185,7 +1185,7 @@ static void free_ctx(struct rcu_head *head)

static void put_ctx(struct perf_event_context *ctx)
{
- if (atomic_dec_and_test(&ctx->refcount)) {
+ if (refcount_dec_and_test(&ctx->refcount)) {
if (ctx->parent_ctx)
put_ctx(ctx->parent_ctx);
if (ctx->task && ctx->task != TASK_TOMBSTONE)
@@ -1267,7 +1267,7 @@ perf_event_ctx_lock_nested(struct perf_event *event, int nesting)
again:
rcu_read_lock();
ctx = READ_ONCE(event->ctx);
- if (!atomic_inc_not_zero(&ctx->refcount)) {
+ if (!refcount_inc_not_zero(&ctx->refcount)) {
rcu_read_unlock();
goto again;
}
@@ -1400,7 +1400,7 @@ perf_lock_task_context(struct task_struct *task, int ctxn, unsigned long *flags)
}

if (ctx->task == TASK_TOMBSTONE ||
- !atomic_inc_not_zero(&ctx->refcount)) {
+ !refcount_inc_not_zero(&ctx->refcount)) {
raw_spin_unlock(&ctx->lock);
ctx = NULL;
} else {
@@ -4056,7 +4056,7 @@ static void __perf_event_init_context(struct perf_event_context *ctx)
INIT_LIST_HEAD(&ctx->event_list);
INIT_LIST_HEAD(&ctx->pinned_active);
INIT_LIST_HEAD(&ctx->flexible_active);
- atomic_set(&ctx->refcount, 1);
+ refcount_set(&ctx->refcount, 1);
}

static struct perf_event_context *
@@ -10391,7 +10391,7 @@ __perf_event_ctx_lock_double(struct perf_event *group_leader,
again:
rcu_read_lock();
gctx = READ_ONCE(group_leader->ctx);
- if (!atomic_inc_not_zero(&gctx->refcount)) {
+ if (!refcount_inc_not_zero(&gctx->refcount)) {
rcu_read_unlock();
goto again;
}
--
2.7.4


2019-01-28 12:30:05

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 3/3] perf/ring_buffer: convert ring_buffer.aux_refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable ring_buffer.aux_refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
for more information.

Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the ring_buffer.aux_refcount it might make a difference
in following places:
- perf_aux_output_begin(): increment in refcount_inc_not_zero() only
guarantees control dependency on success vs. fully ordered
atomic counterpart
- rb_free_aux(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and ACQUIRE ordering + control dependency
on success vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
kernel/events/core.c | 2 +-
kernel/events/internal.h | 2 +-
kernel/events/ring_buffer.c | 6 +++---
3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 963cee0..31ab5d7 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5463,7 +5463,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)

/* this has to be the last one */
rb_free_aux(rb);
- WARN_ON_ONCE(atomic_read(&rb->aux_refcount));
+ WARN_ON_ONCE(refcount_read(&rb->aux_refcount));

mutex_unlock(&event->mmap_mutex);
}
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 4718de2..79c4707 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -49,7 +49,7 @@ struct ring_buffer {
atomic_t aux_mmap_count;
unsigned long aux_mmap_locked;
void (*free_aux)(void *);
- atomic_t aux_refcount;
+ refcount_t aux_refcount;
void **aux_pages;
void *aux_priv;

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index e841d48..0416f01 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -358,7 +358,7 @@ void *perf_aux_output_begin(struct perf_output_handle *handle,
if (!atomic_read(&rb->aux_mmap_count))
goto err;

- if (!atomic_inc_not_zero(&rb->aux_refcount))
+ if (!refcount_inc_not_zero(&rb->aux_refcount))
goto err;

/*
@@ -671,7 +671,7 @@ int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,
* we keep a refcount here to make sure either of the two can
* reference them safely.
*/
- atomic_set(&rb->aux_refcount, 1);
+ refcount_set(&rb->aux_refcount, 1);

rb->aux_overwrite = overwrite;
rb->aux_watermark = watermark;
@@ -690,7 +690,7 @@ int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,

void rb_free_aux(struct ring_buffer *rb)
{
- if (atomic_dec_and_test(&rb->aux_refcount))
+ if (refcount_dec_and_test(&rb->aux_refcount))
__rb_free_aux(rb);
}

--
2.7.4


2019-01-29 09:39:55

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/3] perf: convert perf_event_context.refcount to refcount_t

On Mon, Jan 28, 2019 at 02:27:26PM +0200, Elena Reshetova wrote:
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 3cd13a3..a1e87d2 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -1171,7 +1171,7 @@ static void perf_event_ctx_deactivate(struct perf_event_context *ctx)
>
> static void get_ctx(struct perf_event_context *ctx)
> {
> - WARN_ON(!atomic_inc_not_zero(&ctx->refcount));
> + WARN_ON(!refcount_inc_not_zero(&ctx->refcount));

This could be refcount_inc(), remember how that already produces a WARN
when we try and increment 0.


2019-01-29 09:41:19

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/3] perf refcount_t conversions

On Mon, Jan 28, 2019 at 02:27:25PM +0200, Elena Reshetova wrote:
> Another set of old patches, rebased and this time the commits
> also updated since we merged the docs in past and also
> refcount_dec_and_test() gets new acquire ordering on success
> very soon, which is also reflected in commit messages.
>
>
> Elena Reshetova (3):
> perf: convert perf_event_context.refcount to refcount_t
> perf/ring_buffer: convert ring_buffer.refcount to refcount_t
> perf/ring_buffer: convert ring_buffer.aux_refcount to refcount_t

Fixed that first patch, applied the lot.

2019-01-29 13:55:58

by Elena Reshetova

[permalink] [raw]
Subject: RE: [PATCH 1/3] perf: convert perf_event_context.refcount to refcount_t

> On Mon, Jan 28, 2019 at 02:27:26PM +0200, Elena Reshetova wrote:
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 3cd13a3..a1e87d2 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -1171,7 +1171,7 @@ static void perf_event_ctx_deactivate(struct
> perf_event_context *ctx)
> >
> > static void get_ctx(struct perf_event_context *ctx)
> > {
> > - WARN_ON(!atomic_inc_not_zero(&ctx->refcount));
> > + WARN_ON(!refcount_inc_not_zero(&ctx->refcount));
>
> This could be refcount_inc(), remember how that already produces a WARN
> when we try and increment 0.

But is this true for the x86 arch-specific implementation also?

2019-01-30 09:02:49

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/3] perf: convert perf_event_context.refcount to refcount_t

On Tue, Jan 29, 2019 at 01:55:32PM +0000, Reshetova, Elena wrote:
> > On Mon, Jan 28, 2019 at 02:27:26PM +0200, Elena Reshetova wrote:
> > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > index 3cd13a3..a1e87d2 100644
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -1171,7 +1171,7 @@ static void perf_event_ctx_deactivate(struct
> > perf_event_context *ctx)
> > >
> > > static void get_ctx(struct perf_event_context *ctx)
> > > {
> > > - WARN_ON(!atomic_inc_not_zero(&ctx->refcount));
> > > + WARN_ON(!refcount_inc_not_zero(&ctx->refcount));
> >
> > This could be refcount_inc(), remember how that already produces a WARN
> > when we try and increment 0.
>
> But is this true for the x86 arch-specific implementation also?

Dunno; but when debugging you should not have those enabled anyway.

2019-02-01 10:35:13

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 1/3] perf: convert perf_event_context.refcount to refcount_t

On Tue, Jan 29, 2019 at 01:55:32PM +0000, Reshetova, Elena wrote:
> > On Mon, Jan 28, 2019 at 02:27:26PM +0200, Elena Reshetova wrote:
> > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > index 3cd13a3..a1e87d2 100644
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -1171,7 +1171,7 @@ static void perf_event_ctx_deactivate(struct
> > perf_event_context *ctx)
> > >
> > > static void get_ctx(struct perf_event_context *ctx)
> > > {
> > > - WARN_ON(!atomic_inc_not_zero(&ctx->refcount));
> > > + WARN_ON(!refcount_inc_not_zero(&ctx->refcount));
> >
> > This could be refcount_inc(), remember how that already produces a WARN
> > when we try and increment 0.
>
> But is this true for the x86 arch-specific implementation also?

If you use recount_inc_checked(), it will always produce the WARN(),
even when using the x86-specific refcount implementation.

(this was one place I had intended to use the *_checked() forms of the
refcount ops).

Thanks,
Mark.

2019-02-01 15:46:32

by Elena Reshetova

[permalink] [raw]
Subject: RE: [PATCH 1/3] perf: convert perf_event_context.refcount to refcount_t

> On Tue, Jan 29, 2019 at 01:55:32PM +0000, Reshetova, Elena wrote:
> > > On Mon, Jan 28, 2019 at 02:27:26PM +0200, Elena Reshetova wrote:
> > > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > > index 3cd13a3..a1e87d2 100644
> > > > --- a/kernel/events/core.c
> > > > +++ b/kernel/events/core.c
> > > > @@ -1171,7 +1171,7 @@ static void perf_event_ctx_deactivate(struct
> > > perf_event_context *ctx)
> > > >
> > > > static void get_ctx(struct perf_event_context *ctx)
> > > > {
> > > > - WARN_ON(!atomic_inc_not_zero(&ctx->refcount));
> > > > + WARN_ON(!refcount_inc_not_zero(&ctx->refcount));
> > >
> > > This could be refcount_inc(), remember how that already produces a WARN
> > > when we try and increment 0.
> >
> > But is this true for the x86 arch-specific implementation also?
>
> If you use recount_inc_checked(), it will always produce the WARN(),
> even when using the x86-specific refcount implementation.
>
> (this was one place I had intended to use the *_checked() forms of the
> refcount ops).

Yes, with refcount_inc_checked() it would work, but I don't like it
that much when we have functions that behave regardless of refcount
config. It does help for code minimization & clarity like here, but I think
it complicates things even more: two different configs, then functions that
do not obey configs, etc.

Anyhow, I can change this to refcount_inc_checked(), if this is what everyone
thinks is the best.

Best Regards,
Elena.




2019-02-01 16:24:49

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 1/3] perf: convert perf_event_context.refcount to refcount_t

On Fri, Feb 01, 2019 at 03:44:38PM +0000, Reshetova, Elena wrote:
> > On Tue, Jan 29, 2019 at 01:55:32PM +0000, Reshetova, Elena wrote:
> > > > On Mon, Jan 28, 2019 at 02:27:26PM +0200, Elena Reshetova wrote:
> > > > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > > > index 3cd13a3..a1e87d2 100644
> > > > > --- a/kernel/events/core.c
> > > > > +++ b/kernel/events/core.c
> > > > > @@ -1171,7 +1171,7 @@ static void perf_event_ctx_deactivate(struct
> > > > perf_event_context *ctx)
> > > > >
> > > > > static void get_ctx(struct perf_event_context *ctx)
> > > > > {
> > > > > - WARN_ON(!atomic_inc_not_zero(&ctx->refcount));
> > > > > + WARN_ON(!refcount_inc_not_zero(&ctx->refcount));
> > > >
> > > > This could be refcount_inc(), remember how that already produces a WARN
> > > > when we try and increment 0.
> > >
> > > But is this true for the x86 arch-specific implementation also?
> >
> > If you use recount_inc_checked(), it will always produce the WARN(),
> > even when using the x86-specific refcount implementation.
> >
> > (this was one place I had intended to use the *_checked() forms of the
> > refcount ops).
>
> Yes, with refcount_inc_checked() it would work, but I don't like it
> that much when we have functions that behave regardless of refcount
> config. It does help for code minimization & clarity like here, but I think
> it complicates things even more: two different configs, then functions that
> do not obey configs, etc.

Sure. The main idea of having the _checked() forms was to not lose
warnings in a conversion to refcount_t, but I appreciate that people
might not like the existing warnings at all.

> Anyhow, I can change this to refcount_inc_checked(), if this is what everyone
> thinks is the best.

I'll defer to Peter.

Peter, would you prefer refcount_inc() or refcount_inc_checked() here?

Thanks,
Mark.

Subject: [tip:perf/core] perf: Convert perf_event_context.refcount to refcount_t

Commit-ID: 8c94abbbe1ba24961278055434504b7dc3595415
Gitweb: https://git.kernel.org/tip/8c94abbbe1ba24961278055434504b7dc3595415
Author: Elena Reshetova <[email protected]>
AuthorDate: Mon, 28 Jan 2019 14:27:26 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 4 Feb 2019 08:46:15 +0100

perf: Convert perf_event_context.refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:

- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable perf_event_context.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

** Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
for more information.

Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the perf_event_context.refcount it might make a difference
in following places:

- get_ctx(), perf_event_ctx_lock_nested(), perf_lock_task_context()
and __perf_event_ctx_lock_double(): increment in
refcount_inc_not_zero() only guarantees control dependency
on success vs. fully ordered atomic counterpart
- put_ctx(): decrement in refcount_dec_and_test() provides
RELEASE ordering and ACQUIRE ordering + control dependency on success
vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/perf_event.h | 3 ++-
kernel/events/core.c | 12 ++++++------
2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index a79e59fc3b7d..6cb5d483ab34 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -54,6 +54,7 @@ struct perf_guest_info_callbacks {
#include <linux/sysfs.h>
#include <linux/perf_regs.h>
#include <linux/cgroup.h>
+#include <linux/refcount.h>
#include <asm/local.h>

struct perf_callchain_entry {
@@ -737,7 +738,7 @@ struct perf_event_context {
int nr_stat;
int nr_freq;
int rotate_disable;
- atomic_t refcount;
+ refcount_t refcount;
struct task_struct *task;

/*
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5b89de7918d0..677164d54547 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1172,7 +1172,7 @@ static void perf_event_ctx_deactivate(struct perf_event_context *ctx)

static void get_ctx(struct perf_event_context *ctx)
{
- WARN_ON(!atomic_inc_not_zero(&ctx->refcount));
+ refcount_inc(&ctx->refcount);
}

static void free_ctx(struct rcu_head *head)
@@ -1186,7 +1186,7 @@ static void free_ctx(struct rcu_head *head)

static void put_ctx(struct perf_event_context *ctx)
{
- if (atomic_dec_and_test(&ctx->refcount)) {
+ if (refcount_dec_and_test(&ctx->refcount)) {
if (ctx->parent_ctx)
put_ctx(ctx->parent_ctx);
if (ctx->task && ctx->task != TASK_TOMBSTONE)
@@ -1268,7 +1268,7 @@ perf_event_ctx_lock_nested(struct perf_event *event, int nesting)
again:
rcu_read_lock();
ctx = READ_ONCE(event->ctx);
- if (!atomic_inc_not_zero(&ctx->refcount)) {
+ if (!refcount_inc_not_zero(&ctx->refcount)) {
rcu_read_unlock();
goto again;
}
@@ -1401,7 +1401,7 @@ retry:
}

if (ctx->task == TASK_TOMBSTONE ||
- !atomic_inc_not_zero(&ctx->refcount)) {
+ !refcount_inc_not_zero(&ctx->refcount)) {
raw_spin_unlock(&ctx->lock);
ctx = NULL;
} else {
@@ -4057,7 +4057,7 @@ static void __perf_event_init_context(struct perf_event_context *ctx)
INIT_LIST_HEAD(&ctx->event_list);
INIT_LIST_HEAD(&ctx->pinned_active);
INIT_LIST_HEAD(&ctx->flexible_active);
- atomic_set(&ctx->refcount, 1);
+ refcount_set(&ctx->refcount, 1);
}

static struct perf_event_context *
@@ -10613,7 +10613,7 @@ __perf_event_ctx_lock_double(struct perf_event *group_leader,
again:
rcu_read_lock();
gctx = READ_ONCE(group_leader->ctx);
- if (!atomic_inc_not_zero(&gctx->refcount)) {
+ if (!refcount_inc_not_zero(&gctx->refcount)) {
rcu_read_unlock();
goto again;
}

Subject: [tip:perf/core] perf/ring_buffer: Convert ring_buffer.refcount to refcount_t

Commit-ID: fecb8ed2ce7010db373f8517ee815380d8e3c0c4
Gitweb: https://git.kernel.org/tip/fecb8ed2ce7010db373f8517ee815380d8e3c0c4
Author: Elena Reshetova <[email protected]>
AuthorDate: Mon, 28 Jan 2019 14:27:27 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 4 Feb 2019 08:46:16 +0100

perf/ring_buffer: Convert ring_buffer.refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:

- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable ring_buffer.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

** Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
for more information.

Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the ring_buffer.refcount it might make a difference
in following places:

- ring_buffer_get(): increment in refcount_inc_not_zero() only
guarantees control dependency on success vs. fully ordered
atomic counterpart
- ring_buffer_put(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and ACQUIRE ordering + control dependency
on success vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/events/core.c | 4 ++--
kernel/events/internal.h | 3 ++-
kernel/events/ring_buffer.c | 2 +-
3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 677164d54547..284232edf9be 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5393,7 +5393,7 @@ struct ring_buffer *ring_buffer_get(struct perf_event *event)
rcu_read_lock();
rb = rcu_dereference(event->rb);
if (rb) {
- if (!atomic_inc_not_zero(&rb->refcount))
+ if (!refcount_inc_not_zero(&rb->refcount))
rb = NULL;
}
rcu_read_unlock();
@@ -5403,7 +5403,7 @@ struct ring_buffer *ring_buffer_get(struct perf_event *event)

void ring_buffer_put(struct ring_buffer *rb)
{
- if (!atomic_dec_and_test(&rb->refcount))
+ if (!refcount_dec_and_test(&rb->refcount))
return;

WARN_ON_ONCE(!list_empty(&rb->event_list));
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 6dc725a7e7bc..4718de2a04e6 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -4,13 +4,14 @@

#include <linux/hardirq.h>
#include <linux/uaccess.h>
+#include <linux/refcount.h>

/* Buffer handling */

#define RING_BUFFER_WRITABLE 0x01

struct ring_buffer {
- atomic_t refcount;
+ refcount_t refcount;
struct rcu_head rcu_head;
#ifdef CONFIG_PERF_USE_VMALLOC
struct work_struct work;
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index ed6409300ef5..0a71d16ca41b 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -284,7 +284,7 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
else
rb->overwrite = 1;

- atomic_set(&rb->refcount, 1);
+ refcount_set(&rb->refcount, 1);

INIT_LIST_HEAD(&rb->event_list);
spin_lock_init(&rb->event_lock);

Subject: [tip:perf/core] perf/ring_buffer: Convert ring_buffer.aux_refcount to refcount_t

Commit-ID: ca3bb3d027f69ac3ab1dafb32bde2f5a3a44439c
Gitweb: https://git.kernel.org/tip/ca3bb3d027f69ac3ab1dafb32bde2f5a3a44439c
Author: Elena Reshetova <[email protected]>
AuthorDate: Mon, 28 Jan 2019 14:27:28 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 4 Feb 2019 08:46:17 +0100

perf/ring_buffer: Convert ring_buffer.aux_refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:

- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable ring_buffer.aux_refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

** Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
for more information.

Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the ring_buffer.aux_refcount it might make a difference
in following places:

- perf_aux_output_begin(): increment in refcount_inc_not_zero() only
guarantees control dependency on success vs. fully ordered
atomic counterpart
- rb_free_aux(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and ACQUIRE ordering + control dependency
on success vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/events/core.c | 2 +-
kernel/events/internal.h | 2 +-
kernel/events/ring_buffer.c | 6 +++---
3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 284232edf9be..5aeb4c74fb99 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5468,7 +5468,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)

/* this has to be the last one */
rb_free_aux(rb);
- WARN_ON_ONCE(atomic_read(&rb->aux_refcount));
+ WARN_ON_ONCE(refcount_read(&rb->aux_refcount));

mutex_unlock(&event->mmap_mutex);
}
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 4718de2a04e6..79c47076700a 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -49,7 +49,7 @@ struct ring_buffer {
atomic_t aux_mmap_count;
unsigned long aux_mmap_locked;
void (*free_aux)(void *);
- atomic_t aux_refcount;
+ refcount_t aux_refcount;
void **aux_pages;
void *aux_priv;

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 0a71d16ca41b..805f0423ee0b 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -357,7 +357,7 @@ void *perf_aux_output_begin(struct perf_output_handle *handle,
if (!atomic_read(&rb->aux_mmap_count))
goto err;

- if (!atomic_inc_not_zero(&rb->aux_refcount))
+ if (!refcount_inc_not_zero(&rb->aux_refcount))
goto err;

/*
@@ -670,7 +670,7 @@ int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,
* we keep a refcount here to make sure either of the two can
* reference them safely.
*/
- atomic_set(&rb->aux_refcount, 1);
+ refcount_set(&rb->aux_refcount, 1);

rb->aux_overwrite = overwrite;
rb->aux_watermark = watermark;
@@ -689,7 +689,7 @@ out:

void rb_free_aux(struct ring_buffer *rb)
{
- if (atomic_dec_and_test(&rb->aux_refcount))
+ if (refcount_dec_and_test(&rb->aux_refcount))
__rb_free_aux(rb);
}