2017-11-16 18:36:38

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 00/16] v6 kernel core pieces refcount conversions

Changes in v6:
* memory ordering differences are outlined in each patch
together with potential problematic areas.
Note: I didn't include any statements in individual patches
on why I think the memory ordering changes do not matter
in that particular case since ultimately these are only
known by maintainers (unless explicitly documented) and
very hard to figure out reliably from the code.
Therefore maintainers are expected to double check the
specific pointed functions and make the end decision.
* rebase on top of today's linux-next/master


Changes in v5:
* Kees catched that the following changes in
perf_event_context.refcount and futex_pi_state.refcount
are not correct now when ARCH_HAS_REFCOUNT is enabled:
- WARN_ON(!atomic_inc_not_zero(refcount));
+ refcount_inc(refcount);
So they are now changed back to using refcount_inc_not_zero.

Changes in v4:
* just rebase and corrections on linux-next/master

Changes in v3:
* SoB chain corrected
* minor corrections based on v2 feedback
* rebase on linux-next/master as of today

Changes in v2:
* dropped already merged patches
* rebase on top of linux-next/master
* Now by default refcount_t = atomic_t (*) and uses all atomic
standard operations unless CONFIG_REFCOUNT_FULL is enabled.
This is a compromise for the systems that are critical on
performance (such as net) and cannot accept even slight delay
on the refcounter operations.

This series, for core kernel components, replaces atomic_t reference
counters with the new refcount_t type and API (see include/linux/refcount.h).
By doing this we prevent intentional or accidental
underflows or overflows that can led to use-after-free vulnerabilities.

The patches are fully independent and can be cherry-picked separately.
If there are no objections to the patches, please merge them via respective trees.


Elena Reshetova (16):
futex: convert futex_pi_state.refcount to refcount_t
sched: convert sighand_struct.count to refcount_t
sched: convert signal_struct.sigcnt to refcount_t
sched: convert user_struct.__count to refcount_t
sched: convert numa_group.refcount to refcount_t
sched/task_struct: convert task_struct.usage to refcount_t
sched/task_struct: convert task_struct.stack_refcount to refcount_t
perf: convert perf_event_context.refcount to refcount_t
perf/ring_buffer: convert ring_buffer.refcount to refcount_t
perf/ring_buffer: convert ring_buffer.aux_refcount to refcount_t
uprobes: convert uprobe.ref to refcount_t
nsproxy: convert nsproxy.count to refcount_t
groups: convert group_info.usage to refcount_t
creds: convert cred.usage to refcount_t
kcov: convert kcov.refcount to refcount_t
bdi: convert bdi_writeback_congested.refcnt from atomic_t to
refcount_t

fs/exec.c | 4 ++--
fs/proc/task_nommu.c | 2 +-
include/linux/backing-dev-defs.h | 3 ++-
include/linux/backing-dev.h | 4 ++--
include/linux/cred.h | 13 ++++++------
include/linux/init_task.h | 7 +++---
include/linux/nsproxy.h | 6 +++---
include/linux/perf_event.h | 3 ++-
include/linux/sched.h | 5 +++--
include/linux/sched/signal.h | 5 +++--
include/linux/sched/task.h | 4 ++--
include/linux/sched/task_stack.h | 2 +-
include/linux/sched/user.h | 5 +++--
kernel/cred.c | 46 ++++++++++++++++++++--------------------
kernel/events/core.c | 18 ++++++++--------
kernel/events/internal.h | 5 +++--
kernel/events/ring_buffer.c | 8 +++----
kernel/events/uprobes.c | 8 +++----
kernel/fork.c | 24 ++++++++++-----------
kernel/futex.c | 15 +++++++------
kernel/groups.c | 2 +-
kernel/kcov.c | 9 ++++----
kernel/nsproxy.c | 6 +++---
kernel/sched/fair.c | 12 +++++------
kernel/user.c | 8 +++----
mm/backing-dev.c | 14 ++++++------
26 files changed, 125 insertions(+), 113 deletions(-)

--
2.7.4


From 1584185678083644048@xxx Thu Nov 16 01:53:13 +0000 2017
X-GM-THRID: 1584147584988517853
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread


2017-11-15 15:05:21

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 09/16] perf/ring_buffer: convert ring_buffer.refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable ring_buffer.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the ring_buffer.refcount it might make a difference
in following places:
- ring_buffer_get(): increment in refcount_inc_not_zero() only
guarantees control dependency on success vs. fully ordered
atomic counterpart
- ring_buffer_put(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
kernel/events/core.c | 4 ++--
kernel/events/internal.h | 3 ++-
kernel/events/ring_buffer.c | 2 +-
3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 29c381f..3497c6a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5020,7 +5020,7 @@ struct ring_buffer *ring_buffer_get(struct perf_event *event)
rcu_read_lock();
rb = rcu_dereference(event->rb);
if (rb) {
- if (!atomic_inc_not_zero(&rb->refcount))
+ if (!refcount_inc_not_zero(&rb->refcount))
rb = NULL;
}
rcu_read_unlock();
@@ -5030,7 +5030,7 @@ struct ring_buffer *ring_buffer_get(struct perf_event *event)

void ring_buffer_put(struct ring_buffer *rb)
{
- if (!atomic_dec_and_test(&rb->refcount))
+ if (!refcount_dec_and_test(&rb->refcount))
return;

WARN_ON_ONCE(!list_empty(&rb->event_list));
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 09b1537..86c5c7f 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -4,13 +4,14 @@

#include <linux/hardirq.h>
#include <linux/uaccess.h>
+#include <linux/refcount.h>

/* Buffer handling */

#define RING_BUFFER_WRITABLE 0x01

struct ring_buffer {
- atomic_t refcount;
+ refcount_t refcount;
struct rcu_head rcu_head;
#ifdef CONFIG_PERF_USE_VMALLOC
struct work_struct work;
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 141aa2c..de12d36 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -284,7 +284,7 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
else
rb->overwrite = 1;

- atomic_set(&rb->refcount, 1);
+ refcount_set(&rb->refcount, 1);

INIT_LIST_HEAD(&rb->event_list);
spin_lock_init(&rb->event_lock);
--
2.7.4


From 1584153565198099190@xxx Wed Nov 15 17:22:48 +0000 2017
X-GM-THRID: 1583944269911757184
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-15 16:23:24

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 01/16] futex: convert futex_pi_state.refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable futex_pi_state.refcount is used as pure
reference counter. Convert it to refcount_t and fix up
the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the futex_pi_state.refcount it might make a difference
in following places:
- get_pi_state() and exit_pi_state_list(): increment in
refcount_inc_not_zero() only guarantees control dependency
on success vs. fully ordered atomic counterpart
- put_pi_state(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
kernel/futex.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 76ed592..907055f 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -67,6 +67,7 @@
#include <linux/freezer.h>
#include <linux/bootmem.h>
#include <linux/fault-inject.h>
+#include <linux/refcount.h>

#include <asm/futex.h>

@@ -209,7 +210,7 @@ struct futex_pi_state {
struct rt_mutex pi_mutex;

struct task_struct *owner;
- atomic_t refcount;
+ refcount_t refcount;

union futex_key key;
} __randomize_layout;
@@ -795,7 +796,7 @@ static int refill_pi_state_cache(void)
INIT_LIST_HEAD(&pi_state->list);
/* pi_mutex gets initialized later */
pi_state->owner = NULL;
- atomic_set(&pi_state->refcount, 1);
+ refcount_set(&pi_state->refcount, 1);
pi_state->key = FUTEX_KEY_INIT;

current->pi_state_cache = pi_state;
@@ -815,7 +816,7 @@ static struct futex_pi_state *alloc_pi_state(void)

static void get_pi_state(struct futex_pi_state *pi_state)
{
- WARN_ON_ONCE(!atomic_inc_not_zero(&pi_state->refcount));
+ WARN_ON_ONCE(!refcount_inc_not_zero(&pi_state->refcount));
}

/*
@@ -827,7 +828,7 @@ static void put_pi_state(struct futex_pi_state *pi_state)
if (!pi_state)
return;

- if (!atomic_dec_and_test(&pi_state->refcount))
+ if (!refcount_dec_and_test(&pi_state->refcount))
return;

/*
@@ -857,7 +858,7 @@ static void put_pi_state(struct futex_pi_state *pi_state)
* refcount is at 0 - put it back to 1.
*/
pi_state->owner = NULL;
- atomic_set(&pi_state->refcount, 1);
+ refcount_set(&pi_state->refcount, 1);
current->pi_state_cache = pi_state;
}
}
@@ -918,7 +919,7 @@ void exit_pi_state_list(struct task_struct *curr)
* In that case; drop the locks to let put_pi_state() make
* progress and retry the loop.
*/
- if (!atomic_inc_not_zero(&pi_state->refcount)) {
+ if (!refcount_inc_not_zero(&pi_state->refcount)) {
raw_spin_unlock_irq(&curr->pi_lock);
cpu_relax();
raw_spin_lock_irq(&curr->pi_lock);
@@ -1074,7 +1075,7 @@ static int attach_to_pi_state(u32 __user *uaddr, u32 uval,
* and futex_wait_requeue_pi() as it cannot go to 0 and consequently
* free pi_state before we can take a reference ourselves.
*/
- WARN_ON(!atomic_read(&pi_state->refcount));
+ WARN_ON(!refcount_read(&pi_state->refcount));

/*
* Now that we have a pi_state, we can acquire wait_lock
--
2.7.4


From 1584169966668155114@xxx Wed Nov 15 21:43:30 +0000 2017
X-GM-THRID: 1584169966668155114
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-15 14:15:43

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 08/16] perf: convert perf_event_context.refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable perf_event_context.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the perf_event_context.refcount it might make a difference
in following places:
- get_ctx(), perf_event_ctx_lock_nested(), perf_lock_task_context()
and __perf_event_ctx_lock_double(): increment in
refcount_inc_not_zero() only guarantees control dependency
on success vs. fully ordered atomic counterpart
- put_ctx(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
include/linux/perf_event.h | 3 ++-
kernel/events/core.c | 12 ++++++------
2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2c9c87d..6a78705 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -54,6 +54,7 @@ struct perf_guest_info_callbacks {
#include <linux/perf_regs.h>
#include <linux/workqueue.h>
#include <linux/cgroup.h>
+#include <linux/refcount.h>
#include <asm/local.h>

struct perf_callchain_entry {
@@ -718,7 +719,7 @@ struct perf_event_context {
int nr_stat;
int nr_freq;
int rotate_disable;
- atomic_t refcount;
+ refcount_t refcount;
struct task_struct *task;

/*
diff --git a/kernel/events/core.c b/kernel/events/core.c
index d084a97..29c381f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1148,7 +1148,7 @@ static void perf_event_ctx_deactivate(struct perf_event_context *ctx)

static void get_ctx(struct perf_event_context *ctx)
{
- WARN_ON(!atomic_inc_not_zero(&ctx->refcount));
+ WARN_ON(!refcount_inc_not_zero(&ctx->refcount));
}

static void free_ctx(struct rcu_head *head)
@@ -1162,7 +1162,7 @@ static void free_ctx(struct rcu_head *head)

static void put_ctx(struct perf_event_context *ctx)
{
- if (atomic_dec_and_test(&ctx->refcount)) {
+ if (refcount_dec_and_test(&ctx->refcount)) {
if (ctx->parent_ctx)
put_ctx(ctx->parent_ctx);
if (ctx->task && ctx->task != TASK_TOMBSTONE)
@@ -1240,7 +1240,7 @@ perf_event_ctx_lock_nested(struct perf_event *event, int nesting)
again:
rcu_read_lock();
ctx = READ_ONCE(event->ctx);
- if (!atomic_inc_not_zero(&ctx->refcount)) {
+ if (!refcount_inc_not_zero(&ctx->refcount)) {
rcu_read_unlock();
goto again;
}
@@ -1373,7 +1373,7 @@ perf_lock_task_context(struct task_struct *task, int ctxn, unsigned long *flags)
}

if (ctx->task == TASK_TOMBSTONE ||
- !atomic_inc_not_zero(&ctx->refcount)) {
+ !refcount_inc_not_zero(&ctx->refcount)) {
raw_spin_unlock(&ctx->lock);
ctx = NULL;
} else {
@@ -3715,7 +3715,7 @@ static void __perf_event_init_context(struct perf_event_context *ctx)
INIT_LIST_HEAD(&ctx->pinned_groups);
INIT_LIST_HEAD(&ctx->flexible_groups);
INIT_LIST_HEAD(&ctx->event_list);
- atomic_set(&ctx->refcount, 1);
+ refcount_set(&ctx->refcount, 1);
}

static struct perf_event_context *
@@ -9793,7 +9793,7 @@ __perf_event_ctx_lock_double(struct perf_event *group_leader,
again:
rcu_read_lock();
gctx = READ_ONCE(group_leader->ctx);
- if (!atomic_inc_not_zero(&gctx->refcount)) {
+ if (!refcount_inc_not_zero(&gctx->refcount)) {
rcu_read_unlock();
goto again;
}
--
2.7.4


From 1584147829560585967@xxx Wed Nov 15 15:51:38 +0000 2017
X-GM-THRID: 1583624154673412496
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-15 14:14:15

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 02/16] sched: convert sighand_struct.count to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable sighand_struct.count is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the sighand_struct.count it might make a difference
in following places:
- __cleanup_sighand: decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
fs/exec.c | 4 ++--
fs/proc/task_nommu.c | 2 +-
include/linux/init_task.h | 2 +-
include/linux/sched/signal.h | 3 ++-
kernel/fork.c | 8 ++++----
5 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 19e6325..09d99b5 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1181,7 +1181,7 @@ static int de_thread(struct task_struct *tsk)
flush_itimer_signals();
#endif

- if (atomic_read(&oldsighand->count) != 1) {
+ if (refcount_read(&oldsighand->count) != 1) {
struct sighand_struct *newsighand;
/*
* This ->sighand is shared with the CLONE_SIGHAND
@@ -1191,7 +1191,7 @@ static int de_thread(struct task_struct *tsk)
if (!newsighand)
return -ENOMEM;

- atomic_set(&newsighand->count, 1);
+ refcount_set(&newsighand->count, 1);
memcpy(newsighand->action, oldsighand->action,
sizeof(newsighand->action));

diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c
index 0b60ac6..684f808 100644
--- a/fs/proc/task_nommu.c
+++ b/fs/proc/task_nommu.c
@@ -64,7 +64,7 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
else
bytes += kobjsize(current->files);

- if (current->sighand && atomic_read(&current->sighand->count) > 1)
+ if (current->sighand && refcount_read(&current->sighand->count) > 1)
sbytes += kobjsize(current->sighand);
else
bytes += kobjsize(current->sighand);
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 6a53262..9eb2ce8 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -86,7 +86,7 @@ extern struct fs_struct init_fs;
extern struct nsproxy init_nsproxy;

#define INIT_SIGHAND(sighand) { \
- .count = ATOMIC_INIT(1), \
+ .count = REFCOUNT_INIT(1), \
.action = { { { .sa_handler = SIG_DFL, } }, }, \
.siglock = __SPIN_LOCK_UNLOCKED(sighand.siglock), \
.signalfd_wqh = __WAIT_QUEUE_HEAD_INITIALIZER(sighand.signalfd_wqh), \
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 64d85fc..4a0e2d8 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -8,13 +8,14 @@
#include <linux/sched/jobctl.h>
#include <linux/sched/task.h>
#include <linux/cred.h>
+#include <linux/refcount.h>

/*
* Types defining task->signal and task->sighand and APIs using them:
*/

struct sighand_struct {
- atomic_t count;
+ refcount_t count;
struct k_sigaction action[_NSIG];
spinlock_t siglock;
wait_queue_head_t signalfd_wqh;
diff --git a/kernel/fork.c b/kernel/fork.c
index a1db74e..be451af 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1381,7 +1381,7 @@ static int copy_sighand(unsigned long clone_flags, struct task_struct *tsk)
struct sighand_struct *sig;

if (clone_flags & CLONE_SIGHAND) {
- atomic_inc(&current->sighand->count);
+ refcount_inc(&current->sighand->count);
return 0;
}
sig = kmem_cache_alloc(sighand_cachep, GFP_KERNEL);
@@ -1389,14 +1389,14 @@ static int copy_sighand(unsigned long clone_flags, struct task_struct *tsk)
if (!sig)
return -ENOMEM;

- atomic_set(&sig->count, 1);
+ refcount_set(&sig->count, 1);
memcpy(sig->action, current->sighand->action, sizeof(sig->action));
return 0;
}

void __cleanup_sighand(struct sighand_struct *sighand)
{
- if (atomic_dec_and_test(&sighand->count)) {
+ if (refcount_dec_and_test(&sighand->count)) {
signalfd_cleanup(sighand);
/*
* sighand_cachep is SLAB_TYPESAFE_BY_RCU so we can free it
@@ -2303,7 +2303,7 @@ static int check_unshare_flags(unsigned long unshare_flags)
return -EINVAL;
}
if (unshare_flags & (CLONE_SIGHAND | CLONE_VM)) {
- if (atomic_read(&current->sighand->count) > 1)
+ if (refcount_read(&current->sighand->count) > 1)
return -EINVAL;
}
if (unshare_flags & CLONE_VM) {
--
2.7.4


From 1584146047920554754@xxx Wed Nov 15 15:23:19 +0000 2017
X-GM-THRID: 1584146047920554754
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-15 14:16:23

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 06/16] sched/task_struct: convert task_struct.usage to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable task_struct.usage is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the task_struct.usage it might make a difference
in following places:
- put_task_struct(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
include/linux/init_task.h | 2 +-
include/linux/sched.h | 3 ++-
include/linux/sched/task.h | 4 ++--
kernel/fork.c | 4 ++--
4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 9eb2ce8..1e35fce 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -227,7 +227,7 @@ extern struct cred init_cred;
INIT_TASK_TI(tsk) \
.state = 0, \
.stack = init_stack, \
- .usage = ATOMIC_INIT(2), \
+ .usage = REFCOUNT_INIT(2), \
.flags = PF_KTHREAD, \
.prio = MAX_PRIO-20, \
.static_prio = MAX_PRIO-20, \
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 44f9df5..924a812 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -21,6 +21,7 @@
#include <linux/seccomp.h>
#include <linux/nodemask.h>
#include <linux/rcupdate.h>
+#include <linux/refcount.h>
#include <linux/resource.h>
#include <linux/latencytop.h>
#include <linux/sched/prio.h>
@@ -536,7 +537,7 @@ struct task_struct {
randomized_struct_fields_start

void *stack;
- atomic_t usage;
+ refcount_t usage;
/* Per task flags (PF_*), defined further below: */
unsigned int flags;
unsigned int ptrace;
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 5be31eb..dae8d04 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -86,13 +86,13 @@ extern void sched_exec(void);
#define sched_exec() {}
#endif

-#define get_task_struct(tsk) do { atomic_inc(&(tsk)->usage); } while(0)
+#define get_task_struct(tsk) do { refcount_inc(&(tsk)->usage); } while(0)

extern void __put_task_struct(struct task_struct *t);

static inline void put_task_struct(struct task_struct *t)
{
- if (atomic_dec_and_test(&t->usage))
+ if (refcount_dec_and_test(&t->usage))
__put_task_struct(t);
}

diff --git a/kernel/fork.c b/kernel/fork.c
index a65ec7d..16df4f5 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -649,7 +649,7 @@ static inline void put_signal_struct(struct signal_struct *sig)
void __put_task_struct(struct task_struct *tsk)
{
WARN_ON(!tsk->exit_state);
- WARN_ON(atomic_read(&tsk->usage));
+ WARN_ON(refcount_read(&tsk->usage));
WARN_ON(tsk == current);

cgroup_free(tsk);
@@ -824,7 +824,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
* One for us, one for whoever does the "release_task()" (usually
* parent)
*/
- atomic_set(&tsk->usage, 2);
+ refcount_set(&tsk->usage, 2);
#ifdef CONFIG_BLK_DEV_IO_TRACE
tsk->btrace_seq = 0;
#endif
--
2.7.4


From 1584132895457498675@xxx Wed Nov 15 11:54:16 +0000 2017
X-GM-THRID: 1584010677056626818
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-15 15:04:43

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 03/16] sched: convert signal_struct.sigcnt to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable signal_struct.sigcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the signal_struct.sigcnt it might make a difference
in following places:
- put_signal_struct(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
include/linux/sched/signal.h | 2 +-
kernel/fork.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 4a0e2d8..14e3a0c 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -78,7 +78,7 @@ struct thread_group_cputimer {
* the locking of signal_struct.
*/
struct signal_struct {
- atomic_t sigcnt;
+ refcount_t sigcnt;
atomic_t live;
int nr_threads;
struct list_head thread_head;
diff --git a/kernel/fork.c b/kernel/fork.c
index be451af..a65ec7d 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -642,7 +642,7 @@ static inline void free_signal_struct(struct signal_struct *sig)

static inline void put_signal_struct(struct signal_struct *sig)
{
- if (atomic_dec_and_test(&sig->sigcnt))
+ if (refcount_dec_and_test(&sig->sigcnt))
free_signal_struct(sig);
}

@@ -1443,7 +1443,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)

sig->nr_threads = 1;
atomic_set(&sig->live, 1);
- atomic_set(&sig->sigcnt, 1);
+ refcount_set(&sig->sigcnt, 1);

/* list_add(thread_node, thread_head) without INIT_LIST_HEAD() */
sig->thread_head = (struct list_head)LIST_HEAD_INIT(tsk->thread_node);
@@ -1952,7 +1952,7 @@ static __latent_entropy struct task_struct *copy_process(
} else {
current->signal->nr_threads++;
atomic_inc(&current->signal->live);
- atomic_inc(&current->signal->sigcnt);
+ refcount_inc(&current->signal->sigcnt);
list_add_tail_rcu(&p->thread_group,
&p->group_leader->thread_group);
list_add_tail_rcu(&p->thread_node,
--
2.7.4


From 1584937618668430923@xxx Fri Nov 24 09:05:00 +0000 2017
X-GM-THRID: 1584848484575958684
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-15 14:16:36

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 16/16] bdi: convert bdi_writeback_congested.refcnt from atomic_t to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable bdi_writeback_congested.refcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the bdi_writeback_congested.refcnt it might make a difference
in following places:
- wb_congested_put() in include/linux/backing-dev.h:
decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart
- wb_congested_put() in mm/backing-dev.c: decrement in
refcount_dec_and_lock() only
provides RELEASE ordering, control dependency on success
and hold spin_lock() on success vs. fully ordered atomic
counterpart. Note, there is no difference in spin lock
locking.

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
include/linux/backing-dev-defs.h | 3 ++-
include/linux/backing-dev.h | 4 ++--
mm/backing-dev.c | 14 ++++++++------
3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index bfe86b5..0b1bcce 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -5,6 +5,7 @@
#include <linux/list.h>
#include <linux/radix-tree.h>
#include <linux/rbtree.h>
+#include <linux/refcount.h>
#include <linux/spinlock.h>
#include <linux/percpu_counter.h>
#include <linux/percpu-refcount.h>
@@ -76,7 +77,7 @@ enum wb_reason {
*/
struct bdi_writeback_congested {
unsigned long state; /* WB_[a]sync_congested flags */
- atomic_t refcnt; /* nr of attached wb's and blkg */
+ refcount_t refcnt; /* nr of attached wb's and blkg */

#ifdef CONFIG_CGROUP_WRITEBACK
struct backing_dev_info *__bdi; /* the associated bdi, set to NULL
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index e54e7e0..d6bac2e 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -402,13 +402,13 @@ static inline bool inode_cgwb_enabled(struct inode *inode)
static inline struct bdi_writeback_congested *
wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
{
- atomic_inc(&bdi->wb_congested->refcnt);
+ refcount_inc(&bdi->wb_congested->refcnt);
return bdi->wb_congested;
}

static inline void wb_congested_put(struct bdi_writeback_congested *congested)
{
- if (atomic_dec_and_test(&congested->refcnt))
+ if (refcount_dec_and_test(&congested->refcnt))
kfree(congested);
}

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 74b52df..e92a20f 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -440,14 +440,17 @@ wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
node = &parent->rb_left;
else if (congested->blkcg_id > blkcg_id)
node = &parent->rb_right;
- else
- goto found;
+ else {
+ refcount_inc(&congested->refcnt);
+ goto found;
+ }
}

if (new_congested) {
/* !found and storage for new one already allocated, insert */
congested = new_congested;
new_congested = NULL;
+ refcount_set(&congested->refcnt, 1);
rb_link_node(&congested->rb_node, parent, node);
rb_insert_color(&congested->rb_node, &bdi->cgwb_congested_tree);
goto found;
@@ -460,13 +463,12 @@ wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
if (!new_congested)
return NULL;

- atomic_set(&new_congested->refcnt, 0);
+ refcount_set(&new_congested->refcnt, 0);
new_congested->__bdi = bdi;
new_congested->blkcg_id = blkcg_id;
goto retry;

found:
- atomic_inc(&congested->refcnt);
spin_unlock_irqrestore(&cgwb_lock, flags);
kfree(new_congested);
return congested;
@@ -483,7 +485,7 @@ void wb_congested_put(struct bdi_writeback_congested *congested)
unsigned long flags;

local_irq_save(flags);
- if (!atomic_dec_and_lock(&congested->refcnt, &cgwb_lock)) {
+ if (!refcount_dec_and_lock(&congested->refcnt, &cgwb_lock)) {
local_irq_restore(flags);
return;
}
@@ -793,7 +795,7 @@ static int cgwb_bdi_init(struct backing_dev_info *bdi)
if (!bdi->wb_congested)
return -ENOMEM;

- atomic_set(&bdi->wb_congested->refcnt, 1);
+ refcount_set(&bdi->wb_congested->refcnt, 1);

err = wb_init(&bdi->wb, bdi, 1, GFP_KERNEL);
if (err) {
--
2.7.4


From 1584045433859330342@xxx Tue Nov 14 12:44:06 +0000 2017
X-GM-THRID: 1584045433859330342
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-15 15:03:11

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 05/16] sched: convert numa_group.refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable numa_group.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the numa_group.refcount it might make a difference
in following places:
- get_numa_group(): increment in refcount_inc_not_zero() only
guarantees control dependency on success vs. fully ordered
atomic counterpart
- put_numa_group(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
kernel/sched/fair.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4037e19..b456b94 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1052,7 +1052,7 @@ unsigned int sysctl_numa_balancing_scan_size = 256;
unsigned int sysctl_numa_balancing_scan_delay = 1000;

struct numa_group {
- atomic_t refcount;
+ refcount_t refcount;

spinlock_t lock; /* nr_tasks, tasks */
int nr_tasks;
@@ -1121,7 +1121,7 @@ static unsigned int task_scan_start(struct task_struct *p)
unsigned long shared = group_faults_shared(ng);
unsigned long private = group_faults_priv(ng);

- period *= atomic_read(&ng->refcount);
+ period *= refcount_read(&ng->refcount);
period *= shared + 1;
period /= private + shared + 1;
}
@@ -1144,7 +1144,7 @@ static unsigned int task_scan_max(struct task_struct *p)
unsigned long private = group_faults_priv(ng);
unsigned long period = smax;

- period *= atomic_read(&ng->refcount);
+ period *= refcount_read(&ng->refcount);
period *= shared + 1;
period /= private + shared + 1;

@@ -2229,12 +2229,12 @@ static void task_numa_placement(struct task_struct *p)

static inline int get_numa_group(struct numa_group *grp)
{
- return atomic_inc_not_zero(&grp->refcount);
+ return refcount_inc_not_zero(&grp->refcount);
}

static inline void put_numa_group(struct numa_group *grp)
{
- if (atomic_dec_and_test(&grp->refcount))
+ if (refcount_dec_and_test(&grp->refcount))
kfree_rcu(grp, rcu);
}

@@ -2255,7 +2255,7 @@ static void task_numa_group(struct task_struct *p, int cpupid, int flags,
if (!grp)
return;

- atomic_set(&grp->refcount, 1);
+ refcount_set(&grp->refcount, 1);
grp->active_nodes = 1;
grp->max_faults_cpu = 0;
spin_lock_init(&grp->lock);
--
2.7.4


From 1584032920019309814@xxx Tue Nov 14 09:25:12 +0000 2017
X-GM-THRID: 1584032920019309814
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-15 14:17:21

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 07/16] sched/task_struct: convert task_struct.stack_refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable task_struct.stack_refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the task_struct.stack_refcount it might make a difference
in following places:
- try_get_task_stack(): increment in refcount_inc_not_zero() only
guarantees control dependency on success vs. fully ordered
atomic counterpart
- put_task_stack(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
include/linux/init_task.h | 3 ++-
include/linux/sched.h | 2 +-
include/linux/sched/task_stack.h | 2 +-
kernel/fork.c | 6 +++---
4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 1e35fce..6a87579 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -13,6 +13,7 @@
#include <linux/securebits.h>
#include <linux/seqlock.h>
#include <linux/rbtree.h>
+#include <linux/refcount.h>
#include <linux/sched/autogroup.h>
#include <net/net_namespace.h>
#include <linux/sched/rt.h>
@@ -207,7 +208,7 @@ extern struct cred init_cred;
#ifdef CONFIG_THREAD_INFO_IN_TASK
# define INIT_TASK_TI(tsk) \
.thread_info = INIT_THREAD_INFO(tsk), \
- .stack_refcount = ATOMIC_INIT(1),
+ .stack_refcount = REFCOUNT_INIT(1),
#else
# define INIT_TASK_TI(tsk)
#endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 924a812..c8c6d17 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1098,7 +1098,7 @@ struct task_struct {
#endif
#ifdef CONFIG_THREAD_INFO_IN_TASK
/* A live task holds one reference: */
- atomic_t stack_refcount;
+ refcount_t stack_refcount;
#endif
#ifdef CONFIG_LIVEPATCH
int patch_state;
diff --git a/include/linux/sched/task_stack.h b/include/linux/sched/task_stack.h
index cb4828a..4559316 100644
--- a/include/linux/sched/task_stack.h
+++ b/include/linux/sched/task_stack.h
@@ -61,7 +61,7 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
#ifdef CONFIG_THREAD_INFO_IN_TASK
static inline void *try_get_task_stack(struct task_struct *tsk)
{
- return atomic_inc_not_zero(&tsk->stack_refcount) ?
+ return refcount_inc_not_zero(&tsk->stack_refcount) ?
task_stack_page(tsk) : NULL;
}

diff --git a/kernel/fork.c b/kernel/fork.c
index 16df4f5..822efa2 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -362,7 +362,7 @@ static void release_task_stack(struct task_struct *tsk)
#ifdef CONFIG_THREAD_INFO_IN_TASK
void put_task_stack(struct task_struct *tsk)
{
- if (atomic_dec_and_test(&tsk->stack_refcount))
+ if (refcount_dec_and_test(&tsk->stack_refcount))
release_task_stack(tsk);
}
#endif
@@ -380,7 +380,7 @@ void free_task(struct task_struct *tsk)
* If the task had a separate stack allocation, it should be gone
* by now.
*/
- WARN_ON_ONCE(atomic_read(&tsk->stack_refcount) != 0);
+ WARN_ON_ONCE(refcount_read(&tsk->stack_refcount) != 0);
#endif
rt_mutex_debug_task_free(tsk);
ftrace_graph_exit_task(tsk);
@@ -795,7 +795,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
tsk->stack_vm_area = stack_vm_area;
#endif
#ifdef CONFIG_THREAD_INFO_IN_TASK
- atomic_set(&tsk->stack_refcount, 1);
+ refcount_set(&tsk->stack_refcount, 1);
#endif

if (err)
--
2.7.4


From 1583959455936499896@xxx Mon Nov 13 13:57:31 +0000 2017
X-GM-THRID: 1583959455936499896
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-15 16:24:17

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 10/16] perf/ring_buffer: convert ring_buffer.aux_refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable ring_buffer.aux_refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the ring_buffer.aux_refcount it might make a difference
in following places:
- perf_aux_output_begin(): increment in refcount_inc_not_zero() only
guarantees control dependency on success vs. fully ordered
atomic counterpart
- rb_free_aux(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: David Windsor <[email protected]>
Reviewed-by: Hans Liljestrand <[email protected]>
Signed-off-by: Elena Reshetova <[email protected]>
---
kernel/events/core.c | 2 +-
kernel/events/internal.h | 2 +-
kernel/events/ring_buffer.c | 6 +++---
3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3497c6a..5f087f4 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5095,7 +5095,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)

/* this has to be the last one */
rb_free_aux(rb);
- WARN_ON_ONCE(atomic_read(&rb->aux_refcount));
+ WARN_ON_ONCE(refcount_read(&rb->aux_refcount));

mutex_unlock(&event->mmap_mutex);
}
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 86c5c7f..50ecf00 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -49,7 +49,7 @@ struct ring_buffer {
atomic_t aux_mmap_count;
unsigned long aux_mmap_locked;
void (*free_aux)(void *);
- atomic_t aux_refcount;
+ refcount_t aux_refcount;
void **aux_pages;
void *aux_priv;

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index de12d36..b29d6ce 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -357,7 +357,7 @@ void *perf_aux_output_begin(struct perf_output_handle *handle,
if (!atomic_read(&rb->aux_mmap_count))
goto err;

- if (!atomic_inc_not_zero(&rb->aux_refcount))
+ if (!refcount_inc_not_zero(&rb->aux_refcount))
goto err;

/*
@@ -659,7 +659,7 @@ int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,
* we keep a refcount here to make sure either of the two can
* reference them safely.
*/
- atomic_set(&rb->aux_refcount, 1);
+ refcount_set(&rb->aux_refcount, 1);

rb->aux_overwrite = overwrite;
rb->aux_watermark = watermark;
@@ -678,7 +678,7 @@ int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,

void rb_free_aux(struct ring_buffer *rb)
{
- if (atomic_dec_and_test(&rb->aux_refcount))
+ if (refcount_dec_and_test(&rb->aux_refcount))
__rb_free_aux(rb);
}

--
2.7.4


From 1583782605081562886@xxx Sat Nov 11 15:06:33 +0000 2017
X-GM-THRID: 1583670749450643631
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread