2017-03-03 08:57:36

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 00/17] fs, btrfs refcount conversions

Now when new refcount_t type and API are finally merged
(see include/linux/refcount.h), the following
patches convert various refcounters in the btrfs filesystem from atomic_t
to refcount_t. By doing this we prevent intentional or accidental
underflows or overflows that can led to use-after-free vulnerabilities.

The below patches are fully independent and can be cherry-picked separately.
Since we convert all kernel subsystems in the same fashion, resulting
in about 300 patches, we have to group them for sending at least in some
fashion to be manageable. Please excuse the long cc list.

These patches have been tested with xfstests by running btrfs-related tests.
btrfs debug was enabled, warns on refcount errors, too. No output related to
refcount errors produced. However, the following errors were during the run:
* tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
process hangs. They all seem to be around qgroup, sometimes error visible
such as qgroup scan failed -4 before it blocks, but not always.
* test btrfs/104 dmesg has additional error output:
BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
to free: 4096
I tried looking at the code on what causes the failure, but could not figure
it out. It doesn't seem to be related to any refcount changes at least IMO.

The above test failures are hard for me to understand and interpreted, but
they don't seem to relate to refcount conversions.

Elena Reshetova (17):
fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t
fs, btrfs: convert btrfs_transaction.use_count from atomic_t to
refcount_t
fs, btrfs: convert extent_map.refs from atomic_t to refcount_t
fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to
refcount_t
fs, btrfs: convert btrfs_caching_control.count from atomic_t to
refcount_t
fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to
refcount_t
fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t
fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t
fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t
fs, btrfs: convert extent_state.refs from atomic_t to refcount_t
fs, btrfs: convert compressed_bio.pending_bios from atomic_t to
refcount_t
fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t
fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t
fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t
fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t
fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t
fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t

fs/btrfs/backref.c | 2 +-
fs/btrfs/compression.c | 18 ++++++++---------
fs/btrfs/ctree.h | 5 +++--
fs/btrfs/delayed-inode.c | 46 ++++++++++++++++++++++----------------------
fs/btrfs/delayed-inode.h | 5 +++--
fs/btrfs/delayed-ref.c | 8 ++++----
fs/btrfs/delayed-ref.h | 8 +++++---
fs/btrfs/disk-io.c | 6 +++---
fs/btrfs/disk-io.h | 4 ++--
fs/btrfs/extent-tree.c | 20 +++++++++----------
fs/btrfs/extent_io.c | 18 ++++++++---------
fs/btrfs/extent_io.h | 3 ++-
fs/btrfs/extent_map.c | 10 +++++-----
fs/btrfs/extent_map.h | 3 ++-
fs/btrfs/ordered-data.c | 20 +++++++++----------
fs/btrfs/ordered-data.h | 2 +-
fs/btrfs/raid56.c | 19 +++++++++---------
fs/btrfs/scrub.c | 42 ++++++++++++++++++++--------------------
fs/btrfs/transaction.c | 20 +++++++++----------
fs/btrfs/transaction.h | 3 ++-
fs/btrfs/tree-log.c | 2 +-
fs/btrfs/volumes.c | 10 +++++-----
fs/btrfs/volumes.h | 2 +-
include/trace/events/btrfs.h | 4 ++--
24 files changed, 143 insertions(+), 137 deletions(-)

--
2.7.4


2017-03-03 08:57:35

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 01/17] fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/volumes.c | 8 ++++----
fs/btrfs/volumes.h | 2 +-
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b124462..a8fd2e9 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5295,22 +5295,22 @@ static struct btrfs_bio *alloc_btrfs_bio(int total_stripes, int real_stripes)
GFP_NOFS|__GFP_NOFAIL);

atomic_set(&bbio->error, 0);
- atomic_set(&bbio->refs, 1);
+ refcount_set(&bbio->refs, 1);

return bbio;
}

void btrfs_get_bbio(struct btrfs_bio *bbio)
{
- WARN_ON(!atomic_read(&bbio->refs));
- atomic_inc(&bbio->refs);
+ WARN_ON(!refcount_read(&bbio->refs));
+ refcount_inc(&bbio->refs);
}

void btrfs_put_bbio(struct btrfs_bio *bbio)
{
if (!bbio)
return;
- if (atomic_dec_and_test(&bbio->refs))
+ if (refcount_dec_and_test(&bbio->refs))
kfree(bbio);
}

diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 59be812..ac0bf7d 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -298,7 +298,7 @@ struct btrfs_bio;
typedef void (btrfs_bio_end_io_t) (struct btrfs_bio *bio, int err);

struct btrfs_bio {
- atomic_t refs;
+ refcount_t refs;
atomic_t stripes_pending;
struct btrfs_fs_info *fs_info;
u64 map_type; /* get from map_lookup->type */
--
2.7.4

2017-03-03 08:59:20

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 02/17] fs, btrfs: convert btrfs_transaction.use_count from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/disk-io.c | 2 +-
fs/btrfs/extent-tree.c | 2 +-
fs/btrfs/ordered-data.c | 2 +-
fs/btrfs/transaction.c | 20 ++++++++++----------
fs/btrfs/transaction.h | 3 ++-
5 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 5f1b08c..dca6432 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4615,7 +4615,7 @@ static int btrfs_cleanup_transaction(struct btrfs_fs_info *fs_info)
t = list_first_entry(&fs_info->trans_list,
struct btrfs_transaction, list);
if (t->state >= TRANS_STATE_COMMIT_START) {
- atomic_inc(&t->use_count);
+ refcount_inc(&t->use_count);
spin_unlock(&fs_info->trans_lock);
btrfs_wait_for_commit(fs_info, t->transid);
btrfs_put_transaction(t);
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 6079465..63ba295 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -10849,7 +10849,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device,
spin_lock(&fs_info->trans_lock);
trans = fs_info->running_transaction;
if (trans)
- atomic_inc(&trans->use_count);
+ refcount_inc(&trans->use_count);
spin_unlock(&fs_info->trans_lock);

ret = find_free_dev_extent_start(trans, device, minlen, start,
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 9a46878..da5399f 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -623,7 +623,7 @@ void btrfs_remove_ordered_extent(struct inode *inode,
spin_lock(&fs_info->trans_lock);
trans = fs_info->running_transaction;
if (trans)
- atomic_inc(&trans->use_count);
+ refcount_inc(&trans->use_count);
spin_unlock(&fs_info->trans_lock);

ASSERT(trans);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 61b807d..a7d7a7d 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -60,8 +60,8 @@ static const unsigned int btrfs_blocked_trans_types[TRANS_STATE_MAX] = {

void btrfs_put_transaction(struct btrfs_transaction *transaction)
{
- WARN_ON(atomic_read(&transaction->use_count) == 0);
- if (atomic_dec_and_test(&transaction->use_count)) {
+ WARN_ON(refcount_read(&transaction->use_count) == 0);
+ if (refcount_dec_and_test(&transaction->use_count)) {
BUG_ON(!list_empty(&transaction->list));
WARN_ON(!RB_EMPTY_ROOT(&transaction->delayed_refs.href_root));
if (transaction->delayed_refs.pending_csums)
@@ -207,7 +207,7 @@ static noinline int join_transaction(struct btrfs_fs_info *fs_info,
spin_unlock(&fs_info->trans_lock);
return -EBUSY;
}
- atomic_inc(&cur_trans->use_count);
+ refcount_inc(&cur_trans->use_count);
atomic_inc(&cur_trans->num_writers);
extwriter_counter_inc(cur_trans, type);
spin_unlock(&fs_info->trans_lock);
@@ -257,7 +257,7 @@ static noinline int join_transaction(struct btrfs_fs_info *fs_info,
* One for this trans handle, one so it will live on until we
* commit the transaction.
*/
- atomic_set(&cur_trans->use_count, 2);
+ refcount_set(&cur_trans->use_count, 2);
atomic_set(&cur_trans->pending_ordered, 0);
cur_trans->flags = 0;
cur_trans->start_time = get_seconds();
@@ -432,7 +432,7 @@ static void wait_current_trans(struct btrfs_fs_info *fs_info)
spin_lock(&fs_info->trans_lock);
cur_trans = fs_info->running_transaction;
if (cur_trans && is_transaction_blocked(cur_trans)) {
- atomic_inc(&cur_trans->use_count);
+ refcount_inc(&cur_trans->use_count);
spin_unlock(&fs_info->trans_lock);

wait_event(fs_info->transaction_wait,
@@ -744,7 +744,7 @@ int btrfs_wait_for_commit(struct btrfs_fs_info *fs_info, u64 transid)
list_for_each_entry(t, &fs_info->trans_list, list) {
if (t->transid == transid) {
cur_trans = t;
- atomic_inc(&cur_trans->use_count);
+ refcount_inc(&cur_trans->use_count);
ret = 0;
break;
}
@@ -773,7 +773,7 @@ int btrfs_wait_for_commit(struct btrfs_fs_info *fs_info, u64 transid)
if (t->state == TRANS_STATE_COMPLETED)
break;
cur_trans = t;
- atomic_inc(&cur_trans->use_count);
+ refcount_inc(&cur_trans->use_count);
break;
}
}
@@ -1839,7 +1839,7 @@ int btrfs_commit_transaction_async(struct btrfs_trans_handle *trans,

/* take transaction reference */
cur_trans = trans->transaction;
- atomic_inc(&cur_trans->use_count);
+ refcount_inc(&cur_trans->use_count);

btrfs_end_transaction(trans);

@@ -2015,7 +2015,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
spin_lock(&fs_info->trans_lock);
if (cur_trans->state >= TRANS_STATE_COMMIT_START) {
spin_unlock(&fs_info->trans_lock);
- atomic_inc(&cur_trans->use_count);
+ refcount_inc(&cur_trans->use_count);
ret = btrfs_end_transaction(trans);

wait_for_commit(cur_trans);
@@ -2035,7 +2035,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
prev_trans = list_entry(cur_trans->list.prev,
struct btrfs_transaction, list);
if (prev_trans->state != TRANS_STATE_COMPLETED) {
- atomic_inc(&prev_trans->use_count);
+ refcount_inc(&prev_trans->use_count);
spin_unlock(&fs_info->trans_lock);

wait_for_commit(prev_trans);
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index 5dfb559..2d9ad36 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -18,6 +18,7 @@

#ifndef __BTRFS_TRANSACTION__
#define __BTRFS_TRANSACTION__
+#include <linux/refcount.h>
#include "btrfs_inode.h"
#include "delayed-ref.h"
#include "ctree.h"
@@ -49,7 +50,7 @@ struct btrfs_transaction {
* transaction can end
*/
atomic_t num_writers;
- atomic_t use_count;
+ refcount_t use_count;
atomic_t pending_ordered;

unsigned long flags;
--
2.7.4

2017-03-03 09:02:37

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 04/17] fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/ordered-data.c | 18 +++++++++---------
fs/btrfs/ordered-data.h | 2 +-
include/trace/events/btrfs.h | 2 +-
3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index da5399f..7b40e2e 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -212,7 +212,7 @@ static int __btrfs_add_ordered_extent(struct inode *inode, u64 file_offset,
set_bit(BTRFS_ORDERED_DIRECT, &entry->flags);

/* one ref for the tree */
- atomic_set(&entry->refs, 1);
+ refcount_set(&entry->refs, 1);
init_waitqueue_head(&entry->wait);
INIT_LIST_HEAD(&entry->list);
INIT_LIST_HEAD(&entry->root_extent_list);
@@ -358,7 +358,7 @@ int btrfs_dec_test_first_ordered_pending(struct inode *inode,
out:
if (!ret && cached && entry) {
*cached = entry;
- atomic_inc(&entry->refs);
+ refcount_inc(&entry->refs);
}
spin_unlock_irqrestore(&tree->lock, flags);
return ret == 0;
@@ -425,7 +425,7 @@ int btrfs_dec_test_ordered_pending(struct inode *inode,
out:
if (!ret && cached && entry) {
*cached = entry;
- atomic_inc(&entry->refs);
+ refcount_inc(&entry->refs);
}
spin_unlock_irqrestore(&tree->lock, flags);
return ret == 0;
@@ -456,7 +456,7 @@ void btrfs_get_logged_extents(struct btrfs_inode *inode,
if (test_and_set_bit(BTRFS_ORDERED_LOGGED, &ordered->flags))
continue;
list_add(&ordered->log_list, logged_list);
- atomic_inc(&ordered->refs);
+ refcount_inc(&ordered->refs);
}
spin_unlock_irq(&tree->lock);
}
@@ -565,7 +565,7 @@ void btrfs_put_ordered_extent(struct btrfs_ordered_extent *entry)

trace_btrfs_ordered_extent_put(entry->inode, entry);

- if (atomic_dec_and_test(&entry->refs)) {
+ if (refcount_dec_and_test(&entry->refs)) {
ASSERT(list_empty(&entry->log_list));
ASSERT(list_empty(&entry->trans_list));
ASSERT(list_empty(&entry->root_extent_list));
@@ -690,7 +690,7 @@ int btrfs_wait_ordered_extents(struct btrfs_root *root, int nr,

list_move_tail(&ordered->root_extent_list,
&root->ordered_extents);
- atomic_inc(&ordered->refs);
+ refcount_inc(&ordered->refs);
spin_unlock(&root->ordered_extent_lock);

btrfs_init_work(&ordered->flush_work,
@@ -870,7 +870,7 @@ struct btrfs_ordered_extent *btrfs_lookup_ordered_extent(struct inode *inode,
if (!offset_in_entry(entry, file_offset))
entry = NULL;
if (entry)
- atomic_inc(&entry->refs);
+ refcount_inc(&entry->refs);
out:
spin_unlock_irq(&tree->lock);
return entry;
@@ -911,7 +911,7 @@ struct btrfs_ordered_extent *btrfs_lookup_ordered_range(
}
out:
if (entry)
- atomic_inc(&entry->refs);
+ refcount_inc(&entry->refs);
spin_unlock_irq(&tree->lock);
return entry;
}
@@ -948,7 +948,7 @@ btrfs_lookup_first_ordered_extent(struct inode *inode, u64 file_offset)
goto out;

entry = rb_entry(node, struct btrfs_ordered_extent, rb_node);
- atomic_inc(&entry->refs);
+ refcount_inc(&entry->refs);
out:
spin_unlock_irq(&tree->lock);
return entry;
diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h
index 195c93b..e0c1d5b 100644
--- a/fs/btrfs/ordered-data.h
+++ b/fs/btrfs/ordered-data.h
@@ -113,7 +113,7 @@ struct btrfs_ordered_extent {
int compress_type;

/* reference count */
- atomic_t refs;
+ refcount_t refs;

/* the inode we belong to */
struct inode *inode;
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index 9dd29e8..8f206263 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -275,7 +275,7 @@ DECLARE_EVENT_CLASS(btrfs__ordered_extent,
__entry->bytes_left = ordered->bytes_left;
__entry->flags = ordered->flags;
__entry->compress_type = ordered->compress_type;
- __entry->refs = atomic_read(&ordered->refs);
+ __entry->refs = refcount_read(&ordered->refs);
__entry->root_objectid =
BTRFS_I(inode)->root->root_key.objectid;
__entry->truncated_len = ordered->truncated_len;
--
2.7.4

2017-03-03 09:04:43

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 05/17] fs, btrfs: convert btrfs_caching_control.count from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/ctree.h | 3 ++-
fs/btrfs/extent-tree.c | 12 ++++++------
2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 00e3518..557af39 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -38,6 +38,7 @@
#include <linux/security.h>
#include <linux/sizes.h>
#include <linux/dynamic_debug.h>
+#include <linux/refcount.h>
#include "extent_io.h"
#include "extent_map.h"
#include "async-thread.h"
@@ -517,7 +518,7 @@ struct btrfs_caching_control {
struct btrfs_work work;
struct btrfs_block_group_cache *block_group;
u64 progress;
- atomic_t count;
+ refcount_t count;
};

/* Once caching_thread() finds this much free space, it will wake up waiters. */
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 63ba295..796001b 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -315,14 +315,14 @@ get_caching_control(struct btrfs_block_group_cache *cache)
}

ctl = cache->caching_ctl;
- atomic_inc(&ctl->count);
+ refcount_inc(&ctl->count);
spin_unlock(&cache->lock);
return ctl;
}

static void put_caching_control(struct btrfs_caching_control *ctl)
{
- if (atomic_dec_and_test(&ctl->count))
+ if (refcount_dec_and_test(&ctl->count))
kfree(ctl);
}

@@ -598,7 +598,7 @@ static int cache_block_group(struct btrfs_block_group_cache *cache,
init_waitqueue_head(&caching_ctl->wait);
caching_ctl->block_group = cache;
caching_ctl->progress = cache->key.objectid;
- atomic_set(&caching_ctl->count, 1);
+ refcount_set(&caching_ctl->count, 1);
btrfs_init_work(&caching_ctl->work, btrfs_cache_helper,
caching_thread, NULL, NULL);

@@ -619,7 +619,7 @@ static int cache_block_group(struct btrfs_block_group_cache *cache,
struct btrfs_caching_control *ctl;

ctl = cache->caching_ctl;
- atomic_inc(&ctl->count);
+ refcount_inc(&ctl->count);
prepare_to_wait(&ctl->wait, &wait, TASK_UNINTERRUPTIBLE);
spin_unlock(&cache->lock);

@@ -706,7 +706,7 @@ static int cache_block_group(struct btrfs_block_group_cache *cache,
}

down_write(&fs_info->commit_root_sem);
- atomic_inc(&caching_ctl->count);
+ refcount_inc(&caching_ctl->count);
list_add_tail(&caching_ctl->list, &fs_info->caching_block_groups);
up_write(&fs_info->commit_root_sem);

@@ -10415,7 +10415,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
&fs_info->caching_block_groups, list)
if (ctl->block_group == block_group) {
caching_ctl = ctl;
- atomic_inc(&caching_ctl->count);
+ refcount_inc(&caching_ctl->count);
break;
}
}
--
2.7.4

2017-03-03 09:04:42

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 03/17] fs, btrfs: convert extent_map.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/extent_io.c | 4 ++--
fs/btrfs/extent_map.c | 10 +++++-----
fs/btrfs/extent_map.h | 3 ++-
fs/btrfs/tree-log.c | 2 +-
fs/btrfs/volumes.c | 2 +-
include/trace/events/btrfs.h | 2 +-
6 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index c3abf84..3cf0b02 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2852,7 +2852,7 @@ __get_extent_map(struct inode *inode, struct page *page, size_t pg_offset,
em = *em_cached;
if (extent_map_in_tree(em) && start >= em->start &&
start < extent_map_end(em)) {
- atomic_inc(&em->refs);
+ refcount_inc(&em->refs);
return em;
}

@@ -2863,7 +2863,7 @@ __get_extent_map(struct inode *inode, struct page *page, size_t pg_offset,
em = get_extent(BTRFS_I(inode), page, pg_offset, start, len, 0);
if (em_cached && !IS_ERR_OR_NULL(em)) {
BUG_ON(*em_cached);
- atomic_inc(&em->refs);
+ refcount_inc(&em->refs);
*em_cached = em;
}
return em;
diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index 26f9ac7..6985015 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -55,7 +55,7 @@ struct extent_map *alloc_extent_map(void)
em->flags = 0;
em->compress_type = BTRFS_COMPRESS_NONE;
em->generation = 0;
- atomic_set(&em->refs, 1);
+ refcount_set(&em->refs, 1);
INIT_LIST_HEAD(&em->list);
return em;
}
@@ -71,8 +71,8 @@ void free_extent_map(struct extent_map *em)
{
if (!em)
return;
- WARN_ON(atomic_read(&em->refs) == 0);
- if (atomic_dec_and_test(&em->refs)) {
+ WARN_ON(refcount_read(&em->refs) == 0);
+ if (refcount_dec_and_test(&em->refs)) {
WARN_ON(extent_map_in_tree(em));
WARN_ON(!list_empty(&em->list));
if (test_bit(EXTENT_FLAG_FS_MAPPING, &em->flags))
@@ -322,7 +322,7 @@ static inline void setup_extent_mapping(struct extent_map_tree *tree,
struct extent_map *em,
int modified)
{
- atomic_inc(&em->refs);
+ refcount_inc(&em->refs);
em->mod_start = em->start;
em->mod_len = em->len;

@@ -381,7 +381,7 @@ __lookup_extent_mapping(struct extent_map_tree *tree,
if (strict && !(end > em->start && start < extent_map_end(em)))
return NULL;

- atomic_inc(&em->refs);
+ refcount_inc(&em->refs);
return em;
}

diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h
index eb8b8fa..a67b2de 100644
--- a/fs/btrfs/extent_map.h
+++ b/fs/btrfs/extent_map.h
@@ -2,6 +2,7 @@
#define __EXTENTMAP__

#include <linux/rbtree.h>
+#include <linux/refcount.h>

#define EXTENT_MAP_LAST_BYTE ((u64)-4)
#define EXTENT_MAP_HOLE ((u64)-3)
@@ -41,7 +42,7 @@ struct extent_map {
*/
struct map_lookup *map_lookup;
};
- atomic_t refs;
+ refcount_t refs;
unsigned int compress_type;
struct list_head list;
};
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index a59674c..ccfe9fe 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -4196,7 +4196,7 @@ static int btrfs_log_changed_extents(struct btrfs_trans_handle *trans,
if (em->generation <= test_gen)
continue;
/* Need a ref to keep it from getting evicted from cache */
- atomic_inc(&em->refs);
+ refcount_inc(&em->refs);
set_bit(EXTENT_FLAG_LOGGING, &em->flags);
list_add_tail(&em->list, &extents);
num++;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a8fd2e9..c23a383 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4833,7 +4833,7 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
ret = add_extent_mapping(em_tree, em, 0);
if (!ret) {
list_add_tail(&em->list, &trans->transaction->pending_chunks);
- atomic_inc(&em->refs);
+ refcount_inc(&em->refs);
}
write_unlock(&em_tree->lock);
if (ret) {
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index a3c3cab..9dd29e8 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -213,7 +213,7 @@ TRACE_EVENT_CONDITION(btrfs_get_extent,
__entry->block_start = map->block_start;
__entry->block_len = map->block_len;
__entry->flags = map->flags;
- __entry->refs = atomic_read(&map->refs);
+ __entry->refs = refcount_read(&map->refs);
__entry->compress_type = map->compress_type;
),

--
2.7.4

2017-03-03 09:06:14

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 17/17] fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/raid56.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 1571bf2..a8954f5 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -149,7 +149,7 @@ struct btrfs_raid_bio {

int generic_bio_cnt;

- atomic_t refs;
+ refcount_t refs;

atomic_t stripes_pending;

@@ -389,7 +389,7 @@ static void __remove_rbio_from_cache(struct btrfs_raid_bio *rbio)
if (bio_list_empty(&rbio->bio_list)) {
if (!list_empty(&rbio->hash_list)) {
list_del_init(&rbio->hash_list);
- atomic_dec(&rbio->refs);
+ refcount_dec(&rbio->refs);
BUG_ON(!list_empty(&rbio->plug_list));
}
}
@@ -480,7 +480,7 @@ static void cache_rbio(struct btrfs_raid_bio *rbio)

/* bump our ref if we were not in the list before */
if (!test_and_set_bit(RBIO_CACHE_BIT, &rbio->flags))
- atomic_inc(&rbio->refs);
+ refcount_inc(&rbio->refs);

if (!list_empty(&rbio->stripe_cache)){
list_move(&rbio->stripe_cache, &table->stripe_cache);
@@ -689,7 +689,7 @@ static noinline int lock_stripe_add(struct btrfs_raid_bio *rbio)
test_bit(RBIO_CACHE_BIT, &cur->flags) &&
!test_bit(RBIO_RMW_LOCKED_BIT, &cur->flags)) {
list_del_init(&cur->hash_list);
- atomic_dec(&cur->refs);
+ refcount_dec(&cur->refs);

steal_rbio(cur, rbio);
cache_drop = cur;
@@ -738,7 +738,7 @@ static noinline int lock_stripe_add(struct btrfs_raid_bio *rbio)
}
}
lockit:
- atomic_inc(&rbio->refs);
+ refcount_inc(&rbio->refs);
list_add(&rbio->hash_list, &h->hash_list);
out:
spin_unlock_irqrestore(&h->lock, flags);
@@ -784,7 +784,7 @@ static noinline void unlock_stripe(struct btrfs_raid_bio *rbio)
}

list_del_init(&rbio->hash_list);
- atomic_dec(&rbio->refs);
+ refcount_dec(&rbio->refs);

/*
* we use the plug list to hold all the rbios
@@ -801,7 +801,7 @@ static noinline void unlock_stripe(struct btrfs_raid_bio *rbio)
list_del_init(&rbio->plug_list);

list_add(&next->hash_list, &h->hash_list);
- atomic_inc(&next->refs);
+ refcount_inc(&next->refs);
spin_unlock(&rbio->bio_list_lock);
spin_unlock_irqrestore(&h->lock, flags);

@@ -843,8 +843,7 @@ static void __free_raid_bio(struct btrfs_raid_bio *rbio)
{
int i;

- WARN_ON(atomic_read(&rbio->refs) < 0);
- if (!atomic_dec_and_test(&rbio->refs))
+ if (!refcount_dec_and_test(&rbio->refs))
return;

WARN_ON(!list_empty(&rbio->stripe_cache));
@@ -997,7 +996,7 @@ static struct btrfs_raid_bio *alloc_rbio(struct btrfs_fs_info *fs_info,
rbio->stripe_npages = stripe_npages;
rbio->faila = -1;
rbio->failb = -1;
- atomic_set(&rbio->refs, 1);
+ refcount_set(&rbio->refs, 1);
atomic_set(&rbio->error, 0);
atomic_set(&rbio->stripes_pending, 0);

--
2.7.4

2017-03-03 09:06:11

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 16/17] fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/scrub.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 4d15112..ace634e 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -202,7 +202,7 @@ struct scrub_ctx {
* doesn't free the scrub context before or while the workers are
* doing the wakeup() call.
*/
- atomic_t refs;
+ refcount_t refs;
};

struct scrub_fixup_nodatasum {
@@ -305,7 +305,7 @@ static void scrub_put_ctx(struct scrub_ctx *sctx);

static void scrub_pending_bio_inc(struct scrub_ctx *sctx)
{
- atomic_inc(&sctx->refs);
+ refcount_inc(&sctx->refs);
atomic_inc(&sctx->bios_in_flight);
}

@@ -356,7 +356,7 @@ static void scrub_pending_trans_workers_inc(struct scrub_ctx *sctx)
{
struct btrfs_fs_info *fs_info = sctx->fs_info;

- atomic_inc(&sctx->refs);
+ refcount_inc(&sctx->refs);
/*
* increment scrubs_running to prevent cancel requests from
* completing as long as a worker is running. we must also
@@ -447,7 +447,7 @@ static noinline_for_stack void scrub_free_ctx(struct scrub_ctx *sctx)

static void scrub_put_ctx(struct scrub_ctx *sctx)
{
- if (atomic_dec_and_test(&sctx->refs))
+ if (refcount_dec_and_test(&sctx->refs))
scrub_free_ctx(sctx);
}

@@ -462,7 +462,7 @@ struct scrub_ctx *scrub_setup_ctx(struct btrfs_device *dev, int is_dev_replace)
sctx = kzalloc(sizeof(*sctx), GFP_KERNEL);
if (!sctx)
goto nomem;
- atomic_set(&sctx->refs, 1);
+ refcount_set(&sctx->refs, 1);
sctx->is_dev_replace = is_dev_replace;
sctx->pages_per_rd_bio = SCRUB_PAGES_PER_RD_BIO;
sctx->curr = -1;
--
2.7.4

2017-03-03 09:06:09

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 15/17] fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/scrub.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 08895d8..4d15112 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -142,7 +142,7 @@ struct scrub_parity {

int stripe_len;

- atomic_t refs;
+ refcount_t refs;

struct list_head spages;

@@ -2822,12 +2822,12 @@ static inline int scrub_calc_parity_bitmap_len(int nsectors)

static void scrub_parity_get(struct scrub_parity *sparity)
{
- atomic_inc(&sparity->refs);
+ refcount_inc(&sparity->refs);
}

static void scrub_parity_put(struct scrub_parity *sparity)
{
- if (!atomic_dec_and_test(&sparity->refs))
+ if (!refcount_dec_and_test(&sparity->refs))
return;

scrub_parity_check_and_repair(sparity);
@@ -2879,7 +2879,7 @@ static noinline_for_stack int scrub_raid56_parity(struct scrub_ctx *sctx,
sparity->scrub_dev = sdev;
sparity->logic_start = logic_start;
sparity->logic_end = logic_end;
- atomic_set(&sparity->refs, 1);
+ refcount_set(&sparity->refs, 1);
INIT_LIST_HEAD(&sparity->spages);
sparity->dbitmap = sparity->bitmap;
sparity->ebitmap = (void *)sparity->bitmap + bitmap_len;
--
2.7.4

2017-03-03 09:07:42

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 09/17] fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/ctree.h | 2 +-
fs/btrfs/disk-io.c | 2 +-
fs/btrfs/disk-io.h | 4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 557af39..c01bfca 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1221,7 +1221,7 @@ struct btrfs_root {
dev_t anon_dev;

spinlock_t root_item_lock;
- atomic_t refs;
+ refcount_t refs;

struct mutex delalloc_mutex;
spinlock_t delalloc_lock;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 913df60..ca89ae3 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1340,7 +1340,7 @@ static void __setup_root(struct btrfs_root *root, struct btrfs_fs_info *fs_info,
atomic_set(&root->log_writers, 0);
atomic_set(&root->log_batch, 0);
atomic_set(&root->orphan_inodes, 0);
- atomic_set(&root->refs, 1);
+ refcount_set(&root->refs, 1);
atomic_set(&root->will_be_snapshoted, 0);
atomic_set(&root->qgroup_meta_rsv, 0);
root->log_transid = 0;
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 2e0ec29..21f1ceb 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -101,14 +101,14 @@ struct btrfs_root *btrfs_alloc_dummy_root(struct btrfs_fs_info *fs_info);
*/
static inline struct btrfs_root *btrfs_grab_fs_root(struct btrfs_root *root)
{
- if (atomic_inc_not_zero(&root->refs))
+ if (refcount_inc_not_zero(&root->refs))
return root;
return NULL;
}

static inline void btrfs_put_fs_root(struct btrfs_root *root)
{
- if (atomic_dec_and_test(&root->refs))
+ if (refcount_dec_and_test(&root->refs))
kfree(root);
}

--
2.7.4

2017-03-03 09:08:24

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 10/17] fs, btrfs: convert extent_state.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/extent_io.c | 14 +++++++-------
fs/btrfs/extent_io.h | 3 ++-
2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 3cf0b02..0702be6 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -68,7 +68,7 @@ void btrfs_leak_debug_check(void)
pr_err("BTRFS: state leak: start %llu end %llu state %u in tree %d refs %d\n",
state->start, state->end, state->state,
extent_state_in_tree(state),
- atomic_read(&state->refs));
+ refcount_read(&state->refs));
list_del(&state->leak_list);
kmem_cache_free(extent_state_cache, state);
}
@@ -238,7 +238,7 @@ static struct extent_state *alloc_extent_state(gfp_t mask)
state->failrec = NULL;
RB_CLEAR_NODE(&state->rb_node);
btrfs_leak_debug_add(&state->leak_list, &states);
- atomic_set(&state->refs, 1);
+ refcount_set(&state->refs, 1);
init_waitqueue_head(&state->wq);
trace_alloc_extent_state(state, mask, _RET_IP_);
return state;
@@ -248,7 +248,7 @@ void free_extent_state(struct extent_state *state)
{
if (!state)
return;
- if (atomic_dec_and_test(&state->refs)) {
+ if (refcount_dec_and_test(&state->refs)) {
WARN_ON(extent_state_in_tree(state));
btrfs_leak_debug_del(&state->leak_list);
trace_free_extent_state(state, _RET_IP_);
@@ -641,7 +641,7 @@ static int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
if (cached && extent_state_in_tree(cached) &&
cached->start <= start && cached->end > start) {
if (clear)
- atomic_dec(&cached->refs);
+ refcount_dec(&cached->refs);
state = cached;
goto hit_next;
}
@@ -793,7 +793,7 @@ static void wait_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,

if (state->state & bits) {
start = state->start;
- atomic_inc(&state->refs);
+ refcount_inc(&state->refs);
wait_on_state(tree, state);
free_extent_state(state);
goto again;
@@ -834,7 +834,7 @@ static void cache_state_if_flags(struct extent_state *state,
if (cached_ptr && !(*cached_ptr)) {
if (!flags || (state->state & flags)) {
*cached_ptr = state;
- atomic_inc(&state->refs);
+ refcount_inc(&state->refs);
}
}
}
@@ -1538,7 +1538,7 @@ static noinline u64 find_delalloc_range(struct extent_io_tree *tree,
if (!found) {
*start = state->start;
*cached_state = state;
- atomic_inc(&state->refs);
+ refcount_inc(&state->refs);
}
found++;
*end = state->end;
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index c16260c..9ea9338 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -2,6 +2,7 @@
#define __EXTENTIO__

#include <linux/rbtree.h>
+#include <linux/refcount.h>
#include "ulist.h"

/* bits for the extent state */
@@ -134,7 +135,7 @@ struct extent_state {

/* ADD NEW ELEMENTS AFTER THIS */
wait_queue_head_t wq;
- atomic_t refs;
+ refcount_t refs;
unsigned state;

struct io_failure_record *failrec;
--
2.7.4

2017-03-03 09:29:32

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 08/17] fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/delayed-inode.c | 18 +++++++++---------
fs/btrfs/delayed-inode.h | 2 +-
2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 7396c36..8ae409b 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -308,7 +308,7 @@ static struct btrfs_delayed_item *btrfs_alloc_delayed_item(u32 data_len)
item->ins_or_del = 0;
item->bytes_reserved = 0;
item->delayed_node = NULL;
- atomic_set(&item->refs, 1);
+ refcount_set(&item->refs, 1);
}
return item;
}
@@ -483,7 +483,7 @@ static void btrfs_release_delayed_item(struct btrfs_delayed_item *item)
{
if (item) {
__btrfs_remove_delayed_item(item);
- if (atomic_dec_and_test(&item->refs))
+ if (refcount_dec_and_test(&item->refs))
kfree(item);
}
}
@@ -1600,14 +1600,14 @@ bool btrfs_readdir_get_delayed_items(struct inode *inode,
mutex_lock(&delayed_node->mutex);
item = __btrfs_first_delayed_insertion_item(delayed_node);
while (item) {
- atomic_inc(&item->refs);
+ refcount_inc(&item->refs);
list_add_tail(&item->readdir_list, ins_list);
item = __btrfs_next_delayed_item(item);
}

item = __btrfs_first_delayed_deletion_item(delayed_node);
while (item) {
- atomic_inc(&item->refs);
+ refcount_inc(&item->refs);
list_add_tail(&item->readdir_list, del_list);
item = __btrfs_next_delayed_item(item);
}
@@ -1634,13 +1634,13 @@ void btrfs_readdir_put_delayed_items(struct inode *inode,

list_for_each_entry_safe(curr, next, ins_list, readdir_list) {
list_del(&curr->readdir_list);
- if (atomic_dec_and_test(&curr->refs))
+ if (refcount_dec_and_test(&curr->refs))
kfree(curr);
}

list_for_each_entry_safe(curr, next, del_list, readdir_list) {
list_del(&curr->readdir_list);
- if (atomic_dec_and_test(&curr->refs))
+ if (refcount_dec_and_test(&curr->refs))
kfree(curr);
}

@@ -1667,7 +1667,7 @@ int btrfs_should_delete_dir_index(struct list_head *del_list,
list_del(&curr->readdir_list);
ret = (curr->key.offset == index);

- if (atomic_dec_and_test(&curr->refs))
+ if (refcount_dec_and_test(&curr->refs))
kfree(curr);

if (ret)
@@ -1705,7 +1705,7 @@ int btrfs_readdir_delayed_dir_index(struct dir_context *ctx,
list_del(&curr->readdir_list);

if (curr->key.offset < ctx->pos) {
- if (atomic_dec_and_test(&curr->refs))
+ if (refcount_dec_and_test(&curr->refs))
kfree(curr);
continue;
}
@@ -1722,7 +1722,7 @@ int btrfs_readdir_delayed_dir_index(struct dir_context *ctx,
over = !dir_emit(ctx, name, name_len,
location.objectid, d_type);

- if (atomic_dec_and_test(&curr->refs))
+ if (refcount_dec_and_test(&curr->refs))
kfree(curr);

if (over)
diff --git a/fs/btrfs/delayed-inode.h b/fs/btrfs/delayed-inode.h
index d234974..6d4f5a0 100644
--- a/fs/btrfs/delayed-inode.h
+++ b/fs/btrfs/delayed-inode.h
@@ -81,7 +81,7 @@ struct btrfs_delayed_item {
struct list_head readdir_list; /* used for readdir items */
u64 bytes_reserved;
struct btrfs_delayed_node *delayed_node;
- atomic_t refs;
+ refcount_t refs;
int ins_or_del;
u32 data_len;
char data[0];
--
2.7.4

2017-03-03 09:53:18

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 14/17] fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/scrub.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 8299f64..08895d8 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -112,7 +112,7 @@ struct scrub_block {
struct scrub_page *pagev[SCRUB_MAX_PAGES_PER_BLOCK];
int page_count;
atomic_t outstanding_pages;
- atomic_t refs; /* free mem on transition to zero */
+ refcount_t refs; /* free mem on transition to zero */
struct scrub_ctx *sctx;
struct scrub_parity *sparity;
struct {
@@ -1998,12 +1998,12 @@ static int scrub_checksum_super(struct scrub_block *sblock)

static void scrub_block_get(struct scrub_block *sblock)
{
- atomic_inc(&sblock->refs);
+ refcount_inc(&sblock->refs);
}

static void scrub_block_put(struct scrub_block *sblock)
{
- if (atomic_dec_and_test(&sblock->refs)) {
+ if (refcount_dec_and_test(&sblock->refs)) {
int i;

if (sblock->sparity)
@@ -2255,7 +2255,7 @@ static int scrub_pages(struct scrub_ctx *sctx, u64 logical, u64 len,

/* one ref inside this function, plus one for each page added to
* a bio later on */
- atomic_set(&sblock->refs, 1);
+ refcount_set(&sblock->refs, 1);
sblock->sctx = sctx;
sblock->no_io_error_seen = 1;

@@ -2555,7 +2555,7 @@ static int scrub_pages_for_parity(struct scrub_parity *sparity,

/* one ref inside this function, plus one for each page added to
* a bio later on */
- atomic_set(&sblock->refs, 1);
+ refcount_set(&sblock->refs, 1);
sblock->sctx = sctx;
sblock->no_io_error_seen = 1;
sblock->sparity = sparity;
--
2.7.4

2017-03-03 09:54:56

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 13/17] fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/scrub.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index c9406bf..8299f64 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -79,7 +79,7 @@ struct scrub_page {
u64 logical;
u64 physical;
u64 physical_for_dev_replace;
- atomic_t refs;
+ refcount_t refs;
struct {
unsigned int mirror_num:8;
unsigned int have_csum:1;
@@ -2017,12 +2017,12 @@ static void scrub_block_put(struct scrub_block *sblock)

static void scrub_page_get(struct scrub_page *spage)
{
- atomic_inc(&spage->refs);
+ refcount_inc(&spage->refs);
}

static void scrub_page_put(struct scrub_page *spage)
{
- if (atomic_dec_and_test(&spage->refs)) {
+ if (refcount_dec_and_test(&spage->refs)) {
if (spage->page)
__free_page(spage->page);
kfree(spage);
--
2.7.4

2017-03-03 10:35:57

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 11/17] fs, btrfs: convert compressed_bio.pending_bios from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/compression.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index c7721a6..10e6b28 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -44,7 +44,7 @@

struct compressed_bio {
/* number of bios pending for this compressed extent */
- atomic_t pending_bios;
+ refcount_t pending_bios;

/* the pages with the compressed data on them */
struct page **compressed_pages;
@@ -161,7 +161,7 @@ static void end_compressed_bio_read(struct bio *bio)
/* if there are more bios still pending for this compressed
* extent, just exit
*/
- if (!atomic_dec_and_test(&cb->pending_bios))
+ if (!refcount_dec_and_test(&cb->pending_bios))
goto out;

inode = cb->inode;
@@ -274,7 +274,7 @@ static void end_compressed_bio_write(struct bio *bio)
/* if there are more bios still pending for this compressed
* extent, just exit
*/
- if (!atomic_dec_and_test(&cb->pending_bios))
+ if (!refcount_dec_and_test(&cb->pending_bios))
goto out;

/* ok, we're the last bio for this extent, step one is to
@@ -342,7 +342,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 start,
cb = kmalloc(compressed_bio_size(fs_info, compressed_len), GFP_NOFS);
if (!cb)
return -ENOMEM;
- atomic_set(&cb->pending_bios, 0);
+ refcount_set(&cb->pending_bios, 0);
cb->errors = 0;
cb->inode = inode;
cb->start = start;
@@ -363,7 +363,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 start,
bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
bio->bi_private = cb;
bio->bi_end_io = end_compressed_bio_write;
- atomic_inc(&cb->pending_bios);
+ refcount_set(&cb->pending_bios, 1);

/* create and submit bios for the compressed pages */
bytes_left = compressed_len;
@@ -388,7 +388,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 start,
* we inc the count. Otherwise, the cb might get
* freed before we're done setting it up
*/
- atomic_inc(&cb->pending_bios);
+ refcount_inc(&cb->pending_bios);
ret = btrfs_bio_wq_end_io(fs_info, bio,
BTRFS_WQ_ENDIO_DATA);
BUG_ON(ret); /* -ENOMEM */
@@ -607,7 +607,7 @@ int btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
if (!cb)
goto out;

- atomic_set(&cb->pending_bios, 0);
+ refcount_set(&cb->pending_bios, 0);
cb->errors = 0;
cb->inode = inode;
cb->mirror_num = mirror_num;
@@ -656,7 +656,7 @@ int btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
bio_set_op_attrs (comp_bio, REQ_OP_READ, 0);
comp_bio->bi_private = cb;
comp_bio->bi_end_io = end_compressed_bio_read;
- atomic_inc(&cb->pending_bios);
+ refcount_set(&cb->pending_bios, 1);

for (pg_index = 0; pg_index < nr_pages; pg_index++) {
page = cb->compressed_pages[pg_index];
@@ -685,7 +685,7 @@ int btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
* we inc the count. Otherwise, the cb might get
* freed before we're done setting it up
*/
- atomic_inc(&cb->pending_bios);
+ refcount_inc(&cb->pending_bios);

if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)) {
ret = btrfs_lookup_bio_sums(inode, comp_bio,
--
2.7.4

2017-03-03 12:32:48

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 06/17] fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/backref.c | 2 +-
fs/btrfs/delayed-ref.c | 8 ++++----
fs/btrfs/delayed-ref.h | 8 +++++---
fs/btrfs/disk-io.c | 2 +-
fs/btrfs/extent-tree.c | 6 +++---
5 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 7699e16..1163383 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -1286,7 +1286,7 @@ static int find_parent_nodes(struct btrfs_trans_handle *trans,
head = btrfs_find_delayed_ref_head(delayed_refs, bytenr);
if (head) {
if (!mutex_trylock(&head->mutex)) {
- atomic_inc(&head->node.refs);
+ refcount_inc(&head->node.refs);
spin_unlock(&delayed_refs->lock);

btrfs_release_path(path);
diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
index 6eb8095..be70d90 100644
--- a/fs/btrfs/delayed-ref.c
+++ b/fs/btrfs/delayed-ref.c
@@ -164,7 +164,7 @@ int btrfs_delayed_ref_lock(struct btrfs_trans_handle *trans,
if (mutex_trylock(&head->mutex))
return 0;

- atomic_inc(&head->node.refs);
+ refcount_inc(&head->node.refs);
spin_unlock(&delayed_refs->lock);

mutex_lock(&head->mutex);
@@ -590,7 +590,7 @@ add_delayed_ref_head(struct btrfs_fs_info *fs_info,
delayed_refs = &trans->transaction->delayed_refs;

/* first set the basic ref node struct up */
- atomic_set(&ref->refs, 1);
+ refcount_set(&ref->refs, 1);
ref->bytenr = bytenr;
ref->num_bytes = num_bytes;
ref->ref_mod = count_mod;
@@ -682,7 +682,7 @@ add_delayed_tree_ref(struct btrfs_fs_info *fs_info,
delayed_refs = &trans->transaction->delayed_refs;

/* first set the basic ref node struct up */
- atomic_set(&ref->refs, 1);
+ refcount_set(&ref->refs, 1);
ref->bytenr = bytenr;
ref->num_bytes = num_bytes;
ref->ref_mod = 1;
@@ -739,7 +739,7 @@ add_delayed_data_ref(struct btrfs_fs_info *fs_info,
seq = atomic64_read(&fs_info->tree_mod_seq);

/* first set the basic ref node struct up */
- atomic_set(&ref->refs, 1);
+ refcount_set(&ref->refs, 1);
ref->bytenr = bytenr;
ref->num_bytes = num_bytes;
ref->ref_mod = 1;
diff --git a/fs/btrfs/delayed-ref.h b/fs/btrfs/delayed-ref.h
index 0e537f9..c0264ff 100644
--- a/fs/btrfs/delayed-ref.h
+++ b/fs/btrfs/delayed-ref.h
@@ -18,6 +18,8 @@
#ifndef __DELAYED_REF__
#define __DELAYED_REF__

+#include <linux/refcount.h>
+
/* these are the possible values of struct btrfs_delayed_ref_node->action */
#define BTRFS_ADD_DELAYED_REF 1 /* add one backref to the tree */
#define BTRFS_DROP_DELAYED_REF 2 /* delete one backref from the tree */
@@ -53,7 +55,7 @@ struct btrfs_delayed_ref_node {
u64 seq;

/* ref count on this data structure */
- atomic_t refs;
+ refcount_t refs;

/*
* how many refs is this entry adding or deleting. For
@@ -220,8 +222,8 @@ btrfs_free_delayed_extent_op(struct btrfs_delayed_extent_op *op)

static inline void btrfs_put_delayed_ref(struct btrfs_delayed_ref_node *ref)
{
- WARN_ON(atomic_read(&ref->refs) == 0);
- if (atomic_dec_and_test(&ref->refs)) {
+ WARN_ON(refcount_read(&ref->refs) == 0);
+ if (refcount_dec_and_test(&ref->refs)) {
WARN_ON(ref->in_tree);
switch (ref->type) {
case BTRFS_TREE_BLOCK_REF_KEY:
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index dca6432..913df60 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4343,7 +4343,7 @@ static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans,
head = rb_entry(node, struct btrfs_delayed_ref_head,
href_node);
if (!mutex_trylock(&head->mutex)) {
- atomic_inc(&head->node.refs);
+ refcount_inc(&head->node.refs);
spin_unlock(&delayed_refs->lock);

mutex_lock(&head->mutex);
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 796001b..11a1b07 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -891,7 +891,7 @@ int btrfs_lookup_extent_info(struct btrfs_trans_handle *trans,
head = btrfs_find_delayed_ref_head(delayed_refs, bytenr);
if (head) {
if (!mutex_trylock(&head->mutex)) {
- atomic_inc(&head->node.refs);
+ refcount_inc(&head->node.refs);
spin_unlock(&delayed_refs->lock);

btrfs_release_path(path);
@@ -2979,7 +2979,7 @@ int btrfs_run_delayed_refs(struct btrfs_trans_handle *trans,
struct btrfs_delayed_ref_node *ref;

ref = &head->node;
- atomic_inc(&ref->refs);
+ refcount_inc(&ref->refs);

spin_unlock(&delayed_refs->lock);
/*
@@ -3056,7 +3056,7 @@ static noinline int check_delayed_ref(struct btrfs_root *root,
}

if (!mutex_trylock(&head->mutex)) {
- atomic_inc(&head->node.refs);
+ refcount_inc(&head->node.refs);
spin_unlock(&delayed_refs->lock);

btrfs_release_path(path);
--
2.7.4

2017-03-03 12:57:48

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 12/17] fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/scrub.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index b0251eb..c9406bf 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -64,7 +64,7 @@ struct scrub_ctx;
#define SCRUB_MAX_PAGES_PER_BLOCK 16 /* 64k per node/leaf/sector */

struct scrub_recover {
- atomic_t refs;
+ refcount_t refs;
struct btrfs_bio *bbio;
u64 map_length;
};
@@ -857,12 +857,12 @@ static void scrub_fixup_nodatasum(struct btrfs_work *work)

static inline void scrub_get_recover(struct scrub_recover *recover)
{
- atomic_inc(&recover->refs);
+ refcount_inc(&recover->refs);
}

static inline void scrub_put_recover(struct scrub_recover *recover)
{
- if (atomic_dec_and_test(&recover->refs)) {
+ if (refcount_dec_and_test(&recover->refs)) {
btrfs_put_bbio(recover->bbio);
kfree(recover);
}
@@ -1343,7 +1343,7 @@ static int scrub_setup_recheck_block(struct scrub_block *original_sblock,
return -ENOMEM;
}

- atomic_set(&recover->refs, 1);
+ refcount_set(&recover->refs, 1);
recover->bbio = bbio;
recover->map_length = mapped_length;

--
2.7.4

2017-03-03 15:37:12

by Elena Reshetova

[permalink] [raw]
Subject: [PATCH 07/17] fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
fs/btrfs/delayed-inode.c | 28 ++++++++++++++--------------
fs/btrfs/delayed-inode.h | 3 ++-
2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 1aff676..7396c36 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -52,7 +52,7 @@ static inline void btrfs_init_delayed_node(
{
delayed_node->root = root;
delayed_node->inode_id = inode_id;
- atomic_set(&delayed_node->refs, 0);
+ refcount_set(&delayed_node->refs, 0);
delayed_node->ins_root = RB_ROOT;
delayed_node->del_root = RB_ROOT;
mutex_init(&delayed_node->mutex);
@@ -81,7 +81,7 @@ static struct btrfs_delayed_node *btrfs_get_delayed_node(

node = READ_ONCE(btrfs_inode->delayed_node);
if (node) {
- atomic_inc(&node->refs);
+ refcount_inc(&node->refs);
return node;
}

@@ -89,14 +89,14 @@ static struct btrfs_delayed_node *btrfs_get_delayed_node(
node = radix_tree_lookup(&root->delayed_nodes_tree, ino);
if (node) {
if (btrfs_inode->delayed_node) {
- atomic_inc(&node->refs); /* can be accessed */
+ refcount_inc(&node->refs); /* can be accessed */
BUG_ON(btrfs_inode->delayed_node != node);
spin_unlock(&root->inode_lock);
return node;
}
btrfs_inode->delayed_node = node;
/* can be accessed and cached in the inode */
- atomic_add(2, &node->refs);
+ refcount_add(2, &node->refs);
spin_unlock(&root->inode_lock);
return node;
}
@@ -125,7 +125,7 @@ static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
btrfs_init_delayed_node(node, root, ino);

/* cached in the btrfs inode and can be accessed */
- atomic_add(2, &node->refs);
+ refcount_set(&node->refs, 2);

ret = radix_tree_preload(GFP_NOFS);
if (ret) {
@@ -166,7 +166,7 @@ static void btrfs_queue_delayed_node(struct btrfs_delayed_root *root,
} else {
list_add_tail(&node->n_list, &root->node_list);
list_add_tail(&node->p_list, &root->prepare_list);
- atomic_inc(&node->refs); /* inserted into list */
+ refcount_inc(&node->refs); /* inserted into list */
root->nodes++;
set_bit(BTRFS_DELAYED_NODE_IN_LIST, &node->flags);
}
@@ -180,7 +180,7 @@ static void btrfs_dequeue_delayed_node(struct btrfs_delayed_root *root,
spin_lock(&root->lock);
if (test_bit(BTRFS_DELAYED_NODE_IN_LIST, &node->flags)) {
root->nodes--;
- atomic_dec(&node->refs); /* not in the list */
+ refcount_dec(&node->refs); /* not in the list */
list_del_init(&node->n_list);
if (!list_empty(&node->p_list))
list_del_init(&node->p_list);
@@ -201,7 +201,7 @@ static struct btrfs_delayed_node *btrfs_first_delayed_node(

p = delayed_root->node_list.next;
node = list_entry(p, struct btrfs_delayed_node, n_list);
- atomic_inc(&node->refs);
+ refcount_inc(&node->refs);
out:
spin_unlock(&delayed_root->lock);

@@ -228,7 +228,7 @@ static struct btrfs_delayed_node *btrfs_next_delayed_node(
p = node->n_list.next;

next = list_entry(p, struct btrfs_delayed_node, n_list);
- atomic_inc(&next->refs);
+ refcount_inc(&next->refs);
out:
spin_unlock(&delayed_root->lock);

@@ -253,11 +253,11 @@ static void __btrfs_release_delayed_node(
btrfs_dequeue_delayed_node(delayed_root, delayed_node);
mutex_unlock(&delayed_node->mutex);

- if (atomic_dec_and_test(&delayed_node->refs)) {
+ if (refcount_dec_and_test(&delayed_node->refs)) {
bool free = false;
struct btrfs_root *root = delayed_node->root;
spin_lock(&root->inode_lock);
- if (atomic_read(&delayed_node->refs) == 0) {
+ if (refcount_read(&delayed_node->refs) == 0) {
radix_tree_delete(&root->delayed_nodes_tree,
delayed_node->inode_id);
free = true;
@@ -286,7 +286,7 @@ static struct btrfs_delayed_node *btrfs_first_prepared_delayed_node(
p = delayed_root->prepare_list.next;
list_del_init(p);
node = list_entry(p, struct btrfs_delayed_node, p_list);
- atomic_inc(&node->refs);
+ refcount_inc(&node->refs);
out:
spin_unlock(&delayed_root->lock);

@@ -1621,7 +1621,7 @@ bool btrfs_readdir_get_delayed_items(struct inode *inode,
* insert/delete delayed items in this period. So we also needn't
* requeue or dequeue this delayed node.
*/
- atomic_dec(&delayed_node->refs);
+ refcount_dec(&delayed_node->refs);

return true;
}
@@ -1963,7 +1963,7 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root)
inode_id = delayed_nodes[n - 1]->inode_id + 1;

for (i = 0; i < n; i++)
- atomic_inc(&delayed_nodes[i]->refs);
+ refcount_inc(&delayed_nodes[i]->refs);
spin_unlock(&root->inode_lock);

for (i = 0; i < n; i++) {
diff --git a/fs/btrfs/delayed-inode.h b/fs/btrfs/delayed-inode.h
index 40327cc..d234974 100644
--- a/fs/btrfs/delayed-inode.h
+++ b/fs/btrfs/delayed-inode.h
@@ -26,6 +26,7 @@
#include <linux/list.h>
#include <linux/wait.h>
#include <linux/atomic.h>
+#include <linux/refcount.h>

#include "ctree.h"

@@ -67,7 +68,7 @@ struct btrfs_delayed_node {
struct rb_root del_root;
struct mutex mutex;
struct btrfs_inode_item inode_item;
- atomic_t refs;
+ refcount_t refs;
u64 index_cnt;
unsigned long flags;
int count;
--
2.7.4

2017-03-06 00:28:00

by Qu Wenruo

[permalink] [raw]
Subject: Re: [PATCH 00/17] fs, btrfs refcount conversions



At 03/03/2017 04:55 PM, Elena Reshetova wrote:
> Now when new refcount_t type and API are finally merged
> (see include/linux/refcount.h), the following
> patches convert various refcounters in the btrfs filesystem from atomic_t
> to refcount_t. By doing this we prevent intentional or accidental
> underflows or overflows that can led to use-after-free vulnerabilities.
>
> The below patches are fully independent and can be cherry-picked separately.
> Since we convert all kernel subsystems in the same fashion, resulting
> in about 300 patches, we have to group them for sending at least in some
> fashion to be manageable. Please excuse the long cc list.
>
> These patches have been tested with xfstests by running btrfs-related tests.
> btrfs debug was enabled, warns on refcount errors, too. No output related to
> refcount errors produced. However, the following errors were during the run:
> * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
> process hangs. They all seem to be around qgroup, sometimes error visible
> such as qgroup scan failed -4 before it blocks, but not always.

-EINTR? That's strange.

Any blocked process backtrace?

> * test btrfs/104 dmesg has additional error output:
> BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
> to free: 4096

Known one, and fixes already sent to mail list while not merged yet:
https://patchwork.kernel.org/patch/9592765/

Thanks,
Qu

> I tried looking at the code on what causes the failure, but could not figure
> it out. It doesn't seem to be related to any refcount changes at least IMO.
>
> The above test failures are hard for me to understand and interpreted, but
> they don't seem to relate to refcount conversions.
>
> Elena Reshetova (17):
> fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t
> fs, btrfs: convert btrfs_transaction.use_count from atomic_t to
> refcount_t
> fs, btrfs: convert extent_map.refs from atomic_t to refcount_t
> fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to
> refcount_t
> fs, btrfs: convert btrfs_caching_control.count from atomic_t to
> refcount_t
> fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to
> refcount_t
> fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t
> fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t
> fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t
> fs, btrfs: convert extent_state.refs from atomic_t to refcount_t
> fs, btrfs: convert compressed_bio.pending_bios from atomic_t to
> refcount_t
> fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t
> fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t
> fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t
> fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t
> fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t
> fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t
>
> fs/btrfs/backref.c | 2 +-
> fs/btrfs/compression.c | 18 ++++++++---------
> fs/btrfs/ctree.h | 5 +++--
> fs/btrfs/delayed-inode.c | 46 ++++++++++++++++++++++----------------------
> fs/btrfs/delayed-inode.h | 5 +++--
> fs/btrfs/delayed-ref.c | 8 ++++----
> fs/btrfs/delayed-ref.h | 8 +++++---
> fs/btrfs/disk-io.c | 6 +++---
> fs/btrfs/disk-io.h | 4 ++--
> fs/btrfs/extent-tree.c | 20 +++++++++----------
> fs/btrfs/extent_io.c | 18 ++++++++---------
> fs/btrfs/extent_io.h | 3 ++-
> fs/btrfs/extent_map.c | 10 +++++-----
> fs/btrfs/extent_map.h | 3 ++-
> fs/btrfs/ordered-data.c | 20 +++++++++----------
> fs/btrfs/ordered-data.h | 2 +-
> fs/btrfs/raid56.c | 19 +++++++++---------
> fs/btrfs/scrub.c | 42 ++++++++++++++++++++--------------------
> fs/btrfs/transaction.c | 20 +++++++++----------
> fs/btrfs/transaction.h | 3 ++-
> fs/btrfs/tree-log.c | 2 +-
> fs/btrfs/volumes.c | 10 +++++-----
> fs/btrfs/volumes.h | 2 +-
> include/trace/events/btrfs.h | 4 ++--
> 24 files changed, 143 insertions(+), 137 deletions(-)
>


2017-03-06 04:09:46

by Qu Wenruo

[permalink] [raw]
Subject: Re: [PATCH 00/17] fs, btrfs refcount conversions



At 03/03/2017 04:55 PM, Elena Reshetova wrote:
> Now when new refcount_t type and API are finally merged
> (see include/linux/refcount.h), the following
> patches convert various refcounters in the btrfs filesystem from atomic_t
> to refcount_t. By doing this we prevent intentional or accidental
> underflows or overflows that can led to use-after-free vulnerabilities.
>
> The below patches are fully independent and can be cherry-picked separately.
> Since we convert all kernel subsystems in the same fashion, resulting
> in about 300 patches, we have to group them for sending at least in some
> fashion to be manageable. Please excuse the long cc list.
>
> These patches have been tested with xfstests by running btrfs-related tests.
> btrfs debug was enabled, warns on refcount errors, too. No output related to
> refcount errors produced. However, the following errors were during the run:
> * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
> process hangs. They all seem to be around qgroup, sometimes error visible
> such as qgroup scan failed -4 before it blocks, but not always.

How reproducible of the hang?

I also see the -EINTR output, but that seems to be designed for
btrfs/11[45].

btrfs/078 is unrelated to qgroup, and all these three test pass in my
test environment, which is v4.11-rc1 with your patches applied.

I ran these 3 tests in a row with default and space_cache=v2 mount
options, and 5 times for each mount option, no hang at all.

It would help much if more info can be provided, from blocked process
backtrace to test mount option to base commit.

Thanks,
Qu

> * test btrfs/104 dmesg has additional error output:
> BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
> to free: 4096
> I tried looking at the code on what causes the failure, but could not figure
> it out. It doesn't seem to be related to any refcount changes at least IMO.
>
> The above test failures are hard for me to understand and interpreted, but
> they don't seem to relate to refcount conversions.
>
> Elena Reshetova (17):
> fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t
> fs, btrfs: convert btrfs_transaction.use_count from atomic_t to
> refcount_t
> fs, btrfs: convert extent_map.refs from atomic_t to refcount_t
> fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to
> refcount_t
> fs, btrfs: convert btrfs_caching_control.count from atomic_t to
> refcount_t
> fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to
> refcount_t
> fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t
> fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t
> fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t
> fs, btrfs: convert extent_state.refs from atomic_t to refcount_t
> fs, btrfs: convert compressed_bio.pending_bios from atomic_t to
> refcount_t
> fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t
> fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t
> fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t
> fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t
> fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t
> fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t
>
> fs/btrfs/backref.c | 2 +-
> fs/btrfs/compression.c | 18 ++++++++---------
> fs/btrfs/ctree.h | 5 +++--
> fs/btrfs/delayed-inode.c | 46 ++++++++++++++++++++++----------------------
> fs/btrfs/delayed-inode.h | 5 +++--
> fs/btrfs/delayed-ref.c | 8 ++++----
> fs/btrfs/delayed-ref.h | 8 +++++---
> fs/btrfs/disk-io.c | 6 +++---
> fs/btrfs/disk-io.h | 4 ++--
> fs/btrfs/extent-tree.c | 20 +++++++++----------
> fs/btrfs/extent_io.c | 18 ++++++++---------
> fs/btrfs/extent_io.h | 3 ++-
> fs/btrfs/extent_map.c | 10 +++++-----
> fs/btrfs/extent_map.h | 3 ++-
> fs/btrfs/ordered-data.c | 20 +++++++++----------
> fs/btrfs/ordered-data.h | 2 +-
> fs/btrfs/raid56.c | 19 +++++++++---------
> fs/btrfs/scrub.c | 42 ++++++++++++++++++++--------------------
> fs/btrfs/transaction.c | 20 +++++++++----------
> fs/btrfs/transaction.h | 3 ++-
> fs/btrfs/tree-log.c | 2 +-
> fs/btrfs/volumes.c | 10 +++++-----
> fs/btrfs/volumes.h | 2 +-
> include/trace/events/btrfs.h | 4 ++--
> 24 files changed, 143 insertions(+), 137 deletions(-)
>


2017-03-06 09:45:49

by Elena Reshetova

[permalink] [raw]
Subject: RE: [PATCH 00/17] fs, btrfs refcount conversions


> At 03/03/2017 04:55 PM, Elena Reshetova wrote:
> > Now when new refcount_t type and API are finally merged
> > (see include/linux/refcount.h), the following
> > patches convert various refcounters in the btrfs filesystem from atomic_t
> > to refcount_t. By doing this we prevent intentional or accidental
> > underflows or overflows that can led to use-after-free vulnerabilities.
> >
> > The below patches are fully independent and can be cherry-picked separately.
> > Since we convert all kernel subsystems in the same fashion, resulting
> > in about 300 patches, we have to group them for sending at least in some
> > fashion to be manageable. Please excuse the long cc list.
> >
> > These patches have been tested with xfstests by running btrfs-related tests.
> > btrfs debug was enabled, warns on refcount errors, too. No output related to
> > refcount errors produced. However, the following errors were during the run:
> > * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
> > process hangs. They all seem to be around qgroup, sometimes error visible
> > such as qgroup scan failed -4 before it blocks, but not always.
>
> How reproducible of the hang?

Always in my environment, but I would not much go into investigating why it happens, if it works for you.
My test environment is far from ideal: I am testing in VM with rather old userspace and couple of additional changes in,
so there are many things that can potentially go wrong. Anyway the strace for 078 is in the attachment.

If the patches pass all tests on your side, could you please take them in and propagate further?
I will continue with other kernel subsystems.

Best Regards,
Elena.


>
> I also see the -EINTR output, but that seems to be designed for
> btrfs/11[45].
>
> btrfs/078 is unrelated to qgroup, and all these three test pass in my
> test environment, which is v4.11-rc1 with your patches applied.
>
> I ran these 3 tests in a row with default and space_cache=v2 mount
> options, and 5 times for each mount option, no hang at all.
>
> It would help much if more info can be provided, from blocked process
> backtrace to test mount option to base commit.
>
> Thanks,
> Qu
>
> > * test btrfs/104 dmesg has additional error output:
> > BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
> > to free: 4096
> > I tried looking at the code on what causes the failure, but could not figure
> > it out. It doesn't seem to be related to any refcount changes at least IMO.
> >
> > The above test failures are hard for me to understand and interpreted, but
> > they don't seem to relate to refcount conversions.
> >
> > Elena Reshetova (17):
> > fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t
> > fs, btrfs: convert btrfs_transaction.use_count from atomic_t to
> > refcount_t
> > fs, btrfs: convert extent_map.refs from atomic_t to refcount_t
> > fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to
> > refcount_t
> > fs, btrfs: convert btrfs_caching_control.count from atomic_t to
> > refcount_t
> > fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to
> > refcount_t
> > fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t
> > fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t
> > fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t
> > fs, btrfs: convert extent_state.refs from atomic_t to refcount_t
> > fs, btrfs: convert compressed_bio.pending_bios from atomic_t to
> > refcount_t
> > fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t
> > fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t
> > fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t
> > fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t
> > fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t
> > fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t
> >
> > fs/btrfs/backref.c | 2 +-
> > fs/btrfs/compression.c | 18 ++++++++---------
> > fs/btrfs/ctree.h | 5 +++--
> > fs/btrfs/delayed-inode.c | 46 ++++++++++++++++++++++----------------------
> > fs/btrfs/delayed-inode.h | 5 +++--
> > fs/btrfs/delayed-ref.c | 8 ++++----
> > fs/btrfs/delayed-ref.h | 8 +++++---
> > fs/btrfs/disk-io.c | 6 +++---
> > fs/btrfs/disk-io.h | 4 ++--
> > fs/btrfs/extent-tree.c | 20 +++++++++----------
> > fs/btrfs/extent_io.c | 18 ++++++++---------
> > fs/btrfs/extent_io.h | 3 ++-
> > fs/btrfs/extent_map.c | 10 +++++-----
> > fs/btrfs/extent_map.h | 3 ++-
> > fs/btrfs/ordered-data.c | 20 +++++++++----------
> > fs/btrfs/ordered-data.h | 2 +-
> > fs/btrfs/raid56.c | 19 +++++++++---------
> > fs/btrfs/scrub.c | 42 ++++++++++++++++++++--------------------
> > fs/btrfs/transaction.c | 20 +++++++++----------
> > fs/btrfs/transaction.h | 3 ++-
> > fs/btrfs/tree-log.c | 2 +-
> > fs/btrfs/volumes.c | 10 +++++-----
> > fs/btrfs/volumes.h | 2 +-
> > include/trace/events/btrfs.h | 4 ++--
> > 24 files changed, 143 insertions(+), 137 deletions(-)
> >
>


Attachments:
btrf_078_log (277.90 kB)
btrf_078_log

2017-03-07 06:29:24

by Qu Wenruo

[permalink] [raw]
Subject: Re: [PATCH 00/17] fs, btrfs refcount conversions



At 03/06/2017 05:43 PM, Reshetova, Elena wrote:
>
>> At 03/03/2017 04:55 PM, Elena Reshetova wrote:
>>> Now when new refcount_t type and API are finally merged
>>> (see include/linux/refcount.h), the following
>>> patches convert various refcounters in the btrfs filesystem from atomic_t
>>> to refcount_t. By doing this we prevent intentional or accidental
>>> underflows or overflows that can led to use-after-free vulnerabilities.
>>>
>>> The below patches are fully independent and can be cherry-picked separately.
>>> Since we convert all kernel subsystems in the same fashion, resulting
>>> in about 300 patches, we have to group them for sending at least in some
>>> fashion to be manageable. Please excuse the long cc list.
>>>
>>> These patches have been tested with xfstests by running btrfs-related tests.
>>> btrfs debug was enabled, warns on refcount errors, too. No output related to
>>> refcount errors produced. However, the following errors were during the run:
>>> * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
>>> process hangs. They all seem to be around qgroup, sometimes error visible
>>> such as qgroup scan failed -4 before it blocks, but not always.
>>
>> How reproducible of the hang?
>
> Always in my environment, but I would not much go into investigating why it happens, if it works for you.
> My test environment is far from ideal: I am testing in VM with rather old userspace and couple of additional changes in,
> so there are many things that can potentially go wrong. Anyway the strace for 078 is in the attachment.

Thanks for the strace.

However no "-f" is passed to strace, so it doesn't contain much useful info.

>
> If the patches pass all tests on your side, could you please take them in and propagate further?
> I will continue with other kernel subsystems.

The patchset itself looks like a common cleanup, while I did encounter
several cases (almost all scrub tests) causing kernel warning due to
underflow.

So I'm afraid the patchset will not be merged until we fix all the
underflows.

But thanks for the patchset, it helps us to expose a lot of problem.

Thanks,
Qu

>
> Best Regards,
> Elena.
>
>
>>
>> I also see the -EINTR output, but that seems to be designed for
>> btrfs/11[45].
>>
>> btrfs/078 is unrelated to qgroup, and all these three test pass in my
>> test environment, which is v4.11-rc1 with your patches applied.
>>
>> I ran these 3 tests in a row with default and space_cache=v2 mount
>> options, and 5 times for each mount option, no hang at all.
>>
>> It would help much if more info can be provided, from blocked process
>> backtrace to test mount option to base commit.
>>
>> Thanks,
>> Qu
>>
>>> * test btrfs/104 dmesg has additional error output:
>>> BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
>>> to free: 4096
>>> I tried looking at the code on what causes the failure, but could not figure
>>> it out. It doesn't seem to be related to any refcount changes at least IMO.
>>>
>>> The above test failures are hard for me to understand and interpreted, but
>>> they don't seem to relate to refcount conversions.
>>>
>>> Elena Reshetova (17):
>>> fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t
>>> fs, btrfs: convert btrfs_transaction.use_count from atomic_t to
>>> refcount_t
>>> fs, btrfs: convert extent_map.refs from atomic_t to refcount_t
>>> fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to
>>> refcount_t
>>> fs, btrfs: convert btrfs_caching_control.count from atomic_t to
>>> refcount_t
>>> fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to
>>> refcount_t
>>> fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t
>>> fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t
>>> fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t
>>> fs, btrfs: convert extent_state.refs from atomic_t to refcount_t
>>> fs, btrfs: convert compressed_bio.pending_bios from atomic_t to
>>> refcount_t
>>> fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t
>>> fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t
>>> fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t
>>> fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t
>>> fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t
>>> fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t
>>>
>>> fs/btrfs/backref.c | 2 +-
>>> fs/btrfs/compression.c | 18 ++++++++---------
>>> fs/btrfs/ctree.h | 5 +++--
>>> fs/btrfs/delayed-inode.c | 46 ++++++++++++++++++++++----------------------
>>> fs/btrfs/delayed-inode.h | 5 +++--
>>> fs/btrfs/delayed-ref.c | 8 ++++----
>>> fs/btrfs/delayed-ref.h | 8 +++++---
>>> fs/btrfs/disk-io.c | 6 +++---
>>> fs/btrfs/disk-io.h | 4 ++--
>>> fs/btrfs/extent-tree.c | 20 +++++++++----------
>>> fs/btrfs/extent_io.c | 18 ++++++++---------
>>> fs/btrfs/extent_io.h | 3 ++-
>>> fs/btrfs/extent_map.c | 10 +++++-----
>>> fs/btrfs/extent_map.h | 3 ++-
>>> fs/btrfs/ordered-data.c | 20 +++++++++----------
>>> fs/btrfs/ordered-data.h | 2 +-
>>> fs/btrfs/raid56.c | 19 +++++++++---------
>>> fs/btrfs/scrub.c | 42 ++++++++++++++++++++--------------------
>>> fs/btrfs/transaction.c | 20 +++++++++----------
>>> fs/btrfs/transaction.h | 3 ++-
>>> fs/btrfs/tree-log.c | 2 +-
>>> fs/btrfs/volumes.c | 10 +++++-----
>>> fs/btrfs/volumes.h | 2 +-
>>> include/trace/events/btrfs.h | 4 ++--
>>> 24 files changed, 143 insertions(+), 137 deletions(-)
>>>
>>
>
>
>


2017-03-07 07:58:42

by Elena Reshetova

[permalink] [raw]
Subject: RE: [PATCH 00/17] fs, btrfs refcount conversions

> At 03/06/2017 05:43 PM, Reshetova, Elena wrote:
> >
> >> At 03/03/2017 04:55 PM, Elena Reshetova wrote:
> >>> Now when new refcount_t type and API are finally merged
> >>> (see include/linux/refcount.h), the following
> >>> patches convert various refcounters in the btrfs filesystem from atomic_t
> >>> to refcount_t. By doing this we prevent intentional or accidental
> >>> underflows or overflows that can led to use-after-free vulnerabilities.
> >>>
> >>> The below patches are fully independent and can be cherry-picked separately.
> >>> Since we convert all kernel subsystems in the same fashion, resulting
> >>> in about 300 patches, we have to group them for sending at least in some
> >>> fashion to be manageable. Please excuse the long cc list.
> >>>
> >>> These patches have been tested with xfstests by running btrfs-related tests.
> >>> btrfs debug was enabled, warns on refcount errors, too. No output related to
> >>> refcount errors produced. However, the following errors were during the run:
> >>> * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
> >>> process hangs. They all seem to be around qgroup, sometimes error visible
> >>> such as qgroup scan failed -4 before it blocks, but not always.
> >>
> >> How reproducible of the hang?
> >
> > Always in my environment, but I would not much go into investigating why it
> happens, if it works for you.
> > My test environment is far from ideal: I am testing in VM with rather old
> userspace and couple of additional changes in,
> > so there are many things that can potentially go wrong. Anyway the strace for
> 078 is in the attachment.
>
> Thanks for the strace.
>
> However no "-f" is passed to strace, so it doesn't contain much useful info.
>
> >
> > If the patches pass all tests on your side, could you please take them in and
> propagate further?
> > I will continue with other kernel subsystems.
>
> The patchset itself looks like a common cleanup, while I did encounter
> several cases (almost all scrub tests) causing kernel warning due to
> underflow.

Oh, could you please send me the warning outputs? I can hopefully analyze and fix them.

Best Regards,
Elena.

>
> So I'm afraid the patchset will not be merged until we fix all the
> underflows.
>
> But thanks for the patchset, it helps us to expose a lot of problem.
>
> Thanks,
> Qu
>
> >
> > Best Regards,
> > Elena.
> >
> >
> >>
> >> I also see the -EINTR output, but that seems to be designed for
> >> btrfs/11[45].
> >>
> >> btrfs/078 is unrelated to qgroup, and all these three test pass in my
> >> test environment, which is v4.11-rc1 with your patches applied.
> >>
> >> I ran these 3 tests in a row with default and space_cache=v2 mount
> >> options, and 5 times for each mount option, no hang at all.
> >>
> >> It would help much if more info can be provided, from blocked process
> >> backtrace to test mount option to base commit.
> >>
> >> Thanks,
> >> Qu
> >>
> >>> * test btrfs/104 dmesg has additional error output:
> >>> BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
> >>> to free: 4096
> >>> I tried looking at the code on what causes the failure, but could not figure
> >>> it out. It doesn't seem to be related to any refcount changes at least IMO.
> >>>
> >>> The above test failures are hard for me to understand and interpreted, but
> >>> they don't seem to relate to refcount conversions.
> >>>
> >>> Elena Reshetova (17):
> >>> fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t
> >>> fs, btrfs: convert btrfs_transaction.use_count from atomic_t to
> >>> refcount_t
> >>> fs, btrfs: convert extent_map.refs from atomic_t to refcount_t
> >>> fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to
> >>> refcount_t
> >>> fs, btrfs: convert btrfs_caching_control.count from atomic_t to
> >>> refcount_t
> >>> fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to
> >>> refcount_t
> >>> fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t
> >>> fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t
> >>> fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t
> >>> fs, btrfs: convert extent_state.refs from atomic_t to refcount_t
> >>> fs, btrfs: convert compressed_bio.pending_bios from atomic_t to
> >>> refcount_t
> >>> fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t
> >>> fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t
> >>> fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t
> >>> fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t
> >>> fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t
> >>> fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t
> >>>
> >>> fs/btrfs/backref.c | 2 +-
> >>> fs/btrfs/compression.c | 18 ++++++++---------
> >>> fs/btrfs/ctree.h | 5 +++--
> >>> fs/btrfs/delayed-inode.c | 46 ++++++++++++++++++++++----------------------
> >>> fs/btrfs/delayed-inode.h | 5 +++--
> >>> fs/btrfs/delayed-ref.c | 8 ++++----
> >>> fs/btrfs/delayed-ref.h | 8 +++++---
> >>> fs/btrfs/disk-io.c | 6 +++---
> >>> fs/btrfs/disk-io.h | 4 ++--
> >>> fs/btrfs/extent-tree.c | 20 +++++++++----------
> >>> fs/btrfs/extent_io.c | 18 ++++++++---------
> >>> fs/btrfs/extent_io.h | 3 ++-
> >>> fs/btrfs/extent_map.c | 10 +++++-----
> >>> fs/btrfs/extent_map.h | 3 ++-
> >>> fs/btrfs/ordered-data.c | 20 +++++++++----------
> >>> fs/btrfs/ordered-data.h | 2 +-
> >>> fs/btrfs/raid56.c | 19 +++++++++---------
> >>> fs/btrfs/scrub.c | 42 ++++++++++++++++++++--------------------
> >>> fs/btrfs/transaction.c | 20 +++++++++----------
> >>> fs/btrfs/transaction.h | 3 ++-
> >>> fs/btrfs/tree-log.c | 2 +-
> >>> fs/btrfs/volumes.c | 10 +++++-----
> >>> fs/btrfs/volumes.h | 2 +-
> >>> include/trace/events/btrfs.h | 4 ++--
> >>> 24 files changed, 143 insertions(+), 137 deletions(-)
> >>>
> >>
> >
> >
> >
>


2017-03-07 08:18:36

by Qu Wenruo

[permalink] [raw]
Subject: Re: [PATCH 00/17] fs, btrfs refcount conversions



At 03/07/2017 03:41 PM, Reshetova, Elena wrote:
>> At 03/06/2017 05:43 PM, Reshetova, Elena wrote:
>>>
>>>> At 03/03/2017 04:55 PM, Elena Reshetova wrote:
>>>>> Now when new refcount_t type and API are finally merged
>>>>> (see include/linux/refcount.h), the following
>>>>> patches convert various refcounters in the btrfs filesystem from atomic_t
>>>>> to refcount_t. By doing this we prevent intentional or accidental
>>>>> underflows or overflows that can led to use-after-free vulnerabilities.
>>>>>
>>>>> The below patches are fully independent and can be cherry-picked separately.
>>>>> Since we convert all kernel subsystems in the same fashion, resulting
>>>>> in about 300 patches, we have to group them for sending at least in some
>>>>> fashion to be manageable. Please excuse the long cc list.
>>>>>
>>>>> These patches have been tested with xfstests by running btrfs-related tests.
>>>>> btrfs debug was enabled, warns on refcount errors, too. No output related to
>>>>> refcount errors produced. However, the following errors were during the run:
>>>>> * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
>>>>> process hangs. They all seem to be around qgroup, sometimes error visible
>>>>> such as qgroup scan failed -4 before it blocks, but not always.
>>>>
>>>> How reproducible of the hang?
>>>
>>> Always in my environment, but I would not much go into investigating why it
>> happens, if it works for you.
>>> My test environment is far from ideal: I am testing in VM with rather old
>> userspace and couple of additional changes in,
>>> so there are many things that can potentially go wrong. Anyway the strace for
>> 078 is in the attachment.
>>
>> Thanks for the strace.
>>
>> However no "-f" is passed to strace, so it doesn't contain much useful info.
>>
>>>
>>> If the patches pass all tests on your side, could you please take them in and
>> propagate further?
>>> I will continue with other kernel subsystems.
>>
>> The patchset itself looks like a common cleanup, while I did encounter
>> several cases (almost all scrub tests) causing kernel warning due to
>> underflow.
>
> Oh, could you please send me the warning outputs? I can hopefully analyze and fix them.

Attached. Which is the generated by running btrfs/070 test case.
And I canceled the case almost instantly, so output is not much, but
still contains enough info.

Both refcount_inc() and refcount_sub_and_test() are causing warning.

So now I'm not sure which is the cause, btrfs or bad use of refcount?

Thanks,
Qu

>
> Best Regards,
> Elena.
>
>>
>> So I'm afraid the patchset will not be merged until we fix all the
>> underflows.
>>
>> But thanks for the patchset, it helps us to expose a lot of problem.
>>
>> Thanks,
>> Qu
>>
>>>
>>> Best Regards,
>>> Elena.
>>>
>>>
>>>>
>>>> I also see the -EINTR output, but that seems to be designed for
>>>> btrfs/11[45].
>>>>
>>>> btrfs/078 is unrelated to qgroup, and all these three test pass in my
>>>> test environment, which is v4.11-rc1 with your patches applied.
>>>>
>>>> I ran these 3 tests in a row with default and space_cache=v2 mount
>>>> options, and 5 times for each mount option, no hang at all.
>>>>
>>>> It would help much if more info can be provided, from blocked process
>>>> backtrace to test mount option to base commit.
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>> * test btrfs/104 dmesg has additional error output:
>>>>> BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
>>>>> to free: 4096
>>>>> I tried looking at the code on what causes the failure, but could not figure
>>>>> it out. It doesn't seem to be related to any refcount changes at least IMO.
>>>>>
>>>>> The above test failures are hard for me to understand and interpreted, but
>>>>> they don't seem to relate to refcount conversions.
>>>>>
>>>>> Elena Reshetova (17):
>>>>> fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t
>>>>> fs, btrfs: convert btrfs_transaction.use_count from atomic_t to
>>>>> refcount_t
>>>>> fs, btrfs: convert extent_map.refs from atomic_t to refcount_t
>>>>> fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to
>>>>> refcount_t
>>>>> fs, btrfs: convert btrfs_caching_control.count from atomic_t to
>>>>> refcount_t
>>>>> fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to
>>>>> refcount_t
>>>>> fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t
>>>>> fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t
>>>>> fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t
>>>>> fs, btrfs: convert extent_state.refs from atomic_t to refcount_t
>>>>> fs, btrfs: convert compressed_bio.pending_bios from atomic_t to
>>>>> refcount_t
>>>>> fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t
>>>>> fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t
>>>>> fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t
>>>>> fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t
>>>>> fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t
>>>>> fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t
>>>>>
>>>>> fs/btrfs/backref.c | 2 +-
>>>>> fs/btrfs/compression.c | 18 ++++++++---------
>>>>> fs/btrfs/ctree.h | 5 +++--
>>>>> fs/btrfs/delayed-inode.c | 46 ++++++++++++++++++++++----------------------
>>>>> fs/btrfs/delayed-inode.h | 5 +++--
>>>>> fs/btrfs/delayed-ref.c | 8 ++++----
>>>>> fs/btrfs/delayed-ref.h | 8 +++++---
>>>>> fs/btrfs/disk-io.c | 6 +++---
>>>>> fs/btrfs/disk-io.h | 4 ++--
>>>>> fs/btrfs/extent-tree.c | 20 +++++++++----------
>>>>> fs/btrfs/extent_io.c | 18 ++++++++---------
>>>>> fs/btrfs/extent_io.h | 3 ++-
>>>>> fs/btrfs/extent_map.c | 10 +++++-----
>>>>> fs/btrfs/extent_map.h | 3 ++-
>>>>> fs/btrfs/ordered-data.c | 20 +++++++++----------
>>>>> fs/btrfs/ordered-data.h | 2 +-
>>>>> fs/btrfs/raid56.c | 19 +++++++++---------
>>>>> fs/btrfs/scrub.c | 42 ++++++++++++++++++++--------------------
>>>>> fs/btrfs/transaction.c | 20 +++++++++----------
>>>>> fs/btrfs/transaction.h | 3 ++-
>>>>> fs/btrfs/tree-log.c | 2 +-
>>>>> fs/btrfs/volumes.c | 10 +++++-----
>>>>> fs/btrfs/volumes.h | 2 +-
>>>>> include/trace/events/btrfs.h | 4 ++--
>>>>> 24 files changed, 143 insertions(+), 137 deletions(-)
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>



Attachments:
output (26.53 kB)

2017-03-09 15:30:44

by David Sterba

[permalink] [raw]
Subject: Re: [PATCH 00/17] fs, btrfs refcount conversions

On Tue, Mar 07, 2017 at 03:49:52PM +0800, Qu Wenruo wrote:
> >>> If the patches pass all tests on your side, could you please take them in and
> >> propagate further?
> >>> I will continue with other kernel subsystems.
> >>
> >> The patchset itself looks like a common cleanup, while I did encounter
> >> several cases (almost all scrub tests) causing kernel warning due to
> >> underflow.
> >
> > Oh, could you please send me the warning outputs? I can hopefully analyze and fix them.
>
> Attached. Which is the generated by running btrfs/070 test case.
> And I canceled the case almost instantly, so output is not much, but
> still contains enough info.
>
> Both refcount_inc() and refcount_sub_and_test() are causing warning.
>
> So now I'm not sure which is the cause, btrfs or bad use of refcount?

We we do atomic_inc to get the first reference after initialization in
scrub_pages, instead of atomic_set (or an equivalent):

2266 spage = kzalloc(sizeof(*spage), GFP_KERNEL);
2267 if (!spage) {
...
2274 }
...
2276 scrub_page_get(spage);

so the references are 0 and refcount_inc will catch that, the fix is simple.

The refcount_sub_and_test reports seem to catch a bug in refcounting, I'm
analyzing it right now.

2017-03-09 16:24:51

by David Sterba

[permalink] [raw]
Subject: Re: [PATCH 00/17] fs, btrfs refcount conversions

On Fri, Mar 03, 2017 at 10:55:09AM +0200, Elena Reshetova wrote:
> Now when new refcount_t type and API are finally merged
> (see include/linux/refcount.h), the following
> patches convert various refcounters in the btrfs filesystem from atomic_t
> to refcount_t. By doing this we prevent intentional or accidental
> underflows or overflows that can led to use-after-free vulnerabilities.
>
> The below patches are fully independent and can be cherry-picked separately.
> Since we convert all kernel subsystems in the same fashion, resulting
> in about 300 patches, we have to group them for sending at least in some
> fashion to be manageable. Please excuse the long cc list.

Thanks, the patchset looks good to me, I'm going to add it to the 4.12 queue.

> These patches have been tested with xfstests by running btrfs-related tests.
> btrfs debug was enabled, warns on refcount errors, too. No output related to
> refcount errors produced. However, the following errors were during the run:
> * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
> process hangs. They all seem to be around qgroup, sometimes error visible
> such as qgroup scan failed -4 before it blocks, but not always.
> * test btrfs/104 dmesg has additional error output:
> BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
> to free: 4096
> I tried looking at the code on what causes the failure, but could not figure
> it out. It doesn't seem to be related to any refcount changes at least IMO.
>
> The above test failures are hard for me to understand and interpreted, but
> they don't seem to relate to refcount conversions.

I don't think they're related to the refcount updates so we'll address
them.

2017-03-13 10:55:56

by Elena Reshetova

[permalink] [raw]
Subject: RE: [PATCH 00/17] fs, btrfs refcount conversions


> On Fri, Mar 03, 2017 at 10:55:09AM +0200, Elena Reshetova wrote:
> > Now when new refcount_t type and API are finally merged
> > (see include/linux/refcount.h), the following
> > patches convert various refcounters in the btrfs filesystem from atomic_t
> > to refcount_t. By doing this we prevent intentional or accidental
> > underflows or overflows that can led to use-after-free vulnerabilities.
> >
> > The below patches are fully independent and can be cherry-picked separately.
> > Since we convert all kernel subsystems in the same fashion, resulting
> > in about 300 patches, we have to group them for sending at least in some
> > fashion to be manageable. Please excuse the long cc list.
>
> Thanks, the patchset looks good to me, I'm going to add it to the 4.12 queue.

Thank you very much!

>
> > These patches have been tested with xfstests by running btrfs-related tests.
> > btrfs debug was enabled, warns on refcount errors, too. No output related to
> > refcount errors produced. However, the following errors were during the run:
> > * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
> > process hangs. They all seem to be around qgroup, sometimes error visible
> > such as qgroup scan failed -4 before it blocks, but not always.
> > * test btrfs/104 dmesg has additional error output:
> > BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
> > to free: 4096
> > I tried looking at the code on what causes the failure, but could not figure
> > it out. It doesn't seem to be related to any refcount changes at least IMO.
> >
> > The above test failures are hard for me to understand and interpreted, but
> > they don't seem to relate to refcount conversions.
>
> I don't think they're related to the refcount updates so we'll address
> them.