2018-05-23 01:21:59

by Kent Overstreet

[permalink] [raw]
Subject: [PATCH 1/6] Generic radix trees

Very simple radix tree implementation that supports storing arbitrary
size entries, up to PAGE_SIZE - upcoming patches will convert existing
flex_array users to genradixes. The new genradix code has a much simpler
API and implementation, and doesn't have a hard limit on the number of
elements like flex_array does.

Signed-off-by: Kent Overstreet <[email protected]>
---
include/linux/generic-radix-tree.h | 222 +++++++++++++++++++++++++++++
lib/Makefile | 3 +-
lib/generic-radix-tree.c | 180 +++++++++++++++++++++++
3 files changed, 404 insertions(+), 1 deletion(-)
create mode 100644 include/linux/generic-radix-tree.h
create mode 100644 lib/generic-radix-tree.c

diff --git a/include/linux/generic-radix-tree.h b/include/linux/generic-radix-tree.h
new file mode 100644
index 0000000000..3328813322
--- /dev/null
+++ b/include/linux/generic-radix-tree.h
@@ -0,0 +1,222 @@
+#ifndef _LINUX_GENERIC_RADIX_TREE_H
+#define _LINUX_GENERIC_RADIX_TREE_H
+
+/*
+ * Generic radix trees/sparse arrays:
+ *
+ * Very simple and minimalistic, supporting arbitrary size entries up to
+ * PAGE_SIZE.
+ *
+ * A genradix is defined with the type it will store, like so:
+ * static GENRADIX(struct foo) foo_genradix;
+ *
+ * The main operations are:
+ * - genradix_init(radix) - initialize an empty genradix
+ *
+ * - genradix_free(radix) - free all memory owned by the genradix and
+ * reinitialize it
+ *
+ * - genradix_ptr(radix, idx) - gets a pointer to the entry at idx, returning
+ * NULL if that entry does not exist
+ *
+ * - genradix_ptr_alloc(radix, idx, gfp) - gets a pointer to an entry,
+ * allocating it if necessary
+ *
+ * - genradix_for_each(radix, iter, p) - iterate over each entry in a genradix
+ *
+ * The radix tree allocates one page of entries at a time, so entries may exist
+ * that were never explicitly allocated - they will be initialized to all
+ * zeroes.
+ *
+ * Internally, a genradix is just a radix tree of pages, and indexing works in
+ * terms of byte offsets. The wrappers in this header file use sizeof on the
+ * type the radix contains to calculate a byte offset from the index - see
+ * __idx_to_offset.
+ */
+
+#include <asm/page.h>
+#include <linux/bug.h>
+#include <linux/kernel.h>
+#include <linux/log2.h>
+
+struct genradix_node;
+
+struct __genradix {
+ struct genradix_node *root;
+ size_t depth;
+};
+
+#define __GENRADIX_INITIALIZER \
+ { \
+ .tree = { \
+ .root = NULL, \
+ .depth = 0, \
+ } \
+ }
+
+/*
+ * We use a 0 size array to stash the type we're storing without taking any
+ * space at runtime - then the various accessor macros can use typeof() to get
+ * to it for casts/sizeof - we also force the alignment so that storing a type
+ * with a ridiculous alignment doesn't blow up the alignment or size of the
+ * genradix.
+ */
+
+#define GENRADIX(_type) \
+struct { \
+ struct __genradix tree; \
+ _type type[0] __aligned(1); \
+}
+
+#define DEFINE_GENRADIX(_name, _type) \
+ GENRADIX(_type) _name = __GENRADIX_INITIALIZER
+
+/**
+ * genradix_init - initialize a genradix
+ * @_radix: genradix to initialize
+ *
+ * Does not fail
+ */
+#define genradix_init(_radix) \
+do { \
+ *(_radix) = (typeof(*_radix)) __GENRADIX_INITIALIZER; \
+} while (0)
+
+void __genradix_free(struct __genradix *);
+
+/**
+ * genradix_free: free all memory owned by a genradix
+ *
+ * After freeing, @_radix will be reinitialized and empty
+ */
+#define genradix_free(_radix) __genradix_free(&(_radix)->tree)
+
+static inline size_t __idx_to_offset(size_t idx, size_t obj_size)
+{
+ if (__builtin_constant_p(obj_size))
+ BUILD_BUG_ON(obj_size > PAGE_SIZE);
+ else
+ BUG_ON(obj_size > PAGE_SIZE);
+
+ if (!is_power_of_2(obj_size)) {
+ size_t objs_per_page = PAGE_SIZE / obj_size;
+
+ return (idx / objs_per_page) * PAGE_SIZE +
+ (idx % objs_per_page) * obj_size;
+ } else {
+ return idx * obj_size;
+ }
+}
+
+#define __genradix_cast(_radix) (typeof((_radix)->type[0]) *)
+#define __genradix_obj_size(_radix) sizeof((_radix)->type[0])
+#define __genradix_idx_to_offset(_radix, _idx) \
+ __idx_to_offset(_idx, __genradix_obj_size(_radix))
+
+void *__genradix_ptr(struct __genradix *, size_t);
+
+/**
+ * genradix_ptr - get a pointer to a genradix entry
+ * @_radix: genradix to access
+ * @_idx: index to fetch
+ *
+ * Returns a pointer to entry at @_idx, or NULL if that entry does not exist.
+ */
+#define genradix_ptr(_radix, _idx) \
+ (__genradix_cast(_radix) \
+ __genradix_ptr(&(_radix)->tree, \
+ __genradix_idx_to_offset(_radix, _idx)))
+
+void *__genradix_ptr_alloc(struct __genradix *, size_t, gfp_t);
+
+/**
+ * genradix_ptr - get a pointer to a genradix entry, allocating it if necessary
+ * @_radix: genradix to access
+ * @_idx: index to fetch
+ * @_gfp: gfp mask
+ *
+ * Returns a pointer to entry at @_idx, or NULL on allocation failure
+ */
+#define genradix_ptr_alloc(_radix, _idx, _gfp) \
+ (__genradix_cast(_radix) \
+ __genradix_ptr_alloc(&(_radix)->tree, \
+ __genradix_idx_to_offset(_radix, _idx), \
+ _gfp))
+
+struct genradix_iter {
+ size_t offset;
+ size_t pos;
+};
+
+/**
+ * genradix_iter_init - initialize a genradix_iter
+ * @_radix: genradix that will be iterated over
+ * @_idx index to start iterating from
+ */
+#define genradix_iter_init(_radix, _idx) \
+ ((struct genradix_iter) { \
+ .pos = (_idx), \
+ .offset = __genradix_idx_to_offset((_radix), (_idx)),\
+ })
+
+void *__genradix_iter_peek(struct genradix_iter *, struct __genradix *, size_t);
+
+/**
+ * genradix_iter_peek - get first entry at or above iterator's current
+ * position
+ * @_iter: a genradix_iter
+ * @_radix: genradix being iterated over
+ *
+ * If no more entries exist at or above @_iter's current position, returns NULL
+ */
+#define genradix_iter_peek(_iter, _radix) \
+ (__genradix_cast(_radix) \
+ __genradix_iter_peek(_iter, &(_radix)->tree, \
+ PAGE_SIZE / __genradix_obj_size(_radix)))
+
+static inline void __genradix_iter_advance(struct genradix_iter *iter,
+ size_t obj_size)
+{
+ iter->offset += obj_size;
+
+ if (!is_power_of_2(obj_size) &&
+ (iter->offset & (PAGE_SIZE - 1)) + obj_size > PAGE_SIZE)
+ iter->offset = round_up(iter->offset, PAGE_SIZE);
+
+ iter->pos++;
+}
+
+#define genradix_iter_advance(_iter, _radix) \
+ __genradix_iter_advance(_iter, __genradix_obj_size(_radix))
+
+/**
+ * genradix_for_each - iterate over entry in a genradix
+ * @_radix: genradix to iterate over
+ * @_iter: a genradix_iter to track current position
+ * @_p: pointer to genradix entry type
+ *
+ * On every iteration, @_p will point to the current entry, and @_iter.pos
+ * will be the current entry's index.
+ */
+#define genradix_for_each(_radix, _iter, _p) \
+ for (_iter = genradix_iter_init(_radix, 0); \
+ _p = genradix_iter_peek(&(_iter), _uradix); \
+ genradix_iter_advance(&(_iter), _uradix))
+
+int __genradix_prealloc(struct __genradix *, size_t, gfp_t);
+
+/**
+ * genradix_prealloc - preallocate entries in a generic radix tree
+ * @_radix: genradix to preallocate
+ * @_nr: number of entries to preallocate
+ * @_gfp: gfp mask
+ *
+ * Returns 0 on success, -ENOMEM on failure
+ */
+#define genradix_prealloc(_radix, _nr, _gfp) \
+ __genradix_prealloc(&(_radix)->tree, \
+ __genradix_idx_to_offset(_radix, _nr + 1),\
+ _gfp)
+
+
+#endif /* _LINUX_GENERIC_RADIX_TREE_H */
diff --git a/lib/Makefile b/lib/Makefile
index a90d4fcd74..5db5a7fb1e 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -39,7 +39,8 @@ obj-y += bcd.o div64.o sort.o parser.o debug_locks.o random32.o \
gcd.o lcm.o list_sort.o uuid.o flex_array.o iov_iter.o clz_ctz.o \
bsearch.o find_bit.o llist.o memweight.o kfifo.o \
percpu-refcount.o percpu_ida.o rhashtable.o reciprocal_div.o \
- once.o refcount.o usercopy.o errseq.o bucket_locks.o
+ once.o refcount.o usercopy.o errseq.o bucket_locks.o \
+ generic-radix-tree.o
obj-$(CONFIG_STRING_SELFTEST) += test_string.o
obj-y += string_helpers.o
obj-$(CONFIG_TEST_STRING_HELPERS) += test-string_helpers.o
diff --git a/lib/generic-radix-tree.c b/lib/generic-radix-tree.c
new file mode 100644
index 0000000000..4537c7c62c
--- /dev/null
+++ b/lib/generic-radix-tree.c
@@ -0,0 +1,180 @@
+
+#include <linux/export.h>
+#include <linux/generic-radix-tree.h>
+#include <linux/gfp.h>
+
+#define GENRADIX_ARY (PAGE_SIZE / sizeof(struct genradix_node *))
+#define GENRADIX_ARY_SHIFT ilog2(GENRADIX_ARY)
+
+struct genradix_node {
+ union {
+ /* Interior node: */
+ struct genradix_node *children[GENRADIX_ARY];
+
+ /* Leaf: */
+ u8 data[PAGE_SIZE];
+ };
+};
+
+static inline unsigned genradix_depth_shift(unsigned depth)
+{
+ return PAGE_SHIFT + GENRADIX_ARY_SHIFT * depth;
+}
+
+/*
+ * Returns size (of data, in bytes) that a tree of a given depth holds:
+ */
+static inline size_t genradix_depth_size(unsigned depth)
+{
+ return 1UL << genradix_depth_shift(depth);
+}
+
+/*
+ * Returns pointer to the specified byte @offset within @radix, or NULL if not
+ * allocated
+ */
+void *__genradix_ptr(struct __genradix *radix, size_t offset)
+{
+ size_t level = radix->depth;
+ struct genradix_node *n = radix->root;
+
+ if (offset >= genradix_depth_size(radix->depth))
+ return NULL;
+
+ while (1) {
+ if (!n)
+ return NULL;
+ if (!level)
+ break;
+
+ level--;
+
+ n = n->children[offset >> genradix_depth_shift(level)];
+ offset &= genradix_depth_size(level) - 1;
+ }
+
+ return &n->data[offset];
+}
+EXPORT_SYMBOL(__genradix_ptr);
+
+/*
+ * Returns pointer to the specified byte @offset within @radix, allocating it if
+ * necessary - newly allocated slots are always zeroed out:
+ */
+void *__genradix_ptr_alloc(struct __genradix *radix, size_t offset,
+ gfp_t gfp_mask)
+{
+ struct genradix_node **n;
+ size_t level;
+
+ /* Increase tree depth if necessary: */
+
+ while (offset >= genradix_depth_size(radix->depth)) {
+ struct genradix_node *new_root =
+ (void *) __get_free_page(gfp_mask|__GFP_ZERO);
+
+ if (!new_root)
+ return NULL;
+
+ new_root->children[0] = radix->root;
+ radix->root = new_root;
+ radix->depth++;
+ }
+
+ n = &radix->root;
+ level = radix->depth;
+
+ while (1) {
+ if (!*n) {
+ *n = (void *) __get_free_page(gfp_mask|__GFP_ZERO);
+ if (!*n)
+ return NULL;
+ }
+
+ if (!level)
+ break;
+
+ level--;
+
+ n = &(*n)->children[offset >> genradix_depth_shift(level)];
+ offset &= genradix_depth_size(level) - 1;
+ }
+
+ return &(*n)->data[offset];
+}
+EXPORT_SYMBOL(__genradix_ptr_alloc);
+
+void *__genradix_iter_peek(struct genradix_iter *iter,
+ struct __genradix *radix,
+ size_t objs_per_page)
+{
+ struct genradix_node *n;
+ size_t level, i;
+
+ if (!radix->root)
+ return NULL;
+restart:
+ if (iter->offset >= genradix_depth_size(radix->depth))
+ return NULL;
+
+ n = radix->root;
+ level = radix->depth;
+
+ while (level) {
+ level--;
+
+ i = (iter->offset >> genradix_depth_shift(level)) &
+ (GENRADIX_ARY - 1);
+
+ while (!n->children[i]) {
+ i++;
+ iter->offset = round_down(iter->offset +
+ genradix_depth_size(level),
+ genradix_depth_size(level));
+ iter->pos = (iter->offset >> PAGE_SHIFT) *
+ objs_per_page;
+ if (i == GENRADIX_ARY)
+ goto restart;
+ }
+
+ n = n->children[i];
+ }
+
+ return &n->data[iter->offset & (PAGE_SIZE - 1)];
+}
+EXPORT_SYMBOL(__genradix_iter_peek);
+
+static void genradix_free_recurse(struct genradix_node *n, unsigned level)
+{
+ if (level) {
+ unsigned i;
+
+ for (i = 0; i < GENRADIX_ARY; i++)
+ if (n->children[i])
+ genradix_free_recurse(n->children[i], level - 1);
+ }
+
+ free_page((unsigned long) n);
+}
+
+int __genradix_prealloc(struct __genradix *radix, size_t size,
+ gfp_t gfp_mask)
+{
+ size_t offset;
+
+ for (offset = 0; offset < size; offset += PAGE_SIZE)
+ if (!__genradix_ptr_alloc(radix, offset, gfp_mask))
+ return -ENOMEM;
+
+ return 0;
+}
+EXPORT_SYMBOL(__genradix_prealloc);
+
+void __genradix_free(struct __genradix *radix)
+{
+ genradix_free_recurse(radix->root, radix->depth);
+
+ radix->root = NULL;
+ radix->depth = 0;
+}
+EXPORT_SYMBOL(__genradix_free);
--
2.17.0



2018-05-23 01:19:25

by Kent Overstreet

[permalink] [raw]
Subject: [PATCH 5/6] selinux: convert to genradix

the new generic radix trees have a simpler API and implementation, and
no limitations on number of elements, so all flex_array users are being
converted

Signed-off-by: Kent Overstreet <[email protected]>
---
security/selinux/ss/avtab.c | 44 +++++------
security/selinux/ss/avtab.h | 5 +-
security/selinux/ss/conditional.c | 8 +-
security/selinux/ss/policydb.c | 127 +++++++++++-------------------
security/selinux/ss/policydb.h | 12 ++-
security/selinux/ss/services.c | 25 +++---
6 files changed, 86 insertions(+), 135 deletions(-)

diff --git a/security/selinux/ss/avtab.c b/security/selinux/ss/avtab.c
index 2c3c7d010d..7066d52e74 100644
--- a/security/selinux/ss/avtab.c
+++ b/security/selinux/ss/avtab.c
@@ -93,12 +93,10 @@ avtab_insert_node(struct avtab *h, int hvalue,
newnode->next = prev->next;
prev->next = newnode;
} else {
- newnode->next = flex_array_get_ptr(h->htable, hvalue);
- if (flex_array_put_ptr(h->htable, hvalue, newnode,
- GFP_KERNEL|__GFP_ZERO)) {
- kmem_cache_free(avtab_node_cachep, newnode);
- return NULL;
- }
+ struct avtab_node **n = genradix_ptr(&h->htable, hvalue);
+
+ newnode->next = *n;
+ *n = newnode;
}

h->nel++;
@@ -111,11 +109,11 @@ static int avtab_insert(struct avtab *h, struct avtab_key *key, struct avtab_dat
struct avtab_node *prev, *cur, *newnode;
u16 specified = key->specified & ~(AVTAB_ENABLED|AVTAB_ENABLED_OLD);

- if (!h || !h->htable)
+ if (!h)
return -EINVAL;

hvalue = avtab_hash(key, h->mask);
- for (prev = NULL, cur = flex_array_get_ptr(h->htable, hvalue);
+ for (prev = NULL, cur = *genradix_ptr(&h->htable, hvalue);
cur;
prev = cur, cur = cur->next) {
if (key->source_type == cur->key.source_type &&
@@ -156,10 +154,10 @@ avtab_insert_nonunique(struct avtab *h, struct avtab_key *key, struct avtab_datu
struct avtab_node *prev, *cur;
u16 specified = key->specified & ~(AVTAB_ENABLED|AVTAB_ENABLED_OLD);

- if (!h || !h->htable)
+ if (!h)
return NULL;
hvalue = avtab_hash(key, h->mask);
- for (prev = NULL, cur = flex_array_get_ptr(h->htable, hvalue);
+ for (prev = NULL, cur = *genradix_ptr(&h->htable, hvalue);
cur;
prev = cur, cur = cur->next) {
if (key->source_type == cur->key.source_type &&
@@ -186,11 +184,11 @@ struct avtab_datum *avtab_search(struct avtab *h, struct avtab_key *key)
struct avtab_node *cur;
u16 specified = key->specified & ~(AVTAB_ENABLED|AVTAB_ENABLED_OLD);

- if (!h || !h->htable)
+ if (!h)
return NULL;

hvalue = avtab_hash(key, h->mask);
- for (cur = flex_array_get_ptr(h->htable, hvalue); cur;
+ for (cur = *genradix_ptr(&h->htable, hvalue); cur;
cur = cur->next) {
if (key->source_type == cur->key.source_type &&
key->target_type == cur->key.target_type &&
@@ -222,11 +220,11 @@ avtab_search_node(struct avtab *h, struct avtab_key *key)
struct avtab_node *cur;
u16 specified = key->specified & ~(AVTAB_ENABLED|AVTAB_ENABLED_OLD);

- if (!h || !h->htable)
+ if (!h)
return NULL;

hvalue = avtab_hash(key, h->mask);
- for (cur = flex_array_get_ptr(h->htable, hvalue); cur;
+ for (cur = *genradix_ptr(&h->htable, hvalue); cur;
cur = cur->next) {
if (key->source_type == cur->key.source_type &&
key->target_type == cur->key.target_type &&
@@ -281,11 +279,11 @@ void avtab_destroy(struct avtab *h)
int i;
struct avtab_node *cur, *temp;

- if (!h || !h->htable)
+ if (!h)
return;

for (i = 0; i < h->nslot; i++) {
- cur = flex_array_get_ptr(h->htable, i);
+ cur = *genradix_ptr(&h->htable, i);
while (cur) {
temp = cur;
cur = cur->next;
@@ -295,15 +293,14 @@ void avtab_destroy(struct avtab *h)
kmem_cache_free(avtab_node_cachep, temp);
}
}
- flex_array_free(h->htable);
- h->htable = NULL;
+ genradix_free(&h->htable);
h->nslot = 0;
h->mask = 0;
}

int avtab_init(struct avtab *h)
{
- h->htable = NULL;
+ genradix_init(&h->htable);
h->nel = 0;
return 0;
}
@@ -329,9 +326,8 @@ int avtab_alloc(struct avtab *h, u32 nrules)
nslot = MAX_AVTAB_HASH_BUCKETS;
mask = nslot - 1;

- h->htable = flex_array_alloc(sizeof(struct avtab_node *), nslot,
- GFP_KERNEL | __GFP_ZERO);
- if (!h->htable)
+ genradix_init(&h->htable);
+ if (genradix_prealloc(&h->htable, nslot, GFP_KERNEL))
return -ENOMEM;

avtab_alloc_out:
@@ -353,7 +349,7 @@ void avtab_hash_eval(struct avtab *h, char *tag)
max_chain_len = 0;
chain2_len_sum = 0;
for (i = 0; i < h->nslot; i++) {
- cur = flex_array_get_ptr(h->htable, i);
+ cur = *genradix_ptr(&h->htable, i);
if (cur) {
slots_used++;
chain_len = 0;
@@ -645,7 +641,7 @@ int avtab_write(struct policydb *p, struct avtab *a, void *fp)
return rc;

for (i = 0; i < a->nslot; i++) {
- for (cur = flex_array_get_ptr(a->htable, i); cur;
+ for (cur = *genradix_ptr(&a->htable, i); cur;
cur = cur->next) {
rc = avtab_write_item(p, cur, fp);
if (rc)
diff --git a/security/selinux/ss/avtab.h b/security/selinux/ss/avtab.h
index 725853cadc..7f00c41e96 100644
--- a/security/selinux/ss/avtab.h
+++ b/security/selinux/ss/avtab.h
@@ -24,7 +24,7 @@
#define _SS_AVTAB_H_

#include "security.h"
-#include <linux/flex_array.h>
+#include <linux/generic-radix-tree.h>

struct avtab_key {
u16 source_type; /* source type */
@@ -84,11 +84,10 @@ struct avtab_node {
};

struct avtab {
- struct flex_array *htable;
+ GENRADIX(struct avtab_node *) htable;
u32 nel; /* number of elements */
u32 nslot; /* number of hash slots */
u32 mask; /* mask to compute hash func */
-
};

int avtab_init(struct avtab *);
diff --git a/security/selinux/ss/conditional.c b/security/selinux/ss/conditional.c
index c91543a617..d25ef70748 100644
--- a/security/selinux/ss/conditional.c
+++ b/security/selinux/ss/conditional.c
@@ -195,7 +195,6 @@ int cond_index_bool(void *key, void *datum, void *datap)
{
struct policydb *p;
struct cond_bool_datum *booldatum;
- struct flex_array *fa;

booldatum = datum;
p = datap;
@@ -203,10 +202,9 @@ int cond_index_bool(void *key, void *datum, void *datap)
if (!booldatum->value || booldatum->value > p->p_bools.nprim)
return -EINVAL;

- fa = p->sym_val_to_name[SYM_BOOLS];
- if (flex_array_put_ptr(fa, booldatum->value - 1, key,
- GFP_KERNEL | __GFP_ZERO))
- BUG();
+ *genradix_ptr(&p->sym_val_to_name[SYM_BOOLS],
+ booldatum->value - 1) = key;
+
p->bool_val_to_struct[booldatum->value - 1] = booldatum;

return 0;
diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
index 6e8c8056d7..3df39258c9 100644
--- a/security/selinux/ss/policydb.c
+++ b/security/selinux/ss/policydb.c
@@ -36,7 +36,6 @@
#include <linux/string.h>
#include <linux/errno.h>
#include <linux/audit.h>
-#include <linux/flex_array.h>
#include "security.h"

#include "policydb.h"
@@ -341,17 +340,15 @@ static int common_index(void *key, void *datum, void *datap)
{
struct policydb *p;
struct common_datum *comdatum;
- struct flex_array *fa;

comdatum = datum;
p = datap;
if (!comdatum->value || comdatum->value > p->p_commons.nprim)
return -EINVAL;

- fa = p->sym_val_to_name[SYM_COMMONS];
- if (flex_array_put_ptr(fa, comdatum->value - 1, key,
- GFP_KERNEL | __GFP_ZERO))
- BUG();
+ *genradix_ptr(&p->sym_val_to_name[SYM_COMMONS],
+ comdatum->value - 1) = key;
+
return 0;
}

@@ -359,16 +356,15 @@ static int class_index(void *key, void *datum, void *datap)
{
struct policydb *p;
struct class_datum *cladatum;
- struct flex_array *fa;

cladatum = datum;
p = datap;
if (!cladatum->value || cladatum->value > p->p_classes.nprim)
return -EINVAL;
- fa = p->sym_val_to_name[SYM_CLASSES];
- if (flex_array_put_ptr(fa, cladatum->value - 1, key,
- GFP_KERNEL | __GFP_ZERO))
- BUG();
+
+ *genradix_ptr(&p->sym_val_to_name[SYM_CLASSES],
+ cladatum->value - 1) = key;
+
p->class_val_to_struct[cladatum->value - 1] = cladatum;
return 0;
}
@@ -377,7 +373,6 @@ static int role_index(void *key, void *datum, void *datap)
{
struct policydb *p;
struct role_datum *role;
- struct flex_array *fa;

role = datum;
p = datap;
@@ -386,10 +381,9 @@ static int role_index(void *key, void *datum, void *datap)
|| role->bounds > p->p_roles.nprim)
return -EINVAL;

- fa = p->sym_val_to_name[SYM_ROLES];
- if (flex_array_put_ptr(fa, role->value - 1, key,
- GFP_KERNEL | __GFP_ZERO))
- BUG();
+ *genradix_ptr(&p->sym_val_to_name[SYM_ROLES],
+ role->value - 1) = key;
+
p->role_val_to_struct[role->value - 1] = role;
return 0;
}
@@ -398,7 +392,6 @@ static int type_index(void *key, void *datum, void *datap)
{
struct policydb *p;
struct type_datum *typdatum;
- struct flex_array *fa;

typdatum = datum;
p = datap;
@@ -408,15 +401,11 @@ static int type_index(void *key, void *datum, void *datap)
|| typdatum->value > p->p_types.nprim
|| typdatum->bounds > p->p_types.nprim)
return -EINVAL;
- fa = p->sym_val_to_name[SYM_TYPES];
- if (flex_array_put_ptr(fa, typdatum->value - 1, key,
- GFP_KERNEL | __GFP_ZERO))
- BUG();
+ *genradix_ptr(&p->sym_val_to_name[SYM_TYPES],
+ typdatum->value - 1) = key;

- fa = p->type_val_to_struct_array;
- if (flex_array_put_ptr(fa, typdatum->value - 1, typdatum,
- GFP_KERNEL | __GFP_ZERO))
- BUG();
+ *genradix_ptr(&p->type_val_to_struct_array,
+ typdatum->value - 1) = typdatum;
}

return 0;
@@ -426,7 +415,6 @@ static int user_index(void *key, void *datum, void *datap)
{
struct policydb *p;
struct user_datum *usrdatum;
- struct flex_array *fa;

usrdatum = datum;
p = datap;
@@ -435,10 +423,9 @@ static int user_index(void *key, void *datum, void *datap)
|| usrdatum->bounds > p->p_users.nprim)
return -EINVAL;

- fa = p->sym_val_to_name[SYM_USERS];
- if (flex_array_put_ptr(fa, usrdatum->value - 1, key,
- GFP_KERNEL | __GFP_ZERO))
- BUG();
+ *genradix_ptr(&p->sym_val_to_name[SYM_USERS],
+ usrdatum->value - 1) = key;
+
p->user_val_to_struct[usrdatum->value - 1] = usrdatum;
return 0;
}
@@ -447,7 +434,6 @@ static int sens_index(void *key, void *datum, void *datap)
{
struct policydb *p;
struct level_datum *levdatum;
- struct flex_array *fa;

levdatum = datum;
p = datap;
@@ -456,10 +442,8 @@ static int sens_index(void *key, void *datum, void *datap)
if (!levdatum->level->sens ||
levdatum->level->sens > p->p_levels.nprim)
return -EINVAL;
- fa = p->sym_val_to_name[SYM_LEVELS];
- if (flex_array_put_ptr(fa, levdatum->level->sens - 1, key,
- GFP_KERNEL | __GFP_ZERO))
- BUG();
+ *genradix_ptr(&p->sym_val_to_name[SYM_LEVELS],
+ levdatum->level->sens - 1) = key;
}

return 0;
@@ -469,7 +453,6 @@ static int cat_index(void *key, void *datum, void *datap)
{
struct policydb *p;
struct cat_datum *catdatum;
- struct flex_array *fa;

catdatum = datum;
p = datap;
@@ -477,10 +460,8 @@ static int cat_index(void *key, void *datum, void *datap)
if (!catdatum->isalias) {
if (!catdatum->value || catdatum->value > p->p_cats.nprim)
return -EINVAL;
- fa = p->sym_val_to_name[SYM_CATS];
- if (flex_array_put_ptr(fa, catdatum->value - 1, key,
- GFP_KERNEL | __GFP_ZERO))
- BUG();
+ *genradix_ptr(&p->sym_val_to_name[SYM_CATS],
+ catdatum->value - 1) = key;
}

return 0;
@@ -566,15 +547,10 @@ static int policydb_index(struct policydb *p)
if (!p->user_val_to_struct)
return -ENOMEM;

- /* Yes, I want the sizeof the pointer, not the structure */
- p->type_val_to_struct_array = flex_array_alloc(sizeof(struct type_datum *),
- p->p_types.nprim,
- GFP_KERNEL | __GFP_ZERO);
- if (!p->type_val_to_struct_array)
- return -ENOMEM;
+ genradix_init(&p->type_val_to_struct_array);

- rc = flex_array_prealloc(p->type_val_to_struct_array, 0,
- p->p_types.nprim, GFP_KERNEL | __GFP_ZERO);
+ rc = genradix_prealloc(&p->type_val_to_struct_array,
+ p->p_types.nprim, GFP_KERNEL);
if (rc)
goto out;

@@ -583,15 +559,10 @@ static int policydb_index(struct policydb *p)
goto out;

for (i = 0; i < SYM_NUM; i++) {
- p->sym_val_to_name[i] = flex_array_alloc(sizeof(char *),
- p->symtab[i].nprim,
- GFP_KERNEL | __GFP_ZERO);
- if (!p->sym_val_to_name[i])
- return -ENOMEM;
+ genradix_init(&p->sym_val_to_name[i]);

- rc = flex_array_prealloc(p->sym_val_to_name[i],
- 0, p->symtab[i].nprim,
- GFP_KERNEL | __GFP_ZERO);
+ rc = genradix_prealloc(&p->sym_val_to_name[i],
+ p->symtab[i].nprim, GFP_KERNEL);
if (rc)
goto out;

@@ -807,16 +778,13 @@ void policydb_destroy(struct policydb *p)
hashtab_destroy(p->symtab[i].table);
}

- for (i = 0; i < SYM_NUM; i++) {
- if (p->sym_val_to_name[i])
- flex_array_free(p->sym_val_to_name[i]);
- }
+ for (i = 0; i < SYM_NUM; i++)
+ genradix_free(&p->sym_val_to_name[i]);

kfree(p->class_val_to_struct);
kfree(p->role_val_to_struct);
kfree(p->user_val_to_struct);
- if (p->type_val_to_struct_array)
- flex_array_free(p->type_val_to_struct_array);
+ genradix_free(&p->type_val_to_struct_array);

avtab_destroy(&p->te_avtab);

@@ -869,17 +837,15 @@ void policydb_destroy(struct policydb *p)
hashtab_map(p->range_tr, range_tr_destroy, NULL);
hashtab_destroy(p->range_tr);

- if (p->type_attr_map_array) {
- for (i = 0; i < p->p_types.nprim; i++) {
- struct ebitmap *e;
+ for (i = 0; i < p->p_types.nprim; i++) {
+ struct ebitmap *e;

- e = flex_array_get(p->type_attr_map_array, i);
- if (!e)
- continue;
- ebitmap_destroy(e);
- }
- flex_array_free(p->type_attr_map_array);
+ e = genradix_ptr(&p->type_attr_map_array, i);
+ if (!e)
+ continue;
+ ebitmap_destroy(e);
}
+ genradix_free(&p->type_attr_map_array);

ebitmap_destroy(&p->filename_trans_ttypes);
ebitmap_destroy(&p->policycaps);
@@ -1760,8 +1726,8 @@ static int type_bounds_sanity_check(void *key, void *datum, void *datap)
return -EINVAL;
}

- upper = flex_array_get_ptr(p->type_val_to_struct_array,
- upper->bounds - 1);
+ upper = *genradix_ptr(&p->type_val_to_struct_array,
+ upper->bounds - 1);
BUG_ON(!upper);

if (upper->attribute) {
@@ -2518,21 +2484,16 @@ int policydb_read(struct policydb *p, void *fp)
if (rc)
goto bad;

- rc = -ENOMEM;
- p->type_attr_map_array = flex_array_alloc(sizeof(struct ebitmap),
- p->p_types.nprim,
- GFP_KERNEL | __GFP_ZERO);
- if (!p->type_attr_map_array)
- goto bad;
+ genradix_init(&p->type_attr_map_array);

/* preallocate so we don't have to worry about the put ever failing */
- rc = flex_array_prealloc(p->type_attr_map_array, 0, p->p_types.nprim,
- GFP_KERNEL | __GFP_ZERO);
+ rc = genradix_prealloc(&p->type_attr_map_array, p->p_types.nprim,
+ GFP_KERNEL);
if (rc)
goto bad;

for (i = 0; i < p->p_types.nprim; i++) {
- struct ebitmap *e = flex_array_get(p->type_attr_map_array, i);
+ struct ebitmap *e = genradix_ptr(&p->type_attr_map_array, i);

BUG_ON(!e);
ebitmap_init(e);
@@ -3523,7 +3484,7 @@ int policydb_write(struct policydb *p, void *fp)
return rc;

for (i = 0; i < p->p_types.nprim; i++) {
- struct ebitmap *e = flex_array_get(p->type_attr_map_array, i);
+ struct ebitmap *e = genradix_ptr(&p->type_attr_map_array, i);

BUG_ON(!e);
rc = ebitmap_write(e, fp);
diff --git a/security/selinux/ss/policydb.h b/security/selinux/ss/policydb.h
index 215f8f30ac..1ba102463e 100644
--- a/security/selinux/ss/policydb.h
+++ b/security/selinux/ss/policydb.h
@@ -24,7 +24,7 @@
#ifndef _SS_POLICYDB_H_
#define _SS_POLICYDB_H_

-#include <linux/flex_array.h>
+#include <linux/generic-radix-tree.h>

#include "symtab.h"
#include "avtab.h"
@@ -251,13 +251,13 @@ struct policydb {
#define p_cats symtab[SYM_CATS]

/* symbol names indexed by (value - 1) */
- struct flex_array *sym_val_to_name[SYM_NUM];
+ GENRADIX(char *) sym_val_to_name[SYM_NUM];

/* class, role, and user attributes indexed by (value - 1) */
struct class_datum **class_val_to_struct;
struct role_datum **role_val_to_struct;
struct user_datum **user_val_to_struct;
- struct flex_array *type_val_to_struct_array;
+ GENRADIX(struct type_datum *) type_val_to_struct_array;

/* type enforcement access vectors and transitions */
struct avtab te_avtab;
@@ -294,7 +294,7 @@ struct policydb {
struct hashtab *range_tr;

/* type -> attribute reverse mapping */
- struct flex_array *type_attr_map_array;
+ GENRADIX(struct ebitmap) type_attr_map_array;

struct ebitmap policycaps;

@@ -369,9 +369,7 @@ static inline int put_entry(const void *buf, size_t bytes, int num, struct polic

static inline char *sym_name(struct policydb *p, unsigned int sym_num, unsigned int element_nr)
{
- struct flex_array *fa = p->sym_val_to_name[sym_num];
-
- return flex_array_get_ptr(fa, element_nr);
+ return *genradix_ptr(&p->sym_val_to_name[sym_num], element_nr);
}

extern u16 string_to_security_class(struct policydb *p, const char *name);
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 8900ea5cba..79a56e4653 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -50,7 +50,6 @@
#include <linux/audit.h>
#include <linux/mutex.h>
#include <linux/selinux.h>
-#include <linux/flex_array.h>
#include <linux/vmalloc.h>
#include <net/netlabel.h>

@@ -562,15 +561,15 @@ static void type_attribute_bounds_av(struct context *scontext,
struct type_datum *target;
u32 masked = 0;

- source = flex_array_get_ptr(policydb.type_val_to_struct_array,
- scontext->type - 1);
+ source = *genradix_ptr(&policydb.type_val_to_struct_array,
+ scontext->type - 1);
BUG_ON(!source);

if (!source->bounds)
return;

- target = flex_array_get_ptr(policydb.type_val_to_struct_array,
- tcontext->type - 1);
+ target = *genradix_ptr(&policydb.type_val_to_struct_array,
+ tcontext->type - 1);
BUG_ON(!target);

memset(&lo_avd, 0, sizeof(lo_avd));
@@ -669,9 +668,9 @@ static void context_struct_compute_av(struct context *scontext,
*/
avkey.target_class = tclass;
avkey.specified = AVTAB_AV | AVTAB_XPERMS;
- sattr = flex_array_get(policydb.type_attr_map_array, scontext->type - 1);
+ sattr = genradix_ptr(&policydb.type_attr_map_array, scontext->type - 1);
BUG_ON(!sattr);
- tattr = flex_array_get(policydb.type_attr_map_array, tcontext->type - 1);
+ tattr = genradix_ptr(&policydb.type_attr_map_array, tcontext->type - 1);
BUG_ON(!tattr);
ebitmap_for_each_positive_bit(sattr, snode, i) {
ebitmap_for_each_positive_bit(tattr, tnode, j) {
@@ -895,8 +894,8 @@ int security_bounded_transition(u32 old_sid, u32 new_sid)

index = new_context->type;
while (true) {
- type = flex_array_get_ptr(policydb.type_val_to_struct_array,
- index - 1);
+ type = *genradix_ptr(&policydb.type_val_to_struct_array,
+ index - 1);
BUG_ON(!type);

/* not bounded anymore */
@@ -1053,11 +1052,11 @@ void security_compute_xperms_decision(u32 ssid,

avkey.target_class = tclass;
avkey.specified = AVTAB_XPERMS;
- sattr = flex_array_get(policydb.type_attr_map_array,
- scontext->type - 1);
+ sattr = genradix_ptr(&policydb.type_attr_map_array,
+ scontext->type - 1);
BUG_ON(!sattr);
- tattr = flex_array_get(policydb.type_attr_map_array,
- tcontext->type - 1);
+ tattr = genradix_ptr(&policydb.type_attr_map_array,
+ tcontext->type - 1);
BUG_ON(!tattr);
ebitmap_for_each_positive_bit(sattr, snode, i) {
ebitmap_for_each_positive_bit(tattr, tnode, j) {
--
2.17.0


2018-05-23 01:19:49

by Kent Overstreet

[permalink] [raw]
Subject: [PATCH 3/6] md: convert to genradix

the new generic radix trees have a simpler API and implementation, and
no limitations on number of elements, so all flex_array users are being
converted

Signed-off-by: Kent Overstreet <[email protected]>
---
drivers/md/raid5-ppl.c | 5 ++-
drivers/md/raid5.c | 77 ++++++++++++++++++++----------------------
drivers/md/raid5.h | 4 ++-
3 files changed, 42 insertions(+), 44 deletions(-)

diff --git a/drivers/md/raid5-ppl.c b/drivers/md/raid5-ppl.c
index 42890a0837..ffb4ae0679 100644
--- a/drivers/md/raid5-ppl.c
+++ b/drivers/md/raid5-ppl.c
@@ -16,7 +16,6 @@
#include <linux/blkdev.h>
#include <linux/slab.h>
#include <linux/crc32c.h>
-#include <linux/flex_array.h>
#include <linux/async_tx.h>
#include <linux/raid/md_p.h>
#include "md.h"
@@ -165,7 +164,7 @@ ops_run_partial_parity(struct stripe_head *sh, struct raid5_percpu *percpu,
struct dma_async_tx_descriptor *tx)
{
int disks = sh->disks;
- struct page **srcs = flex_array_get(percpu->scribble, 0);
+ struct page **srcs = __genradix_ptr(&percpu->scribble, 0);
int count = 0, pd_idx = sh->pd_idx, i;
struct async_submit_ctl submit;

@@ -196,7 +195,7 @@ ops_run_partial_parity(struct stripe_head *sh, struct raid5_percpu *percpu,
}

init_async_submit(&submit, ASYNC_TX_FENCE|ASYNC_TX_XOR_ZERO_DST, tx,
- NULL, sh, flex_array_get(percpu->scribble, 0)
+ NULL, sh, __genradix_ptr(&percpu->scribble, 0)
+ sizeof(struct page *) * (sh->disks + 2));

if (count == 1)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index b5d2601483..31f88db83c 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -54,7 +54,6 @@
#include <linux/slab.h>
#include <linux/ratelimit.h>
#include <linux/nodemask.h>
-#include <linux/flex_array.h>

#include <trace/events/block.h>
#include <linux/list_sort.h>
@@ -1396,7 +1395,9 @@ static addr_conv_t *to_addr_conv(struct stripe_head *sh,
{
void *addr;

- addr = flex_array_get(percpu->scribble, i);
+ addr = __genradix_ptr(&percpu->scribble,
+ __idx_to_offset(i, percpu->scribble_obj_size));
+
return addr + sizeof(struct page *) * (sh->disks + 2);
}

@@ -1405,7 +1406,8 @@ static struct page **to_addr_page(struct raid5_percpu *percpu, int i)
{
void *addr;

- addr = flex_array_get(percpu->scribble, i);
+ addr = __genradix_ptr(&percpu->scribble,
+ __idx_to_offset(i, percpu->scribble_obj_size));
return addr;
}

@@ -2235,21 +2237,21 @@ static int grow_stripes(struct r5conf *conf, int num)
* calculate over all devices (not just the data blocks), using zeros in place
* of the P and Q blocks.
*/
-static struct flex_array *scribble_alloc(int num, int cnt, gfp_t flags)
+static int scribble_alloc(struct raid5_percpu *percpu,
+ int num, int cnt, gfp_t flags)
{
- struct flex_array *ret;
- size_t len;
+ size_t obj_size =
+ sizeof(struct page *) * (num+2) +
+ sizeof(addr_conv_t) * (num+2);
+ int ret;

- len = sizeof(struct page *) * (num+2) + sizeof(addr_conv_t) * (num+2);
- ret = flex_array_alloc(len, cnt, flags);
- if (!ret)
- return NULL;
- /* always prealloc all elements, so no locking is required */
- if (flex_array_prealloc(ret, 0, cnt, flags)) {
- flex_array_free(ret);
- return NULL;
- }
- return ret;
+ ret = __genradix_prealloc(&percpu->scribble,
+ __idx_to_offset(cnt + 1, obj_size), flags);
+ if (ret)
+ return ret;
+
+ percpu->scribble_obj_size = obj_size;
+ return 0;
}

static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors)
@@ -2267,23 +2269,18 @@ static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors)
return 0;
mddev_suspend(conf->mddev);
get_online_cpus();
+
for_each_present_cpu(cpu) {
struct raid5_percpu *percpu;
- struct flex_array *scribble;

percpu = per_cpu_ptr(conf->percpu, cpu);
- scribble = scribble_alloc(new_disks,
- new_sectors / STRIPE_SECTORS,
- GFP_NOIO);
-
- if (scribble) {
- flex_array_free(percpu->scribble);
- percpu->scribble = scribble;
- } else {
- err = -ENOMEM;
+ err = scribble_alloc(percpu, new_disks,
+ new_sectors / STRIPE_SECTORS,
+ GFP_NOIO);
+ if (err)
break;
- }
}
+
put_online_cpus();
mddev_resume(conf->mddev);
if (!err) {
@@ -6716,25 +6713,25 @@ raid5_size(struct mddev *mddev, sector_t sectors, int raid_disks)
static void free_scratch_buffer(struct r5conf *conf, struct raid5_percpu *percpu)
{
safe_put_page(percpu->spare_page);
- if (percpu->scribble)
- flex_array_free(percpu->scribble);
percpu->spare_page = NULL;
- percpu->scribble = NULL;
+ __genradix_free(&percpu->scribble);
}

static int alloc_scratch_buffer(struct r5conf *conf, struct raid5_percpu *percpu)
{
- if (conf->level == 6 && !percpu->spare_page)
+ if (conf->level == 6 && !percpu->spare_page) {
percpu->spare_page = alloc_page(GFP_KERNEL);
- if (!percpu->scribble)
- percpu->scribble = scribble_alloc(max(conf->raid_disks,
- conf->previous_raid_disks),
- max(conf->chunk_sectors,
- conf->prev_chunk_sectors)
- / STRIPE_SECTORS,
- GFP_KERNEL);
-
- if (!percpu->scribble || (conf->level == 6 && !percpu->spare_page)) {
+ if (!percpu->spare_page)
+ return -ENOMEM;
+ }
+
+ if (scribble_alloc(percpu,
+ max(conf->raid_disks,
+ conf->previous_raid_disks),
+ max(conf->chunk_sectors,
+ conf->prev_chunk_sectors)
+ / STRIPE_SECTORS,
+ GFP_KERNEL)) {
free_scratch_buffer(conf, percpu);
return -ENOMEM;
}
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 3f8da26032..2c39bfce32 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -4,6 +4,7 @@

#include <linux/raid/xor.h>
#include <linux/dmaengine.h>
+#include <linux/generic-radix-tree.h>

/*
*
@@ -637,10 +638,11 @@ struct r5conf {
/* per cpu variables */
struct raid5_percpu {
struct page *spare_page; /* Used when checking P/Q in raid6 */
- struct flex_array *scribble; /* space for constructing buffer
+ struct __genradix scribble; /* space for constructing buffer
* lists and performing address
* conversions
*/
+ int scribble_obj_size;
} __percpu *percpu;
int scribble_disks;
int scribble_sectors;
--
2.17.0


2018-05-23 01:19:54

by Kent Overstreet

[permalink] [raw]
Subject: [PATCH 6/6] Drop flex_arrays

All existing users have been converted to generic radix trees

Signed-off-by: Kent Overstreet <[email protected]>
---
Documentation/core-api/flexible-arrays.rst | 130 -------
Documentation/flexible-arrays.txt | 123 -------
include/linux/flex_array.h | 149 --------
include/linux/poison.h | 3 -
lib/Makefile | 2 +-
lib/flex_array.c | 398 ---------------------
tools/include/linux/poison.h | 3 -
7 files changed, 1 insertion(+), 807 deletions(-)
delete mode 100644 Documentation/core-api/flexible-arrays.rst
delete mode 100644 Documentation/flexible-arrays.txt
delete mode 100644 include/linux/flex_array.h
delete mode 100644 lib/flex_array.c

diff --git a/Documentation/core-api/flexible-arrays.rst b/Documentation/core-api/flexible-arrays.rst
deleted file mode 100644
index b6b85a1b51..0000000000
--- a/Documentation/core-api/flexible-arrays.rst
+++ /dev/null
@@ -1,130 +0,0 @@
-
-===================================
-Using flexible arrays in the kernel
-===================================
-
-Large contiguous memory allocations can be unreliable in the Linux kernel.
-Kernel programmers will sometimes respond to this problem by allocating
-pages with :c:func:`vmalloc()`. This solution not ideal, though. On 32-bit
-systems, memory from vmalloc() must be mapped into a relatively small address
-space; it's easy to run out. On SMP systems, the page table changes required
-by vmalloc() allocations can require expensive cross-processor interrupts on
-all CPUs. And, on all systems, use of space in the vmalloc() range increases
-pressure on the translation lookaside buffer (TLB), reducing the performance
-of the system.
-
-In many cases, the need for memory from vmalloc() can be eliminated by piecing
-together an array from smaller parts; the flexible array library exists to make
-this task easier.
-
-A flexible array holds an arbitrary (within limits) number of fixed-sized
-objects, accessed via an integer index. Sparse arrays are handled
-reasonably well. Only single-page allocations are made, so memory
-allocation failures should be relatively rare. The down sides are that the
-arrays cannot be indexed directly, individual object size cannot exceed the
-system page size, and putting data into a flexible array requires a copy
-operation. It's also worth noting that flexible arrays do no internal
-locking at all; if concurrent access to an array is possible, then the
-caller must arrange for appropriate mutual exclusion.
-
-The creation of a flexible array is done with :c:func:`flex_array_alloc()`::
-
- #include <linux/flex_array.h>
-
- struct flex_array *flex_array_alloc(int element_size,
- unsigned int total,
- gfp_t flags);
-
-The individual object size is provided by ``element_size``, while total is the
-maximum number of objects which can be stored in the array. The flags
-argument is passed directly to the internal memory allocation calls. With
-the current code, using flags to ask for high memory is likely to lead to
-notably unpleasant side effects.
-
-It is also possible to define flexible arrays at compile time with::
-
- DEFINE_FLEX_ARRAY(name, element_size, total);
-
-This macro will result in a definition of an array with the given name; the
-element size and total will be checked for validity at compile time.
-
-Storing data into a flexible array is accomplished with a call to
-:c:func:`flex_array_put()`::
-
- int flex_array_put(struct flex_array *array, unsigned int element_nr,
- void *src, gfp_t flags);
-
-This call will copy the data from src into the array, in the position
-indicated by ``element_nr`` (which must be less than the maximum specified when
-the array was created). If any memory allocations must be performed, flags
-will be used. The return value is zero on success, a negative error code
-otherwise.
-
-There might possibly be a need to store data into a flexible array while
-running in some sort of atomic context; in this situation, sleeping in the
-memory allocator would be a bad thing. That can be avoided by using
-``GFP_ATOMIC`` for the flags value, but, often, there is a better way. The
-trick is to ensure that any needed memory allocations are done before
-entering atomic context, using :c:func:`flex_array_prealloc()`::
-
- int flex_array_prealloc(struct flex_array *array, unsigned int start,
- unsigned int nr_elements, gfp_t flags);
-
-This function will ensure that memory for the elements indexed in the range
-defined by ``start`` and ``nr_elements`` has been allocated. Thereafter, a
-``flex_array_put()`` call on an element in that range is guaranteed not to
-block.
-
-Getting data back out of the array is done with :c:func:`flex_array_get()`::
-
- void *flex_array_get(struct flex_array *fa, unsigned int element_nr);
-
-The return value is a pointer to the data element, or NULL if that
-particular element has never been allocated.
-
-Note that it is possible to get back a valid pointer for an element which
-has never been stored in the array. Memory for array elements is allocated
-one page at a time; a single allocation could provide memory for several
-adjacent elements. Flexible array elements are normally initialized to the
-value ``FLEX_ARRAY_FREE`` (defined as 0x6c in <linux/poison.h>), so errors
-involving that number probably result from use of unstored array entries.
-Note that, if array elements are allocated with ``__GFP_ZERO``, they will be
-initialized to zero and this poisoning will not happen.
-
-Individual elements in the array can be cleared with
-:c:func:`flex_array_clear()`::
-
- int flex_array_clear(struct flex_array *array, unsigned int element_nr);
-
-This function will set the given element to ``FLEX_ARRAY_FREE`` and return
-zero. If storage for the indicated element is not allocated for the array,
-``flex_array_clear()`` will return ``-EINVAL`` instead. Note that clearing an
-element does not release the storage associated with it; to reduce the
-allocated size of an array, call :c:func:`flex_array_shrink()`::
-
- int flex_array_shrink(struct flex_array *array);
-
-The return value will be the number of pages of memory actually freed.
-This function works by scanning the array for pages containing nothing but
-``FLEX_ARRAY_FREE`` bytes, so (1) it can be expensive, and (2) it will not work
-if the array's pages are allocated with ``__GFP_ZERO``.
-
-It is possible to remove all elements of an array with a call to
-:c:func:`flex_array_free_parts()`::
-
- void flex_array_free_parts(struct flex_array *array);
-
-This call frees all elements, but leaves the array itself in place.
-Freeing the entire array is done with :c:func:`flex_array_free()`::
-
- void flex_array_free(struct flex_array *array);
-
-As of this writing, there are no users of flexible arrays in the mainline
-kernel. The functions described here are also not exported to modules;
-that will probably be fixed when somebody comes up with a need for it.
-
-
-Flexible array functions
-------------------------
-
-.. kernel-doc:: include/linux/flex_array.h
diff --git a/Documentation/flexible-arrays.txt b/Documentation/flexible-arrays.txt
deleted file mode 100644
index a0f2989dd8..0000000000
--- a/Documentation/flexible-arrays.txt
+++ /dev/null
@@ -1,123 +0,0 @@
-===================================
-Using flexible arrays in the kernel
-===================================
-
-:Updated: Last updated for 2.6.32
-:Author: Jonathan Corbet <[email protected]>
-
-Large contiguous memory allocations can be unreliable in the Linux kernel.
-Kernel programmers will sometimes respond to this problem by allocating
-pages with vmalloc(). This solution not ideal, though. On 32-bit systems,
-memory from vmalloc() must be mapped into a relatively small address space;
-it's easy to run out. On SMP systems, the page table changes required by
-vmalloc() allocations can require expensive cross-processor interrupts on
-all CPUs. And, on all systems, use of space in the vmalloc() range
-increases pressure on the translation lookaside buffer (TLB), reducing the
-performance of the system.
-
-In many cases, the need for memory from vmalloc() can be eliminated by
-piecing together an array from smaller parts; the flexible array library
-exists to make this task easier.
-
-A flexible array holds an arbitrary (within limits) number of fixed-sized
-objects, accessed via an integer index. Sparse arrays are handled
-reasonably well. Only single-page allocations are made, so memory
-allocation failures should be relatively rare. The down sides are that the
-arrays cannot be indexed directly, individual object size cannot exceed the
-system page size, and putting data into a flexible array requires a copy
-operation. It's also worth noting that flexible arrays do no internal
-locking at all; if concurrent access to an array is possible, then the
-caller must arrange for appropriate mutual exclusion.
-
-The creation of a flexible array is done with::
-
- #include <linux/flex_array.h>
-
- struct flex_array *flex_array_alloc(int element_size,
- unsigned int total,
- gfp_t flags);
-
-The individual object size is provided by element_size, while total is the
-maximum number of objects which can be stored in the array. The flags
-argument is passed directly to the internal memory allocation calls. With
-the current code, using flags to ask for high memory is likely to lead to
-notably unpleasant side effects.
-
-It is also possible to define flexible arrays at compile time with::
-
- DEFINE_FLEX_ARRAY(name, element_size, total);
-
-This macro will result in a definition of an array with the given name; the
-element size and total will be checked for validity at compile time.
-
-Storing data into a flexible array is accomplished with a call to::
-
- int flex_array_put(struct flex_array *array, unsigned int element_nr,
- void *src, gfp_t flags);
-
-This call will copy the data from src into the array, in the position
-indicated by element_nr (which must be less than the maximum specified when
-the array was created). If any memory allocations must be performed, flags
-will be used. The return value is zero on success, a negative error code
-otherwise.
-
-There might possibly be a need to store data into a flexible array while
-running in some sort of atomic context; in this situation, sleeping in the
-memory allocator would be a bad thing. That can be avoided by using
-GFP_ATOMIC for the flags value, but, often, there is a better way. The
-trick is to ensure that any needed memory allocations are done before
-entering atomic context, using::
-
- int flex_array_prealloc(struct flex_array *array, unsigned int start,
- unsigned int nr_elements, gfp_t flags);
-
-This function will ensure that memory for the elements indexed in the range
-defined by start and nr_elements has been allocated. Thereafter, a
-flex_array_put() call on an element in that range is guaranteed not to
-block.
-
-Getting data back out of the array is done with::
-
- void *flex_array_get(struct flex_array *fa, unsigned int element_nr);
-
-The return value is a pointer to the data element, or NULL if that
-particular element has never been allocated.
-
-Note that it is possible to get back a valid pointer for an element which
-has never been stored in the array. Memory for array elements is allocated
-one page at a time; a single allocation could provide memory for several
-adjacent elements. Flexible array elements are normally initialized to the
-value FLEX_ARRAY_FREE (defined as 0x6c in <linux/poison.h>), so errors
-involving that number probably result from use of unstored array entries.
-Note that, if array elements are allocated with __GFP_ZERO, they will be
-initialized to zero and this poisoning will not happen.
-
-Individual elements in the array can be cleared with::
-
- int flex_array_clear(struct flex_array *array, unsigned int element_nr);
-
-This function will set the given element to FLEX_ARRAY_FREE and return
-zero. If storage for the indicated element is not allocated for the array,
-flex_array_clear() will return -EINVAL instead. Note that clearing an
-element does not release the storage associated with it; to reduce the
-allocated size of an array, call::
-
- int flex_array_shrink(struct flex_array *array);
-
-The return value will be the number of pages of memory actually freed.
-This function works by scanning the array for pages containing nothing but
-FLEX_ARRAY_FREE bytes, so (1) it can be expensive, and (2) it will not work
-if the array's pages are allocated with __GFP_ZERO.
-
-It is possible to remove all elements of an array with a call to::
-
- void flex_array_free_parts(struct flex_array *array);
-
-This call frees all elements, but leaves the array itself in place.
-Freeing the entire array is done with::
-
- void flex_array_free(struct flex_array *array);
-
-As of this writing, there are no users of flexible arrays in the mainline
-kernel. The functions described here are also not exported to modules;
-that will probably be fixed when somebody comes up with a need for it.
diff --git a/include/linux/flex_array.h b/include/linux/flex_array.h
deleted file mode 100644
index b94fa61b51..0000000000
--- a/include/linux/flex_array.h
+++ /dev/null
@@ -1,149 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _FLEX_ARRAY_H
-#define _FLEX_ARRAY_H
-
-#include <linux/types.h>
-#include <linux/reciprocal_div.h>
-#include <asm/page.h>
-
-#define FLEX_ARRAY_PART_SIZE PAGE_SIZE
-#define FLEX_ARRAY_BASE_SIZE PAGE_SIZE
-
-struct flex_array_part;
-
-/*
- * This is meant to replace cases where an array-like
- * structure has gotten too big to fit into kmalloc()
- * and the developer is getting tempted to use
- * vmalloc().
- */
-
-struct flex_array {
- union {
- struct {
- int element_size;
- int total_nr_elements;
- int elems_per_part;
- struct reciprocal_value reciprocal_elems;
- struct flex_array_part *parts[];
- };
- /*
- * This little trick makes sure that
- * sizeof(flex_array) == PAGE_SIZE
- */
- char padding[FLEX_ARRAY_BASE_SIZE];
- };
-};
-
-/* Number of bytes left in base struct flex_array, excluding metadata */
-#define FLEX_ARRAY_BASE_BYTES_LEFT \
- (FLEX_ARRAY_BASE_SIZE - offsetof(struct flex_array, parts))
-
-/* Number of pointers in base to struct flex_array_part pages */
-#define FLEX_ARRAY_NR_BASE_PTRS \
- (FLEX_ARRAY_BASE_BYTES_LEFT / sizeof(struct flex_array_part *))
-
-/* Number of elements of size that fit in struct flex_array_part */
-#define FLEX_ARRAY_ELEMENTS_PER_PART(size) \
- (FLEX_ARRAY_PART_SIZE / size)
-
-/*
- * Defines a statically allocated flex array and ensures its parameters are
- * valid.
- */
-#define DEFINE_FLEX_ARRAY(__arrayname, __element_size, __total) \
- struct flex_array __arrayname = { { { \
- .element_size = (__element_size), \
- .total_nr_elements = (__total), \
- } } }; \
- static inline void __arrayname##_invalid_parameter(void) \
- { \
- BUILD_BUG_ON((__total) > FLEX_ARRAY_NR_BASE_PTRS * \
- FLEX_ARRAY_ELEMENTS_PER_PART(__element_size)); \
- }
-
-/**
- * flex_array_alloc() - Creates a flexible array.
- * @element_size: individual object size.
- * @total: maximum number of objects which can be stored.
- * @flags: GFP flags
- *
- * Return: Returns an object of structure flex_array.
- */
-struct flex_array *flex_array_alloc(int element_size, unsigned int total,
- gfp_t flags);
-
-/**
- * flex_array_prealloc() - Ensures that memory for the elements indexed in the
- * range defined by start and nr_elements has been allocated.
- * @fa: array to allocate memory to.
- * @start: start address
- * @nr_elements: number of elements to be allocated.
- * @flags: GFP flags
- *
- */
-int flex_array_prealloc(struct flex_array *fa, unsigned int start,
- unsigned int nr_elements, gfp_t flags);
-
-/**
- * flex_array_free() - Removes all elements of a flexible array.
- * @fa: array to be freed.
- */
-void flex_array_free(struct flex_array *fa);
-
-/**
- * flex_array_free_parts() - Removes all elements of a flexible array, but
- * leaves the array itself in place.
- * @fa: array to be emptied.
- */
-void flex_array_free_parts(struct flex_array *fa);
-
-/**
- * flex_array_put() - Stores data into a flexible array.
- * @fa: array where element is to be stored.
- * @element_nr: position to copy, must be less than the maximum specified when
- * the array was created.
- * @src: data source to be copied into the array.
- * @flags: GFP flags
- *
- * Return: Returns zero on success, a negative error code otherwise.
- */
-int flex_array_put(struct flex_array *fa, unsigned int element_nr, void *src,
- gfp_t flags);
-
-/**
- * flex_array_clear() - Clears an individual element in the array, sets the
- * given element to FLEX_ARRAY_FREE.
- * @element_nr: element position to clear.
- * @fa: array to which element to be cleared belongs.
- *
- * Return: Returns zero on success, -EINVAL otherwise.
- */
-int flex_array_clear(struct flex_array *fa, unsigned int element_nr);
-
-/**
- * flex_array_get() - Retrieves data into a flexible array.
- *
- * @element_nr: Element position to retrieve data from.
- * @fa: array from which data is to be retrieved.
- *
- * Return: Returns a pointer to the data element, or NULL if that
- * particular element has never been allocated.
- */
-void *flex_array_get(struct flex_array *fa, unsigned int element_nr);
-
-/**
- * flex_array_shrink() - Reduces the allocated size of an array.
- * @fa: array to shrink.
- *
- * Return: Returns number of pages of memory actually freed.
- *
- */
-int flex_array_shrink(struct flex_array *fa);
-
-#define flex_array_put_ptr(fa, nr, src, gfp) \
- flex_array_put(fa, nr, (void *)&(src), gfp)
-
-void *flex_array_get_ptr(struct flex_array *fa, unsigned int element_nr);
-
-#endif /* _FLEX_ARRAY_H */
diff --git a/include/linux/poison.h b/include/linux/poison.h
index 15927ebc22..10173f989a 100644
--- a/include/linux/poison.h
+++ b/include/linux/poison.h
@@ -83,9 +83,6 @@
#define MUTEX_DEBUG_FREE 0x22
#define MUTEX_POISON_WW_CTX ((void *) 0x500 + POISON_POINTER_DELTA)

-/********** lib/flex_array.c **********/
-#define FLEX_ARRAY_FREE 0x6c /* for use-after-free poisoning */
-
/********** security/ **********/
#define KEY_DESTROY 0xbd

diff --git a/lib/Makefile b/lib/Makefile
index 5db5a7fb1e..2c5245d683 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -36,7 +36,7 @@ obj-y += lockref.o

obj-y += bcd.o div64.o sort.o parser.o debug_locks.o random32.o \
bust_spinlocks.o kasprintf.o bitmap.o scatterlist.o \
- gcd.o lcm.o list_sort.o uuid.o flex_array.o iov_iter.o clz_ctz.o \
+ gcd.o lcm.o list_sort.o uuid.o iov_iter.o clz_ctz.o \
bsearch.o find_bit.o llist.o memweight.o kfifo.o \
percpu-refcount.o percpu_ida.o rhashtable.o reciprocal_div.o \
once.o refcount.o usercopy.o errseq.o bucket_locks.o \
diff --git a/lib/flex_array.c b/lib/flex_array.c
deleted file mode 100644
index 2eed22fa50..0000000000
--- a/lib/flex_array.c
+++ /dev/null
@@ -1,398 +0,0 @@
-/*
- * Flexible array managed in PAGE_SIZE parts
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
- *
- * Copyright IBM Corporation, 2009
- *
- * Author: Dave Hansen <[email protected]>
- */
-
-#include <linux/flex_array.h>
-#include <linux/slab.h>
-#include <linux/stddef.h>
-#include <linux/export.h>
-#include <linux/reciprocal_div.h>
-
-struct flex_array_part {
- char elements[FLEX_ARRAY_PART_SIZE];
-};
-
-/*
- * If a user requests an allocation which is small
- * enough, we may simply use the space in the
- * flex_array->parts[] array to store the user
- * data.
- */
-static inline int elements_fit_in_base(struct flex_array *fa)
-{
- int data_size = fa->element_size * fa->total_nr_elements;
- if (data_size <= FLEX_ARRAY_BASE_BYTES_LEFT)
- return 1;
- return 0;
-}
-
-/**
- * flex_array_alloc - allocate a new flexible array
- * @element_size: the size of individual elements in the array
- * @total: total number of elements that this should hold
- * @flags: page allocation flags to use for base array
- *
- * Note: all locking must be provided by the caller.
- *
- * @total is used to size internal structures. If the user ever
- * accesses any array indexes >=@total, it will produce errors.
- *
- * The maximum number of elements is defined as: the number of
- * elements that can be stored in a page times the number of
- * page pointers that we can fit in the base structure or (using
- * integer math):
- *
- * (PAGE_SIZE/element_size) * (PAGE_SIZE-8)/sizeof(void *)
- *
- * Here's a table showing example capacities. Note that the maximum
- * index that the get/put() functions is just nr_objects-1. This
- * basically means that you get 4MB of storage on 32-bit and 2MB on
- * 64-bit.
- *
- *
- * Element size | Objects | Objects |
- * PAGE_SIZE=4k | 32-bit | 64-bit |
- * ---------------------------------|
- * 1 bytes | 4177920 | 2088960 |
- * 2 bytes | 2088960 | 1044480 |
- * 3 bytes | 1392300 | 696150 |
- * 4 bytes | 1044480 | 522240 |
- * 32 bytes | 130560 | 65408 |
- * 33 bytes | 126480 | 63240 |
- * 2048 bytes | 2040 | 1020 |
- * 2049 bytes | 1020 | 510 |
- * void * | 1044480 | 261120 |
- *
- * Since 64-bit pointers are twice the size, we lose half the
- * capacity in the base structure. Also note that no effort is made
- * to efficiently pack objects across page boundaries.
- */
-struct flex_array *flex_array_alloc(int element_size, unsigned int total,
- gfp_t flags)
-{
- struct flex_array *ret;
- int elems_per_part = 0;
- int max_size = 0;
- struct reciprocal_value reciprocal_elems = { 0 };
-
- if (element_size) {
- elems_per_part = FLEX_ARRAY_ELEMENTS_PER_PART(element_size);
- reciprocal_elems = reciprocal_value(elems_per_part);
- max_size = FLEX_ARRAY_NR_BASE_PTRS * elems_per_part;
- }
-
- /* max_size will end up 0 if element_size > PAGE_SIZE */
- if (total > max_size)
- return NULL;
- ret = kzalloc(sizeof(struct flex_array), flags);
- if (!ret)
- return NULL;
- ret->element_size = element_size;
- ret->total_nr_elements = total;
- ret->elems_per_part = elems_per_part;
- ret->reciprocal_elems = reciprocal_elems;
- if (elements_fit_in_base(ret) && !(flags & __GFP_ZERO))
- memset(&ret->parts[0], FLEX_ARRAY_FREE,
- FLEX_ARRAY_BASE_BYTES_LEFT);
- return ret;
-}
-EXPORT_SYMBOL(flex_array_alloc);
-
-static int fa_element_to_part_nr(struct flex_array *fa,
- unsigned int element_nr)
-{
- /*
- * if element_size == 0 we don't get here, so we never touch
- * the zeroed fa->reciprocal_elems, which would yield invalid
- * results
- */
- return reciprocal_divide(element_nr, fa->reciprocal_elems);
-}
-
-/**
- * flex_array_free_parts - just free the second-level pages
- * @fa: the flex array from which to free parts
- *
- * This is to be used in cases where the base 'struct flex_array'
- * has been statically allocated and should not be free.
- */
-void flex_array_free_parts(struct flex_array *fa)
-{
- int part_nr;
-
- if (elements_fit_in_base(fa))
- return;
- for (part_nr = 0; part_nr < FLEX_ARRAY_NR_BASE_PTRS; part_nr++)
- kfree(fa->parts[part_nr]);
-}
-EXPORT_SYMBOL(flex_array_free_parts);
-
-void flex_array_free(struct flex_array *fa)
-{
- flex_array_free_parts(fa);
- kfree(fa);
-}
-EXPORT_SYMBOL(flex_array_free);
-
-static unsigned int index_inside_part(struct flex_array *fa,
- unsigned int element_nr,
- unsigned int part_nr)
-{
- unsigned int part_offset;
-
- part_offset = element_nr - part_nr * fa->elems_per_part;
- return part_offset * fa->element_size;
-}
-
-static struct flex_array_part *
-__fa_get_part(struct flex_array *fa, int part_nr, gfp_t flags)
-{
- struct flex_array_part *part = fa->parts[part_nr];
- if (!part) {
- part = kmalloc(sizeof(struct flex_array_part), flags);
- if (!part)
- return NULL;
- if (!(flags & __GFP_ZERO))
- memset(part, FLEX_ARRAY_FREE,
- sizeof(struct flex_array_part));
- fa->parts[part_nr] = part;
- }
- return part;
-}
-
-/**
- * flex_array_put - copy data into the array at @element_nr
- * @fa: the flex array to copy data into
- * @element_nr: index of the position in which to insert
- * the new element.
- * @src: address of data to copy into the array
- * @flags: page allocation flags to use for array expansion
- *
- *
- * Note that this *copies* the contents of @src into
- * the array. If you are trying to store an array of
- * pointers, make sure to pass in &ptr instead of ptr.
- * You may instead wish to use the flex_array_put_ptr()
- * helper function.
- *
- * Locking must be provided by the caller.
- */
-int flex_array_put(struct flex_array *fa, unsigned int element_nr, void *src,
- gfp_t flags)
-{
- int part_nr = 0;
- struct flex_array_part *part;
- void *dst;
-
- if (element_nr >= fa->total_nr_elements)
- return -ENOSPC;
- if (!fa->element_size)
- return 0;
- if (elements_fit_in_base(fa))
- part = (struct flex_array_part *)&fa->parts[0];
- else {
- part_nr = fa_element_to_part_nr(fa, element_nr);
- part = __fa_get_part(fa, part_nr, flags);
- if (!part)
- return -ENOMEM;
- }
- dst = &part->elements[index_inside_part(fa, element_nr, part_nr)];
- memcpy(dst, src, fa->element_size);
- return 0;
-}
-EXPORT_SYMBOL(flex_array_put);
-
-/**
- * flex_array_clear - clear element in array at @element_nr
- * @fa: the flex array of the element.
- * @element_nr: index of the position to clear.
- *
- * Locking must be provided by the caller.
- */
-int flex_array_clear(struct flex_array *fa, unsigned int element_nr)
-{
- int part_nr = 0;
- struct flex_array_part *part;
- void *dst;
-
- if (element_nr >= fa->total_nr_elements)
- return -ENOSPC;
- if (!fa->element_size)
- return 0;
- if (elements_fit_in_base(fa))
- part = (struct flex_array_part *)&fa->parts[0];
- else {
- part_nr = fa_element_to_part_nr(fa, element_nr);
- part = fa->parts[part_nr];
- if (!part)
- return -EINVAL;
- }
- dst = &part->elements[index_inside_part(fa, element_nr, part_nr)];
- memset(dst, FLEX_ARRAY_FREE, fa->element_size);
- return 0;
-}
-EXPORT_SYMBOL(flex_array_clear);
-
-/**
- * flex_array_prealloc - guarantee that array space exists
- * @fa: the flex array for which to preallocate parts
- * @start: index of first array element for which space is allocated
- * @nr_elements: number of elements for which space is allocated
- * @flags: page allocation flags
- *
- * This will guarantee that no future calls to flex_array_put()
- * will allocate memory. It can be used if you are expecting to
- * be holding a lock or in some atomic context while writing
- * data into the array.
- *
- * Locking must be provided by the caller.
- */
-int flex_array_prealloc(struct flex_array *fa, unsigned int start,
- unsigned int nr_elements, gfp_t flags)
-{
- int start_part;
- int end_part;
- int part_nr;
- unsigned int end;
- struct flex_array_part *part;
-
- if (!start && !nr_elements)
- return 0;
- if (start >= fa->total_nr_elements)
- return -ENOSPC;
- if (!nr_elements)
- return 0;
-
- end = start + nr_elements - 1;
-
- if (end >= fa->total_nr_elements)
- return -ENOSPC;
- if (!fa->element_size)
- return 0;
- if (elements_fit_in_base(fa))
- return 0;
- start_part = fa_element_to_part_nr(fa, start);
- end_part = fa_element_to_part_nr(fa, end);
- for (part_nr = start_part; part_nr <= end_part; part_nr++) {
- part = __fa_get_part(fa, part_nr, flags);
- if (!part)
- return -ENOMEM;
- }
- return 0;
-}
-EXPORT_SYMBOL(flex_array_prealloc);
-
-/**
- * flex_array_get - pull data back out of the array
- * @fa: the flex array from which to extract data
- * @element_nr: index of the element to fetch from the array
- *
- * Returns a pointer to the data at index @element_nr. Note
- * that this is a copy of the data that was passed in. If you
- * are using this to store pointers, you'll get back &ptr. You
- * may instead wish to use the flex_array_get_ptr helper.
- *
- * Locking must be provided by the caller.
- */
-void *flex_array_get(struct flex_array *fa, unsigned int element_nr)
-{
- int part_nr = 0;
- struct flex_array_part *part;
-
- if (!fa->element_size)
- return NULL;
- if (element_nr >= fa->total_nr_elements)
- return NULL;
- if (elements_fit_in_base(fa))
- part = (struct flex_array_part *)&fa->parts[0];
- else {
- part_nr = fa_element_to_part_nr(fa, element_nr);
- part = fa->parts[part_nr];
- if (!part)
- return NULL;
- }
- return &part->elements[index_inside_part(fa, element_nr, part_nr)];
-}
-EXPORT_SYMBOL(flex_array_get);
-
-/**
- * flex_array_get_ptr - pull a ptr back out of the array
- * @fa: the flex array from which to extract data
- * @element_nr: index of the element to fetch from the array
- *
- * Returns the pointer placed in the flex array at element_nr using
- * flex_array_put_ptr(). This function should not be called if the
- * element in question was not set using the _put_ptr() helper.
- */
-void *flex_array_get_ptr(struct flex_array *fa, unsigned int element_nr)
-{
- void **tmp;
-
- tmp = flex_array_get(fa, element_nr);
- if (!tmp)
- return NULL;
-
- return *tmp;
-}
-EXPORT_SYMBOL(flex_array_get_ptr);
-
-static int part_is_free(struct flex_array_part *part)
-{
- int i;
-
- for (i = 0; i < sizeof(struct flex_array_part); i++)
- if (part->elements[i] != FLEX_ARRAY_FREE)
- return 0;
- return 1;
-}
-
-/**
- * flex_array_shrink - free unused second-level pages
- * @fa: the flex array to shrink
- *
- * Frees all second-level pages that consist solely of unused
- * elements. Returns the number of pages freed.
- *
- * Locking must be provided by the caller.
- */
-int flex_array_shrink(struct flex_array *fa)
-{
- struct flex_array_part *part;
- int part_nr;
- int ret = 0;
-
- if (!fa->total_nr_elements || !fa->element_size)
- return 0;
- if (elements_fit_in_base(fa))
- return ret;
- for (part_nr = 0; part_nr < FLEX_ARRAY_NR_BASE_PTRS; part_nr++) {
- part = fa->parts[part_nr];
- if (!part)
- continue;
- if (part_is_free(part)) {
- fa->parts[part_nr] = NULL;
- kfree(part);
- ret++;
- }
- }
- return ret;
-}
-EXPORT_SYMBOL(flex_array_shrink);
diff --git a/tools/include/linux/poison.h b/tools/include/linux/poison.h
index 9fdcd3eaac..d297257691 100644
--- a/tools/include/linux/poison.h
+++ b/tools/include/linux/poison.h
@@ -87,9 +87,6 @@
#define MUTEX_DEBUG_INIT 0x11
#define MUTEX_DEBUG_FREE 0x22

-/********** lib/flex_array.c **********/
-#define FLEX_ARRAY_FREE 0x6c /* for use-after-free poisoning */
-
/********** security/ **********/
#define KEY_DESTROY 0xbd

--
2.17.0


2018-05-23 01:20:58

by Kent Overstreet

[permalink] [raw]
Subject: [PATCH 4/6] openvswitch: convert to genradix

the new generic radix trees have a simpler API and implementation, and
no limitations on number of elements, so all flex_array users are being
converted

Signed-off-by: Kent Overstreet <[email protected]>
---
net/openvswitch/flow.h | 1 -
net/openvswitch/flow_netlink.h | 1 -
net/openvswitch/flow_table.c | 45 ++++++++++------------------------
net/openvswitch/flow_table.h | 4 +--
4 files changed, 15 insertions(+), 36 deletions(-)

diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index c670dd24b8..4f06278166 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -30,7 +30,6 @@
#include <linux/in6.h>
#include <linux/jiffies.h>
#include <linux/time.h>
-#include <linux/flex_array.h>
#include <linux/cpumask.h>
#include <net/inet_ecn.h>
#include <net/ip_tunnels.h>
diff --git a/net/openvswitch/flow_netlink.h b/net/openvswitch/flow_netlink.h
index 6657606b2b..66f9553758 100644
--- a/net/openvswitch/flow_netlink.h
+++ b/net/openvswitch/flow_netlink.h
@@ -30,7 +30,6 @@
#include <linux/in6.h>
#include <linux/jiffies.h>
#include <linux/time.h>
-#include <linux/flex_array.h>

#include <net/inet_ecn.h>
#include <net/ip_tunnels.h>
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 80ea2a7185..284f44d832 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -111,27 +111,19 @@ int ovs_flow_tbl_count(const struct flow_table *table)
return table->count;
}

-static struct flex_array *alloc_buckets(unsigned int n_buckets)
+static int alloc_buckets(struct table_instance *ti, unsigned int n_buckets)
{
- struct flex_array *buckets;
- int i, err;
+ int i;

- buckets = flex_array_alloc(sizeof(struct hlist_head),
- n_buckets, GFP_KERNEL);
- if (!buckets)
- return NULL;
+ genradix_init(&ti->buckets);

- err = flex_array_prealloc(buckets, 0, n_buckets, GFP_KERNEL);
- if (err) {
- flex_array_free(buckets);
- return NULL;
- }
+ if (genradix_prealloc(&ti->buckets, n_buckets, GFP_KERNEL))
+ return -ENOMEM;

for (i = 0; i < n_buckets; i++)
- INIT_HLIST_HEAD((struct hlist_head *)
- flex_array_get(buckets, i));
+ INIT_HLIST_HEAD(genradix_ptr(&ti->buckets, i));

- return buckets;
+ return 0;
}

static void flow_free(struct sw_flow *flow)
@@ -168,15 +160,9 @@ void ovs_flow_free(struct sw_flow *flow, bool deferred)
flow_free(flow);
}

-static void free_buckets(struct flex_array *buckets)
-{
- flex_array_free(buckets);
-}
-
-
static void __table_instance_destroy(struct table_instance *ti)
{
- free_buckets(ti->buckets);
+ genradix_free(&ti->buckets);
kfree(ti);
}

@@ -187,9 +173,7 @@ static struct table_instance *table_instance_alloc(int new_size)
if (!ti)
return NULL;

- ti->buckets = alloc_buckets(new_size);
-
- if (!ti->buckets) {
+ if (alloc_buckets(ti, new_size)) {
kfree(ti);
return NULL;
}
@@ -249,7 +233,7 @@ static void table_instance_destroy(struct table_instance *ti,

for (i = 0; i < ti->n_buckets; i++) {
struct sw_flow *flow;
- struct hlist_head *head = flex_array_get(ti->buckets, i);
+ struct hlist_head *head = genradix_ptr(&ti->buckets, i);
struct hlist_node *n;
int ver = ti->node_ver;
int ufid_ver = ufid_ti->node_ver;
@@ -294,7 +278,7 @@ struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
ver = ti->node_ver;
while (*bucket < ti->n_buckets) {
i = 0;
- head = flex_array_get(ti->buckets, *bucket);
+ head = genradix_ptr(&ti->buckets, *bucket);
hlist_for_each_entry_rcu(flow, head, flow_table.node[ver]) {
if (i < *last) {
i++;
@@ -313,8 +297,7 @@ struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
static struct hlist_head *find_bucket(struct table_instance *ti, u32 hash)
{
hash = jhash_1word(hash, ti->hash_seed);
- return flex_array_get(ti->buckets,
- (hash & (ti->n_buckets - 1)));
+ return genradix_ptr(&ti->buckets, hash & (ti->n_buckets - 1));
}

static void table_instance_insert(struct table_instance *ti,
@@ -347,9 +330,7 @@ static void flow_table_copy_flows(struct table_instance *old,
/* Insert in new table. */
for (i = 0; i < old->n_buckets; i++) {
struct sw_flow *flow;
- struct hlist_head *head;
-
- head = flex_array_get(old->buckets, i);
+ struct hlist_head *head = genradix_ptr(&old->buckets, i);

if (ufid)
hlist_for_each_entry(flow, head,
diff --git a/net/openvswitch/flow_table.h b/net/openvswitch/flow_table.h
index 2dd9900f53..57fe93570a 100644
--- a/net/openvswitch/flow_table.h
+++ b/net/openvswitch/flow_table.h
@@ -29,7 +29,7 @@
#include <linux/in6.h>
#include <linux/jiffies.h>
#include <linux/time.h>
-#include <linux/flex_array.h>
+#include <linux/generic-radix-tree.h>

#include <net/inet_ecn.h>
#include <net/ip_tunnels.h>
@@ -37,7 +37,7 @@
#include "flow.h"

struct table_instance {
- struct flex_array *buckets;
+ GENRADIX(struct hlist_head) buckets;
unsigned int n_buckets;
struct rcu_head rcu;
int node_ver;
--
2.17.0


2018-05-23 01:22:13

by Kent Overstreet

[permalink] [raw]
Subject: [PATCH 2/6] proc: commit to genradix

the new generic radix trees have a simpler API and implementation, and
no limitations on number of elements, so all flex_array users are being
converted

Signed-off-by: Kent Overstreet <[email protected]>
---
fs/proc/base.c | 48 +++++++++++++++++-------------------------------
1 file changed, 17 insertions(+), 31 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 9298324325..0a762eda64 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -59,6 +59,7 @@
#include <linux/capability.h>
#include <linux/file.h>
#include <linux/fdtable.h>
+#include <linux/generic-radix-tree.h>
#include <linux/string.h>
#include <linux/seq_file.h>
#include <linux/namei.h>
@@ -92,7 +93,6 @@
#include <linux/sched/coredump.h>
#include <linux/sched/debug.h>
#include <linux/sched/stat.h>
-#include <linux/flex_array.h>
#include <linux/posix-timers.h>
#ifdef CONFIG_HARDWALL
#include <asm/hardwall.h>
@@ -2140,11 +2140,12 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx)
struct task_struct *task;
struct mm_struct *mm;
unsigned long nr_files, pos, i;
- struct flex_array *fa = NULL;
- struct map_files_info info;
+ GENRADIX(struct map_files_info) fa;
struct map_files_info *p;
int ret;

+ genradix_init(&fa);
+
ret = -ENOENT;
task = get_proc_task(file_inode(file));
if (!task)
@@ -2176,35 +2177,21 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx)
*/

for (vma = mm->mmap, pos = 2; vma; vma = vma->vm_next) {
- if (vma->vm_file && ++pos > ctx->pos)
- nr_files++;
- }
+ if (!vma->vm_file)
+ continue;
+ if (++pos <= ctx->pos)
+ continue;

- if (nr_files) {
- fa = flex_array_alloc(sizeof(info), nr_files,
- GFP_KERNEL);
- if (!fa || flex_array_prealloc(fa, 0, nr_files,
- GFP_KERNEL)) {
+ p = genradix_ptr_alloc(&fa, nr_files++, GFP_KERNEL);
+ if (!p) {
ret = -ENOMEM;
- if (fa)
- flex_array_free(fa);
up_read(&mm->mmap_sem);
- mmput(mm);
- goto out_put_task;
+ goto out_put_mm;
}
- for (i = 0, vma = mm->mmap, pos = 2; vma;
- vma = vma->vm_next) {
- if (!vma->vm_file)
- continue;
- if (++pos <= ctx->pos)
- continue;

- info.start = vma->vm_start;
- info.end = vma->vm_end;
- info.mode = vma->vm_file->f_mode;
- if (flex_array_put(fa, i++, &info, GFP_KERNEL))
- BUG();
- }
+ p->start = vma->vm_start;
+ p->end = vma->vm_end;
+ p->mode = vma->vm_file->f_mode;
}
up_read(&mm->mmap_sem);

@@ -2212,7 +2199,7 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx)
char buf[4 * sizeof(long) + 2]; /* max: %lx-%lx\0 */
unsigned int len;

- p = flex_array_get(fa, i);
+ p = genradix_ptr(&fa, i);
len = snprintf(buf, sizeof(buf), "%lx-%lx", p->start, p->end);
if (!proc_fill_cache(file, ctx,
buf, len,
@@ -2222,13 +2209,12 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx)
break;
ctx->pos++;
}
- if (fa)
- flex_array_free(fa);
+out_put_mm:
mmput(mm);
-
out_put_task:
put_task_struct(task);
out:
+ genradix_free(&fa);
return ret;
}

--
2.17.0


2018-05-23 11:29:15

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 2/6] proc: commit to genradix

On Tue, May 22, 2018 at 09:18:17PM -0400, Kent Overstreet wrote:
> @@ -2140,11 +2140,12 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx)
> struct task_struct *task;
> struct mm_struct *mm;
> unsigned long nr_files, pos, i;
> - struct flex_array *fa = NULL;
> - struct map_files_info info;
> + GENRADIX(struct map_files_info) fa;
> struct map_files_info *p;
> int ret;
>
> + genradix_init(&fa);

Could we have a DEFINE_GENRADIX(type, name) which initialises the tree?


2018-05-23 12:35:13

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 4/6] openvswitch: convert to genradix

On Tue, May 22, 2018 at 09:18:19PM -0400, Kent Overstreet wrote:
> the new generic radix trees have a simpler API and implementation, and
> no limitations on number of elements, so all flex_array users are being
> converted

This doesn't really feel like it should be a flexarray / genradix user.
It just wants to allocate a lot of buckets and use them. Maybe kvmalloc
is the right approach for this user of flexarray?

Signed-off-by: Matthew Wilcox <[email protected]>

net/openvswitch/flow.h | 1 -
net/openvswitch/flow_netlink.h | 1 -
net/openvswitch/flow_table.c | 49 +++++++++++-------------------------------
net/openvswitch/flow_table.h | 3 +--
4 files changed, 13 insertions(+), 41 deletions(-)

diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index c670dd24b8b7..4f06278166d9 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -30,7 +30,6 @@
#include <linux/in6.h>
#include <linux/jiffies.h>
#include <linux/time.h>
-#include <linux/flex_array.h>
#include <linux/cpumask.h>
#include <net/inet_ecn.h>
#include <net/ip_tunnels.h>
diff --git a/net/openvswitch/flow_netlink.h b/net/openvswitch/flow_netlink.h
index 6657606b2b47..66f9553758a5 100644
--- a/net/openvswitch/flow_netlink.h
+++ b/net/openvswitch/flow_netlink.h
@@ -30,7 +30,6 @@
#include <linux/in6.h>
#include <linux/jiffies.h>
#include <linux/time.h>
-#include <linux/flex_array.h>

#include <net/inet_ecn.h>
#include <net/ip_tunnels.h>
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 80ea2a71852e..909372bd2621 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -111,29 +111,6 @@ int ovs_flow_tbl_count(const struct flow_table *table)
return table->count;
}

-static struct flex_array *alloc_buckets(unsigned int n_buckets)
-{
- struct flex_array *buckets;
- int i, err;
-
- buckets = flex_array_alloc(sizeof(struct hlist_head),
- n_buckets, GFP_KERNEL);
- if (!buckets)
- return NULL;
-
- err = flex_array_prealloc(buckets, 0, n_buckets, GFP_KERNEL);
- if (err) {
- flex_array_free(buckets);
- return NULL;
- }
-
- for (i = 0; i < n_buckets; i++)
- INIT_HLIST_HEAD((struct hlist_head *)
- flex_array_get(buckets, i));
-
- return buckets;
-}
-
static void flow_free(struct sw_flow *flow)
{
int cpu;
@@ -168,31 +145,30 @@ void ovs_flow_free(struct sw_flow *flow, bool deferred)
flow_free(flow);
}

-static void free_buckets(struct flex_array *buckets)
-{
- flex_array_free(buckets);
-}
-
-
static void __table_instance_destroy(struct table_instance *ti)
{
- free_buckets(ti->buckets);
+ kvfree(ti->buckets);
kfree(ti);
}

static struct table_instance *table_instance_alloc(int new_size)
{
struct table_instance *ti = kmalloc(sizeof(*ti), GFP_KERNEL);
+ int i;

if (!ti)
return NULL;

- ti->buckets = alloc_buckets(new_size);
-
+ ti->buckets = kvmalloc(sizeof(struct hlist_head) * new_size,
+ GFP_KERNEL);
if (!ti->buckets) {
kfree(ti);
return NULL;
}
+
+ for (i = 0; i < new_size; i++)
+ INIT_HLIST_HEAD(&ti->buckets[i]);
+
ti->n_buckets = new_size;
ti->node_ver = 0;
ti->keep_flows = false;
@@ -249,7 +225,7 @@ static void table_instance_destroy(struct table_instance *ti,

for (i = 0; i < ti->n_buckets; i++) {
struct sw_flow *flow;
- struct hlist_head *head = flex_array_get(ti->buckets, i);
+ struct hlist_head *head = &ti->buckets[i];
struct hlist_node *n;
int ver = ti->node_ver;
int ufid_ver = ufid_ti->node_ver;
@@ -294,7 +270,7 @@ struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
ver = ti->node_ver;
while (*bucket < ti->n_buckets) {
i = 0;
- head = flex_array_get(ti->buckets, *bucket);
+ head = &ti->buckets[*bucket];
hlist_for_each_entry_rcu(flow, head, flow_table.node[ver]) {
if (i < *last) {
i++;
@@ -313,8 +289,7 @@ struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,
static struct hlist_head *find_bucket(struct table_instance *ti, u32 hash)
{
hash = jhash_1word(hash, ti->hash_seed);
- return flex_array_get(ti->buckets,
- (hash & (ti->n_buckets - 1)));
+ return &ti->buckets[hash & (ti->n_buckets - 1)];
}

static void table_instance_insert(struct table_instance *ti,
@@ -349,7 +324,7 @@ static void flow_table_copy_flows(struct table_instance *old,
struct sw_flow *flow;
struct hlist_head *head;

- head = flex_array_get(old->buckets, i);
+ head = &old->buckets[i];

if (ufid)
hlist_for_each_entry(flow, head,
diff --git a/net/openvswitch/flow_table.h b/net/openvswitch/flow_table.h
index 2dd9900f533d..de5ec6cf5174 100644
--- a/net/openvswitch/flow_table.h
+++ b/net/openvswitch/flow_table.h
@@ -29,7 +29,6 @@
#include <linux/in6.h>
#include <linux/jiffies.h>
#include <linux/time.h>
-#include <linux/flex_array.h>

#include <net/inet_ecn.h>
#include <net/ip_tunnels.h>
@@ -37,7 +36,7 @@
#include "flow.h"

struct table_instance {
- struct flex_array *buckets;
+ struct hlist_head *buckets;
unsigned int n_buckets;
struct rcu_head rcu;
int node_ver;

2018-05-23 13:41:23

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH 6/6] Drop flex_arrays

On Tue, 22 May 2018 21:18:21 -0400
Kent Overstreet <[email protected]> wrote:

> All existing users have been converted to generic radix trees
>
> Signed-off-by: Kent Overstreet <[email protected]>
> ---
> Documentation/core-api/flexible-arrays.rst | 130 -------
> Documentation/flexible-arrays.txt | 123 -------
> include/linux/flex_array.h | 149 --------
> include/linux/poison.h | 3 -
> lib/Makefile | 2 +-
> lib/flex_array.c | 398 ---------------------
> tools/include/linux/poison.h | 3 -
> 7 files changed, 1 insertion(+), 807 deletions(-)
> delete mode 100644 Documentation/core-api/flexible-arrays.rst
> delete mode 100644 Documentation/flexible-arrays.txt
> delete mode 100644 include/linux/flex_array.h
> delete mode 100644 lib/flex_array.c

Interesting, I didn't realize that flexible-arrays.txt was still there;
that should go regardless (and 00-INDEX adjusted accordingly). If you zap
the RST file, though, you should also fix Documentation/core-api/index.rst
or you'll break the docs build.

Thanks,

jon

2018-05-23 17:24:47

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 6/6] Drop flex_arrays

Thanks for the heads-up, Matthew!

On 05/22/2018 06:18 PM, Kent Overstreet wrote:
> All existing users have been converted to generic radix trees

Well, flex_arrays had a good run, and the new radix trees do look quite
a bit nicer.

Feel free to add my ack.

2018-05-23 22:06:58

by Kent Overstreet

[permalink] [raw]
Subject: Re: [PATCH 6/6] Drop flex_arrays

On Wed, May 23, 2018 at 10:24:18AM -0700, Dave Hansen wrote:
> Thanks for the heads-up, Matthew!

Sorry, I was just using scripts/get_maintainers, forgot to check the actual file
for the original author :)

> On 05/22/2018 06:18 PM, Kent Overstreet wrote:
> > All existing users have been converted to generic radix trees
>
> Well, flex_arrays had a good run, and the new radix trees do look quite
> a bit nicer.
>
> Feel free to add my ack.

Cool, thanks!

2018-05-23 22:19:26

by Kent Overstreet

[permalink] [raw]
Subject: Re: [PATCH 2/6] proc: commit to genradix

On Wed, May 23, 2018 at 04:28:23AM -0700, Matthew Wilcox wrote:
> On Tue, May 22, 2018 at 09:18:17PM -0400, Kent Overstreet wrote:
> > @@ -2140,11 +2140,12 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx)
> > struct task_struct *task;
> > struct mm_struct *mm;
> > unsigned long nr_files, pos, i;
> > - struct flex_array *fa = NULL;
> > - struct map_files_info info;
> > + GENRADIX(struct map_files_info) fa;
> > struct map_files_info *p;
> > int ret;
> >
> > + genradix_init(&fa);
>
> Could we have a DEFINE_GENRADIX(type, name) which initialises the tree?

Already exists. I kind of prefer not to use it when I don't need to though, and
stick to things that look more like normal declarations instead.

2018-05-23 22:25:06

by Kent Overstreet

[permalink] [raw]
Subject: Re: [PATCH 4/6] openvswitch: convert to genradix

On Wed, May 23, 2018 at 05:34:01AM -0700, Matthew Wilcox wrote:
> On Tue, May 22, 2018 at 09:18:19PM -0400, Kent Overstreet wrote:
> > the new generic radix trees have a simpler API and implementation, and
> > no limitations on number of elements, so all flex_array users are being
> > converted
>
> This doesn't really feel like it should be a flexarray / genradix user.
> It just wants to allocate a lot of buckets and use them. Maybe kvmalloc
> is the right approach for this user of flexarray?

Yeah probably, or a rhashtable.

Now that you mention it, I think the md code should definitely be using kvmalloc
too.

2018-05-26 03:17:12

by Liu Bo

[permalink] [raw]
Subject: Re: [PATCH 1/6] Generic radix trees

Hi Kent,

(Add all ML to cc this time.)

On Wed, May 23, 2018 at 9:18 AM, Kent Overstreet
<[email protected]> wrote:
> Very simple radix tree implementation that supports storing arbitrary
> size entries, up to PAGE_SIZE - upcoming patches will convert existing
> flex_array users to genradixes. The new genradix code has a much simpler
> API and implementation, and doesn't have a hard limit on the number of
> elements like flex_array does.
>
> Signed-off-by: Kent Overstreet <[email protected]>
> ---
> include/linux/generic-radix-tree.h | 222 +++++++++++++++++++++++++++++
> lib/Makefile | 3 +-
> lib/generic-radix-tree.c | 180 +++++++++++++++++++++++
> 3 files changed, 404 insertions(+), 1 deletion(-)
> create mode 100644 include/linux/generic-radix-tree.h
> create mode 100644 lib/generic-radix-tree.c
>
> diff --git a/include/linux/generic-radix-tree.h b/include/linux/generic-radix-tree.h
> new file mode 100644
> index 0000000000..3328813322
> --- /dev/null
> +++ b/include/linux/generic-radix-tree.h
> @@ -0,0 +1,222 @@
> +#ifndef _LINUX_GENERIC_RADIX_TREE_H
> +#define _LINUX_GENERIC_RADIX_TREE_H
> +
> +/*
> + * Generic radix trees/sparse arrays:
> + *
> + * Very simple and minimalistic, supporting arbitrary size entries up to
> + * PAGE_SIZE.
> + *
> + * A genradix is defined with the type it will store, like so:
> + * static GENRADIX(struct foo) foo_genradix;
> + *
> + * The main operations are:
> + * - genradix_init(radix) - initialize an empty genradix
> + *
> + * - genradix_free(radix) - free all memory owned by the genradix and
> + * reinitialize it
> + *
> + * - genradix_ptr(radix, idx) - gets a pointer to the entry at idx, returning
> + * NULL if that entry does not exist
> + *
> + * - genradix_ptr_alloc(radix, idx, gfp) - gets a pointer to an entry,
> + * allocating it if necessary
> + *
> + * - genradix_for_each(radix, iter, p) - iterate over each entry in a genradix
> + *
> + * The radix tree allocates one page of entries at a time, so entries may exist
> + * that were never explicitly allocated - they will be initialized to all
> + * zeroes.
> + *
> + * Internally, a genradix is just a radix tree of pages, and indexing works in
> + * terms of byte offsets. The wrappers in this header file use sizeof on the
> + * type the radix contains to calculate a byte offset from the index - see
> + * __idx_to_offset.
> + */
> +
> +#include <asm/page.h>
> +#include <linux/bug.h>
> +#include <linux/kernel.h>
> +#include <linux/log2.h>
> +
> +struct genradix_node;
> +
> +struct __genradix {
> + struct genradix_node *root;
> + size_t depth;
> +};
> +
> +#define __GENRADIX_INITIALIZER \
> + { \
> + .tree = { \
> + .root = NULL, \
> + .depth = 0, \
> + } \
> + }
> +
> +/*
> + * We use a 0 size array to stash the type we're storing without taking any
> + * space at runtime - then the various accessor macros can use typeof() to get
> + * to it for casts/sizeof - we also force the alignment so that storing a type
> + * with a ridiculous alignment doesn't blow up the alignment or size of the
> + * genradix.
> + */
> +
> +#define GENRADIX(_type) \
> +struct { \
> + struct __genradix tree; \
> + _type type[0] __aligned(1); \
> +}
> +
> +#define DEFINE_GENRADIX(_name, _type) \
> + GENRADIX(_type) _name = __GENRADIX_INITIALIZER
> +
> +/**
> + * genradix_init - initialize a genradix
> + * @_radix: genradix to initialize
> + *
> + * Does not fail
> + */
> +#define genradix_init(_radix) \
> +do { \
> + *(_radix) = (typeof(*_radix)) __GENRADIX_INITIALIZER; \
> +} while (0)
> +
> +void __genradix_free(struct __genradix *);
> +
> +/**
> + * genradix_free: free all memory owned by a genradix
> + *
> + * After freeing, @_radix will be reinitialized and empty
> + */
> +#define genradix_free(_radix) __genradix_free(&(_radix)->tree)
> +
> +static inline size_t __idx_to_offset(size_t idx, size_t obj_size)
> +{
> + if (__builtin_constant_p(obj_size))
> + BUILD_BUG_ON(obj_size > PAGE_SIZE);
> + else
> + BUG_ON(obj_size > PAGE_SIZE);
> +
> + if (!is_power_of_2(obj_size)) {
> + size_t objs_per_page = PAGE_SIZE / obj_size;
> +
> + return (idx / objs_per_page) * PAGE_SIZE +
> + (idx % objs_per_page) * obj_size;
> + } else {
> + return idx * obj_size;
> + }
> +}
> +
> +#define __genradix_cast(_radix) (typeof((_radix)->type[0]) *)
> +#define __genradix_obj_size(_radix) sizeof((_radix)->type[0])
> +#define __genradix_idx_to_offset(_radix, _idx) \
> + __idx_to_offset(_idx, __genradix_obj_size(_radix))
> +
> +void *__genradix_ptr(struct __genradix *, size_t);
> +
> +/**
> + * genradix_ptr - get a pointer to a genradix entry
> + * @_radix: genradix to access
> + * @_idx: index to fetch
> + *
> + * Returns a pointer to entry at @_idx, or NULL if that entry does not exist.
> + */
> +#define genradix_ptr(_radix, _idx) \
> + (__genradix_cast(_radix) \
> + __genradix_ptr(&(_radix)->tree, \
> + __genradix_idx_to_offset(_radix, _idx)))
> +
> +void *__genradix_ptr_alloc(struct __genradix *, size_t, gfp_t);
> +
> +/**
> + * genradix_ptr - get a pointer to a genradix entry, allocating it if necessary
> + * @_radix: genradix to access
> + * @_idx: index to fetch
> + * @_gfp: gfp mask
> + *
> + * Returns a pointer to entry at @_idx, or NULL on allocation failure
> + */
> +#define genradix_ptr_alloc(_radix, _idx, _gfp) \
> + (__genradix_cast(_radix) \
> + __genradix_ptr_alloc(&(_radix)->tree, \
> + __genradix_idx_to_offset(_radix, _idx), \
> + _gfp))
> +
> +struct genradix_iter {
> + size_t offset;
> + size_t pos;
> +};
> +
> +/**
> + * genradix_iter_init - initialize a genradix_iter
> + * @_radix: genradix that will be iterated over
> + * @_idx index to start iterating from
> + */
> +#define genradix_iter_init(_radix, _idx) \
> + ((struct genradix_iter) { \
> + .pos = (_idx), \
> + .offset = __genradix_idx_to_offset((_radix), (_idx)),\
> + })
> +
> +void *__genradix_iter_peek(struct genradix_iter *, struct __genradix *, size_t);
> +
> +/**
> + * genradix_iter_peek - get first entry at or above iterator's current
> + * position
> + * @_iter: a genradix_iter
> + * @_radix: genradix being iterated over
> + *
> + * If no more entries exist at or above @_iter's current position, returns NULL
> + */
> +#define genradix_iter_peek(_iter, _radix) \
> + (__genradix_cast(_radix) \
> + __genradix_iter_peek(_iter, &(_radix)->tree, \
> + PAGE_SIZE / __genradix_obj_size(_radix)))
> +
> +static inline void __genradix_iter_advance(struct genradix_iter *iter,
> + size_t obj_size)
> +{
> + iter->offset += obj_size;
> +
> + if (!is_power_of_2(obj_size) &&
> + (iter->offset & (PAGE_SIZE - 1)) + obj_size > PAGE_SIZE)
> + iter->offset = round_up(iter->offset, PAGE_SIZE);
> +
> + iter->pos++;
> +}
> +
> +#define genradix_iter_advance(_iter, _radix) \
> + __genradix_iter_advance(_iter, __genradix_obj_size(_radix))
> +
> +/**
> + * genradix_for_each - iterate over entry in a genradix
> + * @_radix: genradix to iterate over
> + * @_iter: a genradix_iter to track current position
> + * @_p: pointer to genradix entry type
> + *
> + * On every iteration, @_p will point to the current entry, and @_iter.pos
> + * will be the current entry's index.
> + */
> +#define genradix_for_each(_radix, _iter, _p) \
> + for (_iter = genradix_iter_init(_radix, 0); \
> + _p = genradix_iter_peek(&(_iter), _uradix); \
> + genradix_iter_advance(&(_iter), _uradix))
> +
> +int __genradix_prealloc(struct __genradix *, size_t, gfp_t);
> +
> +/**
> + * genradix_prealloc - preallocate entries in a generic radix tree
> + * @_radix: genradix to preallocate
> + * @_nr: number of entries to preallocate
> + * @_gfp: gfp mask
> + *
> + * Returns 0 on success, -ENOMEM on failure
> + */
> +#define genradix_prealloc(_radix, _nr, _gfp) \
> + __genradix_prealloc(&(_radix)->tree, \
> + __genradix_idx_to_offset(_radix, _nr + 1),\
> + _gfp)
> +
> +
> +#endif /* _LINUX_GENERIC_RADIX_TREE_H */
> diff --git a/lib/Makefile b/lib/Makefile
> index a90d4fcd74..5db5a7fb1e 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -39,7 +39,8 @@ obj-y += bcd.o div64.o sort.o parser.o debug_locks.o random32.o \
> gcd.o lcm.o list_sort.o uuid.o flex_array.o iov_iter.o clz_ctz.o \
> bsearch.o find_bit.o llist.o memweight.o kfifo.o \
> percpu-refcount.o percpu_ida.o rhashtable.o reciprocal_div.o \
> - once.o refcount.o usercopy.o errseq.o bucket_locks.o
> + once.o refcount.o usercopy.o errseq.o bucket_locks.o \
> + generic-radix-tree.o
> obj-$(CONFIG_STRING_SELFTEST) += test_string.o
> obj-y += string_helpers.o
> obj-$(CONFIG_TEST_STRING_HELPERS) += test-string_helpers.o
> diff --git a/lib/generic-radix-tree.c b/lib/generic-radix-tree.c
> new file mode 100644
> index 0000000000..4537c7c62c
> --- /dev/null
> +++ b/lib/generic-radix-tree.c
> @@ -0,0 +1,180 @@
> +
> +#include <linux/export.h>
> +#include <linux/generic-radix-tree.h>
> +#include <linux/gfp.h>
> +
> +#define GENRADIX_ARY (PAGE_SIZE / sizeof(struct genradix_node *))
> +#define GENRADIX_ARY_SHIFT ilog2(GENRADIX_ARY)
> +
> +struct genradix_node {
> + union {
> + /* Interior node: */
> + struct genradix_node *children[GENRADIX_ARY];
> +
> + /* Leaf: */
> + u8 data[PAGE_SIZE];
> + };
> +};
> +
> +static inline unsigned genradix_depth_shift(unsigned depth)
> +{
> + return PAGE_SHIFT + GENRADIX_ARY_SHIFT * depth;
> +}
> +
> +/*
> + * Returns size (of data, in bytes) that a tree of a given depth holds:
> + */
> +static inline size_t genradix_depth_size(unsigned depth)
> +{
> + return 1UL << genradix_depth_shift(depth);
> +}
> +
> +/*
> + * Returns pointer to the specified byte @offset within @radix, or NULL if not
> + * allocated
> + */
> +void *__genradix_ptr(struct __genradix *radix, size_t offset)
> +{
> + size_t level = radix->depth;
> + struct genradix_node *n = radix->root;
> +
> + if (offset >= genradix_depth_size(radix->depth))
> + return NULL;
> +
> + while (1) {
> + if (!n)
> + return NULL;
> + if (!level)
> + break;
> +
> + level--;
> +
> + n = n->children[offset >> genradix_depth_shift(level)];
> + offset &= genradix_depth_size(level) - 1;
> + }
> +
> + return &n->data[offset];
> +}
> +EXPORT_SYMBOL(__genradix_ptr);
> +
> +/*
> + * Returns pointer to the specified byte @offset within @radix, allocating it if
> + * necessary - newly allocated slots are always zeroed out:
> + */
> +void *__genradix_ptr_alloc(struct __genradix *radix, size_t offset,
> + gfp_t gfp_mask)
> +{
> + struct genradix_node **n;

Any reason that " struct genradix_node ** " is used here instead of "
struct genradix_node * "?

Looks like this function only manipulates *n, am I missing something?

thanks,
liubo

> + size_t level;
> +
> + /* Increase tree depth if necessary: */
> +
> + while (offset >= genradix_depth_size(radix->depth)) {
> + struct genradix_node *new_root =
> + (void *) __get_free_page(gfp_mask|__GFP_ZERO);
> +
> + if (!new_root)
> + return NULL;
> +
> + new_root->children[0] = radix->root;
> + radix->root = new_root;
> + radix->depth++;
> + }
> +
> + n = &radix->root;
> + level = radix->depth;
> +
> + while (1) {
> + if (!*n) {
> + *n = (void *) __get_free_page(gfp_mask|__GFP_ZERO);
> + if (!*n)
> + return NULL;
> + }
> +
> + if (!level)
> + break;
> +
> + level--;
> +
> + n = &(*n)->children[offset >> genradix_depth_shift(level)];
> + offset &= genradix_depth_size(level) - 1;
> + }
> +
> + return &(*n)->data[offset];
> +}
> +EXPORT_SYMBOL(__genradix_ptr_alloc);
> +
> +void *__genradix_iter_peek(struct genradix_iter *iter,
> + struct __genradix *radix,
> + size_t objs_per_page)
> +{
> + struct genradix_node *n;
> + size_t level, i;
> +
> + if (!radix->root)
> + return NULL;
> +restart:
> + if (iter->offset >= genradix_depth_size(radix->depth))
> + return NULL;
> +
> + n = radix->root;
> + level = radix->depth;
> +
> + while (level) {
> + level--;
> +
> + i = (iter->offset >> genradix_depth_shift(level)) &
> + (GENRADIX_ARY - 1);
> +
> + while (!n->children[i]) {
> + i++;
> + iter->offset = round_down(iter->offset +
> + genradix_depth_size(level),
> + genradix_depth_size(level));
> + iter->pos = (iter->offset >> PAGE_SHIFT) *
> + objs_per_page;
> + if (i == GENRADIX_ARY)
> + goto restart;
> + }
> +
> + n = n->children[i];
> + }
> +
> + return &n->data[iter->offset & (PAGE_SIZE - 1)];
> +}
> +EXPORT_SYMBOL(__genradix_iter_peek);
> +
> +static void genradix_free_recurse(struct genradix_node *n, unsigned level)
> +{
> + if (level) {
> + unsigned i;
> +
> + for (i = 0; i < GENRADIX_ARY; i++)
> + if (n->children[i])
> + genradix_free_recurse(n->children[i], level - 1);
> + }
> +
> + free_page((unsigned long) n);
> +}
> +
> +int __genradix_prealloc(struct __genradix *radix, size_t size,
> + gfp_t gfp_mask)
> +{
> + size_t offset;
> +
> + for (offset = 0; offset < size; offset += PAGE_SIZE)
> + if (!__genradix_ptr_alloc(radix, offset, gfp_mask))
> + return -ENOMEM;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(__genradix_prealloc);
> +
> +void __genradix_free(struct __genradix *radix)
> +{
> + genradix_free_recurse(radix->root, radix->depth);
> +
> + radix->root = NULL;
> + radix->depth = 0;
> +}
> +EXPORT_SYMBOL(__genradix_free);
> --
> 2.17.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2018-05-26 05:56:58

by Kent Overstreet

[permalink] [raw]
Subject: Re: [PATCH 1/6] Generic radix trees

On Sat, May 26, 2018 at 11:16:42AM +0800, Liu Bo wrote:
> > +/*
> > + * Returns pointer to the specified byte @offset within @radix, allocating it if
> > + * necessary - newly allocated slots are always zeroed out:
> > + */
> > +void *__genradix_ptr_alloc(struct __genradix *radix, size_t offset,
> > + gfp_t gfp_mask)
> > +{
> > + struct genradix_node **n;
>
> Any reason that " struct genradix_node ** " is used here instead of "
> struct genradix_node * "?
>
> Looks like this function only manipulates *n, am I missing something?

It stores to *n, when it has to allocate a node (including the root)

2018-05-29 04:00:13

by Liu Bo

[permalink] [raw]
Subject: Re: [PATCH 1/6] Generic radix trees

On Sat, May 26, 2018 at 1:56 PM, Kent Overstreet
<[email protected]> wrote:
> On Sat, May 26, 2018 at 11:16:42AM +0800, Liu Bo wrote:
>> > +/*
>> > + * Returns pointer to the specified byte @offset within @radix, allocating it if
>> > + * necessary - newly allocated slots are always zeroed out:
>> > + */
>> > +void *__genradix_ptr_alloc(struct __genradix *radix, size_t offset,
>> > + gfp_t gfp_mask)
>> > +{
>> > + struct genradix_node **n;
>>
>> Any reason that " struct genradix_node ** " is used here instead of "
>> struct genradix_node * "?
>>
>> Looks like this function only manipulates *n, am I missing something?
>
> It stores to *n, when it has to allocate a node (including the root)

I see, thanks for the explanation.

thanks,
liubo