2019-09-15 16:57:53

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 0/7] mm, slab: Make kmalloc_info[] contain all types of names

Changes in v4
--
1. [old] abandon patch 4/4
2. [new] patch 4/7:
- return ZERO_SIZE_ALLOC instead 0 for zero sized requests
3. [new] patch 5/7:
- reorder kmalloc_info[], kmalloc_caches[] (in order of size)
- hard to split, so slightly larger
4. [new] patch 6/7:
- initialize kmalloc_cache[] with the same size but different
types
5. [new] patch 7/7:
- modify kmalloc_caches[type][idx] to kmalloc_caches[idx][type]

Patch 4-7 are newly added, more information can be obtained from
commit messages.

Changes in v3
--
1. restore __initconst (patch 1/4)
2. rename patch 3/4
3. add more clarification for patch 4/4

Changes in v2
--
1. remove __initconst (patch 1/5)
2. squash patch 2/5
3. add ack tag from Vlastimil Babka


There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

Patch1 predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

These changes make sense, and the time spent by new_kmalloc_cache()
has been reduced by approximately 36.3%.

Time spent by new_kmalloc_cache()
(CPU cycles)
5.3-rc7 66264
5.3-rc7+patch_1-3 42188


bloat-o-meter
--
$ ./scripts/bloat-o-meter vmlinux.5.3-rc8 vmlinux.5.3-rc8+patch_1-7
add/remove: 1/2 grow/shrink: 6/65 up/down: 872/-1621 (-749)
Function old new delta
all_kmalloc_info - 832 +832
crypto_create_tfm 211 225 +14
ieee80211_key_alloc 1159 1169 +10
nl80211_parse_sched_scan 2787 2795 +8
tg3_self_test 4255 4259 +4
find_get_context.isra 634 637 +3
sd_probe 947 948 +1
nf_queue 671 670 -1
trace_parser_get_init 71 69 -2
pkcs7_verify.cold 318 316 -2
units 323 320 -3
nl80211_set_reg 642 639 -3
pkcs7_verify 1503 1495 -8
i915_sw_fence_await_dma_fence 445 437 -8
nla_strdup 143 134 -9
kmalloc_slab 102 93 -9
xhci_alloc_tt_info 349 338 -11
xhci_segment_alloc 303 289 -14
xhci_alloc_container_ctx 221 207 -14
xfrm_policy_alloc 277 263 -14
selinux_sk_alloc_security 119 105 -14
sdev_evt_send_simple 124 110 -14
sdev_evt_alloc 85 71 -14
sbitmap_queue_init_node 424 410 -14
regulatory_hint_found_beacon 400 386 -14
nf_ct_tmpl_alloc 91 77 -14
gss_create_cred 146 132 -14
drm_flip_work_allocate_task 76 62 -14
cfg80211_stop_iface 266 252 -14
cfg80211_sinfo_alloc_tid_stats 83 69 -14
cfg80211_port_authorized 218 204 -14
cfg80211_ibss_joined 341 327 -14
call_usermodehelper_setup 155 141 -14
bpf_prog_alloc_no_stats 188 174 -14
blk_alloc_flush_queue 197 183 -14
bdi_alloc_node 201 187 -14
_netlbl_catmap_getnode 253 239 -14
__igmp_group_dropped 629 615 -14
____ip_mc_inc_group 481 467 -14
xhci_alloc_command 221 205 -16
audit_log_d_path 204 188 -16
xprt_switch_alloc 145 128 -17
xhci_ring_alloc 378 361 -17
xhci_mem_init 3673 3656 -17
xhci_alloc_virt_device 505 488 -17
xhci_alloc_stream_info 727 710 -17
tcp_sendmsg_locked 3129 3112 -17
tcp_md5_do_add 783 766 -17
tcp_fastopen_defer_connect 279 262 -17
sr_read_tochdr.isra 260 243 -17
sr_read_tocentry.isra 337 320 -17
sr_is_xa 385 368 -17
sr_get_mcn 269 252 -17
scsi_probe_and_add_lun 2947 2930 -17
ring_buffer_read_prepare 103 86 -17
request_firmware_nowait 405 388 -17
ohci_urb_enqueue 3185 3168 -17
nfs_alloc_seqid 96 79 -17
nfs4_get_state_owner 1049 1032 -17
nfs4_do_close 587 570 -17
mempool_create_node 173 156 -17
ip6_setup_cork 1030 1013 -17
ida_alloc_range 951 934 -17
gss_import_sec_context 187 170 -17
dma_pool_alloc 419 402 -17
devres_open_group 223 206 -17
cfg80211_parse_mbssid_data 2406 2389 -17
ip_setup_cork 374 354 -20
kmalloc_caches 336 312 -24
__i915_sw_fence_await_sw_fence 429 405 -24
kmalloc_cache_name 57 - -57
new_kmalloc_cache 112 - -112
create_kmalloc_caches 270 148 -122
kmalloc_info 432 8 -424
Total: Before=14874616, After=14873867, chg -0.01%


Pengfei Li (7):
mm, slab: Make kmalloc_info[] contain all types of names
mm, slab: Remove unused kmalloc_size()
mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc
caches
mm, slab: Return ZERO_SIZE_ALLOC for zero sized kmalloc requests
mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE
mm, slab_common: Initialize the same size of kmalloc_caches[]
mm, slab_common: Modify kmalloc_caches[type][idx] to
kmalloc_caches[idx][type]

include/linux/slab.h | 136 ++++++++++++++------------
mm/slab.c | 11 ++-
mm/slab.h | 10 +-
mm/slab_common.c | 224 ++++++++++++++++++-------------------------
mm/slub.c | 12 +--
5 files changed, 183 insertions(+), 210 deletions(-)

--
2.21.0


2019-09-15 16:58:20

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 2/7] mm, slab: Remove unused kmalloc_size()

The size of kmalloc can be obtained from kmalloc_info[],
so remove kmalloc_size() that will not be used anymore.

Signed-off-by: Pengfei Li <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
---
include/linux/slab.h | 20 --------------------
mm/slab.c | 5 +++--
mm/slab_common.c | 5 ++---
3 files changed, 5 insertions(+), 25 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 56c9c7eed34e..e773e5764b7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -557,26 +557,6 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
return __kmalloc(size, flags);
}

-/*
- * Determine size used for the nth kmalloc cache.
- * return size or 0 if a kmalloc cache for that
- * size does not exist
- */
-static __always_inline unsigned int kmalloc_size(unsigned int n)
-{
-#ifndef CONFIG_SLOB
- if (n > 2)
- return 1U << n;
-
- if (n == 1 && KMALLOC_MIN_SIZE <= 32)
- return 96;
-
- if (n == 2 && KMALLOC_MIN_SIZE <= 64)
- return 192;
-#endif
- return 0;
-}
-
static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
{
#ifndef CONFIG_SLOB
diff --git a/mm/slab.c b/mm/slab.c
index c42b6211f42e..7bc4e90e1147 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1248,8 +1248,9 @@ void __init kmem_cache_init(void)
*/
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
- kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
- 0, kmalloc_size(INDEX_NODE));
+ kmalloc_info[INDEX_NODE].size,
+ ARCH_KMALLOC_FLAGS, 0,
+ kmalloc_info[INDEX_NODE].size);
slab_state = PARTIAL_NODE;
setup_kmalloc_cache_index_table();

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 002e16673581..8b542cfcc4f2 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1239,11 +1239,10 @@ void __init create_kmalloc_caches(slab_flags_t flags)
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];

if (s) {
- unsigned int size = kmalloc_size(i);
-
kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
kmalloc_info[i].name[KMALLOC_DMA],
- size, SLAB_CACHE_DMA | flags, 0, 0);
+ kmalloc_info[i].size,
+ SLAB_CACHE_DMA | flags, 0, 0);
}
}
#endif
--
2.21.0

2019-09-15 16:59:07

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 3/7] mm, slab_common: use enum kmalloc_cache_type to iterate over kmalloc caches

The type of local variable *type* of new_kmalloc_cache() should
be enum kmalloc_cache_type instead of int, so correct it.

Signed-off-by: Pengfei Li <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
---
mm/slab_common.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8b542cfcc4f2..af45b5278fdc 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
}

static void __init
-new_kmalloc_cache(int idx, int type, slab_flags_t flags)
+new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
{
if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
@@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t flags)
*/
void __init create_kmalloc_caches(slab_flags_t flags)
{
- int i, type;
+ int i;
+ enum kmalloc_cache_type type;

for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
--
2.21.0

2019-09-15 16:59:26

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 3/7] mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc caches

The type of local variable *type* of new_kmalloc_cache() should
be enum kmalloc_cache_type instead of int, so correct it.

Signed-off-by: Pengfei Li <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
---
mm/slab_common.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8b542cfcc4f2..af45b5278fdc 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
}

static void __init
-new_kmalloc_cache(int idx, int type, slab_flags_t flags)
+new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
{
if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
@@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t flags)
*/
void __init create_kmalloc_caches(slab_flags_t flags)
{
- int i, type;
+ int i;
+ enum kmalloc_cache_type type;

for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
--
2.21.0

2019-09-15 17:00:12

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 1/7] mm, slab: Make kmalloc_info[] contain all types of names

There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

This patch predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

Besides, remove the kmalloc_cache_name() that is no longer used.

Signed-off-by: Pengfei Li <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
---
mm/slab.c | 2 +-
mm/slab.h | 2 +-
mm/slab_common.c | 91 ++++++++++++++++++++++++++----------------------
3 files changed, 51 insertions(+), 44 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9df370558e5d..c42b6211f42e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1247,7 +1247,7 @@ void __init kmem_cache_init(void)
* structures first. Without this, further allocations will bug.
*/
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
- kmalloc_info[INDEX_NODE].name,
+ kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
0, kmalloc_size(INDEX_NODE));
slab_state = PARTIAL_NODE;
diff --git a/mm/slab.h b/mm/slab.h
index 9057b8056b07..2fc8f956906a 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -76,7 +76,7 @@ extern struct kmem_cache *kmem_cache;

/* A table of kmalloc cache names and sizes */
extern const struct kmalloc_info_struct {
- const char *name;
+ const char *name[NR_KMALLOC_TYPES];
unsigned int size;
} kmalloc_info[];

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 807490fe217a..002e16673581 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1092,26 +1092,56 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
return kmalloc_caches[kmalloc_type(flags)][index];
}

+#ifdef CONFIG_ZONE_DMA
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{ \
+ .name[KMALLOC_NORMAL] = "kmalloc-" #__short_size, \
+ .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size, \
+ .name[KMALLOC_DMA] = "dma-kmalloc-" #__short_size, \
+ .size = __size, \
+}
+#else
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{ \
+ .name[KMALLOC_NORMAL] = "kmalloc-" #__short_size, \
+ .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size, \
+ .size = __size, \
+}
+#endif
+
/*
* kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
* kmalloc_index() supports up to 2^26=64MB, so the final entry of the table is
* kmalloc-67108864.
*/
const struct kmalloc_info_struct kmalloc_info[] __initconst = {
- {NULL, 0}, {"kmalloc-96", 96},
- {"kmalloc-192", 192}, {"kmalloc-8", 8},
- {"kmalloc-16", 16}, {"kmalloc-32", 32},
- {"kmalloc-64", 64}, {"kmalloc-128", 128},
- {"kmalloc-256", 256}, {"kmalloc-512", 512},
- {"kmalloc-1k", 1024}, {"kmalloc-2k", 2048},
- {"kmalloc-4k", 4096}, {"kmalloc-8k", 8192},
- {"kmalloc-16k", 16384}, {"kmalloc-32k", 32768},
- {"kmalloc-64k", 65536}, {"kmalloc-128k", 131072},
- {"kmalloc-256k", 262144}, {"kmalloc-512k", 524288},
- {"kmalloc-1M", 1048576}, {"kmalloc-2M", 2097152},
- {"kmalloc-4M", 4194304}, {"kmalloc-8M", 8388608},
- {"kmalloc-16M", 16777216}, {"kmalloc-32M", 33554432},
- {"kmalloc-64M", 67108864}
+ SET_KMALLOC_SIZE(0, 0),
+ SET_KMALLOC_SIZE(96, 96),
+ SET_KMALLOC_SIZE(192, 192),
+ SET_KMALLOC_SIZE(8, 8),
+ SET_KMALLOC_SIZE(16, 16),
+ SET_KMALLOC_SIZE(32, 32),
+ SET_KMALLOC_SIZE(64, 64),
+ SET_KMALLOC_SIZE(128, 128),
+ SET_KMALLOC_SIZE(256, 256),
+ SET_KMALLOC_SIZE(512, 512),
+ SET_KMALLOC_SIZE(1024, 1k),
+ SET_KMALLOC_SIZE(2048, 2k),
+ SET_KMALLOC_SIZE(4096, 4k),
+ SET_KMALLOC_SIZE(8192, 8k),
+ SET_KMALLOC_SIZE(16384, 16k),
+ SET_KMALLOC_SIZE(32768, 32k),
+ SET_KMALLOC_SIZE(65536, 64k),
+ SET_KMALLOC_SIZE(131072, 128k),
+ SET_KMALLOC_SIZE(262144, 256k),
+ SET_KMALLOC_SIZE(524288, 512k),
+ SET_KMALLOC_SIZE(1048576, 1M),
+ SET_KMALLOC_SIZE(2097152, 2M),
+ SET_KMALLOC_SIZE(4194304, 4M),
+ SET_KMALLOC_SIZE(8388608, 8M),
+ SET_KMALLOC_SIZE(16777216, 16M),
+ SET_KMALLOC_SIZE(33554432, 32M),
+ SET_KMALLOC_SIZE(67108864, 64M)
};

/*
@@ -1161,36 +1191,14 @@ void __init setup_kmalloc_cache_index_table(void)
}
}

-static const char *
-kmalloc_cache_name(const char *prefix, unsigned int size)
-{
-
- static const char units[3] = "\0kM";
- int idx = 0;
-
- while (size >= 1024 && (size % 1024 == 0)) {
- size /= 1024;
- idx++;
- }
-
- return kasprintf(GFP_NOWAIT, "%s-%u%c", prefix, size, units[idx]);
-}
-
static void __init
new_kmalloc_cache(int idx, int type, slab_flags_t flags)
{
- const char *name;
-
- if (type == KMALLOC_RECLAIM) {
+ if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
- name = kmalloc_cache_name("kmalloc-rcl",
- kmalloc_info[idx].size);
- BUG_ON(!name);
- } else {
- name = kmalloc_info[idx].name;
- }

- kmalloc_caches[type][idx] = create_kmalloc_cache(name,
+ kmalloc_caches[type][idx] = create_kmalloc_cache(
+ kmalloc_info[idx].name[type],
kmalloc_info[idx].size, flags, 0,
kmalloc_info[idx].size);
}
@@ -1232,11 +1240,10 @@ void __init create_kmalloc_caches(slab_flags_t flags)

if (s) {
unsigned int size = kmalloc_size(i);
- const char *n = kmalloc_cache_name("dma-kmalloc", size);

- BUG_ON(!n);
kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
- n, size, SLAB_CACHE_DMA | flags, 0, 0);
+ kmalloc_info[i].name[KMALLOC_DMA],
+ size, SLAB_CACHE_DMA | flags, 0, 0);
}
}
#endif
--
2.21.0

2019-09-15 17:00:16

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 4/7] mm, slab: Return ZERO_SIZE_ALLOC for zero sized kmalloc requests

This is a preparation patch, just replace 0 with ZERO_SIZE_ALLOC
as the return value of zero sized requests.

Signed-off-by: Pengfei Li <[email protected]>
---
include/linux/slab.h | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index e773e5764b7b..1f05f68f2c3e 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -121,14 +121,20 @@
#define SLAB_DEACTIVATED ((slab_flags_t __force)0x10000000U)

/*
- * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
+ * ZERO_SIZE_ALLOC will be returned by kmalloc_index() if it was zero sized
+ * requests.
*
+ * After that, ZERO_SIZE_PTR will be returned by the function that called
+ * kmalloc_index().
+
* Dereferencing ZERO_SIZE_PTR will lead to a distinct access fault.
*
* ZERO_SIZE_PTR can be passed to kfree though in the same way that NULL can.
* Both make kfree a no-op.
*/
-#define ZERO_SIZE_PTR ((void *)16)
+#define ZERO_SIZE_ALLOC (UINT_MAX)
+
+#define ZERO_SIZE_PTR ((void *)16)

#define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) <= \
(unsigned long)ZERO_SIZE_PTR)
@@ -350,7 +356,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
static __always_inline unsigned int kmalloc_index(size_t size)
{
if (!size)
- return 0;
+ return ZERO_SIZE_ALLOC;

if (size <= KMALLOC_MIN_SIZE)
return KMALLOC_SHIFT_LOW;
@@ -546,7 +552,7 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
#ifndef CONFIG_SLOB
index = kmalloc_index(size);

- if (!index)
+ if (index == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;

return kmem_cache_alloc_trace(
@@ -564,7 +570,7 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
size <= KMALLOC_MAX_CACHE_SIZE) {
unsigned int i = kmalloc_index(size);

- if (!i)
+ if (i == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;

return kmem_cache_alloc_node_trace(
--
2.21.0

2019-09-15 17:00:40

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 5/7] mm, slab_common: make kmalloc_caches[] start at size KMALLOC_MIN_SIZE

Currently, kmalloc_cache[] is not sorted by size, kmalloc_cache[0]
is kmalloc-96, kmalloc_cache[1] is kmalloc-192 (when ARCH_DMA_MINALIGN
is not defined).

As suggested by Vlastimil Babka,

"Since you're doing these cleanups, have you considered reordering
kmalloc_info, size_index, kmalloc_index() etc so that sizes 96 and 192
are ordered naturally between 64, 128 and 256? That should remove
various special casing such as in create_kmalloc_caches(). I can't
guarantee it will be possible without breaking e.g. constant folding
optimizations etc., but seems to me it should be feasible. (There are
definitely more places to change than those I listed.)"

So this patch reordered kmalloc_info[], kmalloc_caches[], and modified
kmalloc_index() and kmalloc_slab() accordingly.

As a result, there is no subtle judgment about size in
create_kmalloc_caches(). And initialize kmalloc_cache[] from 0 instead
of KMALLOC_SHIFT_LOW.

I used ./scripts/bloat-o-meter to measure the impact of this patch on
performance. The results show that it brings some benefits.

Considering the size change of kmalloc_info[], the size of the code is
actually about 641 bytes less.

(Note: The original kmalloc_info[] was renamed to all_kmalloc_info[])

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-5
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1113 (-241)
Function old new delta
all_kmalloc_info - 832 +832
crypto_create_tfm 211 225 +14
ieee80211_key_alloc 1159 1169 +10
nl80211_parse_sched_scan 2787 2795 +8
ida_alloc_range 951 955 +4
find_get_context.isra 634 637 +3
sd_probe 947 948 +1
nla_strdup 143 142 -1
trace_parser_get_init 71 69 -2
pkcs7_verify.cold 318 316 -2
xhci_alloc_tt_info 349 346 -3
units 323 320 -3
nl80211_set_reg 642 639 -3
i915_sw_fence_await_dma_fence 445 441 -4
nf_queue 671 666 -5
kmalloc_slab 102 97 -5
xhci_segment_alloc 303 297 -6
xhci_alloc_container_ctx 221 215 -6
xfrm_policy_alloc 277 271 -6
selinux_sk_alloc_security 119 113 -6
sdev_evt_send_simple 124 118 -6
sdev_evt_alloc 85 79 -6
sbitmap_queue_init_node 424 418 -6
regulatory_hint_found_beacon 400 394 -6
nf_ct_tmpl_alloc 91 85 -6
gss_create_cred 146 140 -6
drm_flip_work_allocate_task 76 70 -6
cfg80211_stop_iface 266 260 -6
cfg80211_sinfo_alloc_tid_stats 83 77 -6
cfg80211_port_authorized 218 212 -6
cfg80211_ibss_joined 341 335 -6
call_usermodehelper_setup 155 149 -6
bpf_prog_alloc_no_stats 188 182 -6
blk_alloc_flush_queue 197 191 -6
bdi_alloc_node 201 195 -6
_netlbl_catmap_getnode 253 247 -6
____ip_mc_inc_group 481 475 -6
pkcs7_verify 1503 1495 -8
audit_log_d_path 204 196 -8
xprt_switch_alloc 145 136 -9
xhci_ring_alloc 378 369 -9
xhci_mem_init 3673 3664 -9
xhci_alloc_virt_device 505 496 -9
xhci_alloc_stream_info 727 718 -9
xhci_alloc_command 221 212 -9
tcp_sendmsg_locked 3129 3120 -9
tcp_md5_do_add 783 774 -9
tcp_fastopen_defer_connect 279 270 -9
sr_read_tochdr.isra 260 251 -9
sr_read_tocentry.isra 337 328 -9
sr_is_xa 385 376 -9
sr_get_mcn 269 260 -9
scsi_probe_and_add_lun 2947 2938 -9
ring_buffer_read_prepare 103 94 -9
request_firmware_nowait 405 396 -9
ohci_urb_enqueue 3185 3176 -9
nfs_alloc_seqid 96 87 -9
nfs4_get_state_owner 1049 1040 -9
nfs4_do_close 587 578 -9
mempool_create_node 173 164 -9
ip6_setup_cork 1030 1021 -9
dma_pool_alloc 419 410 -9
devres_open_group 223 214 -9
cfg80211_parse_mbssid_data 2406 2397 -9
__igmp_group_dropped 629 619 -10
gss_import_sec_context 187 176 -11
ip_setup_cork 374 362 -12
__i915_sw_fence_await_sw_fence 429 417 -12
kmalloc_caches 336 312 -24
create_kmalloc_caches 270 214 -56
kmalloc_cache_name 57 - -57
new_kmalloc_cache 112 - -112
kmalloc_info 432 8 -424
Total: Before=14874616, After=14874375, chg -0.00%

Signed-off-by: Pengfei Li <[email protected]>
---
include/linux/slab.h | 96 ++++++++++++++++----------
mm/slab.h | 10 +--
mm/slab_common.c | 158 ++++++++++++++++---------------------------
mm/slub.c | 12 ++--
4 files changed, 133 insertions(+), 143 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 1f05f68f2c3e..f53bb6980110 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -297,6 +297,23 @@ static inline void __check_heap_object(const void *ptr, unsigned long n,
#define KMALLOC_MIN_SIZE (1 << KMALLOC_SHIFT_LOW)
#endif

+#define KMALLOC_CACHE_MIN_NUM (KMALLOC_SHIFT_HIGH - KMALLOC_SHIFT_LOW + 1)
+
+#if KMALLOC_MIN_SIZE > 64
+ #define KMALLOC_SIZE_96_EXIST (0)
+ #define KMALLOC_SIZE_192_EXIST (0)
+#elif KMALLOC_MIN_SIZE > 32
+ #define KMALLOC_SIZE_96_EXIST (0)
+ #define KMALLOC_SIZE_192_EXIST (1)
+#else
+ #define KMALLOC_SIZE_96_EXIST (1)
+ #define KMALLOC_SIZE_192_EXIST (1)
+#endif
+
+#define KMALLOC_CACHE_NUM (KMALLOC_CACHE_MIN_NUM \
+ + KMALLOC_SIZE_96_EXIST \
+ + KMALLOC_SIZE_192_EXIST)
+
/*
* This restriction comes from byte sized index implementation.
* Page size is normally 2^12 bytes and, in this case, if we want to use
@@ -323,7 +340,7 @@ enum kmalloc_cache_type {

#ifndef CONFIG_SLOB
extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
+kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];

static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
@@ -345,13 +362,18 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
#endif
}

+/* kmalloc_index adjust size: (0, 96] */
+#define KMALLOC_IDX_ADJ_0 (KMALLOC_SHIFT_LOW)
+
+/* kmalloc_index adjust size: (96, 192] */
+#define KMALLOC_IDX_ADJ_1 (KMALLOC_IDX_ADJ_0 - KMALLOC_SIZE_96_EXIST)
+
+/* kmalloc_index adjust size: (192, N] */
+#define KMALLOC_IDX_ADJ_2 (KMALLOC_IDX_ADJ_1 - KMALLOC_SIZE_192_EXIST)
+
/*
* Figure out which kmalloc slab an allocation of a certain size
* belongs to.
- * 0 = zero alloc
- * 1 = 65 .. 96 bytes
- * 2 = 129 .. 192 bytes
- * n = 2^(n-1)+1 .. 2^n
*/
static __always_inline unsigned int kmalloc_index(size_t size)
{
@@ -359,36 +381,40 @@ static __always_inline unsigned int kmalloc_index(size_t size)
return ZERO_SIZE_ALLOC;

if (size <= KMALLOC_MIN_SIZE)
- return KMALLOC_SHIFT_LOW;
-
- if (KMALLOC_MIN_SIZE <= 32 && size > 64 && size <= 96)
- return 1;
- if (KMALLOC_MIN_SIZE <= 64 && size > 128 && size <= 192)
- return 2;
- if (size <= 8) return 3;
- if (size <= 16) return 4;
- if (size <= 32) return 5;
- if (size <= 64) return 6;
- if (size <= 128) return 7;
- if (size <= 256) return 8;
- if (size <= 512) return 9;
- if (size <= 1024) return 10;
- if (size <= 2 * 1024) return 11;
- if (size <= 4 * 1024) return 12;
- if (size <= 8 * 1024) return 13;
- if (size <= 16 * 1024) return 14;
- if (size <= 32 * 1024) return 15;
- if (size <= 64 * 1024) return 16;
- if (size <= 128 * 1024) return 17;
- if (size <= 256 * 1024) return 18;
- if (size <= 512 * 1024) return 19;
- if (size <= 1024 * 1024) return 20;
- if (size <= 2 * 1024 * 1024) return 21;
- if (size <= 4 * 1024 * 1024) return 22;
- if (size <= 8 * 1024 * 1024) return 23;
- if (size <= 16 * 1024 * 1024) return 24;
- if (size <= 32 * 1024 * 1024) return 25;
- if (size <= 64 * 1024 * 1024) return 26;
+ return 0;
+
+#if KMALLOC_SIZE_96_EXIST == 1
+ if (size > 64 && size <= 96) return (7 - KMALLOC_IDX_ADJ_0);
+#endif
+
+#if KMALLOC_SIZE_192_EXIST == 1
+ if (size > 128 && size <= 192) return (8 - KMALLOC_IDX_ADJ_1);
+#endif
+
+ if (size <= 8) return ( 3 - KMALLOC_IDX_ADJ_0);
+ if (size <= 16) return ( 4 - KMALLOC_IDX_ADJ_0);
+ if (size <= 32) return ( 5 - KMALLOC_IDX_ADJ_0);
+ if (size <= 64) return ( 6 - KMALLOC_IDX_ADJ_0);
+ if (size <= 128) return ( 7 - KMALLOC_IDX_ADJ_1);
+ if (size <= 256) return ( 8 - KMALLOC_IDX_ADJ_2);
+ if (size <= 512) return ( 9 - KMALLOC_IDX_ADJ_2);
+ if (size <= 1024) return (10 - KMALLOC_IDX_ADJ_2);
+ if (size <= 2 * 1024) return (11 - KMALLOC_IDX_ADJ_2);
+ if (size <= 4 * 1024) return (12 - KMALLOC_IDX_ADJ_2);
+ if (size <= 8 * 1024) return (13 - KMALLOC_IDX_ADJ_2);
+ if (size <= 16 * 1024) return (14 - KMALLOC_IDX_ADJ_2);
+ if (size <= 32 * 1024) return (15 - KMALLOC_IDX_ADJ_2);
+ if (size <= 64 * 1024) return (16 - KMALLOC_IDX_ADJ_2);
+ if (size <= 128 * 1024) return (17 - KMALLOC_IDX_ADJ_2);
+ if (size <= 256 * 1024) return (18 - KMALLOC_IDX_ADJ_2);
+ if (size <= 512 * 1024) return (19 - KMALLOC_IDX_ADJ_2);
+ if (size <= 1024 * 1024) return (20 - KMALLOC_IDX_ADJ_2);
+ if (size <= 2 * 1024 * 1024) return (21 - KMALLOC_IDX_ADJ_2);
+ if (size <= 4 * 1024 * 1024) return (22 - KMALLOC_IDX_ADJ_2);
+ if (size <= 8 * 1024 * 1024) return (23 - KMALLOC_IDX_ADJ_2);
+ if (size <= 16 * 1024 * 1024) return (24 - KMALLOC_IDX_ADJ_2);
+ if (size <= 32 * 1024 * 1024) return (25 - KMALLOC_IDX_ADJ_2);
+ if (size <= 64 * 1024 * 1024) return (26 - KMALLOC_IDX_ADJ_2);
BUG();

/* Will never be reached. Needed because the compiler may complain */
diff --git a/mm/slab.h b/mm/slab.h
index 2fc8f956906a..3ada65ef1118 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -63,6 +63,11 @@ enum slab_state {
FULL /* Everything is working */
};

+struct kmalloc_info_struct {
+ const char *name[NR_KMALLOC_TYPES];
+ unsigned int size;
+};
+
extern enum slab_state slab_state;

/* The slab cache mutex protects the management structures during changes */
@@ -75,10 +80,7 @@ extern struct list_head slab_caches;
extern struct kmem_cache *kmem_cache;

/* A table of kmalloc cache names and sizes */
-extern const struct kmalloc_info_struct {
- const char *name[NR_KMALLOC_TYPES];
- unsigned int size;
-} kmalloc_info[];
+extern const struct kmalloc_info_struct * const kmalloc_info;

#ifndef CONFIG_SLOB
/* Kmalloc array related functions */
diff --git a/mm/slab_common.c b/mm/slab_common.c
index af45b5278fdc..2aed30deb071 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1028,7 +1028,7 @@ struct kmem_cache *__init create_kmalloc_cache(const char *name,
}

struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1] __ro_after_init =
+kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM] __ro_after_init =
{ /* initialization for https://bugs.llvm.org/show_bug.cgi?id=42570 */ };
EXPORT_SYMBOL(kmalloc_caches);

@@ -1039,30 +1039,30 @@ EXPORT_SYMBOL(kmalloc_caches);
* fls.
*/
static u8 size_index[24] __ro_after_init = {
- 3, /* 8 */
- 4, /* 16 */
- 5, /* 24 */
- 5, /* 32 */
- 6, /* 40 */
- 6, /* 48 */
- 6, /* 56 */
- 6, /* 64 */
- 1, /* 72 */
- 1, /* 80 */
- 1, /* 88 */
- 1, /* 96 */
- 7, /* 104 */
- 7, /* 112 */
- 7, /* 120 */
- 7, /* 128 */
- 2, /* 136 */
- 2, /* 144 */
- 2, /* 152 */
- 2, /* 160 */
- 2, /* 168 */
- 2, /* 176 */
- 2, /* 184 */
- 2 /* 192 */
+ (3 - KMALLOC_IDX_ADJ_0), /* 8 */
+ (4 - KMALLOC_IDX_ADJ_0), /* 16 */
+ (5 - KMALLOC_IDX_ADJ_0), /* 24 */
+ (5 - KMALLOC_IDX_ADJ_0), /* 32 */
+ (6 - KMALLOC_IDX_ADJ_0), /* 40 */
+ (6 - KMALLOC_IDX_ADJ_0), /* 48 */
+ (6 - KMALLOC_IDX_ADJ_0), /* 56 */
+ (6 - KMALLOC_IDX_ADJ_0), /* 64 */
+ (7 - KMALLOC_IDX_ADJ_0), /* 72 */
+ (7 - KMALLOC_IDX_ADJ_0), /* 80 */
+ (7 - KMALLOC_IDX_ADJ_0), /* 88 */
+ (7 - KMALLOC_IDX_ADJ_0), /* 96 */
+ (7 - KMALLOC_IDX_ADJ_1), /* 104 */
+ (7 - KMALLOC_IDX_ADJ_1), /* 112 */
+ (7 - KMALLOC_IDX_ADJ_1), /* 120 */
+ (7 - KMALLOC_IDX_ADJ_1), /* 128 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 136 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 144 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 152 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 160 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 168 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 176 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 184 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 192 */
};

static inline unsigned int size_index_elem(unsigned int bytes)
@@ -1086,13 +1086,17 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
} else {
if (WARN_ON_ONCE(size > KMALLOC_MAX_CACHE_SIZE))
return NULL;
- index = fls(size - 1);
+
+ index = fls(size - 1) - KMALLOC_IDX_ADJ_2;
}

return kmalloc_caches[kmalloc_type(flags)][index];
}

#ifdef CONFIG_ZONE_DMA
+
+#define KMALLOC_INFO_SHIFT_LOW (3)
+#define KMALLOC_INFO_START_IDX (KMALLOC_SHIFT_LOW - KMALLOC_INFO_SHIFT_LOW)
#define SET_KMALLOC_SIZE(__size, __short_size) \
{ \
.name[KMALLOC_NORMAL] = "kmalloc-" #__short_size, \
@@ -1110,40 +1114,35 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
#endif

/*
- * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
- * kmalloc_index() supports up to 2^26=64MB, so the final entry of the table is
- * kmalloc-67108864.
+ * all_kmalloc_info[] is to make slub_debug=, kmalloc-xx option work at boot
+ * time. kmalloc_index() supports up to 2^26=64MB, so the final entry of the
+ * table is kmalloc-67108864.
*/
-const struct kmalloc_info_struct kmalloc_info[] __initconst = {
- SET_KMALLOC_SIZE(0, 0),
- SET_KMALLOC_SIZE(96, 96),
- SET_KMALLOC_SIZE(192, 192),
- SET_KMALLOC_SIZE(8, 8),
- SET_KMALLOC_SIZE(16, 16),
- SET_KMALLOC_SIZE(32, 32),
- SET_KMALLOC_SIZE(64, 64),
- SET_KMALLOC_SIZE(128, 128),
- SET_KMALLOC_SIZE(256, 256),
- SET_KMALLOC_SIZE(512, 512),
- SET_KMALLOC_SIZE(1024, 1k),
- SET_KMALLOC_SIZE(2048, 2k),
- SET_KMALLOC_SIZE(4096, 4k),
- SET_KMALLOC_SIZE(8192, 8k),
- SET_KMALLOC_SIZE(16384, 16k),
- SET_KMALLOC_SIZE(32768, 32k),
- SET_KMALLOC_SIZE(65536, 64k),
- SET_KMALLOC_SIZE(131072, 128k),
- SET_KMALLOC_SIZE(262144, 256k),
- SET_KMALLOC_SIZE(524288, 512k),
- SET_KMALLOC_SIZE(1048576, 1M),
- SET_KMALLOC_SIZE(2097152, 2M),
- SET_KMALLOC_SIZE(4194304, 4M),
- SET_KMALLOC_SIZE(8388608, 8M),
- SET_KMALLOC_SIZE(16777216, 16M),
- SET_KMALLOC_SIZE(33554432, 32M),
- SET_KMALLOC_SIZE(67108864, 64M)
+const struct kmalloc_info_struct all_kmalloc_info[] __initconst = {
+ SET_KMALLOC_SIZE( 8, 8), SET_KMALLOC_SIZE( 16, 16),
+ SET_KMALLOC_SIZE( 32, 32), SET_KMALLOC_SIZE( 64, 64),
+#if KMALLOC_SIZE_96_EXIST == 1
+ SET_KMALLOC_SIZE( 96, 96),
+#endif
+ SET_KMALLOC_SIZE( 128, 128),
+#if KMALLOC_SIZE_192_EXIST == 1
+ SET_KMALLOC_SIZE( 192, 192),
+#endif
+ SET_KMALLOC_SIZE( 256, 256), SET_KMALLOC_SIZE( 512, 512),
+ SET_KMALLOC_SIZE( 1024, 1k), SET_KMALLOC_SIZE( 2048, 2k),
+ SET_KMALLOC_SIZE( 4096, 4k), SET_KMALLOC_SIZE( 8192, 8k),
+ SET_KMALLOC_SIZE( 16384, 16k), SET_KMALLOC_SIZE( 32768, 32k),
+ SET_KMALLOC_SIZE( 65536, 64k), SET_KMALLOC_SIZE( 131072, 128k),
+ SET_KMALLOC_SIZE( 262144, 256k), SET_KMALLOC_SIZE( 524288, 512k),
+ SET_KMALLOC_SIZE( 1048576, 1M), SET_KMALLOC_SIZE( 2097152, 2M),
+ SET_KMALLOC_SIZE( 4194304, 4M), SET_KMALLOC_SIZE( 8388608, 8M),
+ SET_KMALLOC_SIZE(16777216, 16M), SET_KMALLOC_SIZE(33554432, 32M),
+ SET_KMALLOC_SIZE(67108864, 64M)
};

+const struct kmalloc_info_struct * const __initconst
+kmalloc_info = &all_kmalloc_info[KMALLOC_INFO_START_IDX];
+
/*
* Patch up the size_index table if we have strange large alignment
* requirements for the kmalloc array. This is only the case for
@@ -1162,33 +1161,8 @@ void __init setup_kmalloc_cache_index_table(void)
BUILD_BUG_ON(KMALLOC_MIN_SIZE > 256 ||
(KMALLOC_MIN_SIZE & (KMALLOC_MIN_SIZE - 1)));

- for (i = 8; i < KMALLOC_MIN_SIZE; i += 8) {
- unsigned int elem = size_index_elem(i);
-
- if (elem >= ARRAY_SIZE(size_index))
- break;
- size_index[elem] = KMALLOC_SHIFT_LOW;
- }
-
- if (KMALLOC_MIN_SIZE >= 64) {
- /*
- * The 96 byte size cache is not used if the alignment
- * is 64 byte.
- */
- for (i = 64 + 8; i <= 96; i += 8)
- size_index[size_index_elem(i)] = 7;
-
- }
-
- if (KMALLOC_MIN_SIZE >= 128) {
- /*
- * The 192 byte sized cache is not used if the alignment
- * is 128 byte. Redirect kmalloc to use the 256 byte cache
- * instead.
- */
- for (i = 128 + 8; i <= 192; i += 8)
- size_index[size_index_elem(i)] = 8;
- }
+ for (i = 8; i < KMALLOC_MIN_SIZE && i <= 192; i += 8)
+ size_index[size_index_elem(i)] = 0;
}

static void __init
@@ -1214,21 +1188,9 @@ void __init create_kmalloc_caches(slab_flags_t flags)
enum kmalloc_cache_type type;

for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
- for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
+ for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
if (!kmalloc_caches[type][i])
new_kmalloc_cache(i, type, flags);
-
- /*
- * Caches that are not of the two-to-the-power-of size.
- * These have to be created immediately after the
- * earlier power of two caches
- */
- if (KMALLOC_MIN_SIZE <= 32 && i == 6 &&
- !kmalloc_caches[type][1])
- new_kmalloc_cache(1, type, flags);
- if (KMALLOC_MIN_SIZE <= 64 && i == 7 &&
- !kmalloc_caches[type][2])
- new_kmalloc_cache(2, type, flags);
}
}

@@ -1236,7 +1198,7 @@ void __init create_kmalloc_caches(slab_flags_t flags)
slab_state = UP;

#ifdef CONFIG_ZONE_DMA
- for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
+ for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];

if (s) {
diff --git a/mm/slub.c b/mm/slub.c
index 8834563cdb4b..0e92ebdcacc9 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4711,7 +4711,7 @@ static void __init resiliency_test(void)
pr_err("\n1. kmalloc-16: Clobber Redzone/next pointer 0x12->0x%p\n\n",
p + 16);

- validate_slab_cache(kmalloc_caches[type][4]);
+ validate_slab_cache(kmalloc_caches[type][1]);

/* Hmmm... The next two are dangerous */
p = kzalloc(32, GFP_KERNEL);
@@ -4720,33 +4720,33 @@ static void __init resiliency_test(void)
p);
pr_err("If allocated object is overwritten then not detectable\n\n");

- validate_slab_cache(kmalloc_caches[type][5]);
+ validate_slab_cache(kmalloc_caches[type][2]);
p = kzalloc(64, GFP_KERNEL);
p += 64 + (get_cycles() & 0xff) * sizeof(void *);
*p = 0x56;
pr_err("\n3. kmalloc-64: corrupting random byte 0x56->0x%p\n",
p);
pr_err("If allocated object is overwritten then not detectable\n\n");
- validate_slab_cache(kmalloc_caches[type][6]);
+ validate_slab_cache(kmalloc_caches[type][3]);

pr_err("\nB. Corruption after free\n");
p = kzalloc(128, GFP_KERNEL);
kfree(p);
*p = 0x78;
pr_err("1. kmalloc-128: Clobber first word 0x78->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][7]);
+ validate_slab_cache(kmalloc_caches[type][5]);

p = kzalloc(256, GFP_KERNEL);
kfree(p);
p[50] = 0x9a;
pr_err("\n2. kmalloc-256: Clobber 50th byte 0x9a->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][8]);
+ validate_slab_cache(kmalloc_caches[type][7]);

p = kzalloc(512, GFP_KERNEL);
kfree(p);
p[512] = 0xab;
pr_err("\n3. kmalloc-512: Clobber redzone 0xab->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][9]);
+ validate_slab_cache(kmalloc_caches[type][8]);
}
#else
#ifdef CONFIG_SYSFS
--
2.21.0

2019-09-15 17:01:21

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 6/7] mm, slab_common: initialize the same size of kmalloc_caches[]

In the current implementation, KMALLOC_RECLAIM is not initialized
until all the KMALLOC_NORMAL sizes have been initialized.

But for a particular size, create_kmalloc_caches() can be executed
faster by initializing different types of kmalloc in order.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-5
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1113 (-241)
Function old new delta
create_kmalloc_caches 270 214 -56

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-6
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1172 (-300)
Function old new delta
create_kmalloc_caches 270 155 -115

We can see that it really gets the benefits.

Besides, KMALLOC_DMA will be initialized after "slab_state = UP",
this does not seem to be necessary.

Commit f97d5f634d3b ("slab: Common function to create the kmalloc
array") introduces create_kmalloc_caches().

And I found that for SLAB, KMALLOC_DMA is initialized before
"slab_state = UP". But for SLUB, KMALLOC_DMA is initialized after
"slab_state = UP".

Based on this fact, I think it is okay to initialize KMALLOC_DMA
before "slab_state = UP".

Signed-off-by: Pengfei Li <[email protected]>
---
mm/slab_common.c | 35 ++++++++++++-----------------------
1 file changed, 12 insertions(+), 23 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 2aed30deb071..e7903bd28b1f 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1165,12 +1165,9 @@ void __init setup_kmalloc_cache_index_table(void)
size_index[size_index_elem(i)] = 0;
}

-static void __init
+static __always_inline void __init
new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
{
- if (type == KMALLOC_RECLAIM)
- flags |= SLAB_RECLAIM_ACCOUNT;
-
kmalloc_caches[type][idx] = create_kmalloc_cache(
kmalloc_info[idx].name[type],
kmalloc_info[idx].size, flags, 0,
@@ -1185,30 +1182,22 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
void __init create_kmalloc_caches(slab_flags_t flags)
{
int i;
- enum kmalloc_cache_type type;

- for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
- for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
- if (!kmalloc_caches[type][i])
- new_kmalloc_cache(i, type, flags);
- }
- }
+ for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
+ if (!kmalloc_caches[KMALLOC_NORMAL][i])
+ new_kmalloc_cache(i, KMALLOC_NORMAL, flags);

- /* Kmalloc array is now usable */
- slab_state = UP;
+ new_kmalloc_cache(i, KMALLOC_RECLAIM,
+ flags | SLAB_RECLAIM_ACCOUNT);

#ifdef CONFIG_ZONE_DMA
- for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
- struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
-
- if (s) {
- kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
- kmalloc_info[i].name[KMALLOC_DMA],
- kmalloc_info[i].size,
- SLAB_CACHE_DMA | flags, 0, 0);
- }
- }
+ new_kmalloc_cache(i, KMALLOC_DMA,
+ flags | SLAB_CACHE_DMA);
#endif
+ }
+
+ /* Kmalloc array is now usable */
+ slab_state = UP;
}
#endif /* !CONFIG_SLOB */

--
2.21.0

2019-09-15 17:01:47

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 4/7] mm, slab: return ZERO_SIZE_ALLOC for zero sized kmalloc requests

This is a preparation patch, just replace 0 with ZERO_SIZE_ALLOC
as the return value of zero sized requests.

Signed-off-by: Pengfei Li <[email protected]>
---
include/linux/slab.h | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index e773e5764b7b..1f05f68f2c3e 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -121,14 +121,20 @@
#define SLAB_DEACTIVATED ((slab_flags_t __force)0x10000000U)

/*
- * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
+ * ZERO_SIZE_ALLOC will be returned by kmalloc_index() if it was zero sized
+ * requests.
*
+ * After that, ZERO_SIZE_PTR will be returned by the function that called
+ * kmalloc_index().
+
* Dereferencing ZERO_SIZE_PTR will lead to a distinct access fault.
*
* ZERO_SIZE_PTR can be passed to kfree though in the same way that NULL can.
* Both make kfree a no-op.
*/
-#define ZERO_SIZE_PTR ((void *)16)
+#define ZERO_SIZE_ALLOC (UINT_MAX)
+
+#define ZERO_SIZE_PTR ((void *)16)

#define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) <= \
(unsigned long)ZERO_SIZE_PTR)
@@ -350,7 +356,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
static __always_inline unsigned int kmalloc_index(size_t size)
{
if (!size)
- return 0;
+ return ZERO_SIZE_ALLOC;

if (size <= KMALLOC_MIN_SIZE)
return KMALLOC_SHIFT_LOW;
@@ -546,7 +552,7 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
#ifndef CONFIG_SLOB
index = kmalloc_index(size);

- if (!index)
+ if (index == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;

return kmem_cache_alloc_trace(
@@ -564,7 +570,7 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
size <= KMALLOC_MAX_CACHE_SIZE) {
unsigned int i = kmalloc_index(size);

- if (!i)
+ if (i == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;

return kmem_cache_alloc_node_trace(
--
2.21.0

2019-09-15 17:01:48

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 6/7] mm, slab_common: Initialize the same size of kmalloc_caches[]

In the current implementation, KMALLOC_RECLAIM is not initialized
until all the KMALLOC_NORMAL sizes have been initialized.

But for a particular size, create_kmalloc_caches() can be executed
faster by initializing different types of kmalloc in order.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-5
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1113 (-241)
Function old new delta
create_kmalloc_caches 270 214 -56

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-6
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1172 (-300)
Function old new delta
create_kmalloc_caches 270 155 -115

We can see that it really gets the benefits.

Besides, KMALLOC_DMA will be initialized after "slab_state = UP",
this does not seem to be necessary.

Commit f97d5f634d3b ("slab: Common function to create the kmalloc
array") introduces create_kmalloc_caches().

And I found that for SLAB, KMALLOC_DMA is initialized before
"slab_state = UP". But for SLUB, KMALLOC_DMA is initialized after
"slab_state = UP".

Based on this fact, I think it is okay to initialize KMALLOC_DMA
before "slab_state = UP".

Signed-off-by: Pengfei Li <[email protected]>
---
mm/slab_common.c | 35 ++++++++++++-----------------------
1 file changed, 12 insertions(+), 23 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 2aed30deb071..e7903bd28b1f 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1165,12 +1165,9 @@ void __init setup_kmalloc_cache_index_table(void)
size_index[size_index_elem(i)] = 0;
}

-static void __init
+static __always_inline void __init
new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
{
- if (type == KMALLOC_RECLAIM)
- flags |= SLAB_RECLAIM_ACCOUNT;
-
kmalloc_caches[type][idx] = create_kmalloc_cache(
kmalloc_info[idx].name[type],
kmalloc_info[idx].size, flags, 0,
@@ -1185,30 +1182,22 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
void __init create_kmalloc_caches(slab_flags_t flags)
{
int i;
- enum kmalloc_cache_type type;

- for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
- for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
- if (!kmalloc_caches[type][i])
- new_kmalloc_cache(i, type, flags);
- }
- }
+ for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
+ if (!kmalloc_caches[KMALLOC_NORMAL][i])
+ new_kmalloc_cache(i, KMALLOC_NORMAL, flags);

- /* Kmalloc array is now usable */
- slab_state = UP;
+ new_kmalloc_cache(i, KMALLOC_RECLAIM,
+ flags | SLAB_RECLAIM_ACCOUNT);

#ifdef CONFIG_ZONE_DMA
- for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
- struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
-
- if (s) {
- kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
- kmalloc_info[i].name[KMALLOC_DMA],
- kmalloc_info[i].size,
- SLAB_CACHE_DMA | flags, 0, 0);
- }
- }
+ new_kmalloc_cache(i, KMALLOC_DMA,
+ flags | SLAB_CACHE_DMA);
#endif
+ }
+
+ /* Kmalloc array is now usable */
+ slab_state = UP;
}
#endif /* !CONFIG_SLOB */

--
2.21.0

2019-09-15 17:04:29

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 7/7] mm, slab_common: modify kmalloc_caches[type][idx] to kmalloc_caches[idx][type]

KMALLOC_NORMAL is the most frequently accessed, and kmalloc_caches[]
is initialized by different types of the same size.

So modifying kmalloc_caches[type][idx] to kmalloc_caches[idx][type]
will benefit performance.

$ ./scripts/bloat-o-meter vmlinux.patch_1-6 vmlinux.patch_1-7
add/remove: 0/0 grow/shrink: 2/57 up/down: 8/-457 (-449)
Function old new delta
tg3_self_test 4255 4259 +4
nf_queue 666 670 +4
kmalloc_slab 97 93 -4
i915_sw_fence_await_dma_fence 441 437 -4
__igmp_group_dropped 619 615 -4
gss_import_sec_context 176 170 -6
xhci_alloc_command 212 205 -7
create_kmalloc_caches 155 148 -7
xprt_switch_alloc 136 128 -8
xhci_segment_alloc 297 289 -8
xhci_ring_alloc 369 361 -8
xhci_mem_init 3664 3656 -8
xhci_alloc_virt_device 496 488 -8
xhci_alloc_tt_info 346 338 -8
xhci_alloc_stream_info 718 710 -8
xhci_alloc_container_ctx 215 207 -8
xfrm_policy_alloc 271 263 -8
tcp_sendmsg_locked 3120 3112 -8
tcp_md5_do_add 774 766 -8
tcp_fastopen_defer_connect 270 262 -8
sr_read_tochdr.isra 251 243 -8
sr_read_tocentry.isra 328 320 -8
sr_is_xa 376 368 -8
sr_get_mcn 260 252 -8
selinux_sk_alloc_security 113 105 -8
sdev_evt_send_simple 118 110 -8
sdev_evt_alloc 79 71 -8
scsi_probe_and_add_lun 2938 2930 -8
sbitmap_queue_init_node 418 410 -8
ring_buffer_read_prepare 94 86 -8
request_firmware_nowait 396 388 -8
regulatory_hint_found_beacon 394 386 -8
ohci_urb_enqueue 3176 3168 -8
nla_strdup 142 134 -8
nfs_alloc_seqid 87 79 -8
nfs4_get_state_owner 1040 1032 -8
nfs4_do_close 578 570 -8
nf_ct_tmpl_alloc 85 77 -8
mempool_create_node 164 156 -8
ip_setup_cork 362 354 -8
ip6_setup_cork 1021 1013 -8
gss_create_cred 140 132 -8
drm_flip_work_allocate_task 70 62 -8
dma_pool_alloc 410 402 -8
devres_open_group 214 206 -8
cfg80211_stop_iface 260 252 -8
cfg80211_sinfo_alloc_tid_stats 77 69 -8
cfg80211_port_authorized 212 204 -8
cfg80211_parse_mbssid_data 2397 2389 -8
cfg80211_ibss_joined 335 327 -8
call_usermodehelper_setup 149 141 -8
bpf_prog_alloc_no_stats 182 174 -8
blk_alloc_flush_queue 191 183 -8
bdi_alloc_node 195 187 -8
audit_log_d_path 196 188 -8
_netlbl_catmap_getnode 247 239 -8
____ip_mc_inc_group 475 467 -8
__i915_sw_fence_await_sw_fence 417 405 -12
ida_alloc_range 955 934 -21
Total: Before=14874316, After=14873867, chg -0.00%

Signed-off-by: Pengfei Li <[email protected]>
---
include/linux/slab.h | 6 +++---
mm/slab.c | 4 ++--
mm/slab_common.c | 8 ++++----
mm/slub.c | 12 ++++++------
4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index f53bb6980110..0842db5f7053 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -340,7 +340,7 @@ enum kmalloc_cache_type {

#ifndef CONFIG_SLOB
extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];
+kmalloc_caches[KMALLOC_CACHE_NUM][NR_KMALLOC_TYPES];

static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
@@ -582,7 +582,7 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
return ZERO_SIZE_PTR;

return kmem_cache_alloc_trace(
- kmalloc_caches[kmalloc_type(flags)][index],
+ kmalloc_caches[index][kmalloc_type(flags)],
flags, size);
#endif
}
@@ -600,7 +600,7 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
return ZERO_SIZE_PTR;

return kmem_cache_alloc_node_trace(
- kmalloc_caches[kmalloc_type(flags)][i],
+ kmalloc_caches[i][kmalloc_type(flags)],
flags, node, size);
}
#endif
diff --git a/mm/slab.c b/mm/slab.c
index 7bc4e90e1147..079c3e6ced1f 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1246,7 +1246,7 @@ void __init kmem_cache_init(void)
* Initialize the caches that provide memory for the kmem_cache_node
* structures first. Without this, further allocations will bug.
*/
- kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
+ kmalloc_caches[INDEX_NODE][KMALLOC_NORMAL] = create_kmalloc_cache(
kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
kmalloc_info[INDEX_NODE].size,
ARCH_KMALLOC_FLAGS, 0,
@@ -1263,7 +1263,7 @@ void __init kmem_cache_init(void)
for_each_online_node(nid) {
init_list(kmem_cache, &init_kmem_cache_node[CACHE_CACHE + nid], nid);

- init_list(kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE],
+ init_list(kmalloc_caches[INDEX_NODE][KMALLOC_NORMAL],
&init_kmem_cache_node[SIZE_NODE + nid], nid);
}
}
diff --git a/mm/slab_common.c b/mm/slab_common.c
index e7903bd28b1f..0f465eae32f6 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1028,7 +1028,7 @@ struct kmem_cache *__init create_kmalloc_cache(const char *name,
}

struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM] __ro_after_init =
+kmalloc_caches[KMALLOC_CACHE_NUM][NR_KMALLOC_TYPES] __ro_after_init =
{ /* initialization for https://bugs.llvm.org/show_bug.cgi?id=42570 */ };
EXPORT_SYMBOL(kmalloc_caches);

@@ -1090,7 +1090,7 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
index = fls(size - 1) - KMALLOC_IDX_ADJ_2;
}

- return kmalloc_caches[kmalloc_type(flags)][index];
+ return kmalloc_caches[index][kmalloc_type(flags)];
}

#ifdef CONFIG_ZONE_DMA
@@ -1168,7 +1168,7 @@ void __init setup_kmalloc_cache_index_table(void)
static __always_inline void __init
new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
{
- kmalloc_caches[type][idx] = create_kmalloc_cache(
+ kmalloc_caches[idx][type] = create_kmalloc_cache(
kmalloc_info[idx].name[type],
kmalloc_info[idx].size, flags, 0,
kmalloc_info[idx].size);
@@ -1184,7 +1184,7 @@ void __init create_kmalloc_caches(slab_flags_t flags)
int i;

for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
- if (!kmalloc_caches[KMALLOC_NORMAL][i])
+ if (!kmalloc_caches[i][KMALLOC_NORMAL])
new_kmalloc_cache(i, KMALLOC_NORMAL, flags);

new_kmalloc_cache(i, KMALLOC_RECLAIM,
diff --git a/mm/slub.c b/mm/slub.c
index 0e92ebdcacc9..e87243a16768 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4711,7 +4711,7 @@ static void __init resiliency_test(void)
pr_err("\n1. kmalloc-16: Clobber Redzone/next pointer 0x12->0x%p\n\n",
p + 16);

- validate_slab_cache(kmalloc_caches[type][1]);
+ validate_slab_cache(kmalloc_caches[1][type]);

/* Hmmm... The next two are dangerous */
p = kzalloc(32, GFP_KERNEL);
@@ -4720,33 +4720,33 @@ static void __init resiliency_test(void)
p);
pr_err("If allocated object is overwritten then not detectable\n\n");

- validate_slab_cache(kmalloc_caches[type][2]);
+ validate_slab_cache(kmalloc_caches[2][type]);
p = kzalloc(64, GFP_KERNEL);
p += 64 + (get_cycles() & 0xff) * sizeof(void *);
*p = 0x56;
pr_err("\n3. kmalloc-64: corrupting random byte 0x56->0x%p\n",
p);
pr_err("If allocated object is overwritten then not detectable\n\n");
- validate_slab_cache(kmalloc_caches[type][3]);
+ validate_slab_cache(kmalloc_caches[3][type]);

pr_err("\nB. Corruption after free\n");
p = kzalloc(128, GFP_KERNEL);
kfree(p);
*p = 0x78;
pr_err("1. kmalloc-128: Clobber first word 0x78->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][5]);
+ validate_slab_cache(kmalloc_caches[5][type]);

p = kzalloc(256, GFP_KERNEL);
kfree(p);
p[50] = 0x9a;
pr_err("\n2. kmalloc-256: Clobber 50th byte 0x9a->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][7]);
+ validate_slab_cache(kmalloc_caches[7][type]);

p = kzalloc(512, GFP_KERNEL);
kfree(p);
p[512] = 0xab;
pr_err("\n3. kmalloc-512: Clobber redzone 0xab->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][8]);
+ validate_slab_cache(kmalloc_caches[8][type]);
}
#else
#ifdef CONFIG_SYSFS
--
2.21.0

2019-09-15 19:23:20

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 5/7] mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE

Currently, kmalloc_cache[] is not sorted by size, kmalloc_cache[0]
is kmalloc-96, kmalloc_cache[1] is kmalloc-192 (when ARCH_DMA_MINALIGN
is not defined).

As suggested by Vlastimil Babka,

"Since you're doing these cleanups, have you considered reordering
kmalloc_info, size_index, kmalloc_index() etc so that sizes 96 and 192
are ordered naturally between 64, 128 and 256? That should remove
various special casing such as in create_kmalloc_caches(). I can't
guarantee it will be possible without breaking e.g. constant folding
optimizations etc., but seems to me it should be feasible. (There are
definitely more places to change than those I listed.)"

So this patch reordered kmalloc_info[], kmalloc_caches[], and modified
kmalloc_index() and kmalloc_slab() accordingly.

As a result, there is no subtle judgment about size in
create_kmalloc_caches(). And initialize kmalloc_cache[] from 0 instead
of KMALLOC_SHIFT_LOW.

I used ./scripts/bloat-o-meter to measure the impact of this patch on
performance. The results show that it brings some benefits.

Considering the size change of kmalloc_info[], the size of the code is
actually about 641 bytes less.

(Note: The original kmalloc_info[] was renamed to all_kmalloc_info[])

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-5
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1113 (-241)
Function old new delta
all_kmalloc_info - 832 +832
crypto_create_tfm 211 225 +14
ieee80211_key_alloc 1159 1169 +10
nl80211_parse_sched_scan 2787 2795 +8
ida_alloc_range 951 955 +4
find_get_context.isra 634 637 +3
sd_probe 947 948 +1
nla_strdup 143 142 -1
trace_parser_get_init 71 69 -2
pkcs7_verify.cold 318 316 -2
xhci_alloc_tt_info 349 346 -3
units 323 320 -3
nl80211_set_reg 642 639 -3
i915_sw_fence_await_dma_fence 445 441 -4
nf_queue 671 666 -5
kmalloc_slab 102 97 -5
xhci_segment_alloc 303 297 -6
xhci_alloc_container_ctx 221 215 -6
xfrm_policy_alloc 277 271 -6
selinux_sk_alloc_security 119 113 -6
sdev_evt_send_simple 124 118 -6
sdev_evt_alloc 85 79 -6
sbitmap_queue_init_node 424 418 -6
regulatory_hint_found_beacon 400 394 -6
nf_ct_tmpl_alloc 91 85 -6
gss_create_cred 146 140 -6
drm_flip_work_allocate_task 76 70 -6
cfg80211_stop_iface 266 260 -6
cfg80211_sinfo_alloc_tid_stats 83 77 -6
cfg80211_port_authorized 218 212 -6
cfg80211_ibss_joined 341 335 -6
call_usermodehelper_setup 155 149 -6
bpf_prog_alloc_no_stats 188 182 -6
blk_alloc_flush_queue 197 191 -6
bdi_alloc_node 201 195 -6
_netlbl_catmap_getnode 253 247 -6
____ip_mc_inc_group 481 475 -6
pkcs7_verify 1503 1495 -8
audit_log_d_path 204 196 -8
xprt_switch_alloc 145 136 -9
xhci_ring_alloc 378 369 -9
xhci_mem_init 3673 3664 -9
xhci_alloc_virt_device 505 496 -9
xhci_alloc_stream_info 727 718 -9
xhci_alloc_command 221 212 -9
tcp_sendmsg_locked 3129 3120 -9
tcp_md5_do_add 783 774 -9
tcp_fastopen_defer_connect 279 270 -9
sr_read_tochdr.isra 260 251 -9
sr_read_tocentry.isra 337 328 -9
sr_is_xa 385 376 -9
sr_get_mcn 269 260 -9
scsi_probe_and_add_lun 2947 2938 -9
ring_buffer_read_prepare 103 94 -9
request_firmware_nowait 405 396 -9
ohci_urb_enqueue 3185 3176 -9
nfs_alloc_seqid 96 87 -9
nfs4_get_state_owner 1049 1040 -9
nfs4_do_close 587 578 -9
mempool_create_node 173 164 -9
ip6_setup_cork 1030 1021 -9
dma_pool_alloc 419 410 -9
devres_open_group 223 214 -9
cfg80211_parse_mbssid_data 2406 2397 -9
__igmp_group_dropped 629 619 -10
gss_import_sec_context 187 176 -11
ip_setup_cork 374 362 -12
__i915_sw_fence_await_sw_fence 429 417 -12
kmalloc_caches 336 312 -24
create_kmalloc_caches 270 214 -56
kmalloc_cache_name 57 - -57
new_kmalloc_cache 112 - -112
kmalloc_info 432 8 -424
Total: Before=14874616, After=14874375, chg -0.00%

Signed-off-by: Pengfei Li <[email protected]>
---
include/linux/slab.h | 96 ++++++++++++++++----------
mm/slab.h | 10 +--
mm/slab_common.c | 158 ++++++++++++++++---------------------------
mm/slub.c | 12 ++--
4 files changed, 133 insertions(+), 143 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 1f05f68f2c3e..f53bb6980110 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -297,6 +297,23 @@ static inline void __check_heap_object(const void *ptr, unsigned long n,
#define KMALLOC_MIN_SIZE (1 << KMALLOC_SHIFT_LOW)
#endif

+#define KMALLOC_CACHE_MIN_NUM (KMALLOC_SHIFT_HIGH - KMALLOC_SHIFT_LOW + 1)
+
+#if KMALLOC_MIN_SIZE > 64
+ #define KMALLOC_SIZE_96_EXIST (0)
+ #define KMALLOC_SIZE_192_EXIST (0)
+#elif KMALLOC_MIN_SIZE > 32
+ #define KMALLOC_SIZE_96_EXIST (0)
+ #define KMALLOC_SIZE_192_EXIST (1)
+#else
+ #define KMALLOC_SIZE_96_EXIST (1)
+ #define KMALLOC_SIZE_192_EXIST (1)
+#endif
+
+#define KMALLOC_CACHE_NUM (KMALLOC_CACHE_MIN_NUM \
+ + KMALLOC_SIZE_96_EXIST \
+ + KMALLOC_SIZE_192_EXIST)
+
/*
* This restriction comes from byte sized index implementation.
* Page size is normally 2^12 bytes and, in this case, if we want to use
@@ -323,7 +340,7 @@ enum kmalloc_cache_type {

#ifndef CONFIG_SLOB
extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
+kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];

static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
@@ -345,13 +362,18 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
#endif
}

+/* kmalloc_index adjust size: (0, 96] */
+#define KMALLOC_IDX_ADJ_0 (KMALLOC_SHIFT_LOW)
+
+/* kmalloc_index adjust size: (96, 192] */
+#define KMALLOC_IDX_ADJ_1 (KMALLOC_IDX_ADJ_0 - KMALLOC_SIZE_96_EXIST)
+
+/* kmalloc_index adjust size: (192, N] */
+#define KMALLOC_IDX_ADJ_2 (KMALLOC_IDX_ADJ_1 - KMALLOC_SIZE_192_EXIST)
+
/*
* Figure out which kmalloc slab an allocation of a certain size
* belongs to.
- * 0 = zero alloc
- * 1 = 65 .. 96 bytes
- * 2 = 129 .. 192 bytes
- * n = 2^(n-1)+1 .. 2^n
*/
static __always_inline unsigned int kmalloc_index(size_t size)
{
@@ -359,36 +381,40 @@ static __always_inline unsigned int kmalloc_index(size_t size)
return ZERO_SIZE_ALLOC;

if (size <= KMALLOC_MIN_SIZE)
- return KMALLOC_SHIFT_LOW;
-
- if (KMALLOC_MIN_SIZE <= 32 && size > 64 && size <= 96)
- return 1;
- if (KMALLOC_MIN_SIZE <= 64 && size > 128 && size <= 192)
- return 2;
- if (size <= 8) return 3;
- if (size <= 16) return 4;
- if (size <= 32) return 5;
- if (size <= 64) return 6;
- if (size <= 128) return 7;
- if (size <= 256) return 8;
- if (size <= 512) return 9;
- if (size <= 1024) return 10;
- if (size <= 2 * 1024) return 11;
- if (size <= 4 * 1024) return 12;
- if (size <= 8 * 1024) return 13;
- if (size <= 16 * 1024) return 14;
- if (size <= 32 * 1024) return 15;
- if (size <= 64 * 1024) return 16;
- if (size <= 128 * 1024) return 17;
- if (size <= 256 * 1024) return 18;
- if (size <= 512 * 1024) return 19;
- if (size <= 1024 * 1024) return 20;
- if (size <= 2 * 1024 * 1024) return 21;
- if (size <= 4 * 1024 * 1024) return 22;
- if (size <= 8 * 1024 * 1024) return 23;
- if (size <= 16 * 1024 * 1024) return 24;
- if (size <= 32 * 1024 * 1024) return 25;
- if (size <= 64 * 1024 * 1024) return 26;
+ return 0;
+
+#if KMALLOC_SIZE_96_EXIST == 1
+ if (size > 64 && size <= 96) return (7 - KMALLOC_IDX_ADJ_0);
+#endif
+
+#if KMALLOC_SIZE_192_EXIST == 1
+ if (size > 128 && size <= 192) return (8 - KMALLOC_IDX_ADJ_1);
+#endif
+
+ if (size <= 8) return ( 3 - KMALLOC_IDX_ADJ_0);
+ if (size <= 16) return ( 4 - KMALLOC_IDX_ADJ_0);
+ if (size <= 32) return ( 5 - KMALLOC_IDX_ADJ_0);
+ if (size <= 64) return ( 6 - KMALLOC_IDX_ADJ_0);
+ if (size <= 128) return ( 7 - KMALLOC_IDX_ADJ_1);
+ if (size <= 256) return ( 8 - KMALLOC_IDX_ADJ_2);
+ if (size <= 512) return ( 9 - KMALLOC_IDX_ADJ_2);
+ if (size <= 1024) return (10 - KMALLOC_IDX_ADJ_2);
+ if (size <= 2 * 1024) return (11 - KMALLOC_IDX_ADJ_2);
+ if (size <= 4 * 1024) return (12 - KMALLOC_IDX_ADJ_2);
+ if (size <= 8 * 1024) return (13 - KMALLOC_IDX_ADJ_2);
+ if (size <= 16 * 1024) return (14 - KMALLOC_IDX_ADJ_2);
+ if (size <= 32 * 1024) return (15 - KMALLOC_IDX_ADJ_2);
+ if (size <= 64 * 1024) return (16 - KMALLOC_IDX_ADJ_2);
+ if (size <= 128 * 1024) return (17 - KMALLOC_IDX_ADJ_2);
+ if (size <= 256 * 1024) return (18 - KMALLOC_IDX_ADJ_2);
+ if (size <= 512 * 1024) return (19 - KMALLOC_IDX_ADJ_2);
+ if (size <= 1024 * 1024) return (20 - KMALLOC_IDX_ADJ_2);
+ if (size <= 2 * 1024 * 1024) return (21 - KMALLOC_IDX_ADJ_2);
+ if (size <= 4 * 1024 * 1024) return (22 - KMALLOC_IDX_ADJ_2);
+ if (size <= 8 * 1024 * 1024) return (23 - KMALLOC_IDX_ADJ_2);
+ if (size <= 16 * 1024 * 1024) return (24 - KMALLOC_IDX_ADJ_2);
+ if (size <= 32 * 1024 * 1024) return (25 - KMALLOC_IDX_ADJ_2);
+ if (size <= 64 * 1024 * 1024) return (26 - KMALLOC_IDX_ADJ_2);
BUG();

/* Will never be reached. Needed because the compiler may complain */
diff --git a/mm/slab.h b/mm/slab.h
index 2fc8f956906a..3ada65ef1118 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -63,6 +63,11 @@ enum slab_state {
FULL /* Everything is working */
};

+struct kmalloc_info_struct {
+ const char *name[NR_KMALLOC_TYPES];
+ unsigned int size;
+};
+
extern enum slab_state slab_state;

/* The slab cache mutex protects the management structures during changes */
@@ -75,10 +80,7 @@ extern struct list_head slab_caches;
extern struct kmem_cache *kmem_cache;

/* A table of kmalloc cache names and sizes */
-extern const struct kmalloc_info_struct {
- const char *name[NR_KMALLOC_TYPES];
- unsigned int size;
-} kmalloc_info[];
+extern const struct kmalloc_info_struct * const kmalloc_info;

#ifndef CONFIG_SLOB
/* Kmalloc array related functions */
diff --git a/mm/slab_common.c b/mm/slab_common.c
index af45b5278fdc..2aed30deb071 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1028,7 +1028,7 @@ struct kmem_cache *__init create_kmalloc_cache(const char *name,
}

struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1] __ro_after_init =
+kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM] __ro_after_init =
{ /* initialization for https://bugs.llvm.org/show_bug.cgi?id=42570 */ };
EXPORT_SYMBOL(kmalloc_caches);

@@ -1039,30 +1039,30 @@ EXPORT_SYMBOL(kmalloc_caches);
* fls.
*/
static u8 size_index[24] __ro_after_init = {
- 3, /* 8 */
- 4, /* 16 */
- 5, /* 24 */
- 5, /* 32 */
- 6, /* 40 */
- 6, /* 48 */
- 6, /* 56 */
- 6, /* 64 */
- 1, /* 72 */
- 1, /* 80 */
- 1, /* 88 */
- 1, /* 96 */
- 7, /* 104 */
- 7, /* 112 */
- 7, /* 120 */
- 7, /* 128 */
- 2, /* 136 */
- 2, /* 144 */
- 2, /* 152 */
- 2, /* 160 */
- 2, /* 168 */
- 2, /* 176 */
- 2, /* 184 */
- 2 /* 192 */
+ (3 - KMALLOC_IDX_ADJ_0), /* 8 */
+ (4 - KMALLOC_IDX_ADJ_0), /* 16 */
+ (5 - KMALLOC_IDX_ADJ_0), /* 24 */
+ (5 - KMALLOC_IDX_ADJ_0), /* 32 */
+ (6 - KMALLOC_IDX_ADJ_0), /* 40 */
+ (6 - KMALLOC_IDX_ADJ_0), /* 48 */
+ (6 - KMALLOC_IDX_ADJ_0), /* 56 */
+ (6 - KMALLOC_IDX_ADJ_0), /* 64 */
+ (7 - KMALLOC_IDX_ADJ_0), /* 72 */
+ (7 - KMALLOC_IDX_ADJ_0), /* 80 */
+ (7 - KMALLOC_IDX_ADJ_0), /* 88 */
+ (7 - KMALLOC_IDX_ADJ_0), /* 96 */
+ (7 - KMALLOC_IDX_ADJ_1), /* 104 */
+ (7 - KMALLOC_IDX_ADJ_1), /* 112 */
+ (7 - KMALLOC_IDX_ADJ_1), /* 120 */
+ (7 - KMALLOC_IDX_ADJ_1), /* 128 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 136 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 144 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 152 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 160 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 168 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 176 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 184 */
+ (8 - KMALLOC_IDX_ADJ_1), /* 192 */
};

static inline unsigned int size_index_elem(unsigned int bytes)
@@ -1086,13 +1086,17 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
} else {
if (WARN_ON_ONCE(size > KMALLOC_MAX_CACHE_SIZE))
return NULL;
- index = fls(size - 1);
+
+ index = fls(size - 1) - KMALLOC_IDX_ADJ_2;
}

return kmalloc_caches[kmalloc_type(flags)][index];
}

#ifdef CONFIG_ZONE_DMA
+
+#define KMALLOC_INFO_SHIFT_LOW (3)
+#define KMALLOC_INFO_START_IDX (KMALLOC_SHIFT_LOW - KMALLOC_INFO_SHIFT_LOW)
#define SET_KMALLOC_SIZE(__size, __short_size) \
{ \
.name[KMALLOC_NORMAL] = "kmalloc-" #__short_size, \
@@ -1110,40 +1114,35 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
#endif

/*
- * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
- * kmalloc_index() supports up to 2^26=64MB, so the final entry of the table is
- * kmalloc-67108864.
+ * all_kmalloc_info[] is to make slub_debug=, kmalloc-xx option work at boot
+ * time. kmalloc_index() supports up to 2^26=64MB, so the final entry of the
+ * table is kmalloc-67108864.
*/
-const struct kmalloc_info_struct kmalloc_info[] __initconst = {
- SET_KMALLOC_SIZE(0, 0),
- SET_KMALLOC_SIZE(96, 96),
- SET_KMALLOC_SIZE(192, 192),
- SET_KMALLOC_SIZE(8, 8),
- SET_KMALLOC_SIZE(16, 16),
- SET_KMALLOC_SIZE(32, 32),
- SET_KMALLOC_SIZE(64, 64),
- SET_KMALLOC_SIZE(128, 128),
- SET_KMALLOC_SIZE(256, 256),
- SET_KMALLOC_SIZE(512, 512),
- SET_KMALLOC_SIZE(1024, 1k),
- SET_KMALLOC_SIZE(2048, 2k),
- SET_KMALLOC_SIZE(4096, 4k),
- SET_KMALLOC_SIZE(8192, 8k),
- SET_KMALLOC_SIZE(16384, 16k),
- SET_KMALLOC_SIZE(32768, 32k),
- SET_KMALLOC_SIZE(65536, 64k),
- SET_KMALLOC_SIZE(131072, 128k),
- SET_KMALLOC_SIZE(262144, 256k),
- SET_KMALLOC_SIZE(524288, 512k),
- SET_KMALLOC_SIZE(1048576, 1M),
- SET_KMALLOC_SIZE(2097152, 2M),
- SET_KMALLOC_SIZE(4194304, 4M),
- SET_KMALLOC_SIZE(8388608, 8M),
- SET_KMALLOC_SIZE(16777216, 16M),
- SET_KMALLOC_SIZE(33554432, 32M),
- SET_KMALLOC_SIZE(67108864, 64M)
+const struct kmalloc_info_struct all_kmalloc_info[] __initconst = {
+ SET_KMALLOC_SIZE( 8, 8), SET_KMALLOC_SIZE( 16, 16),
+ SET_KMALLOC_SIZE( 32, 32), SET_KMALLOC_SIZE( 64, 64),
+#if KMALLOC_SIZE_96_EXIST == 1
+ SET_KMALLOC_SIZE( 96, 96),
+#endif
+ SET_KMALLOC_SIZE( 128, 128),
+#if KMALLOC_SIZE_192_EXIST == 1
+ SET_KMALLOC_SIZE( 192, 192),
+#endif
+ SET_KMALLOC_SIZE( 256, 256), SET_KMALLOC_SIZE( 512, 512),
+ SET_KMALLOC_SIZE( 1024, 1k), SET_KMALLOC_SIZE( 2048, 2k),
+ SET_KMALLOC_SIZE( 4096, 4k), SET_KMALLOC_SIZE( 8192, 8k),
+ SET_KMALLOC_SIZE( 16384, 16k), SET_KMALLOC_SIZE( 32768, 32k),
+ SET_KMALLOC_SIZE( 65536, 64k), SET_KMALLOC_SIZE( 131072, 128k),
+ SET_KMALLOC_SIZE( 262144, 256k), SET_KMALLOC_SIZE( 524288, 512k),
+ SET_KMALLOC_SIZE( 1048576, 1M), SET_KMALLOC_SIZE( 2097152, 2M),
+ SET_KMALLOC_SIZE( 4194304, 4M), SET_KMALLOC_SIZE( 8388608, 8M),
+ SET_KMALLOC_SIZE(16777216, 16M), SET_KMALLOC_SIZE(33554432, 32M),
+ SET_KMALLOC_SIZE(67108864, 64M)
};

+const struct kmalloc_info_struct * const __initconst
+kmalloc_info = &all_kmalloc_info[KMALLOC_INFO_START_IDX];
+
/*
* Patch up the size_index table if we have strange large alignment
* requirements for the kmalloc array. This is only the case for
@@ -1162,33 +1161,8 @@ void __init setup_kmalloc_cache_index_table(void)
BUILD_BUG_ON(KMALLOC_MIN_SIZE > 256 ||
(KMALLOC_MIN_SIZE & (KMALLOC_MIN_SIZE - 1)));

- for (i = 8; i < KMALLOC_MIN_SIZE; i += 8) {
- unsigned int elem = size_index_elem(i);
-
- if (elem >= ARRAY_SIZE(size_index))
- break;
- size_index[elem] = KMALLOC_SHIFT_LOW;
- }
-
- if (KMALLOC_MIN_SIZE >= 64) {
- /*
- * The 96 byte size cache is not used if the alignment
- * is 64 byte.
- */
- for (i = 64 + 8; i <= 96; i += 8)
- size_index[size_index_elem(i)] = 7;
-
- }
-
- if (KMALLOC_MIN_SIZE >= 128) {
- /*
- * The 192 byte sized cache is not used if the alignment
- * is 128 byte. Redirect kmalloc to use the 256 byte cache
- * instead.
- */
- for (i = 128 + 8; i <= 192; i += 8)
- size_index[size_index_elem(i)] = 8;
- }
+ for (i = 8; i < KMALLOC_MIN_SIZE && i <= 192; i += 8)
+ size_index[size_index_elem(i)] = 0;
}

static void __init
@@ -1214,21 +1188,9 @@ void __init create_kmalloc_caches(slab_flags_t flags)
enum kmalloc_cache_type type;

for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
- for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
+ for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
if (!kmalloc_caches[type][i])
new_kmalloc_cache(i, type, flags);
-
- /*
- * Caches that are not of the two-to-the-power-of size.
- * These have to be created immediately after the
- * earlier power of two caches
- */
- if (KMALLOC_MIN_SIZE <= 32 && i == 6 &&
- !kmalloc_caches[type][1])
- new_kmalloc_cache(1, type, flags);
- if (KMALLOC_MIN_SIZE <= 64 && i == 7 &&
- !kmalloc_caches[type][2])
- new_kmalloc_cache(2, type, flags);
}
}

@@ -1236,7 +1198,7 @@ void __init create_kmalloc_caches(slab_flags_t flags)
slab_state = UP;

#ifdef CONFIG_ZONE_DMA
- for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
+ for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];

if (s) {
diff --git a/mm/slub.c b/mm/slub.c
index 8834563cdb4b..0e92ebdcacc9 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4711,7 +4711,7 @@ static void __init resiliency_test(void)
pr_err("\n1. kmalloc-16: Clobber Redzone/next pointer 0x12->0x%p\n\n",
p + 16);

- validate_slab_cache(kmalloc_caches[type][4]);
+ validate_slab_cache(kmalloc_caches[type][1]);

/* Hmmm... The next two are dangerous */
p = kzalloc(32, GFP_KERNEL);
@@ -4720,33 +4720,33 @@ static void __init resiliency_test(void)
p);
pr_err("If allocated object is overwritten then not detectable\n\n");

- validate_slab_cache(kmalloc_caches[type][5]);
+ validate_slab_cache(kmalloc_caches[type][2]);
p = kzalloc(64, GFP_KERNEL);
p += 64 + (get_cycles() & 0xff) * sizeof(void *);
*p = 0x56;
pr_err("\n3. kmalloc-64: corrupting random byte 0x56->0x%p\n",
p);
pr_err("If allocated object is overwritten then not detectable\n\n");
- validate_slab_cache(kmalloc_caches[type][6]);
+ validate_slab_cache(kmalloc_caches[type][3]);

pr_err("\nB. Corruption after free\n");
p = kzalloc(128, GFP_KERNEL);
kfree(p);
*p = 0x78;
pr_err("1. kmalloc-128: Clobber first word 0x78->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][7]);
+ validate_slab_cache(kmalloc_caches[type][5]);

p = kzalloc(256, GFP_KERNEL);
kfree(p);
p[50] = 0x9a;
pr_err("\n2. kmalloc-256: Clobber 50th byte 0x9a->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][8]);
+ validate_slab_cache(kmalloc_caches[type][7]);

p = kzalloc(512, GFP_KERNEL);
kfree(p);
p[512] = 0xab;
pr_err("\n3. kmalloc-512: Clobber redzone 0xab->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][9]);
+ validate_slab_cache(kmalloc_caches[type][8]);
}
#else
#ifdef CONFIG_SYSFS
--
2.21.0

2019-09-15 20:57:28

by Pengfei Li

[permalink] [raw]
Subject: [PATCH v4 7/7] mm, slab_common: Modify kmalloc_caches[type][idx] to kmalloc_caches[idx][type]

KMALLOC_NORMAL is the most frequently accessed, and kmalloc_caches[]
is initialized by different types of the same size.

So modifying kmalloc_caches[type][idx] to kmalloc_caches[idx][type]
will benefit performance.

$ ./scripts/bloat-o-meter vmlinux.patch_1-6 vmlinux.patch_1-7
add/remove: 0/0 grow/shrink: 2/57 up/down: 8/-457 (-449)
Function old new delta
tg3_self_test 4255 4259 +4
nf_queue 666 670 +4
kmalloc_slab 97 93 -4
i915_sw_fence_await_dma_fence 441 437 -4
__igmp_group_dropped 619 615 -4
gss_import_sec_context 176 170 -6
xhci_alloc_command 212 205 -7
create_kmalloc_caches 155 148 -7
xprt_switch_alloc 136 128 -8
xhci_segment_alloc 297 289 -8
xhci_ring_alloc 369 361 -8
xhci_mem_init 3664 3656 -8
xhci_alloc_virt_device 496 488 -8
xhci_alloc_tt_info 346 338 -8
xhci_alloc_stream_info 718 710 -8
xhci_alloc_container_ctx 215 207 -8
xfrm_policy_alloc 271 263 -8
tcp_sendmsg_locked 3120 3112 -8
tcp_md5_do_add 774 766 -8
tcp_fastopen_defer_connect 270 262 -8
sr_read_tochdr.isra 251 243 -8
sr_read_tocentry.isra 328 320 -8
sr_is_xa 376 368 -8
sr_get_mcn 260 252 -8
selinux_sk_alloc_security 113 105 -8
sdev_evt_send_simple 118 110 -8
sdev_evt_alloc 79 71 -8
scsi_probe_and_add_lun 2938 2930 -8
sbitmap_queue_init_node 418 410 -8
ring_buffer_read_prepare 94 86 -8
request_firmware_nowait 396 388 -8
regulatory_hint_found_beacon 394 386 -8
ohci_urb_enqueue 3176 3168 -8
nla_strdup 142 134 -8
nfs_alloc_seqid 87 79 -8
nfs4_get_state_owner 1040 1032 -8
nfs4_do_close 578 570 -8
nf_ct_tmpl_alloc 85 77 -8
mempool_create_node 164 156 -8
ip_setup_cork 362 354 -8
ip6_setup_cork 1021 1013 -8
gss_create_cred 140 132 -8
drm_flip_work_allocate_task 70 62 -8
dma_pool_alloc 410 402 -8
devres_open_group 214 206 -8
cfg80211_stop_iface 260 252 -8
cfg80211_sinfo_alloc_tid_stats 77 69 -8
cfg80211_port_authorized 212 204 -8
cfg80211_parse_mbssid_data 2397 2389 -8
cfg80211_ibss_joined 335 327 -8
call_usermodehelper_setup 149 141 -8
bpf_prog_alloc_no_stats 182 174 -8
blk_alloc_flush_queue 191 183 -8
bdi_alloc_node 195 187 -8
audit_log_d_path 196 188 -8
_netlbl_catmap_getnode 247 239 -8
____ip_mc_inc_group 475 467 -8
__i915_sw_fence_await_sw_fence 417 405 -12
ida_alloc_range 955 934 -21
Total: Before=14874316, After=14873867, chg -0.00%

Signed-off-by: Pengfei Li <[email protected]>
---
include/linux/slab.h | 6 +++---
mm/slab.c | 4 ++--
mm/slab_common.c | 8 ++++----
mm/slub.c | 12 ++++++------
4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index f53bb6980110..0842db5f7053 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -340,7 +340,7 @@ enum kmalloc_cache_type {

#ifndef CONFIG_SLOB
extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];
+kmalloc_caches[KMALLOC_CACHE_NUM][NR_KMALLOC_TYPES];

static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
@@ -582,7 +582,7 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
return ZERO_SIZE_PTR;

return kmem_cache_alloc_trace(
- kmalloc_caches[kmalloc_type(flags)][index],
+ kmalloc_caches[index][kmalloc_type(flags)],
flags, size);
#endif
}
@@ -600,7 +600,7 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
return ZERO_SIZE_PTR;

return kmem_cache_alloc_node_trace(
- kmalloc_caches[kmalloc_type(flags)][i],
+ kmalloc_caches[i][kmalloc_type(flags)],
flags, node, size);
}
#endif
diff --git a/mm/slab.c b/mm/slab.c
index 7bc4e90e1147..079c3e6ced1f 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1246,7 +1246,7 @@ void __init kmem_cache_init(void)
* Initialize the caches that provide memory for the kmem_cache_node
* structures first. Without this, further allocations will bug.
*/
- kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
+ kmalloc_caches[INDEX_NODE][KMALLOC_NORMAL] = create_kmalloc_cache(
kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
kmalloc_info[INDEX_NODE].size,
ARCH_KMALLOC_FLAGS, 0,
@@ -1263,7 +1263,7 @@ void __init kmem_cache_init(void)
for_each_online_node(nid) {
init_list(kmem_cache, &init_kmem_cache_node[CACHE_CACHE + nid], nid);

- init_list(kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE],
+ init_list(kmalloc_caches[INDEX_NODE][KMALLOC_NORMAL],
&init_kmem_cache_node[SIZE_NODE + nid], nid);
}
}
diff --git a/mm/slab_common.c b/mm/slab_common.c
index e7903bd28b1f..0f465eae32f6 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1028,7 +1028,7 @@ struct kmem_cache *__init create_kmalloc_cache(const char *name,
}

struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM] __ro_after_init =
+kmalloc_caches[KMALLOC_CACHE_NUM][NR_KMALLOC_TYPES] __ro_after_init =
{ /* initialization for https://bugs.llvm.org/show_bug.cgi?id=42570 */ };
EXPORT_SYMBOL(kmalloc_caches);

@@ -1090,7 +1090,7 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
index = fls(size - 1) - KMALLOC_IDX_ADJ_2;
}

- return kmalloc_caches[kmalloc_type(flags)][index];
+ return kmalloc_caches[index][kmalloc_type(flags)];
}

#ifdef CONFIG_ZONE_DMA
@@ -1168,7 +1168,7 @@ void __init setup_kmalloc_cache_index_table(void)
static __always_inline void __init
new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
{
- kmalloc_caches[type][idx] = create_kmalloc_cache(
+ kmalloc_caches[idx][type] = create_kmalloc_cache(
kmalloc_info[idx].name[type],
kmalloc_info[idx].size, flags, 0,
kmalloc_info[idx].size);
@@ -1184,7 +1184,7 @@ void __init create_kmalloc_caches(slab_flags_t flags)
int i;

for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
- if (!kmalloc_caches[KMALLOC_NORMAL][i])
+ if (!kmalloc_caches[i][KMALLOC_NORMAL])
new_kmalloc_cache(i, KMALLOC_NORMAL, flags);

new_kmalloc_cache(i, KMALLOC_RECLAIM,
diff --git a/mm/slub.c b/mm/slub.c
index 0e92ebdcacc9..e87243a16768 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4711,7 +4711,7 @@ static void __init resiliency_test(void)
pr_err("\n1. kmalloc-16: Clobber Redzone/next pointer 0x12->0x%p\n\n",
p + 16);

- validate_slab_cache(kmalloc_caches[type][1]);
+ validate_slab_cache(kmalloc_caches[1][type]);

/* Hmmm... The next two are dangerous */
p = kzalloc(32, GFP_KERNEL);
@@ -4720,33 +4720,33 @@ static void __init resiliency_test(void)
p);
pr_err("If allocated object is overwritten then not detectable\n\n");

- validate_slab_cache(kmalloc_caches[type][2]);
+ validate_slab_cache(kmalloc_caches[2][type]);
p = kzalloc(64, GFP_KERNEL);
p += 64 + (get_cycles() & 0xff) * sizeof(void *);
*p = 0x56;
pr_err("\n3. kmalloc-64: corrupting random byte 0x56->0x%p\n",
p);
pr_err("If allocated object is overwritten then not detectable\n\n");
- validate_slab_cache(kmalloc_caches[type][3]);
+ validate_slab_cache(kmalloc_caches[3][type]);

pr_err("\nB. Corruption after free\n");
p = kzalloc(128, GFP_KERNEL);
kfree(p);
*p = 0x78;
pr_err("1. kmalloc-128: Clobber first word 0x78->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][5]);
+ validate_slab_cache(kmalloc_caches[5][type]);

p = kzalloc(256, GFP_KERNEL);
kfree(p);
p[50] = 0x9a;
pr_err("\n2. kmalloc-256: Clobber 50th byte 0x9a->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][7]);
+ validate_slab_cache(kmalloc_caches[7][type]);

p = kzalloc(512, GFP_KERNEL);
kfree(p);
p[512] = 0xab;
pr_err("\n3. kmalloc-512: Clobber redzone 0xab->0x%p\n\n", p);
- validate_slab_cache(kmalloc_caches[type][8]);
+ validate_slab_cache(kmalloc_caches[8][type]);
}
#else
#ifdef CONFIG_SYSFS
--
2.21.0