Changes in v4
--
1. [old] abandon patch 4/4
2. [new] patch 4/7:
- return ZERO_SIZE_ALLOC instead 0 for zero sized requests
3. [new] patch 5/7:
- reorder kmalloc_info[], kmalloc_caches[] (in order of size)
- hard to split, so slightly larger
4. [new] patch 6/7:
- initialize kmalloc_cache[] with the same size but different
types
5. [new] patch 7/7:
- modify kmalloc_caches[type][idx] to kmalloc_caches[idx][type]
Patch 4-7 are newly added, more information can be obtained from
commit messages.
Changes in v3
--
1. restore __initconst (patch 1/4)
2. rename patch 3/4
3. add more clarification for patch 4/4
Changes in v2
--
1. remove __initconst (patch 1/5)
2. squash patch 2/5
3. add ack tag from Vlastimil Babka
There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.
The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().
Patch1 predefines the names of all types of kmalloc to save
the time spent dynamically generating names.
These changes make sense, and the time spent by new_kmalloc_cache()
has been reduced by approximately 36.3%.
Time spent by new_kmalloc_cache()
(CPU cycles)
5.3-rc7 66264
5.3-rc7+patch_1-3 42188
bloat-o-meter
--
$ ./scripts/bloat-o-meter vmlinux.5.3-rc8 vmlinux.5.3-rc8+patch_1-7
add/remove: 1/2 grow/shrink: 6/65 up/down: 872/-1621 (-749)
Function old new delta
all_kmalloc_info - 832 +832
crypto_create_tfm 211 225 +14
ieee80211_key_alloc 1159 1169 +10
nl80211_parse_sched_scan 2787 2795 +8
tg3_self_test 4255 4259 +4
find_get_context.isra 634 637 +3
sd_probe 947 948 +1
nf_queue 671 670 -1
trace_parser_get_init 71 69 -2
pkcs7_verify.cold 318 316 -2
units 323 320 -3
nl80211_set_reg 642 639 -3
pkcs7_verify 1503 1495 -8
i915_sw_fence_await_dma_fence 445 437 -8
nla_strdup 143 134 -9
kmalloc_slab 102 93 -9
xhci_alloc_tt_info 349 338 -11
xhci_segment_alloc 303 289 -14
xhci_alloc_container_ctx 221 207 -14
xfrm_policy_alloc 277 263 -14
selinux_sk_alloc_security 119 105 -14
sdev_evt_send_simple 124 110 -14
sdev_evt_alloc 85 71 -14
sbitmap_queue_init_node 424 410 -14
regulatory_hint_found_beacon 400 386 -14
nf_ct_tmpl_alloc 91 77 -14
gss_create_cred 146 132 -14
drm_flip_work_allocate_task 76 62 -14
cfg80211_stop_iface 266 252 -14
cfg80211_sinfo_alloc_tid_stats 83 69 -14
cfg80211_port_authorized 218 204 -14
cfg80211_ibss_joined 341 327 -14
call_usermodehelper_setup 155 141 -14
bpf_prog_alloc_no_stats 188 174 -14
blk_alloc_flush_queue 197 183 -14
bdi_alloc_node 201 187 -14
_netlbl_catmap_getnode 253 239 -14
__igmp_group_dropped 629 615 -14
____ip_mc_inc_group 481 467 -14
xhci_alloc_command 221 205 -16
audit_log_d_path 204 188 -16
xprt_switch_alloc 145 128 -17
xhci_ring_alloc 378 361 -17
xhci_mem_init 3673 3656 -17
xhci_alloc_virt_device 505 488 -17
xhci_alloc_stream_info 727 710 -17
tcp_sendmsg_locked 3129 3112 -17
tcp_md5_do_add 783 766 -17
tcp_fastopen_defer_connect 279 262 -17
sr_read_tochdr.isra 260 243 -17
sr_read_tocentry.isra 337 320 -17
sr_is_xa 385 368 -17
sr_get_mcn 269 252 -17
scsi_probe_and_add_lun 2947 2930 -17
ring_buffer_read_prepare 103 86 -17
request_firmware_nowait 405 388 -17
ohci_urb_enqueue 3185 3168 -17
nfs_alloc_seqid 96 79 -17
nfs4_get_state_owner 1049 1032 -17
nfs4_do_close 587 570 -17
mempool_create_node 173 156 -17
ip6_setup_cork 1030 1013 -17
ida_alloc_range 951 934 -17
gss_import_sec_context 187 170 -17
dma_pool_alloc 419 402 -17
devres_open_group 223 206 -17
cfg80211_parse_mbssid_data 2406 2389 -17
ip_setup_cork 374 354 -20
kmalloc_caches 336 312 -24
__i915_sw_fence_await_sw_fence 429 405 -24
kmalloc_cache_name 57 - -57
new_kmalloc_cache 112 - -112
create_kmalloc_caches 270 148 -122
kmalloc_info 432 8 -424
Total: Before=14874616, After=14873867, chg -0.01%
Pengfei Li (7):
mm, slab: Make kmalloc_info[] contain all types of names
mm, slab: Remove unused kmalloc_size()
mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc
caches
mm, slab: Return ZERO_SIZE_ALLOC for zero sized kmalloc requests
mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE
mm, slab_common: Initialize the same size of kmalloc_caches[]
mm, slab_common: Modify kmalloc_caches[type][idx] to
kmalloc_caches[idx][type]
include/linux/slab.h | 136 ++++++++++++++------------
mm/slab.c | 11 ++-
mm/slab.h | 10 +-
mm/slab_common.c | 224 ++++++++++++++++++-------------------------
mm/slub.c | 12 +--
5 files changed, 183 insertions(+), 210 deletions(-)
--
2.21.0
The size of kmalloc can be obtained from kmalloc_info[],
so remove kmalloc_size() that will not be used anymore.
Signed-off-by: Pengfei Li <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
---
include/linux/slab.h | 20 --------------------
mm/slab.c | 5 +++--
mm/slab_common.c | 5 ++---
3 files changed, 5 insertions(+), 25 deletions(-)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 56c9c7eed34e..e773e5764b7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -557,26 +557,6 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
return __kmalloc(size, flags);
}
-/*
- * Determine size used for the nth kmalloc cache.
- * return size or 0 if a kmalloc cache for that
- * size does not exist
- */
-static __always_inline unsigned int kmalloc_size(unsigned int n)
-{
-#ifndef CONFIG_SLOB
- if (n > 2)
- return 1U << n;
-
- if (n == 1 && KMALLOC_MIN_SIZE <= 32)
- return 96;
-
- if (n == 2 && KMALLOC_MIN_SIZE <= 64)
- return 192;
-#endif
- return 0;
-}
-
static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
{
#ifndef CONFIG_SLOB
diff --git a/mm/slab.c b/mm/slab.c
index c42b6211f42e..7bc4e90e1147 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1248,8 +1248,9 @@ void __init kmem_cache_init(void)
*/
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
- kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
- 0, kmalloc_size(INDEX_NODE));
+ kmalloc_info[INDEX_NODE].size,
+ ARCH_KMALLOC_FLAGS, 0,
+ kmalloc_info[INDEX_NODE].size);
slab_state = PARTIAL_NODE;
setup_kmalloc_cache_index_table();
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 002e16673581..8b542cfcc4f2 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1239,11 +1239,10 @@ void __init create_kmalloc_caches(slab_flags_t flags)
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
if (s) {
- unsigned int size = kmalloc_size(i);
-
kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
kmalloc_info[i].name[KMALLOC_DMA],
- size, SLAB_CACHE_DMA | flags, 0, 0);
+ kmalloc_info[i].size,
+ SLAB_CACHE_DMA | flags, 0, 0);
}
}
#endif
--
2.21.0
The type of local variable *type* of new_kmalloc_cache() should
be enum kmalloc_cache_type instead of int, so correct it.
Signed-off-by: Pengfei Li <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
---
mm/slab_common.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8b542cfcc4f2..af45b5278fdc 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
}
static void __init
-new_kmalloc_cache(int idx, int type, slab_flags_t flags)
+new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
{
if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
@@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t flags)
*/
void __init create_kmalloc_caches(slab_flags_t flags)
{
- int i, type;
+ int i;
+ enum kmalloc_cache_type type;
for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
--
2.21.0
In the current implementation, KMALLOC_RECLAIM is not initialized
until all the KMALLOC_NORMAL sizes have been initialized.
But for a particular size, create_kmalloc_caches() can be executed
faster by initializing different types of kmalloc in order.
$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-5
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1113 (-241)
Function old new delta
create_kmalloc_caches 270 214 -56
$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-6
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1172 (-300)
Function old new delta
create_kmalloc_caches 270 155 -115
We can see that it really gets the benefits.
Besides, KMALLOC_DMA will be initialized after "slab_state = UP",
this does not seem to be necessary.
Commit f97d5f634d3b ("slab: Common function to create the kmalloc
array") introduces create_kmalloc_caches().
And I found that for SLAB, KMALLOC_DMA is initialized before
"slab_state = UP". But for SLUB, KMALLOC_DMA is initialized after
"slab_state = UP".
Based on this fact, I think it is okay to initialize KMALLOC_DMA
before "slab_state = UP".
Signed-off-by: Pengfei Li <[email protected]>
---
mm/slab_common.c | 35 ++++++++++++-----------------------
1 file changed, 12 insertions(+), 23 deletions(-)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 2aed30deb071..e7903bd28b1f 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1165,12 +1165,9 @@ void __init setup_kmalloc_cache_index_table(void)
size_index[size_index_elem(i)] = 0;
}
-static void __init
+static __always_inline void __init
new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
{
- if (type == KMALLOC_RECLAIM)
- flags |= SLAB_RECLAIM_ACCOUNT;
-
kmalloc_caches[type][idx] = create_kmalloc_cache(
kmalloc_info[idx].name[type],
kmalloc_info[idx].size, flags, 0,
@@ -1185,30 +1182,22 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
void __init create_kmalloc_caches(slab_flags_t flags)
{
int i;
- enum kmalloc_cache_type type;
- for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
- for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
- if (!kmalloc_caches[type][i])
- new_kmalloc_cache(i, type, flags);
- }
- }
+ for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
+ if (!kmalloc_caches[KMALLOC_NORMAL][i])
+ new_kmalloc_cache(i, KMALLOC_NORMAL, flags);
- /* Kmalloc array is now usable */
- slab_state = UP;
+ new_kmalloc_cache(i, KMALLOC_RECLAIM,
+ flags | SLAB_RECLAIM_ACCOUNT);
#ifdef CONFIG_ZONE_DMA
- for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
- struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
-
- if (s) {
- kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
- kmalloc_info[i].name[KMALLOC_DMA],
- kmalloc_info[i].size,
- SLAB_CACHE_DMA | flags, 0, 0);
- }
- }
+ new_kmalloc_cache(i, KMALLOC_DMA,
+ flags | SLAB_CACHE_DMA);
#endif
+ }
+
+ /* Kmalloc array is now usable */
+ slab_state = UP;
}
#endif /* !CONFIG_SLOB */
--
2.21.0
On Mon, 16 Sep 2019, Pengfei Li wrote:
> The type of local variable *type* of new_kmalloc_cache() should
> be enum kmalloc_cache_type instead of int, so correct it.
>
> Signed-off-by: Pengfei Li <[email protected]>
> Acked-by: Vlastimil Babka <[email protected]>
> Acked-by: Roman Gushchin <[email protected]>
Acked-by: David Rientjes <[email protected]>
On Mon, 16 Sep 2019, Pengfei Li wrote:
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 2aed30deb071..e7903bd28b1f 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -1165,12 +1165,9 @@ void __init setup_kmalloc_cache_index_table(void)
> size_index[size_index_elem(i)] = 0;
> }
>
> -static void __init
> +static __always_inline void __init
> new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
> {
> - if (type == KMALLOC_RECLAIM)
> - flags |= SLAB_RECLAIM_ACCOUNT;
> -
> kmalloc_caches[type][idx] = create_kmalloc_cache(
> kmalloc_info[idx].name[type],
> kmalloc_info[idx].size, flags, 0,
> @@ -1185,30 +1182,22 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
> void __init create_kmalloc_caches(slab_flags_t flags)
> {
> int i;
> - enum kmalloc_cache_type type;
>
> - for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
> - for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
> - if (!kmalloc_caches[type][i])
> - new_kmalloc_cache(i, type, flags);
> - }
> - }
> + for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
> + if (!kmalloc_caches[KMALLOC_NORMAL][i])
> + new_kmalloc_cache(i, KMALLOC_NORMAL, flags);
>
> - /* Kmalloc array is now usable */
> - slab_state = UP;
> + new_kmalloc_cache(i, KMALLOC_RECLAIM,
> + flags | SLAB_RECLAIM_ACCOUNT);
This seems less robust, no? Previously we verified that the cache doesn't
exist before creating a new cache over top of it (for NORMAL and RECLAIM).
Now we presume that the RECLAIM cache never exists.
Can we just move a check to new_kmalloc_cache() to see if
kmalloc_caches[type][idx] already exists and, if so, just return? This
should be more robust and simplify create_kmalloc_caches() slightly more.
On Mon, 16 Sep 2019, Pengfei Li wrote:
> The size of kmalloc can be obtained from kmalloc_info[],
> so remove kmalloc_size() that will not be used anymore.
>
> Signed-off-by: Pengfei Li <[email protected]>
> Acked-by: Vlastimil Babka <[email protected]>
> Acked-by: Roman Gushchin <[email protected]>
Acked-by: David Rientjes <[email protected]>
On Mon, Sep 16, 2019 at 5:38 AM David Rientjes <[email protected]> wrote:
Thanks for your review comments!
>
> On Mon, 16 Sep 2019, Pengfei Li wrote:
>
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index 2aed30deb071..e7903bd28b1f 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -1165,12 +1165,9 @@ void __init setup_kmalloc_cache_index_table(void)
> > size_index[size_index_elem(i)] = 0;
> > }
> >
> > -static void __init
> > +static __always_inline void __init
> > new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
> > {
> > - if (type == KMALLOC_RECLAIM)
> > - flags |= SLAB_RECLAIM_ACCOUNT;
> > -
> > kmalloc_caches[type][idx] = create_kmalloc_cache(
> > kmalloc_info[idx].name[type],
> > kmalloc_info[idx].size, flags, 0,
> > @@ -1185,30 +1182,22 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
> > void __init create_kmalloc_caches(slab_flags_t flags)
> > {
> > int i;
> > - enum kmalloc_cache_type type;
> >
> > - for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
> > - for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
> > - if (!kmalloc_caches[type][i])
> > - new_kmalloc_cache(i, type, flags);
> > - }
> > - }
> > + for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
> > + if (!kmalloc_caches[KMALLOC_NORMAL][i])
> > + new_kmalloc_cache(i, KMALLOC_NORMAL, flags);
> >
> > - /* Kmalloc array is now usable */
> > - slab_state = UP;
> > + new_kmalloc_cache(i, KMALLOC_RECLAIM,
> > + flags | SLAB_RECLAIM_ACCOUNT);
>
> This seems less robust, no? Previously we verified that the cache doesn't
> exist before creating a new cache over top of it (for NORMAL and RECLAIM).
> Now we presume that the RECLAIM cache never exists.
>
Agree, this is really less robust.
I have checked the code and found that there is no place to initialize
kmalloc-rcl-xxx before create_kmalloc_caches(). So I assume that
kmalloc-rcl-xxx is NULL.
> Can we just move a check to new_kmalloc_cache() to see if
> kmalloc_caches[type][idx] already exists and, if so, just return? This
> should be more robust and simplify create_kmalloc_caches() slightly more.
For better robustness, I will do it as you suggested in v5.