2019-08-29 05:51:06

by Peikan Tsai

[permalink] [raw]
Subject: [PATCH] binder: Use kmem_cache for binder_thread

Hi,

The allocated size for each binder_thread is 512 bytes by kzalloc.
Because the size of binder_thread is fixed and it's only 304 bytes.
It will save 208 bytes per binder_thread when use create a kmem_cache
for the binder_thread.

Signed-off-by: Peikan Tsai <[email protected]>
---
drivers/android/binder.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index dc1c83eafc22..043e0ebd0fe7 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -87,6 +87,8 @@ static struct dentry *binder_debugfs_dir_entry_root;
static struct dentry *binder_debugfs_dir_entry_proc;
static atomic_t binder_last_id;

+static struct kmem_cache *binder_thread_cachep;
+
static int proc_show(struct seq_file *m, void *unused);
DEFINE_SHOW_ATTRIBUTE(proc);

@@ -4696,14 +4698,15 @@ static struct binder_thread *binder_get_thread(struct binder_proc *proc)
thread = binder_get_thread_ilocked(proc, NULL);
binder_inner_proc_unlock(proc);
if (!thread) {
- new_thread = kzalloc(sizeof(*thread), GFP_KERNEL);
+ new_thread = kmem_cache_zalloc(binder_thread_cachep,
+ GFP_KERNEL);
if (new_thread == NULL)
return NULL;
binder_inner_proc_lock(proc);
thread = binder_get_thread_ilocked(proc, new_thread);
binder_inner_proc_unlock(proc);
if (thread != new_thread)
- kfree(new_thread);
+ kmem_cache_free(binder_thread_cachep, new_thread);
}
return thread;
}
@@ -4723,7 +4726,7 @@ static void binder_free_thread(struct binder_thread *thread)
BUG_ON(!list_empty(&thread->todo));
binder_stats_deleted(BINDER_STAT_THREAD);
binder_proc_dec_tmpref(thread->proc);
- kfree(thread);
+ kmem_cache_free(binder_thread_cachep, thread);
}

static int binder_thread_release(struct binder_proc *proc,
@@ -6095,6 +6098,12 @@ static int __init binder_init(void)
if (ret)
return ret;

+ binder_thread_cachep = kmem_cache_create("binder_thread",
+ sizeof(struct binder_thread),
+ 0, 0, NULL);
+ if (!binder_thread_cachep)
+ return -ENOMEM;
+
atomic_set(&binder_transaction_log.cur, ~0U);
atomic_set(&binder_transaction_log_failed.cur, ~0U);

@@ -6167,6 +6176,7 @@ static int __init binder_init(void)

err_alloc_device_names_failed:
debugfs_remove_recursive(binder_debugfs_dir_entry_root);
+ kmem_cache_destroy(binder_thread_cachep);

return ret;
}
--
2.17.1


2019-08-29 06:43:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] binder: Use kmem_cache for binder_thread

On Thu, Aug 29, 2019 at 01:49:53PM +0800, Peikan Tsai wrote:
> Hi,

No need for that in a changelog text :)

> The allocated size for each binder_thread is 512 bytes by kzalloc.
> Because the size of binder_thread is fixed and it's only 304 bytes.
> It will save 208 bytes per binder_thread when use create a kmem_cache
> for the binder_thread.

Are you _sure_ it really will save that much memory? You want to do
allocations based on a nice alignment for lots of good reasons,
especially for something that needs quick accesses.

Did you test your change on a system that relies on binder and find any
speed improvement or decrease, and any actual memory savings?

If so, can you post your results?

thanks,

greg k-h

2019-08-29 13:44:45

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH] binder: Use kmem_cache for binder_thread

On Thu, Aug 29, 2019 at 01:49:53PM +0800, Peikan Tsai wrote:
> Hi,
>
> The allocated size for each binder_thread is 512 bytes by kzalloc.
> Because the size of binder_thread is fixed and it's only 304 bytes.
> It will save 208 bytes per binder_thread when use create a kmem_cache
> for the binder_thread.

Awesome change and observation!!!

Reviewed-by: Joel Fernandes (Google) <[email protected]>

(Another thought: how did you discover this? Are you using some tools to look
into slab fragmentation?).

thanks,

- Joel

> Signed-off-by: Peikan Tsai <[email protected]>
> ---
> drivers/android/binder.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> index dc1c83eafc22..043e0ebd0fe7 100644
> --- a/drivers/android/binder.c
> +++ b/drivers/android/binder.c
> @@ -87,6 +87,8 @@ static struct dentry *binder_debugfs_dir_entry_root;
> static struct dentry *binder_debugfs_dir_entry_proc;
> static atomic_t binder_last_id;
>
> +static struct kmem_cache *binder_thread_cachep;
> +
> static int proc_show(struct seq_file *m, void *unused);
> DEFINE_SHOW_ATTRIBUTE(proc);
>
> @@ -4696,14 +4698,15 @@ static struct binder_thread *binder_get_thread(struct binder_proc *proc)
> thread = binder_get_thread_ilocked(proc, NULL);
> binder_inner_proc_unlock(proc);
> if (!thread) {
> - new_thread = kzalloc(sizeof(*thread), GFP_KERNEL);
> + new_thread = kmem_cache_zalloc(binder_thread_cachep,
> + GFP_KERNEL);
> if (new_thread == NULL)
> return NULL;
> binder_inner_proc_lock(proc);
> thread = binder_get_thread_ilocked(proc, new_thread);
> binder_inner_proc_unlock(proc);
> if (thread != new_thread)
> - kfree(new_thread);
> + kmem_cache_free(binder_thread_cachep, new_thread);
> }
> return thread;
> }
> @@ -4723,7 +4726,7 @@ static void binder_free_thread(struct binder_thread *thread)
> BUG_ON(!list_empty(&thread->todo));
> binder_stats_deleted(BINDER_STAT_THREAD);
> binder_proc_dec_tmpref(thread->proc);
> - kfree(thread);
> + kmem_cache_free(binder_thread_cachep, thread);
> }
>
> static int binder_thread_release(struct binder_proc *proc,
> @@ -6095,6 +6098,12 @@ static int __init binder_init(void)
> if (ret)
> return ret;
>
> + binder_thread_cachep = kmem_cache_create("binder_thread",
> + sizeof(struct binder_thread),
> + 0, 0, NULL);
> + if (!binder_thread_cachep)
> + return -ENOMEM;
> +
> atomic_set(&binder_transaction_log.cur, ~0U);
> atomic_set(&binder_transaction_log_failed.cur, ~0U);
>
> @@ -6167,6 +6176,7 @@ static int __init binder_init(void)
>
> err_alloc_device_names_failed:
> debugfs_remove_recursive(binder_debugfs_dir_entry_root);
> + kmem_cache_destroy(binder_thread_cachep);
>
> return ret;
> }
> --
> 2.17.1
>

2019-08-29 13:55:27

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH] binder: Use kmem_cache for binder_thread

On Thu, Aug 29, 2019 at 08:42:29AM +0200, Greg KH wrote:
> On Thu, Aug 29, 2019 at 01:49:53PM +0800, Peikan Tsai wrote:
[snip]
> > The allocated size for each binder_thread is 512 bytes by kzalloc.
> > Because the size of binder_thread is fixed and it's only 304 bytes.
> > It will save 208 bytes per binder_thread when use create a kmem_cache
> > for the binder_thread.
>
> Are you _sure_ it really will save that much memory? You want to do
> allocations based on a nice alignment for lots of good reasons,
> especially for something that needs quick accesses.

Alignment can be done for slab allocations, kmem_cache_create() takes an
align argument. I am not sure what the default alignment of objects is
though (probably no default alignment). What is an optimal alignment in your
view?

> Did you test your change on a system that relies on binder and find any
> speed improvement or decrease, and any actual memory savings?
>
> If so, can you post your results?

That's certainly worth it and I thought of asking for the same, but spoke too
soon!

Independent note: In general I find the internal fragmentation with large
kmalloc()s troubling in the kernel :-(. Say you have a 5000 objects of 512
allocation, each 300 bytes. 212 * 5000 is around 1MB. Which is arguably not
neglible on a small memory system, right?

thanks,

- Joel

2019-08-29 15:28:41

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] binder: Use kmem_cache for binder_thread

On Thu, Aug 29, 2019 at 09:53:59AM -0400, Joel Fernandes wrote:
> On Thu, Aug 29, 2019 at 08:42:29AM +0200, Greg KH wrote:
> > On Thu, Aug 29, 2019 at 01:49:53PM +0800, Peikan Tsai wrote:
> [snip]
> > > The allocated size for each binder_thread is 512 bytes by kzalloc.
> > > Because the size of binder_thread is fixed and it's only 304 bytes.
> > > It will save 208 bytes per binder_thread when use create a kmem_cache
> > > for the binder_thread.
> >
> > Are you _sure_ it really will save that much memory? You want to do
> > allocations based on a nice alignment for lots of good reasons,
> > especially for something that needs quick accesses.
>
> Alignment can be done for slab allocations, kmem_cache_create() takes an
> align argument. I am not sure what the default alignment of objects is
> though (probably no default alignment). What is an optimal alignment in your
> view?

Probably SLAB_HWCACHE_ALIGN would make most sense.

>
> > Did you test your change on a system that relies on binder and find any
> > speed improvement or decrease, and any actual memory savings?
> >
> > If so, can you post your results?
>
> That's certainly worth it and I thought of asking for the same, but spoke too
> soon!

Yeah, it'd be interesting to see what difference this actually makes.

Christian

2019-08-29 19:00:18

by Peikan Tsai

[permalink] [raw]
Subject: Re: [PATCH] binder: Use kmem_cache for binder_thread

On Thu, Aug 29, 2019 at 05:27:22PM +0200, Christian Brauner wrote:
> On Thu, Aug 29, 2019 at 09:53:59AM -0400, Joel Fernandes wrote:
> > On Thu, Aug 29, 2019 at 08:42:29AM +0200, Greg KH wrote:
> > > On Thu, Aug 29, 2019 at 01:49:53PM +0800, Peikan Tsai wrote:
> > [snip]
> > > > The allocated size for each binder_thread is 512 bytes by kzalloc.
> > > > Because the size of binder_thread is fixed and it's only 304 bytes.
> > > > It will save 208 bytes per binder_thread when use create a kmem_cache
> > > > for the binder_thread.
> > >
> > > Are you _sure_ it really will save that much memory? You want to do
> > > allocations based on a nice alignment for lots of good reasons,
> > > especially for something that needs quick accesses.
> >
> > Alignment can be done for slab allocations, kmem_cache_create() takes an
> > align argument. I am not sure what the default alignment of objects is
> > though (probably no default alignment). What is an optimal alignment in your
> > view?
>
> Probably SLAB_HWCACHE_ALIGN would make most sense.
>

Agree. Thanks for yours comments and suggestions.
I'll put SLAB_HWCACHE_ALIGN it in patch v2.

> >
> > > Did you test your change on a system that relies on binder and find any
> > > speed improvement or decrease, and any actual memory savings?
> > >
> > > If so, can you post your results?
> >
> > That's certainly worth it and I thought of asking for the same, but spoke too
> > soon!
>
> Yeah, it'd be interesting to see what difference this actually makes.
>
> Christian

I tested this change on an Android device(arm) with AOSP kernel 4.19 and
observed
memory usage of binder_thread. But I didn't do binder benchmark yet.

On my platform the memory usage of binder_thread reduce about 90 KB as
the
following result.
nr obj obj size total
before: 624 512 319488 bytes
after: 728 312 227136 bytes

2019-08-29 19:33:11

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH] binder: Use kmem_cache for binder_thread



On August 29, 2019 2:59:01 PM EDT, Peikan Tsai <[email protected]> wrote:
>On Thu, Aug 29, 2019 at 05:27:22PM +0200, Christian Brauner wrote:
>> On Thu, Aug 29, 2019 at 09:53:59AM -0400, Joel Fernandes wrote:
>> > On Thu, Aug 29, 2019 at 08:42:29AM +0200, Greg KH wrote:
>> > > On Thu, Aug 29, 2019 at 01:49:53PM +0800, Peikan Tsai wrote:
>> > [snip]
>> > > > The allocated size for each binder_thread is 512 bytes by
>kzalloc.
>> > > > Because the size of binder_thread is fixed and it's only 304
>bytes.
>> > > > It will save 208 bytes per binder_thread when use create a
>kmem_cache
>> > > > for the binder_thread.
>> > >
>> > > Are you _sure_ it really will save that much memory? You want to
>do
>> > > allocations based on a nice alignment for lots of good reasons,
>> > > especially for something that needs quick accesses.
>> >
>> > Alignment can be done for slab allocations, kmem_cache_create()
>takes an
>> > align argument. I am not sure what the default alignment of objects
>is
>> > though (probably no default alignment). What is an optimal
>alignment in your
>> > view?
>>
>> Probably SLAB_HWCACHE_ALIGN would make most sense.
>>
>
>Agree. Thanks for yours comments and suggestions.
>I'll put SLAB_HWCACHE_ALIGN it in patch v2.
>
>> >
>> > > Did you test your change on a system that relies on binder and
>find any
>> > > speed improvement or decrease, and any actual memory savings?
>> > >
>> > > If so, can you post your results?
>> >
>> > That's certainly worth it and I thought of asking for the same, but
>spoke too
>> > soon!
>>
>> Yeah, it'd be interesting to see what difference this actually makes.
>
>>
>> Christian
>
>I tested this change on an Android device(arm) with AOSP kernel 4.19
>and
>observed
>memory usage of binder_thread. But I didn't do binder benchmark yet.
>
>On my platform the memory usage of binder_thread reduce about 90 KB as
>the
>following result.
> nr obj obj size total
> before: 624 512 319488 bytes
> after: 728 312 227136 bytes

And add this to the changelog as well. Curious- why is nrobj higher with the patch?

Please don't use my reviewed-by tag yet and I will review the new patch and provide tag separately.

Thank you.

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2019-08-30 06:40:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] binder: Use kmem_cache for binder_thread

On Thu, Aug 29, 2019 at 05:27:22PM +0200, Christian Brauner wrote:
> On Thu, Aug 29, 2019 at 09:53:59AM -0400, Joel Fernandes wrote:
> > On Thu, Aug 29, 2019 at 08:42:29AM +0200, Greg KH wrote:
> > > On Thu, Aug 29, 2019 at 01:49:53PM +0800, Peikan Tsai wrote:
> > [snip]
> > > > The allocated size for each binder_thread is 512 bytes by kzalloc.
> > > > Because the size of binder_thread is fixed and it's only 304 bytes.
> > > > It will save 208 bytes per binder_thread when use create a kmem_cache
> > > > for the binder_thread.
> > >
> > > Are you _sure_ it really will save that much memory? You want to do
> > > allocations based on a nice alignment for lots of good reasons,
> > > especially for something that needs quick accesses.
> >
> > Alignment can be done for slab allocations, kmem_cache_create() takes an
> > align argument. I am not sure what the default alignment of objects is
> > though (probably no default alignment). What is an optimal alignment in your
> > view?
>
> Probably SLAB_HWCACHE_ALIGN would make most sense.

This isn't memory accessing hardware, so I don't think it would, right?

Anyway, some actual performance tests need to be run to see if any of
this make any difference at all please...

thanks,

greg k-h


>
> >
> > > Did you test your change on a system that relies on binder and find any
> > > speed improvement or decrease, and any actual memory savings?
> > >
> > > If so, can you post your results?
> >
> > That's certainly worth it and I thought of asking for the same, but spoke too
> > soon!
>
> Yeah, it'd be interesting to see what difference this actually makes.
>
> Christian

2019-08-30 06:41:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] binder: Use kmem_cache for binder_thread

On Fri, Aug 30, 2019 at 02:59:01AM +0800, Peikan Tsai wrote:
> On Thu, Aug 29, 2019 at 05:27:22PM +0200, Christian Brauner wrote:
> > On Thu, Aug 29, 2019 at 09:53:59AM -0400, Joel Fernandes wrote:
> > > On Thu, Aug 29, 2019 at 08:42:29AM +0200, Greg KH wrote:
> > > > On Thu, Aug 29, 2019 at 01:49:53PM +0800, Peikan Tsai wrote:
> > > [snip]
> > > > > The allocated size for each binder_thread is 512 bytes by kzalloc.
> > > > > Because the size of binder_thread is fixed and it's only 304 bytes.
> > > > > It will save 208 bytes per binder_thread when use create a kmem_cache
> > > > > for the binder_thread.
> > > >
> > > > Are you _sure_ it really will save that much memory? You want to do
> > > > allocations based on a nice alignment for lots of good reasons,
> > > > especially for something that needs quick accesses.
> > >
> > > Alignment can be done for slab allocations, kmem_cache_create() takes an
> > > align argument. I am not sure what the default alignment of objects is
> > > though (probably no default alignment). What is an optimal alignment in your
> > > view?
> >
> > Probably SLAB_HWCACHE_ALIGN would make most sense.
> >
>
> Agree. Thanks for yours comments and suggestions.
> I'll put SLAB_HWCACHE_ALIGN it in patch v2.
>
> > >
> > > > Did you test your change on a system that relies on binder and find any
> > > > speed improvement or decrease, and any actual memory savings?
> > > >
> > > > If so, can you post your results?
> > >
> > > That's certainly worth it and I thought of asking for the same, but spoke too
> > > soon!
> >
> > Yeah, it'd be interesting to see what difference this actually makes.
> >
> > Christian
>
> I tested this change on an Android device(arm) with AOSP kernel 4.19 and
> observed
> memory usage of binder_thread. But I didn't do binder benchmark yet.
>
> On my platform the memory usage of binder_thread reduce about 90 KB as
> the
> following result.
> nr obj obj size total
> before: 624 512 319488 bytes
> after: 728 312 227136 bytes

You have more objects???

2019-08-30 12:15:58

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] binder: Use kmem_cache for binder_thread

On Fri, Aug 30, 2019 at 08:38:51AM +0200, Greg KH wrote:
> On Thu, Aug 29, 2019 at 05:27:22PM +0200, Christian Brauner wrote:
> > On Thu, Aug 29, 2019 at 09:53:59AM -0400, Joel Fernandes wrote:
> > > On Thu, Aug 29, 2019 at 08:42:29AM +0200, Greg KH wrote:
> > > > On Thu, Aug 29, 2019 at 01:49:53PM +0800, Peikan Tsai wrote:
> > > [snip]
> > > > > The allocated size for each binder_thread is 512 bytes by kzalloc.
> > > > > Because the size of binder_thread is fixed and it's only 304 bytes.
> > > > > It will save 208 bytes per binder_thread when use create a kmem_cache
> > > > > for the binder_thread.
> > > >
> > > > Are you _sure_ it really will save that much memory? You want to do
> > > > allocations based on a nice alignment for lots of good reasons,
> > > > especially for something that needs quick accesses.
> > >
> > > Alignment can be done for slab allocations, kmem_cache_create() takes an
> > > align argument. I am not sure what the default alignment of objects is
> > > though (probably no default alignment). What is an optimal alignment in your
> > > view?
> >
> > Probably SLAB_HWCACHE_ALIGN would make most sense.
>
> This isn't memory accessing hardware, so I don't think it would, right?

I was more thinking of cacheline bouncing under contention. But maybe
that's not worth it in this case...

2019-09-02 14:14:48

by Peikan Tsai

[permalink] [raw]
Subject: Re: [PATCH] binder: Use kmem_cache for binder_thread

On Fri, Aug 30, 2019 at 08:39:43AM +0200, Greg KH wrote:
> On Fri, Aug 30, 2019 at 02:59:01AM +0800, Peikan Tsai wrote:
> > On Thu, Aug 29, 2019 at 05:27:22PM +0200, Christian Brauner wrote:
> > > On Thu, Aug 29, 2019 at 09:53:59AM -0400, Joel Fernandes wrote:
> > > > On Thu, Aug 29, 2019 at 08:42:29AM +0200, Greg KH wrote:
> > > > > On Thu, Aug 29, 2019 at 01:49:53PM +0800, Peikan Tsai wrote:
> > > > [snip]
> > > > > > The allocated size for each binder_thread is 512 bytes by kzalloc.
> > > > > > Because the size of binder_thread is fixed and it's only 304 bytes.
> > > > > > It will save 208 bytes per binder_thread when use create a kmem_cache
> > > > > > for the binder_thread.
> > > > >
> > > > > Are you _sure_ it really will save that much memory? You want to do
> > > > > allocations based on a nice alignment for lots of good reasons,
> > > > > especially for something that needs quick accesses.
> > > >
> > > > Alignment can be done for slab allocations, kmem_cache_create() takes an
> > > > align argument. I am not sure what the default alignment of objects is
> > > > though (probably no default alignment). What is an optimal alignment in your
> > > > view?
> > >
> > > Probably SLAB_HWCACHE_ALIGN would make most sense.
> > >
> >
> > Agree. Thanks for yours comments and suggestions.
> > I'll put SLAB_HWCACHE_ALIGN it in patch v2.
> >
> > > >
> > > > > Did you test your change on a system that relies on binder and find any
> > > > > speed improvement or decrease, and any actual memory savings?
> > > > >
> > > > > If so, can you post your results?
> > > >
> > > > That's certainly worth it and I thought of asking for the same, but spoke too
> > > > soon!
> > >
> > > Yeah, it'd be interesting to see what difference this actually makes.
> > >
> > > Christian
> >
> > I tested this change on an Android device(arm) with AOSP kernel 4.19 and
> > observed
> > memory usage of binder_thread. But I didn't do binder benchmark yet.
> >
> > On my platform the memory usage of binder_thread reduce about 90 KB as
> > the
> > following result.
> > nr obj obj size total
> > before: 624 512 319488 bytes
> > after: 728 312 227136 bytes
>
> You have more objects???
>

Sorry, it's total objects which include some inactive objects ...
And because I tested it on an Android platform so there may be some noise.

So I try 'adb stop' and 'echo 3 > /proc/sys/vm/drop_caches' before starting
test to reduce the noise, and the result are as following.

objs
kzalloc 220 (kmalloc-512 alloc by binder_get_thread)

active_objs total objs objperslab slabdata
kmem_cache 194 403 13 31

Seems there are more objects when use kmemcache for binder_thread...
But as I understand it, those inactive objects can be free by kmemcahe shrink?

Also, I tested the throughput by using performace test of Android VTS.

size(bytes) kzalloc(byte/ns) kmemcache(byte/ns)
4 0.17 0.17
8 0.33 0.32
16 0.66 0.66
32 1.36 1.42
64 2.66 2.61
128 5.4 5.26
256 10.29 10.77
512 21.51 21.36
1k 41 40.26
2k 82.12 80.28
4k 149.24 146.95
8k 262.34 256
16k 417.96 422.2
32k 596.66 590.23
64k 600.84 601.25