2020-06-01 08:47:00

by Lai Jiangshan

[permalink] [raw]
Subject: [PATCH 4/4] workqueue: slash half memory usage in 32bit system

The major memory ussage in workqueue is on the pool_workqueue.
The pool_workqueue has alignment requirement which often leads
to padding.

Reducing the memory usage for the pool_workqueue is valuable.

And 32bit system often has less memory, less workqueues,
less works, less concurrent flush_workqueue()s, so we can
slash the flush color on 32bit system to reduce memory usage

Before patch:
The sizeof the struct pool_workqueue is 256 bytes,
only 136 bytes is in use in 32bit system

After patch:
The sizeof the struct pool_workqueue is 128 bytes,
only 104 bytes is in use in 32bit system, there is still
room for future usage.

Setting WORK_STRUCT_COLOR_BITS to 3 can't reduce the sizeof
the struct pool_workqueue in 64bit system, unless combined
with big refactor for unbound pwq.

Signed-off-by: Lai Jiangshan <[email protected]>
---
include/linux/workqueue.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 26de0cae2a0a..c0f311926d01 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -39,7 +39,11 @@ enum {
WORK_STRUCT_COLOR_SHIFT = 4, /* color for workqueue flushing */
#endif

+#if BITS_PER_LONG == 32
+ WORK_STRUCT_COLOR_BITS = 3,
+#else
WORK_STRUCT_COLOR_BITS = 4,
+#endif

WORK_STRUCT_PENDING = 1 << WORK_STRUCT_PENDING_BIT,
WORK_STRUCT_DELAYED = 1 << WORK_STRUCT_DELAYED_BIT,
@@ -65,6 +69,8 @@ enum {
* Reserve 8 bits off of pwq pointer w/ debugobjects turned off.
* This makes pwqs aligned to 256 bytes and allows 15 workqueue
* flush colors.
+ * For 32 bit system, the numbers are 7 bits, 128 bytes, 7 colors
+ * respectively.
*/
WORK_STRUCT_FLAG_BITS = WORK_STRUCT_COLOR_SHIFT +
WORK_STRUCT_COLOR_BITS,
--
2.20.1


2020-06-01 15:10:54

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 4/4] workqueue: slash half memory usage in 32bit system

On Mon, Jun 01, 2020 at 08:44:42AM +0000, Lai Jiangshan wrote:
> The major memory ussage in workqueue is on the pool_workqueue.
> The pool_workqueue has alignment requirement which often leads
> to padding.
>
> Reducing the memory usage for the pool_workqueue is valuable.
>
> And 32bit system often has less memory, less workqueues,
> less works, less concurrent flush_workqueue()s, so we can
> slash the flush color on 32bit system to reduce memory usage
>
> Before patch:
> The sizeof the struct pool_workqueue is 256 bytes,
> only 136 bytes is in use in 32bit system
>
> After patch:
> The sizeof the struct pool_workqueue is 128 bytes,
> only 104 bytes is in use in 32bit system, there is still
> room for future usage.
>
> Setting WORK_STRUCT_COLOR_BITS to 3 can't reduce the sizeof
> the struct pool_workqueue in 64bit system, unless combined
> with big refactor for unbound pwq.

Have you calculated how much memory is actually saved this way on a typical
system?

Thanks.

--
tejun

2020-06-02 00:10:37

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH 4/4] workqueue: slash half memory usage in 32bit system

On Mon, Jun 1, 2020 at 11:07 PM Tejun Heo <[email protected]> wrote:
>
> On Mon, Jun 01, 2020 at 08:44:42AM +0000, Lai Jiangshan wrote:
> > The major memory ussage in workqueue is on the pool_workqueue.
> > The pool_workqueue has alignment requirement which often leads
> > to padding.
> >
> > Reducing the memory usage for the pool_workqueue is valuable.
> >
> > And 32bit system often has less memory, less workqueues,
> > less works, less concurrent flush_workqueue()s, so we can
> > slash the flush color on 32bit system to reduce memory usage
> >
> > Before patch:
> > The sizeof the struct pool_workqueue is 256 bytes,
> > only 136 bytes is in use in 32bit system
> >
> > After patch:
> > The sizeof the struct pool_workqueue is 128 bytes,
> > only 104 bytes is in use in 32bit system, there is still
> > room for future usage.
> >
> > Setting WORK_STRUCT_COLOR_BITS to 3 can't reduce the sizeof
> > the struct pool_workqueue in 64bit system, unless combined
> > with big refactor for unbound pwq.
>
> Have you calculated how much memory is actually saved this way on a typical
> system?

It is not noticable from the "free" command.
By counting the number of allocated pwq (mainly percpu pwq),
it saves 20k in my simple kvm guest (4cpu).
I guess it highly various in different boxes with various
kernel modules loaded.

>
> Thanks.
>
> --
> tejun

2020-06-02 16:23:50

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 4/4] workqueue: slash half memory usage in 32bit system

Hello,

On Tue, Jun 02, 2020 at 08:08:10AM +0800, Lai Jiangshan wrote:
> It is not noticable from the "free" command.
> By counting the number of allocated pwq (mainly percpu pwq),
> it saves 20k in my simple kvm guest (4cpu).
> I guess it highly various in different boxes with various
> kernel modules loaded.

Hmm... I find it difficult to judge in any direction. 32 bit machines are
smaller and have less of everything - including CPUs and workqueues
themselves, so while changing configuration for 32bit systems would reduce
memory usage the impact isn't gonna be that big either and I have no data
supporting whether the reduction is gonna help or hurt.

Thanks.

--
tejun