The major memory ussage in workqueue is on the pool_workqueue.
The pool_workqueue has alignment requirement which often leads
to padding.
Reducing the memory usage for the pool_workqueue is valuable.
And 32bit system often has less memory, less workqueues,
less works, less concurrent flush_workqueue()s, so we can
slash the flush color on 32bit system to reduce memory usage
Before patch:
The sizeof the struct pool_workqueue is 256 bytes,
only 136 bytes is in use in 32bit system
After patch:
The sizeof the struct pool_workqueue is 128 bytes,
only 104 bytes is in use in 32bit system, there is still
room for future usage.
Setting WORK_STRUCT_COLOR_BITS to 3 can't reduce the sizeof
the struct pool_workqueue in 64bit system, unless combined
with big refactor for unbound pwq.
Signed-off-by: Lai Jiangshan <[email protected]>
---
include/linux/workqueue.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 26de0cae2a0a..c0f311926d01 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -39,7 +39,11 @@ enum {
WORK_STRUCT_COLOR_SHIFT = 4, /* color for workqueue flushing */
#endif
+#if BITS_PER_LONG == 32
+ WORK_STRUCT_COLOR_BITS = 3,
+#else
WORK_STRUCT_COLOR_BITS = 4,
+#endif
WORK_STRUCT_PENDING = 1 << WORK_STRUCT_PENDING_BIT,
WORK_STRUCT_DELAYED = 1 << WORK_STRUCT_DELAYED_BIT,
@@ -65,6 +69,8 @@ enum {
* Reserve 8 bits off of pwq pointer w/ debugobjects turned off.
* This makes pwqs aligned to 256 bytes and allows 15 workqueue
* flush colors.
+ * For 32 bit system, the numbers are 7 bits, 128 bytes, 7 colors
+ * respectively.
*/
WORK_STRUCT_FLAG_BITS = WORK_STRUCT_COLOR_SHIFT +
WORK_STRUCT_COLOR_BITS,
--
2.20.1
On Mon, Jun 01, 2020 at 08:44:42AM +0000, Lai Jiangshan wrote:
> The major memory ussage in workqueue is on the pool_workqueue.
> The pool_workqueue has alignment requirement which often leads
> to padding.
>
> Reducing the memory usage for the pool_workqueue is valuable.
>
> And 32bit system often has less memory, less workqueues,
> less works, less concurrent flush_workqueue()s, so we can
> slash the flush color on 32bit system to reduce memory usage
>
> Before patch:
> The sizeof the struct pool_workqueue is 256 bytes,
> only 136 bytes is in use in 32bit system
>
> After patch:
> The sizeof the struct pool_workqueue is 128 bytes,
> only 104 bytes is in use in 32bit system, there is still
> room for future usage.
>
> Setting WORK_STRUCT_COLOR_BITS to 3 can't reduce the sizeof
> the struct pool_workqueue in 64bit system, unless combined
> with big refactor for unbound pwq.
Have you calculated how much memory is actually saved this way on a typical
system?
Thanks.
--
tejun
On Mon, Jun 1, 2020 at 11:07 PM Tejun Heo <[email protected]> wrote:
>
> On Mon, Jun 01, 2020 at 08:44:42AM +0000, Lai Jiangshan wrote:
> > The major memory ussage in workqueue is on the pool_workqueue.
> > The pool_workqueue has alignment requirement which often leads
> > to padding.
> >
> > Reducing the memory usage for the pool_workqueue is valuable.
> >
> > And 32bit system often has less memory, less workqueues,
> > less works, less concurrent flush_workqueue()s, so we can
> > slash the flush color on 32bit system to reduce memory usage
> >
> > Before patch:
> > The sizeof the struct pool_workqueue is 256 bytes,
> > only 136 bytes is in use in 32bit system
> >
> > After patch:
> > The sizeof the struct pool_workqueue is 128 bytes,
> > only 104 bytes is in use in 32bit system, there is still
> > room for future usage.
> >
> > Setting WORK_STRUCT_COLOR_BITS to 3 can't reduce the sizeof
> > the struct pool_workqueue in 64bit system, unless combined
> > with big refactor for unbound pwq.
>
> Have you calculated how much memory is actually saved this way on a typical
> system?
It is not noticable from the "free" command.
By counting the number of allocated pwq (mainly percpu pwq),
it saves 20k in my simple kvm guest (4cpu).
I guess it highly various in different boxes with various
kernel modules loaded.
>
> Thanks.
>
> --
> tejun
Hello,
On Tue, Jun 02, 2020 at 08:08:10AM +0800, Lai Jiangshan wrote:
> It is not noticable from the "free" command.
> By counting the number of allocated pwq (mainly percpu pwq),
> it saves 20k in my simple kvm guest (4cpu).
> I guess it highly various in different boxes with various
> kernel modules loaded.
Hmm... I find it difficult to judge in any direction. 32 bit machines are
smaller and have less of everything - including CPUs and workqueues
themselves, so while changing configuration for 32bit systems would reduce
memory usage the impact isn't gonna be that big either and I have no data
supporting whether the reduction is gonna help or hurt.
Thanks.
--
tejun