On Sun, Oct 17, 2021 at 01:36:18PM +0000, Hyeonggon Yoo wrote:
> On Sun, Oct 17, 2021 at 04:28:52AM +0000, Hyeonggon Yoo wrote:
> > I've been reading SLUB/SLOB code for a while. SLUB recently became
> > real time compatible by reducing its locking area.
> >
> > for now, SLUB is the only slab allocator for PREEMPT_RT because
> > it works better than SLAB on RT and SLOB uses non-deterministic method,
> > sequential fit.
> >
> > But memory usage of SLUB is too high for systems with low memory.
> > So In my local repository I made SLOB to use segregated free list
> > method, which is more more deterministic, to provide bounded latency.
> >
> > This can be done by managing list of partial pages globally
> > for every power of two sizes (8, 16, 32, ..., PAGE_SIZE) per NUMA nodes.
> > minimal allocation size is size of pointers to keep pointer of next free object
> > like SLUB.
> >
> > By making objects in same page to have same size, there's no
> > need to iterate free blocks in a page. (Also iterating pages isn't needed)
> >
> > Some cleanups and more tests (especially with NUMA/RT configs) needed,
> > but want to hear your opinion about the idea. Did not test on RT yet.
> >
> > Below is result of benchmarks and memory usage. (on !RT)
> > with 13% increase in memory usage, it's nine times faster and
> > bounded fragmentation, and importantly provides predictable execution time.
> >
>
> Hello linux-mm, I improved it and it uses lower memory
> and 9x~13x faster than original SLOB. it shows much less fragmentation
> after hackbench.
>
> Rather than managing global freelist that has power of 2 sizes,
> I made a kmem_cache to manage its own freelist (for each NUMA nodes) and
> Added support for slab merging. So It quite looks like a lightweight SLUB now.
>
> I'll send rfc patch after some testing and code cleaning.
>
> I think it is more RT-friendly becuase it's uses more deterministic
> algorithm (But lock is still shared among cpus). Any opinions for RT?
Hi there. after some thinking, I got a new question:
If a lightweight SLUB is better than SLOB,
Do we really need SLOB nowdays?
And one more question:
in Christoph's presentation [1], it says SLOB uses
300 KB of memory. but on my system it uses almost 8000 KB.
what's is differences?
[1] https://events.static.linuxfound.org/sites/events/files/slides/slaballocators.pdf
SLUB without cpu partials:
memory usage:
after boot:
Slab: 8672 kB
after hackbench:
Slab: 9540 kB
Performance counter stats for 'hackbench -g 4 -l 10000':
48463.05 msec cpu-clock # 1.995 CPUs utilized
944154 context-switches # 19.482 K/sec
8161 cpu-migrations # 168.396 /sec
4117 page-faults # 84.951 /sec
52570808507 cycles # 1.085 GHz
65083778667 instructions # 1.24 insn per cycle
234990576 branch-misses
23628671709 cache-references # 487.561 M/sec
739599271 cache-misses # 3.130 % of all cache refs
24.287392120 seconds time elapsed
1.509198000 seconds user
46.942748000 seconds sys
> current SLOB:
> memory usage:
> after boot:
> Slab: 7908 kB
> after hackbench:
> Slab: 8544 kB
>
> Time: 189.947
> Performance counter stats for 'hackbench -g 4 -l 10000':
> 379413.20 msec cpu-clock # 1.997 CPUs utilized
> 8818226 context-switches # 23.242 K/sec
> 375186 cpu-migrations # 988.859 /sec
> 3954 page-faults # 10.421 /sec
> 269923095290 cycles # 0.711 GHz
> 212341582012 instructions # 0.79 insn per cycle
> 2361087153 branch-misses
> 58222839688 cache-references # 153.455 M/sec
> 6786521959 cache-misses # 11.656 % of all cache refs
>
> 190.002062273 seconds time elapsed
>
> 3.486150000 seconds user
> 375.599495000 seconds sys
>
> SLOB with segregated list + slab merging:
> memory usage:
> after boot:
> Slab: 7560 kB
> after hackbench:
> Slab: 7836 kB
>
> hackbench:
> Time: 20.780
> Performance counter stats for 'hackbench -g 4 -l 10000':
> 41509.79 msec cpu-clock # 1.996 CPUs utilized
> 630032 context-switches # 15.178 K/sec
> 8287 cpu-migrations # 199.640 /sec
> 4036 page-faults # 97.230 /sec
> 57477161020 cycles # 1.385 GHz
> 62775453932 instructions # 1.09 insn per cycle
> 164902523 branch-misses
> 22559952993 cache-references # 543.485 M/sec
> 832404011 cache-misses # 3.690 % of all cache refs
>
> 20.791893590 seconds time elapsed
>
> 1.423282000 seconds user
> 40.072449000 seconds sys
> -
> Thanks,
> Hyeonggon
On Sun, Oct 17, 2021 at 01:57:08PM +0000, Hyeonggon Yoo wrote:
> On Sun, Oct 17, 2021 at 01:36:18PM +0000, Hyeonggon Yoo wrote:
> > On Sun, Oct 17, 2021 at 04:28:52AM +0000, Hyeonggon Yoo wrote:
> > > I've been reading SLUB/SLOB code for a while. SLUB recently became
> > > real time compatible by reducing its locking area.
> > >
> > > for now, SLUB is the only slab allocator for PREEMPT_RT because
> > > it works better than SLAB on RT and SLOB uses non-deterministic method,
> > > sequential fit.
> > >
> > > But memory usage of SLUB is too high for systems with low memory.
> > > So In my local repository I made SLOB to use segregated free list
> > > method, which is more more deterministic, to provide bounded latency.
> > >
> > > This can be done by managing list of partial pages globally
> > > for every power of two sizes (8, 16, 32, ..., PAGE_SIZE) per NUMA nodes.
> > > minimal allocation size is size of pointers to keep pointer of next free object
> > > like SLUB.
> > >
> > > By making objects in same page to have same size, there's no
> > > need to iterate free blocks in a page. (Also iterating pages isn't needed)
> > >
> > > Some cleanups and more tests (especially with NUMA/RT configs) needed,
> > > but want to hear your opinion about the idea. Did not test on RT yet.
> > >
> > > Below is result of benchmarks and memory usage. (on !RT)
> > > with 13% increase in memory usage, it's nine times faster and
> > > bounded fragmentation, and importantly provides predictable execution time.
> > >
> >
> > Hello linux-mm, I improved it and it uses lower memory
> > and 9x~13x faster than original SLOB. it shows much less fragmentation
> > after hackbench.
> >
> > Rather than managing global freelist that has power of 2 sizes,
> > I made a kmem_cache to manage its own freelist (for each NUMA nodes) and
> > Added support for slab merging. So It quite looks like a lightweight SLUB now.
> >
> > I'll send rfc patch after some testing and code cleaning.
> >
> > I think it is more RT-friendly becuase it's uses more deterministic
> > algorithm (But lock is still shared among cpus). Any opinions for RT?
>
> Hi there. after some thinking, I got a new question:
> If a lightweight SLUB is better than SLOB,
> Do we really need SLOB nowdays?
Better for what use case? SLOB is for machines with 1-16MB of RAM.
On Mon, 18 Oct 2021, Hyeonggon Yoo wrote:
> > Better for what use case? SLOB is for machines with 1-16MB of RAM.
> >
>
> 1~16M is smaller than I thought. Hmm... I'm going to see how it works on
> tiny configuration. Thank you Matthew!
Is there any reference where we can see such a configuration? Sure it does
not work with SLUB too?
On Sun, 17 Oct 2021, Hyeonggon Yoo wrote:
> And one more question:
> in Christoph's presentation [1], it says SLOB uses
> 300 KB of memory. but on my system it uses almost 8000 KB.
> what's is differences?
Hmmm.... Someone already made "improvements" to SLOB? Kernel needs to be
compiled for minimal overhead and debugging removed.
On Mon, Oct 25, 2021 at 10:17:08AM +0200, Christoph Lameter wrote:
> On Mon, 18 Oct 2021, Hyeonggon Yoo wrote:
>
> > > Better for what use case? SLOB is for machines with 1-16MB of RAM.
> > >
> >
> > 1~16M is smaller than I thought. Hmm... I'm going to see how it works on
> > tiny configuration. Thank you Matthew!
>
> Is there any reference where we can see such a configuration? Sure it does
> not work with SLUB too?
I thought why Matthew said "SLOB is for machines with 1-16MB of RAM"
is because if memory is so low, then it is sensitive to memory usage.
(But I still have doubt if we can run linux on machines like that.)
On Thu, Oct 28, 2021 at 10:04:14AM +0000, Hyeonggon Yoo wrote:
> On Mon, Oct 25, 2021 at 10:17:08AM +0200, Christoph Lameter wrote:
> > On Mon, 18 Oct 2021, Hyeonggon Yoo wrote:
> >
> > > > Better for what use case? SLOB is for machines with 1-16MB of RAM.
> > > >
> > >
> > > 1~16M is smaller than I thought. Hmm... I'm going to see how it works on
> > > tiny configuration. Thank you Matthew!
> >
> > Is there any reference where we can see such a configuration? Sure it does
> > not work with SLUB too?
>
> I thought why Matthew said "SLOB is for machines with 1-16MB of RAM"
> is because if memory is so low, then it is sensitive to memory usage.
>
> (But I still have doubt if we can run linux on machines like that.)
I sent you a series of articles about making Linux run in 1MB.
On Thu, Oct 28, 2021 at 01:08:02PM +0100, Matthew Wilcox wrote:
> On Thu, Oct 28, 2021 at 10:04:14AM +0000, Hyeonggon Yoo wrote:
> > On Mon, Oct 25, 2021 at 10:17:08AM +0200, Christoph Lameter wrote:
> > > On Mon, 18 Oct 2021, Hyeonggon Yoo wrote:
> > >
> > > > > Better for what use case? SLOB is for machines with 1-16MB of RAM.
> > > > >
> > > >
> > > > 1~16M is smaller than I thought. Hmm... I'm going to see how it works on
> > > > tiny configuration. Thank you Matthew!
> > >
> > > Is there any reference where we can see such a configuration? Sure it does
> > > not work with SLUB too?
> >
> > I thought why Matthew said "SLOB is for machines with 1-16MB of RAM"
> > is because if memory is so low, then it is sensitive to memory usage.
> >
> > (But I still have doubt if we can run linux on machines like that.)
>
> I sent you a series of articles about making Linux run in 1MB.
Oh I missed your mail, I'm gonna read this!
Thanks!
Thanks,
Hyeonggon.
On Thu, Oct 28, 2021 at 01:08:02PM +0100, Matthew Wilcox wrote:
> On Thu, Oct 28, 2021 at 10:04:14AM +0000, Hyeonggon Yoo wrote:
> > On Mon, Oct 25, 2021 at 10:17:08AM +0200, Christoph Lameter wrote:
> > > On Mon, 18 Oct 2021, Hyeonggon Yoo wrote:
> > >
> > > > > Better for what use case? SLOB is for machines with 1-16MB of RAM.
> > > > >
> > > >
> > > > 1~16M is smaller than I thought. Hmm... I'm going to see how it works on
> > > > tiny configuration. Thank you Matthew!
> > >
> > > Is there any reference where we can see such a configuration? Sure it does
> > > not work with SLUB too?
> >
> > I thought why Matthew said "SLOB is for machines with 1-16MB of RAM"
> > is because if memory is so low, then it is sensitive to memory usage.
> >
> > (But I still have doubt if we can run linux on machines like that.)
>
> I sent you a series of articles about making Linux run in 1MB.
After some time playing with the size of kernel,
I was able to run linux in 6.6MiB of RAM. and the SLOB used
around 300KiB of memory.
Running linux in 1MiB seems almost impossible without introducing
XIP (eXecute In Place) which executes binary directly from ROM or Flash.
(and that's actually not reducing kernel size, it's reducing RAM required to boot)
SLOB seems to be useful when the machine has really really tiny memory.
because the slab allocator can use most of memory when the memory is so
small. But if the machine has some megabytes of RAM,
I think SLUB is right allocator to choose.
Thank you for sending that link.
it was so nice article.
On Fri, 10 Dec 2021, Hyeonggon Yoo wrote:
> > > (But I still have doubt if we can run linux on machines like that.)
> >
> > I sent you a series of articles about making Linux run in 1MB.
>
> After some time playing with the size of kernel,
> I was able to run linux in 6.6MiB of RAM. and the SLOB used
> around 300KiB of memory.
What is the minimal size you need for SLUB?
On 12/10/21 13:06, Christoph Lameter wrote:
> On Fri, 10 Dec 2021, Hyeonggon Yoo wrote:
>
>> > > (But I still have doubt if we can run linux on machines like that.)
>> >
>> > I sent you a series of articles about making Linux run in 1MB.
>>
>> After some time playing with the size of kernel,
>> I was able to run linux in 6.6MiB of RAM. and the SLOB used
>> around 300KiB of memory.
>
> What is the minimal size you need for SLUB?
Good question. Meanwhile I tried to compare Slab: in /proc/meminfo on a virtme run:
virtme-run --mods=auto --kdir /home/vbabka/wrk/linux/ --memory 2G,slots=2,maxmem=4G --qemu-opts --smp 4
Got ~30800kB with SLOB, 34500kB with SLUB without DEBUG and PERCPU_PARTIAL.
Then did a quick and dirty patch (below) to never load c->slab in
___slab_alloc() and got to 32200kB. Fiddling with
slub_min_order/slub_max_order didn't actually help, probably due to causing
more internal fragmentation.
So that's relatively close, but on a really small system the difference can
be possibly more prominent. Also my test doesn't account for text/data or
percpu usage differences.
diff --git a/mm/slub.c b/mm/slub.c
index 68aa112e469b..fd9c853971d1 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3054,6 +3054,8 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
*/
goto return_single;
+ goto return_single;
+
retry_load_slab:
local_lock_irqsave(&s->cpu_slab->lock, flags);
On Tue, Dec 14, 2021 at 06:24:58PM +0100, Vlastimil Babka wrote:
> On 12/10/21 13:06, Christoph Lameter wrote:
> > On Fri, 10 Dec 2021, Hyeonggon Yoo wrote:
> >
> >> > > (But I still have doubt if we can run linux on machines like that.)
> >> >
> >> > I sent you a series of articles about making Linux run in 1MB.
> >>
> >> After some time playing with the size of kernel,
> >> I was able to run linux in 6.6MiB of RAM. and the SLOB used
> >> around 300KiB of memory.
> >
> > What is the minimal size you need for SLUB?
>
I don't know why Christoph's mail is not in my mailbox. maybe I deleted it
by mistake or I'm not cc-ed.
Anyway, I tried to measure this again with SLUB and SLOB.
SLUB uses few hundreds of bytes than SLOB.
There isn't much difference in 'Memory required to boot'.
(interestingly SLUB requires less)
'Memory required to boot' is measured by reducing memory
until it says 'System is deadlocked on memory'. I don't know
exact reason why they differ.
Note that the configuration is based on tinyconfig and
I added initramfs support + tty layer (+ uart driver) + procfs support,
+ ELF binary support + etc.
there isn't even block layer, but it's good starting point to see
what happens in small system.
SLOB:
Memory required to boot: 6950K
/proc/meminfo:
MemTotal: 4820 kB
MemFree: 1172 kB
MemAvailable: 800 kB
Buffers: 0 kB
Cached: 2528 kB
SwapCached: 0 kB
Active: 4 kB
Inactive: 100 kB
Active(anon): 4 kB
Inactive(anon): 100 kB
Active(file): 0 kB
Inactive(file): 0 kB
Unevictable: 2528 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 120 kB
Mapped: 848 kB
Shmem: 0 kB
KReclaimable: 0 kB
Slab: 368 kB
SReclaimable: 0 kB
SUnreclaim: 368 kB
KernelStack: 128 kB
PageTables: 28 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 2408 kB
Committed_AS: 524 kB
VmallocTotal: 1032192 kB
VmallocUsed: 16 kB
VmallocChunk: 0 kB
Percpu: 32 kB
SLUB:
Memory required to boot: 6800K
/proc/meminfo:
MemTotal: 4660 kB
MemFree: 836 kB
MemAvailable: 568 kB
Buffers: 0 kB
Cached: 2528 kB
SwapCached: 0 kB
Active: 4 kB
Inactive: 100 kB
Active(anon): 4 kB
Inactive(anon): 100 kB
Active(file): 0 kB
Inactive(file): 0 kB
Unevictable: 2528 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 120 kB
Mapped: 848 kB
Shmem: 0 kB
KReclaimable: 188 kB
Slab: 552 kB
SReclaimable: 188 kB
SUnreclaim: 364 kB
KernelStack: 128 kB
PageTables: 28 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 2328 kB
Committed_AS: 524 kB
VmallocTotal: 1032192 kB
VmallocUsed: 16 kB
VmallocChunk: 0 kB
Percpu: 32 kB
SLUB with slab merging:
Memory required to boot: 6800K
/proc/meminfo:
MemTotal: 4660 kB
MemFree: 840 kB
MemAvailable: 572 kB
Buffers: 0 kB
Cached: 2528 kB
SwapCached: 0 kB
Active: 4 kB
Inactive: 100 kB
Active(anon): 4 kB
Inactive(anon): 100 kB
Active(file): 0 kB
Inactive(file): 0 kB
Unevictable: 2528 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 120 kB
Mapped: 848 kB
Shmem: 0 kB
KReclaimable: 188 kB
Slab: 536 kB
SReclaimable: 188 kB
SUnreclaim: 348 kB
KernelStack: 128 kB
PageTables: 28 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 2328 kB
Committed_AS: 524 kB
VmallocTotal: 1032192 kB
VmallocUsed: 16 kB
VmallocChunk: 0 kB
Percpu: 32 kB
If you're interested in reproducing this,
some links below will help:
https://hyeyoo.com/148 (written in korean but the commands and pictures will help)
http://events17.linuxfoundation.org/sites/events/files/slides/tiny.pdf
http://events17.linuxfoundation.org/sites/events/files/slides/opdenacker-embedded-linux-size-reduction-techniques_0.pdf
https://lukaszgemborowski.github.io/articles/minimalistic-linux-system-on-qemu-arm.html
https://weeraman.com/building-a-tiny-linux-kernel-8c07579ae79d
the target board is ARM Versatile Platform Board (based on ARMv5).
And I ran this on qemu.
Thanks,
Hyeonggon.
> Good question. Meanwhile I tried to compare Slab: in /proc/meminfo on a virtme run:
> virtme-run --mods=auto --kdir /home/vbabka/wrk/linux/ --memory 2G,slots=2,maxmem=4G --qemu-opts --smp 4
>
> Got ~30800kB with SLOB, 34500kB with SLUB without DEBUG and PERCPU_PARTIAL.
> Then did a quick and dirty patch (below) to never load c->slab in
> ___slab_alloc() and got to 32200kB. Fiddling with
> slub_min_order/slub_max_order didn't actually help, probably due to causing
> more internal fragmentation.
>
> So that's relatively close, but on a really small system the difference can
> be possibly more prominent. Also my test doesn't account for text/data or
> percpu usage differences.
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 68aa112e469b..fd9c853971d1 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3054,6 +3054,8 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
> */
> goto return_single;
>
> + goto return_single;
> +
> retry_load_slab:
>
> local_lock_irqsave(&s->cpu_slab->lock, flags);
>
>
On 12/15/21 07:29, Hyeonggon Yoo wrote:
> On Tue, Dec 14, 2021 at 06:24:58PM +0100, Vlastimil Babka wrote:
>> On 12/10/21 13:06, Christoph Lameter wrote:
>> > On Fri, 10 Dec 2021, Hyeonggon Yoo wrote:
>> >
>> >> > > (But I still have doubt if we can run linux on machines like that.)
>> >> >
>> >> > I sent you a series of articles about making Linux run in 1MB.
>> >>
>> >> After some time playing with the size of kernel,
>> >> I was able to run linux in 6.6MiB of RAM. and the SLOB used
>> >> around 300KiB of memory.
>> >
>> > What is the minimal size you need for SLUB?
>>
>
> I don't know why Christoph's mail is not in my mailbox. maybe I deleted it
> by mistake or I'm not cc-ed.
>
> Anyway, I tried to measure this again with SLUB and SLOB.
>
> SLUB uses few hundreds of bytes than SLOB.
>
> There isn't much difference in 'Memory required to boot'.
> (interestingly SLUB requires less)
>
> 'Memory required to boot' is measured by reducing memory
> until it says 'System is deadlocked on memory'. I don't know
> exact reason why they differ.
>
> Note that the configuration is based on tinyconfig and
> I added initramfs support + tty layer (+ uart driver) + procfs support,
> + ELF binary support + etc.
>
> there isn't even block layer, but it's good starting point to see
> what happens in small system.
>
> SLOB:
>
> Memory required to boot: 6950K
>
> Slab: 368 kB
>
> SLUB:
> Memory required to boot: 6800K
>
> Slab: 552 kB
>
> SLUB with slab merging:
>
> Slab: 536 kB
168kB different on a system with less than 8MB memory looks rather
significant to me to simply delete SLOB, I'm afraid.
On Wed, 15 Dec 2021, Vlastimil Babka wrote:
> > SLOB:
> >
> > Memory required to boot: 6950K
> >
> > Slab: 368 kB
> >
> > SLUB:
> > Memory required to boot: 6800K
> >
> > Slab: 552 kB
> >
> > SLUB with slab merging:
> >
> > Slab: 536 kB
>
> 168kB different on a system with less than 8MB memory looks rather
> significant to me to simply delete SLOB, I'm afraid.
This looks more like a bug/difference in SLAB accounting of SLOB.
How could SLOB require more memory to boot but use less SLAB memory?
This looks to me like a significant reason enough to remove SLOB since
SLUB works with less memory than SLOB.
On Fri, Feb 18, 2022 at 10:13:29AM +0000, Hyeonggon Yoo wrote:
> On Wed, Dec 15, 2021 at 11:10:06AM +0100, Vlastimil Babka wrote:
> > On 12/15/21 07:29, Hyeonggon Yoo wrote:
> > > On Tue, Dec 14, 2021 at 06:24:58PM +0100, Vlastimil Babka wrote:
> > >> On 12/10/21 13:06, Christoph Lameter wrote:
> > >> > On Fri, 10 Dec 2021, Hyeonggon Yoo wrote:
> > >> >
> > >> >> > > (But I still have doubt if we can run linux on machines like that.)
> > >> >> >
> > >> >> > I sent you a series of articles about making Linux run in 1MB.
> > >> >>
> > >> >> After some time playing with the size of kernel,
> > >> >> I was able to run linux in 6.6MiB of RAM. and the SLOB used
> > >> >> around 300KiB of memory.
> > >> >
> > >> > What is the minimal size you need for SLUB?
> > >>
> > >
> > > I don't know why Christoph's mail is not in my mailbox. maybe I deleted it
> > > by mistake or I'm not cc-ed.
> > >
> > > Anyway, I tried to measure this again with SLUB and SLOB.
> > >
> > > SLUB uses few hundreds of bytes than SLOB.
> > >
> > > There isn't much difference in 'Memory required to boot'.
> > > (interestingly SLUB requires less)
> > >
> > > 'Memory required to boot' is measured by reducing memory
> > > until it says 'System is deadlocked on memory'. I don't know
> > > exact reason why they differ.
> > >
> > > Note that the configuration is based on tinyconfig and
> > > I added initramfs support + tty layer (+ uart driver) + procfs support,
> > > + ELF binary support + etc.
> > >
> > > there isn't even block layer, but it's good starting point to see
> > > what happens in small system.
> > >
> > > SLOB:
> > >
> > > Memory required to boot: 6950K
> > >
> > > Slab: 368 kB
> > >
> > > SLUB:
> > > Memory required to boot: 6800K
> > >
> > > Slab: 552 kB
> > >
> > > SLUB with slab merging:
> > >
> > > Slab: 536 kB
> >
> > 168kB different on a system with less than 8MB memory looks rather
> > significant to me to simply delete SLOB, I'm afraid.
>
> Just FYI...
> Some experiment based on v5.17-rc3:
>
> SLOB:
> Slab: 388 kB
>
> SLUB:
> Slab: 540 kB (+152kb)
>
> SLUB with s->min_partial = 0:
> Slab: 452 kB (+64kb)
>
> SLUB with s->min_partial = 0 && slub_max_order = 0:
> Slab: 436 kB (+48kb)
>
> SLUB with s->min_partial = 0 && slub_max_order = 0
> + merging slabs crazily (just ignore SLAB_NEVER_MERGE/SLAB_MERGE_SAME):
> Slab: 408 kB (+20kb)
>
> Decreasing further seem to be hard and
> I guess +20kb are due to partial slabs.
>
> I think SLUB can be memory-efficient as SLOB.
> Is SLOB (Address-Ordered next fit) stronger to fragmentation than SLUB?
(Address-Ordered *first* fit)
On Wed, Dec 15, 2021 at 11:10:06AM +0100, Vlastimil Babka wrote:
> On 12/15/21 07:29, Hyeonggon Yoo wrote:
> > On Tue, Dec 14, 2021 at 06:24:58PM +0100, Vlastimil Babka wrote:
> >> On 12/10/21 13:06, Christoph Lameter wrote:
> >> > On Fri, 10 Dec 2021, Hyeonggon Yoo wrote:
> >> >
> >> >> > > (But I still have doubt if we can run linux on machines like that.)
> >> >> >
> >> >> > I sent you a series of articles about making Linux run in 1MB.
> >> >>
> >> >> After some time playing with the size of kernel,
> >> >> I was able to run linux in 6.6MiB of RAM. and the SLOB used
> >> >> around 300KiB of memory.
> >> >
> >> > What is the minimal size you need for SLUB?
> >>
> >
> > I don't know why Christoph's mail is not in my mailbox. maybe I deleted it
> > by mistake or I'm not cc-ed.
> >
> > Anyway, I tried to measure this again with SLUB and SLOB.
> >
> > SLUB uses few hundreds of bytes than SLOB.
> >
> > There isn't much difference in 'Memory required to boot'.
> > (interestingly SLUB requires less)
> >
> > 'Memory required to boot' is measured by reducing memory
> > until it says 'System is deadlocked on memory'. I don't know
> > exact reason why they differ.
> >
> > Note that the configuration is based on tinyconfig and
> > I added initramfs support + tty layer (+ uart driver) + procfs support,
> > + ELF binary support + etc.
> >
> > there isn't even block layer, but it's good starting point to see
> > what happens in small system.
> >
> > SLOB:
> >
> > Memory required to boot: 6950K
> >
> > Slab: 368 kB
> >
> > SLUB:
> > Memory required to boot: 6800K
> >
> > Slab: 552 kB
> >
> > SLUB with slab merging:
> >
> > Slab: 536 kB
>
> 168kB different on a system with less than 8MB memory looks rather
> significant to me to simply delete SLOB, I'm afraid.
Just FYI...
Some experiment based on v5.17-rc3:
SLOB:
Slab: 388 kB
SLUB:
Slab: 540 kB (+152kb)
SLUB with s->min_partial = 0:
Slab: 452 kB (+64kb)
SLUB with s->min_partial = 0 && slub_max_order = 0:
Slab: 436 kB (+48kb)
SLUB with s->min_partial = 0 && slub_max_order = 0
+ merging slabs crazily (just ignore SLAB_NEVER_MERGE/SLAB_MERGE_SAME):
Slab: 408 kB (+20kb)
Decreasing further seem to be hard and
I guess +20kb are due to partial slabs.
I think SLUB can be memory-efficient as SLOB.
Is SLOB (Address-Ordered next fit) stronger to fragmentation than SLUB?
From: Hyeonggon Yoo
> Sent: 18 February 2022 10:13
...
> I think SLUB can be memory-efficient as SLOB.
> Is SLOB (Address-Ordered next^Wfirst fit) stronger to fragmentation than SLUB?
Dunno, but I had to patch the vxworks malloc to use 'best fit'
because 'first fit' based on a fifo free list was really horrid.
I can't imagine an address ordered 'first fit' really being that much better.
There are probably a lot more allocs and frees than the kernel used to have.
Also isn't the performance of a 'first fit' going to get horrid
when there are a lot of small items on the free list.
Does SLUB split pages into 3s and 5s (on cache lime boundaries)
as well as powers of 2?
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Fri, Feb 18, 2022 at 04:10:28PM +0000, David Laight wrote:
> From: Hyeonggon Yoo
> > Sent: 18 February 2022 10:13
> ...
> > I think SLUB can be memory-efficient as SLOB.
> > Is SLOB (Address-Ordered next^Wfirst fit) stronger to fragmentation than SLUB?
>
> Dunno, but I had to patch the vxworks malloc to use 'best fit'
> because 'first fit' based on a fifo free list was really horrid.
>
> I can't imagine an address ordered 'first fit' really being that much better.
>
> There are probably a lot more allocs and frees than the kernel used to have.
>
> Also isn't the performance of a 'first fit' going to get horrid
> when there are a lot of small items on the free list.
SLOB is focused on low memory usage, at the cost of poor performance.
Its speed is not a concern.
I think Address-Ordered sequential fit method pretty well in terms of
low memory usage.
And I think SLUB may replace SLOB, but we need to sure SLUB is
absolute winner.. I wonder How slab maintainers think?
>
> Does SLUB split pages into 3s and 5s (on cache lime boundaries)
> as well as powers of 2?
>
SLUB/SLAB use different strategy than SLOB, for better allocation
performance. It's variant of segregated storage method.
SLUB/SLAB both creates dedicated "caches" for each type of object. for
example, on my system, there are slab cache for dentry(192), filp(256),
fs_cache(64) ... etc.
Objects that has different types are by default managed by different cache,
which holds manages of pages. slab caches can be merged for better cacheline
utilization.
SLUB/SLAB also creates global kmalloc caches at boot time for power of 2
objects and (128, 256, 512, 1K, 2K, 4K, 8K on my system).
Thanks,
Hyeonggon.
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>