2017-03-23 13:01:19

by Chao Peng

[permalink] [raw]
Subject: [PATCH] x86/boot: Support uncompressed kernel

Compressed kernel has its own drawback: uncompressing takes time. Even
though the time is short enough to ignore for most cases but for cases that
time is critical this is still a big number. In our on-going optimization
for kernel boot time, the measured overall kernel boot time is ~90ms while
the uncompressing takes ~50ms with gzip.

The patch adds a 'CONFIG_KERNEL_RAW' configure choice so the built binary
can have no uncompressing at all. The experiment shows:

kernel kernel size time in decompress_kernel
compressed (gzip) 3.3M 53ms
uncompressed 14M 3ms

Signed-off-by: Chao Peng <[email protected]>
---
arch/x86/boot/compressed/Makefile | 3 +++
arch/x86/boot/compressed/misc.c | 14 ++++++++++++++
init/Kconfig | 7 +++++++
scripts/Makefile.lib | 8 ++++++++
4 files changed, 32 insertions(+)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index f9ce75d..fc0e1c0 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -73,6 +73,8 @@ $(obj)/vmlinux.relocs: vmlinux FORCE
vmlinux.bin.all-y := $(obj)/vmlinux.bin
vmlinux.bin.all-$(CONFIG_X86_NEED_RELOCS) += $(obj)/vmlinux.relocs

+$(obj)/vmlinux.bin.raw: $(vmlinux.bin.all-y) FORCE
+ $(call if_changed,raw)
$(obj)/vmlinux.bin.gz: $(vmlinux.bin.all-y) FORCE
$(call if_changed,gzip)
$(obj)/vmlinux.bin.bz2: $(vmlinux.bin.all-y) FORCE
@@ -86,6 +88,7 @@ $(obj)/vmlinux.bin.lzo: $(vmlinux.bin.all-y) FORCE
$(obj)/vmlinux.bin.lz4: $(vmlinux.bin.all-y) FORCE
$(call if_changed,lz4)

+suffix-$(CONFIG_KERNEL_RAW) := raw
suffix-$(CONFIG_KERNEL_GZIP) := gz
suffix-$(CONFIG_KERNEL_BZIP2) := bz2
suffix-$(CONFIG_KERNEL_LZMA) := lzma
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index 79dac17..fb3cd43 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -123,6 +123,20 @@ static char *vidmem;
static int vidport;
static int lines, cols;

+#ifdef CONFIG_KERNEL_RAW
+#include <linux/decompress/mm.h>
+static int __decompress(unsigned char *buf, long len,
+ long (*fill)(void*, unsigned long),
+ long (*flush)(void*, unsigned long),
+ unsigned char *outbuf, long olen,
+ long *pos,
+ void (*error)(char *x))
+{
+ memcpy(outbuf, buf, olen);
+ return 0;
+}
+#endif
+
#ifdef CONFIG_KERNEL_GZIP
#include "../../../../lib/decompress_inflate.c"
#endif
diff --git a/init/Kconfig b/init/Kconfig
index 2232080..1db2ea2 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -137,6 +137,13 @@ choice

If in doubt, select 'gzip'

+config KERNEL_RAW
+ bool "RAW"
+ help
+ No compression. It creates much bigger kernel and uses much more
+ space (disk/memory) than other choices. It can be useful when
+ decompression speed is the most concern while space is not a problem.
+
config KERNEL_GZIP
bool "Gzip"
depends on HAVE_KERNEL_GZIP
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 2edbcad..384128d 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -344,6 +344,14 @@ cmd_lz4 = (cat $(filter-out FORCE,$^) | \
lz4c -l -c1 stdin stdout && $(call size_append, $(filter-out FORCE,$^))) > $@ || \
(rm -f $@ ; false)

+# RAW
+# ---------------------------------------------------------------------------
+quiet_cmd_raw = RAW $@
+cmd_raw = (cat $(filter-out FORCE,$^) && \
+ $(call size_append, $(filter-out FORCE,$^))) > $@ || \
+ (rm -f $@ ; false)
+
+
# U-Boot mkimage
# ---------------------------------------------------------------------------

--
1.8.3.1


2017-03-23 15:07:53

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Support uncompressed kernel

On Thu, Mar 23, 2017 at 5:51 AM, Chao Peng <[email protected]> wrote:
> Compressed kernel has its own drawback: uncompressing takes time. Even
> though the time is short enough to ignore for most cases but for cases that
> time is critical this is still a big number. In our on-going optimization
> for kernel boot time, the measured overall kernel boot time is ~90ms while
> the uncompressing takes ~50ms with gzip.
>
> The patch adds a 'CONFIG_KERNEL_RAW' configure choice so the built binary
> can have no uncompressing at all. The experiment shows:
>
> kernel kernel size time in decompress_kernel
> compressed (gzip) 3.3M 53ms
> uncompressed 14M 3ms

How about the time difference for bootloader to read kernel from
flash/disk/network to ram?

Thanks

Yinghai

2017-03-23 15:32:35

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Support uncompressed kernel

On (03/23/17 08:07), Yinghai Lu wrote:
> On Thu, Mar 23, 2017 at 5:51 AM, Chao Peng <[email protected]> wrote:
> > Compressed kernel has its own drawback: uncompressing takes time. Even
> > though the time is short enough to ignore for most cases but for cases that
> > time is critical this is still a big number. In our on-going optimization
> > for kernel boot time, the measured overall kernel boot time is ~90ms while
> > the uncompressing takes ~50ms with gzip.
> >
> > The patch adds a 'CONFIG_KERNEL_RAW' configure choice so the built binary
> > can have no uncompressing at all. The experiment shows:
> >
> > kernel kernel size time in decompress_kernel
> > compressed (gzip) 3.3M 53ms
> > uncompressed 14M 3ms
>
> How about the time difference for bootloader to read kernel from
> flash/disk/network to ram?

there are also faster de-compressors than gzip out there. LZ4, for instance.
LZ4, as far as I remember, can be quite fast, like ~10 times faster than gzip.
have you tested it?

-ss

2017-03-24 05:35:58

by Chao Peng

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Support uncompressed kernel


> > > The patch adds a 'CONFIG_KERNEL_RAW' configure choice so the built
> > > binary
> > > can have no uncompressing at all. The experiment shows:
> > >
> > >     kernel               kernel size    time in decompress_kernel
> > >     compressed (gzip)    3.3M           53ms
> > >     uncompressed         14M            3ms
> >
> > How about the time difference for bootloader to read kernel from
> > flash/disk/network to ram?

The loading time for bootloader can be longer as size increased, but
that depends on which media it uses. For our usecase, it's not a big
problem. As we run the kernel in virtual machine and lunch thousands of
instances in the same physical machine so only the first instance needs
to read from the file and later we just copy the memory. The thing that
really matters for us is how fast we can boot for majority of the
instances.

>
> there are also faster de-compressors than gzip out there. LZ4, for
> instance.
> LZ4, as far as I remember, can be quite fast, like ~10 times faster
> than gzip.
> have you tested it?

Exactly, LZ4 is the fastest. It takes 16ms to complete the
decompression. Still sounds a little longer when compared to
uncompressed kernel.

Chao

2017-03-24 08:24:40

by Michal Marek

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Support uncompressed kernel

On 2017-03-23 13:51, Chao Peng wrote:
> Compressed kernel has its own drawback: uncompressing takes time. Even
> though the time is short enough to ignore for most cases but for cases that
> time is critical this is still a big number. In our on-going optimization
> for kernel boot time, the measured overall kernel boot time is ~90ms while
> the uncompressing takes ~50ms with gzip.
>
> The patch adds a 'CONFIG_KERNEL_RAW' configure choice so the built binary
> can have no uncompressing at all. The experiment shows:
>
> kernel kernel size time in decompress_kernel
> compressed (gzip) 3.3M 53ms
> uncompressed 14M 3ms
>
> Signed-off-by: Chao Peng <[email protected]>
> ---
> arch/x86/boot/compressed/Makefile | 3 +++
> arch/x86/boot/compressed/misc.c | 14 ++++++++++++++
> init/Kconfig | 7 +++++++
> scripts/Makefile.lib | 8 ++++++++
> 4 files changed, 32 insertions(+)
>
> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> index f9ce75d..fc0e1c0 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -73,6 +73,8 @@ $(obj)/vmlinux.relocs: vmlinux FORCE
> vmlinux.bin.all-y := $(obj)/vmlinux.bin
> vmlinux.bin.all-$(CONFIG_X86_NEED_RELOCS) += $(obj)/vmlinux.relocs
>
> +$(obj)/vmlinux.bin.raw: $(vmlinux.bin.all-y) FORCE
> + $(call if_changed,raw)
> $(obj)/vmlinux.bin.gz: $(vmlinux.bin.all-y) FORCE
> $(call if_changed,gzip)
> $(obj)/vmlinux.bin.bz2: $(vmlinux.bin.all-y) FORCE
> @@ -86,6 +88,7 @@ $(obj)/vmlinux.bin.lzo: $(vmlinux.bin.all-y) FORCE
> $(obj)/vmlinux.bin.lz4: $(vmlinux.bin.all-y) FORCE
> $(call if_changed,lz4)
>
> +suffix-$(CONFIG_KERNEL_RAW) := raw
> suffix-$(CONFIG_KERNEL_GZIP) := gz
> suffix-$(CONFIG_KERNEL_BZIP2) := bz2
> suffix-$(CONFIG_KERNEL_LZMA) := lzma
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index 79dac17..fb3cd43 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -123,6 +123,20 @@ static char *vidmem;
> static int vidport;
> static int lines, cols;
>
> +#ifdef CONFIG_KERNEL_RAW
> +#include <linux/decompress/mm.h>
> +static int __decompress(unsigned char *buf, long len,
> + long (*fill)(void*, unsigned long),
> + long (*flush)(void*, unsigned long),
> + unsigned char *outbuf, long olen,
> + long *pos,
> + void (*error)(char *x))
> +{
> + memcpy(outbuf, buf, olen);
> + return 0;
> +}
> +#endif
> +
> #ifdef CONFIG_KERNEL_GZIP
> #include "../../../../lib/decompress_inflate.c"
> #endif
> diff --git a/init/Kconfig b/init/Kconfig
> index 2232080..1db2ea2 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -137,6 +137,13 @@ choice
>
> If in doubt, select 'gzip'
>
> +config KERNEL_RAW
> + bool "RAW"
> + help
> + No compression. It creates much bigger kernel and uses much more
> + space (disk/memory) than other choices. It can be useful when
> + decompression speed is the most concern while space is not a problem.

This needs to depend on a HAVE_KERNEL_RAW config that is selected by the
architectures that implement this target (x86).

Michal

Subject: Re: [PATCH] x86/boot: Support uncompressed kernel

On 2017-03-24 13:35:40 [+0800], Chao Peng wrote:
>
> > > >     kernel               kernel size    time in decompress_kernel
> > > >     compressed (gzip)    3.3M           53ms
> > > >     uncompressed         14M            3ms
> > >
> Exactly, LZ4 is the fastest. It takes 16ms to complete the
> decompression. Still sounds a little longer when compared to
> uncompressed kernel.

Are we seriously talking here about one-time improvement of 13ms
boot time?

> Chao

Sebastian

2017-03-27 09:27:43

by Chao Peng

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Support uncompressed kernel

On Mon, 2017-03-27 at 09:58 +0200, Sebastian Andrzej Siewior wrote:
> On 2017-03-24 13:35:40 [+0800], Chao Peng wrote:
> >
> >
> > >
> > > >
> > > > >
> > > > >     kernel               kernel size    time in
> > > > > decompress_kernel
> > > > >     compressed (gzip)    3.3M           53ms
> > > > >     uncompressed         14M            3ms
> > > >
> > Exactly, LZ4 is the fastest. It takes 16ms to complete the
> > decompression. Still sounds a little longer when compared to
> > uncompressed kernel.
>
> Are we seriously talking here about one-time improvement of 13ms
> boot time?

The usage model for us is to lunch kernel in virtual machine and there
will be thousands of instances lunched and shutdowned/re-lunched
frequently, so every single million-second helps. And 13ms means 20%
improvement to our existing optimization (the other part besides
decompression is optimized to ~40ms).

Chao
>
> >
> > Chao
>
> Sebastian

2017-03-27 11:49:00

by Michal Marek

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Support uncompressed kernel

Dne 27.3.2017 v 09:58 Sebastian Andrzej Siewior napsal(a):
> On 2017-03-24 13:35:40 [+0800], Chao Peng wrote:
>>
>>>>> kernel kernel size time in decompress_kernel
>>>>> compressed (gzip) 3.3M 53ms
>>>>> uncompressed 14M 3ms
>>>>
>> Exactly, LZ4 is the fastest. It takes 16ms to complete the
>> decompression. Still sounds a little longer when compared to
>> uncompressed kernel.
>
> Are we seriously talking here about one-time improvement of 13ms
> boot time?

If the use case is launching new VM instances continuously, then
compressing the kernel image is about as useful as compressing /bin/bash.

Michal

2017-03-27 13:25:15

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Support uncompressed kernel

On Mon, Mar 27, 2017 at 1:47 PM, Michal Marek <[email protected]> wrote:
> Dne 27.3.2017 v 09:58 Sebastian Andrzej Siewior napsal(a):
>> On 2017-03-24 13:35:40 [+0800], Chao Peng wrote:
>>>
>>>>>> kernel kernel size time in decompress_kernel
>>>>>> compressed (gzip) 3.3M 53ms
>>>>>> uncompressed 14M 3ms
>>>>>
>>> Exactly, LZ4 is the fastest. It takes 16ms to complete the
>>> decompression. Still sounds a little longer when compared to
>>> uncompressed kernel.
>>
>> Are we seriously talking here about one-time improvement of 13ms
>> boot time?
>
> If the use case is launching new VM instances continuously, then
> compressing the kernel image is about as useful as compressing /bin/bash.

I guess the next step would be to use CONFIG_XIP_KERNEL on x86,
which requires an uncompressed kernel but has the additional advantage
of sharing the read-only sections of the kernel image across virtual
machines, resulting in better RAM and cache usage.

Arnd

2017-03-28 12:01:44

by Chao Peng

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Support uncompressed kernel

On Mon, 2017-03-27 at 15:25 +0200, Arnd Bergmann wrote:
> On Mon, Mar 27, 2017 at 1:47 PM, Michal Marek <[email protected]> wrote:
> >
> > Dne 27.3.2017 v 09:58 Sebastian Andrzej Siewior napsal(a):
> > >
> > > On 2017-03-24 13:35:40 [+0800], Chao Peng wrote:
> > > >
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > >     kernel               kernel size    time in
> > > > > > > decompress_kernel
> > > > > > >     compressed (gzip)    3.3M           53ms
> > > > > > >     uncompressed         14M            3ms
> > > > > >
> > > > Exactly, LZ4 is the fastest. It takes 16ms to complete the
> > > > decompression. Still sounds a little longer when compared to
> > > > uncompressed kernel.
> > >
> > > Are we seriously talking here about one-time improvement of 13ms
> > > boot time?
> >
> > If the use case is launching new VM instances continuously, then
> > compressing the kernel image is about as useful as compressing
> > /bin/bash.
>
> I guess the next step would be to use CONFIG_XIP_KERNEL on x86,
> which requires an uncompressed kernel but has the additional advantage
> of sharing the read-only sections of the kernel image across virtual
> machines, resulting in better RAM and cache usage.

That is something we wanna look into :)

Chao

2017-03-28 22:50:24

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Support uncompressed kernel

On 03/28/17 05:01, Chao Peng wrote:
>>
>> I guess the next step would be to use CONFIG_XIP_KERNEL on x86,
>> which requires an uncompressed kernel but has the additional advantage
>> of sharing the read-only sections of the kernel image across virtual
>> machines, resulting in better RAM and cache usage.
>
> That is something we wanna look into :)
>

It is, but that is a second order thing... especially since the x86
kernel makes heavy use of self-patching at the moment. What would be
more significant, though, would be to avoid the memcpy() and instead
decode the uncompressed kernel in-place.

-hpa


2017-03-28 22:56:16

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Support uncompressed kernel

On Tue, Mar 28, 2017 at 3:38 PM, H. Peter Anvin <[email protected]> wrote:
> On 03/28/17 05:01, Chao Peng wrote:
>>>
>>> I guess the next step would be to use CONFIG_XIP_KERNEL on x86,
>>> which requires an uncompressed kernel but has the additional advantage
>>> of sharing the read-only sections of the kernel image across virtual
>>> machines, resulting in better RAM and cache usage.
>>
>> That is something we wanna look into :)
>>
>
> It is, but that is a second order thing... especially since the x86
> kernel makes heavy use of self-patching at the moment. What would be
> more significant, though, would be to avoid the memcpy() and instead
> decode the uncompressed kernel in-place.
>

Having looked at this code recently, I'd rather fix it differently:
use the streaming decompression API and integrate it with the ELF
parsing code so we can decompress directly into the actual load
location. Also, the parse_elf() code needs some serious improved
documentation and robustification. It's absurdly fragile right now.