2018-04-20 14:38:04

by Paul Menzel

[permalink] [raw]
Subject: How to disable Linux kernel self-extraction (KERNEL_G ZIP, KERNEL_BZIP2, …)?

Dear Linux folks,


I try to decrease boot time, and my system has an SSD and enough space,
so loading 18 instead of 12 MB doesn’t make a difference, but the
self-extraction is noticeable. So, I like to disable it.

From `init/Kconfig`:

> The linux kernel is a kind of self-extracting executable.

Unfortunately, I couldn’t find a way to disable it. Should an option
`KERNEL_NONE` be added?


Kind regards,

Paul


Attachments:
smime.p7s (5.05 kB)
S/MIME Cryptographic Signature

2018-04-22 10:21:55

by Pavel Machek

[permalink] [raw]
Subject: Re: How to disable Linux kernel self-extraction (KERNEL_GZIP, KERNEL_BZIP2, … )?

On Fri 2018-04-20 16:36:00, Paul Menzel wrote:
> Dear Linux folks,
>
>
> I try to decrease boot time, and my system has an SSD and enough space, so
> loading 18 instead of 12 MB doesn’t make a difference, but the
> self-extraction is noticeable. So, I like to disable it.

How long does GZIP extraction take on your hardware?


--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2018-04-22 19:59:47

by Paul Menzel

[permalink] [raw]
Subject: Re: How to disable Linux kernel self-extraction (KERNEL _GZIP, KERNEL_BZIP2, …)?

Dear Pavel,


Am 22.04.2018 um 12:20 schrieb Pavel Machek:
> On Fri 2018-04-20 16:36:00, Paul Menzel wrote:

>> I try to decrease boot time, and my system has an SSD and enough space, so
>> loading 18 instead of 12 MB doesn’t make a difference, but the
>> self-extraction is noticeable. So, I like to disable it.
>
> How long does GZIP extraction take on your hardware?

It’s hard to measure – at least I didn’t find a way to do so –, but
counting from the last GRUB message to the first message of Linux (with
`quiet` removed from the the command line), it takes roughly *two* seconds.

```
$ ls -l /boot/vmlinuz-4.15.0-3-686-pae
-rw-r--r-- 1 root root 3987200 Apr 19 12:13 /boot/vmlinuz-4.15.0-3-686-pae
$ ls -lh /boot/vmlinuz-4.15.0-3-686-pae
-rw-r--r-- 1 root root 3,9M Apr 19 12:13 /boot/vmlinuz-4.15.0-3-686-pae
$ time scripts/extract-vmlinux vmlinuz-4.15.0-3-686-pae > bla.txt

real 0m1,204s
user 0m1,041s
sys 0m0,245s
```

As another data point, my self-built (*bigger*) image with XZ used for
compression.

```
$ ls -lh /boot/vmlinuz-4.16.0+
-rw-r--r-- 1 root root 6,1M Apr 14 10:48 /boot/vmlinuz-4.16.0+
$ time /usr/src/linux-headers-4.17.0-rc1+/scripts/extract-vmlinux
vmlinuz-4.16.0+ > bla.txt

real 0m2,190s
user 0m1,792s
sys 0m0,500s
```

So, it’s really noticeable if the rest of the system starts in less then
five seconds to a login prompt, or you are trying to achieve to display
the LUKS passphrase dialog more or less instantly.


Kind regards,

Paul

2018-04-22 20:53:48

by Pavel Machek

[permalink] [raw]
Subject: Re: How to disable Linux kernel self-extraction (KERNEL_GZIP, KERNEL_BZIP2, … )?

Hi!

> >>I try to decrease boot time, and my system has an SSD and enough space, so
> >>loading 18 instead of 12 MB doesn’t make a difference, but the
> >>self-extraction is noticeable. So, I like to disable it.
> >
> >How long does GZIP extraction take on your hardware?
>
> It’s hard to measure – at least I didn’t find a way to do so –, but counting
> from the last GRUB message to the first message of Linux (with `quiet`
> removed from the the command line), it takes roughly *two* seconds.
>
> ```
> $ ls -l /boot/vmlinuz-4.15.0-3-686-pae
> -rw-r--r-- 1 root root 3987200 Apr 19 12:13 /boot/vmlinuz-4.15.0-3-686-pae
> $ ls -lh /boot/vmlinuz-4.15.0-3-686-pae
> -rw-r--r-- 1 root root 3,9M Apr 19 12:13 /boot/vmlinuz-4.15.0-3-686-pae
> $ time scripts/extract-vmlinux vmlinuz-4.15.0-3-686-pae > bla.txt
>
> real 0m1,204s
> user 0m1,041s
> sys 0m0,245s
> ```

Interesting; looks like I have faster machine (thinkpad X220).

pavel@duo:/data/l/linux$ time scripts/extract-vmlinux /tmp/vmlinux.bin
> /tmp/delme
0.21user 0.15system 0.66 (0m0.660s) elapsed 56.36%CPU
pavel@duo:/data/l/linux$ ls -al /tmp/delme
-rw-r--r-- 1 pavel pavel 18275356 Apr 22 22:44 /tmp/delme
pavel@duo:/data/l/linux$ gzip /tmp/delme
pavel@duo:/data/l/linux$ time gzip -d /tmp/delme.gz
0.21user 0.02system 0.23 (0m0.232s) elapsed 100.03%CPU
pavel@duo:/data/l/linux$ gzip -1 /tmp/delme
pavel@duo:/data/l/linux$ time gzip -d /tmp/delme.gz
0.22user 0.03system 0.25 (0m0.257s) elapsed 100.11%CPU
pavel@duo:/data/l/linux$ time gzip -d /tmp/delme.gz

..and it also looks like extract-kernel is significantly slower than
gzip -9, because it is not using all the CPU. Strange.

Interesting. I somehow assumed gzip -d would be faster than that on
modern machines.

So yes, looks like uncompressed kernel image may be good idea.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (1.97 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments

2018-04-23 17:04:24

by Pavel Machek

[permalink] [raw]
Subject: Re: How to disable Linux kernel self-extraction (KERNEL_GZIP, KERNEL_BZIP2, … )?

Hi!

> > >>I try to decrease boot time, and my system has an SSD and enough space, so
> > >>loading 18 instead of 12 MB doesn’t make a difference, but the
> > >>self-extraction is noticeable. So, I like to disable it.
> > >
> > >How long does GZIP extraction take on your hardware?
> >
> > It’s hard to measure – at least I didn’t find a way to do so –, but counting
> > from the last GRUB message to the first message of Linux (with `quiet`
> > removed from the the command line), it takes roughly *two* seconds.
> >
> > ```
> > $ ls -l /boot/vmlinuz-4.15.0-3-686-pae
> > -rw-r--r-- 1 root root 3987200 Apr 19 12:13 /boot/vmlinuz-4.15.0-3-686-pae
> > $ ls -lh /boot/vmlinuz-4.15.0-3-686-pae
> > -rw-r--r-- 1 root root 3,9M Apr 19 12:13 /boot/vmlinuz-4.15.0-3-686-pae
> > $ time scripts/extract-vmlinux vmlinuz-4.15.0-3-686-pae > bla.txt
> >
> > real 0m1,204s
> > user 0m1,041s
> > sys 0m0,245s
> > ```
>
> Interesting; looks like I have faster machine (thinkpad X220).
>
> pavel@duo:/data/l/linux$ time scripts/extract-vmlinux /tmp/vmlinux.bin
> > /tmp/delme
> 0.21user 0.15system 0.66 (0m0.660s) elapsed 56.36%CPU
> pavel@duo:/data/l/linux$ ls -al /tmp/delme
> -rw-r--r-- 1 pavel pavel 18275356 Apr 22 22:44 /tmp/delme
> pavel@duo:/data/l/linux$ gzip /tmp/delme
> pavel@duo:/data/l/linux$ time gzip -d /tmp/delme.gz
> 0.21user 0.02system 0.23 (0m0.232s) elapsed 100.03%CPU
> pavel@duo:/data/l/linux$ gzip -1 /tmp/delme
> pavel@duo:/data/l/linux$ time gzip -d /tmp/delme.gz
> 0.22user 0.03system 0.25 (0m0.257s) elapsed 100.11%CPU
> pavel@duo:/data/l/linux$ time gzip -d /tmp/delme.gz
>
> ..and it also looks like extract-kernel is significantly slower than
> gzip -9, because it is not using all the CPU. Strange.
>
> Interesting. I somehow assumed gzip -d would be faster than that on
> modern machines.
>
> So yes, looks like uncompressed kernel image may be good idea.

Actually... Compressors usually have a mode when they store the data
uncompressed. So you should be able to prepare .gz image which is not
really compressed inside, and thus really fast to uncompress.

Or maybe even better -- there should be some compression algorithms
that are fast enough to uncompress that there should be no
slowdown. Maybe use one of those?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (2.42 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments

2018-04-24 02:10:37

by Adam Borowski

[permalink] [raw]
Subject: Re: How to disable Linux kernel self-extraction (KERNEL_GZIP, KERNEL_BZIP2, … )?

On Mon, Apr 23, 2018 at 07:02:30PM +0200, Pavel Machek wrote:
> > > >>I try to decrease boot time, and my system has an SSD and enough space, so
> > > >>loading 18 instead of 12 MB doesn’t make a difference, but the
> > > >>self-extraction is noticeable. So, I like to disable it.
> > > >
> > > >How long does GZIP extraction take on your hardware?
> > >
> > > It’s hard to measure – at least I didn’t find a way to do so –, but counting
> > > from the last GRUB message to the first message of Linux (with `quiet`
> > > removed from the the command line), it takes roughly *two* seconds.

I took a somewhat different approach: I recorded the output from grub+kernel
to ttyrec over serial line, and rigged ttyrec2ansi to output timestamp
difference from the last checkpoint every time an '\e' or '\n' is seen.
'\e' is important, as there's no other marking for when grub stops the
interactive phase and starts the actual boot.

Turns out that, reading from SSD, grub is way way slower than it should take
normally. The machine is old (AMD Phenom II X6 1055T), SSD is Crucial
CT240M500SSD1.

I also have the zstd patch applied, which adds another data point.

The two "Loading XXX ..." lines come from grub, those timestamped within []
brackets from the kernel, 〈〉are ttyrec timestamps, ⤸ is wrapped lines.


zstd:

Loading Linux 4.17.0-rc2-debug-00025-gd426b0ba363d ...〈0.739823〉
^MLoading initial ramdisk ...〈0.402010〉
^M[ 0.000000] Linux version 4.17.0-rc2-debug-00025-gd426b0ba363d ⤸
(kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Mon Apr 23⤸
10:25:58 CEST 2018^M〈0.785922〉
[ 0.000000] Command line: BOOT_IMAGE=/sys/boot/vmlinuz-4.17.0-rc2-debug-00025-gd426b0ba363d⤸
root=UUID=b7c38da9-ae84-4083-a1f8-6d7b4fc33961 ro rootflags=subvol=sys syscall.x32=y⤸
console=tty0 console=ttyS0,115200n8 no_console_suspend^M〈0.020199〉

gzip:

Loading Linux 4.17.0-rc2-debug-gz-00025-gd426b0ba363d ...〈0.724988〉
^MLoading initial ramdisk ...〈0.357951〉
^M[ 0.000000] Linux version 4.17.0-rc2-debug-gz-00025-gd426b0ba363d ⤸
(kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Mon Apr 23 ⤸
23:15:07 CEST 2018^M〈0.777977〉
[ 0.000000] Command line: BOOT_IMAGE=/sys/boot/vmlinuz-4.17.0-rc2-debug-gz-00025-gd426b0ba363d⤸
root=UUID=b7c38da9-ae84-4083-a1f8-6d7b4fc33961 ro rootflags=subvol=sys syscall.x32=y⤸
console=tty0 console=ttyS0,115200n8 no_console_suspend^M〈0.020117〉

lz4:

Loading Linux 4.17.0-rc2-debug-none-00025-gd426b0ba363d ...〈0.799969〉
^MLoading initial ramdisk ...〈0.424959〉
^M[ 0.000000] Linux version 4.17.0-rc2-debug-lz4-00025-gd426b0ba363d ⤸
(kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Tue Apr 24 ⤸
00:34:59 CEST 2018^M〈0.732925〉
[ 0.000000] Command line: BOOT_IMAGE=/sys/boot/vmlinuz-4.17.0-rc2-debug-lz4-00025-gd426b0ba363d⤸
root=UUID=b7c38da9-ae84-4083-a1f8-6d7b4fc33961 ro rootflags=subvol=sys syscall.x32=y⤸
console=tty0 console=ttyS0,115200n8 no_console_suspend^M〈0.021019〉

zstd again:

Loading Linux 4.17.0-rc2-debug-00025-gd426b0ba363d ...〈0.728852〉
^MLoading initial ramdisk ...〈0.399968〉
^M[ 0.000000] Linux version 4.17.0-rc2-debug-00025-gd426b0ba363d ⤸
(kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Mon Apr 23 ⤸
10:25:58 CEST 2018^M〈0.786964〉
[ 0.000000] Command line: BOOT_IMAGE=/sys/boot/vmlinuz-4.17.0-rc2-debug-00025-gd426b0ba363d⤸
root=UUID=b7c38da9-ae84-4083-a1f8-6d7b4fc33961 ro rootflags=subvol=sys syscall.x32=y⤸
console=tty0 console=ttyS0,115200n8 no_console_suspend^M〈0.020071〉

lz4 rigged for no compression:

Loading Linux 4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty ...〈0.479834〉
^MLoading initial ramdisk ...〈2.246985〉
^M[ 0.000000] Linux version 4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty ⤸
(kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #5 SMP Tue Apr 24 ⤸
02:57:18 CEST 2018^M〈0.711949〉
[ 0.000000] Command line: BOOT_IMAGE=/sys/boot/vmlinuz-4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty⤸
root=UUID=b7c38da9-ae84-4083-a1f8-6d7b4fc33961 ro rootflags=subvol=sys syscall.x32=y⤸
console=tty0 console=ttyS0,115200n8 no_console_suspend^M〈0.021902〉

Sizes of relevant files:

14826134 initrd.img-4.17.0-rc2-debug-00025-gd426b0ba363d (zstd)
14826352 initrd.img-4.17.0-rc2-debug-gz-00025-gd426b0ba363d
14826909 initrd.img-4.17.0-rc2-debug-lz4-00025-gd426b0ba363d
14826761 initrd.img-4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty
6567408 vmlinuz-4.17.0-rc2-debug-00025-gd426b0ba363d (zstd)
7230960 vmlinuz-4.17.0-rc2-debug-gz-00025-gd426b0ba363d
8775152 vmlinuz-4.17.0-rc2-debug-lz4-00025-gd426b0ba363d
27821552 vmlinuz-4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty
(I did not alter initrd compression, which is zstd in all cases).

> > So yes, looks like uncompressed kernel image may be good idea.

Seems like the time to actually read this far bigger file from the disk
using grub's inefficient way, takes longer than the gains from faster
decompression. You can eliminate the decompression step altogether by
avoiding copying, but it still looks like it's not a win.

I've seen u-boot taking ~60 seconds to read from a SD card, too.

Another surprise is that zstd is a notch _slower_ than gzip (in userspace
it's drastically faster for the same compression ratio), but reduced disk
space is still nice. It's worth investigating why it's not as fast as it
should be.

> Actually... Compressors usually have a mode when they store the data
> uncompressed. So you should be able to prepare .gz image which is not
> really compressed inside, and thus really fast to uncompress.

I can't seem to find any. IIRC xz format can store uncompressed blocks but
the tool doesn't appear to expose this as an option.

> Or maybe even better -- there should be some compression algorithms
> that are fast enough to uncompress that there should be no
> slowdown. Maybe use one of those?

Perhaps my method is totally wrong, but differences in decompression speed
are surprisingly small, dwarfed by whatever else the kernel does between
messages.

I did not test xz, nor ran tests more than once, but it's 4am so these are
things to do todorrow.


Meow!
--
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢰⠒⠀⣿⡁
⢿⡄⠘⠷⠚⠋⠀ ... what's the frequency of that 5V DC?
⠈⠳⣄⠀⠀⠀⠀

2018-04-24 09:27:16

by Paul Menzel

[permalink] [raw]
Subject: Re: How to disable Linux kernel self-extraction (KERNEL _GZIP, KERNEL_BZIP2, …)?

Dear Adam,


Thank you very much to join the discussion.

On 04/24/18 04:08, Adam Borowski wrote:
> On Mon, Apr 23, 2018 at 07:02:30PM +0200, Pavel Machek wrote:
>>>>>> I try to decrease boot time, and my system has an SSD and enough space, so
>>>>>> loading 18 instead of 12 MB doesn’t make a difference, but the
>>>>>> self-extraction is noticeable. So, I like to disable it.
>>>>>
>>>>> How long does GZIP extraction take on your hardware?
>>>>
>>>> It’s hard to measure – at least I didn’t find a way to do so –, but counting
>>>> from the last GRUB message to the first message of Linux (with `quiet`
>>>> removed from the the command line), it takes roughly *two* seconds.
>
> I took a somewhat different approach: I recorded the output from grub+kernel
> to ttyrec over serial line, and rigged ttyrec2ansi to output timestamp
> difference from the last checkpoint every time an '\e' or '\n' is seen.
> '\e' is important, as there's no other marking for when grub stops the
> interactive phase and starts the actual boot.

Nice work. It’d be great, if you shared these patches, so others and I
can reproduce it.

(Also, `scripts/extract-vmlinux` needs to be updated for LZ4.)

> Turns out that, reading from SSD, grub is way way slower than it should take
> normally. The machine is old (AMD Phenom II X6 1055T), SSD is Crucial
> CT240M500SSD1.

What firmware does the device (mainboard) have? Legacy BIOS or UEFI (or
even coreboot ;-)). It’s my understanding, that GRUB does not have a
native driver with the legacy BIOS and UEFI, and just uses the BIOS
calls or the UEFI equivalent.

> I also have the zstd patch applied, which adds another data point.
>
> The two "Loading XXX ..." lines come from grub, those timestamped within []
> brackets from the kernel, 〈〉are ttyrec timestamps, ⤸ is wrapped lines.
>
>
> zstd:
>
> Loading Linux 4.17.0-rc2-debug-00025-gd426b0ba363d ...〈0.739823〉
> ^MLoading initial ramdisk ...〈0.402010〉
> ^M[ 0.000000] Linux version 4.17.0-rc2-debug-00025-gd426b0ba363d ⤸
> (kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Mon Apr 23⤸
> 10:25:58 CEST 2018^M〈0.785922〉
> [ 0.000000] Command line: BOOT_IMAGE=/sys/boot/vmlinuz-4.17.0-rc2-debug-00025-gd426b0ba363d⤸
> root=UUID=b7c38da9-ae84-4083-a1f8-6d7b4fc33961 ro rootflags=subvol=sys syscall.x32=y⤸
> console=tty0 console=ttyS0,115200n8 no_console_suspend^M〈0.020199〉
>
> gzip:
>
> Loading Linux 4.17.0-rc2-debug-gz-00025-gd426b0ba363d ...〈0.724988〉
> ^MLoading initial ramdisk ...〈0.357951〉
> ^M[ 0.000000] Linux version 4.17.0-rc2-debug-gz-00025-gd426b0ba363d ⤸
> (kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Mon Apr 23 ⤸
> 23:15:07 CEST 2018^M〈0.777977〉
> [ 0.000000] Command line: BOOT_IMAGE=/sys/boot/vmlinuz-4.17.0-rc2-debug-gz-00025-gd426b0ba363d⤸
> root=UUID=b7c38da9-ae84-4083-a1f8-6d7b4fc33961 ro rootflags=subvol=sys syscall.x32=y⤸
> console=tty0 console=ttyS0,115200n8 no_console_suspend^M〈0.020117〉
>
> lz4:
>
> Loading Linux 4.17.0-rc2-debug-none-00025-gd426b0ba363d ...〈0.799969〉
> ^MLoading initial ramdisk ...〈0.424959〉
> ^M[ 0.000000] Linux version 4.17.0-rc2-debug-lz4-00025-gd426b0ba363d ⤸
> (kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Tue Apr 24 ⤸
> 00:34:59 CEST 2018^M〈0.732925〉
> [ 0.000000] Command line: BOOT_IMAGE=/sys/boot/vmlinuz-4.17.0-rc2-debug-lz4-00025-gd426b0ba363d⤸
> root=UUID=b7c38da9-ae84-4083-a1f8-6d7b4fc33961 ro rootflags=subvol=sys syscall.x32=y⤸
> console=tty0 console=ttyS0,115200n8 no_console_suspend^M〈0.021019〉
>
> zstd again:
>
> Loading Linux 4.17.0-rc2-debug-00025-gd426b0ba363d ...〈0.728852〉
> ^MLoading initial ramdisk ...〈0.399968〉
> ^M[ 0.000000] Linux version 4.17.0-rc2-debug-00025-gd426b0ba363d ⤸
> (kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Mon Apr 23 ⤸
> 10:25:58 CEST 2018^M〈0.786964〉
> [ 0.000000] Command line: BOOT_IMAGE=/sys/boot/vmlinuz-4.17.0-rc2-debug-00025-gd426b0ba363d⤸
> root=UUID=b7c38da9-ae84-4083-a1f8-6d7b4fc33961 ro rootflags=subvol=sys syscall.x32=y⤸
> console=tty0 console=ttyS0,115200n8 no_console_suspend^M〈0.020071〉
>
> lz4 rigged for no compression:
>
> Loading Linux 4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty ...〈0.479834〉
> ^MLoading initial ramdisk ...〈2.246985〉

Just to be sure. The 2.2 seconds are from loading the 27 MB Linux kernel
image, right?

> ^M[ 0.000000] Linux version 4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty ⤸
> (kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #5 SMP Tue Apr 24 ⤸
> 02:57:18 CEST 2018^M〈0.711949〉
> [ 0.000000] Command line: BOOT_IMAGE=/sys/boot/vmlinuz-4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty⤸
> root=UUID=b7c38da9-ae84-4083-a1f8-6d7b4fc33961 ro rootflags=subvol=sys syscall.x32=y⤸
> console=tty0 console=ttyS0,115200n8 no_console_suspend^M〈0.021902〉
>
> Sizes of relevant files:
>
> 14826134 initrd.img-4.17.0-rc2-debug-00025-gd426b0ba363d (zstd)
> 14826352 initrd.img-4.17.0-rc2-debug-gz-00025-gd426b0ba363d
> 14826909 initrd.img-4.17.0-rc2-debug-lz4-00025-gd426b0ba363d
> 14826761 initrd.img-4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty
> 6567408 vmlinuz-4.17.0-rc2-debug-00025-gd426b0ba363d (zstd)
> 7230960 vmlinuz-4.17.0-rc2-debug-gz-00025-gd426b0ba363d
> 8775152 vmlinuz-4.17.0-rc2-debug-lz4-00025-gd426b0ba363d
> 27821552 vmlinuz-4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty
> (I did not alter initrd compression, which is zstd in all cases).

Does the size of the uncompressed image match the size in
`arch/x86/boot/compressed/vmlinux.bin`?

>>> So yes, looks like uncompressed kernel image may be good idea.
>
> Seems like the time to actually read this far bigger file from the disk
> using grub's inefficient way, takes longer than the gains from faster
> decompression. You can eliminate the decompression step altogether by
> avoiding copying, but it still looks like it's not a win.
>
> I've seen u-boot taking ~60 seconds to read from a SD card, too.

I could test on my coreboot systems, if GRUB is faster with the native
AHCI driver.

> Another surprise is that zstd is a notch _slower_ than gzip (in userspace
> it's drastically faster for the same compression ratio), but reduced disk
> space is still nice. It's worth investigating why it's not as fast as it
> should be.

Maybe that should be done in a separate thread. I’ll split it out.

>> Actually... Compressors usually have a mode when they store the data
>> uncompressed. So you should be able to prepare .gz image which is not
>> really compressed inside, and thus really fast to uncompress.
>
> I can't seem to find any. IIRC xz format can store uncompressed blocks but
> the tool doesn't appear to expose this as an option.
>
>> Or maybe even better -- there should be some compression algorithms
>> that are fast enough to uncompress that there should be no
>> slowdown. Maybe use one of those?
>
> Perhaps my method is totally wrong, but differences in decompression speed
> are surprisingly small, dwarfed by whatever else the kernel does between
> messages.
>
> I did not test xz, nor ran tests more than once, but it's 4am so these are
> things to do tomorrow.

It’d be interesting to find out, what is happening in the first 700 ms,
before the first Linux message is transmitted over serial line. It could
be, that the serial line affects the time stamps for example, or takes
so long to set up the serial console.


Kind regards,

Paul


Attachments:
smime.p7s (5.05 kB)
S/MIME Cryptographic Signature

2018-04-24 09:57:40

by Pavel Machek

[permalink] [raw]
Subject: Re: How to disable Linux kernel self-extraction (KERNEL_GZIP, KERNEL_BZIP2, … )?

Hi!

> > Actually... Compressors usually have a mode when they store the data
> > uncompressed. So you should be able to prepare .gz image which is not
> > really compressed inside, and thus really fast to uncompress.
>
> I can't seem to find any. IIRC xz format can store uncompressed blocks but
> the tool doesn't appear to expose this as an option.

I believe most compressors should be able to do that. But yes, that
option would need to be exported.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (623.00 B)
signature.asc (188.00 B)
Digital signature
Download all attachments

2018-04-24 23:09:57

by Adam Borowski

[permalink] [raw]
Subject: Re: How to disable Linux kernel self-extraction (KERNEL_GZIP, KERNEL_BZIP2, … )?

On Tue, Apr 24, 2018 at 11:08:34AM +0200, Paul Menzel wrote:
> On 04/24/18 04:08, Adam Borowski wrote:
> > On Mon, Apr 23, 2018 at 07:02:30PM +0200, Pavel Machek wrote:
> > > > > > > I try to decrease boot time, and my system has an SSD and enough space, so
> > > > > > > loading 18 instead of 12 MB doesn’t make a difference, but the
> > > > > > > self-extraction is noticeable. So, I like to disable it.
> > > > > >
> > > > > > How long does GZIP extraction take on your hardware?
> > > > >
> > > > > It’s hard to measure – at least I didn’t find a way to do so –, but counting
> > > > > from the last GRUB message to the first message of Linux (with `quiet`
> > > > > removed from the the command line), it takes roughly *two* seconds.
> >
> > I took a somewhat different approach: I recorded the output from grub+kernel
> > to ttyrec over serial line, and rigged ttyrec2ansi to output timestamp
> > difference from the last checkpoint every time an '\e' or '\n' is seen.
> > '\e' is important, as there's no other marking for when grub stops the
> > interactive phase and starts the actual boot.
>
> Nice work. It’d be great, if you shared these patches, so others and I can
> reproduce it.

ttyrec2ansi.c attached. To use: save the serial output as ttyrec (via the
eponymous tool, termrec, conversion from a similar format, etc), pipe
through modified ttyrec2ansi to a terminal, "less -R".


userland lz4:
diff --git a/lib/lz4.c b/lib/lz4.c
index 213b085..39d2cff 100644
--- a/lib/lz4.c
+++ b/lib/lz4.c
@@ -499,6 +499,7 @@ LZ4_FORCE_INLINE U32 LZ4_hashPosition(const void* const p, tableType_t const tab

static void LZ4_putPositionOnHash(const BYTE* p, U32 h, void* tableBase, tableType_t const tableType, const BYTE* srcBase)
{
+ return;
switch (tableType)
{
case byPtr: { const BYTE** hashTable = (const BYTE**)tableBase; hashTable[h] = p; return; }

Somehow this affects only /usr/bin/lz4 not /usr/bin/lz4c, which I did not
bother to fix but hacked via:

diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index f2014876405f..91534a801090 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -332,7 +332,7 @@ cmd_lzo = (cat $(filter-out FORCE,$^) | \

quiet_cmd_lz4 = LZ4 $@
cmd_lz4 = (cat $(filter-out FORCE,$^) | \
- lz4c -l -c1 stdin stdout && $(call size_append, $(filter-out FORCE,$^))) > $@ || \
+ lz4 -l -c1 stdin stdout && $(call size_append, $(filter-out FORCE,$^))) > $@ || \
(rm -f $@ ; false)

# U-Boot mkimage


I have serious doubts this approach is sound, but worked well enough at
least to spot the GRUB read slowness issue.

> > Turns out that, reading from SSD, grub is way way slower than it should take
> > normally. The machine is old (AMD Phenom II X6 1055T), SSD is Crucial
> > CT240M500SSD1.
>
> What firmware does the device (mainboard) have? Legacy BIOS or UEFI (or even
> coreboot ;-)). It’s my understanding, that GRUB does not have a native
> driver with the legacy BIOS and UEFI, and just uses the BIOS calls or the
> UEFI equivalent.

An old machine -- real BIOS.

> > I also have the zstd patch applied, which adds another data point.
> >
> > The two "Loading XXX ..." lines come from grub, those timestamped within []
> > brackets from the kernel, 〈〉are ttyrec timestamps, ⤸ is wrapped lines.
> >
> >
> > zstd:
> >
> > Loading Linux 4.17.0-rc2-debug-00025-gd426b0ba363d ...〈0.739823〉
> > ^MLoading initial ramdisk ...〈0.402010〉
> > ^M[ 0.000000] Linux version 4.17.0-rc2-debug-00025-gd426b0ba363d ⤸
> > (kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Mon Apr 23⤸
> > 10:25:58 CEST 2018^M〈0.785922〉
> >
> > gzip:
> >
> > Loading Linux 4.17.0-rc2-debug-gz-00025-gd426b0ba363d ...〈0.724988〉
> > ^MLoading initial ramdisk ...〈0.357951〉
> > ^M[ 0.000000] Linux version 4.17.0-rc2-debug-gz-00025-gd426b0ba363d ⤸
> > (kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Mon Apr 23 ⤸
> > 23:15:07 CEST 2018^M〈0.777977〉
> >
> > lz4:
> >
> > Loading Linux 4.17.0-rc2-debug-none-00025-gd426b0ba363d ...〈0.799969〉
> > ^MLoading initial ramdisk ...〈0.424959〉
> > ^M[ 0.000000] Linux version 4.17.0-rc2-debug-lz4-00025-gd426b0ba363d ⤸
> > (kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Tue Apr 24 ⤸
> > 00:34:59 CEST 2018^M〈0.732925〉
> >
> > zstd again:
> >
> > Loading Linux 4.17.0-rc2-debug-00025-gd426b0ba363d ...〈0.728852〉
> > ^MLoading initial ramdisk ...〈0.399968〉
> > ^M[ 0.000000] Linux version 4.17.0-rc2-debug-00025-gd426b0ba363d ⤸
> > (kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Mon Apr 23 ⤸
> > 10:25:58 CEST 2018^M〈0.786964〉
> >
> > lz4 rigged for no compression:
> >
> > Loading Linux 4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty ...〈0.479834〉
> > ^MLoading initial ramdisk ...〈2.246985〉
>
> Just to be sure. The 2.2 seconds are from loading the 27 MB Linux kernel
> image, right?

I don't see any other obvious explanation, yeah.

> > ^M[ 0.000000] Linux version 4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty ⤸
> > (kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #5 SMP Tue Apr 24 ⤸
> > 02:57:18 CEST 2018^M〈0.711949〉

lz4 rigged for no compression, re-run:

Loading Linux 4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty ...〈0.476784〉
^MLoading initial ramdisk ...〈2.229852〉
^M[ 0.000000] Linux version 4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty ⤸
(kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #5 SMP Tue Apr 24 ⤸
02:57:18 CEST 2018^M〈0.711020〉

xz:

Loading Linux 4.17.0-rc2-debug-xz-00025-gd426b0ba363d ...〈0.489154〉
^MLoading initial ramdisk ...〈0.278774〉
^M[ 0.000000] Linux version 4.17.0-rc2-debug-xz-00025-gd426b0ba363d ⤸
(kilobyte@umbar) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Tue Apr 24 ⤸
11:30:49 CEST 2018^M〈1.221916〉


> > Sizes of relevant files:
> >
> > 14826134 initrd.img-4.17.0-rc2-debug-00025-gd426b0ba363d (zstd)
> > 14826352 initrd.img-4.17.0-rc2-debug-gz-00025-gd426b0ba363d
> > 14826909 initrd.img-4.17.0-rc2-debug-lz4-00025-gd426b0ba363d
> > 14826761 initrd.img-4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty
14825759 initrd.img-4.17.0-rc2-debug-xz-00025-gd426b0ba363d
> > 6567408 vmlinuz-4.17.0-rc2-debug-00025-gd426b0ba363d (zstd)
> > 7230960 vmlinuz-4.17.0-rc2-debug-gz-00025-gd426b0ba363d
> > 8775152 vmlinuz-4.17.0-rc2-debug-lz4-00025-gd426b0ba363d
> > 27821552 vmlinuz-4.17.0-rc2-debug-none-00025-gd426b0ba363d-dirty
5654000 vmlinuz-4.17.0-rc2-debug-xz-00025-gd426b0ba363d
> > (I did not alter initrd compression, which is zstd in all cases).
>
> Does the size of the uncompressed image match the size in
> `arch/x86/boot/compressed/vmlinux.bin`?

Nearly -- lz4 has a weird inefficient way of storing length of literals: a
long run is stored as 255 255 255 255 255 255 255 42 to say 7*255+42, at a
cost of 0.4% space. It might be good to use a more faithful null
compressor, although sticking with real lz4 might be safer.

> > > > So yes, looks like uncompressed kernel image may be good idea.
> >
> > Seems like the time to actually read this far bigger file from the disk
> > using grub's inefficient way, takes longer than the gains from faster
> > decompression. You can eliminate the decompression step altogether by
> > avoiding copying, but it still looks like it's not a win.
> >
> > I've seen u-boot taking ~60 seconds to read from a SD card, too.
>
> I could test on my coreboot systems, if GRUB is faster with the native AHCI
> driver.

Yeah -- this machine's SSD is no speed demon, mere 500MB/s, but that's wee
bit faster than the ridiculous slowness of GRUB.

And it's not just GRUB:

Retrieving file: /initrd.img-4.16.0-00199-ge68d78e24cde〈0.004987〉
6042402 bytes read in 354 ms (16.3 MiB/s)〈0.380119〉
Retrieving file: /vmlinuz-4.16.0-00199-ge68d78e24cde〈0.004918〉
15809024 bytes read in 68814 ms (223.6 KiB/s)〈68.842893〉
append: root=/dev/mmcblk0p5 ro console=ttyS0,115200n8 rootwait⤸
earlycon=uart,mmio32,0x01c28000 loglevel=8〈0.009030〉
Retrieving file: /dtbs/4.16.0-00199-ge68d78e24cde/allwinner/sun50i-a64-pine64-plus.dtb〈0.007933〉
19458 bytes read in 524 ms (36.1 KiB/s)〈0.550027〉
## Flattened Device Tree blob at 4fa00000〈0.003124〉
Booting using the fdt blob at 0x4fa00000〈0.003997〉
Loading Ramdisk to 49a3c000, end 49fff322 ... OK〈0.010845〉
Loading Device Tree to 0000000049a34000, end 0000000049a3bc01 ... OK〈0.007020〉
〈0.005008〉
Starting kernel ...〈0.002145〉
〈0.000886〉
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]〈0.018013〉
[ 0.000000] Linux version 4.16.0-00199-ge68d78e24cde (kilobyte@sirius) ⤸
(gcc version 7.3.0 (Debian 7.3.0-14)) #1 SMP PREEMPT Fri Apr 6 03:17:52 CEST 2018〈0.013964〉


> > Another surprise is that zstd is a notch _slower_ than gzip (in userspace
> > it's drastically faster for the same compression ratio), but reduced disk
> > space is still nice. It's worth investigating why it's not as fast as it
> > should be.
>
> Maybe that should be done in a separate thread. I’ll split it out.

It'd be good to gather data for initrd compression as well to get both in
one go. In these existing logs, all had zstd:

AMD Phenom II:
[ 2.367726] Decompressing using zstd.〈0.003109〉
[ 2.515786] Freeing initrd memory: 14480K〈0.148877〉

Allwinner A64:
[ 1.924208] Decompressing using zstd.〈0.003906〉
[ 2.230483] Freeing initrd memory: 5900K〈0.306135〉

Lemme re-run with other compressors.

> > Perhaps my method is totally wrong, but differences in decompression speed
> > are surprisingly small, dwarfed by whatever else the kernel does between
> > messages.
> >
> > I did not test xz, nor ran tests more than once, but it's 4am so these are
> > things to do tomorrow.
>
> It’d be interesting to find out, what is happening in the first 700 ms,
> before the first Linux message is transmitted over serial line. It could be,
> that the serial line affects the time stamps for example, or takes so long
> to set up the serial console.

printk only gathers data in a buffer during early init, waiting until not
only the kernel is decompressed, but also a bunch of various other kinds of
initialization are done before the serial line is set up, right? Not sure
if there's a way to time early init, when all in-kernel timestamps show up
as [ 0.000000].

As for the serial line, it _is_ slow: at 115200 it takes 5ms to send a line
which slows down boot considerably, especially with debug output (~1000
lines here) -- but 12ms for the first wrapped line is a far cry from 700ms.


Meow!
--
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢰⠒⠀⣿⡁
⢿⡄⠘⠷⠚⠋⠀ Certified airhead; got the CT scan to prove that!
⠈⠳⣄⠀⠀⠀⠀


Attachments:
(No filename) (10.83 kB)
ttyrec2ansi.c (1.79 kB)
Download all attachments

2018-04-25 01:14:04

by Adam Borowski

[permalink] [raw]
Subject: On Tue, Apr 24, 2018 at 11:08:34AM +0200, Paul Menzel wrote:

'ere you go. KERNEL_ZSTD is not in mainline yet but knowing its magic
can't hurt -- especially that scripts may be out of sync with an installed
kernel. LZ4 is in since 3.11.

--8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<--
From 30886e965e7aeae8d3729c4bacf614a19e103cea Mon Sep 17 00:00:00 2001
From: Adam Borowski <[email protected]>
Date: Wed, 25 Apr 2018 02:29:18 +0200
Subject: [PATCH] scripts: teach extract-vmlinux about LZ4 and ZSTD

Note that the LZ4 signature is different than that of modern LZ4 as we
use the "legacy" format which suffers from some downsides like inability
to disable compression.

Signed-off-by: Adam Borowski <[email protected]>
---
scripts/extract-vmlinux | 2 ++
1 file changed, 2 insertions(+)

diff --git a/scripts/extract-vmlinux b/scripts/extract-vmlinux
index 5061abcc2540..e6239f39abad 100755
--- a/scripts/extract-vmlinux
+++ b/scripts/extract-vmlinux
@@ -57,6 +57,8 @@ try_decompress '\3757zXZ\000' abcde unxz
try_decompress 'BZh' xy bunzip2
try_decompress '\135\0\0\0' xxx unlzma
try_decompress '\211\114\132' xy 'lzop -d'
+try_decompress '\002!L\030' xxx 'lz4 -d'
+try_decompress '(\265/\375' xxx unzstd

# Bail out:
echo "$me: Cannot find vmlinux." >&2
--
2.17.0