2022-01-29 16:27:20

by Feng Tang

[permalink] [raw]
Subject: Possible bug for ZSTD kernel decompressing

Hi All,

Recently 0Day reported a 32bit i386 kernel decompression failure for my
patch [1], which essentially increase the kernel data section's size
from 19MB to 53MB, with error message:

early console in setup code
early console in extract_kernel
input_data: 0x05077079
input_len: 0x00f8a633
output: 0x01000000
output_len: 0x045c4328
kernel_total_size: 0x05040000
needed_size: 0x05040000

Decompressing Linux...

ZSTD-compressed data is corrupt

-- System haltedBUG: kernel hang in boot stage

From debug, it is likely a problem of ZSTD decompression code, as when I
reverted my patch and hacked to increase the size of kernel data
section by 32MB, the same error will happen.

Some other hints are:
* same i386 config with lz4 and xz algo can boot
* X86_64 + zstd also boots fine

This could be reproduced by qemu cmd:

qemu-system-i386 -machine pc -cpu host -enable-kvm -kernel bzImage -m 2048m -smp 4 -serial stdio --append "earlyprintk=ttyS0,115200 console=ttyS0,115200"

i386 kernel config is attached, and the debug patch as below:
---
diff --git a/init/main.c b/init/main.c
index 767ee2672176..873f40ddf96e 100644
--- a/init/main.c
+++ b/init/main.c
@@ -162,6 +162,10 @@ static size_t initargs_offs;
static char *execute_command;
static char *ramdisk_execute_command = "/init";

+#define DT_SIZE 8192000
+static unsigned long tbuf[DT_SIZE] = { 1, 2, 3, 4, };
+
/*
* Used to generate warnings if static_key manipulation functions are used
* before jump_label_init is called.
@@ -690,6 +694,11 @@ noinline void __ref rest_init(void)
struct task_struct *tsk;
int pid;

+ unsigned long i, j;
+ for (i = 0; i < DT_SIZE; i++)
+ j += tbuf[i];
+ printk("j = 0x%x\n", j);
+
rcu_scheduler_starting();
/*
* We need to spawn init first so that it obtains pid 1, however

Please let me know if you need more info.

[1.] https://lore.kernel.org/lkml/[email protected]/

Thanks,
Feng


Attachments:
(No filename) (1.97 kB)
i386_decompress_fail.config (148.98 kB)
Download all attachments

2022-01-31 11:31:15

by Nick Terrell

[permalink] [raw]
Subject: Re: Possible bug for ZSTD kernel decompressing



> On Jan 27, 2022, at 8:53 PM, Feng Tang <[email protected]> wrote:
>
> Hi All,
>
> Recently 0Day reported a 32bit i386 kernel decompression failure for my
> patch [1], which essentially increase the kernel data section's size
> from 19MB to 53MB, with error message:
>
> early console in setup code
> early console in extract_kernel
> input_data: 0x05077079
> input_len: 0x00f8a633
> output: 0x01000000
> output_len: 0x045c4328
> kernel_total_size: 0x05040000
> needed_size: 0x05040000
>
> Decompressing Linux...
>
> ZSTD-compressed data is corrupt

Thanks for the report! I will look into the report this weekend.

-Nick

> -- System haltedBUG: kernel hang in boot stage
>
> From debug, it is likely a problem of ZSTD decompression code, as when I
> reverted my patch and hacked to increase the size of kernel data
> section by 32MB, the same error will happen.
>
> Some other hints are:
> * same i386 config with lz4 and xz algo can boot
> * X86_64 + zstd also boots fine
>
> This could be reproduced by qemu cmd:
>
> qemu-system-i386 -machine pc -cpu host -enable-kvm -kernel bzImage -m 2048m -smp 4 -serial stdio --append "earlyprintk=ttyS0,115200 console=ttyS0,115200"
>
> i386 kernel config is attached, and the debug patch as below:
> ---
> diff --git a/init/main.c b/init/main.c
> index 767ee2672176..873f40ddf96e 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -162,6 +162,10 @@ static size_t initargs_offs;
> static char *execute_command;
> static char *ramdisk_execute_command = "/init";
>
> +#define DT_SIZE 8192000
> +static unsigned long tbuf[DT_SIZE] = { 1, 2, 3, 4, };
> +
> /*
> * Used to generate warnings if static_key manipulation functions are used
> * before jump_label_init is called.
> @@ -690,6 +694,11 @@ noinline void __ref rest_init(void)
> struct task_struct *tsk;
> int pid;
>
> + unsigned long i, j;
> + for (i = 0; i < DT_SIZE; i++)
> + j += tbuf[i];
> + printk("j = 0x%x\n", j);
> +
> rcu_scheduler_starting();
> /*
> * We need to spawn init first so that it obtains pid 1, however
>
> Please let me know if you need more info.
>
> [1.] https://lore.kernel.org/lkml/[email protected]/
>
> Thanks,
> Feng
> <i386_decompress_fail.config>

2022-02-01 20:49:20

by Nick Terrell

[permalink] [raw]
Subject: Re: Possible bug for ZSTD kernel decompressing



> On Jan 27, 2022, at 8:53 PM, Feng Tang <[email protected]> wrote:
>
> Hi All,
>
> Recently 0Day reported a 32bit i386 kernel decompression failure for my
> patch [1], which essentially increase the kernel data section's size
> from 19MB to 53MB, with error message:
>
> early console in setup code
> early console in extract_kernel
> input_data: 0x05077079
> input_len: 0x00f8a633
> output: 0x01000000
> output_len: 0x045c4328
> kernel_total_size: 0x05040000
> needed_size: 0x05040000
>
> Decompressing Linux...
>
> ZSTD-compressed data is corrupt
>
> -- System haltedBUG: kernel hang in boot stage
>
> From debug, it is likely a problem of ZSTD decompression code, as when I
> reverted my patch and hacked to increase the size of kernel data
> section by 32MB, the same error will happen.
>
> Some other hints are:
> * same i386 config with lz4 and xz algo can boot
> * X86_64 + zstd also boots fine
>
> This could be reproduced by qemu cmd:
>
> qemu-system-i386 -machine pc -cpu host -enable-kvm -kernel bzImage -m 2048m -smp 4 -serial stdio --append "earlyprintk=ttyS0,115200 console=ttyS0,115200"
>
> i386 kernel config is attached, and the debug patch as below:
> ---
> diff --git a/init/main.c b/init/main.c
> index 767ee2672176..873f40ddf96e 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -162,6 +162,10 @@ static size_t initargs_offs;
> static char *execute_command;
> static char *ramdisk_execute_command = "/init";
>
> +#define DT_SIZE 8192000
> +static unsigned long tbuf[DT_SIZE] = { 1, 2, 3, 4, };
> +
> /*
> * Used to generate warnings if static_key manipulation functions are used
> * before jump_label_init is called.
> @@ -690,6 +694,11 @@ noinline void __ref rest_init(void)
> struct task_struct *tsk;
> int pid;
>
> + unsigned long i, j;
> + for (i = 0; i < DT_SIZE; i++)
> + j += tbuf[i];
> + printk("j = 0x%x\n", j);
> +
> rcu_scheduler_starting();
> /*
> * We need to spawn init first so that it obtains pid 1, however
>
> Please let me know if you need more info.
>
> [1.] https://lore.kernel.org/lkml/[email protected]/

I've been unable to reproduce this issue using the provided patch + config based on
Linux v5.17-rc2.

What version of Linux are you testing on? Zstd was updated in v5.16, so if you're not
testing on v5.16 or later, can you please re-test on v5.17-rc2?

Best,
Nick Terrell

> Thanks,
> Feng
> <i386_decompress_fail.config>

2022-02-03 09:21:21

by Feng Tang

[permalink] [raw]
Subject: Re: Possible bug for ZSTD kernel decompressing

Hi Nick,

On Mon, Jan 31, 2022 at 08:31:10PM +0000, Nick Terrell wrote:
>
>
> > On Jan 27, 2022, at 8:53 PM, Feng Tang <[email protected]> wrote:
> >
> > Hi All,
> >
> > Recently 0Day reported a 32bit i386 kernel decompression failure for my
> > patch [1], which essentially increase the kernel data section's size
> > from 19MB to 53MB, with error message:
> >
> > early console in setup code
> > early console in extract_kernel
> > input_data: 0x05077079
> > input_len: 0x00f8a633
> > output: 0x01000000
> > output_len: 0x045c4328
> > kernel_total_size: 0x05040000
> > needed_size: 0x05040000
> >
> > Decompressing Linux...
> >
> > ZSTD-compressed data is corrupt
> >
> > -- System haltedBUG: kernel hang in boot stage
> >
> > From debug, it is likely a problem of ZSTD decompression code, as when I
> > reverted my patch and hacked to increase the size of kernel data
> > section by 32MB, the same error will happen.
> >
> > Some other hints are:
> > * same i386 config with lz4 and xz algo can boot
> > * X86_64 + zstd also boots fine
> >
> > This could be reproduced by qemu cmd:
> >
> > qemu-system-i386 -machine pc -cpu host -enable-kvm -kernel bzImage -m 2048m -smp 4 -serial stdio --append "earlyprintk=ttyS0,115200 console=ttyS0,115200"
> >
> > i386 kernel config is attached, and the debug patch as below:
> > ---
> > diff --git a/init/main.c b/init/main.c
> > index 767ee2672176..873f40ddf96e 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -162,6 +162,10 @@ static size_t initargs_offs;
> > static char *execute_command;
> > static char *ramdisk_execute_command = "/init";
> >
> > +#define DT_SIZE 8192000
> > +static unsigned long tbuf[DT_SIZE] = { 1, 2, 3, 4, };
> > +
> > /*
> > * Used to generate warnings if static_key manipulation functions are used
> > * before jump_label_init is called.
> > @@ -690,6 +694,11 @@ noinline void __ref rest_init(void)
> > struct task_struct *tsk;
> > int pid;
> >
> > + unsigned long i, j;
> > + for (i = 0; i < DT_SIZE; i++)
> > + j += tbuf[i];
> > + printk("j = 0x%x\n", j);
> > +
> > rcu_scheduler_starting();
> > /*
> > * We need to spawn init first so that it obtains pid 1, however
> >
> > Please let me know if you need more info.
> >
> > [1.] https://lore.kernel.org/lkml/[email protected]/
>
> I've been unable to reproduce this issue using the provided patch + config based on
> Linux v5.17-rc2.
>
> What version of Linux are you testing on? Zstd was updated in v5.16, so if you're not
> testing on v5.16 or later, can you please re-test on v5.17-rc2?

The original report I got is against commit 8cd7c588decf
"mm/vmscan: throttle reclaim until some writeback completes if congested"
which is post 5.15.

I just retested and the issue can _not_ be reproduced against 5.17-rc2.
Thanks for the check and fix, and sorry for not trying latest kernel.

- Feng