Hi Tom,
Lianbo reported kdump kernel can't boot well with 'nokaslr' added, and
have to enable KASLR in kdump kernel to make it boot successfully. This
blocked his work on enabling sme for kexec/kdump. And on some machines
SME kernel can't boot in 1st kernel.
I checked code of SME implementation, and found out the root cause. The
above failures are caused by SME code, sme_encrypt_kernel(). In
sme_encrypt_kernel(), you get a 2M of encryption work area as intermediate
buffer to encrypt kernel in-place. And the work area is just after _end of
kernel.
This happens to work in 1st kernel. But it will fail kexec/kdump kernel
absolutely. Because we load realmode/kernel/initrd in kexec-tools from
top to down. In kexec-tools, realmode is put just after kernel image. If
KASLR enabled, kernel may be randomized to other position, then kdump
kernel can boot. However, if nokaslr specified, the 2M intermediate
encryption workarea will definitely stump into the following realmode,
and fail kexec/kdump kernel booting.
I have hacked kexec-tools code to put real mode area 4M away from the
kernel image end, it works and confirm my finding. So the current SME
in-place encryption way is not only a kexec/kdump issue, but also an
issue in 1st kernel. Because KASLR could put kernel at the end of an
available memory region, how to make sure the next 2M intermediate
workarea must exist; if KASLR put kernel to be close to starting address
of any cmdline/initrd/setup_data, how to make sure the gap between them
must be larger than 2M.
Thanks
Baoquan
On 6/4/19 8:49 AM, Baoquan He wrote:
> Hi Tom,
>
> Lianbo reported kdump kernel can't boot well with 'nokaslr' added, and
> have to enable KASLR in kdump kernel to make it boot successfully. This
> blocked his work on enabling sme for kexec/kdump. And on some machines
> SME kernel can't boot in 1st kernel.
>
> I checked code of SME implementation, and found out the root cause. The
> above failures are caused by SME code, sme_encrypt_kernel(). In
> sme_encrypt_kernel(), you get a 2M of encryption work area as intermediate
> buffer to encrypt kernel in-place. And the work area is just after _end of
> kernel.
I remember worrying about something like this back when I was testing the
kexec support. I had come up with a patch to address it, but never got the
time to test and submit it. I've included it here if you'd like to test
it (I haven't done run this patch in quite some time). If it works, we can
think about submitting it.
Thanks,
Tom
---
x86/mm: Create an SME workarea in the kernel for early encryption
From: Tom Lendacky <[email protected]>
The SME workarea used during early encryption of the kernel during boot
is situated on a 2MB boundary after the end of the kernel text, data,
etc. sections (_end). This works well during initial boot of a compressed
kernel because of the relocation used for decompression of the kernel.
But when performing a kexec boot, there's a chance that the SME workarea
may not be mapped by the kexec pagetables or that some of the other data
used by kexec could exist in this range.
Create a section for SME in the vmlinux.lds.S. Position it after "_end"
so that the memory will be reclaimed during boot and since it is all
zeroes it compresses well. Since this new section will be part of the
kernel, kexec will account for it in pagetable mappings and placement of
data after the kernel.
Here's an example of a kernel size without and with the SME section:
without:
vmlinux: 36,501,616
bzImage: 6,497,344
100000000-47f37ffff : System RAM
1e4000000-1e47677d4 : Kernel code (0x7677d4)
1e47677d5-1e4e2e0bf : Kernel data (0x6c68ea)
1e5074000-1e5372fff : Kernel bss (0x2fefff)
with:
vmlinux: 44,419,408
bzImage: 6,503,136
880000000-c7ff7ffff : System RAM
8cf000000-8cf7677d4 : Kernel code (0x7677d4)
8cf7677d5-8cfe2e0bf : Kernel data (0x6c68ea)
8d0074000-8d0372fff : Kernel bss (0x2fefff)
Signed-off-by: Tom Lendacky <[email protected]>
---
arch/x86/kernel/vmlinux.lds.S | 16 ++++++++++++++++
arch/x86/mm/mem_encrypt_identity.c | 22 ++++++++++++++++++++--
2 files changed, 36 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 0850b5149345..8c4377983e54 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -379,6 +379,22 @@ SECTIONS
. = ALIGN(PAGE_SIZE); /* keep VO_INIT_SIZE page aligned */
_end = .;
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ /*
+ * SME workarea section: Lives outside of the kernel proper
+ * (_text - _end) for performing in-place encryption. Resides
+ * on a 2MB boundary to simplify the pagetable setup used for
+ * the encryption.
+ */
+ . = ALIGN(HPAGE_SIZE);
+ .sme : AT(ADDR(.sme) - LOAD_OFFSET) {
+ __sme_begin = .;
+ *(.sme)
+ . = ALIGN(HPAGE_SIZE);
+ __sme_end = .;
+ }
+#endif
+
STABS_DEBUG
DWARF_DEBUG
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 4aa9b1480866..c55c2ec8fb12 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -73,6 +73,19 @@ struct sme_populate_pgd_data {
unsigned long vaddr_end;
};
+/*
+ * This work area lives in the .sme section, which lives outside of
+ * the kernel proper. It is sized to hold the intermediate copy buffer
+ * and more than enough pagetable pages.
+ *
+ * By using this section, the kernel can be encrypted in place and we
+ * avoid any possibility of boot parameters or initramfs images being
+ * placed such that the in-place encryption logic overwrites them. This
+ * section is 2MB aligned to allow for simple pagetable setup using only
+ * PMD entries (see vmlinux.lds.S).
+ */
+static char sme_workarea[2 * PMD_PAGE_SIZE] __section(.sme);
+
static char sme_cmdline_arg[] __initdata = "mem_encrypt";
static char sme_cmdline_on[] __initdata = "on";
static char sme_cmdline_off[] __initdata = "off";
@@ -314,8 +327,13 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
}
#endif
- /* Set the encryption workarea to be immediately after the kernel */
- workarea_start = kernel_end;
+ /*
+ * We're running identity mapped, so we must obtain the address to the
+ * SME encryption workarea using rip-relative addressing.
+ */
+ asm ("lea sme_workarea(%%rip), %0"
+ : "=r" (workarea_start)
+ : "p" (sme_workarea));
/*
* Calculate required number of workarea bytes needed:
>
> This happens to work in 1st kernel. But it will fail kexec/kdump kernel
> absolutely. Because we load realmode/kernel/initrd in kexec-tools from
> top to down. In kexec-tools, realmode is put just after kernel image. If
> KASLR enabled, kernel may be randomized to other position, then kdump
> kernel can boot. However, if nokaslr specified, the 2M intermediate
> encryption workarea will definitely stump into the following realmode,
> and fail kexec/kdump kernel booting.
>
> I have hacked kexec-tools code to put real mode area 4M away from the
> kernel image end, it works and confirm my finding. So the current SME
> in-place encryption way is not only a kexec/kdump issue, but also an
> issue in 1st kernel. Because KASLR could put kernel at the end of an
> available memory region, how to make sure the next 2M intermediate
> workarea must exist; if KASLR put kernel to be close to starting address
> of any cmdline/initrd/setup_data, how to make sure the gap between them
> must be larger than 2M.
>
> Thanks
> Baoquan
>
On 06/04/19 at 03:56pm, Lendacky, Thomas wrote:
> On 6/4/19 8:49 AM, Baoquan He wrote:
> > Hi Tom,
> >
> > Lianbo reported kdump kernel can't boot well with 'nokaslr' added, and
> > have to enable KASLR in kdump kernel to make it boot successfully. This
> > blocked his work on enabling sme for kexec/kdump. And on some machines
> > SME kernel can't boot in 1st kernel.
> >
> > I checked code of SME implementation, and found out the root cause. The
> > above failures are caused by SME code, sme_encrypt_kernel(). In
> > sme_encrypt_kernel(), you get a 2M of encryption work area as intermediate
> > buffer to encrypt kernel in-place. And the work area is just after _end of
> > kernel.
>
> I remember worrying about something like this back when I was testing the
> kexec support. I had come up with a patch to address it, but never got the
> time to test and submit it. I've included it here if you'd like to test
> it (I haven't done run this patch in quite some time). If it works, we can
> think about submitting it.
Thanks for your quick response and making this patch, Tom.
Tested on a speedway machine, it entered into kernel, but failed in
below stage. Tested two times, always happened.
[ 4.978521] Freeing unused decrypted memory: 2040K
[ 4.983800] Freeing unused kernel image memory: 2344K
[ 4.988943] Write protecting the kernel read-only data: 18432k
[ 4.995306] Freeing unused kernel image memory: 2012K
[ 5.000488] Freeing unused kernel image memory: 256K
[ 5.005540] Run /init as init process
[ 5.009443] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
[ 5.017230] CPU: 0 PID: 1 Comm: init Not tainted 5.2.0-rc2+ #38
[ 5.023251] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1004B 10/18/2017
[ 5.031299] Call Trace:
[ 5.033793] dump_stack+0x46/0x60
[ 5.037169] panic+0xfb/0x2cb
[ 5.040191] do_exit.cold.21+0x59/0x81
[ 5.044004] do_group_exit+0x3a/0xa0
[ 5.047640] __x64_sys_exit_group+0x14/0x20
[ 5.051899] do_syscall_64+0x55/0x1c0
[ 5.055627] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 5.060764] RIP: 0033:0x7fa1b1fc9e2e
[ 5.064404] Code: Bad RIP value.
[ 5.067687] RSP: 002b:00007fffc5abb778 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
[ 5.075296] RAX: ffffffffffffffda RBX: 00007fa1b1fd2528 RCX: 00007fa1b1fc9e2e
[ 5.082625] RDX: 000000000000007f RSI: 000000000000003c RDI: 000000000000007f
[ 5.089879] RBP: 00007fa1b21d8d00 R08: 00000000000000e7 R09: 00007fffc5abb688
[ 5.097134] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
[ 5.104386] R13: 0000000000000001 R14: 00007fa1b21d8d40 R15: 00007fa1b21d8d30
[ 5.111645] Kernel Offset: disabled
[ 5.423002] Rebooting in 10 seconds..
[ 15.429641] ACPI MEMORY or I/O RESET_REG.
On 6/4/19 7:56 PM, Baoquan He wrote:
> On 06/04/19 at 03:56pm, Lendacky, Thomas wrote:
>> On 6/4/19 8:49 AM, Baoquan He wrote:
>>> Hi Tom,
>>>
>>> Lianbo reported kdump kernel can't boot well with 'nokaslr' added, and
>>> have to enable KASLR in kdump kernel to make it boot successfully. This
>>> blocked his work on enabling sme for kexec/kdump. And on some machines
>>> SME kernel can't boot in 1st kernel.
>>>
>>> I checked code of SME implementation, and found out the root cause. The
>>> above failures are caused by SME code, sme_encrypt_kernel(). In
>>> sme_encrypt_kernel(), you get a 2M of encryption work area as intermediate
>>> buffer to encrypt kernel in-place. And the work area is just after _end of
>>> kernel.
>>
>> I remember worrying about something like this back when I was testing the
>> kexec support. I had come up with a patch to address it, but never got the
>> time to test and submit it. I've included it here if you'd like to test
>> it (I haven't done run this patch in quite some time). If it works, we can
>> think about submitting it.
>
> Thanks for your quick response and making this patch, Tom.
>
> Tested on a speedway machine, it entered into kernel, but failed in
> below stage. Tested two times, always happened.
Is this the initial kernel boot or the kexec kernel boot?
It looks like this is related to the initrd/initramfs decryption. Not
sure what could be happening there. I just tried the patch on my Naples
system and a 5.2.0-rc3 kernel and have been able to repeatedly kexec boot
a number of times so far.
Thanks,
Tom
>
>
> [ 4.978521] Freeing unused decrypted memory: 2040K
> [ 4.983800] Freeing unused kernel image memory: 2344K
> [ 4.988943] Write protecting the kernel read-only data: 18432k
> [ 4.995306] Freeing unused kernel image memory: 2012K
> [ 5.000488] Freeing unused kernel image memory: 256K
> [ 5.005540] Run /init as init process
> [ 5.009443] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
> [ 5.017230] CPU: 0 PID: 1 Comm: init Not tainted 5.2.0-rc2+ #38
> [ 5.023251] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1004B 10/18/2017
> [ 5.031299] Call Trace:
> [ 5.033793] dump_stack+0x46/0x60
> [ 5.037169] panic+0xfb/0x2cb
> [ 5.040191] do_exit.cold.21+0x59/0x81
> [ 5.044004] do_group_exit+0x3a/0xa0
> [ 5.047640] __x64_sys_exit_group+0x14/0x20
> [ 5.051899] do_syscall_64+0x55/0x1c0
> [ 5.055627] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 5.060764] RIP: 0033:0x7fa1b1fc9e2e
> [ 5.064404] Code: Bad RIP value.
> [ 5.067687] RSP: 002b:00007fffc5abb778 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
> [ 5.075296] RAX: ffffffffffffffda RBX: 00007fa1b1fd2528 RCX: 00007fa1b1fc9e2e
> [ 5.082625] RDX: 000000000000007f RSI: 000000000000003c RDI: 000000000000007f
> [ 5.089879] RBP: 00007fa1b21d8d00 R08: 00000000000000e7 R09: 00007fffc5abb688
> [ 5.097134] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
> [ 5.104386] R13: 0000000000000001 R14: 00007fa1b21d8d40 R15: 00007fa1b21d8d30
> [ 5.111645] Kernel Offset: disabled
> [ 5.423002] Rebooting in 10 seconds..
> [ 15.429641] ACPI MEMORY or I/O RESET_REG.
>
On 06/05/19 at 04:04pm, Lendacky, Thomas wrote:
> On 6/4/19 7:56 PM, Baoquan He wrote:
> > On 06/04/19 at 03:56pm, Lendacky, Thomas wrote:
> >> On 6/4/19 8:49 AM, Baoquan He wrote:
> >>> Hi Tom,
> >>>
> >>> Lianbo reported kdump kernel can't boot well with 'nokaslr' added, and
> >>> have to enable KASLR in kdump kernel to make it boot successfully. This
> >>> blocked his work on enabling sme for kexec/kdump. And on some machines
> >>> SME kernel can't boot in 1st kernel.
> >>>
> >>> I checked code of SME implementation, and found out the root cause. The
> >>> above failures are caused by SME code, sme_encrypt_kernel(). In
> >>> sme_encrypt_kernel(), you get a 2M of encryption work area as intermediate
> >>> buffer to encrypt kernel in-place. And the work area is just after _end of
> >>> kernel.
> >>
> >> I remember worrying about something like this back when I was testing the
> >> kexec support. I had come up with a patch to address it, but never got the
> >> time to test and submit it. I've included it here if you'd like to test
> >> it (I haven't done run this patch in quite some time). If it works, we can
> >> think about submitting it.
> >
> > Thanks for your quick response and making this patch, Tom.
> >
> > Tested on a speedway machine, it entered into kernel, but failed in
> > below stage. Tested two times, always happened.
>
> Is this the initial kernel boot or the kexec kernel boot?
It's kexec kernel booting.
>
> It looks like this is related to the initrd/initramfs decryption. Not
> sure what could be happening there. I just tried the patch on my Naples
> system and a 5.2.0-rc3 kernel and have been able to repeatedly kexec boot
> a number of times so far.
>
> Thanks,
> Tom
>
> >
> >
> > [ 4.978521] Freeing unused decrypted memory: 2040K
> > [ 4.983800] Freeing unused kernel image memory: 2344K
> > [ 4.988943] Write protecting the kernel read-only data: 18432k
> > [ 4.995306] Freeing unused kernel image memory: 2012K
> > [ 5.000488] Freeing unused kernel image memory: 256K
> > [ 5.005540] Run /init as init process
> > [ 5.009443] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
> > [ 5.017230] CPU: 0 PID: 1 Comm: init Not tainted 5.2.0-rc2+ #38
> > [ 5.023251] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1004B 10/18/2017
> > [ 5.031299] Call Trace:
> > [ 5.033793] dump_stack+0x46/0x60
> > [ 5.037169] panic+0xfb/0x2cb
> > [ 5.040191] do_exit.cold.21+0x59/0x81
> > [ 5.044004] do_group_exit+0x3a/0xa0
> > [ 5.047640] __x64_sys_exit_group+0x14/0x20
> > [ 5.051899] do_syscall_64+0x55/0x1c0
> > [ 5.055627] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [ 5.060764] RIP: 0033:0x7fa1b1fc9e2e
> > [ 5.064404] Code: Bad RIP value.
> > [ 5.067687] RSP: 002b:00007fffc5abb778 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
> > [ 5.075296] RAX: ffffffffffffffda RBX: 00007fa1b1fd2528 RCX: 00007fa1b1fc9e2e
> > [ 5.082625] RDX: 000000000000007f RSI: 000000000000003c RDI: 000000000000007f
> > [ 5.089879] RBP: 00007fa1b21d8d00 R08: 00000000000000e7 R09: 00007fffc5abb688
> > [ 5.097134] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
> > [ 5.104386] R13: 0000000000000001 R14: 00007fa1b21d8d40 R15: 00007fa1b21d8d30
> > [ 5.111645] Kernel Offset: disabled
> > [ 5.423002] Rebooting in 10 seconds..
> > [ 15.429641] ACPI MEMORY or I/O RESET_REG.
> >
在 2019年06月06日 00:04, Lendacky, Thomas 写道:
> On 6/4/19 7:56 PM, Baoquan He wrote:
>> On 06/04/19 at 03:56pm, Lendacky, Thomas wrote:
>>> On 6/4/19 8:49 AM, Baoquan He wrote:
>>>> Hi Tom,
>>>>
>>>> Lianbo reported kdump kernel can't boot well with 'nokaslr' added, and
>>>> have to enable KASLR in kdump kernel to make it boot successfully. This
>>>> blocked his work on enabling sme for kexec/kdump. And on some machines
>>>> SME kernel can't boot in 1st kernel.
>>>>
>>>> I checked code of SME implementation, and found out the root cause. The
>>>> above failures are caused by SME code, sme_encrypt_kernel(). In
>>>> sme_encrypt_kernel(), you get a 2M of encryption work area as intermediate
>>>> buffer to encrypt kernel in-place. And the work area is just after _end of
>>>> kernel.
>>>
>>> I remember worrying about something like this back when I was testing the
>>> kexec support. I had come up with a patch to address it, but never got the
>>> time to test and submit it. I've included it here if you'd like to test
>>> it (I haven't done run this patch in quite some time). If it works, we can
>>> think about submitting it.
>>
>> Thanks for your quick response and making this patch, Tom.
>>
>> Tested on a speedway machine, it entered into kernel, but failed in
>> below stage. Tested two times, always happened.
>
> Is this the initial kernel boot or the kexec kernel boot?
>
> It looks like this is related to the initrd/initramfs decryption. Not
> sure what could be happening there. I just tried the patch on my Naples
> system and a 5.2.0-rc3 kernel and have been able to repeatedly kexec boot
> a number of times so far.
>
I used the hacked kexec-tools(by Baoquan) to test it, the kexec-d kernel and
kdump kernel worked well. But Tom's patch only worked for the kexec-d kernel,
and the kdump kernel did not work(kdump kernel could not successfully boot).
What's the difference between them?
Thanks
Lianbo
> Thanks,
> Tom
>
>>
>>
>> [ 4.978521] Freeing unused decrypted memory: 2040K
>> [ 4.983800] Freeing unused kernel image memory: 2344K
>> [ 4.988943] Write protecting the kernel read-only data: 18432k
>> [ 4.995306] Freeing unused kernel image memory: 2012K
>> [ 5.000488] Freeing unused kernel image memory: 256K
>> [ 5.005540] Run /init as init process
>> [ 5.009443] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
>> [ 5.017230] CPU: 0 PID: 1 Comm: init Not tainted 5.2.0-rc2+ #38
>> [ 5.023251] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1004B 10/18/2017
>> [ 5.031299] Call Trace:
>> [ 5.033793] dump_stack+0x46/0x60
>> [ 5.037169] panic+0xfb/0x2cb
>> [ 5.040191] do_exit.cold.21+0x59/0x81
>> [ 5.044004] do_group_exit+0x3a/0xa0
>> [ 5.047640] __x64_sys_exit_group+0x14/0x20
>> [ 5.051899] do_syscall_64+0x55/0x1c0
>> [ 5.055627] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [ 5.060764] RIP: 0033:0x7fa1b1fc9e2e
>> [ 5.064404] Code: Bad RIP value.
>> [ 5.067687] RSP: 002b:00007fffc5abb778 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
>> [ 5.075296] RAX: ffffffffffffffda RBX: 00007fa1b1fd2528 RCX: 00007fa1b1fc9e2e
>> [ 5.082625] RDX: 000000000000007f RSI: 000000000000003c RDI: 000000000000007f
>> [ 5.089879] RBP: 00007fa1b21d8d00 R08: 00000000000000e7 R09: 00007fffc5abb688
>> [ 5.097134] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
>> [ 5.104386] R13: 0000000000000001 R14: 00007fa1b21d8d40 R15: 00007fa1b21d8d30
>> [ 5.111645] Kernel Offset: disabled
>> [ 5.423002] Rebooting in 10 seconds..
>> [ 15.429641] ACPI MEMORY or I/O RESET_REG.
>>
在 2019年06月09日 11:45, lijiang 写道:
> 在 2019年06月06日 00:04, Lendacky, Thomas 写道:
>> On 6/4/19 7:56 PM, Baoquan He wrote:
>>> On 06/04/19 at 03:56pm, Lendacky, Thomas wrote:
>>>> On 6/4/19 8:49 AM, Baoquan He wrote:
>>>>> Hi Tom,
>>>>>
>>>>> Lianbo reported kdump kernel can't boot well with 'nokaslr' added, and
>>>>> have to enable KASLR in kdump kernel to make it boot successfully. This
>>>>> blocked his work on enabling sme for kexec/kdump. And on some machines
>>>>> SME kernel can't boot in 1st kernel.
>>>>>
>>>>> I checked code of SME implementation, and found out the root cause. The
>>>>> above failures are caused by SME code, sme_encrypt_kernel(). In
>>>>> sme_encrypt_kernel(), you get a 2M of encryption work area as intermediate
>>>>> buffer to encrypt kernel in-place. And the work area is just after _end of
>>>>> kernel.
>>>>
>>>> I remember worrying about something like this back when I was testing the
>>>> kexec support. I had come up with a patch to address it, but never got the
>>>> time to test and submit it. I've included it here if you'd like to test
>>>> it (I haven't done run this patch in quite some time). If it works, we can
>>>> think about submitting it.
>>>
>>> Thanks for your quick response and making this patch, Tom.
>>>
>>> Tested on a speedway machine, it entered into kernel, but failed in
>>> below stage. Tested two times, always happened.
>>
>> Is this the initial kernel boot or the kexec kernel boot?
>>
>> It looks like this is related to the initrd/initramfs decryption. Not
>> sure what could be happening there. I just tried the patch on my Naples
>> system and a 5.2.0-rc3 kernel and have been able to repeatedly kexec boot
>> a number of times so far.
>>
>
> I used the hacked kexec-tools(by Baoquan) to test it, the kexec-d kernel and
> kdump kernel worked well. But Tom's patch only worked for the kexec-d kernel,
> and the kdump kernel did not work(kdump kernel could not successfully boot).
> What's the difference between them?
>
After applied Tom's patch, i changed the reserved memory(for crash kernel) to the
above 256M(>256M), such as crashkernel=320M or 384M,512M..., the kdump kernel can
work and successfully dump the vmcore.
But the kdump kernel always happened the panic or could not boot successfully in
the 256M(<= 256M) case, and on HP machine, i noticed that it printed OOM, the kdump
kernel was too smaller memory. But i never see the OOM on speedway machine(probably
related to the earlyprintk, it doesn't work and it loses many logs).
After removing the option 'CONFIG_DEBUG_INFO' from .config, i tested again, the kdump
kernel did not happen the panic in the 256M(crashkernel=256M), the kdump kernel can
work and succeed to dump the vmcore on HP machine or speedway machine.
It seems that the small memory caused the previous failure in kdump kernel. I would
suggest to post this patch to upstream. What's your opinion? Tom, Baoquan and other
people. Or do you have any comment?
Thanks.
Lianbo
> Thanks
> Lianbo
>
>> Thanks,
>> Tom
>>
>>>
>>>
>>> [ 4.978521] Freeing unused decrypted memory: 2040K
>>> [ 4.983800] Freeing unused kernel image memory: 2344K
>>> [ 4.988943] Write protecting the kernel read-only data: 18432k
>>> [ 4.995306] Freeing unused kernel image memory: 2012K
>>> [ 5.000488] Freeing unused kernel image memory: 256K
>>> [ 5.005540] Run /init as init process
>>> [ 5.009443] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
>>> [ 5.017230] CPU: 0 PID: 1 Comm: init Not tainted 5.2.0-rc2+ #38
>>> [ 5.023251] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1004B 10/18/2017
>>> [ 5.031299] Call Trace:
>>> [ 5.033793] dump_stack+0x46/0x60
>>> [ 5.037169] panic+0xfb/0x2cb
>>> [ 5.040191] do_exit.cold.21+0x59/0x81
>>> [ 5.044004] do_group_exit+0x3a/0xa0
>>> [ 5.047640] __x64_sys_exit_group+0x14/0x20
>>> [ 5.051899] do_syscall_64+0x55/0x1c0
>>> [ 5.055627] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>> [ 5.060764] RIP: 0033:0x7fa1b1fc9e2e
>>> [ 5.064404] Code: Bad RIP value.
>>> [ 5.067687] RSP: 002b:00007fffc5abb778 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
>>> [ 5.075296] RAX: ffffffffffffffda RBX: 00007fa1b1fd2528 RCX: 00007fa1b1fc9e2e
>>> [ 5.082625] RDX: 000000000000007f RSI: 000000000000003c RDI: 000000000000007f
>>> [ 5.089879] RBP: 00007fa1b21d8d00 R08: 00000000000000e7 R09: 00007fffc5abb688
>>> [ 5.097134] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
>>> [ 5.104386] R13: 0000000000000001 R14: 00007fa1b21d8d40 R15: 00007fa1b21d8d30
>>> [ 5.111645] Kernel Offset: disabled
>>> [ 5.423002] Rebooting in 10 seconds..
>>> [ 15.429641] ACPI MEMORY or I/O RESET_REG.
>>>
Hi Tom,
On 06/11/19 at 05:52pm, lijiang wrote:
> After applied Tom's patch, i changed the reserved memory(for crash kernel) to the
> above 256M(>256M), such as crashkernel=320M or 384M,512M..., the kdump kernel can
> work and successfully dump the vmcore.
>
> But the kdump kernel always happened the panic or could not boot successfully in
> the 256M(<= 256M) case, and on HP machine, i noticed that it printed OOM, the kdump
> kernel was too smaller memory. But i never see the OOM on speedway machine(probably
> related to the earlyprintk, it doesn't work and it loses many logs).
>
> After removing the option 'CONFIG_DEBUG_INFO' from .config, i tested again, the kdump
> kernel did not happen the panic in the 256M(crashkernel=256M), the kdump kernel can
> work and succeed to dump the vmcore on HP machine or speedway machine.
>
> It seems that the small memory caused the previous failure in kdump kernel. I would
> suggest to post this patch to upstream. What's your opinion? Tom, Baoquan and other
> people. Or do you have any comment?
As Lianbo said at above, the previous failure in kdump kernel is caused
by OOM. Just the log on speedway is incomplete, I am not sure what
happened. Now after investigation, your patch works to fix the issue.
Could you post it for riviewing?
Thanks
Baoquan