For AMD machine with SME feature, makedumpfile tools need to know
whether the crash kernel was encrypted or not. So it is necessary
to write the sme_me_mask to vmcoreinfo.
Signed-off-by: Lianbo Jiang <[email protected]>
---
arch/x86/kernel/machine_kexec_64.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 4c8acdfdc5a7..dcfdb64d1097 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -357,6 +357,8 @@ void arch_crash_save_vmcoreinfo(void)
vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
pgtable_l5_enabled());
+ VMCOREINFO_NUMBER(sme_me_mask);
+
#ifdef CONFIG_NUMA
VMCOREINFO_SYMBOL(node_data);
VMCOREINFO_LENGTH(node_data, MAX_NUMNODES);
--
2.17.1
On October 26, 2018 10:36:30 AM GMT+01:00, Lianbo Jiang <[email protected]> wrote:
>For AMD machine with SME feature, makedumpfile tools need to know
>whether the crash kernel was encrypted or not.
Why?
> So it is necessary
>to write the sme_me_mask to vmcoreinfo.
>
>Signed-off-by: Lianbo Jiang <[email protected]>
>---
> arch/x86/kernel/machine_kexec_64.c | 2 ++
> 1 file changed, 2 insertions(+)
>
>diff --git a/arch/x86/kernel/machine_kexec_64.c
>b/arch/x86/kernel/machine_kexec_64.c
>index 4c8acdfdc5a7..dcfdb64d1097 100644
>--- a/arch/x86/kernel/machine_kexec_64.c
>+++ b/arch/x86/kernel/machine_kexec_64.c
>@@ -357,6 +357,8 @@ void arch_crash_save_vmcoreinfo(void)
> vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
> pgtable_l5_enabled());
>
>+ VMCOREINFO_NUMBER(sme_me_mask);
No we're not going to expose a kernel-internal mask to userspace.
If at all needed, add functions to kexec which figure out whether we are encrypted or not and export that result as a kexec variable.
--
Sent from a small device: formatting sux and brevity is inevitable.
在 2018年10月26日 17:43, Boris Petkov 写道:
> On October 26, 2018 10:36:30 AM GMT+01:00, Lianbo Jiang <[email protected]> wrote:
>> For AMD machine with SME feature, makedumpfile tools need to know
>> whether the crash kernel was encrypted or not.
>
> Why?
>
If SME is enabled in the first kernel, the crash kernel's page table(pgd/pud/pmd/pte)
contains the memory encryption mask, so i have to remove the sme mask to obtain the
true physical address when dump vmcore.
>> So it is necessary
>> to write the sme_me_mask to vmcoreinfo.
>>
>> Signed-off-by: Lianbo Jiang <[email protected]>
>> ---
>> arch/x86/kernel/machine_kexec_64.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/x86/kernel/machine_kexec_64.c
>> b/arch/x86/kernel/machine_kexec_64.c
>> index 4c8acdfdc5a7..dcfdb64d1097 100644
>> --- a/arch/x86/kernel/machine_kexec_64.c
>> +++ b/arch/x86/kernel/machine_kexec_64.c
>> @@ -357,6 +357,8 @@ void arch_crash_save_vmcoreinfo(void)
>> vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
>> pgtable_l5_enabled());
>>
>> + VMCOREINFO_NUMBER(sme_me_mask);
>
> No we're not going to expose a kernel-internal mask to userspace.
>
If so, can i set a variable flag for the 'sme_me_mask' and export the
variable flag? For example:
void arch_crash_save_vmcoreinfo(void) {
....
if (sme_active())
sme_enabled = 1;
VMCOREINFO_NUMBER(sme_enabled);
....
}
> If at all needed, add functions to kexec which figure out whether we are encrypted or not and export that result as a kexec variable.
>
>
For AMD machine with the SME feature, the msr 'MSR_K8_SYSCFG' can examine
whether SME is enabled in kernel, but the kexec is also userspace tool,
it has no permission to access the msr.
Furthermore, i also tried to read the "/dev/cpu/cpu[number]/msr", but
the value depends on BIOS's configuration. That is to say, if SME is
set in BIOS, the value of msr is always 0xF40000 whatever the kernel
commandline parameter is "mem_encrypt=on" or "mem_encrypt=off".
If i made a mistake, please help to point it out.
Thanks.
Lianbo
On Fri, 26 Oct 2018 20:32:11 +0800
lijiang <[email protected]> wrote:
>[...]
> For AMD machine with the SME feature, the msr 'MSR_K8_SYSCFG' can examine
> whether SME is enabled in kernel, but the kexec is also userspace tool,
> it has no permission to access the msr.
But we need the MSR value from the panic kernel environment, not while
the production kernel is still running, right?
If so, then this reminds me that I have wanted for a long time to store
more of the hardware state in a vmcore NOTE after a kernel crash ...
control registers, MSRs and whatnot. Of course, this would be a
long-term project, but I wonder what other people think about it in
general.
Just my 2 cents,
Petr T
> Furthermore, i also tried to read the "/dev/cpu/cpu[number]/msr", but
> the value depends on BIOS's configuration. That is to say, if SME is
> set in BIOS, the value of msr is always 0xF40000 whatever the kernel
> commandline parameter is "mem_encrypt=on" or "mem_encrypt=off".
>
> If i made a mistake, please help to point it out.
>
> Thanks.
> Lianbo
>
> _______________________________________________
> kexec mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/kexec
On Fri, Oct 26, 2018 at 08:32:11PM +0800, lijiang wrote:
> If SME is enabled in the first kernel, the crash kernel's page table(pgd/pud/pmd/pte)
> contains the memory encryption mask, so i have to remove the sme mask to obtain the
> true physical address when dump vmcore.
Sorry, I have no clue what makedumpfile does exactly so you'd have to
be more detailed (or wait until I look at it :)). Which kernel accesses
which kernel's pagetable?
/me goes and looks at the makedumpfile's manpage...
Ok, it uses vmcoreinfo to exclude pages which would mean, it accesses
the first kernel's pagetable and traverses it.
Am I close?
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
On Fri, Oct 26, 2018 at 06:24:40PM +0200, Petr Tesarik wrote:
> But we need the MSR value from the panic kernel environment, not while
> the production kernel is still running, right?
Actually, we need only the encryption bit number (and it should be 0
otherwise to denote SME wasn't enabled).
I guess something like
VMCOREINFO_NUMBER(sme_mask);
which gets written by the kexec-ed kernel.
> If so, then this reminds me that I have wanted for a long time to store
> more of the hardware state in a vmcore NOTE after a kernel crash ...
> control registers, MSRs and whatnot. Of course, this would be a
> long-term project, but I wonder what other people think about it in
> general.
I guess that sounds like a good idea - the more relevant hw info for
debugging, the better. Determining the important MSRs to save would need
a bit of a pondering over though. For example, some MSRs are per-core,
some per-socket, etc...
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
在 2018年10月27日 00:35, Borislav Petkov 写道:
> On Fri, Oct 26, 2018 at 08:32:11PM +0800, lijiang wrote:
>> If SME is enabled in the first kernel, the crash kernel's page table(pgd/pud/pmd/pte)
>> contains the memory encryption mask, so i have to remove the sme mask to obtain the
>> true physical address when dump vmcore.
>
> Sorry, I have no clue what makedumpfile does exactly so you'd have to
> be more detailed (or wait until I look at it :)). Which kernel accesses
> which kernel's pagetable?
>
> /me goes and looks at the makedumpfile's manpage...
>
So sorry that i should provide more detail about this.
Thanks a lot for spending time reading the manpage.
> Ok, it uses vmcoreinfo to exclude pages which would mean, it accesses
> the first kernel's pagetable and traverses it.
>
> Am I close?
>
Yes, your explanation is perfect.
Thanks.
Lianbo
在 2018年10月27日 06:25, Borislav Petkov 写道:
> On Fri, Oct 26, 2018 at 06:24:40PM +0200, Petr Tesarik wrote:
>> But we need the MSR value from the panic kernel environment, not while
>> the production kernel is still running, right?
>
> Actually, we need only the encryption bit number (and it should be 0
> otherwise to denote SME wasn't enabled).
>
Thanks for your comment.
For this patch, it really needs only the encryption bit number.
For the AMD machine with SME feature, the OS or HV sets bit 47 of a
physical address to 1 in the page table entry to indicate the page
should be encrypted.
Thanks.
Lianbo
> I guess something like
>
> VMCOREINFO_NUMBER(sme_mask);
>
> which gets written by the kexec-ed kernel.
>
>> If so, then this reminds me that I have wanted for a long time to store
>> more of the hardware state in a vmcore NOTE after a kernel crash ...
>> control registers, MSRs and whatnot. Of course, this would be a
>> long-term project, but I wonder what other people think about it in
>> general.
>
> I guess that sounds like a good idea - the more relevant hw info for
> debugging, the better. Determining the important MSRs to save would need
> a bit of a pondering over though. For example, some MSRs are per-core,
> some per-socket, etc...
>
On 10/27/18 at 12:25am, Borislav Petkov wrote:
> On Fri, Oct 26, 2018 at 06:24:40PM +0200, Petr Tesarik wrote:
> > But we need the MSR value from the panic kernel environment, not while
> > the production kernel is still running, right?
>
> Actually, we need only the encryption bit number (and it should be 0
> otherwise to denote SME wasn't enabled).
>
> I guess something like
>
> VMCOREINFO_NUMBER(sme_mask);
Yes, agree. So sme_me_mask itself or the encryption bit number, both is
fine.
We need read and parse the memory content of crashed kernel which
is mapped to /prov/vmcore by user space makedumpfile. Not only is it
because user space makedumpfile can't access and get if it's sme
enabled from system, we may use cp to copy /proc/vmcore to a file
directly, then analyze it in another compupter. This often happen when
there's something wrong with makedumpfile, we need debug makedumpfile
with the complete copied file.
And only telling whether sme is enabled might be not enough. Since AMD
CPU might extend to support 5-level, then the current fixed bit positon
47 would need be changed.
Thanks
Baoquan
On Sat, Oct 27, 2018 at 04:13:43PM +0800, Baoquan He wrote:
> Yes, agree. So sme_me_mask itself or the encryption bit number, both is
> fine.
You need the encryption bit position and it better be properly formatted
and extracted into a vmcoreinfo-specific variable because we don't
expose arch-specific details like sme_me_mask to the outside.
> we may use cp to copy /proc/vmcore to a file directly, then analyze
> it in another compupter. This often happen when there's something
> wrong with makedumpfile, we need debug makedumpfile with the complete
> copied file.
So for the analyze-on-another-computer scenario you absolutely must copy
anything from the first kernel decrypted because you can't decrypt it on
the other machine.
Which means, in a sensitive environment, you probably should copy and
*encrypt* the dump again with gpg or so.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
On 10/27/18 at 11:10am, Borislav Petkov wrote:
> On Sat, Oct 27, 2018 at 04:13:43PM +0800, Baoquan He wrote:
> > Yes, agree. So sme_me_mask itself or the encryption bit number, both is
> > fine.
>
> You need the encryption bit position and it better be properly formatted
> and extracted into a vmcoreinfo-specific variable because we don't
> expose arch-specific details like sme_me_mask to the outside.
Not very sure about this, we have arch_crash_save_vmcoreinfo() in
arch/x86/kernel/machine_kexec_64.c to export arch-specific stuffs
outside. Is there any special reason about a mask in one architecture
when expose it out? In fact it's only that bit position set mask, no
other information. Surely it's also fine if we export encryption bit
position, then convert it to mask in makedumpfile.
>
> > we may use cp to copy /proc/vmcore to a file directly, then analyze
> > it in another compupter. This often happen when there's something
> > wrong with makedumpfile, we need debug makedumpfile with the complete
> > copied file.
>
> So for the analyze-on-another-computer scenario you absolutely must copy
> anything from the first kernel decrypted because you can't decrypt it on
> the other machine.
>
> Which means, in a sensitive environment, you probably should copy and
> *encrypt* the dump again with gpg or so.
Hmm, no matter it's makedumpfile or cp directly, when we read
/proc/vmcore in kdump kernel, it will call __read_vmcore() or related
functions in fs/proc/vmcore.c. With regarding to SME memory reading,
Lianbo should have handle it in previous SME support in kdump patches.
Just those page tables in crashed kernel are also memory content, they
are decrypted and copied out, while with SME bit position enabled to
indicate encryption, that bit position is added by kernel, now the 1st
kernel is dead, we need wipe it out in makedumpfile when parsing. So
this 'cp', it's still need be done in kdump kernel by 'cp' /proc/vmcore.
Thanks
Baoquan
On Sat, Oct 27, 2018 at 05:39:17PM +0800, Baoquan He wrote:
> Not very sure about this, we have arch_crash_save_vmcoreinfo() in
> arch/x86/kernel/machine_kexec_64.c to export arch-specific stuffs
> outside. Is there any special reason about a mask in one architecture
> when expose it out?
Yes, we don't export random arch-specific details to the outside which
we then cannot change later. So vmcoreinfo needs to define its own.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
On 10/27/18 at 12:12pm, Borislav Petkov wrote:
> On Sat, Oct 27, 2018 at 05:39:17PM +0800, Baoquan He wrote:
> > Not very sure about this, we have arch_crash_save_vmcoreinfo() in
> > arch/x86/kernel/machine_kexec_64.c to export arch-specific stuffs
> > outside. Is there any special reason about a mask in one architecture
> > when expose it out?
>
> Yes, we don't export random arch-specific details to the outside which
> we then cannot change later. So vmcoreinfo needs to define its own.
OK, then it's fine to get the bit number, e.g calling
find_first_bit(sme_me_mask, BITS_PER_LONG), and export it to
vmcoreinfo. Thanks.
On October 27, 2018 12:08:58 PM GMT+01:00, Baoquan He <[email protected]> wrote:
>OK, then it's fine to get the bit number, e.g calling
>find_first_bit(sme_me_mask, BITS_PER_LONG), and export it to
>vmcoreinfo. Thanks.
You can simply assign sme_me_mask for now...
--
Sent from a small device: formatting sux and brevity is inevitable.
在 2018年10月27日 21:17, Boris Petkov 写道:
> On October 27, 2018 12:08:58 PM GMT+01:00, Baoquan He <[email protected]> wrote:
>> OK, then it's fine to get the bit number, e.g calling
>> find_first_bit(sme_me_mask, BITS_PER_LONG), and export it to
>> vmcoreinfo. Thanks.
>
> You can simply assign sme_me_mask for now...
>
Thank you, Boris and Baoquan.
Actually, the value of 'sme_me_mask' is 0x800000000000 when SME is enabled, otherwise it is 0.
That is to say, if the bit 47 is set, the bit number is also 0x800000000000 (1 << 47UL);
At present, they are both the same value.
Regards,
Lianbo
On Sat, Oct 27, 2018 at 10:41:56PM +0800, lijiang wrote:
> Actually, the value of 'sme_me_mask' is 0x800000000000 when SME is
> enabled, otherwise it is 0. That is to say, if the bit 47 is set, the
> bit number is also 0x800000000000 (1 << 47UL);
Yes, and you can simply copy the mask into your variable and export
that. Thinking about it more, though, it might be better if you instead
export a smaller value - not an u64 - and construct the mask in
userspace. I.e., a u8 which should be enough for your current purposes.
I say current because if it turns out we need to export more SME-related
info to userspace, exporting an u64 and then OR-ing in more information
in it would allow that. u8 not so much. So doing something like:
[ misc ][ enc bit ][ other misc SME info ]
0000_0000_0000_0000_1000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000
63 59 55 51 47 43 39 35 31 27 23 19 15 11 7 3
would allow for exporting that additional info.
Especially if we want to use VMCOREINFO for more than kexec things.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
在 2018年10月27日 22:51, Borislav Petkov 写道:
> On Sat, Oct 27, 2018 at 10:41:56PM +0800, lijiang wrote:
>> Actually, the value of 'sme_me_mask' is 0x800000000000 when SME is
>> enabled, otherwise it is 0. That is to say, if the bit 47 is set, the
>> bit number is also 0x800000000000 (1 << 47UL);
>
> Yes, and you can simply copy the mask into your variable and export
> that.
Ok, if you agree with the following patch, i will post it again.
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 4c8acdfdc5a7..de363796ed20 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -29,6 +29,9 @@
#include <asm/setup.h>
#include <asm/set_memory.h>
+u64 sme_mask;
+EXPORT_SYMBOL(sme_mask);
+
#ifdef CONFIG_KEXEC_FILE
const struct kexec_file_ops * const kexec_file_loaders[] = {
&kexec_bzImage64_ops,
@@ -357,6 +360,9 @@ void arch_crash_save_vmcoreinfo(void)
vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
pgtable_l5_enabled());
+ sme_mask = sme_me_mask;
+ VMCOREINFO_NUMBER(sme_mask);
+
#ifdef CONFIG_NUMA
VMCOREINFO_SYMBOL(node_data);
VMCOREINFO_LENGTH(node_data, MAX_NUMNODES);
--
> Thinking about it more, though, it might be better if you instead
> export a smaller value - not an u64 - and construct the mask in
> userspace. I.e., a u8 which should be enough for your current purposes.
>
> I say current because if it turns out we need to export more SME-related
> info to userspace, exporting an u64 and then OR-ing in more information
> in it would allow that. u8 not so much. So doing something like:
>
> [ misc ][ enc bit ][ other misc SME info ]
> 0000_0000_0000_0000_1000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000
> 63 59 55 51 47 43 39 35 31 27 23 19 15 11 7 3
>
> would allow for exporting that additional info.
>
> Especially if we want to use VMCOREINFO for more than kexec things.
>
As you mentioned, if need, the definition of this variable(u64 sme_mask) can be extended
conveniently in the future(The bit might be redefined).
Thanks.
Lianbo
On 10/29/18 at 03:59pm, lijiang wrote:
> diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
> index 4c8acdfdc5a7..de363796ed20 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -29,6 +29,9 @@
> #include <asm/setup.h>
> #include <asm/set_memory.h>
>
> +u64 sme_mask;
> +EXPORT_SYMBOL(sme_mask);
Wondering why it is global. Who else will use it?
> +
> #ifdef CONFIG_KEXEC_FILE
> const struct kexec_file_ops * const kexec_file_loaders[] = {
> &kexec_bzImage64_ops,
> @@ -357,6 +360,9 @@ void arch_crash_save_vmcoreinfo(void)
> vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
> pgtable_l5_enabled());
>
> + sme_mask = sme_me_mask;
> + VMCOREINFO_NUMBER(sme_mask);
> +
> #ifdef CONFIG_NUMA
> VMCOREINFO_SYMBOL(node_data);
> VMCOREINFO_LENGTH(node_data, MAX_NUMNODES);
在 2018年10月29日 16:31, Baoquan He 写道:
> On 10/29/18 at 03:59pm, lijiang wrote:
>> diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
>> index 4c8acdfdc5a7..de363796ed20 100644
>> --- a/arch/x86/kernel/machine_kexec_64.c
>> +++ b/arch/x86/kernel/machine_kexec_64.c
>> @@ -29,6 +29,9 @@
>> #include <asm/setup.h>
>> #include <asm/set_memory.h>
>>
>> +u64 sme_mask;
>> +EXPORT_SYMBOL(sme_mask);
>
> Wondering why it is global. Who else will use it?
>
In general, no need to export the symbol of local variable.
Ok, i will change it to a local variable and export it.
Thanks.
Lianbo
>> +
>> #ifdef CONFIG_KEXEC_FILE
>> const struct kexec_file_ops * const kexec_file_loaders[] = {
>> &kexec_bzImage64_ops,
>> @@ -357,6 +360,9 @@ void arch_crash_save_vmcoreinfo(void)
>> vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
>> pgtable_l5_enabled());
>>
>> + sme_mask = sme_me_mask;
>> + VMCOREINFO_NUMBER(sme_mask);
>> +
>> #ifdef CONFIG_NUMA
>> VMCOREINFO_SYMBOL(node_data);
>> VMCOREINFO_LENGTH(node_data, MAX_NUMNODES);
On Mon, Oct 29, 2018 at 05:29:16PM +0800, lijiang wrote:
> Ok, i will change it to a local variable and export it.
Why do you have to export it at all?!
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
在 2018年10月29日 17:57, Borislav Petkov 写道:
> On Mon, Oct 29, 2018 at 05:29:16PM +0800, lijiang wrote:
>> Ok, i will change it to a local variable and export it.
>
> Why do you have to export it at all?!
>
I mean that i will write the value of local variable to the vmcoreinfo. For example:
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 4c8acdfdc5a7..3cc975e7ff8f 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -352,11 +352,16 @@ void machine_kexec(struct kimage *image)
void arch_crash_save_vmcoreinfo(void)
{
+ u64 sme_mask = 0;
+
VMCOREINFO_NUMBER(phys_base);
VMCOREINFO_SYMBOL(init_top_pgt);
vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
pgtable_l5_enabled());
+ sme_mask = sme_me_mask;
+ VMCOREINFO_NUMBER(sme_mask);
+
#ifdef CONFIG_NUMA
VMCOREINFO_SYMBOL(node_data);
VMCOREINFO_LENGTH(node_data, MAX_NUMNODES);
--
Because the crash kernel's page table(pgd/pud/pmd/pte) contains
the memory encryption mask(bit 47), makedumpfile needs to remove
the sme mask to obtain the true physical address.
Thanks.
Lianbo
On 10/29/18 at 06:12pm, lijiang wrote:
> diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
> index 4c8acdfdc5a7..3cc975e7ff8f 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -352,11 +352,16 @@ void machine_kexec(struct kimage *image)
>
void arch_crash_save_vmcoreinfo(void)
{
+ u64 sme_mask = sme_me_mask;
+
This may looks better.
VMCOREINFO_NUMBER(phys_base);
VMCOREINFO_SYMBOL(init_top_pgt);
vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
pgtable_l5_enabled());
+ VMCOREINFO_NUMBER(sme_mask);
+
#ifdef CONFIG_NUMA
VMCOREINFO_SYMBOL(node_data);
VMCOREINFO_LENGTH(node_data, MAX_NUMNODES);
在 2018年10月29日 19:44, Baoquan He 写道:
> On 10/29/18 at 06:12pm, lijiang wrote:
>> diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
>> index 4c8acdfdc5a7..3cc975e7ff8f 100644
>> --- a/arch/x86/kernel/machine_kexec_64.c
>> +++ b/arch/x86/kernel/machine_kexec_64.c
>> @@ -352,11 +352,16 @@ void machine_kexec(struct kimage *image)
>>
> void arch_crash_save_vmcoreinfo(void)
> {
> + u64 sme_mask = sme_me_mask;
> +
>
> This may looks better.
>
Ok, thanks for your comment.
> VMCOREINFO_NUMBER(phys_base);
> VMCOREINFO_SYMBOL(init_top_pgt);
> vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
> pgtable_l5_enabled());
>
> + VMCOREINFO_NUMBER(sme_mask);
> +
> #ifdef CONFIG_NUMA
> VMCOREINFO_SYMBOL(node_data);
> VMCOREINFO_LENGTH(node_data, MAX_NUMNODES);
>
On Mon, Oct 29, 2018 at 09:41:26PM +0800, lijiang wrote:
> > VMCOREINFO_NUMBER(phys_base);
> > VMCOREINFO_SYMBOL(init_top_pgt);
> > vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
> > pgtable_l5_enabled());
> >
> > + VMCOREINFO_NUMBER(sme_mask);
What I'm still missing from the whole fun is where are we documenting
the contents of that vmcoreinfo thing?
If not, we need it documented.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
在 2018年10月29日 21:49, Borislav Petkov 写道:
> On Mon, Oct 29, 2018 at 09:41:26PM +0800, lijiang wrote:
>>> VMCOREINFO_NUMBER(phys_base);
>>> VMCOREINFO_SYMBOL(init_top_pgt);
>>> vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
>>> pgtable_l5_enabled());
>>>
>>> + VMCOREINFO_NUMBER(sme_mask);
>
> What I'm still missing from the whole fun is where are we documenting
> the contents of that vmcoreinfo thing?
>
I'm not sure whether the following document is what you need.
Documentation/ABI/testing/sysfs-kernel-vmcoreinfo
Thanks.
Lianbo
> If not, we need it documented.
>
On 10/30/18 at 12:46pm, lijiang wrote:
> 在 2018年10月29日 21:49, Borislav Petkov 写道:
> > On Mon, Oct 29, 2018 at 09:41:26PM +0800, lijiang wrote:
> >>> VMCOREINFO_NUMBER(phys_base);
> >>> VMCOREINFO_SYMBOL(init_top_pgt);
> >>> vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
> >>> pgtable_l5_enabled());
> >>>
> >>> + VMCOREINFO_NUMBER(sme_mask);
> >
> > What I'm still missing from the whole fun is where are we documenting
> > the contents of that vmcoreinfo thing?
> >
>
> I'm not sure whether the following document is what you need.
> Documentation/ABI/testing/sysfs-kernel-vmcoreinfo
It is not the content, I think it is a good catch from Boris, it would
be good to document the exported things in somewhere eg.
Documentation/kdump/vmcoreinfo.txt
>
> Thanks.
> Lianbo
> > If not, we need it documented.
> >
Thanks
Dave
On Tue, Oct 30, 2018 at 01:09:00PM +0800, Dave Young wrote:
> It is not the content, I think it is a good catch from Boris, it would
> be good to document the exported things in somewhere eg.
> Documentation/kdump/vmcoreinfo.txt
Yes.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Hi Boris, DaveY and Lianbo,
On 10/30/18 at 10:15am, Borislav Petkov wrote:
> On Tue, Oct 30, 2018 at 01:09:00PM +0800, Dave Young wrote:
> > It is not the content, I think it is a good catch from Boris, it would
> > be good to document the exported things in somewhere eg.
> > Documentation/kdump/vmcoreinfo.txt
For the vmcoreinfo variables document, I personally think it might be
not necessary. The reason is that all the old varialbles are exported
with the name of themselves. We know what they are or what they
represent since they are all kernel symbols or macro. Only this me_mask,
it's a local variable and store the value of sme_me_mask for now, may
store more info later like Petr mentioned, and also will store the memory
encryption information of TME (which is intel's transparent memory encryption).
We can add code comment around to tell these.
Personal opinion.
Thanks
Baoquan
在 2018年10月30日 17:23, Baoquan He 写道:
>
> Hi Boris, DaveY and Lianbo,
>
> On 10/30/18 at 10:15am, Borislav Petkov wrote:
>> On Tue, Oct 30, 2018 at 01:09:00PM +0800, Dave Young wrote:
>>> It is not the content, I think it is a good catch from Boris, it would
>>> be good to document the exported things in somewhere eg.
>>> Documentation/kdump/vmcoreinfo.txt
>
> For the vmcoreinfo variables document, I personally think it might be
> not necessary. The reason is that all the old varialbles are exported
> with the name of themselves. We know what they are or what they
> represent since they are all kernel symbols or macro. Only this me_mask,
> it's a local variable and store the value of sme_me_mask for now, may
> store more info later like Petr mentioned, and also will store the memory
> encryption information of TME (which is intel's transparent memory encryption).
> We can add code comment around to tell these.
>
Thank you, everyone.
I personally agree with Baoquan's opinion. What do you think about? Boris and other reviewer?
If the vmcoreinfo document is necessary, would you like to help me to provide an outline?
I can try my best to write the document.
Anyway, we should make a choice.
Regards,
Lianbo
> Personal opinion.
>
> Thanks
> Baoquan
>
Add Kazu, the makedumpfile maintainer in cc.
On 10/31/18 at 10:26am, lijiang wrote:
> 在 2018年10月30日 17:23, Baoquan He 写道:
> >
> > Hi Boris, DaveY and Lianbo,
> >
> > On 10/30/18 at 10:15am, Borislav Petkov wrote:
> >> On Tue, Oct 30, 2018 at 01:09:00PM +0800, Dave Young wrote:
> >>> It is not the content, I think it is a good catch from Boris, it would
> >>> be good to document the exported things in somewhere eg.
> >>> Documentation/kdump/vmcoreinfo.txt
> >
> > For the vmcoreinfo variables document, I personally think it might be
> > not necessary. The reason is that all the old varialbles are exported
> > with the name of themselves. We know what they are or what they
> > represent since they are all kernel symbols or macro. Only this me_mask,
> > it's a local variable and store the value of sme_me_mask for now, may
> > store more info later like Petr mentioned, and also will store the memory
> > encryption information of TME (which is intel's transparent memory encryption).
> > We can add code comment around to tell these.
> >
> Thank you, everyone.
>
> I personally agree with Baoquan's opinion. What do you think about? Boris and other reviewer?
>
> If the vmcoreinfo document is necessary, would you like to help me to provide an outline?
> I can try my best to write the document.
It is true like what Baoquan said, these information are mainly used by
makedumpfile tool for creating a filtered vmcore. But these vmcoreinfo
hide in a lot of different code paths:
kernel/crash_core.c, printk code, arch code etc.
It is a mist only a few kdump people know them, documenting them will help
people to understand and review. It will also be clearer instead of
digging into code?
The document can briefly explain what is vmcoreinfo, why we need it, and
some background info. Then list the exported values with some
classifying by core kernel, arch related, string or number etc. For
most of them like Baoquan said no need more explanation, but add
explanations for something if needed like this sme mask.
But I think this can be done separately instead of blocking this patch.
Maybe Kazu knows more about it, so I would like to ask Kazu about his opinion
and if he want to do it.
>
> Anyway, we should make a choice.
>
> Regards,
> Lianbo
>
> > Personal opinion.
> >
> > Thanks
> > Baoquan
> >
Thanks
Dave
在 2018年10月31日 10:47, Dave Young 写道:
> Add Kazu, the makedumpfile maintainer in cc.
>
> On 10/31/18 at 10:26am, lijiang wrote:
>> 在 2018年10月30日 17:23, Baoquan He 写道:
>>>
>>> Hi Boris, DaveY and Lianbo,
>>>
>>> On 10/30/18 at 10:15am, Borislav Petkov wrote:
>>>> On Tue, Oct 30, 2018 at 01:09:00PM +0800, Dave Young wrote:
>>>>> It is not the content, I think it is a good catch from Boris, it would
>>>>> be good to document the exported things in somewhere eg.
>>>>> Documentation/kdump/vmcoreinfo.txt
>>>
>>> For the vmcoreinfo variables document, I personally think it might be
>>> not necessary. The reason is that all the old varialbles are exported
>>> with the name of themselves. We know what they are or what they
>>> represent since they are all kernel symbols or macro. Only this me_mask,
>>> it's a local variable and store the value of sme_me_mask for now, may
>>> store more info later like Petr mentioned, and also will store the memory
>>> encryption information of TME (which is intel's transparent memory encryption).
>>> We can add code comment around to tell these.
>>>
>> Thank you, everyone.
>>
>> I personally agree with Baoquan's opinion. What do you think about? Boris and other reviewer?
>>
>> If the vmcoreinfo document is necessary, would you like to help me to provide an outline?
>> I can try my best to write the document.
>
> It is true like what Baoquan said, these information are mainly used by
> makedumpfile tool for creating a filtered vmcore. But these vmcoreinfo
> hide in a lot of different code paths:
> kernel/crash_core.c, printk code, arch code etc.
>
> It is a mist only a few kdump people know them, documenting them will help
> people to understand and review. It will also be clearer instead of
> digging into code?
>
> The document can briefly explain what is vmcoreinfo, why we need it, and
> some background info. Then list the exported values with some
> classifying by core kernel, arch related, string or number etc. For
> most of them like Baoquan said no need more explanation, but add
> explanations for something if needed like this sme mask.
>
> But I think this can be done separately instead of blocking this patch.
For this patch, i think that it had been reached an agreement. So i will
update my patch log and post v2 if there is no objection.
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 4c8acdfdc5a7..ca9bdf46b8e7 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -352,10 +352,23 @@ void machine_kexec(struct kimage *image)
void arch_crash_save_vmcoreinfo(void)
{
+ u64 sme_mask = sme_me_mask;
+
VMCOREINFO_NUMBER(phys_base);
VMCOREINFO_SYMBOL(init_top_pgt);
vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
pgtable_l5_enabled());
+ /*
+ * Currently, the local variable 'sme_mask' stores the value of
+ * sme_me_mask(bit 47), and also write the value of sme_mask to
+ * the vmcoreinfo.
+ * If need, the bit(sme_mask) might be redefined in the future.
+ * For example:
+ * [ misc ][ enc bit ][ other misc SME info ]
+ * 0000_0000_0000_0000_1000_0000_0000_0000_0000_0000_..._0000
+ * 63 59 55 51 47 43 39 35 31 27 ... 3
+ */
+ VMCOREINFO_NUMBER(sme_mask);
#ifdef CONFIG_NUMA
VMCOREINFO_SYMBOL(node_data);
> Maybe Kazu knows more about it, so I would like to ask Kazu about his opinion
> and if he want to do it.
> Ok, thank you, Dave Young.
About the vmcoreinfo document issue, once make a decision, i will try my best to
do this based on your outline.
Regards,
Lianbo
>>
>> Anyway, we should make a choice.
>>
>> Regards,
>> Lianbo
>>
>>> Personal opinion.
>>>
>>> Thanks
>>> Baoquan
>>>
>
> Thanks
> Dave
>
On Wed, Oct 31, 2018 at 10:47:48AM +0800, Dave Young wrote:
> It is a mist only a few kdump people know them, documenting them will help
> people to understand and review. It will also be clearer instead of
> digging into code?
Wholeheartedly agreed. Especially if people start using vmcoreinfo for
other stuff, like live debugging:
https://lkml.kernel.org/r/[email protected]
> The document can briefly explain what is vmcoreinfo, why we need it, and
> some background info. Then list the exported values with some
> classifying by core kernel, arch related, string or number etc. For
> most of them like Baoquan said no need more explanation, but add
> explanations for something if needed like this sme mask.
And add explanations for *all* of them!
If this becomes an API, then it better be documented.
> But I think this can be done separately instead of blocking this patch.
We have three months to the next merge window - I'm sure you guys can do
both. :-)
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On 10/31/2018 6:10 AM, Borislav Petkov wrote:
> On Wed, Oct 31, 2018 at 10:47:48AM +0800, Dave Young wrote:
> > It is a mist only a few kdump people know them, documenting them will help
> > people to understand and review. It will also be clearer instead of
> > digging into code?
>
> Wholeheartedly agreed. Especially if people start using vmcoreinfo for
> other stuff, like live debugging:
>
> https://lkml.kernel.org/r/[email protected]
I also agree. If it can help reviewers and other users to understand
vmcoreinfo and can help itself to become more standard, it would be
better to write it.
One small thing as a vmcoreinfo user (not about the documentation),
I think it might be better to export each information to each variable
separately, not OR-ing them into a variable, because of code simpleness
of both kernel and tools, if there is no limitation in kernel.
Thanks,
Kazu