2020-09-11 19:35:37

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

The following series adds the support for PCID/INVPCID on AMD guests.
While doing it re-structured the vmcb_control_area data structure to
combine all the intercept vectors into one 32 bit array. Makes it easy
for future additions. Re-arranged few pcid related code to make it common
between SVM and VMX.

INVPCID interceptions are added only when the guest is running with shadow
page table enabled. In this case the hypervisor needs to handle the tlbflush
based on the type of invpcid instruction.

For the guests with nested page table (NPT) support, the INVPCID feature
works as running it natively. KVM does not need to do any special handling.

AMD documentation for INVPCID feature is available at "AMD64 Architecture
Programmer’s Manual Volume 2: System Programming, Pub. 24593 Rev. 3.34(or later)"

The documentation can be obtained at the links below:
Link: https://www.amd.com/system/files/TechDocs/24593.pdf
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
---

v6:
One minor change in patch #04. Otherwise same as v5.
Updated all the patches by Reviewed-by.

v5:
https://lore.kernel.org/lkml/159846887637.18873.14677728679411578606.stgit@bmoger-ubuntu/
All the changes are related to rebase.
Aplies cleanly on mainline and kvm(master) tree.
Resending it to get some attention.

v4:
https://lore.kernel.org/lkml/159676101387.12805.18038347880482984693.stgit@bmoger-ubuntu/
1. Changed the functions __set_intercept/__clr_intercept/__is_intercept to
to vmcb_set_intercept/vmcb_clr_intercept/vmcb_is_intercept by passing
vmcb_control_area structure(Suggested by Paolo).
2. Rearranged the commit 7a35e515a7055 ("KVM: VMX: Properly handle kvm_read/write_guest_virt*())
to make it common across both SVM/VMX(Suggested by Jim Mattson).
3. Took care of few other comments from Jim Mattson. Dropped "Reviewed-by"
on few patches which I have changed since v3.

v3:
https://lore.kernel.org/lkml/159597929496.12744.14654593948763926416.stgit@bmoger-ubuntu/
1. Addressing the comments from Jim Mattson. Follow the v2 link below
for the context.
2. Introduced the generic __set_intercept, __clr_intercept and is_intercept
using native __set_bit, clear_bit and test_bit.
3. Combined all the intercepts vectors into single 32 bit array.
4. Removed set_intercept_cr, clr_intercept_cr, set_exception_intercepts,
clr_exception_intercept etc. Used the generic set_intercept and
clr_intercept where applicable.
5. Tested both L1 guest and l2 nested guests.

v2:
https://lore.kernel.org/lkml/159234483706.6230.13753828995249423191.stgit@bmoger-ubuntu/
- Taken care of few comments from Jim Mattson.
- KVM interceptions added only when tdp is off. No interceptions
when tdp is on.
- Reverted the fault priority to original order in VMX.

v1:
https://lore.kernel.org/lkml/159191202523.31436.11959784252237488867.stgit@bmoger-ubuntu/

Babu Moger (12):
KVM: SVM: Introduce vmcb_(set_intercept/clr_intercept/_is_intercept)
KVM: SVM: Change intercept_cr to generic intercepts
KVM: SVM: Change intercept_dr to generic intercepts
KVM: SVM: Modify intercept_exceptions to generic intercepts
KVM: SVM: Modify 64 bit intercept field to two 32 bit vectors
KVM: SVM: Add new intercept vector in vmcb_control_area
KVM: nSVM: Cleanup nested_state data structure
KVM: SVM: Remove set_cr_intercept, clr_cr_intercept and is_cr_intercept
KVM: SVM: Remove set_exception_intercept and clr_exception_intercept
KVM: X86: Rename and move the function vmx_handle_memory_failure to x86.c
KVM: X86: Move handling of INVPCID types to x86
KVM:SVM: Enable INVPCID feature on AMD


arch/x86/include/asm/svm.h | 117 +++++++++++++++++++++++++----------
arch/x86/include/uapi/asm/svm.h | 2 +
arch/x86/kvm/svm/nested.c | 66 +++++++++-----------
arch/x86/kvm/svm/svm.c | 131 ++++++++++++++++++++++++++-------------
arch/x86/kvm/svm/svm.h | 87 +++++++++-----------------
arch/x86/kvm/trace.h | 21 ++++--
arch/x86/kvm/vmx/nested.c | 12 ++--
arch/x86/kvm/vmx/vmx.c | 95 ----------------------------
arch/x86/kvm/vmx/vmx.h | 2 -
arch/x86/kvm/x86.c | 106 ++++++++++++++++++++++++++++++++
arch/x86/kvm/x86.h | 3 +
11 files changed, 364 insertions(+), 278 deletions(-)

--
Signature


2020-09-11 19:35:56

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v6 12/12] KVM:SVM: Enable INVPCID feature on AMD

The following intercept bit has been added to support VMEXIT
for INVPCID instruction:
Code Name Cause
A2h VMEXIT_INVPCID INVPCID instruction

The following bit has been added to the VMCB layout control area
to control intercept of INVPCID:
Byte Offset Bit(s) Function
14h 2 intercept INVPCID

Enable the interceptions when the the guest is running with shadow
page table enabled and handle the tlbflush based on the invpcid
instruction type.

For the guests with nested page table (NPT) support, the INVPCID
feature works as running it natively. KVM does not need to do any
special handling in this case.

AMD documentation for INVPCID feature is available at "AMD64
Architecture Programmer’s Manual Volume 2: System Programming,
Pub. 24593 Rev. 3.34(or later)"

The documentation can be obtained at the links below:
Link: https://www.amd.com/system/files/TechDocs/24593.pdf
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537

Signed-off-by: Babu Moger <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
---
arch/x86/include/uapi/asm/svm.h | 2 ++
arch/x86/kvm/svm/svm.c | 51 +++++++++++++++++++++++++++++++++++++++
2 files changed, 53 insertions(+)

diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 2e8a30f06c74..522d42dfc28c 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -76,6 +76,7 @@
#define SVM_EXIT_MWAIT_COND 0x08c
#define SVM_EXIT_XSETBV 0x08d
#define SVM_EXIT_RDPRU 0x08e
+#define SVM_EXIT_INVPCID 0x0a2
#define SVM_EXIT_NPF 0x400
#define SVM_EXIT_AVIC_INCOMPLETE_IPI 0x401
#define SVM_EXIT_AVIC_UNACCELERATED_ACCESS 0x402
@@ -171,6 +172,7 @@
{ SVM_EXIT_MONITOR, "monitor" }, \
{ SVM_EXIT_MWAIT, "mwait" }, \
{ SVM_EXIT_XSETBV, "xsetbv" }, \
+ { SVM_EXIT_INVPCID, "invpcid" }, \
{ SVM_EXIT_NPF, "npf" }, \
{ SVM_EXIT_AVIC_INCOMPLETE_IPI, "avic_incomplete_ipi" }, \
{ SVM_EXIT_AVIC_UNACCELERATED_ACCESS, "avic_unaccelerated_access" }, \
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 96617b61e531..5c6b8d0f7628 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -813,6 +813,9 @@ static __init void svm_set_cpu_caps(void)
if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD) ||
boot_cpu_has(X86_FEATURE_AMD_SSBD))
kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
+
+ /* Enable INVPCID feature */
+ kvm_cpu_cap_check_and_set(X86_FEATURE_INVPCID);
}

static __init int svm_hardware_setup(void)
@@ -985,6 +988,21 @@ static u64 svm_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
return svm->vmcb->control.tsc_offset;
}

+static void svm_check_invpcid(struct vcpu_svm *svm)
+{
+ /*
+ * Intercept INVPCID instruction only if shadow page table is
+ * enabled. Interception is not required with nested page table
+ * enabled.
+ */
+ if (kvm_cpu_cap_has(X86_FEATURE_INVPCID)) {
+ if (!npt_enabled)
+ svm_set_intercept(svm, INTERCEPT_INVPCID);
+ else
+ svm_clr_intercept(svm, INTERCEPT_INVPCID);
+ }
+}
+
static void init_vmcb(struct vcpu_svm *svm)
{
struct vmcb_control_area *control = &svm->vmcb->control;
@@ -1114,6 +1132,8 @@ static void init_vmcb(struct vcpu_svm *svm)
svm_clr_intercept(svm, INTERCEPT_PAUSE);
}

+ svm_check_invpcid(svm);
+
if (kvm_vcpu_apicv_active(&svm->vcpu))
avic_init_vmcb(svm);

@@ -2730,6 +2750,33 @@ static int mwait_interception(struct vcpu_svm *svm)
return nop_interception(svm);
}

+static int invpcid_interception(struct vcpu_svm *svm)
+{
+ struct kvm_vcpu *vcpu = &svm->vcpu;
+ unsigned long type;
+ gva_t gva;
+
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_INVPCID)) {
+ kvm_queue_exception(vcpu, UD_VECTOR);
+ return 1;
+ }
+
+ /*
+ * For an INVPCID intercept:
+ * EXITINFO1 provides the linear address of the memory operand.
+ * EXITINFO2 provides the contents of the register operand.
+ */
+ type = svm->vmcb->control.exit_info_2;
+ gva = svm->vmcb->control.exit_info_1;
+
+ if (type > 3) {
+ kvm_inject_gp(vcpu, 0);
+ return 1;
+ }
+
+ return kvm_handle_invpcid(vcpu, type, gva);
+}
+
static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
[SVM_EXIT_READ_CR0] = cr_interception,
[SVM_EXIT_READ_CR3] = cr_interception,
@@ -2792,6 +2839,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
[SVM_EXIT_MWAIT] = mwait_interception,
[SVM_EXIT_XSETBV] = xsetbv_interception,
[SVM_EXIT_RDPRU] = rdpru_interception,
+ [SVM_EXIT_INVPCID] = invpcid_interception,
[SVM_EXIT_NPF] = npf_interception,
[SVM_EXIT_RSM] = rsm_interception,
[SVM_EXIT_AVIC_INCOMPLETE_IPI] = avic_incomplete_ipi_interception,
@@ -3622,6 +3670,9 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
guest_cpuid_has(&svm->vcpu, X86_FEATURE_NRIPS);

+ /* Check again if INVPCID interception if required */
+ svm_check_invpcid(svm);
+
if (!kvm_vcpu_apicv_active(vcpu))
return;


2020-09-11 19:35:57

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v6 06/12] KVM: SVM: Add new intercept vector in vmcb_control_area

The new intercept bits have been added in vmcb control area to support
few more interceptions. Here are the some of them.
- INTERCEPT_INVLPGB,
- INTERCEPT_INVLPGB_ILLEGAL,
- INTERCEPT_INVPCID,
- INTERCEPT_MCOMMIT,
- INTERCEPT_TLBSYNC,

Add new intercept vector in vmcb_control_area to support these instructions.
Also update kvm_nested_vmrun trace function to support the new addition.

AMD documentation for these instructions is available at "AMD64
Architecture Programmer’s Manual Volume 2: System Programming, Pub. 24593
Rev. 3.34(or later)"

The documentation can be obtained at the links below:
Link: https://www.amd.com/system/files/TechDocs/24593.pdf
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537

Signed-off-by: Babu Moger <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
---
arch/x86/include/asm/svm.h | 7 +++++++
arch/x86/kvm/svm/nested.c | 3 ++-
arch/x86/kvm/trace.h | 13 ++++++++-----
3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 9f0fa02fc838..623c392a55ac 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -16,6 +16,7 @@ enum vector_offset {
EXCEPTION_VECTOR,
INTERCEPT_VECTOR_3,
INTERCEPT_VECTOR_4,
+ INTERCEPT_VECTOR_5,
MAX_VECTORS,
};

@@ -124,6 +125,12 @@ enum {
INTERCEPT_MWAIT_COND,
INTERCEPT_XSETBV,
INTERCEPT_RDPRU,
+ /* Byte offset 014h (Vector 5) */
+ INTERCEPT_INVLPGB = 160,
+ INTERCEPT_INVLPGB_ILLEGAL,
+ INTERCEPT_INVPCID,
+ INTERCEPT_MCOMMIT,
+ INTERCEPT_TLBSYNC,
};


diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 353d550a5bb7..c833f6265b59 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -489,7 +489,8 @@ int nested_svm_vmrun(struct vcpu_svm *svm)
nested_vmcb->control.intercepts[CR_VECTOR] >> 16,
nested_vmcb->control.intercepts[EXCEPTION_VECTOR],
nested_vmcb->control.intercepts[INTERCEPT_VECTOR_3],
- nested_vmcb->control.intercepts[INTERCEPT_VECTOR_4]);
+ nested_vmcb->control.intercepts[INTERCEPT_VECTOR_4],
+ nested_vmcb->control.intercepts[INTERCEPT_VECTOR_5]);

/* Clear internal status */
kvm_clear_exception_queue(&svm->vcpu);
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 6e7262229e6a..11046171b5d9 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -544,9 +544,10 @@ TRACE_EVENT(kvm_nested_vmrun,
);

TRACE_EVENT(kvm_nested_intercepts,
- TP_PROTO(__u16 cr_read, __u16 cr_write, __u32 exceptions, __u32 intercept1,
- __u32 intercept2),
- TP_ARGS(cr_read, cr_write, exceptions, intercept1, intercept2),
+ TP_PROTO(__u16 cr_read, __u16 cr_write, __u32 exceptions,
+ __u32 intercept1, __u32 intercept2, __u32 intercept3),
+ TP_ARGS(cr_read, cr_write, exceptions, intercept1,
+ intercept2, intercept3),

TP_STRUCT__entry(
__field( __u16, cr_read )
@@ -554,6 +555,7 @@ TRACE_EVENT(kvm_nested_intercepts,
__field( __u32, exceptions )
__field( __u32, intercept1 )
__field( __u32, intercept2 )
+ __field( __u32, intercept3 )
),

TP_fast_assign(
@@ -562,12 +564,13 @@ TRACE_EVENT(kvm_nested_intercepts,
__entry->exceptions = exceptions;
__entry->intercept1 = intercept1;
__entry->intercept2 = intercept2;
+ __entry->intercept3 = intercept3;
),

TP_printk("cr_read: %04x cr_write: %04x excp: %08x "
- "intercept1: %08x intercept2: %08x",
+ "intercept1: %08x intercept2: %08x intercept3: %08x",
__entry->cr_read, __entry->cr_write, __entry->exceptions,
- __entry->intercept1, __entry->intercept2)
+ __entry->intercept1, __entry->intercept2, __entry->intercept3)
);
/*
* Tracepoint for #VMEXIT while nested

2020-09-11 19:38:19

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v6 01/12] KVM: SVM: Introduce vmcb_(set_intercept/clr_intercept/_is_intercept)

This is in preparation for the future intercept vector additions.

Add new functions vmcb_set_intercept, vmcb_clr_intercept and vmcb_is_intercept
using kernel APIs __set_bit, __clear_bit and test_bit espectively.

Signed-off-by: Babu Moger <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
---
arch/x86/kvm/svm/svm.h | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index a798e1731709..1cff7644e70b 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -214,6 +214,21 @@ static inline struct vmcb *get_host_vmcb(struct vcpu_svm *svm)
return svm->vmcb;
}

+static inline void vmcb_set_intercept(struct vmcb_control_area *control, int bit)
+{
+ __set_bit(bit, (unsigned long *)&control->intercept_cr);
+}
+
+static inline void vmcb_clr_intercept(struct vmcb_control_area *control, int bit)
+{
+ __clear_bit(bit, (unsigned long *)&control->intercept_cr);
+}
+
+static inline bool vmcb_is_intercept(struct vmcb_control_area *control, int bit)
+{
+ return test_bit(bit, (unsigned long *)&control->intercept_cr);
+}
+
static inline void set_cr_intercept(struct vcpu_svm *svm, int bit)
{
struct vmcb *vmcb = get_host_vmcb(svm);

2020-09-11 19:38:39

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v6 02/12] KVM: SVM: Change intercept_cr to generic intercepts

Change intercept_cr to generic intercepts in vmcb_control_area.
Use the new vmcb_set_intercept, vmcb_clr_intercept and vmcb_is_intercept
where applicable.

Signed-off-by: Babu Moger <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
---
arch/x86/include/asm/svm.h | 42 ++++++++++++++++++++++++++++++++----------
arch/x86/kvm/svm/nested.c | 26 +++++++++++++++++---------
arch/x86/kvm/svm/svm.c | 4 ++--
arch/x86/kvm/svm/svm.h | 12 ++++++------
4 files changed, 57 insertions(+), 27 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 8a1f5382a4ea..d4739f4eae63 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -4,6 +4,37 @@

#include <uapi/asm/svm.h>

+/*
+ * VMCB Control Area intercept bits starting
+ * at Byte offset 000h (Vector 0).
+ */
+
+enum vector_offset {
+ CR_VECTOR = 0,
+ MAX_VECTORS,
+};
+
+enum {
+ /* Byte offset 000h (Vector 0) */
+ INTERCEPT_CR0_READ = 0,
+ INTERCEPT_CR1_READ,
+ INTERCEPT_CR2_READ,
+ INTERCEPT_CR3_READ,
+ INTERCEPT_CR4_READ,
+ INTERCEPT_CR5_READ,
+ INTERCEPT_CR6_READ,
+ INTERCEPT_CR7_READ,
+ INTERCEPT_CR8_READ,
+ INTERCEPT_CR0_WRITE = 16,
+ INTERCEPT_CR1_WRITE,
+ INTERCEPT_CR2_WRITE,
+ INTERCEPT_CR3_WRITE,
+ INTERCEPT_CR4_WRITE,
+ INTERCEPT_CR5_WRITE,
+ INTERCEPT_CR6_WRITE,
+ INTERCEPT_CR7_WRITE,
+ INTERCEPT_CR8_WRITE,
+};

enum {
INTERCEPT_INTR,
@@ -57,7 +88,7 @@ enum {


struct __attribute__ ((__packed__)) vmcb_control_area {
- u32 intercept_cr;
+ u32 intercepts[MAX_VECTORS];
u32 intercept_dr;
u32 intercept_exceptions;
u64 intercept;
@@ -240,15 +271,6 @@ struct __attribute__ ((__packed__)) vmcb {
#define SVM_SELECTOR_READ_MASK SVM_SELECTOR_WRITE_MASK
#define SVM_SELECTOR_CODE_MASK (1 << 3)

-#define INTERCEPT_CR0_READ 0
-#define INTERCEPT_CR3_READ 3
-#define INTERCEPT_CR4_READ 4
-#define INTERCEPT_CR8_READ 8
-#define INTERCEPT_CR0_WRITE (16 + 0)
-#define INTERCEPT_CR3_WRITE (16 + 3)
-#define INTERCEPT_CR4_WRITE (16 + 4)
-#define INTERCEPT_CR8_WRITE (16 + 8)
-
#define INTERCEPT_DR0_READ 0
#define INTERCEPT_DR1_READ 1
#define INTERCEPT_DR2_READ 2
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index fb68467e6049..5f65b759abcb 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -98,6 +98,7 @@ static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu)
void recalc_intercepts(struct vcpu_svm *svm)
{
struct vmcb_control_area *c, *h, *g;
+ unsigned int i;

vmcb_mark_dirty(svm->vmcb, VMCB_INTERCEPTS);

@@ -110,15 +111,17 @@ void recalc_intercepts(struct vcpu_svm *svm)

svm->nested.host_intercept_exceptions = h->intercept_exceptions;

- c->intercept_cr = h->intercept_cr;
+ for (i = 0; i < MAX_VECTORS; i++)
+ c->intercepts[i] = h->intercepts[i];
+
c->intercept_dr = h->intercept_dr;
c->intercept_exceptions = h->intercept_exceptions;
c->intercept = h->intercept;

if (g->int_ctl & V_INTR_MASKING_MASK) {
/* We only want the cr8 intercept bits of L1 */
- c->intercept_cr &= ~(1U << INTERCEPT_CR8_READ);
- c->intercept_cr &= ~(1U << INTERCEPT_CR8_WRITE);
+ vmcb_clr_intercept(c, INTERCEPT_CR8_READ);
+ vmcb_clr_intercept(c, INTERCEPT_CR8_WRITE);

/*
* Once running L2 with HF_VINTR_MASK, EFLAGS.IF does not
@@ -131,7 +134,9 @@ void recalc_intercepts(struct vcpu_svm *svm)
/* We don't want to see VMMCALLs from a nested guest */
c->intercept &= ~(1ULL << INTERCEPT_VMMCALL);

- c->intercept_cr |= g->intercept_cr;
+ for (i = 0; i < MAX_VECTORS; i++)
+ c->intercepts[i] |= g->intercepts[i];
+
c->intercept_dr |= g->intercept_dr;
c->intercept_exceptions |= g->intercept_exceptions;
c->intercept |= g->intercept;
@@ -140,7 +145,11 @@ void recalc_intercepts(struct vcpu_svm *svm)
static void copy_vmcb_control_area(struct vmcb_control_area *dst,
struct vmcb_control_area *from)
{
- dst->intercept_cr = from->intercept_cr;
+ unsigned int i;
+
+ for (i = 0; i < MAX_VECTORS; i++)
+ dst->intercepts[i] = from->intercepts[i];
+
dst->intercept_dr = from->intercept_dr;
dst->intercept_exceptions = from->intercept_exceptions;
dst->intercept = from->intercept;
@@ -487,8 +496,8 @@ int nested_svm_vmrun(struct vcpu_svm *svm)
nested_vmcb->control.event_inj,
nested_vmcb->control.nested_ctl);

- trace_kvm_nested_intercepts(nested_vmcb->control.intercept_cr & 0xffff,
- nested_vmcb->control.intercept_cr >> 16,
+ trace_kvm_nested_intercepts(nested_vmcb->control.intercepts[CR_VECTOR] & 0xffff,
+ nested_vmcb->control.intercepts[CR_VECTOR] >> 16,
nested_vmcb->control.intercept_exceptions,
nested_vmcb->control.intercept);

@@ -765,8 +774,7 @@ static int nested_svm_intercept(struct vcpu_svm *svm)
vmexit = nested_svm_intercept_ioio(svm);
break;
case SVM_EXIT_READ_CR0 ... SVM_EXIT_WRITE_CR8: {
- u32 bit = 1U << (exit_code - SVM_EXIT_READ_CR0);
- if (svm->nested.ctl.intercept_cr & bit)
+ if (vmcb_is_intercept(&svm->nested.ctl, exit_code))
vmexit = NESTED_EXIT_DONE;
break;
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 03dd7bac8034..523936b80dda 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2813,8 +2813,8 @@ static void dump_vmcb(struct kvm_vcpu *vcpu)
}

pr_err("VMCB Control Area:\n");
- pr_err("%-20s%04x\n", "cr_read:", control->intercept_cr & 0xffff);
- pr_err("%-20s%04x\n", "cr_write:", control->intercept_cr >> 16);
+ pr_err("%-20s%04x\n", "cr_read:", control->intercepts[CR_VECTOR] & 0xffff);
+ pr_err("%-20s%04x\n", "cr_write:", control->intercepts[CR_VECTOR] >> 16);
pr_err("%-20s%04x\n", "dr_read:", control->intercept_dr & 0xffff);
pr_err("%-20s%04x\n", "dr_write:", control->intercept_dr >> 16);
pr_err("%-20s%08x\n", "exceptions:", control->intercept_exceptions);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 1cff7644e70b..e775c502a074 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -216,24 +216,24 @@ static inline struct vmcb *get_host_vmcb(struct vcpu_svm *svm)

static inline void vmcb_set_intercept(struct vmcb_control_area *control, int bit)
{
- __set_bit(bit, (unsigned long *)&control->intercept_cr);
+ __set_bit(bit, (unsigned long *)&control->intercepts);
}

static inline void vmcb_clr_intercept(struct vmcb_control_area *control, int bit)
{
- __clear_bit(bit, (unsigned long *)&control->intercept_cr);
+ __clear_bit(bit, (unsigned long *)&control->intercepts);
}

static inline bool vmcb_is_intercept(struct vmcb_control_area *control, int bit)
{
- return test_bit(bit, (unsigned long *)&control->intercept_cr);
+ return test_bit(bit, (unsigned long *)&control->intercepts);
}

static inline void set_cr_intercept(struct vcpu_svm *svm, int bit)
{
struct vmcb *vmcb = get_host_vmcb(svm);

- vmcb->control.intercept_cr |= (1U << bit);
+ vmcb_set_intercept(&vmcb->control, bit);

recalc_intercepts(svm);
}
@@ -242,7 +242,7 @@ static inline void clr_cr_intercept(struct vcpu_svm *svm, int bit)
{
struct vmcb *vmcb = get_host_vmcb(svm);

- vmcb->control.intercept_cr &= ~(1U << bit);
+ vmcb_clr_intercept(&vmcb->control, bit);

recalc_intercepts(svm);
}
@@ -251,7 +251,7 @@ static inline bool is_cr_intercept(struct vcpu_svm *svm, int bit)
{
struct vmcb *vmcb = get_host_vmcb(svm);

- return vmcb->control.intercept_cr & (1U << bit);
+ return vmcb_is_intercept(&vmcb->control, bit);
}

static inline void set_dr_intercepts(struct vcpu_svm *svm)

2020-09-12 17:10:33

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On 11/09/20 21:27, Babu Moger wrote:
> The following series adds the support for PCID/INVPCID on AMD guests.
> While doing it re-structured the vmcb_control_area data structure to
> combine all the intercept vectors into one 32 bit array. Makes it easy
> for future additions. Re-arranged few pcid related code to make it common
> between SVM and VMX.
>
> INVPCID interceptions are added only when the guest is running with shadow
> page table enabled. In this case the hypervisor needs to handle the tlbflush
> based on the type of invpcid instruction.
>
> For the guests with nested page table (NPT) support, the INVPCID feature
> works as running it natively. KVM does not need to do any special handling.
>
> AMD documentation for INVPCID feature is available at "AMD64 Architecture
> Programmer’s Manual Volume 2: System Programming, Pub. 24593 Rev. 3.34(or later)"
>
> The documentation can be obtained at the links below:
> Link: https://www.amd.com/system/files/TechDocs/24593.pdf
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> ---
>
> v6:
> One minor change in patch #04. Otherwise same as v5.
> Updated all the patches by Reviewed-by.
>
> v5:
> https://lore.kernel.org/lkml/159846887637.18873.14677728679411578606.stgit@bmoger-ubuntu/
> All the changes are related to rebase.
> Aplies cleanly on mainline and kvm(master) tree.
> Resending it to get some attention.
>
> v4:
> https://lore.kernel.org/lkml/159676101387.12805.18038347880482984693.stgit@bmoger-ubuntu/
> 1. Changed the functions __set_intercept/__clr_intercept/__is_intercept to
> to vmcb_set_intercept/vmcb_clr_intercept/vmcb_is_intercept by passing
> vmcb_control_area structure(Suggested by Paolo).
> 2. Rearranged the commit 7a35e515a7055 ("KVM: VMX: Properly handle kvm_read/write_guest_virt*())
> to make it common across both SVM/VMX(Suggested by Jim Mattson).
> 3. Took care of few other comments from Jim Mattson. Dropped "Reviewed-by"
> on few patches which I have changed since v3.
>
> v3:
> https://lore.kernel.org/lkml/159597929496.12744.14654593948763926416.stgit@bmoger-ubuntu/
> 1. Addressing the comments from Jim Mattson. Follow the v2 link below
> for the context.
> 2. Introduced the generic __set_intercept, __clr_intercept and is_intercept
> using native __set_bit, clear_bit and test_bit.
> 3. Combined all the intercepts vectors into single 32 bit array.
> 4. Removed set_intercept_cr, clr_intercept_cr, set_exception_intercepts,
> clr_exception_intercept etc. Used the generic set_intercept and
> clr_intercept where applicable.
> 5. Tested both L1 guest and l2 nested guests.
>
> v2:
> https://lore.kernel.org/lkml/159234483706.6230.13753828995249423191.stgit@bmoger-ubuntu/
> - Taken care of few comments from Jim Mattson.
> - KVM interceptions added only when tdp is off. No interceptions
> when tdp is on.
> - Reverted the fault priority to original order in VMX.
>
> v1:
> https://lore.kernel.org/lkml/159191202523.31436.11959784252237488867.stgit@bmoger-ubuntu/
>
> Babu Moger (12):
> KVM: SVM: Introduce vmcb_(set_intercept/clr_intercept/_is_intercept)
> KVM: SVM: Change intercept_cr to generic intercepts
> KVM: SVM: Change intercept_dr to generic intercepts
> KVM: SVM: Modify intercept_exceptions to generic intercepts
> KVM: SVM: Modify 64 bit intercept field to two 32 bit vectors
> KVM: SVM: Add new intercept vector in vmcb_control_area
> KVM: nSVM: Cleanup nested_state data structure
> KVM: SVM: Remove set_cr_intercept, clr_cr_intercept and is_cr_intercept
> KVM: SVM: Remove set_exception_intercept and clr_exception_intercept
> KVM: X86: Rename and move the function vmx_handle_memory_failure to x86.c
> KVM: X86: Move handling of INVPCID types to x86
> KVM:SVM: Enable INVPCID feature on AMD
>
>
> arch/x86/include/asm/svm.h | 117 +++++++++++++++++++++++++----------
> arch/x86/include/uapi/asm/svm.h | 2 +
> arch/x86/kvm/svm/nested.c | 66 +++++++++-----------
> arch/x86/kvm/svm/svm.c | 131 ++++++++++++++++++++++++++-------------
> arch/x86/kvm/svm/svm.h | 87 +++++++++-----------------
> arch/x86/kvm/trace.h | 21 ++++--
> arch/x86/kvm/vmx/nested.c | 12 ++--
> arch/x86/kvm/vmx/vmx.c | 95 ----------------------------
> arch/x86/kvm/vmx/vmx.h | 2 -
> arch/x86/kvm/x86.c | 106 ++++++++++++++++++++++++++++++++
> arch/x86/kvm/x86.h | 3 +
> 11 files changed, 364 insertions(+), 278 deletions(-)
>
> --
> Signature
>

Queued except for patch 9 with only some changes to the names (mostly
replacing "vector" with "word"). It should get to kvm.git on Monday or
Tuesday, please give it a shot.

Thanks!

Paolo

2020-09-14 15:08:58

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Sat, Sep 12, 2020 at 07:08:05PM +0200, Paolo Bonzini wrote:
> Queued except for patch 9 with only some changes to the names (mostly
> replacing "vector" with "word"). It should get to kvm.git on Monday or
> Tuesday, please give it a shot.

Belated vote for s/vector/word, I found EXCEPTION_VECTOR quite confusing.

2020-09-14 18:34:45

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 9/12/20 12:08 PM, Paolo Bonzini wrote:
> On 11/09/20 21:27, Babu Moger wrote:
>> The following series adds the support for PCID/INVPCID on AMD guests.
>> While doing it re-structured the vmcb_control_area data structure to
>> combine all the intercept vectors into one 32 bit array. Makes it easy
>> for future additions. Re-arranged few pcid related code to make it common
>> between SVM and VMX.
>>
>> INVPCID interceptions are added only when the guest is running with shadow
>> page table enabled. In this case the hypervisor needs to handle the tlbflush
>> based on the type of invpcid instruction.
>>
>> For the guests with nested page table (NPT) support, the INVPCID feature
>> works as running it natively. KVM does not need to do any special handling.
>>
>> AMD documentation for INVPCID feature is available at "AMD64 Architecture
>> Programmer’s Manual Volume 2: System Programming, Pub. 24593 Rev. 3.34(or later)"
>>
>> The documentation can be obtained at the links below:
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F24593.pdf&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cd2bca7c6209743a7fe0e08d8573e70fd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637355274033139116&amp;sdata=C3EGywJcz3rAPmjckWGKbm7GkHR1Xyrl%2BIL9sEijhcQ%3D&amp;reserved=0
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D206537&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cd2bca7c6209743a7fe0e08d8573e70fd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637355274033139116&amp;sdata=29n8WNNpcUgVQRUyxbiSPcWJGTL5uV%2FaHgHXU1b9BjI%3D&amp;reserved=0
>> ---
>>
>> v6:
>> One minor change in patch #04. Otherwise same as v5.
>> Updated all the patches by Reviewed-by.
>>
>> v5:
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F159846887637.18873.14677728679411578606.stgit%40bmoger-ubuntu%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cd2bca7c6209743a7fe0e08d8573e70fd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637355274033139116&amp;sdata=D7HvBj6OArmpKsiaZj0Qk3mIHWYOOUN23f53ajhQpOY%3D&amp;reserved=0
>> All the changes are related to rebase.
>> Aplies cleanly on mainline and kvm(master) tree.
>> Resending it to get some attention.
>>
>> v4:
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F159676101387.12805.18038347880482984693.stgit%40bmoger-ubuntu%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cd2bca7c6209743a7fe0e08d8573e70fd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637355274033139116&amp;sdata=7og620g0qsxee7Wd60emz5YdbA44Al4tiUJX5n46MhE%3D&amp;reserved=0
>> 1. Changed the functions __set_intercept/__clr_intercept/__is_intercept to
>> to vmcb_set_intercept/vmcb_clr_intercept/vmcb_is_intercept by passing
>> vmcb_control_area structure(Suggested by Paolo).
>> 2. Rearranged the commit 7a35e515a7055 ("KVM: VMX: Properly handle kvm_read/write_guest_virt*())
>> to make it common across both SVM/VMX(Suggested by Jim Mattson).
>> 3. Took care of few other comments from Jim Mattson. Dropped "Reviewed-by"
>> on few patches which I have changed since v3.
>>
>> v3:
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F159597929496.12744.14654593948763926416.stgit%40bmoger-ubuntu%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cd2bca7c6209743a7fe0e08d8573e70fd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637355274033139116&amp;sdata=hvPNH827bmo1VL%2F%2FIv%2F%2ByQdVBygOpI1tkgQ6ASf5Wt8%3D&amp;reserved=0
>> 1. Addressing the comments from Jim Mattson. Follow the v2 link below
>> for the context.
>> 2. Introduced the generic __set_intercept, __clr_intercept and is_intercept
>> using native __set_bit, clear_bit and test_bit.
>> 3. Combined all the intercepts vectors into single 32 bit array.
>> 4. Removed set_intercept_cr, clr_intercept_cr, set_exception_intercepts,
>> clr_exception_intercept etc. Used the generic set_intercept and
>> clr_intercept where applicable.
>> 5. Tested both L1 guest and l2 nested guests.
>>
>> v2:
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F159234483706.6230.13753828995249423191.stgit%40bmoger-ubuntu%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cd2bca7c6209743a7fe0e08d8573e70fd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637355274033139116&amp;sdata=rP%2BlRJ91tk1VXS3YX8TdP2L9vORiIj8gN3ZZLKIXfeY%3D&amp;reserved=0
>> - Taken care of few comments from Jim Mattson.
>> - KVM interceptions added only when tdp is off. No interceptions
>> when tdp is on.
>> - Reverted the fault priority to original order in VMX.
>>
>> v1:
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F159191202523.31436.11959784252237488867.stgit%40bmoger-ubuntu%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cd2bca7c6209743a7fe0e08d8573e70fd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637355274033139116&amp;sdata=IGmv%2BLF60dmGVSCwcTU6sTDMvW1%2BEWUqTA5K%2FAowuxM%3D&amp;reserved=0
>>
>> Babu Moger (12):
>> KVM: SVM: Introduce vmcb_(set_intercept/clr_intercept/_is_intercept)
>> KVM: SVM: Change intercept_cr to generic intercepts
>> KVM: SVM: Change intercept_dr to generic intercepts
>> KVM: SVM: Modify intercept_exceptions to generic intercepts
>> KVM: SVM: Modify 64 bit intercept field to two 32 bit vectors
>> KVM: SVM: Add new intercept vector in vmcb_control_area
>> KVM: nSVM: Cleanup nested_state data structure
>> KVM: SVM: Remove set_cr_intercept, clr_cr_intercept and is_cr_intercept
>> KVM: SVM: Remove set_exception_intercept and clr_exception_intercept
>> KVM: X86: Rename and move the function vmx_handle_memory_failure to x86.c
>> KVM: X86: Move handling of INVPCID types to x86
>> KVM:SVM: Enable INVPCID feature on AMD
>>
>>
>> arch/x86/include/asm/svm.h | 117 +++++++++++++++++++++++++----------
>> arch/x86/include/uapi/asm/svm.h | 2 +
>> arch/x86/kvm/svm/nested.c | 66 +++++++++-----------
>> arch/x86/kvm/svm/svm.c | 131 ++++++++++++++++++++++++++-------------
>> arch/x86/kvm/svm/svm.h | 87 +++++++++-----------------
>> arch/x86/kvm/trace.h | 21 ++++--
>> arch/x86/kvm/vmx/nested.c | 12 ++--
>> arch/x86/kvm/vmx/vmx.c | 95 ----------------------------
>> arch/x86/kvm/vmx/vmx.h | 2 -
>> arch/x86/kvm/x86.c | 106 ++++++++++++++++++++++++++++++++
>> arch/x86/kvm/x86.h | 3 +
>> 11 files changed, 364 insertions(+), 278 deletions(-)
>>
>> --
>> Signature
>>
>
> Queued except for patch 9 with only some changes to the names (mostly
> replacing "vector" with "word"). It should get to kvm.git on Monday or
> Tuesday, please give it a shot.

Thanks Paolo. Tested Guest/nested guest/kvm units tests. Everything works
as expected.

2021-01-19 23:04:43

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Mon, Sep 14, 2020 at 11:33 AM Babu Moger <[email protected]> wrote:

> Thanks Paolo. Tested Guest/nested guest/kvm units tests. Everything works
> as expected.

Debian 9 does not like this patch set. As a kvm guest, it panics on a
Milan CPU unless booted with 'nopcid'. Gmail mangles long lines, so
please see the attached kernel log snippet. Debian 10 is fine, so I
assume this is a guest bug.


Attachments:
debian9.panic (3.34 kB)

2021-01-19 23:50:48

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 1/19/21 5:01 PM, Jim Mattson wrote:
> On Mon, Sep 14, 2020 at 11:33 AM Babu Moger <[email protected]> wrote:
>
>> Thanks Paolo. Tested Guest/nested guest/kvm units tests. Everything works
>> as expected.
>
> Debian 9 does not like this patch set. As a kvm guest, it panics on a
> Milan CPU unless booted with 'nopcid'. Gmail mangles long lines, so
> please see the attached kernel log snippet. Debian 10 is fine, so I
> assume this is a guest bug.
>

We had an issue with PCID feature earlier. This was showing only with SEV
guests. It is resolved recently. Do you think it is not related that?
Here are the patch set.
https://lore.kernel.org/kvm/160521930597.32054.4906933314022910996.stgit@bmoger-ubuntu/

2021-01-21 01:21:23

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Tue, Jan 19, 2021 at 3:45 PM Babu Moger <[email protected]> wrote:
>
>
>
> On 1/19/21 5:01 PM, Jim Mattson wrote:
> > On Mon, Sep 14, 2020 at 11:33 AM Babu Moger <[email protected]> wrote:
> >
> >> Thanks Paolo. Tested Guest/nested guest/kvm units tests. Everything works
> >> as expected.
> >
> > Debian 9 does not like this patch set. As a kvm guest, it panics on a
> > Milan CPU unless booted with 'nopcid'. Gmail mangles long lines, so
> > please see the attached kernel log snippet. Debian 10 is fine, so I
> > assume this is a guest bug.
> >
>
> We had an issue with PCID feature earlier. This was showing only with SEV
> guests. It is resolved recently. Do you think it is not related that?
> Here are the patch set.
> https://lore.kernel.org/kvm/160521930597.32054.4906933314022910996.stgit@bmoger-ubuntu/

The Debian 9 release we tested is not an SEV guest.

2021-01-21 02:36:02

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 1/20/21 3:14 PM, Jim Mattson wrote:
> On Tue, Jan 19, 2021 at 3:45 PM Babu Moger <[email protected]> wrote:
>>
>>
>>
>> On 1/19/21 5:01 PM, Jim Mattson wrote:
>>> On Mon, Sep 14, 2020 at 11:33 AM Babu Moger <[email protected]> wrote:
>>>
>>>> Thanks Paolo. Tested Guest/nested guest/kvm units tests. Everything works
>>>> as expected.
>>>
>>> Debian 9 does not like this patch set. As a kvm guest, it panics on a
>>> Milan CPU unless booted with 'nopcid'. Gmail mangles long lines, so
>>> please see the attached kernel log snippet. Debian 10 is fine, so I
>>> assume this is a guest bug.
>>>
>>
>> We had an issue with PCID feature earlier. This was showing only with SEV
>> guests. It is resolved recently. Do you think it is not related that?
>> Here are the patch set.
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fkvm%2F160521930597.32054.4906933314022910996.stgit%40bmoger-ubuntu%2F&amp;data=04%7C01%7Cbabu.moger%40amd.com%7C507e52200cc5478e3b9308d8bd8860bc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467740754159704%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Nxhg4Atzr6wZ1L7egyxQVZ%2FmVCE473%2F%2F5Fi0savgUfk%3D&amp;reserved=0
>
> The Debian 9 release we tested is not an SEV guest.
ok. I have not tested Debian 9 before. I will try now. Will let you know
how it goes. thanks

2021-01-21 03:39:45

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 1/20/21 3:45 PM, Babu Moger wrote:
>
>
> On 1/20/21 3:14 PM, Jim Mattson wrote:
>> On Tue, Jan 19, 2021 at 3:45 PM Babu Moger <[email protected]> wrote:
>>>
>>>
>>>
>>> On 1/19/21 5:01 PM, Jim Mattson wrote:
>>>> On Mon, Sep 14, 2020 at 11:33 AM Babu Moger <[email protected]> wrote:
>>>>
>>>>> Thanks Paolo. Tested Guest/nested guest/kvm units tests. Everything works
>>>>> as expected.
>>>>
>>>> Debian 9 does not like this patch set. As a kvm guest, it panics on a
>>>> Milan CPU unless booted with 'nopcid'. Gmail mangles long lines, so
>>>> please see the attached kernel log snippet. Debian 10 is fine, so I
>>>> assume this is a guest bug.
>>>>
>>>
>>> We had an issue with PCID feature earlier. This was showing only with SEV
>>> guests. It is resolved recently. Do you think it is not related that?
>>> Here are the patch set.
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fkvm%2F160521930597.32054.4906933314022910996.stgit%40bmoger-ubuntu%2F&amp;data=04%7C01%7CBabu.Moger%40amd.com%7C562d8b8ea61c41a61fe608d8bda0ae3b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467845105800757%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=l%2FhF%2FlDAqFN10SzDQ1L05FH1joXrLiuMwHAibBGHOqw%3D&amp;reserved=0
>>
>> The Debian 9 release we tested is not an SEV guest.
> ok. I have not tested Debian 9 before. I will try now. Will let you know
> how it goes. thanks
>

I have reproduced the issue locally. Will investigate. thanks

2021-01-22 01:03:51

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 1/20/21 9:10 PM, Babu Moger wrote:
>
>
> On 1/20/21 3:45 PM, Babu Moger wrote:
>>
>>
>> On 1/20/21 3:14 PM, Jim Mattson wrote:
>>> On Tue, Jan 19, 2021 at 3:45 PM Babu Moger <[email protected]> wrote:
>>>>
>>>>
>>>>
>>>> On 1/19/21 5:01 PM, Jim Mattson wrote:
>>>>> On Mon, Sep 14, 2020 at 11:33 AM Babu Moger <[email protected]> wrote:
>>>>>
>>>>>> Thanks Paolo. Tested Guest/nested guest/kvm units tests. Everything works
>>>>>> as expected.
>>>>>
>>>>> Debian 9 does not like this patch set. As a kvm guest, it panics on a
>>>>> Milan CPU unless booted with 'nopcid'. Gmail mangles long lines, so
>>>>> please see the attached kernel log snippet. Debian 10 is fine, so I
>>>>> assume this is a guest bug.
>>>>>
>>>>
>>>> We had an issue with PCID feature earlier. This was showing only with SEV
>>>> guests. It is resolved recently. Do you think it is not related that?
>>>> Here are the patch set.
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fkvm%2F160521930597.32054.4906933314022910996.stgit%40bmoger-ubuntu%2F&amp;data=04%7C01%7CBabu.Moger%40amd.com%7C3009e5f7f32b4dbd4aee08d8bdc045c9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467980841376327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2Bva7em372XD7uaCrSy3UBH6a9n8xaTTXWCAlA3gJX78%3D&amp;reserved=0
>>>
>>> The Debian 9 release we tested is not an SEV guest.
>> ok. I have not tested Debian 9 before. I will try now. Will let you know
>> how it goes. thanks
>>
>
> I have reproduced the issue locally. Will investigate. thanks
>
Few updates.
1. Like Jim mentioned earlier, this appears to be guest kernel issue.
Debian 9 runs the base kernel 4.9.0-14. Problem can be seen consistently
with this kernel.

2. This guest kernel(4.9.0-14) does not like the new feature INVPCID.

3. System comes up fine when invpcid feature is disabled with the boot
parameter "noinvpcid" and also with "nopcid". nopcid disables both pcid
and invpcid.

4. Upgraded the guest kernel to v5.0 and system comes up fine.

5. Also system comes up fine with latest guest kernel 5.11.0-rc4.

I did not bisect further yet.
Babu
Thanks

2021-01-23 01:54:23

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 1/21/21 5:51 PM, Babu Moger wrote:
>
>
> On 1/20/21 9:10 PM, Babu Moger wrote:
>>
>>
>> On 1/20/21 3:45 PM, Babu Moger wrote:
>>>
>>>
>>> On 1/20/21 3:14 PM, Jim Mattson wrote:
>>>> On Tue, Jan 19, 2021 at 3:45 PM Babu Moger <[email protected]> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 1/19/21 5:01 PM, Jim Mattson wrote:
>>>>>> On Mon, Sep 14, 2020 at 11:33 AM Babu Moger <[email protected]> wrote:
>>>>>>
>>>>>>> Thanks Paolo. Tested Guest/nested guest/kvm units tests. Everything works
>>>>>>> as expected.
>>>>>>
>>>>>> Debian 9 does not like this patch set. As a kvm guest, it panics on a
>>>>>> Milan CPU unless booted with 'nopcid'. Gmail mangles long lines, so
>>>>>> please see the attached kernel log snippet. Debian 10 is fine, so I
>>>>>> assume this is a guest bug.
>>>>>>
>>>>>
>>>>> We had an issue with PCID feature earlier. This was showing only with SEV
>>>>> guests. It is resolved recently. Do you think it is not related that?
>>>>> Here are the patch set.
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fkvm%2F160521930597.32054.4906933314022910996.stgit%40bmoger-ubuntu%2F&amp;data=04%7C01%7CBabu.Moger%40amd.com%7C3009e5f7f32b4dbd4aee08d8bdc045c9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467980841376327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2Bva7em372XD7uaCrSy3UBH6a9n8xaTTXWCAlA3gJX78%3D&amp;reserved=0
>>>>
>>>> The Debian 9 release we tested is not an SEV guest.
>>> ok. I have not tested Debian 9 before. I will try now. Will let you know
>>> how it goes. thanks
>>>
>>
>> I have reproduced the issue locally. Will investigate. thanks
>>
> Few updates.
> 1. Like Jim mentioned earlier, this appears to be guest kernel issue.
> Debian 9 runs the base kernel 4.9.0-14. Problem can be seen consistently
> with this kernel.
>
> 2. This guest kernel(4.9.0-14) does not like the new feature INVPCID.
>
> 3. System comes up fine when invpcid feature is disabled with the boot
> parameter "noinvpcid" and also with "nopcid". nopcid disables both pcid
> and invpcid.
>
> 4. Upgraded the guest kernel to v5.0 and system comes up fine.
>
> 5. Also system comes up fine with latest guest kernel 5.11.0-rc4.
>
> I did not bisect further yet.
> Babu
> Thanks


Some more update:
System comes up fine with kernel v4.9(checked out on upstream tag v4.9).
So, I am assuming this is something specific to Debian 4.9.0-14 kernel.

Note: I couldn't go back prior versions(v4.8 or earlier) due to compile
issues.
Thanks
Babu

2021-02-24 01:12:01

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

Any updates? What should we be telling customers with Debian 9 guests? :-)

On Fri, Jan 22, 2021 at 5:52 PM Babu Moger <[email protected]> wrote:
>
>
>
> On 1/21/21 5:51 PM, Babu Moger wrote:
> >
> >
> > On 1/20/21 9:10 PM, Babu Moger wrote:
> >>
> >>
> >> On 1/20/21 3:45 PM, Babu Moger wrote:
> >>>
> >>>
> >>> On 1/20/21 3:14 PM, Jim Mattson wrote:
> >>>> On Tue, Jan 19, 2021 at 3:45 PM Babu Moger <[email protected]> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 1/19/21 5:01 PM, Jim Mattson wrote:
> >>>>>> On Mon, Sep 14, 2020 at 11:33 AM Babu Moger <[email protected]> wrote:
> >>>>>>
> >>>>>>> Thanks Paolo. Tested Guest/nested guest/kvm units tests. Everything works
> >>>>>>> as expected.
> >>>>>>
> >>>>>> Debian 9 does not like this patch set. As a kvm guest, it panics on a
> >>>>>> Milan CPU unless booted with 'nopcid'. Gmail mangles long lines, so
> >>>>>> please see the attached kernel log snippet. Debian 10 is fine, so I
> >>>>>> assume this is a guest bug.
> >>>>>>
> >>>>>
> >>>>> We had an issue with PCID feature earlier. This was showing only with SEV
> >>>>> guests. It is resolved recently. Do you think it is not related that?
> >>>>> Here are the patch set.
> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fkvm%2F160521930597.32054.4906933314022910996.stgit%40bmoger-ubuntu%2F&amp;data=04%7C01%7CBabu.Moger%40amd.com%7C3009e5f7f32b4dbd4aee08d8bdc045c9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467980841376327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2Bva7em372XD7uaCrSy3UBH6a9n8xaTTXWCAlA3gJX78%3D&amp;reserved=0
> >>>>
> >>>> The Debian 9 release we tested is not an SEV guest.
> >>> ok. I have not tested Debian 9 before. I will try now. Will let you know
> >>> how it goes. thanks
> >>>
> >>
> >> I have reproduced the issue locally. Will investigate. thanks
> >>
> > Few updates.
> > 1. Like Jim mentioned earlier, this appears to be guest kernel issue.
> > Debian 9 runs the base kernel 4.9.0-14. Problem can be seen consistently
> > with this kernel.
> >
> > 2. This guest kernel(4.9.0-14) does not like the new feature INVPCID.
> >
> > 3. System comes up fine when invpcid feature is disabled with the boot
> > parameter "noinvpcid" and also with "nopcid". nopcid disables both pcid
> > and invpcid.
> >
> > 4. Upgraded the guest kernel to v5.0 and system comes up fine.
> >
> > 5. Also system comes up fine with latest guest kernel 5.11.0-rc4.
> >
> > I did not bisect further yet.
> > Babu
> > Thanks
>
>
> Some more update:
> System comes up fine with kernel v4.9(checked out on upstream tag v4.9).
> So, I am assuming this is something specific to Debian 4.9.0-14 kernel.
>
> Note: I couldn't go back prior versions(v4.8 or earlier) due to compile
> issues.
> Thanks
> Babu
>

2021-02-25 03:52:10

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



> -----Original Message-----
> From: Jim Mattson <[email protected]>
> Sent: Tuesday, February 23, 2021 6:14 PM
> To: Moger, Babu <[email protected]>
> Cc: Paolo Bonzini <[email protected]>; Vitaly Kuznetsov
> <[email protected]>; Wanpeng Li <[email protected]>; kvm list
> <[email protected]>; Joerg Roedel <[email protected]>; the arch/x86
> maintainers <[email protected]>; LKML <[email protected]>; Ingo
> Molnar <[email protected]>; Borislav Petkov <[email protected]>; H . Peter Anvin
> <[email protected]>; Thomas Gleixner <[email protected]>; Makarand Sonare
> <[email protected]>; Sean Christopherson <[email protected]>
> Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support
>
> Any updates? What should we be telling customers with Debian 9 guests? :-)

Found another problem with pcid feature om SVM. It is do with CR4 flags
reset during bootup. Problem was showing up with kexec loading on VM.
I am not sure if this is related to that. Will send the patch soon.

>
> On Fri, Jan 22, 2021 at 5:52 PM Babu Moger <[email protected]> wrote:
> >
> >
> >
> > On 1/21/21 5:51 PM, Babu Moger wrote:
> > >
> > >
> > > On 1/20/21 9:10 PM, Babu Moger wrote:
> > >>
> > >>
> > >> On 1/20/21 3:45 PM, Babu Moger wrote:
> > >>>
> > >>>
> > >>> On 1/20/21 3:14 PM, Jim Mattson wrote:
> > >>>> On Tue, Jan 19, 2021 at 3:45 PM Babu Moger <[email protected]>
> wrote:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On 1/19/21 5:01 PM, Jim Mattson wrote:
> > >>>>>> On Mon, Sep 14, 2020 at 11:33 AM Babu Moger
> <[email protected]> wrote:
> > >>>>>>
> > >>>>>>> Thanks Paolo. Tested Guest/nested guest/kvm units tests.
> > >>>>>>> Everything works as expected.
> > >>>>>>
> > >>>>>> Debian 9 does not like this patch set. As a kvm guest, it
> > >>>>>> panics on a Milan CPU unless booted with 'nopcid'. Gmail
> > >>>>>> mangles long lines, so please see the attached kernel log
> > >>>>>> snippet. Debian 10 is fine, so I assume this is a guest bug.
> > >>>>>>
> > >>>>>
> > >>>>> We had an issue with PCID feature earlier. This was showing only
> > >>>>> with SEV guests. It is resolved recently. Do you think it is not related
> that?
> > >>>>> Here are the patch set.
> > >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%
> > >>>>>
> 2Flore.kernel.org%2Fkvm%2F160521930597.32054.4906933314022910996
> > >>>>> .stgit%40bmoger-
> ubuntu%2F&amp;data=04%7C01%7Cbabu.moger%40amd.co
> > >>>>>
> m%7C9558672ca21c4f6c2d5308d8d85919dc%7C3dd8961fe4884e608e11a82d9
> > >>>>>
> 94e183d%7C0%7C0%7C637497224490455772%7CUnknown%7CTWFpbGZsb3d8
> eyJ
> > >>>>>
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> > >>>>>
> 7C1000&amp;sdata=4QzTNHaYllwPd1U0kumq75dpwp7Rg0ZXsSQ631jMeqs%3D
> &
> > >>>>> amp;reserved=0
> > >>>>
> > >>>> The Debian 9 release we tested is not an SEV guest.
> > >>> ok. I have not tested Debian 9 before. I will try now. Will let
> > >>> you know how it goes. thanks
> > >>>
> > >>
> > >> I have reproduced the issue locally. Will investigate. thanks
> > >>
> > > Few updates.
> > > 1. Like Jim mentioned earlier, this appears to be guest kernel issue.
> > > Debian 9 runs the base kernel 4.9.0-14. Problem can be seen
> > > consistently with this kernel.
> > >
> > > 2. This guest kernel(4.9.0-14) does not like the new feature INVPCID.
> > >
> > > 3. System comes up fine when invpcid feature is disabled with the
> > > boot parameter "noinvpcid" and also with "nopcid". nopcid disables
> > > both pcid and invpcid.
> > >
> > > 4. Upgraded the guest kernel to v5.0 and system comes up fine.
> > >
> > > 5. Also system comes up fine with latest guest kernel 5.11.0-rc4.
> > >
> > > I did not bisect further yet.
> > > Babu
> > > Thanks
> >
> >
> > Some more update:
> > System comes up fine with kernel v4.9(checked out on upstream tag v4.9).
> > So, I am assuming this is something specific to Debian 4.9.0-14 kernel.
> >
> > Note: I couldn't go back prior versions(v4.8 or earlier) due to
> > compile issues.
> > Thanks
> > Babu
> >

2021-03-10 01:13:15

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



> -----Original Message-----
> From: Babu Moger <[email protected]>
> Sent: Wednesday, February 24, 2021 4:17 PM
> To: Jim Mattson <[email protected]>
> Cc: Paolo Bonzini <[email protected]>; Vitaly Kuznetsov
> <[email protected]>; Wanpeng Li <[email protected]>; kvm list
> <[email protected]>; Joerg Roedel <[email protected]>; the arch/x86
> maintainers <[email protected]>; LKML <[email protected]>; Ingo
> Molnar <[email protected]>; Borislav Petkov <[email protected]>; H . Peter
> Anvin <[email protected]>; Thomas Gleixner <[email protected]>; Makarand
> Sonare <[email protected]>; Sean Christopherson
> <[email protected]>
> Subject: RE: [PATCH v6 00/12] SVM cleanup and INVPCID feature support
>
>
>
> > -----Original Message-----
> > From: Jim Mattson <[email protected]>
> > Sent: Tuesday, February 23, 2021 6:14 PM
> > To: Moger, Babu <[email protected]>
> > Cc: Paolo Bonzini <[email protected]>; Vitaly Kuznetsov
> > <[email protected]>; Wanpeng Li <[email protected]>; kvm list
> > <[email protected]>; Joerg Roedel <[email protected]>; the arch/x86
> > maintainers <[email protected]>; LKML <[email protected]>;
> > Ingo Molnar <[email protected]>; Borislav Petkov <[email protected]>; H .
> > Peter Anvin <[email protected]>; Thomas Gleixner <[email protected]>;
> > Makarand Sonare <[email protected]>; Sean Christopherson
> > <[email protected]>
> > Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support
> >
> > Any updates? What should we be telling customers with Debian 9 guests?
> > :-)
>
> Found another problem with pcid feature om SVM. It is do with CR4 flags
> reset during bootup. Problem was showing up with kexec loading on VM.
> I am not sure if this is related to that. Will send the patch soon.

Tried to reproduce the problem on upstream kernel versions without any
success. Tried v4.9-0 and v4.8-0. Both these upstream versions are
working fine. So "git bisect" on upstream is ruled out.

Debian kernel 4.10(tag 4.10~rc6-1~exp1) also works fine. It appears the
problem is on Debian 4.9 kernel. I am not sure how to run git bisect on
Debian kernel. Tried anyway. It is pointing to

47811c66356d875e76a6ca637a9d384779a659bb is the first bad commit
commit 47811c66356d875e76a6ca637a9d384779a659bb
Author: Ben Hutchings <[email protected]>
Date: Mon Mar 8 01:17:32 2021 +0100

Prepare to release linux (4.9.258-1).

It does not appear to be the right commit. I am out of ideas now.
hanks
Babu

>
> >
> > On Fri, Jan 22, 2021 at 5:52 PM Babu Moger <[email protected]>
> wrote:
> > >
> > >
> > >
> > > On 1/21/21 5:51 PM, Babu Moger wrote:
> > > >
> > > >
> > > > On 1/20/21 9:10 PM, Babu Moger wrote:
> > > >>
> > > >>
> > > >> On 1/20/21 3:45 PM, Babu Moger wrote:
> > > >>>
> > > >>>
> > > >>> On 1/20/21 3:14 PM, Jim Mattson wrote:
> > > >>>> On Tue, Jan 19, 2021 at 3:45 PM Babu Moger
> <[email protected]>
> > wrote:
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> On 1/19/21 5:01 PM, Jim Mattson wrote:
> > > >>>>>> On Mon, Sep 14, 2020 at 11:33 AM Babu Moger
> > <[email protected]> wrote:
> > > >>>>>>
> > > >>>>>>> Thanks Paolo. Tested Guest/nested guest/kvm units tests.
> > > >>>>>>> Everything works as expected.
> > > >>>>>>
> > > >>>>>> Debian 9 does not like this patch set. As a kvm guest, it
> > > >>>>>> panics on a Milan CPU unless booted with 'nopcid'. Gmail
> > > >>>>>> mangles long lines, so please see the attached kernel log
> > > >>>>>> snippet. Debian 10 is fine, so I assume this is a guest bug.
> > > >>>>>>
> > > >>>>>
> > > >>>>> We had an issue with PCID feature earlier. This was showing
> > > >>>>> only with SEV guests. It is resolved recently. Do you think it
> > > >>>>> is not related
> > that?
> > > >>>>> Here are the patch set.
> > > >>>>>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > >>>>> F%25
> > > >>>>>
> > 2Flore.kernel.org%2Fkvm%2F160521930597.32054.4906933314022910996
> > > >>>>> .stgit%40bmoger-
> > ubuntu%2F&amp;data=04%7C01%7Cbabu.moger%40amd.co
> > > >>>>>
> >
> m%7C9558672ca21c4f6c2d5308d8d85919dc%7C3dd8961fe4884e608e11a82d9
> > > >>>>>
> >
> 94e183d%7C0%7C0%7C637497224490455772%7CUnknown%7CTWFpbGZsb3d
> 8
> > eyJ
> > > >>>>>
> >
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> > > >>>>>
> >
> 7C1000&amp;sdata=4QzTNHaYllwPd1U0kumq75dpwp7Rg0ZXsSQ631jMeqs%
> 3D
> > &
> > > >>>>> amp;reserved=0
> > > >>>>
> > > >>>> The Debian 9 release we tested is not an SEV guest.
> > > >>> ok. I have not tested Debian 9 before. I will try now. Will let
> > > >>> you know how it goes. thanks
> > > >>>
> > > >>
> > > >> I have reproduced the issue locally. Will investigate. thanks
> > > >>
> > > > Few updates.
> > > > 1. Like Jim mentioned earlier, this appears to be guest kernel issue.
> > > > Debian 9 runs the base kernel 4.9.0-14. Problem can be seen
> > > > consistently with this kernel.
> > > >
> > > > 2. This guest kernel(4.9.0-14) does not like the new feature INVPCID.
> > > >
> > > > 3. System comes up fine when invpcid feature is disabled with the
> > > > boot parameter "noinvpcid" and also with "nopcid". nopcid disables
> > > > both pcid and invpcid.
> > > >
> > > > 4. Upgraded the guest kernel to v5.0 and system comes up fine.
> > > >
> > > > 5. Also system comes up fine with latest guest kernel 5.11.0-rc4.
> > > >
> > > > I did not bisect further yet.
> > > > Babu
> > > > Thanks
> > >
> > >
> > > Some more update:
> > > System comes up fine with kernel v4.9(checked out on upstream tag
> v4.9).
> > > So, I am assuming this is something specific to Debian 4.9.0-14 kernel.
> > >
> > > Note: I couldn't go back prior versions(v4.8 or earlier) due to
> > > compile issues.
> > > Thanks
> > > Babu
> > >

2021-03-10 09:10:38

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On 10/03/21 02:04, Babu Moger wrote:
> Debian kernel 4.10(tag 4.10~rc6-1~exp1) also works fine. It appears the
> problem is on Debian 4.9 kernel. I am not sure how to run git bisect on
> Debian kernel. Tried anyway. It is pointing to
>
> 47811c66356d875e76a6ca637a9d384779a659bb is the first bad commit
> commit 47811c66356d875e76a6ca637a9d384779a659bb
> Author: Ben Hutchings<[email protected]>
> Date: Mon Mar 8 01:17:32 2021 +0100
>
> Prepare to release linux (4.9.258-1).
>
> It does not appear to be the right commit. I am out of ideas now.
> hanks

Have you tried bisecting the upstream stable kernels (from 4.9.0 to
4.9.258)?

Paolo

2021-03-10 14:58:03

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



> -----Original Message-----
> From: Paolo Bonzini <[email protected]>
> Sent: Wednesday, March 10, 2021 3:09 AM
> To: Moger, Babu <[email protected]>; Jim Mattson
> <[email protected]>
> Cc: Vitaly Kuznetsov <[email protected]>; Wanpeng Li
> <[email protected]>; kvm list <[email protected]>; Joerg Roedel
> <[email protected]>; the arch/x86 maintainers <[email protected]>; LKML <linux-
> [email protected]>; Ingo Molnar <[email protected]>; Borislav Petkov
> <[email protected]>; H . Peter Anvin <[email protected]>; Thomas Gleixner
> <[email protected]>; Makarand Sonare <[email protected]>; Sean
> Christopherson <[email protected]>
> Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support
>
> On 10/03/21 02:04, Babu Moger wrote:
> > Debian kernel 4.10(tag 4.10~rc6-1~exp1) also works fine. It appears
> > the problem is on Debian 4.9 kernel. I am not sure how to run git
> > bisect on Debian kernel. Tried anyway. It is pointing to
> >
> > 47811c66356d875e76a6ca637a9d384779a659bb is the first bad commit
> > commit 47811c66356d875e76a6ca637a9d384779a659bb
> > Author: Ben Hutchings<[email protected]>
> > Date: Mon Mar 8 01:17:32 2021 +0100
> >
> > Prepare to release linux (4.9.258-1).
> >
> > It does not appear to be the right commit. I am out of ideas now.
> > hanks
>
> Have you tried bisecting the upstream stable kernels (from 4.9.0 to 4.9.258)?

I couldn't reproduce the issue on any of the upstream versions. I have
tried v4.9, v4.8 and even on latest v5.11. No issues there.

Jim mentioned Debian 10 which is based of kernel version 4.19 is also
fine. Issue appears to be only affecting Debian 9(kernel v4.9.0-14).

2021-03-10 15:01:01

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 3/10/21 8:55 AM, Babu Moger wrote:
>
>
>> -----Original Message-----
>> From: Paolo Bonzini <[email protected]>
>> Sent: Wednesday, March 10, 2021 3:09 AM
>> To: Moger, Babu <[email protected]>; Jim Mattson
>> <[email protected]>
>> Cc: Vitaly Kuznetsov <[email protected]>; Wanpeng Li
>> <[email protected]>; kvm list <[email protected]>; Joerg Roedel
>> <[email protected]>; the arch/x86 maintainers <[email protected]>; LKML <linux-
>> [email protected]>; Ingo Molnar <[email protected]>; Borislav Petkov
>> <[email protected]>; H . Peter Anvin <[email protected]>; Thomas Gleixner
>> <[email protected]>; Makarand Sonare <[email protected]>; Sean
>> Christopherson <[email protected]>
>> Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support
>>
>> On 10/03/21 02:04, Babu Moger wrote:
>>> Debian kernel 4.10(tag 4.10~rc6-1~exp1) also works fine. It appears
>>> the problem is on Debian 4.9 kernel. I am not sure how to run git
>>> bisect on Debian kernel. Tried anyway. It is pointing to
>>>
>>> 47811c66356d875e76a6ca637a9d384779a659bb is the first bad commit
>>> commit 47811c66356d875e76a6ca637a9d384779a659bb
>>> Author: Ben Hutchings<[email protected]>
>>> Date: Mon Mar 8 01:17:32 2021 +0100
>>>
>>> Prepare to release linux (4.9.258-1).
>>>
>>> It does not appear to be the right commit. I am out of ideas now.
>>> hanks
>>
>> Have you tried bisecting the upstream stable kernels (from 4.9.0 to 4.9.258)?

I couldn't reproduce the issue on any of the upstream versions. I have
tried v4.9, v4.8 and even on latest v5.11. No issues there. There is no
upstream version 4.9.258.

Jim mentioned Debian 10 which is based of kernel version 4.19 is also
fine. Issue appears to be only affecting Debian 9(kernel v4.9.0-14).

2021-03-10 15:34:00

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On 10/03/21 15:58, Babu Moger wrote:
> There is no upstream version 4.9.258.

Sure there is, check out https://cdn.kernel.org/pub/linux/kernel/v4.x/

The easiest way to do it is to bisect on the linux-4.9.y branch of
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git.

paolo

2021-03-11 01:24:02

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 3/10/21 9:31 AM, Paolo Bonzini wrote:
> On 10/03/21 15:58, Babu Moger wrote:
>> There is no upstream version 4.9.258.
>
> Sure there is, check out
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcdn.kernel.org%2Fpub%2Flinux%2Fkernel%2Fv4.x%2F&amp;data=04%7C01%7Cbabu.moger%40amd.com%7Caeefc58416ed490faa7f08d8e3d99d72%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637509871127634618%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=re2Jj5P7IjN2UdmPTjTuKd1KIJLek84KlcnsXxgKYRc%3D&amp;reserved=0
>
>
> The easiest way to do it is to bisect on the linux-4.9.y branch of
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git.
>
Paolo, Thanks for pointing that out. Bisected linux-4.9.y branch.
It is pointing at

# git bisect good
59094faf3f618b2d2b2a45acb916437d611cede6 is the first bad commit
commit 59094faf3f618b2d2b2a45acb916437d611cede6
Author: Borislav Petkov <[email protected]>
Date: Mon Dec 25 13:57:16 2017 +0100

x86/kaiser: Move feature detection up


... before the first use of kaiser_enabled as otherwise funky
things happen:

about to get started...
(XEN) d0v0 Unhandled page fault fault/trap [#14, ec=0000]
(XEN) Pagetable walk from ffff88022a449090:
(XEN) L4[0x110] = 0000000229e0e067 0000000000001e0e
(XEN) L3[0x008] = 0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S: fault at ffff82d08033fd08
entry.o#create_bounce_frame+0x135/0x14d
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.9.1_02-3.21 x86_64 debug=n Not tainted ]----
(XEN) CPU: 0
(XEN) RIP: e033:[<ffffffff81007460>]
(XEN) RFLAGS: 0000000000000286 EM: 1 CONTEXT: pv guest (d0v0)

Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

:040000 040000 e56bbc975c3fd1a774b6cc0d6699c0c24e66be1c
e06231dccc8589b4baa0cd5759a37899b7ec71c1 M arch

Not sure what is going on with this commit. Still looking.

2021-03-11 20:09:31

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Wed, Mar 10, 2021 at 07:21:23PM -0600, Babu Moger wrote:
> # git bisect good
> 59094faf3f618b2d2b2a45acb916437d611cede6 is the first bad commit
> commit 59094faf3f618b2d2b2a45acb916437d611cede6
> Author: Borislav Petkov <[email protected]>
> Date: Mon Dec 25 13:57:16 2017 +0100
>
> x86/kaiser: Move feature detection up

What is the reproducer?

Boot latest 4.9 stable kernel in a SEV guest? Can you send guest
.config?

Upthread is talking about PCID, so I'm guessing host needs to be Zen3
with PCID. Anything else?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-11 20:36:29

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Thu, Mar 11, 2021 at 09:07:55PM +0100, Borislav Petkov wrote:
> On Wed, Mar 10, 2021 at 07:21:23PM -0600, Babu Moger wrote:
> > # git bisect good
> > 59094faf3f618b2d2b2a45acb916437d611cede6 is the first bad commit
> > commit 59094faf3f618b2d2b2a45acb916437d611cede6
> > Author: Borislav Petkov <[email protected]>
> > Date: Mon Dec 25 13:57:16 2017 +0100
> >
> > x86/kaiser: Move feature detection up
>
> What is the reproducer?
>
> Boot latest 4.9 stable kernel in a SEV guest? Can you send guest
> .config?
>
> Upthread is talking about PCID, so I'm guessing host needs to be Zen3
> with PCID. Anything else?

That oops points to:

[ 1.237515] kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!

which is:

local_flush_tlb();
sync_core();
/* Could also do a CLFLUSH here to speed up CPU recovery; but
that causes hangs on some VIA CPUs. */
for (i = 0; i < len; i++)
BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]); <---
local_irq_restore(flags);
return addr;

in text_poke() which basically says that the patching verification
fails. And you have a local_flush_tlb() before that. And with PCID maybe
it is not flushing properly or whatnot.

And deep down in the TLB flushing code, it does:

if (kaiser_enabled)
kaiser_flush_tlb_on_return_to_user();

and that uses PCID...

Anyway, needs more info.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-11 20:58:38

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 3/11/21 2:32 PM, Borislav Petkov wrote:
> On Thu, Mar 11, 2021 at 09:07:55PM +0100, Borislav Petkov wrote:
>> On Wed, Mar 10, 2021 at 07:21:23PM -0600, Babu Moger wrote:
>>> # git bisect good
>>> 59094faf3f618b2d2b2a45acb916437d611cede6 is the first bad commit
>>> commit 59094faf3f618b2d2b2a45acb916437d611cede6
>>> Author: Borislav Petkov <[email protected]>
>>> Date: Mon Dec 25 13:57:16 2017 +0100
>>>
>>> x86/kaiser: Move feature detection up
>>
>> What is the reproducer?
>>
>> Boot latest 4.9 stable kernel in a SEV guest? Can you send guest
>> .config?
>>
>> Upthread is talking about PCID, so I'm guessing host needs to be Zen3
>> with PCID. Anything else?
>
> That oops points to:
>
> [ 1.237515] kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
>
> which is:
>
> local_flush_tlb();
> sync_core();
> /* Could also do a CLFLUSH here to speed up CPU recovery; but
> that causes hangs on some VIA CPUs. */
> for (i = 0; i < len; i++)
> BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]); <---
> local_irq_restore(flags);
> return addr;
>
> in text_poke() which basically says that the patching verification
> fails. And you have a local_flush_tlb() before that. And with PCID maybe
> it is not flushing properly or whatnot.
>
> And deep down in the TLB flushing code, it does:
>
> if (kaiser_enabled)
> kaiser_flush_tlb_on_return_to_user();
>
> and that uses PCID...
>
> Anyway, needs more info.

Boris,
It is related PCID and INVPCID combination. Few more details.
1. System comes up fine with "noinvpid". So, it happens when invpcid is
enabled.
2. Host is coming up fine. Problem is with the guest.
3. Problem happens with Debian 9. Debian kernel version is 4.9.0-14.
4. Debian 10 is fine.
5. Upstream kernels are fine. Tried on v5.11 and it is working fine.
6. Git bisect pointed to commit 47811c66356d875e76a6ca637a9d384779a659bb.

Let me know if want me to try something else.
thanks
Babu


2021-03-11 21:25:55

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Thu, Mar 11, 2021 at 12:32 PM Borislav Petkov <[email protected]> wrote:
>
> On Thu, Mar 11, 2021 at 09:07:55PM +0100, Borislav Petkov wrote:
> > On Wed, Mar 10, 2021 at 07:21:23PM -0600, Babu Moger wrote:
> > > # git bisect good
> > > 59094faf3f618b2d2b2a45acb916437d611cede6 is the first bad commit
> > > commit 59094faf3f618b2d2b2a45acb916437d611cede6
> > > Author: Borislav Petkov <[email protected]>
> > > Date: Mon Dec 25 13:57:16 2017 +0100
> > >
> > > x86/kaiser: Move feature detection up
> >
> > What is the reproducer?
> >
> > Boot latest 4.9 stable kernel in a SEV guest? Can you send guest
> > .config?
> >
> > Upthread is talking about PCID, so I'm guessing host needs to be Zen3
> > with PCID. Anything else?
>
> That oops points to:
>
> [ 1.237515] kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
>
> which is:
>
> local_flush_tlb();
> sync_core();
> /* Could also do a CLFLUSH here to speed up CPU recovery; but
> that causes hangs on some VIA CPUs. */
> for (i = 0; i < len; i++)
> BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]); <---
> local_irq_restore(flags);
> return addr;
>
> in text_poke() which basically says that the patching verification
> fails. And you have a local_flush_tlb() before that. And with PCID maybe
> it is not flushing properly or whatnot.
>
> And deep down in the TLB flushing code, it does:
>
> if (kaiser_enabled)
> kaiser_flush_tlb_on_return_to_user();
>
> and that uses PCID...

I would expect kaiser_enabled to be false (and PCIDs not to be used),
since AMD CPUs are not vulnerable to Meltdown.

2021-03-11 21:38:16

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Thu, Mar 11, 2021 at 01:23:47PM -0800, Jim Mattson wrote:
> I would expect kaiser_enabled to be false (and PCIDs not to be used),
> since AMD CPUs are not vulnerable to Meltdown.

Ah, of course. The guest dmesg should have

"Kernel/User page tables isolation: disabled."

Lemme see if I can reproduce.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-11 21:43:49

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Thu, Mar 11, 2021 at 02:57:04PM -0600, Babu Moger wrote:
> It is related PCID and INVPCID combination. Few more details.
> 1. System comes up fine with "noinvpid". So, it happens when invpcid is
> enabled.

Which system, host or guest?

> 2. Host is coming up fine. Problem is with the guest.

Aha, guest.

> 3. Problem happens with Debian 9. Debian kernel version is 4.9.0-14.
> 4. Debian 10 is fine.
> 5. Upstream kernels are fine. Tried on v5.11 and it is working fine.
> 6. Git bisect pointed to commit 47811c66356d875e76a6ca637a9d384779a659bb.
>
> Let me know if want me to try something else.

Yes, I assume host has the patches which belong to this thread?

So please describe:

1. host has these patches, cmdline params, etc.
2. guest is a 4.9 kernel, cmdline params, etc.

Please be exact and specific so that I can properly reproduce.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-11 21:52:24

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 3/11/21 3:36 PM, Borislav Petkov wrote:
> On Thu, Mar 11, 2021 at 01:23:47PM -0800, Jim Mattson wrote:
>> I would expect kaiser_enabled to be false (and PCIDs not to be used),
>> since AMD CPUs are not vulnerable to Meltdown.
>
> Ah, of course. The guest dmesg should have
>
> "Kernel/User page tables isolation: disabled."

yes.
#dmesg |grep isolation
[ 0.000000] Kernel/User page tables isolation: disabled

>
> Lemme see if I can reproduce.
>

2021-03-11 22:07:36

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 3/11/21 3:40 PM, Borislav Petkov wrote:
> On Thu, Mar 11, 2021 at 02:57:04PM -0600, Babu Moger wrote:
>> It is related PCID and INVPCID combination. Few more details.
>> 1. System comes up fine with "noinvpid". So, it happens when invpcid is
>> enabled.
>
> Which system, host or guest?
>
>> 2. Host is coming up fine. Problem is with the guest.
>
> Aha, guest.
>
>> 3. Problem happens with Debian 9. Debian kernel version is 4.9.0-14.
>> 4. Debian 10 is fine.
>> 5. Upstream kernels are fine. Tried on v5.11 and it is working fine.
>> 6. Git bisect pointed to commit 47811c66356d875e76a6ca637a9d384779a659bb.
>>
>> Let me know if want me to try something else.
>
> Yes, I assume host has the patches which belong to this thread?

Yes. Host has all these patches. Right now I am on 5.12.0-rc2. I just
updated yesterday. I was able to reproduce 5.11 also.


>
> So please describe:
>
> 1. host has these patches, cmdline params, etc.

# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.12.0-rc2+ root=/dev/mapper/rhel-root ro
crashkernel=auto resume=/dev/mapper/rhel-swap rd.lvm.lv=rhel/root
rd.lvm.lv=rhel/swap ras=cec_disable nmi_watchdog=0 warn_ud2=on selinux=0
earlyprintk=serial,ttyS1,115200n8 console=ttyS1,115200n8


> 2. guest is a 4.9 kernel, cmdline params, etc.

I use qemu command line to bring up the guest. Make sure to use "-cpu host".

qemu-system-x86_64 -name deb9 -m 16384 -smp cores=16,threads=1,sockets=1
-hda vdisk-deb.qcow2 -enable-kvm -net nic -net
bridge,br=virbr0,helper=/usr/libexec/qemu-bridge-helper -cpu host,+svm
-nographic


The grub command line looks like this on the guest.

cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.9.0-14-amd64
root=UUID=a0069240-cd60-4795-a391-273266dbae29 ro console=ttyS0,112500n8
earlyprintk

2021-03-11 22:19:43

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 3/11/21 4:04 PM, Babu Moger wrote:
>
>
> On 3/11/21 3:40 PM, Borislav Petkov wrote:
>> On Thu, Mar 11, 2021 at 02:57:04PM -0600, Babu Moger wrote:
>>> It is related PCID and INVPCID combination. Few more details.
>>> 1. System comes up fine with "noinvpid". So, it happens when invpcid is
>>> enabled.
>>
>> Which system, host or guest?
>>
>>> 2. Host is coming up fine. Problem is with the guest.
>>
>> Aha, guest.
>>
>>> 3. Problem happens with Debian 9. Debian kernel version is 4.9.0-14.
>>> 4. Debian 10 is fine.
>>> 5. Upstream kernels are fine. Tried on v5.11 and it is working fine.
>>> 6. Git bisect pointed to commit 47811c66356d875e76a6ca637a9d384779a659bb.
>>>
>>> Let me know if want me to try something else.
>>
>> Yes, I assume host has the patches which belong to this thread?
>
> Yes. Host has all these patches. Right now I am on 5.12.0-rc2. I just
> updated yesterday. I was able to reproduce 5.11 also.
>
>
>>
>> So please describe:
>>
>> 1. host has these patches, cmdline params, etc.

My host is
# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.3 (Ootpa)
# uname -r
5.12.0-rc2+


>
> # cat /proc/cmdline
> BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.12.0-rc2+ root=/dev/mapper/rhel-root ro
> crashkernel=auto resume=/dev/mapper/rhel-swap rd.lvm.lv=rhel/root
> rd.lvm.lv=rhel/swap ras=cec_disable nmi_watchdog=0 warn_ud2=on selinux=0
> earlyprintk=serial,ttyS1,115200n8 console=ttyS1,115200n8
>
>
>> 2. guest is a 4.9 kernel, cmdline params, etc.
>
> I use qemu command line to bring up the guest. Make sure to use "-cpu host".
>
> qemu-system-x86_64 -name deb9 -m 16384 -smp cores=16,threads=1,sockets=1
> -hda vdisk-deb.qcow2 -enable-kvm -net nic -net
> bridge,br=virbr0,helper=/usr/libexec/qemu-bridge-helper -cpu host,+svm
> -nographic
>
>
> The grub command line looks like this on the guest.
>
> cat /proc/cmdline
> BOOT_IMAGE=/boot/vmlinuz-4.9.0-14-amd64
> root=UUID=a0069240-cd60-4795-a391-273266dbae29 ro console=ttyS0,112500n8
> earlyprintk
>

2021-03-11 23:54:05

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Thu, Mar 11, 2021 at 04:15:37PM -0600, Babu Moger wrote:
> My host is
> # cat /etc/redhat-release
> Red Hat Enterprise Linux release 8.3 (Ootpa)
> # uname -r
> 5.12.0-rc2+

Please upload host and guest .config.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-12 14:57:53

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



On 3/11/21 5:52 PM, Borislav Petkov wrote:
> On Thu, Mar 11, 2021 at 04:15:37PM -0600, Babu Moger wrote:
>> My host is
>> # cat /etc/redhat-release
>> Red Hat Enterprise Linux release 8.3 (Ootpa)
>> # uname -r
>> 5.12.0-rc2+
>
> Please upload host and guest .config.

Attached. Thanks


Attachments:
config-host-v5.12 (206.05 kB)
config-guest-4.9.0-14 (182.06 kB)
Download all attachments

2021-03-12 16:16:23

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v6 00/12] SVM cleanup and INVPCID feature support



> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Thursday, March 11, 2021 5:52 PM
> To: Moger, Babu <[email protected]>
> Cc: Paolo Bonzini <[email protected]>; Jim Mattson
> <[email protected]>; Vitaly Kuznetsov <[email protected]>; Wanpeng
> Li <[email protected]>; kvm list <[email protected]>; Joerg Roedel
> <[email protected]>; the arch/x86 maintainers <[email protected]>; LKML <linux-
> [email protected]>; Ingo Molnar <[email protected]>; H . Peter Anvin
> <[email protected]>; Thomas Gleixner <[email protected]>; Makarand Sonare
> <[email protected]>; Sean Christopherson <[email protected]>
> Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support
>
> On Thu, Mar 11, 2021 at 04:15:37PM -0600, Babu Moger wrote:
> > My host is
> > # cat /etc/redhat-release
> > Red Hat Enterprise Linux release 8.3 (Ootpa) # uname -r 5.12.0-rc2+
>
> Please upload host and guest .config.

Host config.

https://pastebin.com/wuLzEqZr

Guest config

https://pastebin.com/mvzEEq6R

2021-03-25 03:26:52

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

Ok,

some more experimenting Babu and I did lead us to:

---
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index f5ca15622dc9..259aa4889cad 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned long addr)
*/
if (kaiser_enabled)
invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
+ else
+ asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+
invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
}

applied on the guest kernel which fixes the issue. And let me add Hugh
who did that PCID stuff at the time. So lemme summarize for Hugh and to
ask him nicely to sanity-check me. :-)

Basically, you have an AMD host which supports PCID and INVPCID and you
boot on it a 4.9 guest. It explodes like the panic below.

What fixes it is this:

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index f5ca15622dc9..259aa4889cad 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned long addr)
*/
if (kaiser_enabled)
invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
+ else
+ asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+
invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
}

---

and the reason why it does, IMHO, is because on AMD, kaiser_enabled is
false because AMD is not affected by Meltdown, which means, there's no
user/kernel pagetables split.

And that also means, you have global TLB entries which means that if you
look at that __native_flush_tlb_single() function, it needs to flush
global TLB entries on CPUs with X86_FEATURE_INVPCID_SINGLE by doing an
INVLPG in the kaiser_enabled=0 case. Errgo, the above hunk.

But I might be completely off here thus this note...

Thoughts?

Thx.


[ 1.235726] ------------[ cut here ]------------
[ 1.237515] kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
[ 1.240926] invalid opcode: 0000 [#1] SMP
[ 1.243301] Modules linked in:
[ 1.244585] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1
[ 1.247657] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[ 1.251249] task: ffff909363e94040 task.stack: ffffa41bc0194000
[ 1.253519] RIP: 0010:[<ffffffff8fa2e40c>] [<ffffffff8fa2e40c>] text_poke+0x18c/0x240
[ 1.256593] RSP: 0018:ffffa41bc0197d90 EFLAGS: 00010096
[ 1.258657] RAX: 000000000000000f RBX: 0000000001020800 RCX: 00000000feda3203
[ 1.261388] RDX: 00000000178bfbff RSI: 0000000000000000 RDI: ffffffffff57a000
[ 1.264168] RBP: ffffffff8fbd3eca R08: 0000000000000000 R09: 0000000000000003
[ 1.266983] R10: 0000000000000003 R11: 0000000000000112 R12: 0000000000000001
[ 1.269702] R13: ffffa41bc0197dcf R14: 0000000000000286 R15: ffffed1c40407500
[ 1.272572] FS: 0000000000000000(0000) GS:ffff909366300000(0000) knlGS:0000000000000000
[ 1.275791] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.278032] CR2: 0000000000000000 CR3: 0000000010c08000 CR4: 00000000003606f0
[ 1.280815] Stack:
[ 1.281630] ffffffff8fbd3eca 0000000000000005 ffffa41bc0197e03 ffffffff8fbd3ecb
[ 1.284660] 0000000000000000 0000000000000000 ffffffff8fa2e835 ccffffff8fad4326
[ 1.287729] 1ccd0231874d55d3 ffffffff8fbd3eca ffffa41bc0197e03 ffffffff90203844
[ 1.290852] Call Trace:
[ 1.291782] [<ffffffff8fbd3eca>] ? swap_entry_free+0x12a/0x300
[ 1.294900] [<ffffffff8fbd3ecb>] ? swap_entry_free+0x12b/0x300
[ 1.297267] [<ffffffff8fa2e835>] ? text_poke_bp+0x55/0xe0
[ 1.299473] [<ffffffff8fbd3eca>] ? swap_entry_free+0x12a/0x300
[ 1.301896] [<ffffffff8fa2b64c>] ? arch_jump_label_transform+0x9c/0x120
[ 1.304557] [<ffffffff9073e81f>] ? set_debug_rodata+0xc/0xc
[ 1.306790] [<ffffffff8fb81d92>] ? __jump_label_update+0x72/0x80
[ 1.309255] [<ffffffff8fb8206f>] ? static_key_slow_inc+0x8f/0xa0
[ 1.311680] [<ffffffff8fbd7a57>] ? frontswap_register_ops+0x107/0x1d0
[ 1.314281] [<ffffffff9077078c>] ? init_zswap+0x282/0x3f6
[ 1.316547] [<ffffffff9077050a>] ? init_frontswap+0x8c/0x8c
[ 1.318784] [<ffffffff8fa0223e>] ? do_one_initcall+0x4e/0x180
[ 1.321067] [<ffffffff9073e81f>] ? set_debug_rodata+0xc/0xc
[ 1.323366] [<ffffffff9073f08d>] ? kernel_init_freeable+0x16b/0x1ec
[ 1.325873] [<ffffffff90011d50>] ? rest_init+0x80/0x80
[ 1.327989] [<ffffffff90011d5a>] ? kernel_init+0xa/0x100
[ 1.330092] [<ffffffff9001f424>] ? ret_from_fork+0x44/0x70
[ 1.332311] Code: 00 0f a2 4d 85 e4 74 4a 0f b6 45 00 41 38 45 00 75 19 31 c0 83 c0 01 48 63 d0 49 39 d4 76 33 41 0f b6 4c 15 00 38 4c 15 00 74 e9 <0f> 0b 48 89 ef e8 da d6 19 00 48 8d bd 00 10 00 00 48 89 c3 e8
[ 1.342818] RIP [<ffffffff8fa2e40c>] text_poke+0x18c/0x240
[ 1.345859] RSP <ffffa41bc0197d90>
[ 1.347285] ---[ end trace 0a1c5ab5eb16de89 ]---
[ 1.349169] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 1.349169]
[ 1.352885] Kernel Offset: 0xea00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 1.357039] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 1.357039]


--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-25 03:30:43

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On 24/03/21 22:21, Borislav Petkov wrote:
> if (kaiser_enabled)
> invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> + else
> + asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> +
> invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> }

I think the kernel ASID flush can also be moved under the "if"?

> and the reason why it does, IMHO, is because on AMD, kaiser_enabled is
> false because AMD is not affected by Meltdown, which means, there's no
> user/kernel pagetables split.
>
> And that also means, you have global TLB entries which means that if you
> look at that __native_flush_tlb_single() function, it needs to flush
> global TLB entries on CPUs with X86_FEATURE_INVPCID_SINGLE by doing an
> INVLPG in the kaiser_enabled=0 case. Errgo, the above hunk.

Makes sense.

Paolo

2021-03-25 03:37:02

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Wed, 24 Mar 2021, Borislav Petkov wrote:

> Ok,
>
> some more experimenting Babu and I did lead us to:
>
> ---
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index f5ca15622dc9..259aa4889cad 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> */
> if (kaiser_enabled)
> invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> + else
> + asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> +
> invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> }
>
> applied on the guest kernel which fixes the issue. And let me add Hugh
> who did that PCID stuff at the time. So lemme summarize for Hugh and to
> ask him nicely to sanity-check me. :-)

Just a brief interim note to assure you that I'm paying attention,
but wow, it's a long time since I gave any thought down here!
Trying to page it all back in...

I see no harm in your workaround if it works, but it's not as if
this is a previously untried path: so I'm suspicious how an issue
here with Globals could have gone unnoticed for so long, and need
to understand it better.

Hugh

>
> Basically, you have an AMD host which supports PCID and INVPCID and you
> boot on it a 4.9 guest. It explodes like the panic below.
>
> What fixes it is this:
>
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index f5ca15622dc9..259aa4889cad 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> */
> if (kaiser_enabled)
> invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> + else
> + asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> +
> invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> }
>
> ---
>
> and the reason why it does, IMHO, is because on AMD, kaiser_enabled is
> false because AMD is not affected by Meltdown, which means, there's no
> user/kernel pagetables split.
>
> And that also means, you have global TLB entries which means that if you
> look at that __native_flush_tlb_single() function, it needs to flush
> global TLB entries on CPUs with X86_FEATURE_INVPCID_SINGLE by doing an
> INVLPG in the kaiser_enabled=0 case. Errgo, the above hunk.
>
> But I might be completely off here thus this note...
>
> Thoughts?
>
> Thx.
>
>
> [ 1.235726] ------------[ cut here ]------------
> [ 1.237515] kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
> [ 1.240926] invalid opcode: 0000 [#1] SMP
> [ 1.243301] Modules linked in:
> [ 1.244585] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1
> [ 1.247657] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> [ 1.251249] task: ffff909363e94040 task.stack: ffffa41bc0194000
> [ 1.253519] RIP: 0010:[<ffffffff8fa2e40c>] [<ffffffff8fa2e40c>] text_poke+0x18c/0x240
> [ 1.256593] RSP: 0018:ffffa41bc0197d90 EFLAGS: 00010096
> [ 1.258657] RAX: 000000000000000f RBX: 0000000001020800 RCX: 00000000feda3203
> [ 1.261388] RDX: 00000000178bfbff RSI: 0000000000000000 RDI: ffffffffff57a000
> [ 1.264168] RBP: ffffffff8fbd3eca R08: 0000000000000000 R09: 0000000000000003
> [ 1.266983] R10: 0000000000000003 R11: 0000000000000112 R12: 0000000000000001
> [ 1.269702] R13: ffffa41bc0197dcf R14: 0000000000000286 R15: ffffed1c40407500
> [ 1.272572] FS: 0000000000000000(0000) GS:ffff909366300000(0000) knlGS:0000000000000000
> [ 1.275791] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1.278032] CR2: 0000000000000000 CR3: 0000000010c08000 CR4: 00000000003606f0
> [ 1.280815] Stack:
> [ 1.281630] ffffffff8fbd3eca 0000000000000005 ffffa41bc0197e03 ffffffff8fbd3ecb
> [ 1.284660] 0000000000000000 0000000000000000 ffffffff8fa2e835 ccffffff8fad4326
> [ 1.287729] 1ccd0231874d55d3 ffffffff8fbd3eca ffffa41bc0197e03 ffffffff90203844
> [ 1.290852] Call Trace:
> [ 1.291782] [<ffffffff8fbd3eca>] ? swap_entry_free+0x12a/0x300
> [ 1.294900] [<ffffffff8fbd3ecb>] ? swap_entry_free+0x12b/0x300
> [ 1.297267] [<ffffffff8fa2e835>] ? text_poke_bp+0x55/0xe0
> [ 1.299473] [<ffffffff8fbd3eca>] ? swap_entry_free+0x12a/0x300
> [ 1.301896] [<ffffffff8fa2b64c>] ? arch_jump_label_transform+0x9c/0x120
> [ 1.304557] [<ffffffff9073e81f>] ? set_debug_rodata+0xc/0xc
> [ 1.306790] [<ffffffff8fb81d92>] ? __jump_label_update+0x72/0x80
> [ 1.309255] [<ffffffff8fb8206f>] ? static_key_slow_inc+0x8f/0xa0
> [ 1.311680] [<ffffffff8fbd7a57>] ? frontswap_register_ops+0x107/0x1d0
> [ 1.314281] [<ffffffff9077078c>] ? init_zswap+0x282/0x3f6
> [ 1.316547] [<ffffffff9077050a>] ? init_frontswap+0x8c/0x8c
> [ 1.318784] [<ffffffff8fa0223e>] ? do_one_initcall+0x4e/0x180
> [ 1.321067] [<ffffffff9073e81f>] ? set_debug_rodata+0xc/0xc
> [ 1.323366] [<ffffffff9073f08d>] ? kernel_init_freeable+0x16b/0x1ec
> [ 1.325873] [<ffffffff90011d50>] ? rest_init+0x80/0x80
> [ 1.327989] [<ffffffff90011d5a>] ? kernel_init+0xa/0x100
> [ 1.330092] [<ffffffff9001f424>] ? ret_from_fork+0x44/0x70
> [ 1.332311] Code: 00 0f a2 4d 85 e4 74 4a 0f b6 45 00 41 38 45 00 75 19 31 c0 83 c0 01 48 63 d0 49 39 d4 76 33 41 0f b6 4c 15 00 38 4c 15 00 74 e9 <0f> 0b 48 89 ef e8 da d6 19 00 48 8d bd 00 10 00 00 48 89 c3 e8
> [ 1.342818] RIP [<ffffffff8fa2e40c>] text_poke+0x18c/0x240
> [ 1.345859] RSP <ffffa41bc0197d90>
> [ 1.347285] ---[ end trace 0a1c5ab5eb16de89 ]---
> [ 1.349169] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [ 1.349169]
> [ 1.352885] Kernel Offset: 0xea00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [ 1.357039] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [ 1.357039]
>
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>

2021-03-25 03:42:56

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Wed, 24 Mar 2021, Hugh Dickins wrote:
> On Wed, 24 Mar 2021, Borislav Petkov wrote:
>
> > Ok,
> >
> > some more experimenting Babu and I did lead us to:
> >
> > ---
> > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> > index f5ca15622dc9..259aa4889cad 100644
> > --- a/arch/x86/include/asm/tlbflush.h
> > +++ b/arch/x86/include/asm/tlbflush.h
> > @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> > */
> > if (kaiser_enabled)
> > invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> > + else
> > + asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> > +
> > invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> > }
> >
> > applied on the guest kernel which fixes the issue. And let me add Hugh
> > who did that PCID stuff at the time. So lemme summarize for Hugh and to
> > ask him nicely to sanity-check me. :-)
>
> Just a brief interim note to assure you that I'm paying attention,
> but wow, it's a long time since I gave any thought down here!
> Trying to page it all back in...
>
> I see no harm in your workaround if it works, but it's not as if
> this is a previously untried path: so I'm suspicious how an issue
> here with Globals could have gone unnoticed for so long, and need
> to understand it better.

Right, after looking into it more, I completely agree with you:
the Kaiser series (in both 4.4-stable and 4.9-stable) was simply
wrong to lose that invlpg - fine in the kaiser case when we don't
enable Globals at all, but plain wrong in the !kaiser_enabled case.
One way or another, we have somehow got away with it for three years.

I do agree with Paolo that the PCID_ASID_KERN flush would be better
moved under the "if (kaiser_enabled)" now. (And if this were ongoing
development, I'd want to rewrite the function altogether: but no,
these old stable trees are not the place for that.)

Boris, may I leave both -stable fixes to you?
Let me know if you'd prefer me to clean up my mess.

Thanks a lot for tracking this down,
Hugh

> >
> > Basically, you have an AMD host which supports PCID and INVPCID and you
> > boot on it a 4.9 guest. It explodes like the panic below.
> >
> > What fixes it is this:
> >
> > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> > index f5ca15622dc9..259aa4889cad 100644
> > --- a/arch/x86/include/asm/tlbflush.h
> > +++ b/arch/x86/include/asm/tlbflush.h
> > @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> > */
> > if (kaiser_enabled)
> > invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> > + else
> > + asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> > +
> > invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> > }
> >
> > ---
> >
> > and the reason why it does, IMHO, is because on AMD, kaiser_enabled is
> > false because AMD is not affected by Meltdown, which means, there's no
> > user/kernel pagetables split.
> >
> > And that also means, you have global TLB entries which means that if you
> > look at that __native_flush_tlb_single() function, it needs to flush
> > global TLB entries on CPUs with X86_FEATURE_INVPCID_SINGLE by doing an
> > INVLPG in the kaiser_enabled=0 case. Errgo, the above hunk.
> >
> > But I might be completely off here thus this note...
> >
> > Thoughts?
> >
> > Thx.
> >
> >
> > [ 1.235726] ------------[ cut here ]------------
> > [ 1.237515] kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
> > [ 1.240926] invalid opcode: 0000 [#1] SMP
> > [ 1.243301] Modules linked in:
> > [ 1.244585] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1
> > [ 1.247657] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > [ 1.251249] task: ffff909363e94040 task.stack: ffffa41bc0194000
> > [ 1.253519] RIP: 0010:[<ffffffff8fa2e40c>] [<ffffffff8fa2e40c>] text_poke+0x18c/0x240
> > [ 1.256593] RSP: 0018:ffffa41bc0197d90 EFLAGS: 00010096
> > [ 1.258657] RAX: 000000000000000f RBX: 0000000001020800 RCX: 00000000feda3203
> > [ 1.261388] RDX: 00000000178bfbff RSI: 0000000000000000 RDI: ffffffffff57a000
> > [ 1.264168] RBP: ffffffff8fbd3eca R08: 0000000000000000 R09: 0000000000000003
> > [ 1.266983] R10: 0000000000000003 R11: 0000000000000112 R12: 0000000000000001
> > [ 1.269702] R13: ffffa41bc0197dcf R14: 0000000000000286 R15: ffffed1c40407500
> > [ 1.272572] FS: 0000000000000000(0000) GS:ffff909366300000(0000) knlGS:0000000000000000
> > [ 1.275791] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 1.278032] CR2: 0000000000000000 CR3: 0000000010c08000 CR4: 00000000003606f0
> > [ 1.280815] Stack:
> > [ 1.281630] ffffffff8fbd3eca 0000000000000005 ffffa41bc0197e03 ffffffff8fbd3ecb
> > [ 1.284660] 0000000000000000 0000000000000000 ffffffff8fa2e835 ccffffff8fad4326
> > [ 1.287729] 1ccd0231874d55d3 ffffffff8fbd3eca ffffa41bc0197e03 ffffffff90203844
> > [ 1.290852] Call Trace:
> > [ 1.291782] [<ffffffff8fbd3eca>] ? swap_entry_free+0x12a/0x300
> > [ 1.294900] [<ffffffff8fbd3ecb>] ? swap_entry_free+0x12b/0x300
> > [ 1.297267] [<ffffffff8fa2e835>] ? text_poke_bp+0x55/0xe0
> > [ 1.299473] [<ffffffff8fbd3eca>] ? swap_entry_free+0x12a/0x300
> > [ 1.301896] [<ffffffff8fa2b64c>] ? arch_jump_label_transform+0x9c/0x120
> > [ 1.304557] [<ffffffff9073e81f>] ? set_debug_rodata+0xc/0xc
> > [ 1.306790] [<ffffffff8fb81d92>] ? __jump_label_update+0x72/0x80
> > [ 1.309255] [<ffffffff8fb8206f>] ? static_key_slow_inc+0x8f/0xa0
> > [ 1.311680] [<ffffffff8fbd7a57>] ? frontswap_register_ops+0x107/0x1d0
> > [ 1.314281] [<ffffffff9077078c>] ? init_zswap+0x282/0x3f6
> > [ 1.316547] [<ffffffff9077050a>] ? init_frontswap+0x8c/0x8c
> > [ 1.318784] [<ffffffff8fa0223e>] ? do_one_initcall+0x4e/0x180
> > [ 1.321067] [<ffffffff9073e81f>] ? set_debug_rodata+0xc/0xc
> > [ 1.323366] [<ffffffff9073f08d>] ? kernel_init_freeable+0x16b/0x1ec
> > [ 1.325873] [<ffffffff90011d50>] ? rest_init+0x80/0x80
> > [ 1.327989] [<ffffffff90011d5a>] ? kernel_init+0xa/0x100
> > [ 1.330092] [<ffffffff9001f424>] ? ret_from_fork+0x44/0x70
> > [ 1.332311] Code: 00 0f a2 4d 85 e4 74 4a 0f b6 45 00 41 38 45 00 75 19 31 c0 83 c0 01 48 63 d0 49 39 d4 76 33 41 0f b6 4c 15 00 38 4c 15 00 74 e9 <0f> 0b 48 89 ef e8 da d6 19 00 48 8d bd 00 10 00 00 48 89 c3 e8
> > [ 1.342818] RIP [<ffffffff8fa2e40c>] text_poke+0x18c/0x240
> > [ 1.345859] RSP <ffffa41bc0197d90>
> > [ 1.347285] ---[ end trace 0a1c5ab5eb16de89 ]---
> > [ 1.349169] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> > [ 1.349169]
> > [ 1.352885] Kernel Offset: 0xea00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > [ 1.357039] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> > [ 1.357039]
> >
> >
> > --
> > Regards/Gruss,
> > Boris.
> >
> > https://people.kernel.org/tglx/notes-about-netiquette

2021-03-25 09:58:27

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

On Wed, Mar 24, 2021 at 07:43:29PM -0700, Hugh Dickins wrote:
> Right, after looking into it more, I completely agree with you:
> the Kaiser series (in both 4.4-stable and 4.9-stable) was simply
> wrong to lose that invlpg - fine in the kaiser case when we don't
> enable Globals at all, but plain wrong in the !kaiser_enabled case.
> One way or another, we have somehow got away with it for three years.

Yeah, because there were no boxes with kaiser_enabled=0 *and* PCID
which would set INVPCID_SINGLE. Before those, it would INVLPG in the
!INVPCID_SINGLE case.

Oh, btw, booting with "pci=on" "fixes" the issue too. And I tried
reproducing this on an Intel box with "pti=off" but it booted fine
so I'm probably missing some other aspect or triggering it there is
harder/different due to TLB differences or whatnot.

And Babu triggered the same issue on a AMD baremetal yesterday.

> I do agree with Paolo that the PCID_ASID_KERN flush would be better
> moved under the "if (kaiser_enabled)" now.

Ok.

> (And if this were ongoing development, I'd want to rewrite the
> function altogether: but no, these old stable trees are not the place
> for that.)

Bah, it brought some very mixed memories, wading through that code
after years. And yeah, people should stop using all these dead kernels
already! So yeah, no, you don't want to clean up stuff there - let
sleeping dogs lie.

> Boris, may I leave both -stable fixes to you?
> Let me know if you'd prefer me to clean up my mess.

No worries, I'll take care of it.

> Thanks a lot for tracking this down,

Thanks for double-checking me so quickly, lemme whip up a patch.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-25 10:32:14

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH] x86/tlb: Flush global mappings when KAISER is disabled

Ok,

I tried to be as specific as possible in the commit message so that we
don't forget. Please lemme know if I've missed something.

Babu, Jim, I'd appreciate it if you ran this to confirm.

Thx.

---
From: Borislav Petkov <[email protected]>
Date: Thu, 25 Mar 2021 11:02:31 +0100

Jim Mattson reported that Debian 9 guests using a 4.9-stable kernel
are exploding during alternatives patching:

kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
invalid opcode: 0000 [#1] SMP
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
swap_entry_free
swap_entry_free
text_poke_bp
swap_entry_free
arch_jump_label_transform
set_debug_rodata
__jump_label_update
static_key_slow_inc
frontswap_register_ops
init_zswap
init_frontswap
do_one_initcall
set_debug_rodata
kernel_init_freeable
rest_init
kernel_init
ret_from_fork

triggering the BUG_ON in text_poke() which verifies whether patched
instruction bytes have actually landed at the destination.

Further debugging showed that the TLB flush before that check is
insufficient because there could be global mappings left in the TLB,
leading to a stale mapping getting used.

I say "global mappings" because the hardware configuration is a new one:
machine is an AMD, which means, KAISER/PTI doesn't need to be enabled
there, which also means there's no user/kernel pagetables split and
therefore the TLB can have global mappings.

And the configuration is new one for a second reason: because that AMD
machine supports PCID and INVPCID, which leads the CPU detection code to
set the synthetic X86_FEATURE_INVPCID_SINGLE flag.

Now, __native_flush_tlb_single() does invalidate global mappings when
X86_FEATURE_INVPCID_SINGLE is *not* set and returns.

When X86_FEATURE_INVPCID_SINGLE is set, however, it invalidates the
requested address from both PCIDs in the KAISER-enabled case. But if
KAISER is not enabled and the machine has global mappings in the TLB,
then those global mappings do not get invalidated, which would lead to
the above mismatch from using a stale TLB entry.

So make sure to flush those global mappings in the KAISER disabled case.

Co-debugged by Babu Moger <[email protected]>.

Reported-by: Jim Mattson <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/CALMp9eRDSW66%[email protected]
---
arch/x86/include/asm/tlbflush.h | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index f5ca15622dc9..2bfa4deb8cae 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -245,12 +245,15 @@ static inline void __native_flush_tlb_single(unsigned long addr)
* ASID. But, userspace flushes are probably much more
* important performance-wise.
*
- * Make sure to do only a single invpcid when KAISER is
- * disabled and we have only a single ASID.
+ * In the KAISER disabled case, do an INVLPG to make sure
+ * the mapping is flushed in case it is a global one.
*/
- if (kaiser_enabled)
+ if (kaiser_enabled) {
invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
- invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
+ invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
+ } else {
+ asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+ }
}

static inline void __flush_tlb_all(void)
--
2.29.2

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-25 10:55:48

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] x86/tlb: Flush global mappings when KAISER is disabled

On 25/03/21 11:29, Borislav Petkov wrote:
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index f5ca15622dc9..2bfa4deb8cae 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -245,12 +245,15 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> * ASID. But, userspace flushes are probably much more
> * important performance-wise.
> *
> - * Make sure to do only a single invpcid when KAISER is
> - * disabled and we have only a single ASID.
> + * In the KAISER disabled case, do an INVLPG to make sure
> + * the mapping is flushed in case it is a global one.
> */
> - if (kaiser_enabled)
> + if (kaiser_enabled) {
> invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> - invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> + invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> + } else {
> + asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> + }
> }
>
> static inline void __flush_tlb_all(void)
>

Reviewed-by: Paolo Bonzini <[email protected]>

2021-03-25 15:15:34

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH] x86/tlb: Flush global mappings when KAISER is disabled



On 3/25/21 5:29 AM, Borislav Petkov wrote:
> Ok,
>
> I tried to be as specific as possible in the commit message so that we
> don't forget. Please lemme know if I've missed something.
>
> Babu, Jim, I'd appreciate it if you ran this to confirm.
>
> Thx.
>
> ---
> From: Borislav Petkov <[email protected]>
> Date: Thu, 25 Mar 2021 11:02:31 +0100
>
> Jim Mattson reported that Debian 9 guests using a 4.9-stable kernel
> are exploding during alternatives patching:
>
> kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
> invalid opcode: 0000 [#1] SMP
> Modules linked in:
> CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> swap_entry_free
> swap_entry_free
> text_poke_bp
> swap_entry_free
> arch_jump_label_transform
> set_debug_rodata
> __jump_label_update
> static_key_slow_inc
> frontswap_register_ops
> init_zswap
> init_frontswap
> do_one_initcall
> set_debug_rodata
> kernel_init_freeable
> rest_init
> kernel_init
> ret_from_fork
>
> triggering the BUG_ON in text_poke() which verifies whether patched
> instruction bytes have actually landed at the destination.
>
> Further debugging showed that the TLB flush before that check is
> insufficient because there could be global mappings left in the TLB,
> leading to a stale mapping getting used.
>
> I say "global mappings" because the hardware configuration is a new one:
> machine is an AMD, which means, KAISER/PTI doesn't need to be enabled
> there, which also means there's no user/kernel pagetables split and
> therefore the TLB can have global mappings.
>
> And the configuration is new one for a second reason: because that AMD
> machine supports PCID and INVPCID, which leads the CPU detection code to
> set the synthetic X86_FEATURE_INVPCID_SINGLE flag.
>
> Now, __native_flush_tlb_single() does invalidate global mappings when
> X86_FEATURE_INVPCID_SINGLE is *not* set and returns.
>
> When X86_FEATURE_INVPCID_SINGLE is set, however, it invalidates the
> requested address from both PCIDs in the KAISER-enabled case. But if
> KAISER is not enabled and the machine has global mappings in the TLB,
> then those global mappings do not get invalidated, which would lead to
> the above mismatch from using a stale TLB entry.
>
> So make sure to flush those global mappings in the KAISER disabled case.
>
> Co-debugged by Babu Moger <[email protected]>.
>
> Reported-by: Jim Mattson <[email protected]>
> Signed-off-by: Borislav Petkov <[email protected]>
> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.kernel.org%2Fr%2FCALMp9eRDSW66%252BXvbHVF4ohL7XhThoPoT0BrB0TcS0cgk%3DdkcBg%40mail.gmail.com&amp;data=04%7C01%7Cbabu.moger%40amd.com%7Cf4e0aacf81744dc8be4408d8ef78f2cf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637522650066097649%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=1c4MQ9I9KrLxWLqghGCI%2BC%2Bvs0c9vYaNC5d%2FiYL0oMA%3D&amp;reserved=0
> ---
> arch/x86/include/asm/tlbflush.h | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index f5ca15622dc9..2bfa4deb8cae 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -245,12 +245,15 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> * ASID. But, userspace flushes are probably much more
> * important performance-wise.
> *
> - * Make sure to do only a single invpcid when KAISER is
> - * disabled and we have only a single ASID.
> + * In the KAISER disabled case, do an INVLPG to make sure
> + * the mapping is flushed in case it is a global one.
> */
> - if (kaiser_enabled)
> + if (kaiser_enabled) {
> invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> - invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> + invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> + } else {
> + asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> + }
> }
>
> static inline void __flush_tlb_all(void)
>

Thanks Boris. As you updated the patch little bit since yesterday, I
retested them again both on host and guest kernel. They are looking good.

Tested-by: Babu Moger <[email protected]>

2021-03-25 16:35:13

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] x86/tlb: Flush global mappings when KAISER is disabled

On Thu, 25 Mar 2021, Borislav Petkov wrote:

> Ok,
>
> I tried to be as specific as possible in the commit message so that we
> don't forget. Please lemme know if I've missed something.
>
> Babu, Jim, I'd appreciate it if you ran this to confirm.
>
> Thx.
>
> ---
> From: Borislav Petkov <[email protected]>
> Date: Thu, 25 Mar 2021 11:02:31 +0100
>
> Jim Mattson reported that Debian 9 guests using a 4.9-stable kernel
> are exploding during alternatives patching:
>
> kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
> invalid opcode: 0000 [#1] SMP
> Modules linked in:
> CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> swap_entry_free
> swap_entry_free
> text_poke_bp
> swap_entry_free
> arch_jump_label_transform
> set_debug_rodata
> __jump_label_update
> static_key_slow_inc
> frontswap_register_ops
> init_zswap
> init_frontswap
> do_one_initcall
> set_debug_rodata
> kernel_init_freeable
> rest_init
> kernel_init
> ret_from_fork
>
> triggering the BUG_ON in text_poke() which verifies whether patched
> instruction bytes have actually landed at the destination.
>
> Further debugging showed that the TLB flush before that check is
> insufficient because there could be global mappings left in the TLB,
> leading to a stale mapping getting used.
>
> I say "global mappings" because the hardware configuration is a new one:
> machine is an AMD, which means, KAISER/PTI doesn't need to be enabled
> there, which also means there's no user/kernel pagetables split and
> therefore the TLB can have global mappings.
>
> And the configuration is new one for a second reason: because that AMD
> machine supports PCID and INVPCID, which leads the CPU detection code to
> set the synthetic X86_FEATURE_INVPCID_SINGLE flag.
>
> Now, __native_flush_tlb_single() does invalidate global mappings when
> X86_FEATURE_INVPCID_SINGLE is *not* set and returns.
>
> When X86_FEATURE_INVPCID_SINGLE is set, however, it invalidates the
> requested address from both PCIDs in the KAISER-enabled case. But if
> KAISER is not enabled and the machine has global mappings in the TLB,
> then those global mappings do not get invalidated, which would lead to
> the above mismatch from using a stale TLB entry.
>
> So make sure to flush those global mappings in the KAISER disabled case.
>
> Co-debugged by Babu Moger <[email protected]>.
>
> Reported-by: Jim Mattson <[email protected]>
> Signed-off-by: Borislav Petkov <[email protected]>
> Link: https://lkml.kernel.org/r/CALMp9eRDSW66%[email protected]

Acked-by: Hugh Dickins <[email protected]>

Great write-up too: many thanks.

> ---
> arch/x86/include/asm/tlbflush.h | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index f5ca15622dc9..2bfa4deb8cae 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -245,12 +245,15 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> * ASID. But, userspace flushes are probably much more
> * important performance-wise.
> *
> - * Make sure to do only a single invpcid when KAISER is
> - * disabled and we have only a single ASID.
> + * In the KAISER disabled case, do an INVLPG to make sure
> + * the mapping is flushed in case it is a global one.
> */
> - if (kaiser_enabled)
> + if (kaiser_enabled) {
> invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> - invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> + invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> + } else {
> + asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> + }
> }
>
> static inline void __flush_tlb_all(void)
> --
> 2.29.2
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

2021-03-25 19:03:00

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH] x86/tlb: Flush global mappings when KAISER is disabled

On Thu, Mar 25, 2021 at 9:33 AM Hugh Dickins <[email protected]> wrote:
>
> On Thu, 25 Mar 2021, Borislav Petkov wrote:
>
> > Ok,
> >
> > I tried to be as specific as possible in the commit message so that we
> > don't forget. Please lemme know if I've missed something.
> >
> > Babu, Jim, I'd appreciate it if you ran this to confirm.
Tested-by: Jim Mattson <[email protected]>

2021-03-25 20:11:28

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH] x86/tlb: Flush global mappings when KAISER is disabled

Hi stable folks,

the patch below fixes kernels 4.4 and 4.9 booting on AMD platforms with
PCID support. It doesn't have an upstream counterpart because it patches
the KAISER code which didn't go upstream. It applies fine to both of the
aforementioned kernels - please pick it up.

Thx.

---
From: Borislav Petkov <[email protected]>
Date: Thu, 25 Mar 2021 11:02:31 +0100
Subject: [PATCH] x86/tlb: Flush global mappings when KAISER is disabled

Jim Mattson reported that Debian 9 guests using a 4.9-stable kernel
are exploding during alternatives patching:

kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
invalid opcode: 0000 [#1] SMP
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
swap_entry_free
swap_entry_free
text_poke_bp
swap_entry_free
arch_jump_label_transform
set_debug_rodata
__jump_label_update
static_key_slow_inc
frontswap_register_ops
init_zswap
init_frontswap
do_one_initcall
set_debug_rodata
kernel_init_freeable
rest_init
kernel_init
ret_from_fork

triggering the BUG_ON in text_poke() which verifies whether patched
instruction bytes have actually landed at the destination.

Further debugging showed that the TLB flush before that check is
insufficient because there could be global mappings left in the TLB,
leading to a stale mapping getting used.

I say "global mappings" because the hardware configuration is a new one:
machine is an AMD, which means, KAISER/PTI doesn't need to be enabled
there, which also means there's no user/kernel pagetables split and
therefore the TLB can have global mappings.

And the configuration is new one for a second reason: because that AMD
machine supports PCID and INVPCID, which leads the CPU detection code to
set the synthetic X86_FEATURE_INVPCID_SINGLE flag.

Now, __native_flush_tlb_single() does invalidate global mappings when
X86_FEATURE_INVPCID_SINGLE is *not* set and returns.

When X86_FEATURE_INVPCID_SINGLE is set, however, it invalidates the
requested address from both PCIDs in the KAISER-enabled case. But if
KAISER is not enabled and the machine has global mappings in the TLB,
then those global mappings do not get invalidated, which would lead to
the above mismatch from using a stale TLB entry.

So make sure to flush those global mappings in the KAISER disabled case.

Co-debugged by Babu Moger <[email protected]>.

Reported-by: Jim Mattson <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Hugh Dickins <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Tested-by: Babu Moger <[email protected]>
Tested-by: Jim Mattson <[email protected]>
Link: https://lkml.kernel.org/r/CALMp9eRDSW66%[email protected]
---
arch/x86/include/asm/tlbflush.h | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index f5ca15622dc9..2bfa4deb8cae 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -245,12 +245,15 @@ static inline void __native_flush_tlb_single(unsigned long addr)
* ASID. But, userspace flushes are probably much more
* important performance-wise.
*
- * Make sure to do only a single invpcid when KAISER is
- * disabled and we have only a single ASID.
+ * In the KAISER disabled case, do an INVLPG to make sure
+ * the mapping is flushed in case it is a global one.
*/
- if (kaiser_enabled)
+ if (kaiser_enabled) {
invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
- invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
+ invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
+ } else {
+ asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+ }
}

static inline void __flush_tlb_all(void)
--
2.29.2


--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-25 20:38:40

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/tlb: Flush global mappings when KAISER is disabled

On Thu, Mar 25, 2021 at 09:09:42PM +0100, Borislav Petkov wrote:
>Hi stable folks,
>
>the patch below fixes kernels 4.4 and 4.9 booting on AMD platforms with
>PCID support. It doesn't have an upstream counterpart because it patches
>the KAISER code which didn't go upstream. It applies fine to both of the
>aforementioned kernels - please pick it up.

Queued up for 4.9 and 4.4, thanks!

>Jim Mattson reported that Debian 9 guests using a 4.9-stable kernel
>are exploding during alternatives patching:

(Cc Ben & Salvatore)

I'm not sure if 4.9 or Debian is still alive or not, but FYI...

--
Thanks,
Sasha

2021-03-25 23:22:06

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/tlb: Flush global mappings when KAISER is disabled

On Thu, Mar 25, 2021 at 04:36:55PM -0400, Sasha Levin wrote:
>On Thu, Mar 25, 2021 at 09:09:42PM +0100, Borislav Petkov wrote:
>>Hi stable folks,
>>
>>the patch below fixes kernels 4.4 and 4.9 booting on AMD platforms with
>>PCID support. It doesn't have an upstream counterpart because it patches
>>the KAISER code which didn't go upstream. It applies fine to both of the
>>aforementioned kernels - please pick it up.
>
>Queued up for 4.9 and 4.4, thanks!
>
>>Jim Mattson reported that Debian 9 guests using a 4.9-stable kernel
>>are exploding during alternatives patching:
>
>(Cc Ben & Salvatore)
>
>I'm not sure if 4.9 or Debian is still alive or not, but FYI...
*on

:)
--
Thanks,
Sasha

2021-03-26 00:00:59

by Ben Hutchings

[permalink] [raw]
Subject: Re: [PATCH] x86/tlb: Flush global mappings when KAISER is disabled

On Thu, 2021-03-25 at 19:19 -0400, Sasha Levin wrote:
> On Thu, Mar 25, 2021 at 04:36:55PM -0400, Sasha Levin wrote:
> > On Thu, Mar 25, 2021 at 09:09:42PM +0100, Borislav Petkov wrote:
> > > Hi stable folks,
> > >
> > > the patch below fixes kernels 4.4 and 4.9 booting on AMD platforms with
> > > PCID support. It doesn't have an upstream counterpart because it patches
> > > the KAISER code which didn't go upstream. It applies fine to both of the
> > > aforementioned kernels - please pick it up.
> >
> > Queued up for 4.9 and 4.4, thanks!
> >
> > > Jim Mattson reported that Debian 9 guests using a 4.9-stable kernel
> > > are exploding during alternatives patching:
> >
> > (Cc Ben & Salvatore)
> >
> > I'm not sure if 4.9 or Debian is still alive or not, but FYI...
>                     *on
>
>                     :)

We're supporting both 4.9 and 4.19 in Debian 9. The general rule is we
carry on with the same stable kernel branch for the whole 5 year
support period, but add the option of using the kernel version from the
next stable release.

Ben.

--
Ben Hutchings
Teamwork is essential - it allows you to blame someone else.


Attachments:
signature.asc (849.00 B)
This is a digitally signed message part