On the Intel side, restoration of the guest's IA32_SPEC_CTRL is done
as late as possible before vmentry, with the comment:
* IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
* there must not be any returns or indirect branches between this code
* and vmentry.
On AMD, there is no need to avoid returns or indirect branches between
wrmsr and vmrun because Linux doesn't use IBRS; however, restoration
of the host IA32_SPEC_CTRL value is definitely way too late. With
respect to the user/kernel boundary, AMD says, "If software chooses to
toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." Assuming the same requirements apply to the guest/host
boundary, KVM does not respect this recommendation: the return thunk
training sequence is in vmenter.S, quite close to the VM-exit, while
the host's IA32_SPEC_CTRL value is only restored much later for hosts
without V_SPEC_CTRL.
In the absence of clarifications for AMD, move all the SPEC_CTRL
handling to assembly code and, in passing, also make the Intel and AMD
code a bit more similar to each other.
Patches 1-2 are the Intel side, which is just a cleanup.
Patch 3 prepares for adding asm-offsets.c entries in arch/x86/kvm/svm/svm.h,
and patches 4-5 are a similar cleanup to the earlier VMX ones.
Patch 6 is the bulk of the change, and finally patch 7 removes now
dead code in asm/spec-ctrl.h and arch/x86/kernel/.
This is RFC because I haven't tested SEV-ES or 32-bit yet.
Paolo
Paolo Bonzini (7):
KVM: VMX: remove regs argument of __vmx_vcpu_run
KVM: VMX: more cleanups to __vmx_vcpu_run
KVM: SVM: extract VMCB accessors to a new file
KVM: SVM: replace argument of __svm_vcpu_run with vcpu_svm
KVM: SVM: adjust register allocation for __svm_vcpu_run
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and
callers
arch/x86/include/asm/spec-ctrl.h | 10 +-
arch/x86/kernel/asm-offsets.c | 8 ++
arch/x86/kernel/cpu/bugs.c | 15 +--
arch/x86/kvm/svm/avic.c | 1 +
arch/x86/kvm/svm/nested.c | 1 +
arch/x86/kvm/svm/sev.c | 1 +
arch/x86/kvm/svm/svm.c | 39 +++---
arch/x86/kvm/svm/svm.h | 204 +-----------------------------
arch/x86/kvm/svm/svm_onhyperv.c | 1 +
arch/x86/kvm/svm/vmcb.h | 211 +++++++++++++++++++++++++++++++
arch/x86/kvm/svm/vmenter.S | 164 ++++++++++++++++++------
arch/x86/kvm/vmx/nested.c | 3 +-
arch/x86/kvm/vmx/vmenter.S | 92 ++++++--------
arch/x86/kvm/vmx/vmx.c | 3 +-
arch/x86/kvm/vmx/vmx.h | 3 +-
15 files changed, 419 insertions(+), 337 deletions(-)
create mode 100644 arch/x86/kvm/svm/vmcb.h
--
2.31.1
Slightly improve register allocation, loading vmx only once
before vmlaunch/vmresume.
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/vmx/vmenter.S | 40 +++++++++++++++++---------------------
1 file changed, 18 insertions(+), 22 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 1362fe5859f9..0aea6b348a96 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -81,13 +81,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
* there must not be any returns or indirect branches between this code
* and vmentry.
*/
- movl VMX_spec_ctrl(%_ASM_DI), %edi
+ movl VMX_spec_ctrl(%_ASM_DI), %eax
movl PER_CPU_VAR(x86_spec_ctrl_current), %esi
- cmp %edi, %esi
+ cmp %eax, %esi
je .Lspec_ctrl_done
mov $MSR_IA32_SPEC_CTRL, %ecx
xor %edx, %edx
- mov %edi, %eax
wrmsr
.Lspec_ctrl_done:
@@ -97,31 +96,28 @@ SYM_FUNC_START(__vmx_vcpu_run)
* an LFENCE to stop speculation from skipping the wrmsr.
*/
- /* Load @vmx to RAX. */
- mov WORD_SIZE(%_ASM_SP), %_ASM_AX
-
/* Check if vmlaunch or vmresume is needed */
testb $VMX_RUN_VMRESUME, %bl
/* Load guest registers. Don't clobber flags. */
- mov VCPU_RCX(%_ASM_AX), %_ASM_CX
- mov VCPU_RDX(%_ASM_AX), %_ASM_DX
- mov VCPU_RBX(%_ASM_AX), %_ASM_BX
- mov VCPU_RBP(%_ASM_AX), %_ASM_BP
- mov VCPU_RSI(%_ASM_AX), %_ASM_SI
- mov VCPU_RDI(%_ASM_AX), %_ASM_DI
+ mov VCPU_RAX(%_ASM_DI), %_ASM_AX
+ mov VCPU_RCX(%_ASM_DI), %_ASM_CX
+ mov VCPU_RDX(%_ASM_DI), %_ASM_DX
+ mov VCPU_RBX(%_ASM_DI), %_ASM_BX
+ mov VCPU_RBP(%_ASM_DI), %_ASM_BP
+ mov VCPU_RSI(%_ASM_DI), %_ASM_SI
#ifdef CONFIG_X86_64
- mov VCPU_R8 (%_ASM_AX), %r8
- mov VCPU_R9 (%_ASM_AX), %r9
- mov VCPU_R10(%_ASM_AX), %r10
- mov VCPU_R11(%_ASM_AX), %r11
- mov VCPU_R12(%_ASM_AX), %r12
- mov VCPU_R13(%_ASM_AX), %r13
- mov VCPU_R14(%_ASM_AX), %r14
- mov VCPU_R15(%_ASM_AX), %r15
+ mov VCPU_R8 (%_ASM_DI), %r8
+ mov VCPU_R9 (%_ASM_DI), %r9
+ mov VCPU_R10(%_ASM_DI), %r10
+ mov VCPU_R11(%_ASM_DI), %r11
+ mov VCPU_R12(%_ASM_DI), %r12
+ mov VCPU_R13(%_ASM_DI), %r13
+ mov VCPU_R14(%_ASM_DI), %r14
+ mov VCPU_R15(%_ASM_DI), %r15
#endif
- /* Load guest RAX. This kills the @vmx pointer! */
- mov VCPU_RAX(%_ASM_AX), %_ASM_AX
+ /* Load guest RDI. This kills the @vmx pointer! */
+ mov VCPU_RDI(%_ASM_DI), %_ASM_DI
/* Check EFLAGS.ZF from 'testb' above */
jz .Lvmlaunch
--
2.31.1
x86_virt_spec_ctrl only deals with the paravirtualized
MSR_IA32_VIRT_SPEC_CTRL now and does not handle MSR_IA32_SPEC_CTRL
anymore; remove the corresponding, unused argument.
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/include/asm/spec-ctrl.h | 10 +++++-----
arch/x86/kernel/cpu/bugs.c | 2 +-
arch/x86/kvm/svm/svm.c | 4 ++--
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h
index 5393babc0598..cb0386fc4dc3 100644
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -13,7 +13,7 @@
* Takes the guest view of SPEC_CTRL MSR as a parameter and also
* the guest's version of VIRT_SPEC_CTRL, if emulated.
*/
-extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool guest);
+extern void x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool guest);
/**
* x86_spec_ctrl_set_guest - Set speculation control registers for the guest
@@ -24,9 +24,9 @@ extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bo
* Avoids writing to the MSR if the content/bits are the same
*/
static inline
-void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_set_guest(u64 guest_virt_spec_ctrl)
{
- x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, true);
+ x86_virt_spec_ctrl(guest_virt_spec_ctrl, true);
}
/**
@@ -38,9 +38,9 @@ void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
* Avoids writing to the MSR if the content/bits are the same
*/
static inline
-void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_restore_host(u64 guest_virt_spec_ctrl)
{
- x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, false);
+ x86_virt_spec_ctrl(guest_virt_spec_ctrl, false);
}
/* AMD specific Speculative Store Bypass MSR data */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6ec0b7ce7453..3e3230cccaa7 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -200,7 +200,7 @@ void __init check_bugs(void)
* MSR_IA32_SPEC_CTRL for SSBD.
*/
void
-x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
+x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool setguest)
{
u64 guestval, hostval;
struct thread_info *ti = current_thread_info();
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a79cdeebc181..7b87a4e43708 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4011,7 +4011,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
* being speculatively taken.
*/
if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
- x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
+ x86_spec_ctrl_set_guest(svm->virt_spec_ctrl);
svm_vcpu_enter_exit(vcpu);
@@ -4019,7 +4019,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
reload_tss(vcpu);
if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
- x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl);
+ x86_spec_ctrl_restore_host(svm->virt_spec_ctrl);
if (!sev_es_guest(vcpu->kvm)) {
vcpu->arch.cr2 = svm->vmcb->save.cr2;
--
2.31.1
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Suggested-by: Jim Mattson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kernel/asm-offsets.c | 1 +
arch/x86/kernel/cpu/bugs.c | 13 ++---
arch/x86/kvm/svm/svm.c | 34 +++++--------
arch/x86/kvm/svm/svm.h | 4 +-
arch/x86/kvm/svm/vmenter.S | 93 +++++++++++++++++++++++++++++++++--
5 files changed, 109 insertions(+), 36 deletions(-)
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 7f1dd1138117..9dad8849c1ef 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -113,6 +113,7 @@ static void __used common(void)
if (IS_ENABLED(CONFIG_KVM_AMD)) {
BLANK();
OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+ OFFSET(SVM_spec_ctrl, vcpu_svm, spec_ctrl);
}
if (IS_ENABLED(CONFIG_KVM_INTEL)) {
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index da7c361f47e0..6ec0b7ce7453 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -196,22 +196,15 @@ void __init check_bugs(void)
}
/*
- * NOTE: This function is *only* called for SVM. VMX spec_ctrl handling is
- * done in vmenter.S.
+ * NOTE: This function is *only* called for SVM, since Intel uses
+ * MSR_IA32_SPEC_CTRL for SSBD.
*/
void
x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
{
- u64 msrval, guestval = guest_spec_ctrl, hostval = spec_ctrl_current();
+ u64 guestval, hostval;
struct thread_info *ti = current_thread_info();
- if (static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
- if (hostval != guestval) {
- msrval = setguest ? guestval : hostval;
- wrmsrl(MSR_IA32_SPEC_CTRL, msrval);
- }
- }
-
/*
* If SSBD is not handled in MSR_SPEC_CTRL on AMD, update
* MSR_AMD64_L2_CFG or MSR_VIRT_SPEC_CTRL if supported.
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 64f5f0544b4f..a79cdeebc181 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3918,10 +3918,21 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
unsigned long vmcb_pa = svm->current_vmcb->pa;
+ /*
+ * For non-nested case:
+ * If the L01 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ *
+ * For nested case:
+ * If the L02 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ */
+ bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);
+
guest_state_enter_irqoff();
if (sev_es_guest(vcpu->kvm)) {
- __svm_sev_es_vcpu_run(vmcb_pa);
+ __svm_sev_es_vcpu_run(vmcb_pa, svm, spec_ctrl_intercepted);
} else {
struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
@@ -3932,7 +3943,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
* vmcb02 when switching vmcbs for nested virtualization.
*/
vmload(svm->vmcb01.pa);
- __svm_vcpu_run(vmcb_pa, svm);
+ __svm_vcpu_run(vmcb_pa, svm, spec_ctrl_intercepted);
vmsave(svm->vmcb01.pa);
vmload(__sme_page_pa(sd->save_area));
@@ -4004,25 +4015,6 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
svm_vcpu_enter_exit(vcpu);
- /*
- * We do not use IBRS in the kernel. If this vCPU has used the
- * SPEC_CTRL MSR it may have left it on; save the value and
- * turn it off. This is much more efficient than blindly adding
- * it to the atomic save/restore list. Especially as the former
- * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
- *
- * For non-nested case:
- * If the L01 MSR bitmap does not intercept the MSR, then we need to
- * save it.
- *
- * For nested case:
- * If the L02 MSR bitmap does not intercept the MSR, then we need to
- * save it.
- */
- if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL) &&
- unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
- svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
-
if (!sev_es_guest(vcpu->kvm))
reload_tss(vcpu);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 5f8dfc9cd9a7..f61c05116ea5 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -483,7 +483,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
/* vmenter.S */
-void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);
+void __svm_sev_es_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm, bool spec_ctrl_intercepted);
+void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm, bool spec_ctrl_intercepted);
#endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 51a63a47d74f..89402e450071 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -29,10 +29,65 @@
.section .noinstr.text, "ax"
+.macro RESTORE_GUEST_SPEC_CTRL
+ /* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+ ALTERNATIVE_2 "jmp 999f", \
+ "", X86_FEATURE_MSR_SPEC_CTRL, \
+ "jmp 999f", X86_FEATURE_V_SPEC_CTRL
+
+ /*
+ * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
+ * host's, write the MSR.
+ *
+ * IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
+ * there must not be any returns or indirect branches between this code
+ * and vmentry.
+ */
+ movl SVM_spec_ctrl(%_ASM_DI), %eax
+ cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
+ je 999f
+ mov $MSR_IA32_SPEC_CTRL, %ecx
+ xor %edx, %edx
+ wrmsr
+999:
+
+.endm
+
+.macro RESTORE_HOST_SPEC_CTRL
+ /* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+ ALTERNATIVE_2 "jmp 999f", \
+ "", X86_FEATURE_MSR_SPEC_CTRL, \
+ "jmp 999f", X86_FEATURE_V_SPEC_CTRL
+
+ mov $MSR_IA32_SPEC_CTRL, %ecx
+
+ /*
+ * Load the value that the guest had written into MSR_IA32_SPEC_CTRL,
+ * if it was not intercepted during guest execution.
+ */
+ cmpb $0, (%_ASM_SP)
+ jnz 998f
+ rdmsr
+ movl %eax, SVM_spec_ctrl(%_ASM_DI)
+998:
+
+ /* Now restore the host value of the MSR if different from the guest's. */
+ movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
+ cmp SVM_spec_ctrl(%_ASM_DI), %eax
+ je 999f
+ xor %edx, %edx
+ wrmsr
+999:
+
+.endm
+
+
+
/**
* __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
* @vmcb_pa: unsigned long
* @svm: struct vcpu_svm *
+ * @spec_ctrl_intercepted: bool
*/
SYM_FUNC_START(__svm_vcpu_run)
push %_ASM_BP
@@ -47,6 +102,9 @@ SYM_FUNC_START(__svm_vcpu_run)
#endif
push %_ASM_BX
+ /* Save @spec_ctrl_intercepted. */
+ push %_ASM_ARG3
+
/* Save @svm. */
push %_ASM_ARG2
@@ -56,6 +114,7 @@ SYM_FUNC_START(__svm_vcpu_run)
/* Move @svm to RDI. */
mov %_ASM_ARG2, %_ASM_DI
+ RESTORE_GUEST_SPEC_CTRL
/* "POP" @vmcb to RAX. */
pop %_ASM_AX
@@ -90,7 +149,7 @@ SYM_FUNC_START(__svm_vcpu_run)
FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
#endif
- /* "POP" @svm to RAX. */
+ /* "POP" @svm to RAX while it's the only available register. */
pop %_ASM_AX
/* Save all guest registers. */
@@ -111,6 +170,9 @@ SYM_FUNC_START(__svm_vcpu_run)
mov %r15, VCPU_R15(%_ASM_AX)
#endif
+ mov %_ASM_AX, %_ASM_DI
+ RESTORE_HOST_SPEC_CTRL
+
/*
* Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
* untrained as soon as we exit the VM and are back to the
@@ -146,6 +208,9 @@ SYM_FUNC_START(__svm_vcpu_run)
xor %r15d, %r15d
#endif
+ /* "Pop" @spec_ctrl_intercepted. */
+ pop %_ASM_BX
+
pop %_ASM_BX
#ifdef CONFIG_X86_64
@@ -171,6 +236,8 @@ SYM_FUNC_END(__svm_vcpu_run)
/**
* __svm_sev_es_vcpu_run - Run a SEV-ES vCPU via a transition to SVM guest mode
* @vmcb_pa: unsigned long
+ * @svm: struct vcpu_svm *
+ * @spec_ctrl_intercepted: bool
*/
SYM_FUNC_START(__svm_sev_es_vcpu_run)
push %_ASM_BP
@@ -185,8 +252,22 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
#endif
push %_ASM_BX
- /* Move @vmcb to RAX. */
- mov %_ASM_ARG1, %_ASM_AX
+ /* Save @spec_ctrl_intercepted. */
+ push %_ASM_ARG3
+
+ /* Save @svm. */
+ push %_ASM_ARG2
+
+ /* Save @vmcb. */
+ push %_ASM_ARG1
+
+ /* Move @svm to RDI for RESTORE_GUEST_SPEC_CTRL. */
+ mov %_ASM_ARG2, %_ASM_DI
+
+ RESTORE_GUEST_SPEC_CTRL
+
+ /* Pop @vmcb to RAX. */
+ pop %_ASM_AX
/* Enter guest mode */
sti
@@ -200,6 +281,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
#endif
+ pop %_ASM_DI
+ RESTORE_HOST_SPEC_CTRL
+
/*
* Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
* untrained as soon as we exit the VM and are back to the
@@ -209,6 +293,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
*/
UNTRAIN_RET
+ /* "Pop" @spec_ctrl_intercepted. */
+ pop %_ASM_BX
+
pop %_ASM_BX
#ifdef CONFIG_X86_64
--
2.31.1
In preparation for moving SPEC_CTRL access to __svm_vcpu_run,
keep the pointer to the struct vcpu_svm in %rdi, which is not
used by rdmsr/wrmsr.
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/svm/vmenter.S | 39 +++++++++++++++++++-------------------
1 file changed, 20 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 8fac744361e5..51a63a47d74f 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -53,30 +53,31 @@ SYM_FUNC_START(__svm_vcpu_run)
/* Save @vmcb. */
push %_ASM_ARG1
- /* Move @svm to RAX. */
- mov %_ASM_ARG2, %_ASM_AX
+ /* Move @svm to RDI. */
+ mov %_ASM_ARG2, %_ASM_DI
- /* Load guest registers. */
- mov VCPU_RCX(%_ASM_AX), %_ASM_CX
- mov VCPU_RDX(%_ASM_AX), %_ASM_DX
- mov VCPU_RBX(%_ASM_AX), %_ASM_BX
- mov VCPU_RBP(%_ASM_AX), %_ASM_BP
- mov VCPU_RSI(%_ASM_AX), %_ASM_SI
- mov VCPU_RDI(%_ASM_AX), %_ASM_DI
-#ifdef CONFIG_X86_64
- mov VCPU_R8 (%_ASM_AX), %r8
- mov VCPU_R9 (%_ASM_AX), %r9
- mov VCPU_R10(%_ASM_AX), %r10
- mov VCPU_R11(%_ASM_AX), %r11
- mov VCPU_R12(%_ASM_AX), %r12
- mov VCPU_R13(%_ASM_AX), %r13
- mov VCPU_R14(%_ASM_AX), %r14
- mov VCPU_R15(%_ASM_AX), %r15
-#endif
/* "POP" @vmcb to RAX. */
pop %_ASM_AX
+ /* Load guest registers. */
+ mov VCPU_RCX(%_ASM_DI), %_ASM_CX
+ mov VCPU_RDX(%_ASM_DI), %_ASM_DX
+ mov VCPU_RBX(%_ASM_DI), %_ASM_BX
+ mov VCPU_RBP(%_ASM_DI), %_ASM_BP
+ mov VCPU_RSI(%_ASM_DI), %_ASM_SI
+#ifdef CONFIG_X86_64
+ mov VCPU_R8 (%_ASM_DI), %r8
+ mov VCPU_R9 (%_ASM_DI), %r9
+ mov VCPU_R10(%_ASM_DI), %r10
+ mov VCPU_R11(%_ASM_DI), %r11
+ mov VCPU_R12(%_ASM_DI), %r12
+ mov VCPU_R13(%_ASM_DI), %r13
+ mov VCPU_R14(%_ASM_DI), %r14
+ mov VCPU_R15(%_ASM_DI), %r15
+#endif
+ mov VCPU_RDI(%_ASM_DI), %_ASM_DI
+
/* Enter guest mode */
sti
--
2.31.1
Registers are reachable through vcpu_vmx, no need to pass
a separate pointer to the regs[] array.
No functional change intended.
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kernel/asm-offsets.c | 1 +
arch/x86/kvm/vmx/nested.c | 3 +-
arch/x86/kvm/vmx/vmenter.S | 58 +++++++++++++++--------------------
arch/x86/kvm/vmx/vmx.c | 3 +-
arch/x86/kvm/vmx/vmx.h | 3 +-
5 files changed, 29 insertions(+), 39 deletions(-)
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index cb50589a7102..90da275ad223 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -111,6 +111,7 @@ static void __used common(void)
if (IS_ENABLED(CONFIG_KVM_INTEL)) {
BLANK();
+ OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
}
}
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 61a2e551640a..3f62bdaffb0b 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3094,8 +3094,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
vmx->loaded_vmcs->host_state.cr4 = cr4;
}
- vm_fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
- __vmx_vcpu_run_flags(vmx));
+ vm_fail = __vmx_vcpu_run(vmx, __vmx_vcpu_run_flags(vmx));
if (vmx->msr_autoload.host.nr)
vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 63b4ad54331b..1362fe5859f9 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -11,24 +11,24 @@
#define WORD_SIZE (BITS_PER_LONG / 8)
-#define VCPU_RAX __VCPU_REGS_RAX * WORD_SIZE
-#define VCPU_RCX __VCPU_REGS_RCX * WORD_SIZE
-#define VCPU_RDX __VCPU_REGS_RDX * WORD_SIZE
-#define VCPU_RBX __VCPU_REGS_RBX * WORD_SIZE
+#define VCPU_RAX (VMX_vcpu_arch_regs + __VCPU_REGS_RAX * WORD_SIZE)
+#define VCPU_RCX (VMX_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
+#define VCPU_RDX (VMX_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
+#define VCPU_RBX (VMX_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
/* Intentionally omit RSP as it's context switched by hardware */
-#define VCPU_RBP __VCPU_REGS_RBP * WORD_SIZE
-#define VCPU_RSI __VCPU_REGS_RSI * WORD_SIZE
-#define VCPU_RDI __VCPU_REGS_RDI * WORD_SIZE
+#define VCPU_RBP (VMX_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
+#define VCPU_RSI (VMX_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
+#define VCPU_RDI (VMX_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)
#ifdef CONFIG_X86_64
-#define VCPU_R8 __VCPU_REGS_R8 * WORD_SIZE
-#define VCPU_R9 __VCPU_REGS_R9 * WORD_SIZE
-#define VCPU_R10 __VCPU_REGS_R10 * WORD_SIZE
-#define VCPU_R11 __VCPU_REGS_R11 * WORD_SIZE
-#define VCPU_R12 __VCPU_REGS_R12 * WORD_SIZE
-#define VCPU_R13 __VCPU_REGS_R13 * WORD_SIZE
-#define VCPU_R14 __VCPU_REGS_R14 * WORD_SIZE
-#define VCPU_R15 __VCPU_REGS_R15 * WORD_SIZE
+#define VCPU_R8 (VMX_vcpu_arch_regs + __VCPU_REGS_R8 * WORD_SIZE)
+#define VCPU_R9 (VMX_vcpu_arch_regs + __VCPU_REGS_R9 * WORD_SIZE)
+#define VCPU_R10 (VMX_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
+#define VCPU_R11 (VMX_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
+#define VCPU_R12 (VMX_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
+#define VCPU_R13 (VMX_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
+#define VCPU_R14 (VMX_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
+#define VCPU_R15 (VMX_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
#endif
.section .noinstr.text, "ax"
@@ -36,7 +36,6 @@
/**
* __vmx_vcpu_run - Run a vCPU via a transition to VMX guest mode
* @vmx: struct vcpu_vmx *
- * @regs: unsigned long * (to guest registers)
* @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
* VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
*
@@ -61,22 +60,19 @@ SYM_FUNC_START(__vmx_vcpu_run)
push %_ASM_ARG1
/* Save @flags for SPEC_CTRL handling */
- push %_ASM_ARG3
-
- /*
- * Save @regs, _ASM_ARG2 may be modified by vmx_update_host_rsp() and
- * @regs is needed after VM-Exit to save the guest's register values.
- */
push %_ASM_ARG2
- /* Copy @flags to BL, _ASM_ARG3 is volatile. */
- mov %_ASM_ARG3B, %bl
+ /* Copy @flags to BL, _ASM_ARG2 is volatile. */
+ mov %_ASM_ARG2B, %bl
lea (%_ASM_SP), %_ASM_ARG2
call vmx_update_host_rsp
ALTERNATIVE "jmp .Lspec_ctrl_done", "", X86_FEATURE_MSR_SPEC_CTRL
+ /* Reload @vmx, _ASM_ARG1 may be modified by vmx_update_host_rsp(). */
+ mov WORD_SIZE(%_ASM_SP), %_ASM_DI
+
/*
* SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
* host's, write the MSR.
@@ -85,7 +81,6 @@ SYM_FUNC_START(__vmx_vcpu_run)
* there must not be any returns or indirect branches between this code
* and vmentry.
*/
- mov 2*WORD_SIZE(%_ASM_SP), %_ASM_DI
movl VMX_spec_ctrl(%_ASM_DI), %edi
movl PER_CPU_VAR(x86_spec_ctrl_current), %esi
cmp %edi, %esi
@@ -102,8 +97,8 @@ SYM_FUNC_START(__vmx_vcpu_run)
* an LFENCE to stop speculation from skipping the wrmsr.
*/
- /* Load @regs to RAX. */
- mov (%_ASM_SP), %_ASM_AX
+ /* Load @vmx to RAX. */
+ mov WORD_SIZE(%_ASM_SP), %_ASM_AX
/* Check if vmlaunch or vmresume is needed */
testb $VMX_RUN_VMRESUME, %bl
@@ -125,7 +120,7 @@ SYM_FUNC_START(__vmx_vcpu_run)
mov VCPU_R14(%_ASM_AX), %r14
mov VCPU_R15(%_ASM_AX), %r15
#endif
- /* Load guest RAX. This kills the @regs pointer! */
+ /* Load guest RAX. This kills the @vmx pointer! */
mov VCPU_RAX(%_ASM_AX), %_ASM_AX
/* Check EFLAGS.ZF from 'testb' above */
@@ -163,8 +158,8 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
/* Temporarily save guest's RAX. */
push %_ASM_AX
- /* Reload @regs to RAX. */
- mov WORD_SIZE(%_ASM_SP), %_ASM_AX
+ /* Reload @vmx to RAX. */
+ mov 2*WORD_SIZE(%_ASM_SP), %_ASM_AX
/* Save all guest registers, including RAX from the stack */
pop VCPU_RAX(%_ASM_AX)
@@ -189,9 +184,6 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
xor %ebx, %ebx
.Lclear_regs:
- /* Discard @regs. The register is irrelevant, it just can't be RBX. */
- pop %_ASM_AX
-
/*
* Clear all general purpose registers except RSP and RBX to prevent
* speculative use of the guest's values, even those that are reloaded
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 05a747c9a9ff..42cda7a5c009 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7084,8 +7084,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
if (vcpu->arch.cr2 != native_read_cr2())
native_write_cr2(vcpu->arch.cr2);
- vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
- flags);
+ vmx->fail = __vmx_vcpu_run(vmx, flags);
vcpu->arch.cr2 = native_read_cr2();
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index a3da84f4ea45..d90cdbea0e4c 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -422,8 +422,7 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu);
void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
void vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx, unsigned int flags);
unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx);
-bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs,
- unsigned int flags);
+bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned int flags);
int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr);
void vmx_ept_load_pdptrs(struct kvm_vcpu *vcpu);
--
2.31.1
Having inline functions confuses the compilation of asm-offsets.c,
which cannot find kvm_cache_regs.h because arch/x86/kvm is not in
asm-offset.c's include path. Just extract the functions to a
new file.
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/svm/avic.c | 1 +
arch/x86/kvm/svm/nested.c | 1 +
arch/x86/kvm/svm/sev.c | 1 +
arch/x86/kvm/svm/svm.c | 1 +
arch/x86/kvm/svm/svm.h | 200 ------------------------------
arch/x86/kvm/svm/svm_onhyperv.c | 1 +
arch/x86/kvm/svm/vmcb.h | 211 ++++++++++++++++++++++++++++++++
7 files changed, 216 insertions(+), 200 deletions(-)
create mode 100644 arch/x86/kvm/svm/vmcb.h
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 6919dee69f18..cc651a3310b1 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -26,6 +26,7 @@
#include "x86.h"
#include "irq.h"
#include "svm.h"
+#include "vmcb.h"
/* AVIC GATAG is encoded using VM and VCPU IDs */
#define AVIC_VCPU_ID_BITS 8
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index b258d6988f5d..365f5ef55b53 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -29,6 +29,7 @@
#include "cpuid.h"
#include "lapic.h"
#include "svm.h"
+#include "vmcb.h"
#include "hyperv.h"
#define CC KVM_NESTED_VMENTER_CONSISTENCY_CHECK
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index c0c9ed5e279c..549f35ded880 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -25,6 +25,7 @@
#include "mmu.h"
#include "x86.h"
#include "svm.h"
+#include "vmcb.h"
#include "svm_ops.h"
#include "cpuid.h"
#include "trace.h"
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d22a809d9233..b793cfdce68d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -44,6 +44,7 @@
#include "trace.h"
#include "svm.h"
+#include "vmcb.h"
#include "svm_ops.h"
#include "kvm_onhyperv.h"
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6a7686bf6900..222856788153 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -22,8 +22,6 @@
#include <asm/svm.h>
#include <asm/sev-common.h>
-#include "kvm_cache_regs.h"
-
#define __sme_page_pa(x) __sme_set(page_to_pfn(x) << PAGE_SHIFT)
#define IOPM_SIZE PAGE_SIZE * 3
@@ -327,27 +325,6 @@ static __always_inline bool sev_es_guest(struct kvm *kvm)
#endif
}
-static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
-{
- vmcb->control.clean = 0;
-}
-
-static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
-{
- vmcb->control.clean = VMCB_ALL_CLEAN_MASK
- & ~VMCB_ALWAYS_DIRTY_MASK;
-}
-
-static inline void vmcb_mark_dirty(struct vmcb *vmcb, int bit)
-{
- vmcb->control.clean &= ~(1 << bit);
-}
-
-static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
-{
- return !test_bit(bit, (unsigned long *)&vmcb->control.clean);
-}
-
static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
{
return container_of(vcpu, struct vcpu_svm, vcpu);
@@ -363,161 +340,6 @@ static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
*/
#define SVM_REGS_LAZY_LOAD_SET (1 << VCPU_EXREG_PDPTR)
-static inline void vmcb_set_intercept(struct vmcb_control_area *control, u32 bit)
-{
- WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
- __set_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline void vmcb_clr_intercept(struct vmcb_control_area *control, u32 bit)
-{
- WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
- __clear_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline bool vmcb_is_intercept(struct vmcb_control_area *control, u32 bit)
-{
- WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
- return test_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u32 bit)
-{
- WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
- return test_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline void set_dr_intercepts(struct vcpu_svm *svm)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- if (!sev_es_guest(svm->vcpu.kvm)) {
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
- }
-
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
-
- recalc_intercepts(svm);
-}
-
-static inline void clr_dr_intercepts(struct vcpu_svm *svm)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- vmcb->control.intercepts[INTERCEPT_DR] = 0;
-
- /* DR7 access must remain intercepted for an SEV-ES guest */
- if (sev_es_guest(svm->vcpu.kvm)) {
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
- }
-
- recalc_intercepts(svm);
-}
-
-static inline void set_exception_intercept(struct vcpu_svm *svm, u32 bit)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- WARN_ON_ONCE(bit >= 32);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
-
- recalc_intercepts(svm);
-}
-
-static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- WARN_ON_ONCE(bit >= 32);
- vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
-
- recalc_intercepts(svm);
-}
-
-static inline void svm_set_intercept(struct vcpu_svm *svm, int bit)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- vmcb_set_intercept(&vmcb->control, bit);
-
- recalc_intercepts(svm);
-}
-
-static inline void svm_clr_intercept(struct vcpu_svm *svm, int bit)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- vmcb_clr_intercept(&vmcb->control, bit);
-
- recalc_intercepts(svm);
-}
-
-static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
-{
- return vmcb_is_intercept(&svm->vmcb->control, bit);
-}
-
-static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
-{
- return svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
-}
-
-static inline struct vmcb *get_vgif_vmcb(struct vcpu_svm *svm)
-{
- if (!vgif)
- return NULL;
-
- if (is_guest_mode(&svm->vcpu) && !nested_vgif_enabled(svm))
- return svm->nested.vmcb02.ptr;
- else
- return svm->vmcb01.ptr;
-}
-
-static inline void enable_gif(struct vcpu_svm *svm)
-{
- struct vmcb *vmcb = get_vgif_vmcb(svm);
-
- if (vmcb)
- vmcb->control.int_ctl |= V_GIF_MASK;
- else
- svm->vcpu.arch.hflags |= HF_GIF_MASK;
-}
-
-static inline void disable_gif(struct vcpu_svm *svm)
-{
- struct vmcb *vmcb = get_vgif_vmcb(svm);
-
- if (vmcb)
- vmcb->control.int_ctl &= ~V_GIF_MASK;
- else
- svm->vcpu.arch.hflags &= ~HF_GIF_MASK;
-}
-
-static inline bool gif_set(struct vcpu_svm *svm)
-{
- struct vmcb *vmcb = get_vgif_vmcb(svm);
-
- if (vmcb)
- return !!(vmcb->control.int_ctl & V_GIF_MASK);
- else
- return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
-}
-
static inline bool nested_npt_enabled(struct vcpu_svm *svm)
{
return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE;
@@ -567,28 +389,6 @@ void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode,
#define NESTED_EXIT_DONE 1 /* Exit caused nested vmexit */
#define NESTED_EXIT_CONTINUE 2 /* Further checks needed */
-static inline bool nested_svm_virtualize_tpr(struct kvm_vcpu *vcpu)
-{
- struct vcpu_svm *svm = to_svm(vcpu);
-
- return is_guest_mode(vcpu) && (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK);
-}
-
-static inline bool nested_exit_on_smi(struct vcpu_svm *svm)
-{
- return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_SMI);
-}
-
-static inline bool nested_exit_on_intr(struct vcpu_svm *svm)
-{
- return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_INTR);
-}
-
-static inline bool nested_exit_on_nmi(struct vcpu_svm *svm)
-{
- return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_NMI);
-}
-
int enter_svm_guest_mode(struct kvm_vcpu *vcpu,
u64 vmcb_gpa, struct vmcb *vmcb12, bool from_vmrun);
void svm_leave_nested(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm/svm_onhyperv.c b/arch/x86/kvm/svm/svm_onhyperv.c
index 8cdc62c74a96..ae0a101329e6 100644
--- a/arch/x86/kvm/svm/svm_onhyperv.c
+++ b/arch/x86/kvm/svm/svm_onhyperv.c
@@ -8,6 +8,7 @@
#include <asm/mshyperv.h>
#include "svm.h"
+#include "vmcb.h"
#include "svm_ops.h"
#include "hyperv.h"
diff --git a/arch/x86/kvm/svm/vmcb.h b/arch/x86/kvm/svm/vmcb.h
new file mode 100644
index 000000000000..8757cda27e3a
--- /dev/null
+++ b/arch/x86/kvm/svm/vmcb.h
@@ -0,0 +1,211 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Kernel-based Virtual Machine driver for Linux
+ *
+ * AMD SVM support - VMCB accessors
+ */
+
+#ifndef __SVM_VMCB_H
+#define __SVM_VMCB_H
+
+#include "kvm_cache_regs.h"
+
+static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
+{
+ vmcb->control.clean = 0;
+}
+
+static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
+{
+ vmcb->control.clean = VMCB_ALL_CLEAN_MASK
+ & ~VMCB_ALWAYS_DIRTY_MASK;
+}
+
+static inline void vmcb_mark_dirty(struct vmcb *vmcb, int bit)
+{
+ vmcb->control.clean &= ~(1 << bit);
+}
+
+static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
+{
+ return !test_bit(bit, (unsigned long *)&vmcb->control.clean);
+}
+
+static inline void vmcb_set_intercept(struct vmcb_control_area *control, u32 bit)
+{
+ WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+ __set_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline void vmcb_clr_intercept(struct vmcb_control_area *control, u32 bit)
+{
+ WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+ __clear_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline bool vmcb_is_intercept(struct vmcb_control_area *control, u32 bit)
+{
+ WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+ return test_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u32 bit)
+{
+ WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+ return test_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline void set_dr_intercepts(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ if (!sev_es_guest(svm->vcpu.kvm)) {
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
+ }
+
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+
+ recalc_intercepts(svm);
+}
+
+static inline void clr_dr_intercepts(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb->control.intercepts[INTERCEPT_DR] = 0;
+
+ /* DR7 access must remain intercepted for an SEV-ES guest */
+ if (sev_es_guest(svm->vcpu.kvm)) {
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+ }
+
+ recalc_intercepts(svm);
+}
+
+static inline void set_exception_intercept(struct vcpu_svm *svm, u32 bit)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ WARN_ON_ONCE(bit >= 32);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
+
+ recalc_intercepts(svm);
+}
+
+static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ WARN_ON_ONCE(bit >= 32);
+ vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
+
+ recalc_intercepts(svm);
+}
+
+static inline void svm_set_intercept(struct vcpu_svm *svm, int bit)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb_set_intercept(&vmcb->control, bit);
+
+ recalc_intercepts(svm);
+}
+
+static inline void svm_clr_intercept(struct vcpu_svm *svm, int bit)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb_clr_intercept(&vmcb->control, bit);
+
+ recalc_intercepts(svm);
+}
+
+static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
+{
+ return vmcb_is_intercept(&svm->vmcb->control, bit);
+}
+
+static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
+{
+ return svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
+}
+
+static inline struct vmcb *get_vgif_vmcb(struct vcpu_svm *svm)
+{
+ if (!vgif)
+ return NULL;
+
+ if (is_guest_mode(&svm->vcpu) && !nested_vgif_enabled(svm))
+ return svm->nested.vmcb02.ptr;
+ else
+ return svm->vmcb01.ptr;
+}
+
+static inline void enable_gif(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+ if (vmcb)
+ vmcb->control.int_ctl |= V_GIF_MASK;
+ else
+ svm->vcpu.arch.hflags |= HF_GIF_MASK;
+}
+
+static inline void disable_gif(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+ if (vmcb)
+ vmcb->control.int_ctl &= ~V_GIF_MASK;
+ else
+ svm->vcpu.arch.hflags &= ~HF_GIF_MASK;
+}
+
+static inline bool gif_set(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+ if (vmcb)
+ return !!(vmcb->control.int_ctl & V_GIF_MASK);
+ else
+ return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
+}
+
+static inline bool nested_svm_virtualize_tpr(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ return is_guest_mode(vcpu) && (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK);
+}
+
+static inline bool nested_exit_on_smi(struct vcpu_svm *svm)
+{
+ return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_SMI);
+}
+
+static inline bool nested_exit_on_intr(struct vcpu_svm *svm)
+{
+ return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_INTR);
+}
+
+static inline bool nested_exit_on_nmi(struct vcpu_svm *svm)
+{
+ return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_NMI);
+}
+
+#endif
--
2.31.1
Since registers are reachable through vcpu_svm, and we will
need to access more fields of that struct, pass it instead
of the regs[] array.
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kernel/asm-offsets.c | 6 ++++++
arch/x86/kvm/svm/svm.c | 2 +-
arch/x86/kvm/svm/svm.h | 2 +-
arch/x86/kvm/svm/vmenter.S | 36 +++++++++++++++++------------------
4 files changed, 26 insertions(+), 20 deletions(-)
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 90da275ad223..7f1dd1138117 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -20,6 +20,7 @@
#include <asm/tlbflush.h>
#include <asm/tdx.h>
#include "../kvm/vmx/vmx.h"
+#include "../kvm/svm/svm.h"
#ifdef CONFIG_XEN
#include <xen/interface/xen.h>
@@ -109,6 +110,11 @@ static void __used common(void)
OFFSET(TSS_sp1, tss_struct, x86_tss.sp1);
OFFSET(TSS_sp2, tss_struct, x86_tss.sp2);
+ if (IS_ENABLED(CONFIG_KVM_AMD)) {
+ BLANK();
+ OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+ }
+
if (IS_ENABLED(CONFIG_KVM_INTEL)) {
BLANK();
OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b793cfdce68d..64f5f0544b4f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3932,7 +3932,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
* vmcb02 when switching vmcbs for nested virtualization.
*/
vmload(svm->vmcb01.pa);
- __svm_vcpu_run(vmcb_pa, (unsigned long *)&vcpu->arch.regs);
+ __svm_vcpu_run(vmcb_pa, svm);
vmsave(svm->vmcb01.pa);
vmload(__sme_page_pa(sd->save_area));
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 222856788153..5f8dfc9cd9a7 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -484,6 +484,6 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
/* vmenter.S */
void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, unsigned long *regs);
+void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);
#endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 723f8534986c..8fac744361e5 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -8,23 +8,23 @@
#define WORD_SIZE (BITS_PER_LONG / 8)
/* Intentionally omit RAX as it's context switched by hardware */
-#define VCPU_RCX __VCPU_REGS_RCX * WORD_SIZE
-#define VCPU_RDX __VCPU_REGS_RDX * WORD_SIZE
-#define VCPU_RBX __VCPU_REGS_RBX * WORD_SIZE
+#define VCPU_RCX (SVM_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
+#define VCPU_RDX (SVM_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
+#define VCPU_RBX (SVM_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
/* Intentionally omit RSP as it's context switched by hardware */
-#define VCPU_RBP __VCPU_REGS_RBP * WORD_SIZE
-#define VCPU_RSI __VCPU_REGS_RSI * WORD_SIZE
-#define VCPU_RDI __VCPU_REGS_RDI * WORD_SIZE
+#define VCPU_RBP (SVM_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
+#define VCPU_RSI (SVM_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
+#define VCPU_RDI (SVM_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)
#ifdef CONFIG_X86_64
-#define VCPU_R8 __VCPU_REGS_R8 * WORD_SIZE
-#define VCPU_R9 __VCPU_REGS_R9 * WORD_SIZE
-#define VCPU_R10 __VCPU_REGS_R10 * WORD_SIZE
-#define VCPU_R11 __VCPU_REGS_R11 * WORD_SIZE
-#define VCPU_R12 __VCPU_REGS_R12 * WORD_SIZE
-#define VCPU_R13 __VCPU_REGS_R13 * WORD_SIZE
-#define VCPU_R14 __VCPU_REGS_R14 * WORD_SIZE
-#define VCPU_R15 __VCPU_REGS_R15 * WORD_SIZE
+#define VCPU_R8 (SVM_vcpu_arch_regs + __VCPU_REGS_R8 * WORD_SIZE)
+#define VCPU_R9 (SVM_vcpu_arch_regs + __VCPU_REGS_R9 * WORD_SIZE)
+#define VCPU_R10 (SVM_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
+#define VCPU_R11 (SVM_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
+#define VCPU_R12 (SVM_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
+#define VCPU_R13 (SVM_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
+#define VCPU_R14 (SVM_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
+#define VCPU_R15 (SVM_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
#endif
.section .noinstr.text, "ax"
@@ -32,7 +32,7 @@
/**
* __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
* @vmcb_pa: unsigned long
- * @regs: unsigned long * (to guest registers)
+ * @svm: struct vcpu_svm *
*/
SYM_FUNC_START(__svm_vcpu_run)
push %_ASM_BP
@@ -47,13 +47,13 @@ SYM_FUNC_START(__svm_vcpu_run)
#endif
push %_ASM_BX
- /* Save @regs. */
+ /* Save @svm. */
push %_ASM_ARG2
/* Save @vmcb. */
push %_ASM_ARG1
- /* Move @regs to RAX. */
+ /* Move @svm to RAX. */
mov %_ASM_ARG2, %_ASM_AX
/* Load guest registers. */
@@ -89,7 +89,7 @@ SYM_FUNC_START(__svm_vcpu_run)
FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
#endif
- /* "POP" @regs to RAX. */
+ /* "POP" @svm to RAX. */
pop %_ASM_AX
/* Save all guest registers. */
--
2.31.1
Hi Paolo,
I love your patch! Yet something to improve:
[auto build test ERROR on kvm/queue]
[also build test ERROR on mst-vhost/linux-next linus/master v6.1-rc2]
[cannot apply to tip/x86/core next-20221028]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Paolo-Bonzini/KVM-SVM-move-MSR_IA32_SPEC_CTRL-save-restore-to-assembly/20221029-071449
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git queue
patch link: https://lore.kernel.org/r/20221028230723.3254250-5-pbonzini%40redhat.com
patch subject: [PATCH 4/7] KVM: SVM: replace argument of __svm_vcpu_run with vcpu_svm
config: x86_64-allyesconfig
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/2c7e6ae97f17bc625add1e9f85880f7bde7eecc9
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Paolo-Bonzini/KVM-SVM-move-MSR_IA32_SPEC_CTRL-save-restore-to-assembly/20221029-071449
git checkout 2c7e6ae97f17bc625add1e9f85880f7bde7eecc9
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
ld: kernel image bigger than KERNEL_IMAGE_SIZE
ld: vmlinux.o: in function `__svm_vcpu_run':
>> (.noinstr.text+0x1ee6): undefined reference to `SVM_vcpu_arch_regs'
>> ld: (.noinstr.text+0x1eed): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1ef4): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1efb): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1f02): undefined reference to `SVM_vcpu_arch_regs'
ld: vmlinux.o:(.noinstr.text+0x1f09): more undefined references to `SVM_vcpu_arch_regs' follow
--
0-DAY CI Kernel Test Service
https://01.org/lkp
Hi Paolo,
I love your patch! Yet something to improve:
[auto build test ERROR on kvm/queue]
[also build test ERROR on mst-vhost/linux-next linus/master v6.1-rc2]
[cannot apply to tip/x86/core next-20221028]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Paolo-Bonzini/KVM-SVM-move-MSR_IA32_SPEC_CTRL-save-restore-to-assembly/20221029-071449
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git queue
patch link: https://lore.kernel.org/r/20221028230723.3254250-7-pbonzini%40redhat.com
patch subject: [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
config: x86_64-allyesconfig
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/63e749bdae3bd901e8710a6b15ed42d0c08853b6
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Paolo-Bonzini/KVM-SVM-move-MSR_IA32_SPEC_CTRL-save-restore-to-assembly/20221029-071449
git checkout 63e749bdae3bd901e8710a6b15ed42d0c08853b6
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All error/warnings (new ones prefixed by >>):
vmlinux.o: warning: objtool: kasan_report+0x12: call to stackleak_track_stack() with UACCESS enabled
>> vmlinux.o: warning: objtool: svm_vcpu_enter_exit+0x1e: call to msr_write_intercepted.constprop.0() leaves .noinstr.text section
vmlinux.o: warning: objtool: check_stackleak_irqoff+0x309: call to _printk() leaves .noinstr.text section
--
ld: kernel image bigger than KERNEL_IMAGE_SIZE
ld: vmlinux.o: in function `__svm_vcpu_run':
>> (.noinstr.text+0x1f0b): undefined reference to `SVM_spec_ctrl'
ld: (.noinstr.text+0x1f26): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1f2d): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1f34): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1f3b): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1f42): undefined reference to `SVM_vcpu_arch_regs'
ld: vmlinux.o:(.noinstr.text+0x1f49): more undefined references to `SVM_vcpu_arch_regs' follow
ld: vmlinux.o: in function `__svm_vcpu_run':
(.noinstr.text+0x2023): undefined reference to `SVM_spec_ctrl'
>> ld: (.noinstr.text+0x2031): undefined reference to `SVM_spec_ctrl'
ld: vmlinux.o: in function `__svm_sev_es_vcpu_run':
(.noinstr.text+0x20ab): undefined reference to `SVM_spec_ctrl'
ld: (.noinstr.text+0x20fc): undefined reference to `SVM_spec_ctrl'
ld: (.noinstr.text+0x210a): undefined reference to `SVM_spec_ctrl'
--
0-DAY CI Kernel Test Service
https://01.org/lkp
On Fri, Oct 28, 2022, Paolo Bonzini wrote:
> Registers are reachable through vcpu_vmx, no need to pass
> a separate pointer to the regs[] array.
>
> No functional change intended.
>
> Signed-off-by: Paolo Bonzini <[email protected]>
> ---
> arch/x86/kernel/asm-offsets.c | 1 +
> arch/x86/kvm/vmx/nested.c | 3 +-
> arch/x86/kvm/vmx/vmenter.S | 58 +++++++++++++++--------------------
> arch/x86/kvm/vmx/vmx.c | 3 +-
> arch/x86/kvm/vmx/vmx.h | 3 +-
> 5 files changed, 29 insertions(+), 39 deletions(-)
>
> diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> index cb50589a7102..90da275ad223 100644
> --- a/arch/x86/kernel/asm-offsets.c
> +++ b/arch/x86/kernel/asm-offsets.c
> @@ -111,6 +111,7 @@ static void __used common(void)
>
> if (IS_ENABLED(CONFIG_KVM_INTEL)) {
> BLANK();
> + OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
outside of KVM? We (Google) want to explore loading multiple instances of KVM,
i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
migration between versions of KVM to upgrade/rollback KVM without changing the
kernel (RFC coming soon-ish). IIRC, asm-offsets is the only place where I haven't
been able to figure out a simple way to avoid exposing KVM's internal structures
outside of KVM (so that the structures can change across KVM instances without
breaking kernel code).
On Mon, Oct 31, 2022 at 05:37:46PM +0000, Sean Christopherson wrote:
> On Fri, Oct 28, 2022, Paolo Bonzini wrote:
> > Registers are reachable through vcpu_vmx, no need to pass
> > a separate pointer to the regs[] array.
> >
> > No functional change intended.
> >
> > Signed-off-by: Paolo Bonzini <[email protected]>
> > ---
> > arch/x86/kernel/asm-offsets.c | 1 +
> > arch/x86/kvm/vmx/nested.c | 3 +-
> > arch/x86/kvm/vmx/vmenter.S | 58 +++++++++++++++--------------------
> > arch/x86/kvm/vmx/vmx.c | 3 +-
> > arch/x86/kvm/vmx/vmx.h | 3 +-
> > 5 files changed, 29 insertions(+), 39 deletions(-)
> >
> > diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> > index cb50589a7102..90da275ad223 100644
> > --- a/arch/x86/kernel/asm-offsets.c
> > +++ b/arch/x86/kernel/asm-offsets.c
> > @@ -111,6 +111,7 @@ static void __used common(void)
> >
> > if (IS_ENABLED(CONFIG_KVM_INTEL)) {
> > BLANK();
> > + OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
>
> Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> outside of KVM? We (Google) want to explore loading multiple instances of KVM,
> i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> migration between versions of KVM to upgrade/rollback KVM without changing the
> kernel (RFC coming soon-ish). IIRC, asm-offsets is the only place where I haven't
> been able to figure out a simple way to avoid exposing KVM's internal structures
> outside of KVM (so that the structures can change across KVM instances without
> breaking kernel code).
Is that really a problem? Would it even make sense for non-KVM kernel
code to use 'vcpu_vmx' anyway? It already seems to be private.
asm-offsets.c has to "cheat" to get access to it by including
"../kvm/vmx/vmx.h".
So the only concern is exposing the asm offsets, right? But it seems
highly unlikely any non-KVM code would be using those either.
And, that would be a bug anyway: module code is subject to change and
could always get recompiled. The kernel proper shouldn't be making any
assumptions about the layouts of module-owned structs.
--
Josh
On Tue, Nov 01, 2022, Josh Poimboeuf wrote:
> On Mon, Oct 31, 2022 at 05:37:46PM +0000, Sean Christopherson wrote:
> > > diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> > > index cb50589a7102..90da275ad223 100644
> > > --- a/arch/x86/kernel/asm-offsets.c
> > > +++ b/arch/x86/kernel/asm-offsets.c
> > > @@ -111,6 +111,7 @@ static void __used common(void)
> > >
> > > if (IS_ENABLED(CONFIG_KVM_INTEL)) {
> > > BLANK();
> > > + OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
> >
> > Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> > outside of KVM? We (Google) want to explore loading multiple instances of KVM,
> > i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> > migration between versions of KVM to upgrade/rollback KVM without changing the
> > kernel (RFC coming soon-ish). IIRC, asm-offsets is the only place where I haven't
> > been able to figure out a simple way to avoid exposing KVM's internal structures
> > outside of KVM (so that the structures can change across KVM instances without
> > breaking kernel code).
>
> Is that really a problem? Would it even make sense for non-KVM kernel
> code to use 'vcpu_vmx' anyway?
vcpu_vmx itself isn't a problem as non-KVM kernel code _shouldn't_ be using
vcpu_vmx, but I want to go beyond "shouldn't" and make it all-but-impossible for
non-KVM code to reference internal KVM structures/state, e.g. I want to bury all
kvm_host.h headers in kvm/ code instead of exposing them in include/asm/.
On Fri, Oct 28, 2022 at 07:07:22PM -0400, Paolo Bonzini wrote:
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3918,10 +3918,21 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
> struct vcpu_svm *svm = to_svm(vcpu);
> unsigned long vmcb_pa = svm->current_vmcb->pa;
>
> + /*
> + * For non-nested case:
> + * If the L01 MSR bitmap does not intercept the MSR, then we need to
> + * save it.
> + *
> + * For nested case:
> + * If the L02 MSR bitmap does not intercept the MSR, then we need to
> + * save it.
> + */
> + bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);
This triggers a warning:
vmlinux.o: warning: objtool: svm_vcpu_enter_exit+0x3d: call to svm_msrpm_offset() leaves .noinstr.text section
svm_vcpu_enter_exit() is noinstr, but it's calling
msr_write_intercepted() which is not.
That's why in the VMX code I did the call to msr_write_intercepted() (in
__vmx_vcpu_run_flags) before calling vmx_vcpu_enter_exit().
--
Josh
On 11/2/22 16:28, Josh Poimboeuf wrote:
> On Fri, Oct 28, 2022 at 07:07:22PM -0400, Paolo Bonzini wrote:
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -3918,10 +3918,21 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
>> struct vcpu_svm *svm = to_svm(vcpu);
>> unsigned long vmcb_pa = svm->current_vmcb->pa;
>>
>> + /*
>> + * For non-nested case:
>> + * If the L01 MSR bitmap does not intercept the MSR, then we need to
>> + * save it.
>> + *
>> + * For nested case:
>> + * If the L02 MSR bitmap does not intercept the MSR, then we need to
>> + * save it.
>> + */
>> + bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);
>
> This triggers a warning:
>
> vmlinux.o: warning: objtool: svm_vcpu_enter_exit+0x3d: call to svm_msrpm_offset() leaves .noinstr.text section
>
> svm_vcpu_enter_exit() is noinstr, but it's calling
> msr_write_intercepted() which is not.
I suspect I didn't see it because it's inlined here, but it has to be
fixed indeed.
> That's why in the VMX code I did the call to msr_write_intercepted() (in
> __vmx_vcpu_run_flags) before calling vmx_vcpu_enter_exit().
Yes, it's easy to do the same and do it in svm_vcpu_run().
Paolo
On 10/31/22 18:37, Sean Christopherson wrote:
>> if (IS_ENABLED(CONFIG_KVM_INTEL)) {
>> BLANK();
>> + OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
> Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> outside of KVM? We (Google) want to explore loading multiple instances of KVM,
> i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> migration between versions of KVM to upgrade/rollback KVM without changing the
> kernel (RFC coming soon-ish). IIRC, asm-offsets is the only place where I haven't
> been able to figure out a simple way to avoid exposing KVM's internal structures
> outside of KVM (so that the structures can change across KVM instances without
> breaking kernel code).
>
It's possible to create our own asm-offsets.h file using something
similar to the logic in Kbuild:
#####
# Generate asm-offsets.h
offsets-file := include/generated/asm-offsets.h
always-y += $(offsets-file)
targets += arch/$(SRCARCH)/kernel/asm-offsets.s
arch/$(SRCARCH)/kernel/asm-offsets.s: $(timeconst-file) $(bounds-file)
$(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
$(call filechk,offsets,__ASM_OFFSETS_H__)
Paolo