2022-10-28 23:11:54

by Paolo Bonzini

[permalink] [raw]
Subject: [RFC PATCH 0/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly

On the Intel side, restoration of the guest's IA32_SPEC_CTRL is done
as late as possible before vmentry, with the comment:

* IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
* there must not be any returns or indirect branches between this code
* and vmentry.

On AMD, there is no need to avoid returns or indirect branches between
wrmsr and vmrun because Linux doesn't use IBRS; however, restoration
of the host IA32_SPEC_CTRL value is definitely way too late. With
respect to the user/kernel boundary, AMD says, "If software chooses to
toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." Assuming the same requirements apply to the guest/host
boundary, KVM does not respect this recommendation: the return thunk
training sequence is in vmenter.S, quite close to the VM-exit, while
the host's IA32_SPEC_CTRL value is only restored much later for hosts
without V_SPEC_CTRL.

In the absence of clarifications for AMD, move all the SPEC_CTRL
handling to assembly code and, in passing, also make the Intel and AMD
code a bit more similar to each other.

Patches 1-2 are the Intel side, which is just a cleanup.

Patch 3 prepares for adding asm-offsets.c entries in arch/x86/kvm/svm/svm.h,
and patches 4-5 are a similar cleanup to the earlier VMX ones.

Patch 6 is the bulk of the change, and finally patch 7 removes now
dead code in asm/spec-ctrl.h and arch/x86/kernel/.

This is RFC because I haven't tested SEV-ES or 32-bit yet.

Paolo

Paolo Bonzini (7):
KVM: VMX: remove regs argument of __vmx_vcpu_run
KVM: VMX: more cleanups to __vmx_vcpu_run
KVM: SVM: extract VMCB accessors to a new file
KVM: SVM: replace argument of __svm_vcpu_run with vcpu_svm
KVM: SVM: adjust register allocation for __svm_vcpu_run
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and
callers

arch/x86/include/asm/spec-ctrl.h | 10 +-
arch/x86/kernel/asm-offsets.c | 8 ++
arch/x86/kernel/cpu/bugs.c | 15 +--
arch/x86/kvm/svm/avic.c | 1 +
arch/x86/kvm/svm/nested.c | 1 +
arch/x86/kvm/svm/sev.c | 1 +
arch/x86/kvm/svm/svm.c | 39 +++---
arch/x86/kvm/svm/svm.h | 204 +-----------------------------
arch/x86/kvm/svm/svm_onhyperv.c | 1 +
arch/x86/kvm/svm/vmcb.h | 211 +++++++++++++++++++++++++++++++
arch/x86/kvm/svm/vmenter.S | 164 ++++++++++++++++++------
arch/x86/kvm/vmx/nested.c | 3 +-
arch/x86/kvm/vmx/vmenter.S | 92 ++++++--------
arch/x86/kvm/vmx/vmx.c | 3 +-
arch/x86/kvm/vmx/vmx.h | 3 +-
15 files changed, 419 insertions(+), 337 deletions(-)
create mode 100644 arch/x86/kvm/svm/vmcb.h

--
2.31.1



2022-10-28 23:12:45

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH 2/7] KVM: VMX: more cleanups to __vmx_vcpu_run

Slightly improve register allocation, loading vmx only once
before vmlaunch/vmresume.

Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/vmx/vmenter.S | 40 +++++++++++++++++---------------------
1 file changed, 18 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 1362fe5859f9..0aea6b348a96 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -81,13 +81,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
* there must not be any returns or indirect branches between this code
* and vmentry.
*/
- movl VMX_spec_ctrl(%_ASM_DI), %edi
+ movl VMX_spec_ctrl(%_ASM_DI), %eax
movl PER_CPU_VAR(x86_spec_ctrl_current), %esi
- cmp %edi, %esi
+ cmp %eax, %esi
je .Lspec_ctrl_done
mov $MSR_IA32_SPEC_CTRL, %ecx
xor %edx, %edx
- mov %edi, %eax
wrmsr

.Lspec_ctrl_done:
@@ -97,31 +96,28 @@ SYM_FUNC_START(__vmx_vcpu_run)
* an LFENCE to stop speculation from skipping the wrmsr.
*/

- /* Load @vmx to RAX. */
- mov WORD_SIZE(%_ASM_SP), %_ASM_AX
-
/* Check if vmlaunch or vmresume is needed */
testb $VMX_RUN_VMRESUME, %bl

/* Load guest registers. Don't clobber flags. */
- mov VCPU_RCX(%_ASM_AX), %_ASM_CX
- mov VCPU_RDX(%_ASM_AX), %_ASM_DX
- mov VCPU_RBX(%_ASM_AX), %_ASM_BX
- mov VCPU_RBP(%_ASM_AX), %_ASM_BP
- mov VCPU_RSI(%_ASM_AX), %_ASM_SI
- mov VCPU_RDI(%_ASM_AX), %_ASM_DI
+ mov VCPU_RAX(%_ASM_DI), %_ASM_AX
+ mov VCPU_RCX(%_ASM_DI), %_ASM_CX
+ mov VCPU_RDX(%_ASM_DI), %_ASM_DX
+ mov VCPU_RBX(%_ASM_DI), %_ASM_BX
+ mov VCPU_RBP(%_ASM_DI), %_ASM_BP
+ mov VCPU_RSI(%_ASM_DI), %_ASM_SI
#ifdef CONFIG_X86_64
- mov VCPU_R8 (%_ASM_AX), %r8
- mov VCPU_R9 (%_ASM_AX), %r9
- mov VCPU_R10(%_ASM_AX), %r10
- mov VCPU_R11(%_ASM_AX), %r11
- mov VCPU_R12(%_ASM_AX), %r12
- mov VCPU_R13(%_ASM_AX), %r13
- mov VCPU_R14(%_ASM_AX), %r14
- mov VCPU_R15(%_ASM_AX), %r15
+ mov VCPU_R8 (%_ASM_DI), %r8
+ mov VCPU_R9 (%_ASM_DI), %r9
+ mov VCPU_R10(%_ASM_DI), %r10
+ mov VCPU_R11(%_ASM_DI), %r11
+ mov VCPU_R12(%_ASM_DI), %r12
+ mov VCPU_R13(%_ASM_DI), %r13
+ mov VCPU_R14(%_ASM_DI), %r14
+ mov VCPU_R15(%_ASM_DI), %r15
#endif
- /* Load guest RAX. This kills the @vmx pointer! */
- mov VCPU_RAX(%_ASM_AX), %_ASM_AX
+ /* Load guest RDI. This kills the @vmx pointer! */
+ mov VCPU_RDI(%_ASM_DI), %_ASM_DI

/* Check EFLAGS.ZF from 'testb' above */
jz .Lvmlaunch
--
2.31.1



2022-10-28 23:13:04

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH 7/7] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers

x86_virt_spec_ctrl only deals with the paravirtualized
MSR_IA32_VIRT_SPEC_CTRL now and does not handle MSR_IA32_SPEC_CTRL
anymore; remove the corresponding, unused argument.

Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/include/asm/spec-ctrl.h | 10 +++++-----
arch/x86/kernel/cpu/bugs.c | 2 +-
arch/x86/kvm/svm/svm.c | 4 ++--
3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h
index 5393babc0598..cb0386fc4dc3 100644
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -13,7 +13,7 @@
* Takes the guest view of SPEC_CTRL MSR as a parameter and also
* the guest's version of VIRT_SPEC_CTRL, if emulated.
*/
-extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool guest);
+extern void x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool guest);

/**
* x86_spec_ctrl_set_guest - Set speculation control registers for the guest
@@ -24,9 +24,9 @@ extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bo
* Avoids writing to the MSR if the content/bits are the same
*/
static inline
-void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_set_guest(u64 guest_virt_spec_ctrl)
{
- x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, true);
+ x86_virt_spec_ctrl(guest_virt_spec_ctrl, true);
}

/**
@@ -38,9 +38,9 @@ void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
* Avoids writing to the MSR if the content/bits are the same
*/
static inline
-void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_restore_host(u64 guest_virt_spec_ctrl)
{
- x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, false);
+ x86_virt_spec_ctrl(guest_virt_spec_ctrl, false);
}

/* AMD specific Speculative Store Bypass MSR data */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6ec0b7ce7453..3e3230cccaa7 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -200,7 +200,7 @@ void __init check_bugs(void)
* MSR_IA32_SPEC_CTRL for SSBD.
*/
void
-x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
+x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool setguest)
{
u64 guestval, hostval;
struct thread_info *ti = current_thread_info();
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a79cdeebc181..7b87a4e43708 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4011,7 +4011,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
* being speculatively taken.
*/
if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
- x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
+ x86_spec_ctrl_set_guest(svm->virt_spec_ctrl);

svm_vcpu_enter_exit(vcpu);

@@ -4019,7 +4019,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
reload_tss(vcpu);

if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
- x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl);
+ x86_spec_ctrl_restore_host(svm->virt_spec_ctrl);

if (!sev_es_guest(vcpu->kvm)) {
vcpu->arch.cr2 = svm->vmcb->save.cr2;
--
2.31.1


2022-10-28 23:20:36

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly

Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.

With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.

To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.

Suggested-by: Jim Mattson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kernel/asm-offsets.c | 1 +
arch/x86/kernel/cpu/bugs.c | 13 ++---
arch/x86/kvm/svm/svm.c | 34 +++++--------
arch/x86/kvm/svm/svm.h | 4 +-
arch/x86/kvm/svm/vmenter.S | 93 +++++++++++++++++++++++++++++++++--
5 files changed, 109 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 7f1dd1138117..9dad8849c1ef 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -113,6 +113,7 @@ static void __used common(void)
if (IS_ENABLED(CONFIG_KVM_AMD)) {
BLANK();
OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+ OFFSET(SVM_spec_ctrl, vcpu_svm, spec_ctrl);
}

if (IS_ENABLED(CONFIG_KVM_INTEL)) {
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index da7c361f47e0..6ec0b7ce7453 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -196,22 +196,15 @@ void __init check_bugs(void)
}

/*
- * NOTE: This function is *only* called for SVM. VMX spec_ctrl handling is
- * done in vmenter.S.
+ * NOTE: This function is *only* called for SVM, since Intel uses
+ * MSR_IA32_SPEC_CTRL for SSBD.
*/
void
x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
{
- u64 msrval, guestval = guest_spec_ctrl, hostval = spec_ctrl_current();
+ u64 guestval, hostval;
struct thread_info *ti = current_thread_info();

- if (static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
- if (hostval != guestval) {
- msrval = setguest ? guestval : hostval;
- wrmsrl(MSR_IA32_SPEC_CTRL, msrval);
- }
- }
-
/*
* If SSBD is not handled in MSR_SPEC_CTRL on AMD, update
* MSR_AMD64_L2_CFG or MSR_VIRT_SPEC_CTRL if supported.
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 64f5f0544b4f..a79cdeebc181 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3918,10 +3918,21 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
unsigned long vmcb_pa = svm->current_vmcb->pa;

+ /*
+ * For non-nested case:
+ * If the L01 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ *
+ * For nested case:
+ * If the L02 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ */
+ bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);
+
guest_state_enter_irqoff();

if (sev_es_guest(vcpu->kvm)) {
- __svm_sev_es_vcpu_run(vmcb_pa);
+ __svm_sev_es_vcpu_run(vmcb_pa, svm, spec_ctrl_intercepted);
} else {
struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);

@@ -3932,7 +3943,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
* vmcb02 when switching vmcbs for nested virtualization.
*/
vmload(svm->vmcb01.pa);
- __svm_vcpu_run(vmcb_pa, svm);
+ __svm_vcpu_run(vmcb_pa, svm, spec_ctrl_intercepted);
vmsave(svm->vmcb01.pa);

vmload(__sme_page_pa(sd->save_area));
@@ -4004,25 +4015,6 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)

svm_vcpu_enter_exit(vcpu);

- /*
- * We do not use IBRS in the kernel. If this vCPU has used the
- * SPEC_CTRL MSR it may have left it on; save the value and
- * turn it off. This is much more efficient than blindly adding
- * it to the atomic save/restore list. Especially as the former
- * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
- *
- * For non-nested case:
- * If the L01 MSR bitmap does not intercept the MSR, then we need to
- * save it.
- *
- * For nested case:
- * If the L02 MSR bitmap does not intercept the MSR, then we need to
- * save it.
- */
- if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL) &&
- unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
- svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
-
if (!sev_es_guest(vcpu->kvm))
reload_tss(vcpu);

diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 5f8dfc9cd9a7..f61c05116ea5 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -483,7 +483,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);

/* vmenter.S */

-void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);
+void __svm_sev_es_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm, bool spec_ctrl_intercepted);
+void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm, bool spec_ctrl_intercepted);

#endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 51a63a47d74f..89402e450071 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -29,10 +29,65 @@

.section .noinstr.text, "ax"

+.macro RESTORE_GUEST_SPEC_CTRL
+ /* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+ ALTERNATIVE_2 "jmp 999f", \
+ "", X86_FEATURE_MSR_SPEC_CTRL, \
+ "jmp 999f", X86_FEATURE_V_SPEC_CTRL
+
+ /*
+ * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
+ * host's, write the MSR.
+ *
+ * IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
+ * there must not be any returns or indirect branches between this code
+ * and vmentry.
+ */
+ movl SVM_spec_ctrl(%_ASM_DI), %eax
+ cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
+ je 999f
+ mov $MSR_IA32_SPEC_CTRL, %ecx
+ xor %edx, %edx
+ wrmsr
+999:
+
+.endm
+
+.macro RESTORE_HOST_SPEC_CTRL
+ /* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+ ALTERNATIVE_2 "jmp 999f", \
+ "", X86_FEATURE_MSR_SPEC_CTRL, \
+ "jmp 999f", X86_FEATURE_V_SPEC_CTRL
+
+ mov $MSR_IA32_SPEC_CTRL, %ecx
+
+ /*
+ * Load the value that the guest had written into MSR_IA32_SPEC_CTRL,
+ * if it was not intercepted during guest execution.
+ */
+ cmpb $0, (%_ASM_SP)
+ jnz 998f
+ rdmsr
+ movl %eax, SVM_spec_ctrl(%_ASM_DI)
+998:
+
+ /* Now restore the host value of the MSR if different from the guest's. */
+ movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
+ cmp SVM_spec_ctrl(%_ASM_DI), %eax
+ je 999f
+ xor %edx, %edx
+ wrmsr
+999:
+
+.endm
+
+
+
/**
* __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
* @vmcb_pa: unsigned long
* @svm: struct vcpu_svm *
+ * @spec_ctrl_intercepted: bool
*/
SYM_FUNC_START(__svm_vcpu_run)
push %_ASM_BP
@@ -47,6 +102,9 @@ SYM_FUNC_START(__svm_vcpu_run)
#endif
push %_ASM_BX

+ /* Save @spec_ctrl_intercepted. */
+ push %_ASM_ARG3
+
/* Save @svm. */
push %_ASM_ARG2

@@ -56,6 +114,7 @@ SYM_FUNC_START(__svm_vcpu_run)
/* Move @svm to RDI. */
mov %_ASM_ARG2, %_ASM_DI

+ RESTORE_GUEST_SPEC_CTRL

/* "POP" @vmcb to RAX. */
pop %_ASM_AX
@@ -90,7 +149,7 @@ SYM_FUNC_START(__svm_vcpu_run)
FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
#endif

- /* "POP" @svm to RAX. */
+ /* "POP" @svm to RAX while it's the only available register. */
pop %_ASM_AX

/* Save all guest registers. */
@@ -111,6 +170,9 @@ SYM_FUNC_START(__svm_vcpu_run)
mov %r15, VCPU_R15(%_ASM_AX)
#endif

+ mov %_ASM_AX, %_ASM_DI
+ RESTORE_HOST_SPEC_CTRL
+
/*
* Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
* untrained as soon as we exit the VM and are back to the
@@ -146,6 +208,9 @@ SYM_FUNC_START(__svm_vcpu_run)
xor %r15d, %r15d
#endif

+ /* "Pop" @spec_ctrl_intercepted. */
+ pop %_ASM_BX
+
pop %_ASM_BX

#ifdef CONFIG_X86_64
@@ -171,6 +236,8 @@ SYM_FUNC_END(__svm_vcpu_run)
/**
* __svm_sev_es_vcpu_run - Run a SEV-ES vCPU via a transition to SVM guest mode
* @vmcb_pa: unsigned long
+ * @svm: struct vcpu_svm *
+ * @spec_ctrl_intercepted: bool
*/
SYM_FUNC_START(__svm_sev_es_vcpu_run)
push %_ASM_BP
@@ -185,8 +252,22 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
#endif
push %_ASM_BX

- /* Move @vmcb to RAX. */
- mov %_ASM_ARG1, %_ASM_AX
+ /* Save @spec_ctrl_intercepted. */
+ push %_ASM_ARG3
+
+ /* Save @svm. */
+ push %_ASM_ARG2
+
+ /* Save @vmcb. */
+ push %_ASM_ARG1
+
+ /* Move @svm to RDI for RESTORE_GUEST_SPEC_CTRL. */
+ mov %_ASM_ARG2, %_ASM_DI
+
+ RESTORE_GUEST_SPEC_CTRL
+
+ /* Pop @vmcb to RAX. */
+ pop %_ASM_AX

/* Enter guest mode */
sti
@@ -200,6 +281,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
#endif

+ pop %_ASM_DI
+ RESTORE_HOST_SPEC_CTRL
+
/*
* Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
* untrained as soon as we exit the VM and are back to the
@@ -209,6 +293,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
*/
UNTRAIN_RET

+ /* "Pop" @spec_ctrl_intercepted. */
+ pop %_ASM_BX
+
pop %_ASM_BX

#ifdef CONFIG_X86_64
--
2.31.1



2022-10-28 23:21:50

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH 5/7] KVM: SVM: adjust register allocation for __svm_vcpu_run

In preparation for moving SPEC_CTRL access to __svm_vcpu_run,
keep the pointer to the struct vcpu_svm in %rdi, which is not
used by rdmsr/wrmsr.

Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/svm/vmenter.S | 39 +++++++++++++++++++-------------------
1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 8fac744361e5..51a63a47d74f 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -53,30 +53,31 @@ SYM_FUNC_START(__svm_vcpu_run)
/* Save @vmcb. */
push %_ASM_ARG1

- /* Move @svm to RAX. */
- mov %_ASM_ARG2, %_ASM_AX
+ /* Move @svm to RDI. */
+ mov %_ASM_ARG2, %_ASM_DI

- /* Load guest registers. */
- mov VCPU_RCX(%_ASM_AX), %_ASM_CX
- mov VCPU_RDX(%_ASM_AX), %_ASM_DX
- mov VCPU_RBX(%_ASM_AX), %_ASM_BX
- mov VCPU_RBP(%_ASM_AX), %_ASM_BP
- mov VCPU_RSI(%_ASM_AX), %_ASM_SI
- mov VCPU_RDI(%_ASM_AX), %_ASM_DI
-#ifdef CONFIG_X86_64
- mov VCPU_R8 (%_ASM_AX), %r8
- mov VCPU_R9 (%_ASM_AX), %r9
- mov VCPU_R10(%_ASM_AX), %r10
- mov VCPU_R11(%_ASM_AX), %r11
- mov VCPU_R12(%_ASM_AX), %r12
- mov VCPU_R13(%_ASM_AX), %r13
- mov VCPU_R14(%_ASM_AX), %r14
- mov VCPU_R15(%_ASM_AX), %r15
-#endif

/* "POP" @vmcb to RAX. */
pop %_ASM_AX

+ /* Load guest registers. */
+ mov VCPU_RCX(%_ASM_DI), %_ASM_CX
+ mov VCPU_RDX(%_ASM_DI), %_ASM_DX
+ mov VCPU_RBX(%_ASM_DI), %_ASM_BX
+ mov VCPU_RBP(%_ASM_DI), %_ASM_BP
+ mov VCPU_RSI(%_ASM_DI), %_ASM_SI
+#ifdef CONFIG_X86_64
+ mov VCPU_R8 (%_ASM_DI), %r8
+ mov VCPU_R9 (%_ASM_DI), %r9
+ mov VCPU_R10(%_ASM_DI), %r10
+ mov VCPU_R11(%_ASM_DI), %r11
+ mov VCPU_R12(%_ASM_DI), %r12
+ mov VCPU_R13(%_ASM_DI), %r13
+ mov VCPU_R14(%_ASM_DI), %r14
+ mov VCPU_R15(%_ASM_DI), %r15
+#endif
+ mov VCPU_RDI(%_ASM_DI), %_ASM_DI
+
/* Enter guest mode */
sti

--
2.31.1



2022-10-28 23:24:30

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run

Registers are reachable through vcpu_vmx, no need to pass
a separate pointer to the regs[] array.

No functional change intended.

Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kernel/asm-offsets.c | 1 +
arch/x86/kvm/vmx/nested.c | 3 +-
arch/x86/kvm/vmx/vmenter.S | 58 +++++++++++++++--------------------
arch/x86/kvm/vmx/vmx.c | 3 +-
arch/x86/kvm/vmx/vmx.h | 3 +-
5 files changed, 29 insertions(+), 39 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index cb50589a7102..90da275ad223 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -111,6 +111,7 @@ static void __used common(void)

if (IS_ENABLED(CONFIG_KVM_INTEL)) {
BLANK();
+ OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
}
}
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 61a2e551640a..3f62bdaffb0b 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3094,8 +3094,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
vmx->loaded_vmcs->host_state.cr4 = cr4;
}

- vm_fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
- __vmx_vcpu_run_flags(vmx));
+ vm_fail = __vmx_vcpu_run(vmx, __vmx_vcpu_run_flags(vmx));

if (vmx->msr_autoload.host.nr)
vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 63b4ad54331b..1362fe5859f9 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -11,24 +11,24 @@

#define WORD_SIZE (BITS_PER_LONG / 8)

-#define VCPU_RAX __VCPU_REGS_RAX * WORD_SIZE
-#define VCPU_RCX __VCPU_REGS_RCX * WORD_SIZE
-#define VCPU_RDX __VCPU_REGS_RDX * WORD_SIZE
-#define VCPU_RBX __VCPU_REGS_RBX * WORD_SIZE
+#define VCPU_RAX (VMX_vcpu_arch_regs + __VCPU_REGS_RAX * WORD_SIZE)
+#define VCPU_RCX (VMX_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
+#define VCPU_RDX (VMX_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
+#define VCPU_RBX (VMX_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
/* Intentionally omit RSP as it's context switched by hardware */
-#define VCPU_RBP __VCPU_REGS_RBP * WORD_SIZE
-#define VCPU_RSI __VCPU_REGS_RSI * WORD_SIZE
-#define VCPU_RDI __VCPU_REGS_RDI * WORD_SIZE
+#define VCPU_RBP (VMX_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
+#define VCPU_RSI (VMX_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
+#define VCPU_RDI (VMX_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)

#ifdef CONFIG_X86_64
-#define VCPU_R8 __VCPU_REGS_R8 * WORD_SIZE
-#define VCPU_R9 __VCPU_REGS_R9 * WORD_SIZE
-#define VCPU_R10 __VCPU_REGS_R10 * WORD_SIZE
-#define VCPU_R11 __VCPU_REGS_R11 * WORD_SIZE
-#define VCPU_R12 __VCPU_REGS_R12 * WORD_SIZE
-#define VCPU_R13 __VCPU_REGS_R13 * WORD_SIZE
-#define VCPU_R14 __VCPU_REGS_R14 * WORD_SIZE
-#define VCPU_R15 __VCPU_REGS_R15 * WORD_SIZE
+#define VCPU_R8 (VMX_vcpu_arch_regs + __VCPU_REGS_R8 * WORD_SIZE)
+#define VCPU_R9 (VMX_vcpu_arch_regs + __VCPU_REGS_R9 * WORD_SIZE)
+#define VCPU_R10 (VMX_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
+#define VCPU_R11 (VMX_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
+#define VCPU_R12 (VMX_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
+#define VCPU_R13 (VMX_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
+#define VCPU_R14 (VMX_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
+#define VCPU_R15 (VMX_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
#endif

.section .noinstr.text, "ax"
@@ -36,7 +36,6 @@
/**
* __vmx_vcpu_run - Run a vCPU via a transition to VMX guest mode
* @vmx: struct vcpu_vmx *
- * @regs: unsigned long * (to guest registers)
* @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
* VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
*
@@ -61,22 +60,19 @@ SYM_FUNC_START(__vmx_vcpu_run)
push %_ASM_ARG1

/* Save @flags for SPEC_CTRL handling */
- push %_ASM_ARG3
-
- /*
- * Save @regs, _ASM_ARG2 may be modified by vmx_update_host_rsp() and
- * @regs is needed after VM-Exit to save the guest's register values.
- */
push %_ASM_ARG2

- /* Copy @flags to BL, _ASM_ARG3 is volatile. */
- mov %_ASM_ARG3B, %bl
+ /* Copy @flags to BL, _ASM_ARG2 is volatile. */
+ mov %_ASM_ARG2B, %bl

lea (%_ASM_SP), %_ASM_ARG2
call vmx_update_host_rsp

ALTERNATIVE "jmp .Lspec_ctrl_done", "", X86_FEATURE_MSR_SPEC_CTRL

+ /* Reload @vmx, _ASM_ARG1 may be modified by vmx_update_host_rsp(). */
+ mov WORD_SIZE(%_ASM_SP), %_ASM_DI
+
/*
* SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
* host's, write the MSR.
@@ -85,7 +81,6 @@ SYM_FUNC_START(__vmx_vcpu_run)
* there must not be any returns or indirect branches between this code
* and vmentry.
*/
- mov 2*WORD_SIZE(%_ASM_SP), %_ASM_DI
movl VMX_spec_ctrl(%_ASM_DI), %edi
movl PER_CPU_VAR(x86_spec_ctrl_current), %esi
cmp %edi, %esi
@@ -102,8 +97,8 @@ SYM_FUNC_START(__vmx_vcpu_run)
* an LFENCE to stop speculation from skipping the wrmsr.
*/

- /* Load @regs to RAX. */
- mov (%_ASM_SP), %_ASM_AX
+ /* Load @vmx to RAX. */
+ mov WORD_SIZE(%_ASM_SP), %_ASM_AX

/* Check if vmlaunch or vmresume is needed */
testb $VMX_RUN_VMRESUME, %bl
@@ -125,7 +120,7 @@ SYM_FUNC_START(__vmx_vcpu_run)
mov VCPU_R14(%_ASM_AX), %r14
mov VCPU_R15(%_ASM_AX), %r15
#endif
- /* Load guest RAX. This kills the @regs pointer! */
+ /* Load guest RAX. This kills the @vmx pointer! */
mov VCPU_RAX(%_ASM_AX), %_ASM_AX

/* Check EFLAGS.ZF from 'testb' above */
@@ -163,8 +158,8 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
/* Temporarily save guest's RAX. */
push %_ASM_AX

- /* Reload @regs to RAX. */
- mov WORD_SIZE(%_ASM_SP), %_ASM_AX
+ /* Reload @vmx to RAX. */
+ mov 2*WORD_SIZE(%_ASM_SP), %_ASM_AX

/* Save all guest registers, including RAX from the stack */
pop VCPU_RAX(%_ASM_AX)
@@ -189,9 +184,6 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
xor %ebx, %ebx

.Lclear_regs:
- /* Discard @regs. The register is irrelevant, it just can't be RBX. */
- pop %_ASM_AX
-
/*
* Clear all general purpose registers except RSP and RBX to prevent
* speculative use of the guest's values, even those that are reloaded
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 05a747c9a9ff..42cda7a5c009 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7084,8 +7084,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
if (vcpu->arch.cr2 != native_read_cr2())
native_write_cr2(vcpu->arch.cr2);

- vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
- flags);
+ vmx->fail = __vmx_vcpu_run(vmx, flags);

vcpu->arch.cr2 = native_read_cr2();

diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index a3da84f4ea45..d90cdbea0e4c 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -422,8 +422,7 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu);
void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
void vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx, unsigned int flags);
unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx);
-bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs,
- unsigned int flags);
+bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned int flags);
int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr);
void vmx_ept_load_pdptrs(struct kvm_vcpu *vcpu);

--
2.31.1



2022-10-28 23:59:51

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH 3/7] KVM: SVM: extract VMCB accessors to a new file

Having inline functions confuses the compilation of asm-offsets.c,
which cannot find kvm_cache_regs.h because arch/x86/kvm is not in
asm-offset.c's include path. Just extract the functions to a
new file.

Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/svm/avic.c | 1 +
arch/x86/kvm/svm/nested.c | 1 +
arch/x86/kvm/svm/sev.c | 1 +
arch/x86/kvm/svm/svm.c | 1 +
arch/x86/kvm/svm/svm.h | 200 ------------------------------
arch/x86/kvm/svm/svm_onhyperv.c | 1 +
arch/x86/kvm/svm/vmcb.h | 211 ++++++++++++++++++++++++++++++++
7 files changed, 216 insertions(+), 200 deletions(-)
create mode 100644 arch/x86/kvm/svm/vmcb.h

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 6919dee69f18..cc651a3310b1 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -26,6 +26,7 @@
#include "x86.h"
#include "irq.h"
#include "svm.h"
+#include "vmcb.h"

/* AVIC GATAG is encoded using VM and VCPU IDs */
#define AVIC_VCPU_ID_BITS 8
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index b258d6988f5d..365f5ef55b53 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -29,6 +29,7 @@
#include "cpuid.h"
#include "lapic.h"
#include "svm.h"
+#include "vmcb.h"
#include "hyperv.h"

#define CC KVM_NESTED_VMENTER_CONSISTENCY_CHECK
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index c0c9ed5e279c..549f35ded880 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -25,6 +25,7 @@
#include "mmu.h"
#include "x86.h"
#include "svm.h"
+#include "vmcb.h"
#include "svm_ops.h"
#include "cpuid.h"
#include "trace.h"
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d22a809d9233..b793cfdce68d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -44,6 +44,7 @@
#include "trace.h"

#include "svm.h"
+#include "vmcb.h"
#include "svm_ops.h"

#include "kvm_onhyperv.h"
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6a7686bf6900..222856788153 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -22,8 +22,6 @@
#include <asm/svm.h>
#include <asm/sev-common.h>

-#include "kvm_cache_regs.h"
-
#define __sme_page_pa(x) __sme_set(page_to_pfn(x) << PAGE_SHIFT)

#define IOPM_SIZE PAGE_SIZE * 3
@@ -327,27 +325,6 @@ static __always_inline bool sev_es_guest(struct kvm *kvm)
#endif
}

-static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
-{
- vmcb->control.clean = 0;
-}
-
-static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
-{
- vmcb->control.clean = VMCB_ALL_CLEAN_MASK
- & ~VMCB_ALWAYS_DIRTY_MASK;
-}
-
-static inline void vmcb_mark_dirty(struct vmcb *vmcb, int bit)
-{
- vmcb->control.clean &= ~(1 << bit);
-}
-
-static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
-{
- return !test_bit(bit, (unsigned long *)&vmcb->control.clean);
-}
-
static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
{
return container_of(vcpu, struct vcpu_svm, vcpu);
@@ -363,161 +340,6 @@ static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
*/
#define SVM_REGS_LAZY_LOAD_SET (1 << VCPU_EXREG_PDPTR)

-static inline void vmcb_set_intercept(struct vmcb_control_area *control, u32 bit)
-{
- WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
- __set_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline void vmcb_clr_intercept(struct vmcb_control_area *control, u32 bit)
-{
- WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
- __clear_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline bool vmcb_is_intercept(struct vmcb_control_area *control, u32 bit)
-{
- WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
- return test_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u32 bit)
-{
- WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
- return test_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline void set_dr_intercepts(struct vcpu_svm *svm)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- if (!sev_es_guest(svm->vcpu.kvm)) {
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
- }
-
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
-
- recalc_intercepts(svm);
-}
-
-static inline void clr_dr_intercepts(struct vcpu_svm *svm)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- vmcb->control.intercepts[INTERCEPT_DR] = 0;
-
- /* DR7 access must remain intercepted for an SEV-ES guest */
- if (sev_es_guest(svm->vcpu.kvm)) {
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
- }
-
- recalc_intercepts(svm);
-}
-
-static inline void set_exception_intercept(struct vcpu_svm *svm, u32 bit)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- WARN_ON_ONCE(bit >= 32);
- vmcb_set_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
-
- recalc_intercepts(svm);
-}
-
-static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- WARN_ON_ONCE(bit >= 32);
- vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
-
- recalc_intercepts(svm);
-}
-
-static inline void svm_set_intercept(struct vcpu_svm *svm, int bit)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- vmcb_set_intercept(&vmcb->control, bit);
-
- recalc_intercepts(svm);
-}
-
-static inline void svm_clr_intercept(struct vcpu_svm *svm, int bit)
-{
- struct vmcb *vmcb = svm->vmcb01.ptr;
-
- vmcb_clr_intercept(&vmcb->control, bit);
-
- recalc_intercepts(svm);
-}
-
-static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
-{
- return vmcb_is_intercept(&svm->vmcb->control, bit);
-}
-
-static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
-{
- return svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
-}
-
-static inline struct vmcb *get_vgif_vmcb(struct vcpu_svm *svm)
-{
- if (!vgif)
- return NULL;
-
- if (is_guest_mode(&svm->vcpu) && !nested_vgif_enabled(svm))
- return svm->nested.vmcb02.ptr;
- else
- return svm->vmcb01.ptr;
-}
-
-static inline void enable_gif(struct vcpu_svm *svm)
-{
- struct vmcb *vmcb = get_vgif_vmcb(svm);
-
- if (vmcb)
- vmcb->control.int_ctl |= V_GIF_MASK;
- else
- svm->vcpu.arch.hflags |= HF_GIF_MASK;
-}
-
-static inline void disable_gif(struct vcpu_svm *svm)
-{
- struct vmcb *vmcb = get_vgif_vmcb(svm);
-
- if (vmcb)
- vmcb->control.int_ctl &= ~V_GIF_MASK;
- else
- svm->vcpu.arch.hflags &= ~HF_GIF_MASK;
-}
-
-static inline bool gif_set(struct vcpu_svm *svm)
-{
- struct vmcb *vmcb = get_vgif_vmcb(svm);
-
- if (vmcb)
- return !!(vmcb->control.int_ctl & V_GIF_MASK);
- else
- return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
-}
-
static inline bool nested_npt_enabled(struct vcpu_svm *svm)
{
return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE;
@@ -567,28 +389,6 @@ void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode,
#define NESTED_EXIT_DONE 1 /* Exit caused nested vmexit */
#define NESTED_EXIT_CONTINUE 2 /* Further checks needed */

-static inline bool nested_svm_virtualize_tpr(struct kvm_vcpu *vcpu)
-{
- struct vcpu_svm *svm = to_svm(vcpu);
-
- return is_guest_mode(vcpu) && (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK);
-}
-
-static inline bool nested_exit_on_smi(struct vcpu_svm *svm)
-{
- return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_SMI);
-}
-
-static inline bool nested_exit_on_intr(struct vcpu_svm *svm)
-{
- return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_INTR);
-}
-
-static inline bool nested_exit_on_nmi(struct vcpu_svm *svm)
-{
- return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_NMI);
-}
-
int enter_svm_guest_mode(struct kvm_vcpu *vcpu,
u64 vmcb_gpa, struct vmcb *vmcb12, bool from_vmrun);
void svm_leave_nested(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm/svm_onhyperv.c b/arch/x86/kvm/svm/svm_onhyperv.c
index 8cdc62c74a96..ae0a101329e6 100644
--- a/arch/x86/kvm/svm/svm_onhyperv.c
+++ b/arch/x86/kvm/svm/svm_onhyperv.c
@@ -8,6 +8,7 @@
#include <asm/mshyperv.h>

#include "svm.h"
+#include "vmcb.h"
#include "svm_ops.h"

#include "hyperv.h"
diff --git a/arch/x86/kvm/svm/vmcb.h b/arch/x86/kvm/svm/vmcb.h
new file mode 100644
index 000000000000..8757cda27e3a
--- /dev/null
+++ b/arch/x86/kvm/svm/vmcb.h
@@ -0,0 +1,211 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Kernel-based Virtual Machine driver for Linux
+ *
+ * AMD SVM support - VMCB accessors
+ */
+
+#ifndef __SVM_VMCB_H
+#define __SVM_VMCB_H
+
+#include "kvm_cache_regs.h"
+
+static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
+{
+ vmcb->control.clean = 0;
+}
+
+static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
+{
+ vmcb->control.clean = VMCB_ALL_CLEAN_MASK
+ & ~VMCB_ALWAYS_DIRTY_MASK;
+}
+
+static inline void vmcb_mark_dirty(struct vmcb *vmcb, int bit)
+{
+ vmcb->control.clean &= ~(1 << bit);
+}
+
+static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
+{
+ return !test_bit(bit, (unsigned long *)&vmcb->control.clean);
+}
+
+static inline void vmcb_set_intercept(struct vmcb_control_area *control, u32 bit)
+{
+ WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+ __set_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline void vmcb_clr_intercept(struct vmcb_control_area *control, u32 bit)
+{
+ WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+ __clear_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline bool vmcb_is_intercept(struct vmcb_control_area *control, u32 bit)
+{
+ WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+ return test_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u32 bit)
+{
+ WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+ return test_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline void set_dr_intercepts(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ if (!sev_es_guest(svm->vcpu.kvm)) {
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
+ }
+
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+
+ recalc_intercepts(svm);
+}
+
+static inline void clr_dr_intercepts(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb->control.intercepts[INTERCEPT_DR] = 0;
+
+ /* DR7 access must remain intercepted for an SEV-ES guest */
+ if (sev_es_guest(svm->vcpu.kvm)) {
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+ }
+
+ recalc_intercepts(svm);
+}
+
+static inline void set_exception_intercept(struct vcpu_svm *svm, u32 bit)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ WARN_ON_ONCE(bit >= 32);
+ vmcb_set_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
+
+ recalc_intercepts(svm);
+}
+
+static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ WARN_ON_ONCE(bit >= 32);
+ vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
+
+ recalc_intercepts(svm);
+}
+
+static inline void svm_set_intercept(struct vcpu_svm *svm, int bit)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb_set_intercept(&vmcb->control, bit);
+
+ recalc_intercepts(svm);
+}
+
+static inline void svm_clr_intercept(struct vcpu_svm *svm, int bit)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb_clr_intercept(&vmcb->control, bit);
+
+ recalc_intercepts(svm);
+}
+
+static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
+{
+ return vmcb_is_intercept(&svm->vmcb->control, bit);
+}
+
+static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
+{
+ return svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
+}
+
+static inline struct vmcb *get_vgif_vmcb(struct vcpu_svm *svm)
+{
+ if (!vgif)
+ return NULL;
+
+ if (is_guest_mode(&svm->vcpu) && !nested_vgif_enabled(svm))
+ return svm->nested.vmcb02.ptr;
+ else
+ return svm->vmcb01.ptr;
+}
+
+static inline void enable_gif(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+ if (vmcb)
+ vmcb->control.int_ctl |= V_GIF_MASK;
+ else
+ svm->vcpu.arch.hflags |= HF_GIF_MASK;
+}
+
+static inline void disable_gif(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+ if (vmcb)
+ vmcb->control.int_ctl &= ~V_GIF_MASK;
+ else
+ svm->vcpu.arch.hflags &= ~HF_GIF_MASK;
+}
+
+static inline bool gif_set(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+ if (vmcb)
+ return !!(vmcb->control.int_ctl & V_GIF_MASK);
+ else
+ return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
+}
+
+static inline bool nested_svm_virtualize_tpr(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ return is_guest_mode(vcpu) && (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK);
+}
+
+static inline bool nested_exit_on_smi(struct vcpu_svm *svm)
+{
+ return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_SMI);
+}
+
+static inline bool nested_exit_on_intr(struct vcpu_svm *svm)
+{
+ return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_INTR);
+}
+
+static inline bool nested_exit_on_nmi(struct vcpu_svm *svm)
+{
+ return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_NMI);
+}
+
+#endif
--
2.31.1



2022-10-29 00:00:20

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH 4/7] KVM: SVM: replace argument of __svm_vcpu_run with vcpu_svm

Since registers are reachable through vcpu_svm, and we will
need to access more fields of that struct, pass it instead
of the regs[] array.

Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kernel/asm-offsets.c | 6 ++++++
arch/x86/kvm/svm/svm.c | 2 +-
arch/x86/kvm/svm/svm.h | 2 +-
arch/x86/kvm/svm/vmenter.S | 36 +++++++++++++++++------------------
4 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 90da275ad223..7f1dd1138117 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -20,6 +20,7 @@
#include <asm/tlbflush.h>
#include <asm/tdx.h>
#include "../kvm/vmx/vmx.h"
+#include "../kvm/svm/svm.h"

#ifdef CONFIG_XEN
#include <xen/interface/xen.h>
@@ -109,6 +110,11 @@ static void __used common(void)
OFFSET(TSS_sp1, tss_struct, x86_tss.sp1);
OFFSET(TSS_sp2, tss_struct, x86_tss.sp2);

+ if (IS_ENABLED(CONFIG_KVM_AMD)) {
+ BLANK();
+ OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+ }
+
if (IS_ENABLED(CONFIG_KVM_INTEL)) {
BLANK();
OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b793cfdce68d..64f5f0544b4f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3932,7 +3932,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
* vmcb02 when switching vmcbs for nested virtualization.
*/
vmload(svm->vmcb01.pa);
- __svm_vcpu_run(vmcb_pa, (unsigned long *)&vcpu->arch.regs);
+ __svm_vcpu_run(vmcb_pa, svm);
vmsave(svm->vmcb01.pa);

vmload(__sme_page_pa(sd->save_area));
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 222856788153..5f8dfc9cd9a7 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -484,6 +484,6 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
/* vmenter.S */

void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, unsigned long *regs);
+void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);

#endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 723f8534986c..8fac744361e5 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -8,23 +8,23 @@
#define WORD_SIZE (BITS_PER_LONG / 8)

/* Intentionally omit RAX as it's context switched by hardware */
-#define VCPU_RCX __VCPU_REGS_RCX * WORD_SIZE
-#define VCPU_RDX __VCPU_REGS_RDX * WORD_SIZE
-#define VCPU_RBX __VCPU_REGS_RBX * WORD_SIZE
+#define VCPU_RCX (SVM_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
+#define VCPU_RDX (SVM_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
+#define VCPU_RBX (SVM_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
/* Intentionally omit RSP as it's context switched by hardware */
-#define VCPU_RBP __VCPU_REGS_RBP * WORD_SIZE
-#define VCPU_RSI __VCPU_REGS_RSI * WORD_SIZE
-#define VCPU_RDI __VCPU_REGS_RDI * WORD_SIZE
+#define VCPU_RBP (SVM_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
+#define VCPU_RSI (SVM_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
+#define VCPU_RDI (SVM_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)

#ifdef CONFIG_X86_64
-#define VCPU_R8 __VCPU_REGS_R8 * WORD_SIZE
-#define VCPU_R9 __VCPU_REGS_R9 * WORD_SIZE
-#define VCPU_R10 __VCPU_REGS_R10 * WORD_SIZE
-#define VCPU_R11 __VCPU_REGS_R11 * WORD_SIZE
-#define VCPU_R12 __VCPU_REGS_R12 * WORD_SIZE
-#define VCPU_R13 __VCPU_REGS_R13 * WORD_SIZE
-#define VCPU_R14 __VCPU_REGS_R14 * WORD_SIZE
-#define VCPU_R15 __VCPU_REGS_R15 * WORD_SIZE
+#define VCPU_R8 (SVM_vcpu_arch_regs + __VCPU_REGS_R8 * WORD_SIZE)
+#define VCPU_R9 (SVM_vcpu_arch_regs + __VCPU_REGS_R9 * WORD_SIZE)
+#define VCPU_R10 (SVM_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
+#define VCPU_R11 (SVM_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
+#define VCPU_R12 (SVM_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
+#define VCPU_R13 (SVM_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
+#define VCPU_R14 (SVM_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
+#define VCPU_R15 (SVM_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
#endif

.section .noinstr.text, "ax"
@@ -32,7 +32,7 @@
/**
* __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
* @vmcb_pa: unsigned long
- * @regs: unsigned long * (to guest registers)
+ * @svm: struct vcpu_svm *
*/
SYM_FUNC_START(__svm_vcpu_run)
push %_ASM_BP
@@ -47,13 +47,13 @@ SYM_FUNC_START(__svm_vcpu_run)
#endif
push %_ASM_BX

- /* Save @regs. */
+ /* Save @svm. */
push %_ASM_ARG2

/* Save @vmcb. */
push %_ASM_ARG1

- /* Move @regs to RAX. */
+ /* Move @svm to RAX. */
mov %_ASM_ARG2, %_ASM_AX

/* Load guest registers. */
@@ -89,7 +89,7 @@ SYM_FUNC_START(__svm_vcpu_run)
FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
#endif

- /* "POP" @regs to RAX. */
+ /* "POP" @svm to RAX. */
pop %_ASM_AX

/* Save all guest registers. */
--
2.31.1



2022-10-29 22:29:39

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 4/7] KVM: SVM: replace argument of __svm_vcpu_run with vcpu_svm

Hi Paolo,

I love your patch! Yet something to improve:

[auto build test ERROR on kvm/queue]
[also build test ERROR on mst-vhost/linux-next linus/master v6.1-rc2]
[cannot apply to tip/x86/core next-20221028]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Paolo-Bonzini/KVM-SVM-move-MSR_IA32_SPEC_CTRL-save-restore-to-assembly/20221029-071449
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git queue
patch link: https://lore.kernel.org/r/20221028230723.3254250-5-pbonzini%40redhat.com
patch subject: [PATCH 4/7] KVM: SVM: replace argument of __svm_vcpu_run with vcpu_svm
config: x86_64-allyesconfig
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/2c7e6ae97f17bc625add1e9f85880f7bde7eecc9
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Paolo-Bonzini/KVM-SVM-move-MSR_IA32_SPEC_CTRL-save-restore-to-assembly/20221029-071449
git checkout 2c7e6ae97f17bc625add1e9f85880f7bde7eecc9
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

ld: kernel image bigger than KERNEL_IMAGE_SIZE
ld: vmlinux.o: in function `__svm_vcpu_run':
>> (.noinstr.text+0x1ee6): undefined reference to `SVM_vcpu_arch_regs'
>> ld: (.noinstr.text+0x1eed): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1ef4): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1efb): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1f02): undefined reference to `SVM_vcpu_arch_regs'
ld: vmlinux.o:(.noinstr.text+0x1f09): more undefined references to `SVM_vcpu_arch_regs' follow

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (2.17 kB)
config (297.61 kB)
Download all attachments

2022-10-30 08:55:07

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly

Hi Paolo,

I love your patch! Yet something to improve:

[auto build test ERROR on kvm/queue]
[also build test ERROR on mst-vhost/linux-next linus/master v6.1-rc2]
[cannot apply to tip/x86/core next-20221028]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Paolo-Bonzini/KVM-SVM-move-MSR_IA32_SPEC_CTRL-save-restore-to-assembly/20221029-071449
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git queue
patch link: https://lore.kernel.org/r/20221028230723.3254250-7-pbonzini%40redhat.com
patch subject: [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
config: x86_64-allyesconfig
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/63e749bdae3bd901e8710a6b15ed42d0c08853b6
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Paolo-Bonzini/KVM-SVM-move-MSR_IA32_SPEC_CTRL-save-restore-to-assembly/20221029-071449
git checkout 63e749bdae3bd901e8710a6b15ed42d0c08853b6
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All error/warnings (new ones prefixed by >>):

vmlinux.o: warning: objtool: kasan_report+0x12: call to stackleak_track_stack() with UACCESS enabled
>> vmlinux.o: warning: objtool: svm_vcpu_enter_exit+0x1e: call to msr_write_intercepted.constprop.0() leaves .noinstr.text section
vmlinux.o: warning: objtool: check_stackleak_irqoff+0x309: call to _printk() leaves .noinstr.text section
--
ld: kernel image bigger than KERNEL_IMAGE_SIZE
ld: vmlinux.o: in function `__svm_vcpu_run':
>> (.noinstr.text+0x1f0b): undefined reference to `SVM_spec_ctrl'
ld: (.noinstr.text+0x1f26): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1f2d): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1f34): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1f3b): undefined reference to `SVM_vcpu_arch_regs'
ld: (.noinstr.text+0x1f42): undefined reference to `SVM_vcpu_arch_regs'
ld: vmlinux.o:(.noinstr.text+0x1f49): more undefined references to `SVM_vcpu_arch_regs' follow
ld: vmlinux.o: in function `__svm_vcpu_run':
(.noinstr.text+0x2023): undefined reference to `SVM_spec_ctrl'
>> ld: (.noinstr.text+0x2031): undefined reference to `SVM_spec_ctrl'
ld: vmlinux.o: in function `__svm_sev_es_vcpu_run':
(.noinstr.text+0x20ab): undefined reference to `SVM_spec_ctrl'
ld: (.noinstr.text+0x20fc): undefined reference to `SVM_spec_ctrl'
ld: (.noinstr.text+0x210a): undefined reference to `SVM_spec_ctrl'

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (3.03 kB)
config (297.61 kB)
Download all attachments

2022-10-31 17:52:50

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run

On Fri, Oct 28, 2022, Paolo Bonzini wrote:
> Registers are reachable through vcpu_vmx, no need to pass
> a separate pointer to the regs[] array.
>
> No functional change intended.
>
> Signed-off-by: Paolo Bonzini <[email protected]>
> ---
> arch/x86/kernel/asm-offsets.c | 1 +
> arch/x86/kvm/vmx/nested.c | 3 +-
> arch/x86/kvm/vmx/vmenter.S | 58 +++++++++++++++--------------------
> arch/x86/kvm/vmx/vmx.c | 3 +-
> arch/x86/kvm/vmx/vmx.h | 3 +-
> 5 files changed, 29 insertions(+), 39 deletions(-)
>
> diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> index cb50589a7102..90da275ad223 100644
> --- a/arch/x86/kernel/asm-offsets.c
> +++ b/arch/x86/kernel/asm-offsets.c
> @@ -111,6 +111,7 @@ static void __used common(void)
>
> if (IS_ENABLED(CONFIG_KVM_INTEL)) {
> BLANK();
> + OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);

Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
outside of KVM? We (Google) want to explore loading multiple instances of KVM,
i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
migration between versions of KVM to upgrade/rollback KVM without changing the
kernel (RFC coming soon-ish). IIRC, asm-offsets is the only place where I haven't
been able to figure out a simple way to avoid exposing KVM's internal structures
outside of KVM (so that the structures can change across KVM instances without
breaking kernel code).

2022-11-01 17:54:13

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run

On Mon, Oct 31, 2022 at 05:37:46PM +0000, Sean Christopherson wrote:
> On Fri, Oct 28, 2022, Paolo Bonzini wrote:
> > Registers are reachable through vcpu_vmx, no need to pass
> > a separate pointer to the regs[] array.
> >
> > No functional change intended.
> >
> > Signed-off-by: Paolo Bonzini <[email protected]>
> > ---
> > arch/x86/kernel/asm-offsets.c | 1 +
> > arch/x86/kvm/vmx/nested.c | 3 +-
> > arch/x86/kvm/vmx/vmenter.S | 58 +++++++++++++++--------------------
> > arch/x86/kvm/vmx/vmx.c | 3 +-
> > arch/x86/kvm/vmx/vmx.h | 3 +-
> > 5 files changed, 29 insertions(+), 39 deletions(-)
> >
> > diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> > index cb50589a7102..90da275ad223 100644
> > --- a/arch/x86/kernel/asm-offsets.c
> > +++ b/arch/x86/kernel/asm-offsets.c
> > @@ -111,6 +111,7 @@ static void __used common(void)
> >
> > if (IS_ENABLED(CONFIG_KVM_INTEL)) {
> > BLANK();
> > + OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
>
> Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> outside of KVM? We (Google) want to explore loading multiple instances of KVM,
> i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> migration between versions of KVM to upgrade/rollback KVM without changing the
> kernel (RFC coming soon-ish). IIRC, asm-offsets is the only place where I haven't
> been able to figure out a simple way to avoid exposing KVM's internal structures
> outside of KVM (so that the structures can change across KVM instances without
> breaking kernel code).

Is that really a problem? Would it even make sense for non-KVM kernel
code to use 'vcpu_vmx' anyway? It already seems to be private.
asm-offsets.c has to "cheat" to get access to it by including
"../kvm/vmx/vmx.h".

So the only concern is exposing the asm offsets, right? But it seems
highly unlikely any non-KVM code would be using those either.

And, that would be a bug anyway: module code is subject to change and
could always get recompiled. The kernel proper shouldn't be making any
assumptions about the layouts of module-owned structs.

--
Josh

2022-11-01 18:12:37

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run

On Tue, Nov 01, 2022, Josh Poimboeuf wrote:
> On Mon, Oct 31, 2022 at 05:37:46PM +0000, Sean Christopherson wrote:
> > > diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> > > index cb50589a7102..90da275ad223 100644
> > > --- a/arch/x86/kernel/asm-offsets.c
> > > +++ b/arch/x86/kernel/asm-offsets.c
> > > @@ -111,6 +111,7 @@ static void __used common(void)
> > >
> > > if (IS_ENABLED(CONFIG_KVM_INTEL)) {
> > > BLANK();
> > > + OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
> >
> > Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> > outside of KVM? We (Google) want to explore loading multiple instances of KVM,
> > i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> > migration between versions of KVM to upgrade/rollback KVM without changing the
> > kernel (RFC coming soon-ish). IIRC, asm-offsets is the only place where I haven't
> > been able to figure out a simple way to avoid exposing KVM's internal structures
> > outside of KVM (so that the structures can change across KVM instances without
> > breaking kernel code).
>
> Is that really a problem? Would it even make sense for non-KVM kernel
> code to use 'vcpu_vmx' anyway?

vcpu_vmx itself isn't a problem as non-KVM kernel code _shouldn't_ be using
vcpu_vmx, but I want to go beyond "shouldn't" and make it all-but-impossible for
non-KVM code to reference internal KVM structures/state, e.g. I want to bury all
kvm_host.h headers in kvm/ code instead of exposing them in include/asm/.

2022-11-02 15:32:39

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly

On Fri, Oct 28, 2022 at 07:07:22PM -0400, Paolo Bonzini wrote:
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3918,10 +3918,21 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
> struct vcpu_svm *svm = to_svm(vcpu);
> unsigned long vmcb_pa = svm->current_vmcb->pa;
>
> + /*
> + * For non-nested case:
> + * If the L01 MSR bitmap does not intercept the MSR, then we need to
> + * save it.
> + *
> + * For nested case:
> + * If the L02 MSR bitmap does not intercept the MSR, then we need to
> + * save it.
> + */
> + bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);

This triggers a warning:

vmlinux.o: warning: objtool: svm_vcpu_enter_exit+0x3d: call to svm_msrpm_offset() leaves .noinstr.text section

svm_vcpu_enter_exit() is noinstr, but it's calling
msr_write_intercepted() which is not.

That's why in the VMX code I did the call to msr_write_intercepted() (in
__vmx_vcpu_run_flags) before calling vmx_vcpu_enter_exit().

--
Josh

2022-11-02 16:32:24

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly

On 11/2/22 16:28, Josh Poimboeuf wrote:
> On Fri, Oct 28, 2022 at 07:07:22PM -0400, Paolo Bonzini wrote:
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -3918,10 +3918,21 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
>> struct vcpu_svm *svm = to_svm(vcpu);
>> unsigned long vmcb_pa = svm->current_vmcb->pa;
>>
>> + /*
>> + * For non-nested case:
>> + * If the L01 MSR bitmap does not intercept the MSR, then we need to
>> + * save it.
>> + *
>> + * For nested case:
>> + * If the L02 MSR bitmap does not intercept the MSR, then we need to
>> + * save it.
>> + */
>> + bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);
>
> This triggers a warning:
>
> vmlinux.o: warning: objtool: svm_vcpu_enter_exit+0x3d: call to svm_msrpm_offset() leaves .noinstr.text section
>
> svm_vcpu_enter_exit() is noinstr, but it's calling
> msr_write_intercepted() which is not.

I suspect I didn't see it because it's inlined here, but it has to be
fixed indeed.

> That's why in the VMX code I did the call to msr_write_intercepted() (in
> __vmx_vcpu_run_flags) before calling vmx_vcpu_enter_exit().

Yes, it's easy to do the same and do it in svm_vcpu_run().

Paolo


2022-11-02 18:07:04

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run

On 10/31/22 18:37, Sean Christopherson wrote:
>> if (IS_ENABLED(CONFIG_KVM_INTEL)) {
>> BLANK();
>> + OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
> Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> outside of KVM? We (Google) want to explore loading multiple instances of KVM,
> i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> migration between versions of KVM to upgrade/rollback KVM without changing the
> kernel (RFC coming soon-ish). IIRC, asm-offsets is the only place where I haven't
> been able to figure out a simple way to avoid exposing KVM's internal structures
> outside of KVM (so that the structures can change across KVM instances without
> breaking kernel code).
>

It's possible to create our own asm-offsets.h file using something
similar to the logic in Kbuild:

#####
# Generate asm-offsets.h

offsets-file := include/generated/asm-offsets.h

always-y += $(offsets-file)
targets += arch/$(SRCARCH)/kernel/asm-offsets.s

arch/$(SRCARCH)/kernel/asm-offsets.s: $(timeconst-file) $(bounds-file)

$(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
$(call filechk,offsets,__ASM_OFFSETS_H__)

Paolo