2021-05-05 00:29:10

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v4 0/8] KVM: Fix tick-based accounting for x86 guests

Fix tick-based accounting for x86 guests, and do additional cleanups to
further disentangle guest time accounting and to deduplicate code.

v4:
- Add R-b's (dropped one due to code change). [Christian]
- Drop instrumentation annotation shuffling since s390 may be gaining
support. [Christian].
- Drop "irqs_off" from context_tracking_guest_exit(). [Frederic]
- Account guest time after enabling IRQs, even when using context
tracking to precisely account time. [Frederic]

v3 (delta from Wanpeng's v2):
- https://lkml.kernel.org/r/[email protected]
- s/context_guest/context_tracking_guest, purely to match the existing
functions. I have no strong opinion either way.
- Split only the "exit" functions.
- Partially open code vcpu_account_guest_exit() and
__vtime_account_guest_exit() in x86 to avoid churn when segueing into
my cleanups (see above).

older:
- https://lkml.kernel.org/r/[email protected]
- https://lkml.kernel.org/r/[email protected]


Sean Christopherson (5):
sched/vtime: Move vtime accounting external declarations above inlines
sched/vtime: Move guest enter/exit vtime accounting to vtime.h
context_tracking: Consolidate guest enter/exit wrappers
context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain
KVM: x86: Consolidate guest enter/exit logic to common helpers

Wanpeng Li (3):
context_tracking: Move guest exit context tracking to separate helpers
context_tracking: Move guest exit vtime accounting to separate helpers
KVM: x86: Defer vtime accounting 'til after IRQ handling

arch/x86/kvm/svm/svm.c | 39 +--------
arch/x86/kvm/vmx/vmx.c | 39 +--------
arch/x86/kvm/x86.c | 9 ++
arch/x86/kvm/x86.h | 45 ++++++++++
include/linux/context_tracking.h | 92 ++++-----------------
include/linux/kvm_host.h | 45 ++++++++++
include/linux/vtime.h | 138 +++++++++++++++++++------------
7 files changed, 205 insertions(+), 202 deletions(-)

--
2.31.1.527.g47e6f16901-goog


2021-05-05 00:29:10

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v4 1/8] context_tracking: Move guest exit context tracking to separate helpers

From: Wanpeng Li <[email protected]>

Provide separate context tracking helpers for guest exit, the standalone
helpers will be called separately by KVM x86 in later patches to fix
tick-based accounting.

Suggested-by: Thomas Gleixner <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Michael Tokarev <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
include/linux/context_tracking.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index bceb06498521..b8c7313495a7 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -131,10 +131,15 @@ static __always_inline void guest_enter_irqoff(void)
}
}

-static __always_inline void guest_exit_irqoff(void)
+static __always_inline void context_tracking_guest_exit(void)
{
if (context_tracking_enabled())
__context_tracking_exit(CONTEXT_GUEST);
+}
+
+static __always_inline void guest_exit_irqoff(void)
+{
+ context_tracking_guest_exit();

instrumentation_begin();
if (vtime_accounting_enabled_this_cpu())
@@ -159,6 +164,8 @@ static __always_inline void guest_enter_irqoff(void)
instrumentation_end();
}

+static __always_inline void context_tracking_guest_exit(void) { }
+
static __always_inline void guest_exit_irqoff(void)
{
instrumentation_begin();
--
2.31.1.527.g47e6f16901-goog

2021-05-05 00:29:15

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v4 2/8] context_tracking: Move guest exit vtime accounting to separate helpers

From: Wanpeng Li <[email protected]>

Provide separate vtime accounting functions for guest exit instead of
open coding the logic within the context tracking code. This will allow
KVM x86 to handle vtime accounting slightly differently when using
tick-based accounting.

Suggested-by: Thomas Gleixner <[email protected]>
Cc: Michael Tokarev <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
include/linux/context_tracking.h | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index b8c7313495a7..4f4556232dcf 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -137,15 +137,20 @@ static __always_inline void context_tracking_guest_exit(void)
__context_tracking_exit(CONTEXT_GUEST);
}

-static __always_inline void guest_exit_irqoff(void)
+static __always_inline void vtime_account_guest_exit(void)
{
- context_tracking_guest_exit();
-
- instrumentation_begin();
if (vtime_accounting_enabled_this_cpu())
vtime_guest_exit(current);
else
current->flags &= ~PF_VCPU;
+}
+
+static __always_inline void guest_exit_irqoff(void)
+{
+ context_tracking_guest_exit();
+
+ instrumentation_begin();
+ vtime_account_guest_exit();
instrumentation_end();
}

@@ -166,12 +171,17 @@ static __always_inline void guest_enter_irqoff(void)

static __always_inline void context_tracking_guest_exit(void) { }

-static __always_inline void guest_exit_irqoff(void)
+static __always_inline void vtime_account_guest_exit(void)
{
- instrumentation_begin();
- /* Flush the guest cputime we spent on the guest */
vtime_account_kernel(current);
current->flags &= ~PF_VCPU;
+}
+
+static __always_inline void guest_exit_irqoff(void)
+{
+ instrumentation_begin();
+ /* Flush the guest cputime we spent on the guest */
+ vtime_account_guest_exit();
instrumentation_end();
}
#endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */
--
2.31.1.527.g47e6f16901-goog

2021-05-05 00:29:19

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v4 3/8] KVM: x86: Defer vtime accounting 'til after IRQ handling

From: Wanpeng Li <[email protected]>

Defer the call to account guest time until after servicing any IRQ(s)
that happened in the guest or immediately after VM-Exit. Tick-based
accounting of vCPU time relies on PF_VCPU being set when the tick IRQ
handler runs, and IRQs are blocked throughout the main sequence of
vcpu_enter_guest(), including the call into vendor code to actually
enter and exit the guest.

This fixes a bug[*] where reported guest time remains '0', even when
running an infinite loop in the guest.

[*] https://bugzilla.kernel.org/show_bug.cgi?id=209831

Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs")
Cc: Thomas Gleixner <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Michael Tokarev <[email protected]>
Cc: [email protected]#v5.9-rc1+
Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/svm.c | 6 +++---
arch/x86/kvm/vmx/vmx.c | 6 +++---
arch/x86/kvm/x86.c | 9 +++++++++
3 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a7271f31df47..7dd63545526b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3753,15 +3753,15 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
* have them in state 'on' as recorded before entering guest mode.
* Same as enter_from_user_mode().
*
- * guest_exit_irqoff() restores host context and reinstates RCU if
- * enabled and required.
+ * context_tracking_guest_exit() restores host context and reinstates
+ * RCU if enabled and required.
*
* This needs to be done before the below as native_read_msr()
* contains a tracepoint and x86_spec_ctrl_restore_host() calls
* into world and some more.
*/
lockdep_hardirqs_off(CALLER_ADDR0);
- guest_exit_irqoff();
+ context_tracking_guest_exit();

instrumentation_begin();
trace_hardirqs_off_finish();
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 10b610fc7bbc..8425827068c3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6701,15 +6701,15 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
* have them in state 'on' as recorded before entering guest mode.
* Same as enter_from_user_mode().
*
- * guest_exit_irqoff() restores host context and reinstates RCU if
- * enabled and required.
+ * context_tracking_guest_exit() restores host context and reinstates
+ * RCU if enabled and required.
*
* This needs to be done before the below as native_read_msr()
* contains a tracepoint and x86_spec_ctrl_restore_host() calls
* into world and some more.
*/
lockdep_hardirqs_off(CALLER_ADDR0);
- guest_exit_irqoff();
+ context_tracking_guest_exit();

instrumentation_begin();
trace_hardirqs_off_finish();
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3bf52ba5f2bb..40e958617405 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9367,6 +9367,15 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
local_irq_disable();
kvm_after_interrupt(vcpu);

+ /*
+ * Wait until after servicing IRQs to account guest time so that any
+ * ticks that occurred while running the guest are properly accounted
+ * to the guest. Waiting until IRQs are enabled degrades the accuracy
+ * of accounting via context tracking, but the loss of accuracy is
+ * acceptable for all known use cases.
+ */
+ vtime_account_guest_exit();
+
if (lapic_in_kernel(vcpu)) {
s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;
if (delta != S64_MIN) {
--
2.31.1.527.g47e6f16901-goog

2021-05-05 00:29:24

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v4 4/8] sched/vtime: Move vtime accounting external declarations above inlines

Move the blob of external declarations (and their stubs) above the set of
inline definitions (and their stubs) for vtime accounting. This will
allow a future patch to bring in more inline definitions without also
having to shuffle large chunks of code.

No functional change intended.

Reviewed-by: Christian Borntraeger <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
include/linux/vtime.h | 94 +++++++++++++++++++++----------------------
1 file changed, 47 insertions(+), 47 deletions(-)

diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 041d6524d144..6a4317560539 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -10,53 +10,6 @@

struct task_struct;

-/*
- * vtime_accounting_enabled_this_cpu() definitions/declarations
- */
-#if defined(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE)
-
-static inline bool vtime_accounting_enabled_this_cpu(void) { return true; }
-extern void vtime_task_switch(struct task_struct *prev);
-
-#elif defined(CONFIG_VIRT_CPU_ACCOUNTING_GEN)
-
-/*
- * Checks if vtime is enabled on some CPU. Cputime readers want to be careful
- * in that case and compute the tickless cputime.
- * For now vtime state is tied to context tracking. We might want to decouple
- * those later if necessary.
- */
-static inline bool vtime_accounting_enabled(void)
-{
- return context_tracking_enabled();
-}
-
-static inline bool vtime_accounting_enabled_cpu(int cpu)
-{
- return context_tracking_enabled_cpu(cpu);
-}
-
-static inline bool vtime_accounting_enabled_this_cpu(void)
-{
- return context_tracking_enabled_this_cpu();
-}
-
-extern void vtime_task_switch_generic(struct task_struct *prev);
-
-static inline void vtime_task_switch(struct task_struct *prev)
-{
- if (vtime_accounting_enabled_this_cpu())
- vtime_task_switch_generic(prev);
-}
-
-#else /* !CONFIG_VIRT_CPU_ACCOUNTING */
-
-static inline bool vtime_accounting_enabled_cpu(int cpu) {return false; }
-static inline bool vtime_accounting_enabled_this_cpu(void) { return false; }
-static inline void vtime_task_switch(struct task_struct *prev) { }
-
-#endif
-
/*
* Common vtime APIs
*/
@@ -94,6 +47,53 @@ static inline void vtime_account_hardirq(struct task_struct *tsk) { }
static inline void vtime_flush(struct task_struct *tsk) { }
#endif

+/*
+ * vtime_accounting_enabled_this_cpu() definitions/declarations
+ */
+#if defined(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE)
+
+static inline bool vtime_accounting_enabled_this_cpu(void) { return true; }
+extern void vtime_task_switch(struct task_struct *prev);
+
+#elif defined(CONFIG_VIRT_CPU_ACCOUNTING_GEN)
+
+/*
+ * Checks if vtime is enabled on some CPU. Cputime readers want to be careful
+ * in that case and compute the tickless cputime.
+ * For now vtime state is tied to context tracking. We might want to decouple
+ * those later if necessary.
+ */
+static inline bool vtime_accounting_enabled(void)
+{
+ return context_tracking_enabled();
+}
+
+static inline bool vtime_accounting_enabled_cpu(int cpu)
+{
+ return context_tracking_enabled_cpu(cpu);
+}
+
+static inline bool vtime_accounting_enabled_this_cpu(void)
+{
+ return context_tracking_enabled_this_cpu();
+}
+
+extern void vtime_task_switch_generic(struct task_struct *prev);
+
+static inline void vtime_task_switch(struct task_struct *prev)
+{
+ if (vtime_accounting_enabled_this_cpu())
+ vtime_task_switch_generic(prev);
+}
+
+#else /* !CONFIG_VIRT_CPU_ACCOUNTING */
+
+static inline bool vtime_accounting_enabled_cpu(int cpu) {return false; }
+static inline bool vtime_accounting_enabled_this_cpu(void) { return false; }
+static inline void vtime_task_switch(struct task_struct *prev) { }
+
+#endif
+

#ifdef CONFIG_IRQ_TIME_ACCOUNTING
extern void irqtime_account_irq(struct task_struct *tsk, unsigned int offset);
--
2.31.1.527.g47e6f16901-goog

2021-05-05 00:30:45

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v4 6/8] context_tracking: Consolidate guest enter/exit wrappers

Consolidate the guest enter/exit wrappers, providing and tweaking stubs
as needed. This will allow moving the wrappers under KVM without having
to bleed #ifdefs into the soon-to-be KVM code.

No functional change intended.

Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
include/linux/context_tracking.h | 65 ++++++++++++--------------------
1 file changed, 24 insertions(+), 41 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 56c648bdbde8..aa58c2ac67ca 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -71,6 +71,19 @@ static inline void exception_exit(enum ctx_state prev_ctx)
}
}

+static __always_inline bool context_tracking_guest_enter(void)
+{
+ if (context_tracking_enabled())
+ __context_tracking_enter(CONTEXT_GUEST);
+
+ return context_tracking_enabled_this_cpu();
+}
+
+static __always_inline void context_tracking_guest_exit(void)
+{
+ if (context_tracking_enabled())
+ __context_tracking_exit(CONTEXT_GUEST);
+}

/**
* ct_state() - return the current context tracking state if known
@@ -92,6 +105,9 @@ static inline void user_exit_irqoff(void) { }
static inline enum ctx_state exception_enter(void) { return 0; }
static inline void exception_exit(enum ctx_state prev_ctx) { }
static inline enum ctx_state ct_state(void) { return CONTEXT_DISABLED; }
+static inline bool context_tracking_guest_enter(void) { return false; }
+static inline void context_tracking_guest_exit(void) { }
+
#endif /* !CONFIG_CONTEXT_TRACKING */

#define CT_WARN_ON(cond) WARN_ON(context_tracking_enabled() && (cond))
@@ -102,74 +118,41 @@ extern void context_tracking_init(void);
static inline void context_tracking_init(void) { }
#endif /* CONFIG_CONTEXT_TRACKING_FORCE */

-
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
/* must be called with irqs disabled */
static __always_inline void guest_enter_irqoff(void)
{
+ /*
+ * This is running in ioctl context so its safe to assume that it's the
+ * stime pending cputime to flush.
+ */
instrumentation_begin();
- if (vtime_accounting_enabled_this_cpu())
- vtime_guest_enter(current);
- else
- current->flags |= PF_VCPU;
+ vtime_account_guest_enter();
instrumentation_end();

- if (context_tracking_enabled())
- __context_tracking_enter(CONTEXT_GUEST);
-
- /* KVM does not hold any references to rcu protected data when it
+ /*
+ * KVM does not hold any references to rcu protected data when it
* switches CPU into a guest mode. In fact switching to a guest mode
* is very similar to exiting to userspace from rcu point of view. In
* addition CPU may stay in a guest mode for quite a long time (up to
* one time slice). Lets treat guest mode as quiescent state, just like
* we do with user-mode execution.
*/
- if (!context_tracking_enabled_this_cpu()) {
+ if (!context_tracking_guest_enter()) {
instrumentation_begin();
rcu_virt_note_context_switch(smp_processor_id());
instrumentation_end();
}
}

-static __always_inline void context_tracking_guest_exit(void)
-{
- if (context_tracking_enabled())
- __context_tracking_exit(CONTEXT_GUEST);
-}
-
static __always_inline void guest_exit_irqoff(void)
{
context_tracking_guest_exit();

- instrumentation_begin();
- vtime_account_guest_exit();
- instrumentation_end();
-}
-
-#else
-static __always_inline void guest_enter_irqoff(void)
-{
- /*
- * This is running in ioctl context so its safe
- * to assume that it's the stime pending cputime
- * to flush.
- */
- instrumentation_begin();
- vtime_account_guest_enter();
- rcu_virt_note_context_switch(smp_processor_id());
- instrumentation_end();
-}
-
-static __always_inline void context_tracking_guest_exit(void) { }
-
-static __always_inline void guest_exit_irqoff(void)
-{
instrumentation_begin();
/* Flush the guest cputime we spent on the guest */
vtime_account_guest_exit();
instrumentation_end();
}
-#endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */

static inline void guest_exit(void)
{
--
2.31.1.527.g47e6f16901-goog

2021-05-05 00:32:39

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v4 7/8] context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain

Move the guest enter/exit wrappers to kvm_host.h so that KVM can manage
its context tracking vs. vtime accounting without bleeding too many KVM
details into the context tracking code.

No functional change intended.

Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
include/linux/context_tracking.h | 45 --------------------------------
include/linux/kvm_host.h | 45 ++++++++++++++++++++++++++++++++
2 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index aa58c2ac67ca..4d7fced3a39f 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -118,49 +118,4 @@ extern void context_tracking_init(void);
static inline void context_tracking_init(void) { }
#endif /* CONFIG_CONTEXT_TRACKING_FORCE */

-/* must be called with irqs disabled */
-static __always_inline void guest_enter_irqoff(void)
-{
- /*
- * This is running in ioctl context so its safe to assume that it's the
- * stime pending cputime to flush.
- */
- instrumentation_begin();
- vtime_account_guest_enter();
- instrumentation_end();
-
- /*
- * KVM does not hold any references to rcu protected data when it
- * switches CPU into a guest mode. In fact switching to a guest mode
- * is very similar to exiting to userspace from rcu point of view. In
- * addition CPU may stay in a guest mode for quite a long time (up to
- * one time slice). Lets treat guest mode as quiescent state, just like
- * we do with user-mode execution.
- */
- if (!context_tracking_guest_enter()) {
- instrumentation_begin();
- rcu_virt_note_context_switch(smp_processor_id());
- instrumentation_end();
- }
-}
-
-static __always_inline void guest_exit_irqoff(void)
-{
- context_tracking_guest_exit();
-
- instrumentation_begin();
- /* Flush the guest cputime we spent on the guest */
- vtime_account_guest_exit();
- instrumentation_end();
-}
-
-static inline void guest_exit(void)
-{
- unsigned long flags;
-
- local_irq_save(flags);
- guest_exit_irqoff();
- local_irq_restore(flags);
-}
-
#endif
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a9a7bcf6ebee..a6f47ed8b1e6 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -338,6 +338,51 @@ struct kvm_vcpu {
struct kvm_dirty_ring dirty_ring;
};

+/* must be called with irqs disabled */
+static __always_inline void guest_enter_irqoff(void)
+{
+ /*
+ * This is running in ioctl context so its safe to assume that it's the
+ * stime pending cputime to flush.
+ */
+ instrumentation_begin();
+ vtime_account_guest_enter();
+ instrumentation_end();
+
+ /*
+ * KVM does not hold any references to rcu protected data when it
+ * switches CPU into a guest mode. In fact switching to a guest mode
+ * is very similar to exiting to userspace from rcu point of view. In
+ * addition CPU may stay in a guest mode for quite a long time (up to
+ * one time slice). Lets treat guest mode as quiescent state, just like
+ * we do with user-mode execution.
+ */
+ if (!context_tracking_guest_enter()) {
+ instrumentation_begin();
+ rcu_virt_note_context_switch(smp_processor_id());
+ instrumentation_end();
+ }
+}
+
+static __always_inline void guest_exit_irqoff(void)
+{
+ context_tracking_guest_exit();
+
+ instrumentation_begin();
+ /* Flush the guest cputime we spent on the guest */
+ vtime_account_guest_exit();
+ instrumentation_end();
+}
+
+static inline void guest_exit(void)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ guest_exit_irqoff();
+ local_irq_restore(flags);
+}
+
static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu)
{
/*
--
2.31.1.527.g47e6f16901-goog

2021-05-05 00:33:08

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v4 8/8] KVM: x86: Consolidate guest enter/exit logic to common helpers

Move the enter/exit logic in {svm,vmx}_vcpu_enter_exit() to common
helpers. Opportunistically update the somewhat stale comment about the
updates needing to occur immediately after VM-Exit.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/svm.c | 39 ++----------------------------------
arch/x86/kvm/vmx/vmx.c | 39 ++----------------------------------
arch/x86/kvm/x86.h | 45 ++++++++++++++++++++++++++++++++++++++++++
3 files changed, 49 insertions(+), 74 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7dd63545526b..8abaf4ec4f22 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3710,25 +3710,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
unsigned long vmcb_pa = svm->current_vmcb->pa;

- /*
- * VMENTER enables interrupts (host state), but the kernel state is
- * interrupts disabled when this is invoked. Also tell RCU about
- * it. This is the same logic as for exit_to_user_mode().
- *
- * This ensures that e.g. latency analysis on the host observes
- * guest mode as interrupt enabled.
- *
- * guest_enter_irqoff() informs context tracking about the
- * transition to guest mode and if enabled adjusts RCU state
- * accordingly.
- */
- instrumentation_begin();
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare(CALLER_ADDR0);
- instrumentation_end();
-
- guest_enter_irqoff();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ kvm_guest_enter_irqoff();

if (sev_es_guest(vcpu->kvm)) {
__svm_sev_es_vcpu_run(vmcb_pa);
@@ -3748,24 +3730,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
vmload(__sme_page_pa(sd->save_area));
}

- /*
- * VMEXIT disables interrupts (host state), but tracing and lockdep
- * have them in state 'on' as recorded before entering guest mode.
- * Same as enter_from_user_mode().
- *
- * context_tracking_guest_exit() restores host context and reinstates
- * RCU if enabled and required.
- *
- * This needs to be done before the below as native_read_msr()
- * contains a tracepoint and x86_spec_ctrl_restore_host() calls
- * into world and some more.
- */
- lockdep_hardirqs_off(CALLER_ADDR0);
- context_tracking_guest_exit();
-
- instrumentation_begin();
- trace_hardirqs_off_finish();
- instrumentation_end();
+ kvm_guest_exit_irqoff();
}

static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8425827068c3..dd6fae37b139 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6662,25 +6662,7 @@ static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
struct vcpu_vmx *vmx)
{
- /*
- * VMENTER enables interrupts (host state), but the kernel state is
- * interrupts disabled when this is invoked. Also tell RCU about
- * it. This is the same logic as for exit_to_user_mode().
- *
- * This ensures that e.g. latency analysis on the host observes
- * guest mode as interrupt enabled.
- *
- * guest_enter_irqoff() informs context tracking about the
- * transition to guest mode and if enabled adjusts RCU state
- * accordingly.
- */
- instrumentation_begin();
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare(CALLER_ADDR0);
- instrumentation_end();
-
- guest_enter_irqoff();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ kvm_guest_enter_irqoff();

/* L1D Flush includes CPU buffer clear to mitigate MDS */
if (static_branch_unlikely(&vmx_l1d_should_flush))
@@ -6696,24 +6678,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,

vcpu->arch.cr2 = native_read_cr2();

- /*
- * VMEXIT disables interrupts (host state), but tracing and lockdep
- * have them in state 'on' as recorded before entering guest mode.
- * Same as enter_from_user_mode().
- *
- * context_tracking_guest_exit() restores host context and reinstates
- * RCU if enabled and required.
- *
- * This needs to be done before the below as native_read_msr()
- * contains a tracepoint and x86_spec_ctrl_restore_host() calls
- * into world and some more.
- */
- lockdep_hardirqs_off(CALLER_ADDR0);
- context_tracking_guest_exit();
-
- instrumentation_begin();
- trace_hardirqs_off_finish();
- instrumentation_end();
+ kvm_guest_exit_irqoff();
}

static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 8ddd38146525..521f74e5bbf2 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -8,6 +8,51 @@
#include "kvm_cache_regs.h"
#include "kvm_emulate.h"

+static __always_inline void kvm_guest_enter_irqoff(void)
+{
+ /*
+ * VMENTER enables interrupts (host state), but the kernel state is
+ * interrupts disabled when this is invoked. Also tell RCU about
+ * it. This is the same logic as for exit_to_user_mode().
+ *
+ * This ensures that e.g. latency analysis on the host observes
+ * guest mode as interrupt enabled.
+ *
+ * guest_enter_irqoff() informs context tracking about the
+ * transition to guest mode and if enabled adjusts RCU state
+ * accordingly.
+ */
+ instrumentation_begin();
+ trace_hardirqs_on_prepare();
+ lockdep_hardirqs_on_prepare(CALLER_ADDR0);
+ instrumentation_end();
+
+ guest_enter_irqoff();
+ lockdep_hardirqs_on(CALLER_ADDR0);
+}
+
+static __always_inline void kvm_guest_exit_irqoff(void)
+{
+ /*
+ * VMEXIT disables interrupts (host state), but tracing and lockdep
+ * have them in state 'on' as recorded before entering guest mode.
+ * Same as enter_from_user_mode().
+ *
+ * context_tracking_guest_exit() restores host context and reinstates
+ * RCU if enabled and required.
+ *
+ * This needs to be done immediately after VM-Exit, before any code
+ * that might contain tracepoints or call out to the greater world,
+ * e.g. before x86_spec_ctrl_restore_host().
+ */
+ lockdep_hardirqs_off(CALLER_ADDR0);
+ context_tracking_guest_exit();
+
+ instrumentation_begin();
+ trace_hardirqs_off_finish();
+ instrumentation_end();
+}
+
#define KVM_NESTED_VMENTER_CONSISTENCY_CHECK(consistency_check) \
({ \
bool failed = (consistency_check); \
--
2.31.1.527.g47e6f16901-goog

2021-05-05 01:56:58

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v4 5/8] sched/vtime: Move guest enter/exit vtime accounting to vtime.h

Provide separate helpers for guest enter vtime accounting (in addition to
the existing guest exit helpers), and move all vtime accounting helpers
to vtime.h where the existing #ifdef infrastructure can be leveraged to
better delineate the different types of accounting. This will also allow
future cleanups via deduplication of context tracking code.

Opportunstically delete the vtime_account_kernel() stub now that all
callers are wrapped with CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
---
include/linux/context_tracking.h | 17 +-----------
include/linux/vtime.h | 46 +++++++++++++++++++++++++++-----
2 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 4f4556232dcf..56c648bdbde8 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -137,14 +137,6 @@ static __always_inline void context_tracking_guest_exit(void)
__context_tracking_exit(CONTEXT_GUEST);
}

-static __always_inline void vtime_account_guest_exit(void)
-{
- if (vtime_accounting_enabled_this_cpu())
- vtime_guest_exit(current);
- else
- current->flags &= ~PF_VCPU;
-}
-
static __always_inline void guest_exit_irqoff(void)
{
context_tracking_guest_exit();
@@ -163,20 +155,13 @@ static __always_inline void guest_enter_irqoff(void)
* to flush.
*/
instrumentation_begin();
- vtime_account_kernel(current);
- current->flags |= PF_VCPU;
+ vtime_account_guest_enter();
rcu_virt_note_context_switch(smp_processor_id());
instrumentation_end();
}

static __always_inline void context_tracking_guest_exit(void) { }

-static __always_inline void vtime_account_guest_exit(void)
-{
- vtime_account_kernel(current);
- current->flags &= ~PF_VCPU;
-}
-
static __always_inline void guest_exit_irqoff(void)
{
instrumentation_begin();
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 6a4317560539..3684487d01e1 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -3,21 +3,18 @@
#define _LINUX_KERNEL_VTIME_H

#include <linux/context_tracking_state.h>
+#include <linux/sched.h>
+
#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
#include <asm/vtime.h>
#endif

-
-struct task_struct;
-
/*
* Common vtime APIs
*/
#ifdef CONFIG_VIRT_CPU_ACCOUNTING
extern void vtime_account_kernel(struct task_struct *tsk);
extern void vtime_account_idle(struct task_struct *tsk);
-#else /* !CONFIG_VIRT_CPU_ACCOUNTING */
-static inline void vtime_account_kernel(struct task_struct *tsk) { }
#endif /* !CONFIG_VIRT_CPU_ACCOUNTING */

#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
@@ -55,6 +52,18 @@ static inline void vtime_flush(struct task_struct *tsk) { }
static inline bool vtime_accounting_enabled_this_cpu(void) { return true; }
extern void vtime_task_switch(struct task_struct *prev);

+static __always_inline void vtime_account_guest_enter(void)
+{
+ vtime_account_kernel(current);
+ current->flags |= PF_VCPU;
+}
+
+static __always_inline void vtime_account_guest_exit(void)
+{
+ vtime_account_kernel(current);
+ current->flags &= ~PF_VCPU;
+}
+
#elif defined(CONFIG_VIRT_CPU_ACCOUNTING_GEN)

/*
@@ -86,12 +95,37 @@ static inline void vtime_task_switch(struct task_struct *prev)
vtime_task_switch_generic(prev);
}

+static __always_inline void vtime_account_guest_enter(void)
+{
+ if (vtime_accounting_enabled_this_cpu())
+ vtime_guest_enter(current);
+ else
+ current->flags |= PF_VCPU;
+}
+
+static __always_inline void vtime_account_guest_exit(void)
+{
+ if (vtime_accounting_enabled_this_cpu())
+ vtime_guest_exit(current);
+ else
+ current->flags &= ~PF_VCPU;
+}
+
#else /* !CONFIG_VIRT_CPU_ACCOUNTING */

-static inline bool vtime_accounting_enabled_cpu(int cpu) {return false; }
static inline bool vtime_accounting_enabled_this_cpu(void) { return false; }
static inline void vtime_task_switch(struct task_struct *prev) { }

+static __always_inline void vtime_account_guest_enter(void)
+{
+ current->flags |= PF_VCPU;
+}
+
+static __always_inline void vtime_account_guest_exit(void)
+{
+ current->flags &= ~PF_VCPU;
+}
+
#endif


--
2.31.1.527.g47e6f16901-goog

2021-05-05 21:08:29

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v4 3/8] KVM: x86: Defer vtime accounting 'til after IRQ handling

On Tue, May 04 2021 at 17:27, Sean Christopherson wrote:
> Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs")
...
> Cc: [email protected]#v5.9-rc1+

Bah. Breaks skripts as this is really not a valid email address and
aside of that the Fixes tag already identifies clearly which kernel
versions this affects.

Thanks,

tglx

Subject: [tip: x86/urgent] context_tracking: Consolidate guest enter/exit wrappers

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 14296e0c447885d6c7b326e059fb528eb00526ed
Gitweb: https://git.kernel.org/tip/14296e0c447885d6c7b326e059fb528eb00526ed
Author: Sean Christopherson <[email protected]>
AuthorDate: Tue, 04 May 2021 17:27:33 -07:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Wed, 05 May 2021 22:54:11 +02:00

context_tracking: Consolidate guest enter/exit wrappers

Consolidate the guest enter/exit wrappers, providing and tweaking stubs
as needed. This will allow moving the wrappers under KVM without having
to bleed #ifdefs into the soon-to-be KVM code.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
include/linux/context_tracking.h | 65 +++++++++++--------------------
1 file changed, 24 insertions(+), 41 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 56c648b..aa58c2a 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -71,6 +71,19 @@ static inline void exception_exit(enum ctx_state prev_ctx)
}
}

+static __always_inline bool context_tracking_guest_enter(void)
+{
+ if (context_tracking_enabled())
+ __context_tracking_enter(CONTEXT_GUEST);
+
+ return context_tracking_enabled_this_cpu();
+}
+
+static __always_inline void context_tracking_guest_exit(void)
+{
+ if (context_tracking_enabled())
+ __context_tracking_exit(CONTEXT_GUEST);
+}

/**
* ct_state() - return the current context tracking state if known
@@ -92,6 +105,9 @@ static inline void user_exit_irqoff(void) { }
static inline enum ctx_state exception_enter(void) { return 0; }
static inline void exception_exit(enum ctx_state prev_ctx) { }
static inline enum ctx_state ct_state(void) { return CONTEXT_DISABLED; }
+static inline bool context_tracking_guest_enter(void) { return false; }
+static inline void context_tracking_guest_exit(void) { }
+
#endif /* !CONFIG_CONTEXT_TRACKING */

#define CT_WARN_ON(cond) WARN_ON(context_tracking_enabled() && (cond))
@@ -102,74 +118,41 @@ extern void context_tracking_init(void);
static inline void context_tracking_init(void) { }
#endif /* CONFIG_CONTEXT_TRACKING_FORCE */

-
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
/* must be called with irqs disabled */
static __always_inline void guest_enter_irqoff(void)
{
+ /*
+ * This is running in ioctl context so its safe to assume that it's the
+ * stime pending cputime to flush.
+ */
instrumentation_begin();
- if (vtime_accounting_enabled_this_cpu())
- vtime_guest_enter(current);
- else
- current->flags |= PF_VCPU;
+ vtime_account_guest_enter();
instrumentation_end();

- if (context_tracking_enabled())
- __context_tracking_enter(CONTEXT_GUEST);
-
- /* KVM does not hold any references to rcu protected data when it
+ /*
+ * KVM does not hold any references to rcu protected data when it
* switches CPU into a guest mode. In fact switching to a guest mode
* is very similar to exiting to userspace from rcu point of view. In
* addition CPU may stay in a guest mode for quite a long time (up to
* one time slice). Lets treat guest mode as quiescent state, just like
* we do with user-mode execution.
*/
- if (!context_tracking_enabled_this_cpu()) {
+ if (!context_tracking_guest_enter()) {
instrumentation_begin();
rcu_virt_note_context_switch(smp_processor_id());
instrumentation_end();
}
}

-static __always_inline void context_tracking_guest_exit(void)
-{
- if (context_tracking_enabled())
- __context_tracking_exit(CONTEXT_GUEST);
-}
-
static __always_inline void guest_exit_irqoff(void)
{
context_tracking_guest_exit();

instrumentation_begin();
- vtime_account_guest_exit();
- instrumentation_end();
-}
-
-#else
-static __always_inline void guest_enter_irqoff(void)
-{
- /*
- * This is running in ioctl context so its safe
- * to assume that it's the stime pending cputime
- * to flush.
- */
- instrumentation_begin();
- vtime_account_guest_enter();
- rcu_virt_note_context_switch(smp_processor_id());
- instrumentation_end();
-}
-
-static __always_inline void context_tracking_guest_exit(void) { }
-
-static __always_inline void guest_exit_irqoff(void)
-{
- instrumentation_begin();
/* Flush the guest cputime we spent on the guest */
vtime_account_guest_exit();
instrumentation_end();
}
-#endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */

static inline void guest_exit(void)
{

Subject: [tip: x86/urgent] sched/vtime: Move vtime accounting external declarations above inlines

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: b41c723b203e19480c26f2ec8f04eedc03d34b34
Gitweb: https://git.kernel.org/tip/b41c723b203e19480c26f2ec8f04eedc03d34b34
Author: Sean Christopherson <[email protected]>
AuthorDate: Tue, 04 May 2021 17:27:31 -07:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Wed, 05 May 2021 22:54:11 +02:00

sched/vtime: Move vtime accounting external declarations above inlines

Move the blob of external declarations (and their stubs) above the set of
inline definitions (and their stubs) for vtime accounting. This will
allow a future patch to bring in more inline definitions without also
having to shuffle large chunks of code.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
include/linux/vtime.h | 74 +++++++++++++++++++++---------------------
1 file changed, 37 insertions(+), 37 deletions(-)

diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 041d652..6a43175 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -11,6 +11,43 @@
struct task_struct;

/*
+ * Common vtime APIs
+ */
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING
+extern void vtime_account_kernel(struct task_struct *tsk);
+extern void vtime_account_idle(struct task_struct *tsk);
+#else /* !CONFIG_VIRT_CPU_ACCOUNTING */
+static inline void vtime_account_kernel(struct task_struct *tsk) { }
+#endif /* !CONFIG_VIRT_CPU_ACCOUNTING */
+
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
+extern void arch_vtime_task_switch(struct task_struct *tsk);
+extern void vtime_user_enter(struct task_struct *tsk);
+extern void vtime_user_exit(struct task_struct *tsk);
+extern void vtime_guest_enter(struct task_struct *tsk);
+extern void vtime_guest_exit(struct task_struct *tsk);
+extern void vtime_init_idle(struct task_struct *tsk, int cpu);
+#else /* !CONFIG_VIRT_CPU_ACCOUNTING_GEN */
+static inline void vtime_user_enter(struct task_struct *tsk) { }
+static inline void vtime_user_exit(struct task_struct *tsk) { }
+static inline void vtime_guest_enter(struct task_struct *tsk) { }
+static inline void vtime_guest_exit(struct task_struct *tsk) { }
+static inline void vtime_init_idle(struct task_struct *tsk, int cpu) { }
+#endif
+
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+extern void vtime_account_irq(struct task_struct *tsk, unsigned int offset);
+extern void vtime_account_softirq(struct task_struct *tsk);
+extern void vtime_account_hardirq(struct task_struct *tsk);
+extern void vtime_flush(struct task_struct *tsk);
+#else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
+static inline void vtime_account_irq(struct task_struct *tsk, unsigned int offset) { }
+static inline void vtime_account_softirq(struct task_struct *tsk) { }
+static inline void vtime_account_hardirq(struct task_struct *tsk) { }
+static inline void vtime_flush(struct task_struct *tsk) { }
+#endif
+
+/*
* vtime_accounting_enabled_this_cpu() definitions/declarations
*/
#if defined(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE)
@@ -57,43 +94,6 @@ static inline void vtime_task_switch(struct task_struct *prev) { }

#endif

-/*
- * Common vtime APIs
- */
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING
-extern void vtime_account_kernel(struct task_struct *tsk);
-extern void vtime_account_idle(struct task_struct *tsk);
-#else /* !CONFIG_VIRT_CPU_ACCOUNTING */
-static inline void vtime_account_kernel(struct task_struct *tsk) { }
-#endif /* !CONFIG_VIRT_CPU_ACCOUNTING */
-
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
-extern void arch_vtime_task_switch(struct task_struct *tsk);
-extern void vtime_user_enter(struct task_struct *tsk);
-extern void vtime_user_exit(struct task_struct *tsk);
-extern void vtime_guest_enter(struct task_struct *tsk);
-extern void vtime_guest_exit(struct task_struct *tsk);
-extern void vtime_init_idle(struct task_struct *tsk, int cpu);
-#else /* !CONFIG_VIRT_CPU_ACCOUNTING_GEN */
-static inline void vtime_user_enter(struct task_struct *tsk) { }
-static inline void vtime_user_exit(struct task_struct *tsk) { }
-static inline void vtime_guest_enter(struct task_struct *tsk) { }
-static inline void vtime_guest_exit(struct task_struct *tsk) { }
-static inline void vtime_init_idle(struct task_struct *tsk, int cpu) { }
-#endif
-
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
-extern void vtime_account_irq(struct task_struct *tsk, unsigned int offset);
-extern void vtime_account_softirq(struct task_struct *tsk);
-extern void vtime_account_hardirq(struct task_struct *tsk);
-extern void vtime_flush(struct task_struct *tsk);
-#else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
-static inline void vtime_account_irq(struct task_struct *tsk, unsigned int offset) { }
-static inline void vtime_account_softirq(struct task_struct *tsk) { }
-static inline void vtime_account_hardirq(struct task_struct *tsk) { }
-static inline void vtime_flush(struct task_struct *tsk) { }
-#endif
-

#ifdef CONFIG_IRQ_TIME_ACCOUNTING
extern void irqtime_account_irq(struct task_struct *tsk, unsigned int offset);

Subject: [tip: x86/urgent] KVM: x86: Consolidate guest enter/exit logic to common helpers

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: bc908e091b3264672889162733020048901021fb
Gitweb: https://git.kernel.org/tip/bc908e091b3264672889162733020048901021fb
Author: Sean Christopherson <[email protected]>
AuthorDate: Tue, 04 May 2021 17:27:35 -07:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Wed, 05 May 2021 22:54:12 +02:00

KVM: x86: Consolidate guest enter/exit logic to common helpers

Move the enter/exit logic in {svm,vmx}_vcpu_enter_exit() to common
helpers. Opportunistically update the somewhat stale comment about the
updates needing to occur immediately after VM-Exit.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
arch/x86/kvm/svm/svm.c | 39 +-----------------------------------
arch/x86/kvm/vmx/vmx.c | 39 +-----------------------------------
arch/x86/kvm/x86.h | 45 +++++++++++++++++++++++++++++++++++++++++-
3 files changed, 49 insertions(+), 74 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index c400def..b649f92 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3710,25 +3710,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
unsigned long vmcb_pa = svm->current_vmcb->pa;

- /*
- * VMENTER enables interrupts (host state), but the kernel state is
- * interrupts disabled when this is invoked. Also tell RCU about
- * it. This is the same logic as for exit_to_user_mode().
- *
- * This ensures that e.g. latency analysis on the host observes
- * guest mode as interrupt enabled.
- *
- * guest_enter_irqoff() informs context tracking about the
- * transition to guest mode and if enabled adjusts RCU state
- * accordingly.
- */
- instrumentation_begin();
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare(CALLER_ADDR0);
- instrumentation_end();
-
- guest_enter_irqoff();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ kvm_guest_enter_irqoff();

if (sev_es_guest(vcpu->kvm)) {
__svm_sev_es_vcpu_run(vmcb_pa);
@@ -3748,24 +3730,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
vmload(__sme_page_pa(sd->save_area));
}

- /*
- * VMEXIT disables interrupts (host state), but tracing and lockdep
- * have them in state 'on' as recorded before entering guest mode.
- * Same as enter_from_user_mode().
- *
- * context_tracking_guest_exit() restores host context and reinstates
- * RCU if enabled and required.
- *
- * This needs to be done before the below as native_read_msr()
- * contains a tracepoint and x86_spec_ctrl_restore_host() calls
- * into world and some more.
- */
- lockdep_hardirqs_off(CALLER_ADDR0);
- context_tracking_guest_exit();
-
- instrumentation_begin();
- trace_hardirqs_off_finish();
- instrumentation_end();
+ kvm_guest_exit_irqoff();
}

static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e108fb4..d000cdd 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6664,25 +6664,7 @@ static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
struct vcpu_vmx *vmx)
{
- /*
- * VMENTER enables interrupts (host state), but the kernel state is
- * interrupts disabled when this is invoked. Also tell RCU about
- * it. This is the same logic as for exit_to_user_mode().
- *
- * This ensures that e.g. latency analysis on the host observes
- * guest mode as interrupt enabled.
- *
- * guest_enter_irqoff() informs context tracking about the
- * transition to guest mode and if enabled adjusts RCU state
- * accordingly.
- */
- instrumentation_begin();
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare(CALLER_ADDR0);
- instrumentation_end();
-
- guest_enter_irqoff();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ kvm_guest_enter_irqoff();

/* L1D Flush includes CPU buffer clear to mitigate MDS */
if (static_branch_unlikely(&vmx_l1d_should_flush))
@@ -6698,24 +6680,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,

vcpu->arch.cr2 = native_read_cr2();

- /*
- * VMEXIT disables interrupts (host state), but tracing and lockdep
- * have them in state 'on' as recorded before entering guest mode.
- * Same as enter_from_user_mode().
- *
- * context_tracking_guest_exit() restores host context and reinstates
- * RCU if enabled and required.
- *
- * This needs to be done before the below as native_read_msr()
- * contains a tracepoint and x86_spec_ctrl_restore_host() calls
- * into world and some more.
- */
- lockdep_hardirqs_off(CALLER_ADDR0);
- context_tracking_guest_exit();
-
- instrumentation_begin();
- trace_hardirqs_off_finish();
- instrumentation_end();
+ kvm_guest_exit_irqoff();
}

static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 8ddd381..521f74e 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -8,6 +8,51 @@
#include "kvm_cache_regs.h"
#include "kvm_emulate.h"

+static __always_inline void kvm_guest_enter_irqoff(void)
+{
+ /*
+ * VMENTER enables interrupts (host state), but the kernel state is
+ * interrupts disabled when this is invoked. Also tell RCU about
+ * it. This is the same logic as for exit_to_user_mode().
+ *
+ * This ensures that e.g. latency analysis on the host observes
+ * guest mode as interrupt enabled.
+ *
+ * guest_enter_irqoff() informs context tracking about the
+ * transition to guest mode and if enabled adjusts RCU state
+ * accordingly.
+ */
+ instrumentation_begin();
+ trace_hardirqs_on_prepare();
+ lockdep_hardirqs_on_prepare(CALLER_ADDR0);
+ instrumentation_end();
+
+ guest_enter_irqoff();
+ lockdep_hardirqs_on(CALLER_ADDR0);
+}
+
+static __always_inline void kvm_guest_exit_irqoff(void)
+{
+ /*
+ * VMEXIT disables interrupts (host state), but tracing and lockdep
+ * have them in state 'on' as recorded before entering guest mode.
+ * Same as enter_from_user_mode().
+ *
+ * context_tracking_guest_exit() restores host context and reinstates
+ * RCU if enabled and required.
+ *
+ * This needs to be done immediately after VM-Exit, before any code
+ * that might contain tracepoints or call out to the greater world,
+ * e.g. before x86_spec_ctrl_restore_host().
+ */
+ lockdep_hardirqs_off(CALLER_ADDR0);
+ context_tracking_guest_exit();
+
+ instrumentation_begin();
+ trace_hardirqs_off_finish();
+ instrumentation_end();
+}
+
#define KVM_NESTED_VMENTER_CONSISTENCY_CHECK(consistency_check) \
({ \
bool failed = (consistency_check); \

Subject: [tip: x86/urgent] sched/vtime: Move guest enter/exit vtime accounting to vtime.h

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 6f922b89e5518143920b10e3643e556d9df58d94
Gitweb: https://git.kernel.org/tip/6f922b89e5518143920b10e3643e556d9df58d94
Author: Sean Christopherson <[email protected]>
AuthorDate: Tue, 04 May 2021 17:27:32 -07:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Wed, 05 May 2021 22:54:11 +02:00

sched/vtime: Move guest enter/exit vtime accounting to vtime.h

Provide separate helpers for guest enter vtime accounting (in addition to
the existing guest exit helpers), and move all vtime accounting helpers
to vtime.h where the existing #ifdef infrastructure can be leveraged to
better delineate the different types of accounting. This will also allow
future cleanups via deduplication of context tracking code.

Opportunstically delete the vtime_account_kernel() stub now that all
callers are wrapped with CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
include/linux/context_tracking.h | 17 +-----------
include/linux/vtime.h | 46 ++++++++++++++++++++++++++-----
2 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 4f45562..56c648b 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -137,14 +137,6 @@ static __always_inline void context_tracking_guest_exit(void)
__context_tracking_exit(CONTEXT_GUEST);
}

-static __always_inline void vtime_account_guest_exit(void)
-{
- if (vtime_accounting_enabled_this_cpu())
- vtime_guest_exit(current);
- else
- current->flags &= ~PF_VCPU;
-}
-
static __always_inline void guest_exit_irqoff(void)
{
context_tracking_guest_exit();
@@ -163,20 +155,13 @@ static __always_inline void guest_enter_irqoff(void)
* to flush.
*/
instrumentation_begin();
- vtime_account_kernel(current);
- current->flags |= PF_VCPU;
+ vtime_account_guest_enter();
rcu_virt_note_context_switch(smp_processor_id());
instrumentation_end();
}

static __always_inline void context_tracking_guest_exit(void) { }

-static __always_inline void vtime_account_guest_exit(void)
-{
- vtime_account_kernel(current);
- current->flags &= ~PF_VCPU;
-}
-
static __always_inline void guest_exit_irqoff(void)
{
instrumentation_begin();
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 6a43175..3684487 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -3,21 +3,18 @@
#define _LINUX_KERNEL_VTIME_H

#include <linux/context_tracking_state.h>
+#include <linux/sched.h>
+
#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
#include <asm/vtime.h>
#endif

-
-struct task_struct;
-
/*
* Common vtime APIs
*/
#ifdef CONFIG_VIRT_CPU_ACCOUNTING
extern void vtime_account_kernel(struct task_struct *tsk);
extern void vtime_account_idle(struct task_struct *tsk);
-#else /* !CONFIG_VIRT_CPU_ACCOUNTING */
-static inline void vtime_account_kernel(struct task_struct *tsk) { }
#endif /* !CONFIG_VIRT_CPU_ACCOUNTING */

#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
@@ -55,6 +52,18 @@ static inline void vtime_flush(struct task_struct *tsk) { }
static inline bool vtime_accounting_enabled_this_cpu(void) { return true; }
extern void vtime_task_switch(struct task_struct *prev);

+static __always_inline void vtime_account_guest_enter(void)
+{
+ vtime_account_kernel(current);
+ current->flags |= PF_VCPU;
+}
+
+static __always_inline void vtime_account_guest_exit(void)
+{
+ vtime_account_kernel(current);
+ current->flags &= ~PF_VCPU;
+}
+
#elif defined(CONFIG_VIRT_CPU_ACCOUNTING_GEN)

/*
@@ -86,12 +95,37 @@ static inline void vtime_task_switch(struct task_struct *prev)
vtime_task_switch_generic(prev);
}

+static __always_inline void vtime_account_guest_enter(void)
+{
+ if (vtime_accounting_enabled_this_cpu())
+ vtime_guest_enter(current);
+ else
+ current->flags |= PF_VCPU;
+}
+
+static __always_inline void vtime_account_guest_exit(void)
+{
+ if (vtime_accounting_enabled_this_cpu())
+ vtime_guest_exit(current);
+ else
+ current->flags &= ~PF_VCPU;
+}
+
#else /* !CONFIG_VIRT_CPU_ACCOUNTING */

-static inline bool vtime_accounting_enabled_cpu(int cpu) {return false; }
static inline bool vtime_accounting_enabled_this_cpu(void) { return false; }
static inline void vtime_task_switch(struct task_struct *prev) { }

+static __always_inline void vtime_account_guest_enter(void)
+{
+ current->flags |= PF_VCPU;
+}
+
+static __always_inline void vtime_account_guest_exit(void)
+{
+ current->flags &= ~PF_VCPU;
+}
+
#endif


Subject: [tip: x86/urgent] KVM: x86: Defer vtime accounting 'til after IRQ handling

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 160457140187c5fb127b844e5a85f87f00a01b14
Gitweb: https://git.kernel.org/tip/160457140187c5fb127b844e5a85f87f00a01b14
Author: Wanpeng Li <[email protected]>
AuthorDate: Tue, 04 May 2021 17:27:30 -07:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Wed, 05 May 2021 22:54:11 +02:00

KVM: x86: Defer vtime accounting 'til after IRQ handling

Defer the call to account guest time until after servicing any IRQ(s)
that happened in the guest or immediately after VM-Exit. Tick-based
accounting of vCPU time relies on PF_VCPU being set when the tick IRQ
handler runs, and IRQs are blocked throughout the main sequence of
vcpu_enter_guest(), including the call into vendor code to actually
enter and exit the guest.

This fixes a bug where reported guest time remains '0', even when
running an infinite loop in the guest:

https://bugzilla.kernel.org/show_bug.cgi?id=209831

Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs")
Suggested-by: Thomas Gleixner <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/kvm/svm/svm.c | 6 +++---
arch/x86/kvm/vmx/vmx.c | 6 +++---
arch/x86/kvm/x86.c | 9 +++++++++
3 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9790c73..c400def 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3753,15 +3753,15 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
* have them in state 'on' as recorded before entering guest mode.
* Same as enter_from_user_mode().
*
- * guest_exit_irqoff() restores host context and reinstates RCU if
- * enabled and required.
+ * context_tracking_guest_exit() restores host context and reinstates
+ * RCU if enabled and required.
*
* This needs to be done before the below as native_read_msr()
* contains a tracepoint and x86_spec_ctrl_restore_host() calls
* into world and some more.
*/
lockdep_hardirqs_off(CALLER_ADDR0);
- guest_exit_irqoff();
+ context_tracking_guest_exit();

instrumentation_begin();
trace_hardirqs_off_finish();
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b21d751..e108fb4 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6703,15 +6703,15 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
* have them in state 'on' as recorded before entering guest mode.
* Same as enter_from_user_mode().
*
- * guest_exit_irqoff() restores host context and reinstates RCU if
- * enabled and required.
+ * context_tracking_guest_exit() restores host context and reinstates
+ * RCU if enabled and required.
*
* This needs to be done before the below as native_read_msr()
* contains a tracepoint and x86_spec_ctrl_restore_host() calls
* into world and some more.
*/
lockdep_hardirqs_off(CALLER_ADDR0);
- guest_exit_irqoff();
+ context_tracking_guest_exit();

instrumentation_begin();
trace_hardirqs_off_finish();
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cebdaa1..6eda283 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9315,6 +9315,15 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
local_irq_disable();
kvm_after_interrupt(vcpu);

+ /*
+ * Wait until after servicing IRQs to account guest time so that any
+ * ticks that occurred while running the guest are properly accounted
+ * to the guest. Waiting until IRQs are enabled degrades the accuracy
+ * of accounting via context tracking, but the loss of accuracy is
+ * acceptable for all known use cases.
+ */
+ vtime_account_guest_exit();
+
if (lapic_in_kernel(vcpu)) {
s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;
if (delta != S64_MIN) {

Subject: [tip: x86/urgent] context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 1ca0016c149be35fe19a6b75fce95c25807b7159
Gitweb: https://git.kernel.org/tip/1ca0016c149be35fe19a6b75fce95c25807b7159
Author: Sean Christopherson <[email protected]>
AuthorDate: Tue, 04 May 2021 17:27:34 -07:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Wed, 05 May 2021 22:54:12 +02:00

context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain

Move the guest enter/exit wrappers to kvm_host.h so that KVM can manage
its context tracking vs. vtime accounting without bleeding too many KVM
details into the context tracking code.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
include/linux/context_tracking.h | 45 +-------------------------------
include/linux/kvm_host.h | 45 +++++++++++++++++++++++++++++++-
2 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index aa58c2a..4d7fced 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -118,49 +118,4 @@ extern void context_tracking_init(void);
static inline void context_tracking_init(void) { }
#endif /* CONFIG_CONTEXT_TRACKING_FORCE */

-/* must be called with irqs disabled */
-static __always_inline void guest_enter_irqoff(void)
-{
- /*
- * This is running in ioctl context so its safe to assume that it's the
- * stime pending cputime to flush.
- */
- instrumentation_begin();
- vtime_account_guest_enter();
- instrumentation_end();
-
- /*
- * KVM does not hold any references to rcu protected data when it
- * switches CPU into a guest mode. In fact switching to a guest mode
- * is very similar to exiting to userspace from rcu point of view. In
- * addition CPU may stay in a guest mode for quite a long time (up to
- * one time slice). Lets treat guest mode as quiescent state, just like
- * we do with user-mode execution.
- */
- if (!context_tracking_guest_enter()) {
- instrumentation_begin();
- rcu_virt_note_context_switch(smp_processor_id());
- instrumentation_end();
- }
-}
-
-static __always_inline void guest_exit_irqoff(void)
-{
- context_tracking_guest_exit();
-
- instrumentation_begin();
- /* Flush the guest cputime we spent on the guest */
- vtime_account_guest_exit();
- instrumentation_end();
-}
-
-static inline void guest_exit(void)
-{
- unsigned long flags;
-
- local_irq_save(flags);
- guest_exit_irqoff();
- local_irq_restore(flags);
-}
-
#endif
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 8895b95..2f34487 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -338,6 +338,51 @@ struct kvm_vcpu {
struct kvm_dirty_ring dirty_ring;
};

+/* must be called with irqs disabled */
+static __always_inline void guest_enter_irqoff(void)
+{
+ /*
+ * This is running in ioctl context so its safe to assume that it's the
+ * stime pending cputime to flush.
+ */
+ instrumentation_begin();
+ vtime_account_guest_enter();
+ instrumentation_end();
+
+ /*
+ * KVM does not hold any references to rcu protected data when it
+ * switches CPU into a guest mode. In fact switching to a guest mode
+ * is very similar to exiting to userspace from rcu point of view. In
+ * addition CPU may stay in a guest mode for quite a long time (up to
+ * one time slice). Lets treat guest mode as quiescent state, just like
+ * we do with user-mode execution.
+ */
+ if (!context_tracking_guest_enter()) {
+ instrumentation_begin();
+ rcu_virt_note_context_switch(smp_processor_id());
+ instrumentation_end();
+ }
+}
+
+static __always_inline void guest_exit_irqoff(void)
+{
+ context_tracking_guest_exit();
+
+ instrumentation_begin();
+ /* Flush the guest cputime we spent on the guest */
+ vtime_account_guest_exit();
+ instrumentation_end();
+}
+
+static inline void guest_exit(void)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ guest_exit_irqoff();
+ local_irq_restore(flags);
+}
+
static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu)
{
/*

Subject: [tip: x86/urgent] context_tracking: Move guest exit context tracking to separate helpers

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 866a6dadbb027b2955a7ae00bab9705d382def12
Gitweb: https://git.kernel.org/tip/866a6dadbb027b2955a7ae00bab9705d382def12
Author: Wanpeng Li <[email protected]>
AuthorDate: Tue, 04 May 2021 17:27:28 -07:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Wed, 05 May 2021 22:54:10 +02:00

context_tracking: Move guest exit context tracking to separate helpers

Provide separate context tracking helpers for guest exit, the standalone
helpers will be called separately by KVM x86 in later patches to fix
tick-based accounting.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
include/linux/context_tracking.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index bceb064..b8c7313 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -131,10 +131,15 @@ static __always_inline void guest_enter_irqoff(void)
}
}

-static __always_inline void guest_exit_irqoff(void)
+static __always_inline void context_tracking_guest_exit(void)
{
if (context_tracking_enabled())
__context_tracking_exit(CONTEXT_GUEST);
+}
+
+static __always_inline void guest_exit_irqoff(void)
+{
+ context_tracking_guest_exit();

instrumentation_begin();
if (vtime_accounting_enabled_this_cpu())
@@ -159,6 +164,8 @@ static __always_inline void guest_enter_irqoff(void)
instrumentation_end();
}

+static __always_inline void context_tracking_guest_exit(void) { }
+
static __always_inline void guest_exit_irqoff(void)
{
instrumentation_begin();

Subject: [tip: x86/urgent] context_tracking: Move guest exit vtime accounting to separate helpers

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 88d8220bbf06dd8045b2ac4be1046290eaa7773a
Gitweb: https://git.kernel.org/tip/88d8220bbf06dd8045b2ac4be1046290eaa7773a
Author: Wanpeng Li <[email protected]>
AuthorDate: Tue, 04 May 2021 17:27:29 -07:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Wed, 05 May 2021 22:54:11 +02:00

context_tracking: Move guest exit vtime accounting to separate helpers

Provide separate vtime accounting functions for guest exit instead of
open coding the logic within the context tracking code. This will allow
KVM x86 to handle vtime accounting slightly differently when using
tick-based accounting.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
include/linux/context_tracking.h | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index b8c7313..4f45562 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -137,15 +137,20 @@ static __always_inline void context_tracking_guest_exit(void)
__context_tracking_exit(CONTEXT_GUEST);
}

-static __always_inline void guest_exit_irqoff(void)
+static __always_inline void vtime_account_guest_exit(void)
{
- context_tracking_guest_exit();
-
- instrumentation_begin();
if (vtime_accounting_enabled_this_cpu())
vtime_guest_exit(current);
else
current->flags &= ~PF_VCPU;
+}
+
+static __always_inline void guest_exit_irqoff(void)
+{
+ context_tracking_guest_exit();
+
+ instrumentation_begin();
+ vtime_account_guest_exit();
instrumentation_end();
}

@@ -166,12 +171,17 @@ static __always_inline void guest_enter_irqoff(void)

static __always_inline void context_tracking_guest_exit(void) { }

+static __always_inline void vtime_account_guest_exit(void)
+{
+ vtime_account_kernel(current);
+ current->flags &= ~PF_VCPU;
+}
+
static __always_inline void guest_exit_irqoff(void)
{
instrumentation_begin();
/* Flush the guest cputime we spent on the guest */
- vtime_account_kernel(current);
- current->flags &= ~PF_VCPU;
+ vtime_account_guest_exit();
instrumentation_end();
}
#endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */