2020-07-22 22:16:24

by Thomas Gleixner

[permalink] [raw]
Subject: [patch V5 15/15] x86/kvm: Use generic xfer to guest work function

From: Thomas Gleixner <[email protected]>

Use the generic infrastructure to check for and handle pending work before
transitioning into guest mode.

This now handles TIF_NOTIFY_RESUME as well which was ignored so
far. Handling it is important as this covers task work and task work will
be used to offload the heavy lifting of POSIX CPU timers to thread context.

Signed-off-by: Thomas Gleixner <[email protected]>
---
V5: Rename exit -> xfer
---
arch/x86/kvm/Kconfig | 1 +
arch/x86/kvm/vmx/vmx.c | 11 +++++------
arch/x86/kvm/x86.c | 15 ++++++---------
3 files changed, 12 insertions(+), 15 deletions(-)

--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -42,6 +42,7 @@ config KVM
select HAVE_KVM_MSI
select HAVE_KVM_CPU_RELAX_INTERCEPT
select HAVE_KVM_NO_POLL
+ select KVM_XFER_TO_GUEST_WORK
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select KVM_VFIO
select SRCU
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -27,6 +27,7 @@
#include <linux/slab.h>
#include <linux/tboot.h>
#include <linux/trace_events.h>
+#include <linux/entry-kvm.h>

#include <asm/apic.h>
#include <asm/asm.h>
@@ -5376,14 +5377,12 @@ static int handle_invalid_guest_state(st
}

/*
- * Note, return 1 and not 0, vcpu_run() is responsible for
- * morphing the pending signal into the proper return code.
+ * Note, return 1 and not 0, vcpu_run() will invoke
+ * xfer_to_guest_mode() which will create a proper return
+ * code.
*/
- if (signal_pending(current))
+ if (__xfer_to_guest_mode_work_pending())
return 1;
-
- if (need_resched())
- schedule();
}

return 1;
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -56,6 +56,7 @@
#include <linux/sched/stat.h>
#include <linux/sched/isolation.h>
#include <linux/mem_encrypt.h>
+#include <linux/entry-kvm.h>

#include <trace/events/kvm.h>

@@ -1585,7 +1586,7 @@ EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr);
bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu)
{
return vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) ||
- need_resched() || signal_pending(current);
+ xfer_to_guest_mode_work_pending();
}
EXPORT_SYMBOL_GPL(kvm_vcpu_exit_request);

@@ -8676,15 +8677,11 @@ static int vcpu_run(struct kvm_vcpu *vcp
break;
}

- if (signal_pending(current)) {
- r = -EINTR;
- vcpu->run->exit_reason = KVM_EXIT_INTR;
- ++vcpu->stat.signal_exits;
- break;
- }
- if (need_resched()) {
+ if (xfer_to_guest_mode_work_pending()) {
srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
- cond_resched();
+ r = xfer_to_guest_mode(vcpu);
+ if (r)
+ return r;
vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
}
}


2020-07-24 00:20:25

by Sean Christopherson

[permalink] [raw]
Subject: Re: [patch V5 15/15] x86/kvm: Use generic xfer to guest work function

On Thu, Jul 23, 2020 at 12:00:09AM +0200, Thomas Gleixner wrote:
> From: Thomas Gleixner <[email protected]>
>
> Use the generic infrastructure to check for and handle pending work before
> transitioning into guest mode.
>
> This now handles TIF_NOTIFY_RESUME as well which was ignored so
> far. Handling it is important as this covers task work and task work will
> be used to offload the heavy lifting of POSIX CPU timers to thread context.
>
> Signed-off-by: Thomas Gleixner <[email protected]>
> ---
> V5: Rename exit -> xfer
> ---

One nit/question below (though it's really about patch 5).

Reviewed-and-tested-by: Sean Christopherson <[email protected]>

> @@ -8676,15 +8677,11 @@ static int vcpu_run(struct kvm_vcpu *vcp
> break;
> }
>
> - if (signal_pending(current)) {
> - r = -EINTR;
> - vcpu->run->exit_reason = KVM_EXIT_INTR;
> - ++vcpu->stat.signal_exits;
> - break;
> - }
> - if (need_resched()) {
> + if (xfer_to_guest_mode_work_pending()) {
> srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
> - cond_resched();
> + r = xfer_to_guest_mode(vcpu);

Any reason not to call this xfer_to_guest_mode_work()? Or handle_work(),
do_work(), etc... Without the "work" part, it looks like a function that
should be invoked unconditionally. It's obvious that's not the case if
one looks at the implementation, but it's a bit confusing on the KVM side
of things.

> + if (r)
> + return r;
> vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
> }
> }
>

2020-07-24 00:47:01

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [patch V5 15/15] x86/kvm: Use generic xfer to guest work function

Sean,

Sean Christopherson <[email protected]> writes:
> On Thu, Jul 23, 2020 at 12:00:09AM +0200, Thomas Gleixner wrote:
>> + if (xfer_to_guest_mode_work_pending()) {
>> srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
>> - cond_resched();
>> + r = xfer_to_guest_mode(vcpu);
>
> Any reason not to call this xfer_to_guest_mode_work()? Or handle_work(),
> do_work(), etc... Without the "work" part, it looks like a function that
> should be invoked unconditionally. It's obvious that's not the case if
> one looks at the implementation, but it's a bit confusing on the KVM side
> of things.

The reason is probably lazyness. The original approach was to have this
as close as possible to user entry/exit but with the recent changes
vs. instrumentation and RCU this is not longer the case.

I really want to keep the notion of transitioning in the function name,
so xfer_to_guest_mode_handle_work() makes a lot of sense.

I'll change that before merging the lot into the tip tree if your
Reviewed-by still stands with that change made w/o reposting.

Thanks,

tglx

2020-07-24 00:56:19

by Sean Christopherson

[permalink] [raw]
Subject: Re: [patch V5 15/15] x86/kvm: Use generic xfer to guest work function

On Fri, Jul 24, 2020 at 02:46:26AM +0200, Thomas Gleixner wrote:
> Sean,
>
> Sean Christopherson <[email protected]> writes:
> > On Thu, Jul 23, 2020 at 12:00:09AM +0200, Thomas Gleixner wrote:
> >> + if (xfer_to_guest_mode_work_pending()) {
> >> srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
> >> - cond_resched();
> >> + r = xfer_to_guest_mode(vcpu);
> >
> > Any reason not to call this xfer_to_guest_mode_work()? Or handle_work(),
> > do_work(), etc... Without the "work" part, it looks like a function that
> > should be invoked unconditionally. It's obvious that's not the case if
> > one looks at the implementation, but it's a bit confusing on the KVM side
> > of things.
>
> The reason is probably lazyness. The original approach was to have this
> as close as possible to user entry/exit but with the recent changes
> vs. instrumentation and RCU this is not longer the case.
>
> I really want to keep the notion of transitioning in the function name,
> so xfer_to_guest_mode_handle_work() makes a lot of sense.
>
> I'll change that before merging the lot into the tip tree if your
> Reviewed-by still stands with that change made w/o reposting.

Ya, works for me.

2020-07-24 14:25:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch V5 15/15] x86/kvm: Use generic xfer to guest work function


* Thomas Gleixner <[email protected]> wrote:

> From: Thomas Gleixner <[email protected]>
>
> Use the generic infrastructure to check for and handle pending work before
> transitioning into guest mode.
>
> This now handles TIF_NOTIFY_RESUME as well which was ignored so
> far. Handling it is important as this covers task work and task work will
> be used to offload the heavy lifting of POSIX CPU timers to thread context.
>
> Signed-off-by: Thomas Gleixner <[email protected]>
> ---
> V5: Rename exit -> xfer
> ---
> arch/x86/kvm/Kconfig | 1 +
> arch/x86/kvm/vmx/vmx.c | 11 +++++------
> arch/x86/kvm/x86.c | 15 ++++++---------
> 3 files changed, 12 insertions(+), 15 deletions(-)
>
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -42,6 +42,7 @@ config KVM
> select HAVE_KVM_MSI
> select HAVE_KVM_CPU_RELAX_INTERCEPT
> select HAVE_KVM_NO_POLL
> + select KVM_XFER_TO_GUEST_WORK
> select KVM_GENERIC_DIRTYLOG_READ_PROTECT
> select KVM_VFIO
> select SRCU
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -27,6 +27,7 @@
> #include <linux/slab.h>
> #include <linux/tboot.h>
> #include <linux/trace_events.h>
> +#include <linux/entry-kvm.h>
>
> #include <asm/apic.h>
> #include <asm/asm.h>
> @@ -5376,14 +5377,12 @@ static int handle_invalid_guest_state(st

>
> return 1;
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -56,6 +56,7 @@
> #include <linux/sched/stat.h>
> #include <linux/sched/isolation.h>
> #include <linux/mem_encrypt.h>
> +#include <linux/entry-kvm.h>
>
> #include <trace/events/kvm.h>
>
> @@ -1585,7 +1586,7 @@ EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr);
> bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu)
> {
> return vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) ||
> - need_resched() || signal_pending(current);
> + xfer_to_guest_mode_work_pending();
> }
> EXPORT_SYMBOL_GPL(kvm_vcpu_exit_request);
>
> @@ -8676,15 +8677,11 @@ static int vcpu_run(struct kvm_vcpu *vcp
> break;
> }
>
> - if (signal_pending(current)) {
> - r = -EINTR;
> - vcpu->run->exit_reason = KVM_EXIT_INTR;
> - ++vcpu->stat.signal_exits;
> - break;
> - }
> - if (need_resched()) {
> + if (xfer_to_guest_mode_work_pending()) {
> srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
> - cond_resched();
> + r = xfer_to_guest_mode(vcpu);
> + if (r)
> + return r;
> vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
> }
> }

So this chunk replaces vcpu_run()'s cond_resched() call with a call to
xfer_to_guest_mode(), which checks NEED_RESCHED & acts upon it, among
other things.

But:

> }
>
> /*
> - * Note, return 1 and not 0, vcpu_run() is responsible for
> - * morphing the pending signal into the proper return code.
> + * Note, return 1 and not 0, vcpu_run() will invoke
> + * xfer_to_guest_mode() which will create a proper return
> + * code.
> */
> - if (signal_pending(current))
> + if (__xfer_to_guest_mode_work_pending())
> return 1;
> -
> - if (need_resched())
> - schedule();
> }

AFAICS this chunk removes a conditional reschedule point from
handle_invalid_guest_state() and replaces it with
__xfer_to_guest_mode_work_pending().

But __xfer_to_guest_mode_work_pending() doesn't do the cond-resched of
the full xfer_to_guest_mode_work() function - so we essentially lose a
cond_resched() here.

Is this side effect intended, was the cond_resched() superfluous?

The logic is quite a maze, and the KVM code was missing a few of the
state checks, so maybe this is all for the better - just wanted to
mention it, in case it matters.

Thanks,

Ingo

2020-07-24 19:09:28

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [patch V5 15/15] x86/kvm: Use generic xfer to guest work function

Ingo Molnar <[email protected]> writes:
> * Thomas Gleixner <[email protected]> wrote:
>> /*
>> - * Note, return 1 and not 0, vcpu_run() is responsible for
>> - * morphing the pending signal into the proper return code.
>> + * Note, return 1 and not 0, vcpu_run() will invoke
>> + * xfer_to_guest_mode() which will create a proper return
>> + * code.
>> */
>> - if (signal_pending(current))
>> + if (__xfer_to_guest_mode_work_pending())
>> return 1;
>> -
>> - if (need_resched())
>> - schedule();
>> }
>
> AFAICS this chunk removes a conditional reschedule point from
> handle_invalid_guest_state() and replaces it with
> __xfer_to_guest_mode_work_pending().
>
> But __xfer_to_guest_mode_work_pending() doesn't do the cond-resched of
> the full xfer_to_guest_mode_work() function - so we essentially lose a
> cond_resched() here.
>
> Is this side effect intended, was the cond_resched() superfluous?

It makes the thing drop back to the outer loop for any pending work not
only for signals. That avoids having yet another thing to worry about.

Thanks,

tglx

Subject: [tip: x86/entry] x86/kvm: Use generic xfer to guest work function

The following commit has been merged into the x86/entry branch of tip:

Commit-ID: 72c3c0fe54a3f3ddea8f5ca468ddf9deaf2100b7
Gitweb: https://git.kernel.org/tip/72c3c0fe54a3f3ddea8f5ca468ddf9deaf2100b7
Author: Thomas Gleixner <[email protected]>
AuthorDate: Thu, 23 Jul 2020 00:00:09 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Fri, 24 Jul 2020 15:05:01 +02:00

x86/kvm: Use generic xfer to guest work function

Use the generic infrastructure to check for and handle pending work before
transitioning into guest mode.

This now handles TIF_NOTIFY_RESUME as well which was ignored so
far. Handling it is important as this covers task work and task work will
be used to offload the heavy lifting of POSIX CPU timers to thread context.

Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/kvm/Kconfig | 1 +
arch/x86/kvm/vmx/vmx.c | 11 +++++------
arch/x86/kvm/x86.c | 15 ++++++---------
3 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index b277a2d..fbd5bd7 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -42,6 +42,7 @@ config KVM
select HAVE_KVM_MSI
select HAVE_KVM_CPU_RELAX_INTERCEPT
select HAVE_KVM_NO_POLL
+ select KVM_XFER_TO_GUEST_WORK
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select KVM_VFIO
select SRCU
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 13745f2..9909375 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -27,6 +27,7 @@
#include <linux/slab.h>
#include <linux/tboot.h>
#include <linux/trace_events.h>
+#include <linux/entry-kvm.h>

#include <asm/apic.h>
#include <asm/asm.h>
@@ -5373,14 +5374,12 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
}

/*
- * Note, return 1 and not 0, vcpu_run() is responsible for
- * morphing the pending signal into the proper return code.
+ * Note, return 1 and not 0, vcpu_run() will invoke
+ * xfer_to_guest_mode() which will create a proper return
+ * code.
*/
- if (signal_pending(current))
+ if (__xfer_to_guest_mode_work_pending())
return 1;
-
- if (need_resched())
- schedule();
}

return 1;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 88c593f..82d4a9e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -56,6 +56,7 @@
#include <linux/sched/stat.h>
#include <linux/sched/isolation.h>
#include <linux/mem_encrypt.h>
+#include <linux/entry-kvm.h>

#include <trace/events/kvm.h>

@@ -1587,7 +1588,7 @@ EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr);
bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu)
{
return vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) ||
- need_resched() || signal_pending(current);
+ xfer_to_guest_mode_work_pending();
}
EXPORT_SYMBOL_GPL(kvm_vcpu_exit_request);

@@ -8681,15 +8682,11 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
break;
}

- if (signal_pending(current)) {
- r = -EINTR;
- vcpu->run->exit_reason = KVM_EXIT_INTR;
- ++vcpu->stat.signal_exits;
- break;
- }
- if (need_resched()) {
+ if (xfer_to_guest_mode_work_pending()) {
srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
- cond_resched();
+ r = xfer_to_guest_mode_handle_work(vcpu);
+ if (r)
+ return r;
vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
}
}