Subject: [PATCH] entry/kvm: Make vCPU tasks exit to userspace when a livepatch is pending

A livepatch migration for a task can only happen when the task is
sleeping or it exits to userspace. This may happen infrequently for a
heavily loaded vCPU task, leading to livepatch transition failures.

Fake signals will be sent to tasks which fail to migrate via stack
checking. This will cause running vCPU tasks to exit guest mode, but
since no signal is pending they return to guest execution without
exiting to userspace. Fix this by treating a pending livepatch migration
like a pending signal, exiting to userspace with EINTR. This allows the
migration to complete, and userspace should re-excecute KVM_RUN to
resume guest execution.

In my testing, systems where livepatching would timeout after 60 seconds
were able to load livepatches within a couple of seconds with this
change.

Signed-off-by: Seth Forshee <[email protected]>
---
kernel/entry/kvm.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
index 9d09f489b60e..efe4b791c253 100644
--- a/kernel/entry/kvm.c
+++ b/kernel/entry/kvm.c
@@ -14,7 +14,12 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work)
task_work_run();
}

- if (ti_work & _TIF_SIGPENDING) {
+ /*
+ * When a livepatch migration is pending, force an exit to
+ * userspace as though a signal is pending to allow the
+ * migration to complete.
+ */
+ if (ti_work & (_TIF_SIGPENDING | _TIF_PATCH_PENDING)) {
kvm_handle_signal_exit(vcpu);
return -EINTR;
}
--
2.32.0


Subject: Re: [PATCH] entry/kvm: Make vCPU tasks exit to userspace when a livepatch is pending

On Tue, May 03, 2022 at 02:17:53PM +0000, Sean Christopherson wrote:
> On Tue, May 03, 2022, Seth Forshee wrote:
> > A livepatch migration for a task can only happen when the task is
> > sleeping or it exits to userspace. This may happen infrequently for a
> > heavily loaded vCPU task, leading to livepatch transition failures.
> >
> > Fake signals will be sent to tasks which fail to migrate via stack
> > checking. This will cause running vCPU tasks to exit guest mode, but
> > since no signal is pending they return to guest execution without
> > exiting to userspace. Fix this by treating a pending livepatch migration
> > like a pending signal, exiting to userspace with EINTR. This allows the
> > migration to complete, and userspace should re-excecute KVM_RUN to
> > resume guest execution.
> >
> > In my testing, systems where livepatching would timeout after 60 seconds
> > were able to load livepatches within a couple of seconds with this
> > change.
> >
> > Signed-off-by: Seth Forshee <[email protected]>
> > ---
> > kernel/entry/kvm.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
> > index 9d09f489b60e..efe4b791c253 100644
> > --- a/kernel/entry/kvm.c
> > +++ b/kernel/entry/kvm.c
> > @@ -14,7 +14,12 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work)
> > task_work_run();
> > }
> >
> > - if (ti_work & _TIF_SIGPENDING) {
> > + /*
> > + * When a livepatch migration is pending, force an exit to
>
> Can the changelog and comment use terminology other than migration? Maybe "transition"?
> That seems to be prevelant through the livepatch code and docs. There are already
> too many meanings for "migration" in KVM, e.g. live migration, page migration, task/vCPU
> migration, etc...

"Transition" is used a lot, but afaict it refers to the overall state of
the livepatch. "Migrate" is used a lot less, but it always seems to
refer to patching a single task, which is why I used that term. But I
can see the opportunity for confusion, so I'll reword it.

>
> > + * userspace as though a signal is pending to allow the
> > + * migration to complete.
> > + */
> > + if (ti_work & (_TIF_SIGPENDING | _TIF_PATCH_PENDING)) {
>
> _TIF_PATCH_PENDING needs to be in XFER_TO_GUEST_MODE_WORK too, otherwise there's
> no guarantee KVM will see the flag and invoke xfer_to_guest_mode_handle_work().

Yes, you are right. I was relying on the livepatch code setting
_TIF_NOTIFY_SIGNAL for vCPU tasks which were running, but it would be
better to have _TIF_PATCH_PENDING in XFER_TO_GUEST_MODE_WORK too.

Thanks,
Seth

>
> > kvm_handle_signal_exit(vcpu);
> > return -EINTR;
> > }
> > --
> > 2.32.0
> >

2022-05-03 21:36:29

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] entry/kvm: Make vCPU tasks exit to userspace when a livepatch is pending

On Tue, May 03, 2022, Seth Forshee wrote:
> A livepatch migration for a task can only happen when the task is
> sleeping or it exits to userspace. This may happen infrequently for a
> heavily loaded vCPU task, leading to livepatch transition failures.
>
> Fake signals will be sent to tasks which fail to migrate via stack
> checking. This will cause running vCPU tasks to exit guest mode, but
> since no signal is pending they return to guest execution without
> exiting to userspace. Fix this by treating a pending livepatch migration
> like a pending signal, exiting to userspace with EINTR. This allows the
> migration to complete, and userspace should re-excecute KVM_RUN to
> resume guest execution.
>
> In my testing, systems where livepatching would timeout after 60 seconds
> were able to load livepatches within a couple of seconds with this
> change.
>
> Signed-off-by: Seth Forshee <[email protected]>
> ---
> kernel/entry/kvm.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
> index 9d09f489b60e..efe4b791c253 100644
> --- a/kernel/entry/kvm.c
> +++ b/kernel/entry/kvm.c
> @@ -14,7 +14,12 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work)
> task_work_run();
> }
>
> - if (ti_work & _TIF_SIGPENDING) {
> + /*
> + * When a livepatch migration is pending, force an exit to

Can the changelog and comment use terminology other than migration? Maybe "transition"?
That seems to be prevelant through the livepatch code and docs. There are already
too many meanings for "migration" in KVM, e.g. live migration, page migration, task/vCPU
migration, etc...

> + * userspace as though a signal is pending to allow the
> + * migration to complete.
> + */
> + if (ti_work & (_TIF_SIGPENDING | _TIF_PATCH_PENDING)) {

_TIF_PATCH_PENDING needs to be in XFER_TO_GUEST_MODE_WORK too, otherwise there's
no guarantee KVM will see the flag and invoke xfer_to_guest_mode_handle_work().

> kvm_handle_signal_exit(vcpu);
> return -EINTR;
> }
> --
> 2.32.0
>

2022-05-04 00:50:04

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH] entry/kvm: Make vCPU tasks exit to userspace when a livepatch is pending

On Tue, May 03, 2022 at 11:18:02AM -0500, Seth Forshee wrote:
> > Can the changelog and comment use terminology other than migration? Maybe "transition"?
> > That seems to be prevelant through the livepatch code and docs. There are already
> > too many meanings for "migration" in KVM, e.g. live migration, page migration, task/vCPU
> > migration, etc...
>
> "Transition" is used a lot, but afaict it refers to the overall state of
> the livepatch. "Migrate" is used a lot less, but it always seems to
> refer to patching a single task, which is why I used that term. But I
> can see the opportunity for confusion, so I'll reword it.

The livepatch code does seem to be guilty of using both terms
interchangeably. I agree "transition" is preferable.

--
Josh