2021-10-19 17:56:25

by Halil Pasic

[permalink] [raw]
Subject: [PATCH 1/3] KVM: s390: clear kicked_mask before sleeping again

The idea behind kicked mask is that we should not re-kick a vcpu that
is already in the "kick" process, i.e. that was kicked and is
is about to be dispatched if certain conditions are met.

The problem with the current implementation is, that it assumes the
kicked vcpu is going to enter SIE shortly. But under certain
circumstances, the vcpu we just kicked will be deemed non-runnable and
will remain in wait state. This can happen, if the interrupt(s) this
vcpu got kicked to deal with got already cleared (because the interrupts
got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
would return false, and the vcpu would remain in kvm_vcpu_block(),
but this time with its kicked_mask bit set. So next time around we
wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
that we just kicked it.

Let us make sure the kicked_mask is cleared before we give up on
re-dispatching the vcpu.

Signed-off-by: Halil Pasic <[email protected]>
Reported-by: Matthew Rosato <[email protected]>
Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")
---
arch/s390/kvm/kvm-s390.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 6a6dd5e1daf6..1c97493d21e1 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -3363,6 +3363,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)

int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
{
+ clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask);
return kvm_s390_vcpu_has_irq(vcpu, 0);
}

--
2.25.1


2021-10-19 21:25:28

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 1/3] KVM: s390: clear kicked_mask before sleeping again



Am 19.10.21 um 19:53 schrieb Halil Pasic:
> The idea behind kicked mask is that we should not re-kick a vcpu that
> is already in the "kick" process, i.e. that was kicked and is
> is about to be dispatched if certain conditions are met.
>
> The problem with the current implementation is, that it assumes the
> kicked vcpu is going to enter SIE shortly. But under certain
> circumstances, the vcpu we just kicked will be deemed non-runnable and
> will remain in wait state. This can happen, if the interrupt(s) this
> vcpu got kicked to deal with got already cleared (because the interrupts
> got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
> would return false, and the vcpu would remain in kvm_vcpu_block(),
> but this time with its kicked_mask bit set. So next time around we
> wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
> that we just kicked it.
>
> Let us make sure the kicked_mask is cleared before we give up on
> re-dispatching the vcpu.
>
> Signed-off-by: Halil Pasic <[email protected]>
> Reported-by: Matthew Rosato <[email protected]>
> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")

Reviewed-by: Christian Borntraeger <[email protected]>

> ---
> arch/s390/kvm/kvm-s390.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 6a6dd5e1daf6..1c97493d21e1 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -3363,6 +3363,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>
> int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> {
> + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask);
> return kvm_s390_vcpu_has_irq(vcpu, 0);
> }
>
>

2021-10-20 05:45:43

by Claudio Imbrenda

[permalink] [raw]
Subject: Re: [PATCH 1/3] KVM: s390: clear kicked_mask before sleeping again

On Tue, 19 Oct 2021 19:53:59 +0200
Halil Pasic <[email protected]> wrote:

> The idea behind kicked mask is that we should not re-kick a vcpu that
> is already in the "kick" process, i.e. that was kicked and is
> is about to be dispatched if certain conditions are met.
>
> The problem with the current implementation is, that it assumes the
> kicked vcpu is going to enter SIE shortly. But under certain
> circumstances, the vcpu we just kicked will be deemed non-runnable and
> will remain in wait state. This can happen, if the interrupt(s) this
> vcpu got kicked to deal with got already cleared (because the interrupts
> got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
> would return false, and the vcpu would remain in kvm_vcpu_block(),
> but this time with its kicked_mask bit set. So next time around we
> wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
> that we just kicked it.
>
> Let us make sure the kicked_mask is cleared before we give up on
> re-dispatching the vcpu.
>
> Signed-off-by: Halil Pasic <[email protected]>
> Reported-by: Matthew Rosato <[email protected]>
> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")
> ---
> arch/s390/kvm/kvm-s390.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 6a6dd5e1daf6..1c97493d21e1 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -3363,6 +3363,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>
> int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> {
> + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask);

so, you unconditionally clear the flag, before knowing if the vCPU is
runnable?

from your description I would have expected to only clear the bit if
the vCPU is not runnable.

would things break if we were to try to kick the vCPU again after
clearing the bit, but before dispatching it?

> return kvm_s390_vcpu_has_irq(vcpu, 0);
> }
>

2021-10-20 06:07:04

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 1/3] KVM: s390: clear kicked_mask before sleeping again



Am 20.10.21 um 07:35 schrieb Claudio Imbrenda:
> On Tue, 19 Oct 2021 19:53:59 +0200
> Halil Pasic <[email protected]> wrote:
>
>> The idea behind kicked mask is that we should not re-kick a vcpu that
>> is already in the "kick" process, i.e. that was kicked and is
>> is about to be dispatched if certain conditions are met.
>>
>> The problem with the current implementation is, that it assumes the
>> kicked vcpu is going to enter SIE shortly. But under certain
>> circumstances, the vcpu we just kicked will be deemed non-runnable and
>> will remain in wait state. This can happen, if the interrupt(s) this
>> vcpu got kicked to deal with got already cleared (because the interrupts
>> got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
>> would return false, and the vcpu would remain in kvm_vcpu_block(),
>> but this time with its kicked_mask bit set. So next time around we
>> wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
>> that we just kicked it.
>>
>> Let us make sure the kicked_mask is cleared before we give up on
>> re-dispatching the vcpu.
>>
>> Signed-off-by: Halil Pasic <[email protected]>
>> Reported-by: Matthew Rosato <[email protected]>
>> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")
>> ---
>> arch/s390/kvm/kvm-s390.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 6a6dd5e1daf6..1c97493d21e1 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -3363,6 +3363,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>>
>> int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
>> {
>> + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask);
>
> so, you unconditionally clear the flag, before knowing if the vCPU is
> runnable?
>
> from your description I would have expected to only clear the bit if
> the vCPU is not runnable.
>
> would things break if we were to try to kick the vCPU again after
> clearing the bit, but before dispatching it?

The whole logic is just an optimization to avoid unnecessary wakeups.
When the bit is set a wakup might be omitted.
I prefer to do an unneeded wakeup over not doing a wakeup so I think
over-clearing is safer.
In fact, getting rid of this micro-optimization would be a valid
alternative.
>
>> return kvm_s390_vcpu_has_irq(vcpu, 0);
>> }
>>
>

2021-10-20 06:11:09

by Claudio Imbrenda

[permalink] [raw]
Subject: Re: [PATCH 1/3] KVM: s390: clear kicked_mask before sleeping again

On Wed, 20 Oct 2021 08:03:40 +0200
Christian Borntraeger <[email protected]> wrote:

> Am 20.10.21 um 07:35 schrieb Claudio Imbrenda:
> > On Tue, 19 Oct 2021 19:53:59 +0200
> > Halil Pasic <[email protected]> wrote:
> >
> >> The idea behind kicked mask is that we should not re-kick a vcpu that
> >> is already in the "kick" process, i.e. that was kicked and is
> >> is about to be dispatched if certain conditions are met.
> >>
> >> The problem with the current implementation is, that it assumes the
> >> kicked vcpu is going to enter SIE shortly. But under certain
> >> circumstances, the vcpu we just kicked will be deemed non-runnable and
> >> will remain in wait state. This can happen, if the interrupt(s) this
> >> vcpu got kicked to deal with got already cleared (because the interrupts
> >> got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
> >> would return false, and the vcpu would remain in kvm_vcpu_block(),
> >> but this time with its kicked_mask bit set. So next time around we
> >> wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
> >> that we just kicked it.
> >>
> >> Let us make sure the kicked_mask is cleared before we give up on
> >> re-dispatching the vcpu.
> >>
> >> Signed-off-by: Halil Pasic <[email protected]>
> >> Reported-by: Matthew Rosato <[email protected]>
> >> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")
> >> ---
> >> arch/s390/kvm/kvm-s390.c | 1 +
> >> 1 file changed, 1 insertion(+)
> >>
> >> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> >> index 6a6dd5e1daf6..1c97493d21e1 100644
> >> --- a/arch/s390/kvm/kvm-s390.c
> >> +++ b/arch/s390/kvm/kvm-s390.c
> >> @@ -3363,6 +3363,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> >>
> >> int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> >> {
> >> + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask);
> >
> > so, you unconditionally clear the flag, before knowing if the vCPU is
> > runnable?
> >
> > from your description I would have expected to only clear the bit if
> > the vCPU is not runnable.
> >
> > would things break if we were to try to kick the vCPU again after
> > clearing the bit, but before dispatching it?
>
> The whole logic is just an optimization to avoid unnecessary wakeups.
> When the bit is set a wakup might be omitted.
> I prefer to do an unneeded wakeup over not doing a wakeup so I think
> over-clearing is safer.
> In fact, getting rid of this micro-optimization would be a valid
> alternative.

my only concern was if things would break in case we kick the vCPU
again after clearing the bit; it seems nothing breaks, so I'm ok with it

> >
> >> return kvm_s390_vcpu_has_irq(vcpu, 0);
> >> }
> >>
> >

2021-10-20 08:11:39

by Michael Mueller

[permalink] [raw]
Subject: Re: [PATCH 1/3] KVM: s390: clear kicked_mask before sleeping again



On 19.10.21 19:53, Halil Pasic wrote:
> The idea behind kicked mask is that we should not re-kick a vcpu that
> is already in the "kick" process, i.e. that was kicked and is
> is about to be dispatched if certain conditions are met.
>
> The problem with the current implementation is, that it assumes the
> kicked vcpu is going to enter SIE shortly. But under certain
> circumstances, the vcpu we just kicked will be deemed non-runnable and
> will remain in wait state. This can happen, if the interrupt(s) this
> vcpu got kicked to deal with got already cleared (because the interrupts
> got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
> would return false, and the vcpu would remain in kvm_vcpu_block(),
> but this time with its kicked_mask bit set. So next time around we
> wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
> that we just kicked it.
>
> Let us make sure the kicked_mask is cleared before we give up on
> re-dispatching the vcpu.
>
> Signed-off-by: Halil Pasic <[email protected]>
> Reported-by: Matthew Rosato <[email protected]>
> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")

Reviewed-by: Michael Mueller <[email protected]>

> ---
> arch/s390/kvm/kvm-s390.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 6a6dd5e1daf6..1c97493d21e1 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -3363,6 +3363,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>
> int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> {
> + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask);
> return kvm_s390_vcpu_has_irq(vcpu, 0);
> }
>
>

2021-10-20 08:19:58

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH 1/3] KVM: s390: clear kicked_mask before sleeping again

On Wed, 20 Oct 2021 08:08:16 +0200
Claudio Imbrenda <[email protected]> wrote:

> > >> + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask);
> > >
> > > so, you unconditionally clear the flag, before knowing if the vCPU is
> > > runnable?

Right. I talked about this with Mimu. It would extend the section
guarded by the bit, and than may be a good thing. Maybe we should
measure that alternative as well.

> > >
> > > from your description I would have expected to only clear the bit if
> > > the vCPU is not runnable.
> > >
> > > would things break if we were to try to kick the vCPU again after
> > > clearing the bit, but before dispatching it?
> >
> > The whole logic is just an optimization to avoid unnecessary wakeups.
> > When the bit is set a wakup might be omitted.
> > I prefer to do an unneeded wakeup over not doing a wakeup so I think
> > over-clearing is safer.
> > In fact, getting rid of this micro-optimization would be a valid
> > alternative.
>
> my only concern was if things would break in case we kick the vCPU
> again after clearing the bit; it seems nothing breaks, so I'm ok with it

I'm not sure about the exact impact of over-waking.
kvm_s390_vcpu_wakeup() sets vcpu->valid_wakeup which is I believe used
for some halt poll heuristics. We unset that in
kvm_arch_vcpu_block_finish(). If we cleared only conditionally the
protection would extend for that as well. Which would be a good thing.
The statistics stuff in kvm_vcpu_wake_up() does account for already
running, so I see no correctness issues there.

Regards,
Halil




2021-10-20 10:46:37

by Claudio Imbrenda

[permalink] [raw]
Subject: Re: [PATCH 1/3] KVM: s390: clear kicked_mask before sleeping again

On Wed, 20 Oct 2021 10:14:50 +0200
Halil Pasic <[email protected]> wrote:

[...]

> running, so I see no correctness issues there.

fair enough :)


2021-10-20 10:47:31

by Claudio Imbrenda

[permalink] [raw]
Subject: Re: [PATCH 1/3] KVM: s390: clear kicked_mask before sleeping again

On Tue, 19 Oct 2021 19:53:59 +0200
Halil Pasic <[email protected]> wrote:

> The idea behind kicked mask is that we should not re-kick a vcpu that
> is already in the "kick" process, i.e. that was kicked and is
> is about to be dispatched if certain conditions are met.
>
> The problem with the current implementation is, that it assumes the
> kicked vcpu is going to enter SIE shortly. But under certain
> circumstances, the vcpu we just kicked will be deemed non-runnable and
> will remain in wait state. This can happen, if the interrupt(s) this
> vcpu got kicked to deal with got already cleared (because the interrupts
> got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
> would return false, and the vcpu would remain in kvm_vcpu_block(),
> but this time with its kicked_mask bit set. So next time around we
> wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
> that we just kicked it.
>
> Let us make sure the kicked_mask is cleared before we give up on
> re-dispatching the vcpu.
>
> Signed-off-by: Halil Pasic <[email protected]>
> Reported-by: Matthew Rosato <[email protected]>
> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")

Reviewed-by: Claudio Imbrenda <[email protected]>

> ---
> arch/s390/kvm/kvm-s390.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 6a6dd5e1daf6..1c97493d21e1 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -3363,6 +3363,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>
> int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> {
> + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask);
> return kvm_s390_vcpu_has_irq(vcpu, 0);
> }
>