2021-08-10 06:29:29

by Lai Jiangshan

[permalink] [raw]
Subject: [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT

From: Lai Jiangshan <[email protected]>

Commit c77fb5fe6f03 ("KVM: x86: Allow the guest to run with dirty debug
registers") allows the guest accessing to DRs without exiting when
KVM_DEBUGREG_WONT_EXIT and we need to ensure that they are synchronized
on entry to the guest---including DR6 that was not synced before the commit.

But the commit sets the hardware DR6 not only when KVM_DEBUGREG_WONT_EXIT,
but also when KVM_DEBUGREG_BP_ENABLED. The second case is unnecessary
and just leads to a more case which leaks stale DR6 to the host which has
to be resolved by unconditionally reseting DR6 in kvm_arch_vcpu_put().

We'd better to set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT,
so that we can fine-grain control the cases when we need to reset it
which is done in later patch.

Signed-off-by: Lai Jiangshan <[email protected]>
---
arch/x86/kvm/x86.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ad47a09ce307..d2aa49722064 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9598,7 +9598,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
set_debugreg(vcpu->arch.eff_db[1], 1);
set_debugreg(vcpu->arch.eff_db[2], 2);
set_debugreg(vcpu->arch.eff_db[3], 3);
- set_debugreg(vcpu->arch.dr6, 6);
+ /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
+ if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
+ set_debugreg(vcpu->arch.dr6, 6);
} else if (unlikely(hw_breakpoint_active())) {
set_debugreg(0, 7);
}
--
2.19.1.6.gb485710b


2021-08-10 13:26:46

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT

On 09/08/21 19:43, Lai Jiangshan wrote:
> From: Lai Jiangshan <[email protected]>
>
> Commit c77fb5fe6f03 ("KVM: x86: Allow the guest to run with dirty debug
> registers") allows the guest accessing to DRs without exiting when
> KVM_DEBUGREG_WONT_EXIT and we need to ensure that they are synchronized
> on entry to the guest---including DR6 that was not synced before the commit.
>
> But the commit sets the hardware DR6 not only when KVM_DEBUGREG_WONT_EXIT,
> but also when KVM_DEBUGREG_BP_ENABLED. The second case is unnecessary
> and just leads to a more case which leaks stale DR6 to the host which has
> to be resolved by unconditionally reseting DR6 in kvm_arch_vcpu_put().
>
> We'd better to set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT,
> so that we can fine-grain control the cases when we need to reset it
> which is done in later patch.
>
> Signed-off-by: Lai Jiangshan <[email protected]>
> ---
> arch/x86/kvm/x86.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ad47a09ce307..d2aa49722064 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9598,7 +9598,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> set_debugreg(vcpu->arch.eff_db[1], 1);
> set_debugreg(vcpu->arch.eff_db[2], 2);
> set_debugreg(vcpu->arch.eff_db[3], 3);
> - set_debugreg(vcpu->arch.dr6, 6);
> + /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
> + if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
> + set_debugreg(vcpu->arch.dr6, 6);
> } else if (unlikely(hw_breakpoint_active())) {
> set_debugreg(0, 7);
> }
>

Even better, this should be moved to vmx.c's vcpu_enter_guest. This
matches the handling in svm.c:

/*
* Run with all-zero DR6 unless needed, so that we can get the exact cause
* of a #DB.
*/
if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))
svm_set_dr6(svm, vcpu->arch.dr6);
else
svm_set_dr6(svm, DR6_ACTIVE_LOW);

That is,

KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT

Commit c77fb5fe6f03 ("KVM: x86: Allow the guest to run with dirty debug
registers") allows the guest accessing to DRs without exiting when
KVM_DEBUGREG_WONT_EXIT and we need to ensure that they are synchronized
on entry to the guest---including DR6 that was not synced before the commit.

But the commit sets the hardware DR6 not only when KVM_DEBUGREG_WONT_EXIT,
but also when KVM_DEBUGREG_BP_ENABLED. The second case is unnecessary
and just leads to a more case which leaks stale DR6 to the host which has
to be resolved by unconditionally reseting DR6 in kvm_arch_vcpu_put().

Even if KVM_DEBUGREG_WONT_EXIT, however, setting the host DR6 only matters
on VMX because SVM always uses the DR6 value from the VMCB. So move this
line to vmx.c and make it conditional on KVM_DEBUGREG_WONT_EXIT.

Signed-off-by: Paolo Bonzini <[email protected]>

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ae8e62df16dd..21a3ef3012cf 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6625,6 +6625,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
vmx->loaded_vmcs->host_state.cr4 = cr4;
}

+ /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
+ if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
+ set_debugreg(vcpu->arch.dr6, 6);
+
/* When single-stepping over STI and MOV SS, we must clear the
* corresponding interruptibility bits in the guest state. Otherwise
* vmentry fails as it then expects bit 14 (BS) in pending debug
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a111899ab2b4..fbc536b21585 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9597,7 +9597,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
set_debugreg(vcpu->arch.eff_db[1], 1);
set_debugreg(vcpu->arch.eff_db[2], 2);
set_debugreg(vcpu->arch.eff_db[3], 3);
- set_debugreg(vcpu->arch.dr6, 6);
} else if (unlikely(hw_breakpoint_active())) {
set_debugreg(0, 7);
}

Paolo

2021-08-10 13:29:55

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT



On 2021/8/10 18:07, Paolo Bonzini wrote:
> On 09/08/21 19:43, Lai Jiangshan wrote:
>> From: Lai Jiangshan <[email protected]>
>>
>> Commit c77fb5fe6f03 ("KVM: x86: Allow the guest to run with dirty debug
>> registers") allows the guest accessing to DRs without exiting when
>> KVM_DEBUGREG_WONT_EXIT and we need to ensure that they are synchronized
>> on entry to the guest---including DR6 that was not synced before the commit.
>>
>> But the commit sets the hardware DR6 not only when KVM_DEBUGREG_WONT_EXIT,
>> but also when KVM_DEBUGREG_BP_ENABLED.  The second case is unnecessary
>> and just leads to a more case which leaks stale DR6 to the host which has
>> to be resolved by unconditionally reseting DR6 in kvm_arch_vcpu_put().
>>
>> We'd better to set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT,
>> so that we can fine-grain control the cases when we need to reset it
>> which is done in later patch.
>>
>> Signed-off-by: Lai Jiangshan <[email protected]>
>> ---
>>   arch/x86/kvm/x86.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index ad47a09ce307..d2aa49722064 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -9598,7 +9598,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>           set_debugreg(vcpu->arch.eff_db[1], 1);
>>           set_debugreg(vcpu->arch.eff_db[2], 2);
>>           set_debugreg(vcpu->arch.eff_db[3], 3);
>> -        set_debugreg(vcpu->arch.dr6, 6);
>> +        /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
>> +        if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
>> +            set_debugreg(vcpu->arch.dr6, 6);
>>       } else if (unlikely(hw_breakpoint_active())) {
>>           set_debugreg(0, 7);
>>       }
>>
>
> Even better, this should be moved to vmx.c's vcpu_enter_guest.  This
> matches the handling in svm.c:
>
>         /*
>          * Run with all-zero DR6 unless needed, so that we can get the exact cause
>          * of a #DB.
>          */
>         if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))
>                 svm_set_dr6(svm, vcpu->arch.dr6);
>         else
>                 svm_set_dr6(svm, DR6_ACTIVE_LOW);
>
> That is,
>
>     KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT
>     Commit c77fb5fe6f03 ("KVM: x86: Allow the guest to run with dirty debug
>     registers") allows the guest accessing to DRs without exiting when
>     KVM_DEBUGREG_WONT_EXIT and we need to ensure that they are synchronized
>     on entry to the guest---including DR6 that was not synced before the commit.
>     But the commit sets the hardware DR6 not only when KVM_DEBUGREG_WONT_EXIT,
>     but also when KVM_DEBUGREG_BP_ENABLED.  The second case is unnecessary
>     and just leads to a more case which leaks stale DR6 to the host which has
>     to be resolved by unconditionally reseting DR6 in kvm_arch_vcpu_put().
>     Even if KVM_DEBUGREG_WONT_EXIT, however, setting the host DR6 only matters
>     on VMX because SVM always uses the DR6 value from the VMCB.  So move this
>     line to vmx.c and make it conditional on KVM_DEBUGREG_WONT_EXIT.
>     Signed-off-by: Paolo Bonzini <[email protected]>
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ae8e62df16dd..21a3ef3012cf 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6625,6 +6625,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>          vmx->loaded_vmcs->host_state.cr4 = cr4;
>      }
>
> +    /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
> +    if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
> +        set_debugreg(vcpu->arch.dr6, 6);


I also noticed the related code in svm.c, but I refrained myself
to add a new branch in vmx_vcpu_run(). But after I see you put
the code of resetting dr6 in vmx_sync_dirty_debug_regs(), the whole
solution is much clean and better.

And if any chance you are also concern about the additional branch,
could you add a new callback to set dr6 and call the callback from
x86.c when KVM_DEBUGREG_WONT_EXIT.

The possible implementation of the callback:
for vmx: set_debugreg(vcpu->arch.dr6, 6);
for svm: svm_set_dr6(svm, vcpu->arch.dr6);
and always do svm_set_dr6(svm, DR6_ACTIVE_LOW); at the end of the
svm_handle_exit().

Thanks
Lai

> +
>      /* When single-stepping over STI and MOV SS, we must clear the
>       * corresponding interruptibility bits in the guest state. Otherwise
>       * vmentry fails as it then expects bit 14 (BS) in pending debug
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a111899ab2b4..fbc536b21585 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9597,7 +9597,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>          set_debugreg(vcpu->arch.eff_db[1], 1);
>          set_debugreg(vcpu->arch.eff_db[2], 2);
>          set_debugreg(vcpu->arch.eff_db[3], 3);
> -        set_debugreg(vcpu->arch.dr6, 6);
>      } else if (unlikely(hw_breakpoint_active())) {
>          set_debugreg(0, 7);
>      }
>
> Paolo

2021-08-10 13:32:48

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT

On 10/08/21 12:30, Lai Jiangshan wrote:
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index ae8e62df16dd..21a3ef3012cf 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -6625,6 +6625,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu
>> *vcpu)
>>           vmx->loaded_vmcs->host_state.cr4 = cr4;
>>       }
>>
>> +    /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
>> +    if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
>> +        set_debugreg(vcpu->arch.dr6, 6);
>
>
> I also noticed the related code in svm.c, but I refrained myself
> to add a new branch in vmx_vcpu_run().  But after I see you put
> the code of resetting dr6 in vmx_sync_dirty_debug_regs(), the whole
> solution is much clean and better.
>
> And if any chance you are also concern about the additional branch,
> could you add a new callback to set dr6 and call the callback from
> x86.c when KVM_DEBUGREG_WONT_EXIT.

The extra branch should be well predicted, and the idea you sketched
below would cause DR6 to be marked uselessly as dirty in SVM, so I think
this is cleaner. Let's add an "unlikely" around it too.

Paolo

> The possible implementation of the callback:
> for vmx: set_debugreg(vcpu->arch.dr6, 6);
> for svm: svm_set_dr6(svm, vcpu->arch.dr6);
>          and always do svm_set_dr6(svm, DR6_ACTIVE_LOW); at the end of the
>          svm_handle_exit().
>
> Thanks
> Lai
>
>> +
>>       /* When single-stepping over STI and MOV SS, we must clear the
>>        * corresponding interruptibility bits in the guest state.
>> Otherwise
>>        * vmentry fails as it then expects bit 14 (BS) in pending debug
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index a111899ab2b4..fbc536b21585 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -9597,7 +9597,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>           set_debugreg(vcpu->arch.eff_db[1], 1);
>>           set_debugreg(vcpu->arch.eff_db[2], 2);
>>           set_debugreg(vcpu->arch.eff_db[3], 3);
>> -        set_debugreg(vcpu->arch.dr6, 6);
>>       } else if (unlikely(hw_breakpoint_active())) {
>>           set_debugreg(0, 7);
>>       }
>>
>> Paolo
>

2021-08-10 14:17:25

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT

On 10/08/21 12:46, Lai Jiangshan wrote:
> I'm OK with it. But I don't think the sketched idea would cause DR6 to
> be marked uselessly as dirty in SVM. It doesn't mark it dirty if the
> value is unchanged, and the value is always DR6_ACTIVE_LOW except when
> it just clears KVM_DEBUGREG_WONT_EXIT.

It would be marked dirty if it is not DR6_ACTIVE_LOW, because it would
be set first to DR6_ACTIVE_LOW in svm_handle_exit and then set back to
the guest value on the next entry.

Paolo

2021-08-10 16:23:08

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT



On 2021/8/10 18:35, Paolo Bonzini wrote:
> On 10/08/21 12:30, Lai Jiangshan wrote:
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>> index ae8e62df16dd..21a3ef3012cf 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -6625,6 +6625,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>>           vmx->loaded_vmcs->host_state.cr4 = cr4;
>>>       }
>>>
>>> +    /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
>>> +    if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
>>> +        set_debugreg(vcpu->arch.dr6, 6);
>>
>>
>> I also noticed the related code in svm.c, but I refrained myself
>> to add a new branch in vmx_vcpu_run().  But after I see you put
>> the code of resetting dr6 in vmx_sync_dirty_debug_regs(), the whole
>> solution is much clean and better.
>>
>> And if any chance you are also concern about the additional branch,
>> could you add a new callback to set dr6 and call the callback from
>> x86.c when KVM_DEBUGREG_WONT_EXIT.
>
> The extra branch should be well predicted, and the idea you sketched below would cause DR6 to be marked uselessly as
> dirty in SVM, so I think this is cleaner.  Let's add an "unlikely" around it too.

I'm OK with it. But I don't think the sketched idea would cause DR6 to be marked uselessly as dirty in SVM. It doesn't
mark it dirty if the value is unchanged, and the value is always DR6_ACTIVE_LOW except when it just clears
KVM_DEBUGREG_WONT_EXIT.

>
> Paolo
>
>> The possible implementation of the callback:
>> for vmx: set_debugreg(vcpu->arch.dr6, 6);
>> for svm: svm_set_dr6(svm, vcpu->arch.dr6);
>>           and always do svm_set_dr6(svm, DR6_ACTIVE_LOW); at the end of the
>>           svm_handle_exit().
>>
>> Thanks
>> Lai
>>
>>> +
>>>       /* When single-stepping over STI and MOV SS, we must clear the
>>>        * corresponding interruptibility bits in the guest state. Otherwise
>>>        * vmentry fails as it then expects bit 14 (BS) in pending debug
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index a111899ab2b4..fbc536b21585 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -9597,7 +9597,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>           set_debugreg(vcpu->arch.eff_db[1], 1);
>>>           set_debugreg(vcpu->arch.eff_db[2], 2);
>>>           set_debugreg(vcpu->arch.eff_db[3], 3);
>>> -        set_debugreg(vcpu->arch.dr6, 6);
>>>       } else if (unlikely(hw_breakpoint_active())) {
>>>           set_debugreg(0, 7);
>>>       }
>>>
>>> Paolo
>>