From: Julien Thierry <[email protected]>
kvm_vcpu_kick() is not NMI safe. When the overflow handler is called from
NMI context, defer waking the vcpu to an irq_work queue.
Cc: Julien Thierry <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: James Morse <[email protected]>
Cc: Suzuki K Pouloze <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Julien Thierry <[email protected]>
Signed-off-by: Alexandru Elisei <[email protected]>
---
arch/arm64/kvm/pmu-emul.c | 25 ++++++++++++++++++++++++-
include/kvm/arm_pmu.h | 1 +
2 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index f0d0312c0a55..30268397ed06 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -433,6 +433,22 @@ void kvm_pmu_sync_hwstate(struct kvm_vcpu *vcpu)
kvm_pmu_update_state(vcpu);
}
+/**
+ * When perf interrupt is an NMI, we cannot safely notify the vcpu corresponding
+ * to the event.
+ * This is why we need a callback to do it once outside of the NMI context.
+ */
+static void kvm_pmu_perf_overflow_notify_vcpu(struct irq_work *work)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_pmu *pmu;
+
+ pmu = container_of(work, struct kvm_pmu, overflow_work);
+ vcpu = kvm_pmc_to_vcpu(&pmu->pmc[0]);
+
+ kvm_vcpu_kick(vcpu);
+}
+
/**
* When the perf event overflows, set the overflow status and inform the vcpu.
*/
@@ -465,7 +481,11 @@ static void kvm_pmu_perf_overflow(struct perf_event *perf_event,
if (kvm_pmu_overflow_status(vcpu)) {
kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu);
- kvm_vcpu_kick(vcpu);
+
+ if (!in_nmi())
+ kvm_vcpu_kick(vcpu);
+ else
+ irq_work_queue(&vcpu->arch.pmu.overflow_work);
}
cpu_pmu->pmu.start(perf_event, PERF_EF_RELOAD);
@@ -764,6 +784,9 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
return ret;
}
+ init_irq_work(&vcpu->arch.pmu.overflow_work,
+ kvm_pmu_perf_overflow_notify_vcpu);
+
vcpu->arch.pmu.created = true;
return 0;
}
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 6db030439e29..dbf4f08d42e5 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -27,6 +27,7 @@ struct kvm_pmu {
bool ready;
bool created;
bool irq_level;
+ struct irq_work overflow_work;
};
#define kvm_arm_pmu_v3_ready(v) ((v)->arch.pmu.ready)
--
2.28.0
On Wed, Aug 19, 2020 at 02:34:17PM +0100, Alexandru Elisei wrote:
> From: Julien Thierry <[email protected]>
>
> kvm_vcpu_kick() is not NMI safe. When the overflow handler is called from
> NMI context, defer waking the vcpu to an irq_work queue.
>
> Cc: Julien Thierry <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: James Morse <[email protected]>
> Cc: Suzuki K Pouloze <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Julien Thierry <[email protected]>
> Signed-off-by: Alexandru Elisei <[email protected]>
> ---
> arch/arm64/kvm/pmu-emul.c | 25 ++++++++++++++++++++++++-
> include/kvm/arm_pmu.h | 1 +
> 2 files changed, 25 insertions(+), 1 deletion(-)
I'd like an Ack from the KVM side on this one, but some minor comments
inline.
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index f0d0312c0a55..30268397ed06 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -433,6 +433,22 @@ void kvm_pmu_sync_hwstate(struct kvm_vcpu *vcpu)
> kvm_pmu_update_state(vcpu);
> }
>
> +/**
> + * When perf interrupt is an NMI, we cannot safely notify the vcpu corresponding
> + * to the event.
> + * This is why we need a callback to do it once outside of the NMI context.
> + */
> +static void kvm_pmu_perf_overflow_notify_vcpu(struct irq_work *work)
> +{
> + struct kvm_vcpu *vcpu;
> + struct kvm_pmu *pmu;
> +
> + pmu = container_of(work, struct kvm_pmu, overflow_work);
> + vcpu = kvm_pmc_to_vcpu(&pmu->pmc[0]);
Can you spell this kvm_pmc_to_vcpu(pmu->pmc); ?
> +
> + kvm_vcpu_kick(vcpu);
How do we guarantee that the vCPU is still around by the time this runs?
Sorry to ask such a horrible question, but I don't see anything associating
the workqueue with the lifetime of the vCPU.
Will
Hi Will,
On 9/21/20 2:43 PM, Will Deacon wrote:
> On Wed, Aug 19, 2020 at 02:34:17PM +0100, Alexandru Elisei wrote:
>> From: Julien Thierry <[email protected]>
>>
>> kvm_vcpu_kick() is not NMI safe. When the overflow handler is called from
>> NMI context, defer waking the vcpu to an irq_work queue.
>>
>> Cc: Julien Thierry <[email protected]>
>> Cc: Marc Zyngier <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> Cc: Mark Rutland <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: James Morse <[email protected]>
>> Cc: Suzuki K Pouloze <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Signed-off-by: Julien Thierry <[email protected]>
>> Signed-off-by: Alexandru Elisei <[email protected]>
>> ---
>> arch/arm64/kvm/pmu-emul.c | 25 ++++++++++++++++++++++++-
>> include/kvm/arm_pmu.h | 1 +
>> 2 files changed, 25 insertions(+), 1 deletion(-)
> I'd like an Ack from the KVM side on this one, but some minor comments
> inline.
>
>> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
>> index f0d0312c0a55..30268397ed06 100644
>> --- a/arch/arm64/kvm/pmu-emul.c
>> +++ b/arch/arm64/kvm/pmu-emul.c
>> @@ -433,6 +433,22 @@ void kvm_pmu_sync_hwstate(struct kvm_vcpu *vcpu)
>> kvm_pmu_update_state(vcpu);
>> }
>>
>> +/**
>> + * When perf interrupt is an NMI, we cannot safely notify the vcpu corresponding
>> + * to the event.
>> + * This is why we need a callback to do it once outside of the NMI context.
>> + */
>> +static void kvm_pmu_perf_overflow_notify_vcpu(struct irq_work *work)
>> +{
>> + struct kvm_vcpu *vcpu;
>> + struct kvm_pmu *pmu;
>> +
>> + pmu = container_of(work, struct kvm_pmu, overflow_work);
>> + vcpu = kvm_pmc_to_vcpu(&pmu->pmc[0]);
> Can you spell this kvm_pmc_to_vcpu(pmu->pmc); ?
Of course, that is much better.
>
>> +
>> + kvm_vcpu_kick(vcpu);
> How do we guarantee that the vCPU is still around by the time this runs?
> Sorry to ask such a horrible question, but I don't see anything associating
> the workqueue with the lifetime of the vCPU.
That's a very nice catch, indeed the code doesn't guarantee that the VM is still
around when the work is executed. I will add an irq_work_sync() call to
kvm_pmu_vcpu_destroy() (which is called by kvm_vcpu_destroy() ->
kvm_arch_vcpu_destroy()), and to kvm_pmu_vcpu_reset(), similar to how x86 handles it.
Thanks,
Alex