2024-01-26 13:01:47

by Zhang, Xiong Y

[permalink] [raw]
Subject: [RFC PATCH 03/41] perf: Set exclude_guest onto nmi_watchdog

From: Xiong Zhang <[email protected]>

The perf event for NMI watchdog is per cpu pinned system wide event,
if such event doesn't have exclude_guest flag, it will be put into
error state once guest with passthrough PMU starts, this breaks
NMI watchdog function totally.

This commit adds exclude_guest flag for this perf event, so this perf
event is stopped during VM running, but it will continue working after
VM exit. In this way the NMI watchdog can not detect hardlockups during
VM running, it still breaks NMI watchdog function a bit. But host perf
event must be stopped during VM with passthrough PMU running, current
no other reliable method can be used to replace perf event for NMI
watchdog.

Signed-off-by: Xiong Zhang <[email protected]>
Signed-off-by: Mingwei Zhang <[email protected]>
---
kernel/watchdog_perf.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/watchdog_perf.c b/kernel/watchdog_perf.c
index 8ea00c4a24b2..c8ba656ff674 100644
--- a/kernel/watchdog_perf.c
+++ b/kernel/watchdog_perf.c
@@ -88,6 +88,7 @@ static struct perf_event_attr wd_hw_attr = {
.size = sizeof(struct perf_event_attr),
.pinned = 1,
.disabled = 1,
+ .exclude_guest = 1,
};

/* Callback function for perf event subsystem */
--
2.34.1



2024-04-11 18:57:10

by Sean Christopherson

[permalink] [raw]
Subject: Re: [RFC PATCH 03/41] perf: Set exclude_guest onto nmi_watchdog

On Fri, Jan 26, 2024, Xiong Zhang wrote:
> From: Xiong Zhang <[email protected]>
>
> The perf event for NMI watchdog is per cpu pinned system wide event,
> if such event doesn't have exclude_guest flag, it will be put into
> error state once guest with passthrough PMU starts, this breaks
> NMI watchdog function totally.
>
> This commit adds exclude_guest flag for this perf event, so this perf
> event is stopped during VM running, but it will continue working after
> VM exit. In this way the NMI watchdog can not detect hardlockups during
> VM running, it still breaks NMI watchdog function a bit. But host perf
> event must be stopped during VM with passthrough PMU running, current
> no other reliable method can be used to replace perf event for NMI
> watchdog.

As mentioned in the cover letter, I think this is backwards, and mediated PMU
support should be disallowed if kernel-priority things like the watchdog are in
use.

Doubly so because this patch affects _everything_, not just systems with VMs
that have a mediated PMU.

> Signed-off-by: Xiong Zhang <[email protected]>
> Signed-off-by: Mingwei Zhang <[email protected]>
> ---
> kernel/watchdog_perf.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/watchdog_perf.c b/kernel/watchdog_perf.c
> index 8ea00c4a24b2..c8ba656ff674 100644
> --- a/kernel/watchdog_perf.c
> +++ b/kernel/watchdog_perf.c
> @@ -88,6 +88,7 @@ static struct perf_event_attr wd_hw_attr = {
> .size = sizeof(struct perf_event_attr),
> .pinned = 1,
> .disabled = 1,
> + .exclude_guest = 1,
> };
>
> /* Callback function for perf event subsystem */
> --
> 2.34.1
>