2018-11-26 15:50:24

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 0/4] x86/kvm/hyper-v: Implement Direct Mode for synthetic timers

Changes since v1:
- avoid open-coding stimer_mark_pending() in kvm_hv_notify_acked_sint() and
kvm_hv_synic_send_eoi [Paolo Bonzini]

Turns out Hyper-V on KVM (as of 2016) will only use synthetic timers
if direct mode is available. With direct mode we notify the guest by
asserting APIC irq instead of sending a SynIC message.

Qemu and kvm-unit-test patches for testing this series can be found in
v1 submission:
https://lkml.org/lkml/2018/11/13/972

Vitaly Kuznetsov (4):
x86/hyper-v: move synic/stimer control structures definitions to
hyperv-tlfs.h
x86/kvm/hyper-v: use stimer config definition from hyperv-tlfs.h
x86/kvm/hyper-v: direct mode for synthetic timers
x86/kvm/hyper-v: avoid open-coding stimer_mark_pending() in
kvm_hv_notify_acked_sint()

arch/x86/include/asm/hyperv-tlfs.h | 73 ++++++++++++++++++--
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/hyperv.c | 106 +++++++++++++++++++++--------
arch/x86/kvm/trace.h | 10 +--
arch/x86/kvm/x86.c | 1 +
drivers/hv/hv.c | 2 +-
drivers/hv/hyperv_vmbus.h | 68 ------------------
include/uapi/linux/kvm.h | 1 +
8 files changed, 154 insertions(+), 109 deletions(-)

--
2.19.1



2018-11-26 15:49:52

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

We implement Hyper-V SynIC and synthetic timers in KVM too so there's some
room for code sharing.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/include/asm/hyperv-tlfs.h | 69 ++++++++++++++++++++++++++++++
drivers/hv/hv.c | 2 +-
drivers/hv/hyperv_vmbus.h | 68 -----------------------------
3 files changed, 70 insertions(+), 69 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 4139f7650fe5..b032c05cd3ee 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -731,6 +731,75 @@ struct hv_enlightened_vmcs {
#define HV_STIMER_AUTOENABLE (1ULL << 3)
#define HV_STIMER_SINT(config) (__u8)(((config) >> 16) & 0x0F)

+/* Define synthetic interrupt controller flag constants. */
+#define HV_EVENT_FLAGS_COUNT (256 * 8)
+#define HV_EVENT_FLAGS_LONG_COUNT (256 / sizeof(unsigned long))
+
+/*
+ * Synthetic timer configuration.
+ */
+union hv_stimer_config {
+ u64 as_uint64;
+ struct {
+ u64 enable:1;
+ u64 periodic:1;
+ u64 lazy:1;
+ u64 auto_enable:1;
+ u64 apic_vector:8;
+ u64 direct_mode:1;
+ u64 reserved_z0:3;
+ u64 sintx:4;
+ u64 reserved_z1:44;
+ };
+};
+
+
+/* Define the synthetic interrupt controller event flags format. */
+union hv_synic_event_flags {
+ unsigned long flags[HV_EVENT_FLAGS_LONG_COUNT];
+};
+
+/* Define SynIC control register. */
+union hv_synic_scontrol {
+ u64 as_uint64;
+ struct {
+ u64 enable:1;
+ u64 reserved:63;
+ };
+};
+
+/* Define synthetic interrupt source. */
+union hv_synic_sint {
+ u64 as_uint64;
+ struct {
+ u64 vector:8;
+ u64 reserved1:8;
+ u64 masked:1;
+ u64 auto_eoi:1;
+ u64 reserved2:46;
+ };
+};
+
+/* Define the format of the SIMP register */
+union hv_synic_simp {
+ u64 as_uint64;
+ struct {
+ u64 simp_enabled:1;
+ u64 preserved:11;
+ u64 base_simp_gpa:52;
+ };
+};
+
+/* Define the format of the SIEFP register */
+union hv_synic_siefp {
+ u64 as_uint64;
+ struct {
+ u64 siefp_enabled:1;
+ u64 preserved:11;
+ u64 base_siefp_gpa:52;
+ };
+};
+
struct hv_vpset {
u64 format;
u64 valid_bank_mask;
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 332d7c34be5c..11273cd384d6 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -143,7 +143,7 @@ static int hv_ce_shutdown(struct clock_event_device *evt)

static int hv_ce_set_oneshot(struct clock_event_device *evt)
{
- union hv_timer_config timer_cfg;
+ union hv_stimer_config timer_cfg;

timer_cfg.as_uint64 = 0;
timer_cfg.enable = 1;
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 72eaba3d50fc..8df4f45f4b6d 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -44,74 +44,6 @@
*/
#define HV_UTIL_NEGO_TIMEOUT 55

-/* Define synthetic interrupt controller flag constants. */
-#define HV_EVENT_FLAGS_COUNT (256 * 8)
-#define HV_EVENT_FLAGS_LONG_COUNT (256 / sizeof(unsigned long))
-
-/*
- * Timer configuration register.
- */
-union hv_timer_config {
- u64 as_uint64;
- struct {
- u64 enable:1;
- u64 periodic:1;
- u64 lazy:1;
- u64 auto_enable:1;
- u64 apic_vector:8;
- u64 direct_mode:1;
- u64 reserved_z0:3;
- u64 sintx:4;
- u64 reserved_z1:44;
- };
-};
-
-
-/* Define the synthetic interrupt controller event flags format. */
-union hv_synic_event_flags {
- unsigned long flags[HV_EVENT_FLAGS_LONG_COUNT];
-};
-
-/* Define SynIC control register. */
-union hv_synic_scontrol {
- u64 as_uint64;
- struct {
- u64 enable:1;
- u64 reserved:63;
- };
-};
-
-/* Define synthetic interrupt source. */
-union hv_synic_sint {
- u64 as_uint64;
- struct {
- u64 vector:8;
- u64 reserved1:8;
- u64 masked:1;
- u64 auto_eoi:1;
- u64 reserved2:46;
- };
-};
-
-/* Define the format of the SIMP register */
-union hv_synic_simp {
- u64 as_uint64;
- struct {
- u64 simp_enabled:1;
- u64 preserved:11;
- u64 base_simp_gpa:52;
- };
-};
-
-/* Define the format of the SIEFP register */
-union hv_synic_siefp {
- u64 as_uint64;
- struct {
- u64 siefp_enabled:1;
- u64 preserved:11;
- u64 base_siefp_gpa:52;
- };
-};

/* Definitions for the monitored notification facility */
union hv_monitor_trigger_group {
--
2.19.1


2018-11-26 15:49:59

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 4/4] x86/kvm/hyper-v: avoid open-coding stimer_mark_pending() in kvm_hv_notify_acked_sint()

stimers_pending optimization only helps us to avoid multiple
kvm_make_request() calls. This doesn't happen very often and these
calls are very cheap in the first place, remove open-coded version of
stimer_mark_pending() from kvm_hv_notify_acked_sint().

Suggested-by: Paolo Bonzini <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/hyperv.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 9533133be566..e6a2a085644a 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -206,7 +206,7 @@ static void kvm_hv_notify_acked_sint(struct kvm_vcpu *vcpu, u32 sint)
struct kvm_vcpu_hv_synic *synic = vcpu_to_synic(vcpu);
struct kvm_vcpu_hv *hv_vcpu = vcpu_to_hv_vcpu(vcpu);
struct kvm_vcpu_hv_stimer *stimer;
- int gsi, idx, stimers_pending;
+ int gsi, idx;

trace_kvm_hv_notify_acked_sint(vcpu->vcpu_id, sint);

@@ -214,19 +214,13 @@ static void kvm_hv_notify_acked_sint(struct kvm_vcpu *vcpu, u32 sint)
synic_clear_sint_msg_pending(synic, sint);

/* Try to deliver pending Hyper-V SynIC timers messages */
- stimers_pending = 0;
for (idx = 0; idx < ARRAY_SIZE(hv_vcpu->stimer); idx++) {
stimer = &hv_vcpu->stimer[idx];
if (stimer->msg_pending && stimer->config.enable &&
!stimer->config.direct_mode &&
- stimer->config.sintx == sint) {
- set_bit(stimer->index,
- hv_vcpu->stimer_pending_bitmap);
- stimers_pending++;
- }
+ stimer->config.sintx == sint)
+ stimer_mark_pending(stimer, false);
}
- if (stimers_pending)
- kvm_make_request(KVM_REQ_HV_STIMER, vcpu);

idx = srcu_read_lock(&kvm->irq_srcu);
gsi = atomic_read(&synic->sint_to_gsi[sint]);
--
2.19.1


2018-11-26 15:52:22

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 2/4] x86/kvm/hyper-v: use stimer config definition from hyperv-tlfs.h

As a preparation to implementing Direct Mode for Hyper-V synthetic
timers switch to using stimer config definition from hyperv-tlfs.h.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/include/asm/hyperv-tlfs.h | 6 ------
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/hyperv.c | 33 +++++++++++++++---------------
3 files changed, 18 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index b032c05cd3ee..ebfed56976d2 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -725,12 +725,6 @@ struct hv_enlightened_vmcs {

#define HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL 0xFFFF

-#define HV_STIMER_ENABLE (1ULL << 0)
-#define HV_STIMER_PERIODIC (1ULL << 1)
-#define HV_STIMER_LAZY (1ULL << 2)
-#define HV_STIMER_AUTOENABLE (1ULL << 3)
-#define HV_STIMER_SINT(config) (__u8)(((config) >> 16) & 0x0F)
-
/* Define synthetic interrupt controller flag constants. */
#define HV_EVENT_FLAGS_COUNT (256 * 8)
#define HV_EVENT_FLAGS_LONG_COUNT (256 / sizeof(unsigned long))
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 55e51ff7e421..e1a40e649cdc 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -497,7 +497,7 @@ struct kvm_mtrr {
struct kvm_vcpu_hv_stimer {
struct hrtimer timer;
int index;
- u64 config;
+ union hv_stimer_config config;
u64 count;
u64 exp_time;
struct hv_message msg;
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 4e80080f277a..eaec15c738df 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -201,9 +201,8 @@ static void kvm_hv_notify_acked_sint(struct kvm_vcpu *vcpu, u32 sint)
stimers_pending = 0;
for (idx = 0; idx < ARRAY_SIZE(hv_vcpu->stimer); idx++) {
stimer = &hv_vcpu->stimer[idx];
- if (stimer->msg_pending &&
- (stimer->config & HV_STIMER_ENABLE) &&
- HV_STIMER_SINT(stimer->config) == sint) {
+ if (stimer->msg_pending && stimer->config.enable &&
+ stimer->config.sintx == sint) {
set_bit(stimer->index,
hv_vcpu->stimer_pending_bitmap);
stimers_pending++;
@@ -497,7 +496,7 @@ static int stimer_start(struct kvm_vcpu_hv_stimer *stimer)
time_now = get_time_ref_counter(stimer_to_vcpu(stimer)->kvm);
ktime_now = ktime_get();

- if (stimer->config & HV_STIMER_PERIODIC) {
+ if (stimer->config.periodic) {
if (stimer->exp_time) {
if (time_now >= stimer->exp_time) {
u64 remainder;
@@ -546,13 +545,15 @@ static int stimer_start(struct kvm_vcpu_hv_stimer *stimer)
static int stimer_set_config(struct kvm_vcpu_hv_stimer *stimer, u64 config,
bool host)
{
+ union hv_stimer_config new_config = {.as_uint64 = config};
+
trace_kvm_hv_stimer_set_config(stimer_to_vcpu(stimer)->vcpu_id,
stimer->index, config, host);

stimer_cleanup(stimer);
- if ((stimer->config & HV_STIMER_ENABLE) && HV_STIMER_SINT(config) == 0)
- config &= ~HV_STIMER_ENABLE;
- stimer->config = config;
+ if (stimer->config.enable && new_config.sintx == 0)
+ new_config.enable = 0;
+ stimer->config.as_uint64 = new_config.as_uint64;
stimer_mark_pending(stimer, false);
return 0;
}
@@ -566,16 +567,16 @@ static int stimer_set_count(struct kvm_vcpu_hv_stimer *stimer, u64 count,
stimer_cleanup(stimer);
stimer->count = count;
if (stimer->count == 0)
- stimer->config &= ~HV_STIMER_ENABLE;
- else if (stimer->config & HV_STIMER_AUTOENABLE)
- stimer->config |= HV_STIMER_ENABLE;
+ stimer->config.enable = 0;
+ else if (stimer->config.auto_enable)
+ stimer->config.enable = 1;
stimer_mark_pending(stimer, false);
return 0;
}

static int stimer_get_config(struct kvm_vcpu_hv_stimer *stimer, u64 *pconfig)
{
- *pconfig = stimer->config;
+ *pconfig = stimer->config.as_uint64;
return 0;
}

@@ -636,7 +637,7 @@ static int stimer_send_msg(struct kvm_vcpu_hv_stimer *stimer)
payload->expiration_time = stimer->exp_time;
payload->delivery_time = get_time_ref_counter(vcpu->kvm);
return synic_deliver_msg(vcpu_to_synic(vcpu),
- HV_STIMER_SINT(stimer->config), msg);
+ stimer->config.sintx, msg);
}

static void stimer_expiration(struct kvm_vcpu_hv_stimer *stimer)
@@ -649,8 +650,8 @@ static void stimer_expiration(struct kvm_vcpu_hv_stimer *stimer)
stimer->index, r);
if (!r) {
stimer->msg_pending = false;
- if (!(stimer->config & HV_STIMER_PERIODIC))
- stimer->config &= ~HV_STIMER_ENABLE;
+ if (!(stimer->config.periodic))
+ stimer->config.enable = 0;
}
}

@@ -664,7 +665,7 @@ void kvm_hv_process_stimers(struct kvm_vcpu *vcpu)
for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++)
if (test_and_clear_bit(i, hv_vcpu->stimer_pending_bitmap)) {
stimer = &hv_vcpu->stimer[i];
- if (stimer->config & HV_STIMER_ENABLE) {
+ if (stimer->config.enable) {
exp_time = stimer->exp_time;

if (exp_time) {
@@ -674,7 +675,7 @@ void kvm_hv_process_stimers(struct kvm_vcpu *vcpu)
stimer_expiration(stimer);
}

- if ((stimer->config & HV_STIMER_ENABLE) &&
+ if ((stimer->config.enable) &&
stimer->count) {
if (!stimer->msg_pending)
stimer_start(stimer);
--
2.19.1


2018-11-26 15:53:02

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

Turns out Hyper-V on KVM (as of 2016) will only use synthetic timers
if direct mode is available. With direct mode we notify the guest by
asserting APIC irq instead of sending a SynIC message.

The implementation uses existing vec_bitmap for letting lapic code
know that we're interested in the particular IRQ's EOI request. We assume
that the same APIC irq won't be used by the guest for both direct mode
stimer and as sint source (especially with AutoEOI semantics). It is
unclear how things should be handled if that's not true.

Direct mode is also somewhat less expensive; in my testing
stimer_send_msg() takes not less than 1500 cpu cycles and
stimer_notify_direct() can usually be done in 300-400. WS2016 without
Hyper-V, however, always sticks to non-direct version.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
- Changes since v1: avoid open-coding stimer_mark_pending() in
kvm_hv_synic_send_eoi() [Paolo Bonzini]
---
arch/x86/kvm/hyperv.c | 67 +++++++++++++++++++++++++++++++++++-----
arch/x86/kvm/trace.h | 10 +++---
arch/x86/kvm/x86.c | 1 +
include/uapi/linux/kvm.h | 1 +
4 files changed, 67 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index eaec15c738df..9533133be566 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -38,6 +38,9 @@

#define KVM_HV_MAX_SPARSE_VCPU_SET_BITS DIV_ROUND_UP(KVM_MAX_VCPUS, 64)

+static void stimer_mark_pending(struct kvm_vcpu_hv_stimer *stimer,
+ bool vcpu_kick);
+
static inline u64 synic_read_sint(struct kvm_vcpu_hv_synic *synic, int sint)
{
return atomic64_read(&synic->sint[sint]);
@@ -53,8 +56,21 @@ static inline int synic_get_sint_vector(u64 sint_value)
static bool synic_has_vector_connected(struct kvm_vcpu_hv_synic *synic,
int vector)
{
+ struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
+ struct kvm_vcpu_hv *hv_vcpu = vcpu_to_hv_vcpu(vcpu);
+ struct kvm_vcpu_hv_stimer *stimer;
int i;

+ for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++) {
+ stimer = &hv_vcpu->stimer[i];
+ if (stimer->config.enable && stimer->config.direct_mode &&
+ stimer->config.apic_vector == vector)
+ return true;
+ }
+
+ if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
+ return false;
+
for (i = 0; i < ARRAY_SIZE(synic->sint); i++) {
if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
return true;
@@ -80,14 +96,14 @@ static bool synic_has_vector_auto_eoi(struct kvm_vcpu_hv_synic *synic,
static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
int vector)
{
- if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
- return;
-
if (synic_has_vector_connected(synic, vector))
__set_bit(vector, synic->vec_bitmap);
else
__clear_bit(vector, synic->vec_bitmap);

+ if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
+ return;
+
if (synic_has_vector_auto_eoi(synic, vector))
__set_bit(vector, synic->auto_eoi_bitmap);
else
@@ -202,6 +218,7 @@ static void kvm_hv_notify_acked_sint(struct kvm_vcpu *vcpu, u32 sint)
for (idx = 0; idx < ARRAY_SIZE(hv_vcpu->stimer); idx++) {
stimer = &hv_vcpu->stimer[idx];
if (stimer->msg_pending && stimer->config.enable &&
+ !stimer->config.direct_mode &&
stimer->config.sintx == sint) {
set_bit(stimer->index,
hv_vcpu->stimer_pending_bitmap);
@@ -371,7 +388,9 @@ int kvm_hv_synic_set_irq(struct kvm *kvm, u32 vpidx, u32 sint)

void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector)
{
+ struct kvm_vcpu_hv *hv_vcpu = vcpu_to_hv_vcpu(vcpu);
struct kvm_vcpu_hv_synic *synic = vcpu_to_synic(vcpu);
+ struct kvm_vcpu_hv_stimer *stimer;
int i;

trace_kvm_hv_synic_send_eoi(vcpu->vcpu_id, vector);
@@ -379,6 +398,14 @@ void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector)
for (i = 0; i < ARRAY_SIZE(synic->sint); i++)
if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
kvm_hv_notify_acked_sint(vcpu, i);
+
+ for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++) {
+ stimer = &hv_vcpu->stimer[i];
+ if (stimer->msg_pending && stimer->config.enable &&
+ stimer->config.direct_mode &&
+ stimer->config.apic_vector == vector)
+ stimer_mark_pending(stimer, false);
+ }
}

static int kvm_hv_set_sint_gsi(struct kvm *kvm, u32 vpidx, u32 sint, int gsi)
@@ -545,15 +572,25 @@ static int stimer_start(struct kvm_vcpu_hv_stimer *stimer)
static int stimer_set_config(struct kvm_vcpu_hv_stimer *stimer, u64 config,
bool host)
{
- union hv_stimer_config new_config = {.as_uint64 = config};
+ struct kvm_vcpu *vcpu = stimer_to_vcpu(stimer);
+ struct kvm_vcpu_hv *hv_vcpu = vcpu_to_hv_vcpu(vcpu);
+ union hv_stimer_config new_config = {.as_uint64 = config},
+ old_config = {.as_uint64 = stimer->config.as_uint64};

trace_kvm_hv_stimer_set_config(stimer_to_vcpu(stimer)->vcpu_id,
stimer->index, config, host);

stimer_cleanup(stimer);
- if (stimer->config.enable && new_config.sintx == 0)
+ if (old_config.enable &&
+ !new_config.direct_mode && new_config.sintx == 0)
new_config.enable = 0;
stimer->config.as_uint64 = new_config.as_uint64;
+
+ if (old_config.direct_mode)
+ synic_update_vector(&hv_vcpu->synic, old_config.apic_vector);
+ if (new_config.direct_mode)
+ synic_update_vector(&hv_vcpu->synic, new_config.apic_vector);
+
stimer_mark_pending(stimer, false);
return 0;
}
@@ -640,14 +677,28 @@ static int stimer_send_msg(struct kvm_vcpu_hv_stimer *stimer)
stimer->config.sintx, msg);
}

+static int stimer_notify_direct(struct kvm_vcpu_hv_stimer *stimer)
+{
+ struct kvm_vcpu *vcpu = stimer_to_vcpu(stimer);
+ struct kvm_lapic_irq irq = {
+ .delivery_mode = APIC_DM_FIXED,
+ .vector = stimer->config.apic_vector
+ };
+
+ return !kvm_apic_set_irq(vcpu, &irq, NULL);
+}
+
static void stimer_expiration(struct kvm_vcpu_hv_stimer *stimer)
{
- int r;
+ int r, direct = stimer->config.direct_mode;

stimer->msg_pending = true;
- r = stimer_send_msg(stimer);
+ if (!direct)
+ r = stimer_send_msg(stimer);
+ else
+ r = stimer_notify_direct(stimer);
trace_kvm_hv_stimer_expiration(stimer_to_vcpu(stimer)->vcpu_id,
- stimer->index, r);
+ stimer->index, direct, r);
if (!r) {
stimer->msg_pending = false;
if (!(stimer->config.periodic))
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 0659465a745c..705f40ae2532 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1254,24 +1254,26 @@ TRACE_EVENT(kvm_hv_stimer_callback,
* Tracepoint for stimer_expiration.
*/
TRACE_EVENT(kvm_hv_stimer_expiration,
- TP_PROTO(int vcpu_id, int timer_index, int msg_send_result),
- TP_ARGS(vcpu_id, timer_index, msg_send_result),
+ TP_PROTO(int vcpu_id, int timer_index, int direct, int msg_send_result),
+ TP_ARGS(vcpu_id, timer_index, direct, msg_send_result),

TP_STRUCT__entry(
__field(int, vcpu_id)
__field(int, timer_index)
+ __field(int, direct)
__field(int, msg_send_result)
),

TP_fast_assign(
__entry->vcpu_id = vcpu_id;
__entry->timer_index = timer_index;
+ __entry->direct = direct;
__entry->msg_send_result = msg_send_result;
),

- TP_printk("vcpu_id %d timer %d msg send result %d",
+ TP_printk("vcpu_id %d timer %d direct %d send result %d",
__entry->vcpu_id, __entry->timer_index,
- __entry->msg_send_result)
+ __entry->direct, __entry->msg_send_result)
);

/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5cd5647120f2..b21b5ceb8d26 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2997,6 +2997,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_HYPERV_TLBFLUSH:
case KVM_CAP_HYPERV_SEND_IPI:
case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
+ case KVM_CAP_HYPERV_STIMER_DIRECT:
case KVM_CAP_PCI_SEGMENT:
case KVM_CAP_DEBUGREGS:
case KVM_CAP_X86_ROBUST_SINGLESTEP:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 2b7a652c9fa4..b8da14cee8e5 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -975,6 +975,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_HYPERV_ENLIGHTENED_VMCS 163
#define KVM_CAP_EXCEPTION_PAYLOAD 164
#define KVM_CAP_ARM_VM_IPA_SIZE 165
+#define KVM_CAP_HYPERV_STIMER_DIRECT 166

#ifdef KVM_CAP_IRQ_ROUTING

--
2.19.1


2018-11-26 16:48:02

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] x86/kvm/hyper-v: Implement Direct Mode for synthetic timers

On 26/11/18 16:47, Vitaly Kuznetsov wrote:
> Changes since v1:
> - avoid open-coding stimer_mark_pending() in kvm_hv_notify_acked_sint() and
> kvm_hv_synic_send_eoi [Paolo Bonzini]
>
> Turns out Hyper-V on KVM (as of 2016) will only use synthetic timers
> if direct mode is available. With direct mode we notify the guest by
> asserting APIC irq instead of sending a SynIC message.
>
> Qemu and kvm-unit-test patches for testing this series can be found in
> v1 submission:
> https://lkml.org/lkml/2018/11/13/972
>
> Vitaly Kuznetsov (4):
> x86/hyper-v: move synic/stimer control structures definitions to
> hyperv-tlfs.h
> x86/kvm/hyper-v: use stimer config definition from hyperv-tlfs.h
> x86/kvm/hyper-v: direct mode for synthetic timers
> x86/kvm/hyper-v: avoid open-coding stimer_mark_pending() in
> kvm_hv_notify_acked_sint()
>
> arch/x86/include/asm/hyperv-tlfs.h | 73 ++++++++++++++++++--
> arch/x86/include/asm/kvm_host.h | 2 +-
> arch/x86/kvm/hyperv.c | 106 +++++++++++++++++++++--------
> arch/x86/kvm/trace.h | 10 +--
> arch/x86/kvm/x86.c | 1 +
> drivers/hv/hv.c | 2 +-
> drivers/hv/hyperv_vmbus.h | 68 ------------------
> include/uapi/linux/kvm.h | 1 +
> 8 files changed, 154 insertions(+), 109 deletions(-)
>

Queued, thanks.

Paolo

2018-11-26 16:49:41

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] x86/kvm/hyper-v: avoid open-coding stimer_mark_pending() in kvm_hv_notify_acked_sint()

On 26/11/18 16:47, Vitaly Kuznetsov wrote:
> stimers_pending optimization only helps us to avoid multiple
> kvm_make_request() calls. This doesn't happen very often and these
> calls are very cheap in the first place, remove open-coded version of
> stimer_mark_pending() from kvm_hv_notify_acked_sint().

I wouldn't say very cheap, but still relatively cheap compared to a vmexit.

Paolo

2018-11-26 16:50:51

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

On 26/11/18 16:47, Vitaly Kuznetsov wrote:
> Turns out Hyper-V on KVM (as of 2016) will only use synthetic timers
> if direct mode is available. With direct mode we notify the guest by
> asserting APIC irq instead of sending a SynIC message.
>
> The implementation uses existing vec_bitmap for letting lapic code
> know that we're interested in the particular IRQ's EOI request. We assume
> that the same APIC irq won't be used by the guest for both direct mode
> stimer and as sint source (especially with AutoEOI semantics). It is
> unclear how things should be handled if that's not true.
>
> Direct mode is also somewhat less expensive; in my testing
> stimer_send_msg() takes not less than 1500 cpu cycles and
> stimer_notify_direct() can usually be done in 300-400. WS2016 without
> Hyper-V, however, always sticks to non-direct version.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> - Changes since v1: avoid open-coding stimer_mark_pending() in
> kvm_hv_synic_send_eoi() [Paolo Bonzini]
> ---
> arch/x86/kvm/hyperv.c | 67 +++++++++++++++++++++++++++++++++++-----
> arch/x86/kvm/trace.h | 10 +++---
> arch/x86/kvm/x86.c | 1 +
> include/uapi/linux/kvm.h | 1 +
> 4 files changed, 67 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index eaec15c738df..9533133be566 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -38,6 +38,9 @@
>
> #define KVM_HV_MAX_SPARSE_VCPU_SET_BITS DIV_ROUND_UP(KVM_MAX_VCPUS, 64)
>
> +static void stimer_mark_pending(struct kvm_vcpu_hv_stimer *stimer,
> + bool vcpu_kick);
> +
> static inline u64 synic_read_sint(struct kvm_vcpu_hv_synic *synic, int sint)
> {
> return atomic64_read(&synic->sint[sint]);
> @@ -53,8 +56,21 @@ static inline int synic_get_sint_vector(u64 sint_value)
> static bool synic_has_vector_connected(struct kvm_vcpu_hv_synic *synic,
> int vector)
> {
> + struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
> + struct kvm_vcpu_hv *hv_vcpu = vcpu_to_hv_vcpu(vcpu);
> + struct kvm_vcpu_hv_stimer *stimer;
> int i;
>
> + for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++) {
> + stimer = &hv_vcpu->stimer[i];
> + if (stimer->config.enable && stimer->config.direct_mode &&
> + stimer->config.apic_vector == vector)
> + return true;
> + }
> +
> + if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
> + return false;
> +
> for (i = 0; i < ARRAY_SIZE(synic->sint); i++) {
> if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
> return true;
> @@ -80,14 +96,14 @@ static bool synic_has_vector_auto_eoi(struct kvm_vcpu_hv_synic *synic,
> static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
> int vector)
> {
> - if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
> - return;
> -
> if (synic_has_vector_connected(synic, vector))
> __set_bit(vector, synic->vec_bitmap);
> else
> __clear_bit(vector, synic->vec_bitmap);
>
> + if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
> + return;
> +
> if (synic_has_vector_auto_eoi(synic, vector))
> __set_bit(vector, synic->auto_eoi_bitmap);
> else
> @@ -202,6 +218,7 @@ static void kvm_hv_notify_acked_sint(struct kvm_vcpu *vcpu, u32 sint)
> for (idx = 0; idx < ARRAY_SIZE(hv_vcpu->stimer); idx++) {
> stimer = &hv_vcpu->stimer[idx];
> if (stimer->msg_pending && stimer->config.enable &&
> + !stimer->config.direct_mode &&
> stimer->config.sintx == sint) {
> set_bit(stimer->index,
> hv_vcpu->stimer_pending_bitmap);
> @@ -371,7 +388,9 @@ int kvm_hv_synic_set_irq(struct kvm *kvm, u32 vpidx, u32 sint)
>
> void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector)
> {
> + struct kvm_vcpu_hv *hv_vcpu = vcpu_to_hv_vcpu(vcpu);
> struct kvm_vcpu_hv_synic *synic = vcpu_to_synic(vcpu);
> + struct kvm_vcpu_hv_stimer *stimer;
> int i;
>
> trace_kvm_hv_synic_send_eoi(vcpu->vcpu_id, vector);
> @@ -379,6 +398,14 @@ void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector)
> for (i = 0; i < ARRAY_SIZE(synic->sint); i++)
> if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
> kvm_hv_notify_acked_sint(vcpu, i);
> +
> + for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++) {
> + stimer = &hv_vcpu->stimer[i];
> + if (stimer->msg_pending && stimer->config.enable &&
> + stimer->config.direct_mode &&
> + stimer->config.apic_vector == vector)
> + stimer_mark_pending(stimer, false);
> + }
> }
>
> static int kvm_hv_set_sint_gsi(struct kvm *kvm, u32 vpidx, u32 sint, int gsi)
> @@ -545,15 +572,25 @@ static int stimer_start(struct kvm_vcpu_hv_stimer *stimer)
> static int stimer_set_config(struct kvm_vcpu_hv_stimer *stimer, u64 config,
> bool host)
> {
> - union hv_stimer_config new_config = {.as_uint64 = config};
> + struct kvm_vcpu *vcpu = stimer_to_vcpu(stimer);
> + struct kvm_vcpu_hv *hv_vcpu = vcpu_to_hv_vcpu(vcpu);
> + union hv_stimer_config new_config = {.as_uint64 = config},
> + old_config = {.as_uint64 = stimer->config.as_uint64};
>
> trace_kvm_hv_stimer_set_config(stimer_to_vcpu(stimer)->vcpu_id,
> stimer->index, config, host);
>
> stimer_cleanup(stimer);
> - if (stimer->config.enable && new_config.sintx == 0)
> + if (old_config.enable &&
> + !new_config.direct_mode && new_config.sintx == 0)
> new_config.enable = 0;
> stimer->config.as_uint64 = new_config.as_uint64;
> +
> + if (old_config.direct_mode)
> + synic_update_vector(&hv_vcpu->synic, old_config.apic_vector);
> + if (new_config.direct_mode)
> + synic_update_vector(&hv_vcpu->synic, new_config.apic_vector);
> +
> stimer_mark_pending(stimer, false);
> return 0;
> }
> @@ -640,14 +677,28 @@ static int stimer_send_msg(struct kvm_vcpu_hv_stimer *stimer)
> stimer->config.sintx, msg);
> }
>
> +static int stimer_notify_direct(struct kvm_vcpu_hv_stimer *stimer)
> +{
> + struct kvm_vcpu *vcpu = stimer_to_vcpu(stimer);
> + struct kvm_lapic_irq irq = {
> + .delivery_mode = APIC_DM_FIXED,
> + .vector = stimer->config.apic_vector
> + };
> +
> + return !kvm_apic_set_irq(vcpu, &irq, NULL);
> +}
> +
> static void stimer_expiration(struct kvm_vcpu_hv_stimer *stimer)
> {
> - int r;
> + int r, direct = stimer->config.direct_mode;
>
> stimer->msg_pending = true;
> - r = stimer_send_msg(stimer);
> + if (!direct)
> + r = stimer_send_msg(stimer);
> + else
> + r = stimer_notify_direct(stimer);
> trace_kvm_hv_stimer_expiration(stimer_to_vcpu(stimer)->vcpu_id,
> - stimer->index, r);
> + stimer->index, direct, r);
> if (!r) {
> stimer->msg_pending = false;
> if (!(stimer->config.periodic))
> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> index 0659465a745c..705f40ae2532 100644
> --- a/arch/x86/kvm/trace.h
> +++ b/arch/x86/kvm/trace.h
> @@ -1254,24 +1254,26 @@ TRACE_EVENT(kvm_hv_stimer_callback,
> * Tracepoint for stimer_expiration.
> */
> TRACE_EVENT(kvm_hv_stimer_expiration,
> - TP_PROTO(int vcpu_id, int timer_index, int msg_send_result),
> - TP_ARGS(vcpu_id, timer_index, msg_send_result),
> + TP_PROTO(int vcpu_id, int timer_index, int direct, int msg_send_result),
> + TP_ARGS(vcpu_id, timer_index, direct, msg_send_result),
>
> TP_STRUCT__entry(
> __field(int, vcpu_id)
> __field(int, timer_index)
> + __field(int, direct)
> __field(int, msg_send_result)
> ),
>
> TP_fast_assign(
> __entry->vcpu_id = vcpu_id;
> __entry->timer_index = timer_index;
> + __entry->direct = direct;
> __entry->msg_send_result = msg_send_result;
> ),
>
> - TP_printk("vcpu_id %d timer %d msg send result %d",
> + TP_printk("vcpu_id %d timer %d direct %d send result %d",
> __entry->vcpu_id, __entry->timer_index,
> - __entry->msg_send_result)
> + __entry->direct, __entry->msg_send_result)
> );
>
> /*
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5cd5647120f2..b21b5ceb8d26 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2997,6 +2997,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> case KVM_CAP_HYPERV_TLBFLUSH:
> case KVM_CAP_HYPERV_SEND_IPI:
> case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
> + case KVM_CAP_HYPERV_STIMER_DIRECT:
> case KVM_CAP_PCI_SEGMENT:
> case KVM_CAP_DEBUGREGS:
> case KVM_CAP_X86_ROBUST_SINGLESTEP:
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 2b7a652c9fa4..b8da14cee8e5 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -975,6 +975,7 @@ struct kvm_ppc_resize_hpt {
> #define KVM_CAP_HYPERV_ENLIGHTENED_VMCS 163
> #define KVM_CAP_EXCEPTION_PAYLOAD 164
> #define KVM_CAP_ARM_VM_IPA_SIZE 165
> +#define KVM_CAP_HYPERV_STIMER_DIRECT 166

I wonder if all these capabilities shouldn't be replaced by a single
KVM_GET_HYPERV_SUPPORTED_CPUID ioctl, or something like that. If you
can do it for 4.21, before this one cap is crystallized into userspace
API, that would be great. :)

Paolo

2018-11-26 17:08:38

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

From: Vitaly Kuznetsov <[email protected]> Sent: Monday, November 26, 2018 7:47 AM
>
> We implement Hyper-V SynIC and synthetic timers in KVM too so there's some
> room for code sharing.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/include/asm/hyperv-tlfs.h | 69 ++++++++++++++++++++++++++++++
> drivers/hv/hv.c | 2 +-
> drivers/hv/hyperv_vmbus.h | 68 -----------------------------
> 3 files changed, 70 insertions(+), 69 deletions(-)
>

Reviewed-by: Michael Kelley <[email protected]>

2018-11-26 17:20:52

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

Paolo Bonzini <[email protected]> writes:

>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 2b7a652c9fa4..b8da14cee8e5 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -975,6 +975,7 @@ struct kvm_ppc_resize_hpt {
>> #define KVM_CAP_HYPERV_ENLIGHTENED_VMCS 163
>> #define KVM_CAP_EXCEPTION_PAYLOAD 164
>> #define KVM_CAP_ARM_VM_IPA_SIZE 165
>> +#define KVM_CAP_HYPERV_STIMER_DIRECT 166
>
> I wonder if all these capabilities shouldn't be replaced by a single
> KVM_GET_HYPERV_SUPPORTED_CPUID ioctl, or something like that. If you
> can do it for 4.21, before this one cap is crystallized into userspace
> API, that would be great. :)

Oh, so the suggestion is to get all these features in CPUID format
(leafs 0x40000001-0x4000000A at this moment - as Hyper-V encodes them)
and let userspace parse them. Could work. Will take a look.

Alternatively, we can go with 'something like that' and add a
generalized KVM_GET_HYPERV_SUPPORTED_CAPS ioctl (returning somehthing
like u64 feature, u64 parameter pair). Doing that, however, wouldn't
relieve us from adding a new KVM_CAP_HYPERV_* constant for every new
feature.

--
Vitaly

2018-11-26 20:05:50

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

[ Sorry for having missed v1 ]

On Mon, Nov 26, 2018 at 04:47:29PM +0100, Vitaly Kuznetsov wrote:
> We implement Hyper-V SynIC and synthetic timers in KVM too so there's some
> room for code sharing.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/include/asm/hyperv-tlfs.h | 69 ++++++++++++++++++++++++++++++
> drivers/hv/hv.c | 2 +-
> drivers/hv/hyperv_vmbus.h | 68 -----------------------------
> 3 files changed, 70 insertions(+), 69 deletions(-)
>
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
> index 4139f7650fe5..b032c05cd3ee 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -731,6 +731,75 @@ struct hv_enlightened_vmcs {
> #define HV_STIMER_AUTOENABLE (1ULL << 3)
> #define HV_STIMER_SINT(config) (__u8)(((config) >> 16) & 0x0F)
>
> +/* Define synthetic interrupt controller flag constants. */
> +#define HV_EVENT_FLAGS_COUNT (256 * 8)
> +#define HV_EVENT_FLAGS_LONG_COUNT (256 / sizeof(unsigned long))
> +
> +/*
> + * Synthetic timer configuration.
> + */
> +union hv_stimer_config {
> + u64 as_uint64;
> + struct {
> + u64 enable:1;
> + u64 periodic:1;
> + u64 lazy:1;
> + u64 auto_enable:1;
> + u64 apic_vector:8;
> + u64 direct_mode:1;
> + u64 reserved_z0:3;
> + u64 sintx:4;
> + u64 reserved_z1:44;
> + };
> +};
> +
> +
> +/* Define the synthetic interrupt controller event flags format. */
> +union hv_synic_event_flags {
> + unsigned long flags[HV_EVENT_FLAGS_LONG_COUNT];
> +};
> +
> +/* Define SynIC control register. */
> +union hv_synic_scontrol {
> + u64 as_uint64;
> + struct {
> + u64 enable:1;
> + u64 reserved:63;
> + };
> +};
> +
> +/* Define synthetic interrupt source. */
> +union hv_synic_sint {
> + u64 as_uint64;
> + struct {
> + u64 vector:8;
> + u64 reserved1:8;
> + u64 masked:1;
> + u64 auto_eoi:1;
> + u64 reserved2:46;
> + };
> +};
> +
> +/* Define the format of the SIMP register */
> +union hv_synic_simp {
> + u64 as_uint64;
> + struct {
> + u64 simp_enabled:1;
> + u64 preserved:11;
> + u64 base_simp_gpa:52;
> + };
> +};
> +
> +/* Define the format of the SIEFP register */
> +union hv_synic_siefp {
> + u64 as_uint64;
> + struct {
> + u64 siefp_enabled:1;
> + u64 preserved:11;
> + u64 base_siefp_gpa:52;
> + };
> +};
> +
> struct hv_vpset {
> u64 format;
> u64 valid_bank_mask;

I personally tend to prefer masks over bitfields, so I'd rather do the
consolidation in the opposite direction: use the definitions in
hyperv-tlfs.h and replace those unions/bitfields elsewhere. (I vaguely
remember posting such a patchset a couple of years ago but I lacked the
motivation to complete it).

Roman.

2018-11-27 08:39:42

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

On Mon, Nov 26, 2018 at 05:44:24PM +0100, Paolo Bonzini wrote:
> On 26/11/18 16:47, Vitaly Kuznetsov wrote:
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 5cd5647120f2..b21b5ceb8d26 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -2997,6 +2997,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> > case KVM_CAP_HYPERV_TLBFLUSH:
> > case KVM_CAP_HYPERV_SEND_IPI:
> > case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
> > + case KVM_CAP_HYPERV_STIMER_DIRECT:
> > case KVM_CAP_PCI_SEGMENT:
> > case KVM_CAP_DEBUGREGS:
> > case KVM_CAP_X86_ROBUST_SINGLESTEP:
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index 2b7a652c9fa4..b8da14cee8e5 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -975,6 +975,7 @@ struct kvm_ppc_resize_hpt {
> > #define KVM_CAP_HYPERV_ENLIGHTENED_VMCS 163
> > #define KVM_CAP_EXCEPTION_PAYLOAD 164
> > #define KVM_CAP_ARM_VM_IPA_SIZE 165
> > +#define KVM_CAP_HYPERV_STIMER_DIRECT 166
>
> I wonder if all these capabilities shouldn't be replaced by a single
> KVM_GET_HYPERV_SUPPORTED_CPUID ioctl, or something like that.

Hmm, why? Are we running short of cap numbers?

Capabilities are a well-established and unambiguous negotiation
mechanism, why invent another one? Besides, not all features map
conveniently onto cpuid bits, e.g. currently we have two versions of
SynIC support, which differ in the way the userspace deals with it, but
not in the cpuid bits we expose in the guest. IMO such an ioctl would
bring more complexity rather than less.

Roman.

2018-11-27 08:50:50

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] x86/kvm/hyper-v: avoid open-coding stimer_mark_pending() in kvm_hv_notify_acked_sint()

On Mon, Nov 26, 2018 at 04:47:32PM +0100, Vitaly Kuznetsov wrote:
> stimers_pending optimization only helps us to avoid multiple
> kvm_make_request() calls. This doesn't happen very often and these
> calls are very cheap in the first place, remove open-coded version of
> stimer_mark_pending() from kvm_hv_notify_acked_sint().

Frankly speaking, I've yet to see a guest that configures more than one
SynIC timer. So it was indeed a bit of overengineering in the first
place.

> Suggested-by: Paolo Bonzini <[email protected]>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/kvm/hyperv.c | 12 +++---------
> 1 file changed, 3 insertions(+), 9 deletions(-)

Reviewed-by: Roman Kagan <[email protected]>

2018-11-27 11:00:54

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

On Mon, Nov 26, 2018 at 04:47:31PM +0100, Vitaly Kuznetsov wrote:
> Turns out Hyper-V on KVM (as of 2016) will only use synthetic timers
> if direct mode is available. With direct mode we notify the guest by
> asserting APIC irq instead of sending a SynIC message.
>
> The implementation uses existing vec_bitmap for letting lapic code
> know that we're interested in the particular IRQ's EOI request. We assume
> that the same APIC irq won't be used by the guest for both direct mode
> stimer and as sint source (especially with AutoEOI semantics). It is
> unclear how things should be handled if that's not true.
>
> Direct mode is also somewhat less expensive; in my testing
> stimer_send_msg() takes not less than 1500 cpu cycles and
> stimer_notify_direct() can usually be done in 300-400. WS2016 without
> Hyper-V, however, always sticks to non-direct version.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> - Changes since v1: avoid open-coding stimer_mark_pending() in
> kvm_hv_synic_send_eoi() [Paolo Bonzini]
> ---
> arch/x86/kvm/hyperv.c | 67 +++++++++++++++++++++++++++++++++++-----
> arch/x86/kvm/trace.h | 10 +++---
> arch/x86/kvm/x86.c | 1 +
> include/uapi/linux/kvm.h | 1 +
> 4 files changed, 67 insertions(+), 12 deletions(-)

Reviewed-by: Roman Kagan <[email protected]>

2018-11-27 14:28:36

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

On 27/11/18 09:37, Roman Kagan wrote:
> On Mon, Nov 26, 2018 at 05:44:24PM +0100, Paolo Bonzini wrote:
>> On 26/11/18 16:47, Vitaly Kuznetsov wrote:
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 5cd5647120f2..b21b5ceb8d26 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -2997,6 +2997,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>> case KVM_CAP_HYPERV_TLBFLUSH:
>>> case KVM_CAP_HYPERV_SEND_IPI:
>>> case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
>>> + case KVM_CAP_HYPERV_STIMER_DIRECT:
>>> case KVM_CAP_PCI_SEGMENT:
>>> case KVM_CAP_DEBUGREGS:
>>> case KVM_CAP_X86_ROBUST_SINGLESTEP:
>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>> index 2b7a652c9fa4..b8da14cee8e5 100644
>>> --- a/include/uapi/linux/kvm.h
>>> +++ b/include/uapi/linux/kvm.h
>>> @@ -975,6 +975,7 @@ struct kvm_ppc_resize_hpt {
>>> #define KVM_CAP_HYPERV_ENLIGHTENED_VMCS 163
>>> #define KVM_CAP_EXCEPTION_PAYLOAD 164
>>> #define KVM_CAP_ARM_VM_IPA_SIZE 165
>>> +#define KVM_CAP_HYPERV_STIMER_DIRECT 166
>>
>> I wonder if all these capabilities shouldn't be replaced by a single
>> KVM_GET_HYPERV_SUPPORTED_CPUID ioctl, or something like that.
>
> Hmm, why? Are we running short of cap numbers?
>
> Capabilities are a well-established and unambiguous negotiation
> mechanism, why invent another one? Besides, not all features map
> conveniently onto cpuid bits, e.g. currently we have two versions of
> SynIC support, which differ in the way the userspace deals with it, but
> not in the cpuid bits we expose in the guest. IMO such an ioctl would
> bring more complexity rather than less.

Yes, in that case (for bugfixes basically) you'd have to use
capabilities. But if the feature is completely hidden to userspace
except for the CPUID bit, it seems like a ioctl would be more consistent
with the rest of the KVM API.

Paolo

2018-11-27 15:56:00

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

From: Vitaly Kuznetsov <[email protected]> Tuesday, November 27, 2018 5:11 AM

> > I personally tend to prefer masks over bitfields, so I'd rather do the
> > consolidation in the opposite direction: use the definitions in
> > hyperv-tlfs.h and replace those unions/bitfields elsewhere. (I vaguely
> > remember posting such a patchset a couple of years ago but I lacked the
> > motivation to complete it).
>
> Are there any known advantages of using masks over bitfields or the
> resulting binary code is the same? Assuming there are no I'd personally
> prefer bitfields - not sure why but to me e.g.
> (stimer->config.enabled && !stimer->config.direct_mode)
> looks nicer than
> (stimer->config & HV_STIMER_ENABLED && !(stimer->config & HV_STIMER_DIRECT))
>
> + there's no need to do shifts, e.g.
>
> vector = stimer->config.apic_vector
>
> looks cleaner that
>
> vector = (stimer->config & HV_STIMER_APICVECTOR_MASK) >>
> HV_STIMER_APICVECTOR_SHIFT
>
> ... and we already use a few in hyperv-tlfs.h. I'm, however, flexible :-)
>
> K. Y., Michael, Haiyang, Stephen - would you prefer to get rid of all
> bitfields from hyperv-tlfs.h? If so I can work on a patchset. As to this
> series, my understanding is that Paolo already queued it so in any case
> this is going to be a separate effort (unless there are strong
> objections of course).
>

I prefer to keep the bit fields. I concur think it makes the code more
readable. Even if there is a modest performance benefit, at least
within a Linux guest most of the manipulation of the fields is during
initialization when performance doesn't matter. But I can't speak to
KVM's implementation of the hypervisor side.

Michael

2018-11-27 16:36:33

by Vitaly Kuznetsov

[permalink] [raw]
Subject: RE: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

Out of pure curiosity I decided to check what 'gcc -O3' produces when we
use bitfields and masks. As of 'gcc version 8.2.1 20181105 (Red Hat 8.2.1-5) (GCC)'

1) bitfields:

struct abc {
int enabled:1;
int _pad:7;
int vec:8;
};

int is_good(struct abc *s) {
if (s->enabled)
return s->vec;
else
return 0;
}

results in

is_good:
xorl %eax, %eax
testb $1, (%rdi)
je .L1
movsbl 1(%rdi), %eax
.L1:
ret

2) masks

#include <stdint.h>

#define S_ENABLED 1
#define S_VEC_MASK 0xff00
#define S_VEC_SHIFT 8

int is_good(uint16_t *s) {
if (*s & S_ENABLED)
return (*s & S_VEC_MASK) >> S_VEC_SHIFT;
else
return 0;
}

results in

is_good:
movzwl (%rdi), %edx
movzbl %dh, %eax
andl $1, %edx
movl $0, %edx
cmove %edx, %eax
ret

so bitfields version looks somewhat more efficient. I'm not sure if my
example is too synthetic though.

--
Vitaly

2018-11-27 19:21:15

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

Roman Kagan <[email protected]> writes:

> [ Sorry for having missed v1 ]
>
> On Mon, Nov 26, 2018 at 04:47:29PM +0100, Vitaly Kuznetsov wrote:
>> We implement Hyper-V SynIC and synthetic timers in KVM too so there's some
>> room for code sharing.
>>
>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
>> ---
>> arch/x86/include/asm/hyperv-tlfs.h | 69 ++++++++++++++++++++++++++++++
>> drivers/hv/hv.c | 2 +-
>> drivers/hv/hyperv_vmbus.h | 68 -----------------------------
>> 3 files changed, 70 insertions(+), 69 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
>> index 4139f7650fe5..b032c05cd3ee 100644
>> --- a/arch/x86/include/asm/hyperv-tlfs.h
>> +++ b/arch/x86/include/asm/hyperv-tlfs.h
>> @@ -731,6 +731,75 @@ struct hv_enlightened_vmcs {
>> #define HV_STIMER_AUTOENABLE (1ULL << 3)
>> #define HV_STIMER_SINT(config) (__u8)(((config) >> 16) & 0x0F)
>>
>> +/* Define synthetic interrupt controller flag constants. */
>> +#define HV_EVENT_FLAGS_COUNT (256 * 8)
>> +#define HV_EVENT_FLAGS_LONG_COUNT (256 / sizeof(unsigned long))
>> +
>> +/*
>> + * Synthetic timer configuration.
>> + */
>> +union hv_stimer_config {
>> + u64 as_uint64;
>> + struct {
>> + u64 enable:1;
>> + u64 periodic:1;
>> + u64 lazy:1;
>> + u64 auto_enable:1;
>> + u64 apic_vector:8;
>> + u64 direct_mode:1;
>> + u64 reserved_z0:3;
>> + u64 sintx:4;
>> + u64 reserved_z1:44;
>> + };
>> +};
>> +
>> +
>> +/* Define the synthetic interrupt controller event flags format. */
>> +union hv_synic_event_flags {
>> + unsigned long flags[HV_EVENT_FLAGS_LONG_COUNT];
>> +};
>> +
>> +/* Define SynIC control register. */
>> +union hv_synic_scontrol {
>> + u64 as_uint64;
>> + struct {
>> + u64 enable:1;
>> + u64 reserved:63;
>> + };
>> +};
>> +
>> +/* Define synthetic interrupt source. */
>> +union hv_synic_sint {
>> + u64 as_uint64;
>> + struct {
>> + u64 vector:8;
>> + u64 reserved1:8;
>> + u64 masked:1;
>> + u64 auto_eoi:1;
>> + u64 reserved2:46;
>> + };
>> +};
>> +
>> +/* Define the format of the SIMP register */
>> +union hv_synic_simp {
>> + u64 as_uint64;
>> + struct {
>> + u64 simp_enabled:1;
>> + u64 preserved:11;
>> + u64 base_simp_gpa:52;
>> + };
>> +};
>> +
>> +/* Define the format of the SIEFP register */
>> +union hv_synic_siefp {
>> + u64 as_uint64;
>> + struct {
>> + u64 siefp_enabled:1;
>> + u64 preserved:11;
>> + u64 base_siefp_gpa:52;
>> + };
>> +};
>> +
>> struct hv_vpset {
>> u64 format;
>> u64 valid_bank_mask;
>
> I personally tend to prefer masks over bitfields, so I'd rather do the
> consolidation in the opposite direction: use the definitions in
> hyperv-tlfs.h and replace those unions/bitfields elsewhere. (I vaguely
> remember posting such a patchset a couple of years ago but I lacked the
> motivation to complete it).

Are there any known advantages of using masks over bitfields or the
resulting binary code is the same? Assuming there are no I'd personally
prefer bitfields - not sure why but to me e.g.
(stimer->config.enabled && !stimer->config.direct_mode)
looks nicer than
(stimer->config & HV_STIMER_ENABLED && !(stimer->config & HV_STIMER_DIRECT))

+ there's no need to do shifts, e.g.

vector = stimer->config.apic_vector

looks cleaner that

vector = (stimer->config & HV_STIMER_APICVECTOR_MASK) >> HV_STIMER_APICVECTOR_SHIFT

... and we already use a few in hyperv-tlfs.h. I'm, however, flexible :-)

K. Y., Michael, Haiyang, Stephen - would you prefer to get rid of all
bitfields from hyperv-tlfs.h? If so I can work on a patchset. As to this
series, my understanding is that Paolo already queued it so in any case
this is going to be a separate effort (unless there are strong
objections of course).

Thanks!

--
Vitaly

2018-11-27 19:30:09

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

On Tue, Nov 27, 2018 at 02:54:21PM +0100, Paolo Bonzini wrote:
> On 27/11/18 09:37, Roman Kagan wrote:
> > On Mon, Nov 26, 2018 at 05:44:24PM +0100, Paolo Bonzini wrote:
> >> On 26/11/18 16:47, Vitaly Kuznetsov wrote:
> >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >>> index 5cd5647120f2..b21b5ceb8d26 100644
> >>> --- a/arch/x86/kvm/x86.c
> >>> +++ b/arch/x86/kvm/x86.c
> >>> @@ -2997,6 +2997,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> >>> case KVM_CAP_HYPERV_TLBFLUSH:
> >>> case KVM_CAP_HYPERV_SEND_IPI:
> >>> case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
> >>> + case KVM_CAP_HYPERV_STIMER_DIRECT:
> >>> case KVM_CAP_PCI_SEGMENT:
> >>> case KVM_CAP_DEBUGREGS:
> >>> case KVM_CAP_X86_ROBUST_SINGLESTEP:
> >>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> >>> index 2b7a652c9fa4..b8da14cee8e5 100644
> >>> --- a/include/uapi/linux/kvm.h
> >>> +++ b/include/uapi/linux/kvm.h
> >>> @@ -975,6 +975,7 @@ struct kvm_ppc_resize_hpt {
> >>> #define KVM_CAP_HYPERV_ENLIGHTENED_VMCS 163
> >>> #define KVM_CAP_EXCEPTION_PAYLOAD 164
> >>> #define KVM_CAP_ARM_VM_IPA_SIZE 165
> >>> +#define KVM_CAP_HYPERV_STIMER_DIRECT 166
> >>
> >> I wonder if all these capabilities shouldn't be replaced by a single
> >> KVM_GET_HYPERV_SUPPORTED_CPUID ioctl, or something like that.
> >
> > Hmm, why? Are we running short of cap numbers?
> >
> > Capabilities are a well-established and unambiguous negotiation
> > mechanism, why invent another one? Besides, not all features map
> > conveniently onto cpuid bits, e.g. currently we have two versions of
> > SynIC support, which differ in the way the userspace deals with it, but
> > not in the cpuid bits we expose in the guest. IMO such an ioctl would
> > bring more complexity rather than less.
>
> Yes, in that case (for bugfixes basically) you'd have to use
> capabilities. But if the feature is completely hidden to userspace
> except for the CPUID bit, it seems like a ioctl would be more consistent
> with the rest of the KVM API.

So there will be some features that are more equal than others?

I think that it's the current scheme which is more consistent: there are
features that are implemented in KVM, and there are caps for negotiating
them with userspace, and then -- separately -- there's a mix of bits to
show to the guest in response to CPUID, which only the userspace has to
bother with.

Thanks,
Roman.

2018-11-27 19:30:49

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

On Tue, Nov 27, 2018 at 02:10:49PM +0100, Vitaly Kuznetsov wrote:
> Roman Kagan <[email protected]> writes:
> > On Mon, Nov 26, 2018 at 04:47:29PM +0100, Vitaly Kuznetsov wrote:
> > I personally tend to prefer masks over bitfields, so I'd rather do the
> > consolidation in the opposite direction: use the definitions in
> > hyperv-tlfs.h and replace those unions/bitfields elsewhere. (I vaguely
> > remember posting such a patchset a couple of years ago but I lacked the
> > motivation to complete it).
>
> Are there any known advantages of using masks over bitfields or the
> resulting binary code is the same?

Strictly speaking bitwise ops are portable while bitfields are not, but
I guess this is not much of an issue with gcc which is dependable to
produce the right thing.

I came to dislike the bitfields for the false feeling of atomicity of
assignment while most of the time they are read-modify-write operations.

And no, I don't feel strong about it, so if nobody backs me on this I
give up :)

Roman.

2018-11-28 01:50:25

by Nadav Amit

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

> On Nov 27, 2018, at 10:48 AM, Roman Kagan <[email protected]> wrote:
>
> On Tue, Nov 27, 2018 at 02:10:49PM +0100, Vitaly Kuznetsov wrote:
>> Roman Kagan <[email protected]> writes:
>>> On Mon, Nov 26, 2018 at 04:47:29PM +0100, Vitaly Kuznetsov wrote:
>>> I personally tend to prefer masks over bitfields, so I'd rather do the
>>> consolidation in the opposite direction: use the definitions in
>>> hyperv-tlfs.h and replace those unions/bitfields elsewhere. (I vaguely
>>> remember posting such a patchset a couple of years ago but I lacked the
>>> motivation to complete it).
>>
>> Are there any known advantages of using masks over bitfields or the
>> resulting binary code is the same?
>
> Strictly speaking bitwise ops are portable while bitfields are not, but
> I guess this is not much of an issue with gcc which is dependable to
> produce the right thing.
>
> I came to dislike the bitfields for the false feeling of atomicity of
> assignment while most of the time they are read-modify-write operations.
>
> And no, I don't feel strong about it, so if nobody backs me on this I
> give up :)

Last time I tried to push a change from bitmasks to bitfields I failed:
https://lkml.org/lkml/2014/9/16/245

On a different note: how come all of the hyper-v structs are not marked
with the “packed" attribute?

2018-11-28 08:41:31

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

On 27/11/18 19:48, Roman Kagan wrote:
> On Tue, Nov 27, 2018 at 02:10:49PM +0100, Vitaly Kuznetsov wrote:
>> Roman Kagan <[email protected]> writes:
>>> On Mon, Nov 26, 2018 at 04:47:29PM +0100, Vitaly Kuznetsov wrote:
>>> I personally tend to prefer masks over bitfields, so I'd rather do the
>>> consolidation in the opposite direction: use the definitions in
>>> hyperv-tlfs.h and replace those unions/bitfields elsewhere. (I vaguely
>>> remember posting such a patchset a couple of years ago but I lacked the
>>> motivation to complete it).
>>
>> Are there any known advantages of using masks over bitfields or the
>> resulting binary code is the same?
>
> Strictly speaking bitwise ops are portable while bitfields are not, but
> I guess this is not much of an issue with gcc which is dependable to
> produce the right thing.
>
> I came to dislike the bitfields for the false feeling of atomicity of
> assignment while most of the time they are read-modify-write operations.
>
> And no, I don't feel strong about it, so if nobody backs me on this I
> give up :)

I agree, but I am deferring to the Hyper-V maintainers. KVM is mostly
reading these structs.

Paolo


2018-11-28 08:44:39

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

On 27/11/18 20:05, Roman Kagan wrote:
>>> Capabilities are a well-established and unambiguous negotiation
>>> mechanism, why invent another one? Besides, not all features map
>>> conveniently onto cpuid bits, e.g. currently we have two versions of
>>> SynIC support, which differ in the way the userspace deals with it, but
>>> not in the cpuid bits we expose in the guest. IMO such an ioctl would
>>> bring more complexity rather than less.
>>
>> Yes, in that case (for bugfixes basically) you'd have to use
>> capabilities. But if the feature is completely hidden to userspace
>> except for the CPUID bit, it seems like a ioctl would be more consistent
>> with the rest of the KVM API.
>
> So there will be some features that are more equal than others?

Well, it's already like that. Some features have to be explicitly
KVM_ENABLE_CAP'd because userspace has to be aware of them. But in many
cases they can be enabled blindly, and in that case a CPUID-based API
can help.

The CPUID-based API already works well for processor features, MSRs, KVM
paravirt features, etc. Unfortunately we cannot just reuse
KVM_GET_SUPPORTED_CPUID because userspace expects KVM features to be at
0x40000000 rather than Hyper-V ones.

> I think that it's the current scheme which is more consistent: there are
> features that are implemented in KVM, and there are caps for negotiating
> them with userspace, and then -- separately -- there's a mix of bits to
> show to the guest in response to CPUID, which only the userspace has to
> bother with.

The only issue is how to present the "features that are implemented in
KVM". Since they _are_ expressed as CPUID bits, it makes sense if
userspace only has to know about those instead of having both
capabilities and CPUID bits.

Paolo

2018-11-28 10:39:27

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

Nadav Amit <[email protected]> writes:

>
> On a different note: how come all of the hyper-v structs are not marked
> with the “packed" attribute?

"packed" should not be needed with proper padding; I vaguely remember
someone (from x86@?) arguing _against_ "packed".

--
Vitaly

2018-11-28 13:09:42

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

On Wed, 28 Nov 2018, Vitaly Kuznetsov wrote:

> Nadav Amit <[email protected]> writes:
>
> >
> > On a different note: how come all of the hyper-v structs are not marked
> > with the “packed" attribute?
>
> "packed" should not be needed with proper padding; I vaguely remember
> someone (from x86@?) arguing _against_ "packed".

Packed needs to be used, when describing fixed format data structures in
hardware or other ABIs, so the compiler cannot put alignment holes into
them.

Using packed for generic data structures might result in suboptimal layouts
and prevents layout randomization.

Thanks,

tglx

2018-11-28 17:58:34

by Nadav Amit

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

> On Nov 28, 2018, at 5:07 AM, Thomas Gleixner <[email protected]> wrote:
>
> On Wed, 28 Nov 2018, Vitaly Kuznetsov wrote:
>
>> Nadav Amit <[email protected]> writes:
>>
>>> On a different note: how come all of the hyper-v structs are not marked
>>> with the “packed" attribute?
>>
>> "packed" should not be needed with proper padding; I vaguely remember
>> someone (from x86@?) arguing _against_ "packed".
>
> Packed needs to be used, when describing fixed format data structures in
> hardware or other ABIs, so the compiler cannot put alignment holes into
> them.
>
> Using packed for generic data structures might result in suboptimal layouts
> and prevents layout randomization.

Right, I forgot about the structs randomization. So at least for it, the
attribute should be needed.

To prevent conflicts, I think that this series should also add the
attribute in a first patch, which would be tagged for stable.


2018-11-29 07:54:34

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

On Wed, Nov 28, 2018 at 02:07:42PM +0100, Thomas Gleixner wrote:
> On Wed, 28 Nov 2018, Vitaly Kuznetsov wrote:
>
> > Nadav Amit <[email protected]> writes:
> >
> > >
> > > On a different note: how come all of the hyper-v structs are not marked
> > > with the “packed" attribute?
> >
> > "packed" should not be needed with proper padding; I vaguely remember
> > someone (from x86@?) arguing _against_ "packed".
>
> Packed needs to be used, when describing fixed format data structures in
> hardware or other ABIs, so the compiler cannot put alignment holes into
> them.
>
> Using packed for generic data structures might result in suboptimal layouts
> and prevents layout randomization.

Sorry for my illiteracy, I didn't watch this field closely so I had to
google about layout randomization. Is my understanding correct that one
can't rely any more on the compiler to naturally align the struct
members with minimal padding? My life will never be the same...

Roman.

2018-11-29 11:38:59

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

Nadav Amit <[email protected]> writes:

>> On Nov 28, 2018, at 5:07 AM, Thomas Gleixner <[email protected]> wrote:
>>
>> On Wed, 28 Nov 2018, Vitaly Kuznetsov wrote:
>>
>>> Nadav Amit <[email protected]> writes:
>>>
>>>> On a different note: how come all of the hyper-v structs are not marked
>>>> with the “packed" attribute?
>>>
>>> "packed" should not be needed with proper padding; I vaguely remember
>>> someone (from x86@?) arguing _against_ "packed".
>>
>> Packed needs to be used, when describing fixed format data structures in
>> hardware or other ABIs, so the compiler cannot put alignment holes into
>> them.
>>
>> Using packed for generic data structures might result in suboptimal layouts
>> and prevents layout randomization.
>
> Right, I forgot about the structs randomization. So at least for it, the
> attribute should be needed.
>

Not sure when randomization.s used but Hyper-V drivers will of course be
utterly broken with it.

> To prevent conflicts, I think that this series should also add the
> attribute in a first patch, which would be tagged for stable.

As the patchset doesn't add new definitions and as Paolo already queued
it I'd go with a follow-up patch adding "packed" to all hyperv-tlfs.h
structures. The question is how to avoid conflicts when Linus will be
merging this. We can do:
- Topic branch in kvm
- Send the patch to x86, make topic branch and reabse kvm
- Send the patch to kvm
- ... ?

Paolo/Thomas, what would be your preference?

--
Vitaly

2018-11-29 19:23:45

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h

On Thu, 29 Nov 2018, Vitaly Kuznetsov wrote:
> Nadav Amit <[email protected]> writes:
>
> >> On Nov 28, 2018, at 5:07 AM, Thomas Gleixner <[email protected]> wrote:
> >>
> >> On Wed, 28 Nov 2018, Vitaly Kuznetsov wrote:
> >>
> >>> Nadav Amit <[email protected]> writes:
> >>>
> >>>> On a different note: how come all of the hyper-v structs are not marked
> >>>> with the “packed" attribute?
> >>>
> >>> "packed" should not be needed with proper padding; I vaguely remember
> >>> someone (from x86@?) arguing _against_ "packed".
> >>
> >> Packed needs to be used, when describing fixed format data structures in
> >> hardware or other ABIs, so the compiler cannot put alignment holes into
> >> them.
> >>
> >> Using packed for generic data structures might result in suboptimal layouts
> >> and prevents layout randomization.
> >
> > Right, I forgot about the structs randomization. So at least for it, the
> > attribute should be needed.
> >
>
> Not sure when randomization.s used but Hyper-V drivers will of course be
> utterly broken with it.
>
> > To prevent conflicts, I think that this series should also add the
> > attribute in a first patch, which would be tagged for stable.
>
> As the patchset doesn't add new definitions and as Paolo already queued
> it I'd go with a follow-up patch adding "packed" to all hyperv-tlfs.h
> structures. The question is how to avoid conflicts when Linus will be
> merging this. We can do:
> - Topic branch in kvm
> - Send the patch to x86, make topic branch and reabse kvm
> - Send the patch to kvm
> - ... ?
>
> Paolo/Thomas, what would be your preference?

As Paolo already has it, just route it through his tree please.

Thanks,

tglx

2018-12-03 17:14:09

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

On Mon, Nov 26, 2018 at 04:47:31PM +0100, Vitaly Kuznetsov wrote:
> @@ -379,6 +398,14 @@ void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector)
> for (i = 0; i < ARRAY_SIZE(synic->sint); i++)
> if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
> kvm_hv_notify_acked_sint(vcpu, i);
> +
> + for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++) {
> + stimer = &hv_vcpu->stimer[i];
> + if (stimer->msg_pending && stimer->config.enable &&
> + stimer->config.direct_mode &&
> + stimer->config.apic_vector == vector)
> + stimer_mark_pending(stimer, false);
> + }
> }

While debugging another issue with synic timers, it just occurred to me
that with direct timers no extra processing is necessary on EOI: unlike
traditional synic timers which may have failed to deliver a message and
want to be notified when they can retry, direct timers just set the irq
directly in the apic.

So this hunk shouldn't be needed, should it?

Roman.

2018-12-04 12:37:19

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

Roman Kagan <[email protected]> writes:

> On Mon, Nov 26, 2018 at 04:47:31PM +0100, Vitaly Kuznetsov wrote:
>> @@ -379,6 +398,14 @@ void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector)
>> for (i = 0; i < ARRAY_SIZE(synic->sint); i++)
>> if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
>> kvm_hv_notify_acked_sint(vcpu, i);
>> +
>> + for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++) {
>> + stimer = &hv_vcpu->stimer[i];
>> + if (stimer->msg_pending && stimer->config.enable &&
>> + stimer->config.direct_mode &&
>> + stimer->config.apic_vector == vector)
>> + stimer_mark_pending(stimer, false);
>> + }
>> }
>
> While debugging another issue with synic timers, it just occurred to me
> that with direct timers no extra processing is necessary on EOI: unlike
> traditional synic timers which may have failed to deliver a message and
> want to be notified when they can retry, direct timers just set the irq
> directly in the apic.
>
> So this hunk shouldn't be needed, should it?

Hm, you're probably right: kvm_apic_set_irq() fails only when apic is
disabled (see APIC_DM_FIXED case in __apic_accept_irq()) and I'm not
convinced we should re-try in this synthetic case.

Let me test the hypothesis with Hyper-V on KVM, I'll come back with
either a patch removing this over-engineered part or a reson for it to
stay. Will do later this week.

Thanks!

--
Vitaly

2018-12-10 12:58:44

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

Roman Kagan <[email protected]> writes:

> On Mon, Nov 26, 2018 at 04:47:31PM +0100, Vitaly Kuznetsov wrote:
>> Turns out Hyper-V on KVM (as of 2016) will only use synthetic timers
>> if direct mode is available. With direct mode we notify the guest by
>> asserting APIC irq instead of sending a SynIC message.
>>
>> The implementation uses existing vec_bitmap for letting lapic code
>> know that we're interested in the particular IRQ's EOI request. We assume
>> that the same APIC irq won't be used by the guest for both direct mode
>> stimer and as sint source (especially with AutoEOI semantics). It is
>> unclear how things should be handled if that's not true.
>>
>> Direct mode is also somewhat less expensive; in my testing
>> stimer_send_msg() takes not less than 1500 cpu cycles and
>> stimer_notify_direct() can usually be done in 300-400. WS2016 without
>> Hyper-V, however, always sticks to non-direct version.
>>
>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
>> ---
>> - Changes since v1: avoid open-coding stimer_mark_pending() in
>> kvm_hv_synic_send_eoi() [Paolo Bonzini]
>> ---
>> arch/x86/kvm/hyperv.c | 67 +++++++++++++++++++++++++++++++++++-----
>> arch/x86/kvm/trace.h | 10 +++---
>> arch/x86/kvm/x86.c | 1 +
>> include/uapi/linux/kvm.h | 1 +
>> 4 files changed, 67 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index eaec15c738df..9533133be566 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -38,6 +38,9 @@
>>
>> #define KVM_HV_MAX_SPARSE_VCPU_SET_BITS DIV_ROUND_UP(KVM_MAX_VCPUS, 64)
>>
>> +static void stimer_mark_pending(struct kvm_vcpu_hv_stimer *stimer,
>> + bool vcpu_kick);
>> +
>> static inline u64 synic_read_sint(struct kvm_vcpu_hv_synic *synic, int sint)
>> {
>> return atomic64_read(&synic->sint[sint]);
>> @@ -53,8 +56,21 @@ static inline int synic_get_sint_vector(u64 sint_value)
>> static bool synic_has_vector_connected(struct kvm_vcpu_hv_synic *synic,
>> int vector)
>> {
>> + struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
>> + struct kvm_vcpu_hv *hv_vcpu = vcpu_to_hv_vcpu(vcpu);
>> + struct kvm_vcpu_hv_stimer *stimer;
>> int i;
>>
>> + for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++) {
>> + stimer = &hv_vcpu->stimer[i];
>> + if (stimer->config.enable && stimer->config.direct_mode &&
>> + stimer->config.apic_vector == vector)
>> + return true;
>> + }
>> +
>> + if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
>> + return false;
>> +
>> for (i = 0; i < ARRAY_SIZE(synic->sint); i++) {
>> if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
>> return true;
>> @@ -80,14 +96,14 @@ static bool synic_has_vector_auto_eoi(struct kvm_vcpu_hv_synic *synic,
>> static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
>> int vector)
>> {
>> - if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
>> - return;
>> -
>> if (synic_has_vector_connected(synic, vector))
>> __set_bit(vector, synic->vec_bitmap);
>> else
>> __clear_bit(vector, synic->vec_bitmap);
>>
>> + if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
>> + return;
>> +
>
> Just noticed that the patch seems to assume that "direct" timers are
> allowed to use any vectors including 0-15. I guess this is incorrect,
> and instead stimer_set_config should error out on direct mode with a
> vector less than HV_SYNIC_FIRST_VALID_VECTOR.

The spec is really vague about this and I'm not sure that this has
anything to do with HV_SYNIC_FIRST_VALID_VECTOR (as these are actually
not "synic" vectors, I *think* that SynIC doesn't even need to be
enabled to make them work).

I checked and Hyper-V 2016 uses vector '0xff', not sure if it proves
your point :-)

Do you envision any issues in KVM if we keep allowing vectors <
HV_SYNIC_FIRST_VALID_VECTOR?

--
Vitaly

2018-12-10 13:22:24

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

On Mon, Nov 26, 2018 at 04:47:31PM +0100, Vitaly Kuznetsov wrote:
> Turns out Hyper-V on KVM (as of 2016) will only use synthetic timers
> if direct mode is available. With direct mode we notify the guest by
> asserting APIC irq instead of sending a SynIC message.
>
> The implementation uses existing vec_bitmap for letting lapic code
> know that we're interested in the particular IRQ's EOI request. We assume
> that the same APIC irq won't be used by the guest for both direct mode
> stimer and as sint source (especially with AutoEOI semantics). It is
> unclear how things should be handled if that's not true.
>
> Direct mode is also somewhat less expensive; in my testing
> stimer_send_msg() takes not less than 1500 cpu cycles and
> stimer_notify_direct() can usually be done in 300-400. WS2016 without
> Hyper-V, however, always sticks to non-direct version.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> - Changes since v1: avoid open-coding stimer_mark_pending() in
> kvm_hv_synic_send_eoi() [Paolo Bonzini]
> ---
> arch/x86/kvm/hyperv.c | 67 +++++++++++++++++++++++++++++++++++-----
> arch/x86/kvm/trace.h | 10 +++---
> arch/x86/kvm/x86.c | 1 +
> include/uapi/linux/kvm.h | 1 +
> 4 files changed, 67 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index eaec15c738df..9533133be566 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -38,6 +38,9 @@
>
> #define KVM_HV_MAX_SPARSE_VCPU_SET_BITS DIV_ROUND_UP(KVM_MAX_VCPUS, 64)
>
> +static void stimer_mark_pending(struct kvm_vcpu_hv_stimer *stimer,
> + bool vcpu_kick);
> +
> static inline u64 synic_read_sint(struct kvm_vcpu_hv_synic *synic, int sint)
> {
> return atomic64_read(&synic->sint[sint]);
> @@ -53,8 +56,21 @@ static inline int synic_get_sint_vector(u64 sint_value)
> static bool synic_has_vector_connected(struct kvm_vcpu_hv_synic *synic,
> int vector)
> {
> + struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
> + struct kvm_vcpu_hv *hv_vcpu = vcpu_to_hv_vcpu(vcpu);
> + struct kvm_vcpu_hv_stimer *stimer;
> int i;
>
> + for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++) {
> + stimer = &hv_vcpu->stimer[i];
> + if (stimer->config.enable && stimer->config.direct_mode &&
> + stimer->config.apic_vector == vector)
> + return true;
> + }
> +
> + if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
> + return false;
> +
> for (i = 0; i < ARRAY_SIZE(synic->sint); i++) {
> if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
> return true;
> @@ -80,14 +96,14 @@ static bool synic_has_vector_auto_eoi(struct kvm_vcpu_hv_synic *synic,
> static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
> int vector)
> {
> - if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
> - return;
> -
> if (synic_has_vector_connected(synic, vector))
> __set_bit(vector, synic->vec_bitmap);
> else
> __clear_bit(vector, synic->vec_bitmap);
>
> + if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
> + return;
> +

Just noticed that the patch seems to assume that "direct" timers are
allowed to use any vectors including 0-15. I guess this is incorrect,
and instead stimer_set_config should error out on direct mode with a
vector less than HV_SYNIC_FIRST_VALID_VECTOR.

Roman.

2018-12-10 15:06:24

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

Roman Kagan <[email protected]> writes:

> On Mon, Dec 10, 2018 at 01:54:18PM +0100, Vitaly Kuznetsov wrote:
>> Roman Kagan <[email protected]> writes:
>> > Just noticed that the patch seems to assume that "direct" timers are
>> > allowed to use any vectors including 0-15. I guess this is incorrect,
>> > and instead stimer_set_config should error out on direct mode with a
>> > vector less than HV_SYNIC_FIRST_VALID_VECTOR.
>>
>> The spec is really vague about this and I'm not sure that this has
>> anything to do with HV_SYNIC_FIRST_VALID_VECTOR (as these are actually
>> not "synic" vectors, I *think* that SynIC doesn't even need to be
>> enabled to make them work).
>>
>> I checked and Hyper-V 2016 uses vector '0xff', not sure if it proves
>> your point :-)
>>
>> Do you envision any issues in KVM if we keep allowing vectors <
>> HV_SYNIC_FIRST_VALID_VECTOR?
>
> It's actually lapic that treats vectors 0..15 as illegal. Nothing
> Hyper-V specific here.

Oh, right you are,

Intel SDM 10.5.2 "Valid Interrupt Vectors" says:

"The Intel 64 and IA-32 architectures define 256 vector numbers, ranging
from 0 through 255 (see Section 6.2, “Exception and Interrupt
Vectors”). Local and I/O APICs support 240 of these vectors (in the
range of 16 to 255) as valid interrupts.

When an interrupt vector in the range of 0 to 15 is sent or received
through the local APIC, the APIC indicates an illegal vector in its
Error Status Register (see Section 10.5.3, “Error Handling”). The Intel
64 and IA-32 architectures reserve vectors 16 through 31 for predefined
interrupts, exceptions, and Intel-reserved encodings (see Table
6-1). However, the local APIC does not treat vectors in this range as
illegal."

Out of pure curiosity I checked what Hyper-V does by hacking up linux
and I got "unchecked MSR access error: WRMSR to 0x400000b0" so we know
they follow the spec.

I'll send a patch to fix this, thanks!

--
Vitaly

2018-12-10 16:05:38

by Roman Kagan

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/kvm/hyper-v: direct mode for synthetic timers

On Mon, Dec 10, 2018 at 01:54:18PM +0100, Vitaly Kuznetsov wrote:
> Roman Kagan <[email protected]> writes:
> > Just noticed that the patch seems to assume that "direct" timers are
> > allowed to use any vectors including 0-15. I guess this is incorrect,
> > and instead stimer_set_config should error out on direct mode with a
> > vector less than HV_SYNIC_FIRST_VALID_VECTOR.
>
> The spec is really vague about this and I'm not sure that this has
> anything to do with HV_SYNIC_FIRST_VALID_VECTOR (as these are actually
> not "synic" vectors, I *think* that SynIC doesn't even need to be
> enabled to make them work).
>
> I checked and Hyper-V 2016 uses vector '0xff', not sure if it proves
> your point :-)
>
> Do you envision any issues in KVM if we keep allowing vectors <
> HV_SYNIC_FIRST_VALID_VECTOR?

It's actually lapic that treats vectors 0..15 as illegal. Nothing
Hyper-V specific here.

Roman.