kvm ptp targets to provide high precision time sync between guest
and host in virtualization environment. This patch enable kvm ptp
for arm64.
This patch set base on [1][2][3]
change log:
from v2 to v3:
(1) fix some issues in commit log.
(2) add some receivers in send list.
from v1 to v2:
(1) move arch-specific code from arch/ to driver/ptp/
(2) offer mechanism to inform userspace if ptp_kvm service is
available.
(3) separate ptp_kvm code for arm64 into hypervisor part and
guest part.
(4) add API to expose monotonic clock and counter value.
(5) refine code: remove no necessary part and reconsitution.
[1]https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/
commit/?h=kvm/hvc&id=125ea89e4a21e2fc5235410f966a996a1a7148bf
[2]https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/
commit/?h=kvm/hvc&id=464f5a1741e5959c3e4d2be1966ae0093b4dce06
[3]https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/
commit/?h=kvm/hvc&id=6597490e005d0eeca8ed8c1c1d7b4318ee014681
Jianyong Wu (6):
psci: Export psci_ops.conduit symbol as modules will use it.
ptp: Reorganize ptp_kvm modules to make it arch-independent.
timekeeping: Expose API allowing retrival of current clocksource and
counter value
psci: Add hvc call service for ptp_kvm.
ptp: arm64: Enable ptp_kvm for arm64
kvm: arm64: Add capability check extension for ptp_kvm
drivers/firmware/psci/psci.c | 6 ++
drivers/ptp/Kconfig | 2 +-
drivers/ptp/Makefile | 1 +
drivers/ptp/{ptp_kvm.c => kvm_ptp.c} | 77 ++++++------------------
drivers/ptp/ptp_kvm_arm64.c | 82 ++++++++++++++++++++++++++
drivers/ptp/ptp_kvm_x86.c | 87 ++++++++++++++++++++++++++++
include/asm-generic/ptp_kvm.h | 12 ++++
include/linux/arm-smccc.h | 14 ++++-
include/linux/psci.h | 1 +
include/linux/timekeeping.h | 3 +
include/uapi/linux/kvm.h | 1 +
kernel/time/timekeeping.c | 13 +++++
virt/kvm/arm/arm.c | 1 +
virt/kvm/arm/psci.c | 17 ++++++
14 files changed, 256 insertions(+), 61 deletions(-)
rename drivers/ptp/{ptp_kvm.c => kvm_ptp.c} (63%)
create mode 100644 drivers/ptp/ptp_kvm_arm64.c
create mode 100644 drivers/ptp/ptp_kvm_x86.c
create mode 100644 include/asm-generic/ptp_kvm.h
--
2.17.1
If arm_smccc_1_1_invoke used in modules, psci_ops.conduit should
be export.
Signed-off-by: Jianyong Wu <[email protected]>
---
drivers/firmware/psci/psci.c | 6 ++++++
include/linux/arm-smccc.h | 2 +-
include/linux/psci.h | 1 +
3 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
index f82ccd39a913..35c4eaab1451 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -212,6 +212,12 @@ static unsigned long psci_migrate_info_up_cpu(void)
0, 0, 0);
}
+enum psci_conduit psci_get_conduit(void)
+{
+ return psci_ops.conduit;
+}
+EXPORT_SYMBOL(psci_get_conduit);
+
static void set_conduit(enum psci_conduit conduit)
{
switch (conduit) {
diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index 552cbd49abe8..a6e4d3e3d10a 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -357,7 +357,7 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1,
* The return value also provides the conduit that was used.
*/
#define arm_smccc_1_1_invoke(...) ({ \
- int method = psci_ops.conduit; \
+ int method = psci_get_conduit(); \
switch (method) { \
case PSCI_CONDUIT_HVC: \
arm_smccc_1_1_hvc(__VA_ARGS__); \
diff --git a/include/linux/psci.h b/include/linux/psci.h
index a8a15613c157..e5cedc986049 100644
--- a/include/linux/psci.h
+++ b/include/linux/psci.h
@@ -42,6 +42,7 @@ struct psci_operations {
enum smccc_version smccc_version;
};
+extern enum psci_conduit psci_get_conduit(void);
extern struct psci_operations psci_ops;
#if defined(CONFIG_ARM_PSCI_FW)
--
2.17.1
This patch is the base of ptp_kvm for arm64.
ptp_kvm modules will call hvc to get this service.
The service offers real time and counter cycle of host for guest.
Signed-off-by: Jianyong Wu <[email protected]>
---
include/linux/arm-smccc.h | 12 ++++++++++++
virt/kvm/arm/psci.c | 17 +++++++++++++++++
2 files changed, 29 insertions(+)
diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index a6e4d3e3d10a..bc0cdad10f35 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -94,6 +94,7 @@
/* KVM "vendor specific" services */
#define ARM_SMCCC_KVM_FUNC_FEATURES 0
+#define ARM_SMCCC_KVM_PTP 1
#define ARM_SMCCC_KVM_FUNC_FEATURES_2 127
#define ARM_SMCCC_KVM_NUM_FUNCS 128
@@ -103,6 +104,17 @@
ARM_SMCCC_OWNER_VENDOR_HYP, \
ARM_SMCCC_KVM_FUNC_FEATURES)
+/*
+ * This ID used for virtual ptp kvm clock and it will pass second value
+ * and nanosecond value of host real time and system counter by vcpu
+ * register to guest.
+ */
+#define ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID \
+ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+ ARM_SMCCC_SMC_32, \
+ ARM_SMCCC_OWNER_VENDOR_HYP, \
+ ARM_SMCCC_KVM_PTP)
+
#ifndef __ASSEMBLY__
#include <linux/linkage.h>
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index 0debf49bf259..2c5d53817a28 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -392,6 +392,8 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
u32 func_id = smccc_get_function(vcpu);
u32 val[4] = {};
u32 option;
+ struct timespec *ts;
+ struct system_counterval_t sc;
val[0] = SMCCC_RET_NOT_SUPPORTED;
@@ -431,6 +433,21 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES);
break;
+ /*
+ * This will used for virtual ptp kvm clock. three
+ * values will be passed back.
+ * reg0 stores seconds of host real time;
+ * reg1 stores nanoseconds of host real time;
+ * reg2 stores system counter cycle value.
+ */
+ case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
+ getnstimeofday(ts);
+ get_current_counterval(&sc);
+ val[0] = ts->tv_sec;
+ val[1] = ts->tv_nsec;
+ val[2] = sc.cycles;
+ val[3] = 0;
+ break;
default:
return kvm_psci_call(vcpu);
}
--
2.17.1
Currently in arm64 virtualization environment, there is no mechanism to
keep time sync between guest and host. Time in guest will drift compared
with host after boot up as they may both use third party time sources
to correct their time respectively. The time deviation will be in order
of milliseconds but some scenarios ask for higher time precision, like
in cloud envirenment, we want all the VMs running in the host aquire the
same level accuracy from host clock.
Use of kvm ptp clock, which choose the host clock source clock as a
reference clock to sync time clock between guest and host has been adopted
by x86 which makes the time sync order from milliseconds to nanoseconds.
This patch enable kvm ptp on arm64 and we get the similar clock drift as
found with x86 with kvm ptp.
Test result comparison between with kvm ptp and without it in arm64 are
as follows. This test derived from the result of command 'chronyc
sources'. we should take more cure of the last sample column which shows
the offset between the local clock and the source at the last measurement.
no kvm ptp in guest:
MS Name/IP address Stratum Poll Reach LastRx Last sample
========================================================================
^* dns1.synet.edu.cn 2 6 377 13 +1040us[+1581us] +/- 21ms
^* dns1.synet.edu.cn 2 6 377 21 +1040us[+1581us] +/- 21ms
^* dns1.synet.edu.cn 2 6 377 29 +1040us[+1581us] +/- 21ms
^* dns1.synet.edu.cn 2 6 377 37 +1040us[+1581us] +/- 21ms
^* dns1.synet.edu.cn 2 6 377 45 +1040us[+1581us] +/- 21ms
^* dns1.synet.edu.cn 2 6 377 53 +1040us[+1581us] +/- 21ms
^* dns1.synet.edu.cn 2 6 377 61 +1040us[+1581us] +/- 21ms
^* dns1.synet.edu.cn 2 6 377 4 -130us[ +796us] +/- 21ms
^* dns1.synet.edu.cn 2 6 377 12 -130us[ +796us] +/- 21ms
^* dns1.synet.edu.cn 2 6 377 20 -130us[ +796us] +/- 21ms
in host:
MS Name/IP address Stratum Poll Reach LastRx Last sample
========================================================================
^* 120.25.115.20 2 7 377 72 -470us[ -603us] +/- 18ms
^* 120.25.115.20 2 7 377 92 -470us[ -603us] +/- 18ms
^* 120.25.115.20 2 7 377 112 -470us[ -603us] +/- 18ms
^* 120.25.115.20 2 7 377 2 +872ns[-6808ns] +/- 17ms
^* 120.25.115.20 2 7 377 22 +872ns[-6808ns] +/- 17ms
^* 120.25.115.20 2 7 377 43 +872ns[-6808ns] +/- 17ms
^* 120.25.115.20 2 7 377 63 +872ns[-6808ns] +/- 17ms
^* 120.25.115.20 2 7 377 83 +872ns[-6808ns] +/- 17ms
^* 120.25.115.20 2 7 377 103 +872ns[-6808ns] +/- 17ms
^* 120.25.115.20 2 7 377 123 +872ns[-6808ns] +/- 17ms
The dns1.synet.edu.cn is the network reference clock for guest and
120.25.115.20 is the network reference clock for host. we can't get the
clock error between guest and host directly, but a roughly estimated value
will be in order of hundreds of us to ms.
with kvm ptp in guest:
chrony has been disabled in host to remove the disturb by network clock.
MS Name/IP address Stratum Poll Reach LastRx Last sample
========================================================================
* PHC0 0 3 377 8 -7ns[ +1ns] +/- 3ns
* PHC0 0 3 377 8 +1ns[ +16ns] +/- 3ns
* PHC0 0 3 377 6 -4ns[ -0ns] +/- 6ns
* PHC0 0 3 377 6 -8ns[ -12ns] +/- 5ns
* PHC0 0 3 377 5 +2ns[ +4ns] +/- 4ns
* PHC0 0 3 377 13 +2ns[ +4ns] +/- 4ns
* PHC0 0 3 377 12 -4ns[ -6ns] +/- 4ns
* PHC0 0 3 377 11 -8ns[ -11ns] +/- 6ns
* PHC0 0 3 377 10 -14ns[ -20ns] +/- 4ns
* PHC0 0 3 377 8 +4ns[ +5ns] +/- 4ns
The PHC0 is the ptp clock which choose the host clock as its source
clock. So we can be sure to say that the clock error between host and guest
is in order of ns.
Signed-off-by: Jianyong Wu <[email protected]>
---
drivers/ptp/Kconfig | 2 +-
drivers/ptp/kvm_ptp.c | 2 +-
drivers/ptp/ptp_kvm_arm64.c | 82 +++++++++++++++++++++++++++++++++++++
3 files changed, 84 insertions(+), 2 deletions(-)
create mode 100644 drivers/ptp/ptp_kvm_arm64.c
diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig
index 9b8fee5178e8..e032fafdafa7 100644
--- a/drivers/ptp/Kconfig
+++ b/drivers/ptp/Kconfig
@@ -110,7 +110,7 @@ config PTP_1588_CLOCK_PCH
config PTP_1588_CLOCK_KVM
tristate "KVM virtual PTP clock"
depends on PTP_1588_CLOCK
- depends on KVM_GUEST && X86
+ depends on KVM_GUEST && X86 || ARM64
default y
help
This driver adds support for using kvm infrastructure as a PTP
diff --git a/drivers/ptp/kvm_ptp.c b/drivers/ptp/kvm_ptp.c
index d8f215186904..c0b445fa6144 100644
--- a/drivers/ptp/kvm_ptp.c
+++ b/drivers/ptp/kvm_ptp.c
@@ -138,7 +138,7 @@ static int __init ptp_kvm_init(void)
int ret;
ret = kvm_arch_ptp_init();
- if (!ret)
+ if (ret)
return -EOPNOTSUPP;
kvm_ptp_clock.caps = ptp_kvm_caps;
diff --git a/drivers/ptp/ptp_kvm_arm64.c b/drivers/ptp/ptp_kvm_arm64.c
new file mode 100644
index 000000000000..630144186c08
--- /dev/null
+++ b/drivers/ptp/ptp_kvm_arm64.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Virtual PTP 1588 clock for use with KVM guests
+ * Copyright (C) 2019 ARM Ltd.
+ * All Rights Reserved
+ */
+
+#include <linux/kernel.h>
+#include <linux/err.h>
+#include <asm/hypervisor.h>
+#include <linux/module.h>
+#include <linux/psci.h>
+#include <linux/arm-smccc.h>
+#include <linux/timecounter.h>
+#include <linux/sched/clock.h>
+#include <asm/arch_timer.h>
+
+struct system_counterval_t ptp_sc;
+
+/*
+ * as trap call cause delay, this function will return the delay in nanosecond
+ */
+static u64 arm_smccc_1_1_invoke_delay(u32 id, struct arm_smccc_res *res)
+{
+ u64 t1, t2;
+
+ t1 = sched_clock();
+ arm_smccc_1_1_invoke(id, res);
+ t2 = sched_clock();
+ t2 -= t1;
+
+ return t2;
+}
+
+int kvm_arch_ptp_init(void)
+{
+ if (!kvm_arm_hyp_service_available(
+ ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID)) {
+ return -EOPNOTSUPP;
+ }
+ get_current_counterval(&ptp_sc);
+
+ return 0;
+}
+
+int kvm_arch_ptp_get_clock_generic(struct timespec64 *ts,
+ struct arm_smccc_res *hvc_res)
+{
+ u64 ns;
+
+ ns = arm_smccc_1_1_invoke_delay(ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID,
+ hvc_res);
+ if ((long)(hvc_res->a0) < 0)
+ return -EOPNOTSUPP;
+
+ ts->tv_sec = hvc_res->a0;
+ ts->tv_nsec = hvc_res->a1;
+ timespec64_add_ns(ts, ns);
+
+ return 0;
+}
+
+int kvm_arch_ptp_get_clock(struct timespec64 *ts)
+{
+ struct arm_smccc_res hvc_res;
+
+ kvm_arch_ptp_get_clock_generic(ts, &hvc_res);
+
+ return 0;
+}
+
+int kvm_arch_ptp_get_clock_fn(long *cycle, struct timespec64 *ts,
+ struct clocksource **cs)
+{
+ struct arm_smccc_res hvc_res;
+
+ kvm_arch_ptp_get_clock_generic(ts, &hvc_res);
+ *cycle = hvc_res.a2;
+ *cs = ptp_sc.cs;
+
+ return 0;
+}
--
2.17.1
Let userspace check if there is kvm ptp service in host.
before VMs migrate to a another host, VMM may check if this
cap is available to determine the migration behaviour.
Signed-off-by: Jianyong Wu <[email protected]>
Suggested-by: Marc Zyngier <[email protected]>
---
include/uapi/linux/kvm.h | 1 +
virt/kvm/arm/arm.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 2fe12b40d503..a0bff6002bd9 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -993,6 +993,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_ARM_SVE 170
#define KVM_CAP_ARM_PTRAUTH_ADDRESS 171
#define KVM_CAP_ARM_PTRAUTH_GENERIC 172
+#define KVM_CAP_ARM_KVM_PTP 173
#ifdef KVM_CAP_IRQ_ROUTING
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index bd5c55916d0d..80999985160b 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -201,6 +201,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_MP_STATE:
case KVM_CAP_IMMEDIATE_EXIT:
case KVM_CAP_VCPU_EVENTS:
+ case KVM_CAP_ARM_KVM_PTP:
r = 1;
break;
case KVM_CAP_ARM_SET_DEVICE_ADDR:
--
2.17.1
On 18/09/19 10:07, Jianyong Wu wrote:
> + case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
> + getnstimeofday(ts);
This is not Y2038-safe. Please use ktime_get_real_ts64 instead, and
split the 64-bit seconds value between val[0] and val[1].
However, it seems to me that the new function is not needed and you can
just use ktime_get_snapshot. You'll get the time in
systime_snapshot->real and the cycles value in systime_snapshot->cycles.
> + get_current_counterval(&sc);
> + val[0] = ts->tv_sec;
> + val[1] = ts->tv_nsec;
> + val[2] = sc.cycles;
> + val[3] = 0;
> + break;
This should return a guest-cycles value. If the cycles values always
the same between the host and the guest on ARM, then okay. If not, you
have to apply whatever offset exists.
Thanks,
Paolo
Hi Paolo,
> -----Original Message-----
> From: Paolo Bonzini <[email protected]>
> Sent: Wednesday, September 18, 2019 4:26 PM
> To: Jianyong Wu (Arm Technology China) <[email protected]>;
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Mark Rutland <[email protected]>; Will
> Deacon <[email protected]>; Suzuki Poulose
> <[email protected]>
> Cc: [email protected]; [email protected]; Steve Capper
> <[email protected]>; Kaly Xin (Arm Technology China)
> <[email protected]>; Justin He (Arm Technology China)
> <[email protected]>; nd <[email protected]>; linux-arm-
> [email protected]
> Subject: Re: [RFC PATCH v3 4/6] psci: Add hvc call service for ptp_kvm.
>
> On 18/09/19 10:07, Jianyong Wu wrote:
> > + case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
> > + getnstimeofday(ts);
>
> This is not Y2038-safe. Please use ktime_get_real_ts64 instead, and split the
> 64-bit seconds value between val[0] and val[1].
>
As far as I know, y2038-safe will only affect signed 32-bit integer, how does it affect 64-bit integer?
And why split 64-bit number into two blocks is necessary?
Also, split number will lead to shortage of return value.
> However, it seems to me that the new function is not needed and you can
> just use ktime_get_snapshot. You'll get the time in systime_snapshot->real
> and the cycles value in systime_snapshot->cycles.
See patch 5/6, I need both counter cycle and clocksource, ktime_get_snapshot seems only offer cycles.
>
> > + get_current_counterval(&sc);
> > + val[0] = ts->tv_sec;
> > + val[1] = ts->tv_nsec;
> > + val[2] = sc.cycles;
> > + val[3] = 0;
> > + break;
>
> This should return a guest-cycles value. If the cycles values always the same
> between the host and the guest on ARM, then okay. If not, you have to
> apply whatever offset exists.
>
In my opinion, when use ptp_kvm as clock sources to sync time between host and guest, user should promise the guest and host has no
clock offset. So we can be sure that the cycle between guest and host should be keep consistent. But I need check it.
I think host cycle should be returned to guest as we should promise we get clock and counter in the same time.
Thanks
Jianyong Wu
> Thanks,
>
> Paolo
On 18/09/19 11:57, Jianyong Wu (Arm Technology China) wrote:
> Hi Paolo,
>
>> On 18/09/19 10:07, Jianyong Wu wrote:
>>> + case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
>>> + getnstimeofday(ts);
>>
>> This is not Y2038-safe. Please use ktime_get_real_ts64 instead, and split the
>> 64-bit seconds value between val[0] and val[1].
>>
> As far as I know, y2038-safe will only affect signed 32-bit integer,
> how does it affect 64-bit integer?
> And why split 64-bit number into two blocks is necessary?
val is an u32, not an u64. (And val[0], where you store the seconds, is
best treated as signed, since val[0] == -1 is returned for
SMCCC_RET_NOT_SUPPORTED).
>> However, it seems to me that the new function is not needed and you can
>> just use ktime_get_snapshot. You'll get the time in systime_snapshot->real
>> and the cycles value in systime_snapshot->cycles.
>
> See patch 5/6, I need both counter cycle and clocksource, ktime_get_snapshot seems only offer cycles.
No, patch 5/6 only needs the current clock (ptp_sc.cycles is never
accessed). So you could just use READ_ONCE(tk->tkr_mono.clock).
However, even then I don't think it is correct to use ptp_sc.cs blindly
in patch 5. I think there is a misunderstanding on the meaning of
system_counterval.cs as passed to get_device_system_crosststamp.
system_counterval.cs is not the active clocksource; it's the clocksource
on which system_counterval.cycles is based.
Hypothetically, the clocksource could be one for which ptp_sc.cycles is
_not_ a cycle value. If you set system_counterval.cs to the system
clocksource, get_device_system_crosststamp will return a bogus value.
So system_counterval.cs should be set to something like
&clocksource_counter (from drivers/clocksource/arm_arch_timer.c).
Perhaps the right place to define kvm_arch_ptp_get_clock_fn is in that file?
>>> + get_current_counterval(&sc);
>>> + val[0] = ts->tv_sec;
>>> + val[1] = ts->tv_nsec;
>>> + val[2] = sc.cycles;
>>> + val[3] = 0;
>>> + break;
>>
>> This should return a guest-cycles value. If the cycles values always the same
>> between the host and the guest on ARM, then okay. If not, you have to
>> apply whatever offset exists.
>>
> In my opinion, when use ptp_kvm as clock sources to sync time
> between host and guest, user should promise the guest and host has no
> clock offset.
What would be the adverse effect of having a fixed offset between guest
and host? If there were one, you'd have to check that and fail the
hypercall if there is an offset. But again, I think it's enough to
subtract vcpu_vtimer(vcpu)->cntvoff or something like that.
You also have to check here that the clocksource is based on the ARM
architectural timer. Again, maybe you could place the implementation in
drivers/clocksource/arm_arch_timer.c, and make it return -ENODEV if the
active clocksource is not clocksource_counter. Then KVM can look for
errors and return SMCCC_RET_NOT_SUPPORTED in that case.
Thanks,
Paolo
> So we can be sure that the cycle between guest and
> host should be keep consistent. But I need check it.
> I think host cycle should be returned to guest as we should promise
> we get clock and counter in the same time.
From Marc Zyngier <[email protected]>
A number of PTP drivers (such as ptp-kvm) are assuming what the
current clock source is, which could lead to interesting effects on
systems where the clocksource can change depending on external events.
For this purpose, add a new API that retrives both the current
monotonic clock as well as its counter value.
From Jianyong Wu: export this API then modules can use it.
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Jianyong Wu <[email protected]>
---
include/linux/timekeeping.h | 3 +++
kernel/time/timekeeping.c | 13 +++++++++++++
2 files changed, 16 insertions(+)
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index a8ab0f143ac4..a5389adaa8bc 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -247,6 +247,9 @@ extern int get_device_system_crosststamp(
struct system_time_snapshot *history,
struct system_device_crosststamp *xtstamp);
+/* Obtain current monotonic clock and its counter value */
+extern void get_current_counterval(struct system_counterval_t *sc);
+
/*
* Simultaneously snapshot realtime and monotonic raw clocks
*/
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 44b726bab4bd..07a0969625b1 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1098,6 +1098,19 @@ static bool cycle_between(u64 before, u64 test, u64 after)
return false;
}
+/**
+ * get_current_counterval - Snapshot the current clocksource and counter value
+ * @sc: Pointer to a struct containing the current clocksource and its value
+ */
+void get_current_counterval(struct system_counterval_t *sc)
+{
+ struct timekeeper *tk = &tk_core.timekeeper;
+
+ sc->cs = READ_ONCE(tk->tkr_mono.clock);
+ sc->cycles = sc->cs->read(sc->cs);
+}
+EXPORT_SYMBOL_GPL(get_current_counterval);
+
/**
* get_device_system_crosststamp - Synchronously capture system/device timestamp
* @get_time_fn: Callback to get simultaneous device time and
--
2.17.1
Currently, ptp_kvm modules implementation is only for x86 which includs
large part of arch-specific code. This patch move all of those code
into sole arch related file in the same directory.
Signed-off-by: Jianyong Wu <[email protected]>
---
drivers/ptp/Makefile | 1 +
drivers/ptp/{ptp_kvm.c => kvm_ptp.c} | 77 ++++++------------------
drivers/ptp/ptp_kvm_x86.c | 87 ++++++++++++++++++++++++++++
include/asm-generic/ptp_kvm.h | 12 ++++
4 files changed, 118 insertions(+), 59 deletions(-)
rename drivers/ptp/{ptp_kvm.c => kvm_ptp.c} (63%)
create mode 100644 drivers/ptp/ptp_kvm_x86.c
create mode 100644 include/asm-generic/ptp_kvm.h
diff --git a/drivers/ptp/Makefile b/drivers/ptp/Makefile
index 677d1d178a3e..8f27ba302e31 100644
--- a/drivers/ptp/Makefile
+++ b/drivers/ptp/Makefile
@@ -4,6 +4,7 @@
#
ptp-y := ptp_clock.o ptp_chardev.o ptp_sysfs.o
+ptp_kvm-y := ptp_kvm_$(ARCH).o kvm_ptp.o
obj-$(CONFIG_PTP_1588_CLOCK) += ptp.o
obj-$(CONFIG_PTP_1588_CLOCK_DTE) += ptp_dte.o
obj-$(CONFIG_PTP_1588_CLOCK_IXP46X) += ptp_ixp46x.o
diff --git a/drivers/ptp/ptp_kvm.c b/drivers/ptp/kvm_ptp.c
similarity index 63%
rename from drivers/ptp/ptp_kvm.c
rename to drivers/ptp/kvm_ptp.c
index fc7d0b77e118..d8f215186904 100644
--- a/drivers/ptp/ptp_kvm.c
+++ b/drivers/ptp/kvm_ptp.c
@@ -8,12 +8,12 @@
#include <linux/err.h>
#include <linux/init.h>
#include <linux/kernel.h>
+#include <linux/slab.h>
#include <linux/module.h>
#include <uapi/linux/kvm_para.h>
#include <asm/kvm_para.h>
-#include <asm/pvclock.h>
-#include <asm/kvmclock.h>
#include <uapi/asm/kvm_para.h>
+#include <asm-generic/ptp_kvm.h>
#include <linux/ptp_clock_kernel.h>
@@ -24,56 +24,29 @@ struct kvm_ptp_clock {
DEFINE_SPINLOCK(kvm_ptp_lock);
-static struct pvclock_vsyscall_time_info *hv_clock;
-
-static struct kvm_clock_pairing clock_pair;
-static phys_addr_t clock_pair_gpa;
-
static int ptp_kvm_get_time_fn(ktime_t *device_time,
struct system_counterval_t *system_counter,
void *ctx)
{
- unsigned long ret;
+ unsigned long ret, cycle;
struct timespec64 tspec;
- unsigned version;
- int cpu;
- struct pvclock_vcpu_time_info *src;
+ struct clocksource *cs;
spin_lock(&kvm_ptp_lock);
preempt_disable_notrace();
- cpu = smp_processor_id();
- src = &hv_clock[cpu].pvti;
-
- do {
- /*
- * We are using a TSC value read in the hosts
- * kvm_hc_clock_pairing handling.
- * So any changes to tsc_to_system_mul
- * and tsc_shift or any other pvclock
- * data invalidate that measurement.
- */
- version = pvclock_read_begin(src);
-
- ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING,
- clock_pair_gpa,
- KVM_CLOCK_PAIRING_WALLCLOCK);
- if (ret != 0) {
- pr_err_ratelimited("clock pairing hypercall ret %lu\n", ret);
- spin_unlock(&kvm_ptp_lock);
- preempt_enable_notrace();
- return -EOPNOTSUPP;
- }
-
- tspec.tv_sec = clock_pair.sec;
- tspec.tv_nsec = clock_pair.nsec;
- ret = __pvclock_read_cycles(src, clock_pair.tsc);
- } while (pvclock_read_retry(src, version));
+ ret = kvm_arch_ptp_get_clock_fn(&cycle, &tspec, &cs);
+ if (ret != 0) {
+ pr_err_ratelimited("clock pairing hypercall ret %lu\n", ret);
+ spin_unlock(&kvm_ptp_lock);
+ preempt_enable_notrace();
+ return -EOPNOTSUPP;
+ }
preempt_enable_notrace();
- system_counter->cycles = ret;
- system_counter->cs = &kvm_clock;
+ system_counter->cycles = cycle;
+ system_counter->cs = cs;
*device_time = timespec64_to_ktime(tspec);
@@ -116,17 +89,13 @@ static int ptp_kvm_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
spin_lock(&kvm_ptp_lock);
- ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING,
- clock_pair_gpa,
- KVM_CLOCK_PAIRING_WALLCLOCK);
+ ret = kvm_arch_ptp_get_clock(&tspec);
if (ret != 0) {
pr_err_ratelimited("clock offset hypercall ret %lu\n", ret);
spin_unlock(&kvm_ptp_lock);
return -EOPNOTSUPP;
}
- tspec.tv_sec = clock_pair.sec;
- tspec.tv_nsec = clock_pair.nsec;
spin_unlock(&kvm_ptp_lock);
memcpy(ts, &tspec, sizeof(struct timespec64));
@@ -166,21 +135,11 @@ static void __exit ptp_kvm_exit(void)
static int __init ptp_kvm_init(void)
{
- long ret;
-
- if (!kvm_para_available())
- return -ENODEV;
-
- clock_pair_gpa = slow_virt_to_phys(&clock_pair);
- hv_clock = pvclock_get_pvti_cpu0_va();
+ int ret;
- if (!hv_clock)
- return -ENODEV;
-
- ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING, clock_pair_gpa,
- KVM_CLOCK_PAIRING_WALLCLOCK);
- if (ret == -KVM_ENOSYS || ret == -KVM_EOPNOTSUPP)
- return -ENODEV;
+ ret = kvm_arch_ptp_init();
+ if (!ret)
+ return -EOPNOTSUPP;
kvm_ptp_clock.caps = ptp_kvm_caps;
diff --git a/drivers/ptp/ptp_kvm_x86.c b/drivers/ptp/ptp_kvm_x86.c
new file mode 100644
index 000000000000..632733989d59
--- /dev/null
+++ b/drivers/ptp/ptp_kvm_x86.c
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Virtual PTP 1588 clock for use with KVM guests
+ *
+ * Copyright (C) 2017 Red Hat Inc.
+ */
+
+#include <asm/pvclock.h>
+#include <asm/kvmclock.h>
+#include <linux/module.h>
+#include <uapi/asm/kvm_para.h>
+#include <uapi/linux/kvm_para.h>
+#include <linux/ptp_clock_kernel.h>
+
+phys_addr_t clock_pair_gpa;
+struct kvm_clock_pairing clock_pair;
+struct pvclock_vsyscall_time_info *hv_clock;
+
+int kvm_arch_ptp_init(void)
+{
+ int ret;
+
+ if (!kvm_para_available())
+ return -ENODEV;
+
+ clock_pair_gpa = slow_virt_to_phys(&clock_pair);
+ hv_clock = pvclock_get_pvti_cpu0_va();
+ if (!hv_clock)
+ return -ENODEV;
+
+ ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING, clock_pair_gpa,
+ KVM_CLOCK_PAIRING_WALLCLOCK);
+ if (ret == -KVM_ENOSYS || ret == -KVM_EOPNOTSUPP)
+ return -ENODEV;
+
+ return 0;
+}
+
+int kvm_arch_ptp_get_clock(struct timespec64 *ts)
+{
+ long ret;
+
+ ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING,
+ clock_pair_gpa,
+ KVM_CLOCK_PAIRING_WALLCLOCK);
+ if (ret != 0)
+ return -EOPNOTSUPP;
+
+ ts->tv_sec = clock_pair.sec;
+ ts->tv_nsec = clock_pair.nsec;
+
+ return 0;
+}
+
+int kvm_arch_ptp_get_clock_fn(long *cycle, struct timespec64 *tspec,
+ struct clocksource **cs)
+{
+ unsigned long ret;
+ unsigned int version;
+ int cpu;
+ struct pvclock_vcpu_time_info *src;
+
+ cpu = smp_processor_id();
+ src = &hv_clock[cpu].pvti;
+
+ do {
+ /*
+ * We are using a TSC value read in the hosts
+ * kvm_hc_clock_pairing handling.
+ * So any changes to tsc_to_system_mul
+ * and tsc_shift or any other pvclock
+ * data invalidate that measurement.
+ */
+ version = pvclock_read_begin(src);
+
+ ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING,
+ clock_pair_gpa,
+ KVM_CLOCK_PAIRING_WALLCLOCK);
+ tspec->tv_sec = clock_pair.sec;
+ tspec->tv_nsec = clock_pair.nsec;
+ *cycle = __pvclock_read_cycles(src, clock_pair.tsc);
+ } while (pvclock_read_retry(src, version));
+
+ *cs = &kvm_clock;
+
+ return 0;
+}
diff --git a/include/asm-generic/ptp_kvm.h b/include/asm-generic/ptp_kvm.h
new file mode 100644
index 000000000000..208e842bfa64
--- /dev/null
+++ b/include/asm-generic/ptp_kvm.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Virtual PTP 1588 clock for use with KVM guests
+ *
+ * Copyright (C) 2019 ARM Ltd.
+ * All Rights Reserved
+ */
+
+int kvm_arch_ptp_init(void);
+int kvm_arch_ptp_get_clock(struct timespec64 *ts);
+int kvm_arch_ptp_get_clock_fn(long *cycle,
+ struct timespec64 *tspec, void *cs);
--
2.17.1
Hi Paolo,
> -----Original Message-----
> From: Paolo Bonzini <[email protected]>
> Sent: Wednesday, September 18, 2019 6:24 PM
> To: Jianyong Wu (Arm Technology China) <[email protected]>;
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Mark Rutland <[email protected]>; Will
> Deacon <[email protected]>; Suzuki Poulose
> <[email protected]>
> Cc: [email protected]; [email protected]; Steve Capper
> <[email protected]>; Kaly Xin (Arm Technology China)
> <[email protected]>; Justin He (Arm Technology China)
> <[email protected]>; nd <[email protected]>; linux-arm-
> [email protected]
> Subject: Re: [RFC PATCH v3 4/6] psci: Add hvc call service for ptp_kvm.
>
> On 18/09/19 11:57, Jianyong Wu (Arm Technology China) wrote:
> > Hi Paolo,
> >
> >> On 18/09/19 10:07, Jianyong Wu wrote:
> >>> + case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
> >>> + getnstimeofday(ts);
> >>
> >> This is not Y2038-safe. Please use ktime_get_real_ts64 instead, and
> >> split the 64-bit seconds value between val[0] and val[1].
> >>
> > As far as I know, y2038-safe will only affect signed 32-bit integer,
> > how does it affect 64-bit integer?
> > And why split 64-bit number into two blocks is necessary?
>
> val is an u32, not an u64. (And val[0], where you store the seconds, is best
> treated as signed, since val[0] == -1 is returned for
> SMCCC_RET_NOT_SUPPORTED).
>
Yeah, need consider twice.
Val[] should be long not u32 I think, so in arm64 I can avoid that Y2038_safe, but
also need rewrite for arm32.
> >> However, it seems to me that the new function is not needed and you
> >> can just use ktime_get_snapshot. You'll get the time in
> >> systime_snapshot->real and the cycles value in systime_snapshot->cycles.
> >
> > See patch 5/6, I need both counter cycle and clocksource,
> ktime_get_snapshot seems only offer cycles.
>
> No, patch 5/6 only needs the current clock (ptp_sc.cycles is never accessed).
> So you could just use READ_ONCE(tk->tkr_mono.clock).
>
Yeah, patch 5/6 just need clocksource, but I think tk->tkr_mono.clock can't read in external like module,
So I need an API to expose clocksource.
> However, even then I don't think it is correct to use ptp_sc.cs blindly in patch
> 5. I think there is a misunderstanding on the meaning of
> system_counterval.cs as passed to get_device_system_crosststamp.
> system_counterval.cs is not the active clocksource; it's the clocksource on
> which system_counterval.cycles is based.
>
I think we can use system_counterval_t as pass current clocksource to system_counterval_t.cs and its
corresponding cycles to system_counterval_t.cycles. is it a big problem?
> Hypothetically, the clocksource could be one for which ptp_sc.cycles is _not_
> a cycle value. If you set system_counterval.cs to the system clocksource,
> get_device_system_crosststamp will return a bogus value.
Yeah, but in patch 3/6, we have a corresponding pair of clock source and cycle value. So I think there will be no
that problem in this patch set.
In the implementation of get_device_system_crosststamp:
"
...
if (tk->tkr_mono.clock != system_counterval.cs)
return -ENODEV;
...
"
We need tk->tkr_mono.clock passed to get_device_system_crosststamp, just like patch 3/6 do, otherwise will return error.
> So system_counterval.cs should be set to something like
> &clocksource_counter (from drivers/clocksource/arm_arch_timer.c).
> Perhaps the right place to define kvm_arch_ptp_get_clock_fn is in that file?
>
I have checked that ptp_sc.cs is arch_sys_counter.
Also move the module API to arm_arch_timer.c will looks a little ugly and it's not easy to be accept by arm side I think.
> >>> + get_current_counterval(&sc);
> >>> + val[0] = ts->tv_sec;
> >>> + val[1] = ts->tv_nsec;
> >>> + val[2] = sc.cycles;
> >>> + val[3] = 0;
> >>> + break;
> >>
> >> This should return a guest-cycles value. If the cycles values always
> >> the same between the host and the guest on ARM, then okay. If not,
> >> you have to apply whatever offset exists.
> >>
> > In my opinion, when use ptp_kvm as clock sources to sync time between
> > host and guest, user should promise the guest and host has no clock
> > offset.
>
> What would be the adverse effect of having a fixed offset between guest
> and host? If there were one, you'd have to check that and fail the hypercall if
> there is an offset. But again, I think it's enough to subtract
> vcpu_vtimer(vcpu)->cntvoff or something like that.
>
Sure, counter offset should be considered.
> You also have to check here that the clocksource is based on the ARM
> architectural timer. Again, maybe you could place the implementation in
> drivers/clocksource/arm_arch_timer.c, and make it return -ENODEV if the
> active clocksource is not clocksource_counter. Then KVM can look for errors
> and return SMCCC_RET_NOT_SUPPORTED in that case.
I have checked it. The clock source is arch_sys_counter which is arm arch timer.
I can try to do that but I'm not sure arm side will be happy for that change.
Thanks
Jianyong Wu
>
> Thanks,
>
> Paolo
>
> > So we can be sure that the cycle between guest and host should be keep
> > consistent. But I need check it.
> > I think host cycle should be returned to guest as we should promise we
> > get clock and counter in the same time.
On 19/09/19 11:46, Jianyong Wu (Arm Technology China) wrote:
>> On 18/09/19 11:57, Jianyong Wu (Arm Technology China) wrote:
>>> Paolo Bonzini wrote:
>>>> This is not Y2038-safe. Please use ktime_get_real_ts64 instead, and
>>>> split the 64-bit seconds value between val[0] and val[1].
>
> Val[] should be long not u32 I think, so in arm64 I can avoid that Y2038_safe, but
> also need rewrite for arm32.
I don't think there's anything inherently wrong with u32 val[], and as
you notice it lets you reuse code between arm and arm64. It's up to you
and Marc to decide.
>>>> However, it seems to me that the new function is not needed and you
>>>> can just use ktime_get_snapshot. You'll get the time in
>>>> systime_snapshot->real and the cycles value in systime_snapshot->cycles.
>>>
>>> See patch 5/6, I need both counter cycle and clocksource,
>> ktime_get_snapshot seems only offer cycles.
>>
>> No, patch 5/6 only needs the current clock (ptp_sc.cycles is never accessed).
>> So you could just use READ_ONCE(tk->tkr_mono.clock).
>>
> Yeah, patch 5/6 just need clocksource, but I think tk->tkr_mono.clock can't read in external like module,
> So I need an API to expose clocksource.
>
>> However, even then I don't think it is correct to use ptp_sc.cs blindly in patch
>> 5. I think there is a misunderstanding on the meaning of
>> system_counterval.cs as passed to get_device_system_crosststamp.
>> system_counterval.cs is not the active clocksource; it's the clocksource on
>> which system_counterval.cycles is based.
>>
>
> I think we can use system_counterval_t as pass current clocksource to system_counterval_t.cs and its
> corresponding cycles to system_counterval_t.cycles. is it a big problem?
Yes, it is. Because...
>> Hypothetically, the clocksource could be one for which ptp_sc.cycles is _not_
>> a cycle value. If you set system_counterval.cs to the system clocksource,
>> get_device_system_crosststamp will return a bogus value.
>
> Yeah, but in patch 3/6, we have a corresponding pair of clock source and cycle value. So I think there will be no
> that problem in this patch set.
> In the implementation of get_device_system_crosststamp:
> "
> ...
> if (tk->tkr_mono.clock != system_counterval.cs)
> return -ENODEV;
> ...
> "
> We need tk->tkr_mono.clock passed to get_device_system_crosststamp, just like patch 3/6 do, otherwise will return error.
... if the hypercall returns an architectural timer value, you must not
pass tk->tkr.mono.clock to get_device_system_crosststamp: you must pass
&clocksource_counter. This way, PTP is disabled when using any other
clocksource.
>> So system_counterval.cs should be set to something like
>> &clocksource_counter (from drivers/clocksource/arm_arch_timer.c).
>> Perhaps the right place to define kvm_arch_ptp_get_clock_fn is in that file?
>>
> I have checked that ptp_sc.cs is arch_sys_counter.
> Also move the module API to arm_arch_timer.c will looks a little
> ugly and it's not easy to be accept by arm side I think.
I don't think it's ugly but more important, using tk->tkr_mono.clock is
incorrect. See how the x86 code hardcodes &kvm_clock, it's the same for
ARM.
>> You also have to check here that the clocksource is based on the ARM
>> architectural timer. Again, maybe you could place the implementation in
>> drivers/clocksource/arm_arch_timer.c, and make it return -ENODEV if the
>> active clocksource is not clocksource_counter. Then KVM can look for errors
>> and return SMCCC_RET_NOT_SUPPORTED in that case.
>
> I have checked it. The clock source is arch_sys_counter which is arm arch timer.
> I can try to do that but I'm not sure arm side will be happy for that change.
Just try. For my taste, it's nice to include both sides of the
hypercall in drivers/clocksource/arm_arch_timer.c, possibly
conditionalizing them on #ifdef CONFIG_KVM and #ifdef
CONFIG_PTP_1588_CLOCK_KVM. But there is an alternative which is simply
to export the clocksource struct. Both choices are easy to implement so
you can just ask the ARM people what they prefer and they can judge from
the code.
Paolo
On 19/09/19 13:39, Marc Zyngier wrote:
>> I don't think it's ugly but more important, using tk->tkr_mono.clock is
>> incorrect. See how the x86 code hardcodes &kvm_clock, it's the same for
>> ARM.
> Not really. The guest kernel is free to use any clocksource it wishes.
Understood, in fact it's the same on x86.
However, for PTP to work, the cycles value returned by the clocksource
must match the one returned by the hypercall. So for ARM
get_device_system_crosststamp must receive the arch timer clocksource,
so that it will return -ENODEV if the active clocksource is anything else.
Paolo
> In some cases, it is actually desirable (like these broken systems that
> cannot use an in-kernel irqchip...). Maybe it is that on x86 the guest
> only uses the kvm_clock, but that's a much harder sell on ARM. The fact
> that ptp_kvm assumes that the clocksource is fixed doesn't seem correct
> in that case.
On 19/09/2019 12:07, Paolo Bonzini wrote:
> On 19/09/19 11:46, Jianyong Wu (Arm Technology China) wrote:
>>> On 18/09/19 11:57, Jianyong Wu (Arm Technology China) wrote:
>>>> Paolo Bonzini wrote:
>>>>> This is not Y2038-safe. Please use ktime_get_real_ts64 instead, and
>>>>> split the 64-bit seconds value between val[0] and val[1].
>>
>> Val[] should be long not u32 I think, so in arm64 I can avoid that Y2038_safe, but
>> also need rewrite for arm32.
>
> I don't think there's anything inherently wrong with u32 val[], and as
> you notice it lets you reuse code between arm and arm64. It's up to you
> and Marc to decide.
>
>>>>> However, it seems to me that the new function is not needed and you
>>>>> can just use ktime_get_snapshot. You'll get the time in
>>>>> systime_snapshot->real and the cycles value in systime_snapshot->cycles.
>>>>
>>>> See patch 5/6, I need both counter cycle and clocksource,
>>> ktime_get_snapshot seems only offer cycles.
>>>
>>> No, patch 5/6 only needs the current clock (ptp_sc.cycles is never accessed).
>>> So you could just use READ_ONCE(tk->tkr_mono.clock).
>>>
>> Yeah, patch 5/6 just need clocksource, but I think tk->tkr_mono.clock can't read in external like module,
>> So I need an API to expose clocksource.
>>
>>> However, even then I don't think it is correct to use ptp_sc.cs blindly in patch
>>> 5. I think there is a misunderstanding on the meaning of
>>> system_counterval.cs as passed to get_device_system_crosststamp.
>>> system_counterval.cs is not the active clocksource; it's the clocksource on
>>> which system_counterval.cycles is based.
>>>
>>
>> I think we can use system_counterval_t as pass current clocksource to system_counterval_t.cs and its
>> corresponding cycles to system_counterval_t.cycles. is it a big problem?
>
> Yes, it is. Because...
>
>>> Hypothetically, the clocksource could be one for which ptp_sc.cycles is _not_
>>> a cycle value. If you set system_counterval.cs to the system clocksource,
>>> get_device_system_crosststamp will return a bogus value.
>>
>> Yeah, but in patch 3/6, we have a corresponding pair of clock source and cycle value. So I think there will be no
>> that problem in this patch set.
>> In the implementation of get_device_system_crosststamp:
>> "
>> ...
>> if (tk->tkr_mono.clock != system_counterval.cs)
>> return -ENODEV;
>> ...
>> "
>> We need tk->tkr_mono.clock passed to get_device_system_crosststamp, just like patch 3/6 do, otherwise will return error.
>
> ... if the hypercall returns an architectural timer value, you must not
> pass tk->tkr.mono.clock to get_device_system_crosststamp: you must pass
> &clocksource_counter. This way, PTP is disabled when using any other
> clocksource.
>
>>> So system_counterval.cs should be set to something like
>>> &clocksource_counter (from drivers/clocksource/arm_arch_timer.c).
>>> Perhaps the right place to define kvm_arch_ptp_get_clock_fn is in that file?
>>>
>> I have checked that ptp_sc.cs is arch_sys_counter.
>> Also move the module API to arm_arch_timer.c will looks a little
>> ugly and it's not easy to be accept by arm side I think.
>
> I don't think it's ugly but more important, using tk->tkr_mono.clock is
> incorrect. See how the x86 code hardcodes &kvm_clock, it's the same for
> ARM.
Not really. The guest kernel is free to use any clocksource it wishes.
In some cases, it is actually desirable (like these broken systems that
cannot use an in-kernel irqchip...). Maybe it is that on x86 the guest
only uses the kvm_clock, but that's a much harder sell on ARM. The fact
that ptp_kvm assumes that the clocksource is fixed doesn't seem correct
in that case.
M.
--
Jazz is not dead, it just smells funny...
Hi Paolo, Marc
> -----Original Message-----
> From: Paolo Bonzini <[email protected]>
> Sent: Thursday, September 19, 2019 7:07 PM
> To: Jianyong Wu (Arm Technology China) <[email protected]>;
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Mark Rutland <[email protected]>; Will
> Deacon <[email protected]>; Suzuki Poulose
> <[email protected]>
> Cc: [email protected]; [email protected]; Steve Capper
> <[email protected]>; Kaly Xin (Arm Technology China)
> <[email protected]>; Justin He (Arm Technology China)
> <[email protected]>; nd <[email protected]>; linux-arm-
> [email protected]
> Subject: Re: [RFC PATCH v3 4/6] psci: Add hvc call service for ptp_kvm.
>
> On 19/09/19 11:46, Jianyong Wu (Arm Technology China) wrote:
> >> On 18/09/19 11:57, Jianyong Wu (Arm Technology China) wrote:
> >>> Paolo Bonzini wrote:
> >>>> This is not Y2038-safe. Please use ktime_get_real_ts64 instead,
> >>>> and split the 64-bit seconds value between val[0] and val[1].
> >
> > Val[] should be long not u32 I think, so in arm64 I can avoid that
> > Y2038_safe, but also need rewrite for arm32.
>
> I don't think there's anything inherently wrong with u32 val[], and as you
> notice it lets you reuse code between arm and arm64. It's up to you and
> Marc to decide.
>
To compatible 32-bit, Integrates second value and nanosecond value as a nanosecond value then split it into val[0] and val[1] and split cycle value into val[2] and val[3],
In this way, time will overflow at Y2262.
WDYT?
Thanks
Jianyong Wu
Hi Marc, Paolo,
> -----Original Message-----
> From: Paolo Bonzini <[email protected]>
> Sent: Thursday, September 19, 2019 8:13 PM
> To: Marc Zyngier <[email protected]>; Jianyong Wu (Arm Technology China)
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Mark Rutland <[email protected]>; Will
> Deacon <[email protected]>; Suzuki Poulose
> <[email protected]>
> Cc: [email protected]; [email protected]; Steve Capper
> <[email protected]>; Kaly Xin (Arm Technology China)
> <[email protected]>; Justin He (Arm Technology China)
> <[email protected]>; nd <[email protected]>; linux-arm-
> [email protected]
> Subject: Re: [RFC PATCH v3 4/6] psci: Add hvc call service for ptp_kvm.
>
> On 19/09/19 13:39, Marc Zyngier wrote:
> >> I don't think it's ugly but more important, using tk->tkr_mono.clock
> >> is incorrect. See how the x86 code hardcodes &kvm_clock, it's the
> >> same for ARM.
> > Not really. The guest kernel is free to use any clocksource it wishes.
>
> Understood, in fact it's the same on x86.
>
> However, for PTP to work, the cycles value returned by the clocksource must
> match the one returned by the hypercall. So for ARM
> get_device_system_crosststamp must receive the arch timer clocksource, so
> that it will return -ENODEV if the active clocksource is anything else.
>
After day's reflection on this, I'm a little clear about this issue, let me clarify it.
I think there is three principles for this issue:
(1): guest and host can use different clocksouces as their current clocksouce at the same time
and we can't or it's not easy to probe that if they use the same one.
(2): the cycle value and the clocksouce which passed to get_device_system_crosststamp must be match.
(3): ptp_kvm target to increase the time sync precision so we should choose a high rate clocksource as ptp_kvm clocksource.
Base on (1) and (2) we can deduce that the ptp_kvm clocksource should be better a fixed one. So we can test if the cycle and
clocksource is match.
Base on (3) the arch_sys_timer should be chosen, as it's the best clocksource by far for arm.
Thanks
Jianyong Wu
> Paolo
>
> > In some cases, it is actually desirable (like these broken systems
> > that cannot use an in-kernel irqchip...). Maybe it is that on x86 the
> > guest only uses the kvm_clock, but that's a much harder sell on ARM.
> > The fact that ptp_kvm assumes that the clocksource is fixed doesn't
> > seem correct in that case.
On 23/09/19 06:57, Jianyong Wu (Arm Technology China) wrote:
>> On 19/09/19 11:46, Jianyong Wu (Arm Technology China) wrote:
>>>> On 18/09/19 11:57, Jianyong Wu (Arm Technology China) wrote:
>>>>> Paolo Bonzini wrote:
>>>>>> This is not Y2038-safe. Please use ktime_get_real_ts64 instead,
>>>>>> and split the 64-bit seconds value between val[0] and val[1].
>>>
>>> Val[] should be long not u32 I think, so in arm64 I can avoid that
>>> Y2038_safe, but also need rewrite for arm32.
>>
>> I don't think there's anything inherently wrong with u32 val[], and as you
>> notice it lets you reuse code between arm and arm64. It's up to you and
>> Marc to decide.
>>
> To compatible 32-bit, Integrates second value and nanosecond value as a nanosecond value then split it into val[0] and val[1] and split cycle value into val[2] and val[3],
> In this way, time will overflow at Y2262.
> WDYT?
So if I understand correctly you'd multiply by 10^9 (or better shift by
30) the nanoseconds.
That works, but why not provide 5 output registers? Alternatively, take
an address as input and write there.
Finally, on x86 we added an argument for the CLOCK_* that is being read
(currently only CLOCK_REALTIME, but having room for extensibility in the
API is always nice).
Paolo
Hi Paolo,
> -----Original Message-----
> From: Paolo Bonzini <[email protected]>
> Sent: Tuesday, September 24, 2019 10:20 PM
> To: Jianyong Wu (Arm Technology China) <[email protected]>;
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Mark Rutland <[email protected]>; Will
> Deacon <[email protected]>; Suzuki Poulose
> <[email protected]>
> Cc: [email protected]; [email protected]; Steve Capper
> <[email protected]>; Kaly Xin (Arm Technology China)
> <[email protected]>; Justin He (Arm Technology China)
> <[email protected]>; nd <[email protected]>; linux-arm-
> [email protected]
> Subject: Re: [RFC PATCH v3 4/6] psci: Add hvc call service for ptp_kvm.
>
> On 23/09/19 06:57, Jianyong Wu (Arm Technology China) wrote:
> >> On 19/09/19 11:46, Jianyong Wu (Arm Technology China) wrote:
> >>>> On 18/09/19 11:57, Jianyong Wu (Arm Technology China) wrote:
> >>>>> Paolo Bonzini wrote:
> >>>>>> This is not Y2038-safe. Please use ktime_get_real_ts64 instead,
> >>>>>> and split the 64-bit seconds value between val[0] and val[1].
> >>>
> >>> Val[] should be long not u32 I think, so in arm64 I can avoid that
> >>> Y2038_safe, but also need rewrite for arm32.
> >>
> >> I don't think there's anything inherently wrong with u32 val[], and
> >> as you notice it lets you reuse code between arm and arm64. It's up
> >> to you and Marc to decide.
> >>
> > To compatible 32-bit, Integrates second value and nanosecond value as
> > a nanosecond value then split it into val[0] and val[1] and split cycle value
> into val[2] and val[3], In this way, time will overflow at Y2262.
> > WDYT?
>
> So if I understand correctly you'd multiply by 10^9 (or better shift by
> 30) the nanoseconds.
>
Yeah,
> That works, but why not provide 5 output registers? Alternatively, take an
> address as input and write there.
It will be easy, if I could have expanded the store room. But these code is the infrastructure for hypercall, I can't change them at my will.
I think only value but pointer can delivered by smccc_set_retval call.
>
> Finally, on x86 we added an argument for the CLOCK_* that is being read
> (currently only CLOCK_REALTIME, but having room for extensibility in the API
> is always nice).
>
IMO, I will be limited by the design of hypercall on arm64, I can only design my code under it. maybe it will be better sometime but for now I could just obey it.
Thanks
Jianyong Wu
> Paolo
Hi Paolo,
> -----Original Message-----
> From: Paolo Bonzini <[email protected]>
> Sent: Thursday, September 19, 2019 8:13 PM
> To: Marc Zyngier <[email protected]>; Jianyong Wu (Arm Technology China)
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Mark Rutland <[email protected]>; Will
> Deacon <[email protected]>; Suzuki Poulose
> <[email protected]>
> Cc: [email protected]; [email protected]; Steve Capper
> <[email protected]>; Kaly Xin (Arm Technology China)
> <[email protected]>; Justin He (Arm Technology China)
> <[email protected]>; nd <[email protected]>; linux-arm-
> [email protected]
> Subject: Re: [RFC PATCH v3 4/6] psci: Add hvc call service for ptp_kvm.
>
> On 19/09/19 13:39, Marc Zyngier wrote:
> >> I don't think it's ugly but more important, using tk->tkr_mono.clock
> >> is incorrect. See how the x86 code hardcodes &kvm_clock, it's the
> >> same for ARM.
> > Not really. The guest kernel is free to use any clocksource it wishes.
>
> Understood, in fact it's the same on x86.
>
> However, for PTP to work, the cycles value returned by the clocksource must
> match the one returned by the hypercall. So for ARM
> get_device_system_crosststamp must receive the arch timer clocksource, so
> that it will return -ENODEV if the active clocksource is anything else.
>
As ptp_kvm clock has fixed to arm arch system counter in patch set v4, we need check if the current clocksource is system counter when return clock cycle in host,
so a helper needed to return the current clocksource.
Could I add this helper in next patch set?
Thanks
Jianyong wu
> Paolo
>
> > In some cases, it is actually desirable (like these broken systems
> > that cannot use an in-kernel irqchip...). Maybe it is that on x86 the
> > guest only uses the kvm_clock, but that's a much harder sell on ARM.
> > The fact that ptp_kvm assumes that the clocksource is fixed doesn't
> > seem correct in that case.
On 09/10/19 07:21, Jianyong Wu (Arm Technology China) wrote:
> As ptp_kvm clock has fixed to arm arch system counter in patch set
> v4, we need check if the current clocksource is system counter when
> return clock cycle in host, so a helper needed to return the current
> clocksource. Could I add this helper in next patch set?
You don't need a helper. You need to return the ARM arch counter
clocksource in the struct system_counterval_t that you return.
get_device_system_crosststamp will then check that the clocksource
matches the active one.
Paolo
Hi Paolo,
> -----Original Message-----
> From: Paolo Bonzini <[email protected]>
> Sent: Wednesday, October 9, 2019 2:36 PM
> To: Jianyong Wu (Arm Technology China) <[email protected]>; Marc
> Zyngier <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Mark Rutland <[email protected]>; Will
> Deacon <[email protected]>; Suzuki Poulose
> <[email protected]>
> Cc: [email protected]; [email protected]; Steve Capper
> <[email protected]>; Kaly Xin (Arm Technology China)
> <[email protected]>; Justin He (Arm Technology China)
> <[email protected]>; nd <[email protected]>; linux-arm-
> [email protected]
> Subject: Re: [RFC PATCH v3 4/6] psci: Add hvc call service for ptp_kvm.
>
> On 09/10/19 07:21, Jianyong Wu (Arm Technology China) wrote:
> > As ptp_kvm clock has fixed to arm arch system counter in patch set v4,
> > we need check if the current clocksource is system counter when return
> > clock cycle in host, so a helper needed to return the current
> > clocksource. Could I add this helper in next patch set?
>
> You don't need a helper. You need to return the ARM arch counter
> clocksource in the struct system_counterval_t that you return.
> get_device_system_crosststamp will then check that the clocksource
> matches the active one.
>
We must ensure both of the host and guest using the same clocksource.
get_device_system_crosststamp will check the clocksource of guest and we also need check
the clocksource in host, and struct type can't be transferred from host to guest using arm hypercall.
now we lack of a mechanism to check the current clocksource. I think this will be useful if we add one.
Thanks
Jianyong Wu
> Paolo
On 09/10/19 10:18, Jianyong Wu (Arm Technology China) wrote:
> Hi Paolo,
>
>> -----Original Message-----
>> From: Paolo Bonzini <[email protected]>
>> Sent: Wednesday, October 9, 2019 2:36 PM
>> To: Jianyong Wu (Arm Technology China) <[email protected]>; Marc
>> Zyngier <[email protected]>; [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected];
>> [email protected]; Mark Rutland <[email protected]>; Will
>> Deacon <[email protected]>; Suzuki Poulose
>> <[email protected]>
>> Cc: [email protected]; [email protected]; Steve Capper
>> <[email protected]>; Kaly Xin (Arm Technology China)
>> <[email protected]>; Justin He (Arm Technology China)
>> <[email protected]>; nd <[email protected]>; linux-arm-
>> [email protected]
>> Subject: Re: [RFC PATCH v3 4/6] psci: Add hvc call service for ptp_kvm.
>>
>> On 09/10/19 07:21, Jianyong Wu (Arm Technology China) wrote:
>>> As ptp_kvm clock has fixed to arm arch system counter in patch set v4,
>>> we need check if the current clocksource is system counter when return
>>> clock cycle in host, so a helper needed to return the current
>>> clocksource. Could I add this helper in next patch set?
>>
>> You don't need a helper. You need to return the ARM arch counter
>> clocksource in the struct system_counterval_t that you return.
>> get_device_system_crosststamp will then check that the clocksource
>> matches the active one.
>
> We must ensure both of the host and guest using the same clocksource.
> get_device_system_crosststamp will check the clocksource of guest and we also need check
> the clocksource in host, and struct type can't be transferred from host to guest using arm hypercall.
> now we lack of a mechanism to check the current clocksource. I think this will be useful if we add one.
Got it---yes, I think adding a struct clocksource to struct
system_time_snapshot would make sense. Then the hypercall can just use
ktime_get_snapshot and fail if the clocksource is not the ARM arch counter.
John (Stultz), does that sound good to you? The context is that
Jianyong would like to add a hypercall that returns a (cycles,
nanoseconds) pair to the guest. On x86 we're relying on the vclock_mode
field that is already there for the vDSO, but being able to just use
ktime_get_snapshot would be much nicer.
Paolo
On Wed, Oct 9, 2019 at 2:13 AM Paolo Bonzini <[email protected]> wrote:
> On 09/10/19 10:18, Jianyong Wu (Arm Technology China) wrote:
> >
> > We must ensure both of the host and guest using the same clocksource.
> > get_device_system_crosststamp will check the clocksource of guest and we also need check
> > the clocksource in host, and struct type can't be transferred from host to guest using arm hypercall.
> > now we lack of a mechanism to check the current clocksource. I think this will be useful if we add one.
>
> Got it---yes, I think adding a struct clocksource to struct
> system_time_snapshot would make sense. Then the hypercall can just use
> ktime_get_snapshot and fail if the clocksource is not the ARM arch counter.
>
> John (Stultz), does that sound good to you? The context is that
> Jianyong would like to add a hypercall that returns a (cycles,
> nanoseconds) pair to the guest. On x86 we're relying on the vclock_mode
> field that is already there for the vDSO, but being able to just use
> ktime_get_snapshot would be much nicer.
I've not really looked at the code closely in awhile, so I'm not sure
my suggestions will be too useful.
My only instinct is maybe to not include the clocksource pointer in
the system_time_snapshot, as I worry that structure will then be
abused by the interface users. If you're just wanting to make sure
the clocksource is what you're expecting, would instead putting only
the clocksource name in the structure suffice?
thanks
-john
On 09/10/19 18:05, John Stultz wrote:
> On Wed, Oct 9, 2019 at 2:13 AM Paolo Bonzini <[email protected]> wrote:
>> John (Stultz), does that sound good to you? The context is that
>> Jianyong would like to add a hypercall that returns a (cycles,
>> nanoseconds) pair to the guest. On x86 we're relying on the vclock_mode
>> field that is already there for the vDSO, but being able to just use
>> ktime_get_snapshot would be much nicer.
>
> I've not really looked at the code closely in awhile, so I'm not sure
> my suggestions will be too useful.
>
> My only instinct is maybe to not include the clocksource pointer in
> the system_time_snapshot, as I worry that structure will then be
> abused by the interface users. If you're just wanting to make sure
> the clocksource is what you're expecting, would instead putting only
> the clocksource name in the structure suffice?
Well, it would suffice but it would be quite ugly to do a string
comparison later.
What kind of abuse are you thinking of? We already have struct
system_counterval_t for a clocksource+cycles tuple, so it seemed obvious
to me to make system_time_snapshot a superset of it... In fact,
system_time_snapshot's cycles member is even unused currently, so it
could even be easily replaced by a struct system_counterval_t, instead
of adding an extra field.
Paolo
Hi Paolo,
> -----Original Message-----
> From: Paolo Bonzini <[email protected]>
> Sent: Wednesday, October 9, 2019 5:13 PM
> To: Jianyong Wu (Arm Technology China) <[email protected]>; Marc
> Zyngier <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Mark Rutland <[email protected]>; Will
> Deacon <[email protected]>; Suzuki Poulose
> <[email protected]>
> Cc: [email protected]; [email protected]; Steve Capper
> <[email protected]>; Kaly Xin (Arm Technology China)
> <[email protected]>; Justin He (Arm Technology China)
> <[email protected]>; nd <[email protected]>; linux-arm-
> [email protected]
> Subject: Re: [RFC PATCH v3 4/6] psci: Add hvc call service for ptp_kvm.
>
> On 09/10/19 10:18, Jianyong Wu (Arm Technology China) wrote:
> > Hi Paolo,
> >
> >> -----Original Message-----
> >> From: Paolo Bonzini <[email protected]>
> >> Sent: Wednesday, October 9, 2019 2:36 PM
> >> To: Jianyong Wu (Arm Technology China) <[email protected]>; Marc
> >> Zyngier <[email protected]>; [email protected];
> [email protected];
> >> [email protected]; [email protected];
> >> [email protected]; [email protected]; Mark
> >> Rutland <[email protected]>; Will Deacon
> <[email protected]>;
> >> Suzuki Poulose <[email protected]>
> >> Cc: [email protected]; [email protected]; Steve Capper
> >> <[email protected]>; Kaly Xin (Arm Technology China)
> >> <[email protected]>; Justin He (Arm Technology China)
> >> <[email protected]>; nd <[email protected]>; linux-arm-
> >> [email protected]
> >> Subject: Re: [RFC PATCH v3 4/6] psci: Add hvc call service for ptp_kvm.
> >>
> >> On 09/10/19 07:21, Jianyong Wu (Arm Technology China) wrote:
> >>> As ptp_kvm clock has fixed to arm arch system counter in patch set
> >>> v4, we need check if the current clocksource is system counter when
> >>> return clock cycle in host, so a helper needed to return the current
> >>> clocksource. Could I add this helper in next patch set?
> >>
> >> You don't need a helper. You need to return the ARM arch counter
> >> clocksource in the struct system_counterval_t that you return.
> >> get_device_system_crosststamp will then check that the clocksource
> >> matches the active one.
> >
> > We must ensure both of the host and guest using the same clocksource.
> > get_device_system_crosststamp will check the clocksource of guest and
> > we also need check the clocksource in host, and struct type can't be
> transferred from host to guest using arm hypercall.
> > now we lack of a mechanism to check the current clocksource. I think this
> will be useful if we add one.
>
> Got it---yes, I think adding a struct clocksource to struct
> system_time_snapshot would make sense. Then the hypercall can just use
> ktime_get_snapshot and fail if the clocksource is not the ARM arch counter.
>
> John (Stultz), does that sound good to you? The context is that Jianyong
> would like to add a hypercall that returns a (cycles,
> nanoseconds) pair to the guest. On x86 we're relying on the vclock_mode
> field that is already there for the vDSO, but being able to just use
> ktime_get_snapshot would be much nicer.
>
Could I add struct clocksource to system_time_snapshot struct in next version of my patch set?
Jianyong Wu
Thanks
> Paolo
On 14/10/19 07:50, Jianyong Wu (Arm Technology China) wrote:
>>
>> John (Stultz), does that sound good to you? The context is that Jianyong
>> would like to add a hypercall that returns a (cycles,
>> nanoseconds) pair to the guest. On x86 we're relying on the vclock_mode
>> field that is already there for the vDSO, but being able to just use
>> ktime_get_snapshot would be much nicer.
>>
> Could I add struct clocksource to system_time_snapshot struct in next version of my patch set?
Yes, I say go ahead. At least it will get closer to a final design.
Paolo