Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751208AbdH1H3N (ORCPT ); Mon, 28 Aug 2017 03:29:13 -0400 Received: from mail-eopbgr40097.outbound.protection.outlook.com ([40.107.4.97]:42016 "EHLO EUR03-DB5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751124AbdH1H3L (ORCPT ); Mon, 28 Aug 2017 03:29:11 -0400 Subject: Re: [PATCH v4 00/10] make L2's kvm-clock stable, get rid of pvclock_gtod_copy in KVM To: Paolo Bonzini , Thomas Gleixner Cc: John Stultz , Radim Krcmar , kvm list , Ingo Molnar , "H. Peter Anvin" , lkml , x86@kernel.org, rkagan@virtuozzo.com, den@virtuozzo.com, Marcelo Tosatti , Peter Zijlstra References: <1501684690-211093-1-git-send-email-dplotnikov@virtuozzo.com> <894362115.582988.1503435653874.JavaMail.zimbra@redhat.com> <43d87f0d-62c3-878e-108a-aaf7fb68fcb3@redhat.com> From: Denis Plotnikov Message-ID: Date: Mon, 28 Aug 2017 10:28:06 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <43d87f0d-62c3-878e-108a-aaf7fb68fcb3@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [195.214.232.6] X-ClientProxiedBy: AM5PR0602CA0008.eurprd06.prod.outlook.com (2603:10a6:203:a3::18) To AM4PR0802MB2212.eurprd08.prod.outlook.com (2603:10a6:200:5e::10) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 88f62c32-4f8c-4315-973b-08d4ede67826 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(300000503095)(300135400095)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:AM4PR0802MB2212; X-Microsoft-Exchange-Diagnostics: 1;AM4PR0802MB2212;3:SG/U92uz+QH33V3DXNswUR4FJzkB3v8LbAqqowCiE3iTXIi8pooiLoZmaYmO28HaLFwWhbm3ycMO2zg70ZaWtj7RGxMaysBGwab8kmDLlhkTbx6wZ6FmNhaEPGjnCWQPPKl0rJUI6miEkTVIqhCQESTQJGRDbhZbvJeGZl/6W7pXybmJncq5PLEaiZUFTqNwnBdAXUwWFGitZwU3Uw8DGoh7Cz0hQVXDfGlvkOoYQ75tn1IbaIjEzDwVYD+DQvTZ;25:+N1vmIU7ZSqSo39v5Oo1AiZaqkBDJyPl1BcagIrb4VNIT38D0ITTpYf1IuWO2DgX5DxLqnmBJFNI1TODOaeMavKJjNVRjjSMpkm8RRT9jTirGJ5E7+YDm7dtRQkDT/Zhu6Hmaq3sywr23+XW4ECl5l7q3lG6NtPnP11i4LaPsGSwv/2jY9DObcSZBYqGd6AYxjdyFWisVzQowkbUmmWbxum3/dO7Ycqz/B9t7JsBMxuJlAVbToTk7XGGgoa0k/npCJfh4Bs2AI4fF8OlMfj7AvIHbYXuF4DwhJZihxG/Uk24BnLNnP4BNyRQ1cyDOsb4TId6XDzWSjuQ0PaTkNlqiA==;31:sUdknpL0hryiulNDQ5dMATrZhWKPFFnmGBXupGdWBlBzU7UqrhYYr6RT2Qxz5GbY+osEugi4wmJPHibENUUeLynTgRkyOAnRBWNvVgaaDpmIXJmk+WmJdZdvQMOs/l9F15KmZisdM34yFustfnNF5xRbfNgbQaorssfDa0zJbpurzmXrH3iJ/lQnGvrl17AWeRgfWGZtY2tV7uoXcIH5rV6S3Np/IOlAwqoROAtOyuY= X-MS-TrafficTypeDiagnostic: AM4PR0802MB2212: X-Microsoft-Exchange-Diagnostics: 1;AM4PR0802MB2212;20:j7d6G3/IxQBa9FLBwYkxRTuldytL2DrGZ8fz/mIGV3PtY/YXFw6m0iAYe6XMd3xsVK/nhZ1TJ8Jmg5MJWWgvNLb3G0o6hAakjwuwKFqKdM63OtHZcvHu0ZFk+6j2jCon5YD0DyGX0zPr/Y2shi+rXjJPaxekXeKDDFmyWJAMGrNJ581kiXebMlcLZ+5cGIEVZvucgfil1v7K8vZyp7UC9uNn05WgYWQXuZa9aFlnn8hOWWSmb1HMSEO53kN3Pzt4XTQDaiNSH0KPTTV05PP/b6HiY94UmGb7GTN1XM3ddl90xFLGoiYLvIscbcBocUZLkCCuT/V5SLPGw+DWY68sciuE3GIqSNk+6tch6RNBIZ9Ea0mM/7na1v6CahqSGyc4nBVEYAbkzJTYV/7p2AsWu5VJi17/Aan38b2Jv1D6amU=;4:aWQQ14D+ZoP5rstC8ZLlnikPN7E0VAg3IMXmQfYAYzSV0X+k+q8ILUwARhjjFjiZvYeHX9t2qmG8TeuHsejF4tzBxeivYb1DoIoq+1w7A7HCQfltfCdVf9ULjj4otZIZGOe+EuYCCIFsOAfMFHNydxA2/ppyPn+ryRwsKBSt3A0OKAhbRrBRQfmwSLD2CE6ihAwD8RJ2xiutGYZArTlRiu6ue1lF+++1nvYbLMr02Npo2FUAGoTgoPH20Qzy/BXy X-Exchange-Antispam-Report-Test: UriScan:; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(100000703101)(100105400095)(93006095)(93001095)(10201501046)(3002001)(6041248)(20161123558100)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123564025)(20161123562025)(20161123555025)(20161123560025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:AM4PR0802MB2212;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:AM4PR0802MB2212; X-Forefront-PRVS: 0413C9F1ED X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(7370300001)(4630300001)(6009001)(6049001)(39830400002)(52314003)(199003)(24454002)(189002)(66066001)(86362001)(7736002)(54906002)(77096006)(6486002)(305945005)(42186005)(229853002)(53936002)(53546010)(76176999)(54356999)(50986999)(2906002)(106356001)(101416001)(105586002)(33646002)(7416002)(6116002)(65826007)(189998001)(64126003)(478600001)(4001350100001)(5660300001)(83506001)(8676002)(97736004)(47776003)(81156014)(81166006)(31686004)(3846002)(50466002)(25786009)(2950100002)(65956001)(36756003)(65806001)(575784001)(68736007)(230700001)(4326008)(93886005)(23676002)(31696002)(6246003)(7350300001)(26583001);DIR:OUT;SFP:1102;SCL:1;SRVR:AM4PR0802MB2212;H:[172.16.25.217];FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=dplotnikov@virtuozzo.com; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtBTTRQUjA4MDJNQjIyMTI7MjM6czNuVlpGR1RuUzJvTXBlQW05elZpWFh2?= =?utf-8?B?eTd2MzUzZnBpOEVlSlRsNUFQNFBwRERkTlRmQlR5OWVQbW5IZ1NyR01nbEto?= =?utf-8?B?cVFlTVZZVmpKb3AwZ0pQNC9PTHdLbVh6d1E1U3BIZzB6WWwyTDdaU2l1Nk9X?= =?utf-8?B?TjZZZkxtekJUVGsxSVorSyt5NGlQaURxc245WlBxd0hTbmRWWDJIS0xwdXRv?= =?utf-8?B?UlJxNTdPMVVDWWMwS3RUZDVJMTJqbmFoZXJySytHVXV1ZDltN2pDd1dxU1Y3?= =?utf-8?B?SS83RmNGRytCdWFkeUZZbVFOZ3gwWkF5YU1CQnM1SE4vQnJwTS9BVUV3WXVa?= =?utf-8?B?bllWZDZkbFVjU1lYSWZPYzZZUzBCcm95eG5ZWENJUWtlbHlWenpXVGpZbG9k?= =?utf-8?B?NkphT1JFVmFtN2V5UlJJcUFydEdCRG9HbjhhckF6ZGtRVjZHdXJJK3B4TkxB?= =?utf-8?B?eDVyUDlqK25jRU5VMmNiOW5GZGRzbGF0VndxV0xwUEFuNXhoUm14WE5rbDJC?= =?utf-8?B?WGE2MWpCelZLMnZBRW9wOEhiMUlKK3k5SlhKV29nTE1ZQU1tcno3WGxGUUpy?= =?utf-8?B?YisyYzJ0SUxzR2tLWm1JS3dqeExHcHlNeFFESzhseEp6NzNiaUJsSUYzY2Mx?= =?utf-8?B?OTlaWTFuN3RQTG04bjRnU3oxbUNvUjlJc3Izemo5Y2NSdEpXaFBZenQyT3c0?= =?utf-8?B?Y2tWemlMU1Z2ZXZZZ1dabndUc1lOc3p4WWp4RWdVV2pGekkyYjZ6cHc2bEFQ?= =?utf-8?B?RnhIaENxZlEzYkFydEUrd083ZU1LWWx3ak93YUNOQ0xESjVKNEorZHZpVWJ2?= =?utf-8?B?cVY2dmx5b3JTZXBVaXhQUUM5K2F5RUJ0aWVEN2lYRGNaRG5nQnU3aDFFMmNZ?= =?utf-8?B?VE95Kzd5NE01Rmk4amFPZytpVkxOZWFTaG84U3ovK0J3N2IyN2xpWEhHUzgv?= =?utf-8?B?bVFsT04wazVUcmFEMW1BZWhPaHRaNTdyQjRsSG1ZNlpxTnpGOUhzS0QydFBp?= =?utf-8?B?OCttRHhabE5GTWVFNCtacEQ1WERHbmVGc1ZlSE1jWUxXWjlmQjdHRmJ2ZzRz?= =?utf-8?B?NGFSZlgyb1JqMFZFVmF5WHlTS1JJVXZzZXlSV1NtWlJ5NEsvdGlQWCtKUXgy?= =?utf-8?B?QWp0N3JjbGNOR2hYUFNrUDkrSlRXT3hVSDJ3VDZqalB3MmhlZy9PN0FEQVRo?= =?utf-8?B?OHIvZUtOcU9VZFNoSGNRWVYxenJmcXhRRDl4YjlDTDI5ZkwvZ2lwbUlUQ1RZ?= =?utf-8?B?U01jR0xIMU5MT3J4LzhWWDNWbGtFYjA2YU5FNGM0dGhXNTg1Z0lMSTBOVFRi?= =?utf-8?B?WE5lVVZXS0lHcy9vVE4zRmhCN0VZZ1JzSTY0VVYreTNqQUxkQzg5bkg2TDVz?= =?utf-8?B?bGRrWWF3ZCtyTjBkQUljRm9BR1JyYjFXTkF5eDJPaGF6UnVJTEV0dnZGWVI1?= =?utf-8?B?WmVMV2FtREFWdGlFTWdVaDJHbEQrT1hhbzFzckRWMlVRQzFlVVBoVktXMGVF?= =?utf-8?B?WHpnZXZQNUlUWmUyYmVHY3R4QWFwRUFjSEozejY1UStxcHhsV2x1Z01HcGFG?= =?utf-8?B?Z3lmVW5YSVZqS29iZnEyVlVmWTZ4cEh2Z0tLUG5TWW1IK0ZNcG1ZK3NzZkVi?= =?utf-8?B?NHZYWUtpbHkrNkFjMXVLRkcrOW9uOVEyZWZKVTRGM3NVL2lVNHV3VnBZS29M?= =?utf-8?B?ZGFCbG1QekFQVDZDU0FnaEo0aHpUc2J1VlZWSVVGSVhUT3NvTnhTNjNIWnVK?= =?utf-8?B?Q0gwRDh1ZzB6eVZyK1JWaU5UTjdyOENwSHNxRnNTcG1SRjV0emdKS2lxUU9l?= =?utf-8?B?U01zMzhZejdUMnhjUzdoRzBOalZFSnlrempSaDg3a0JCZDJyZkxuVGRrVEwr?= =?utf-8?B?MTFWSkpOa09lb2ZlZmtyUU0xUDMyNXlyM0Y1VDhuNXFWbjZjNngzRjVZR0p0?= =?utf-8?Q?v6BxqBzpKB+hVW6aVeAOVCD7OEd0fLrY=3D?= X-Microsoft-Exchange-Diagnostics: 1;AM4PR0802MB2212;6:whpmos9FKO8VOUABujO7WJFBE8ecz/tEjPIy40ji54dlUPnLjPiBdDV2PS4XzyU6h8NBZFGHMu4Kplf8wuC9ZijkFqCwx449/o788s9jcOTaglZGLD4JAkHhCkUzEmQZ7EvMS31RSyf0JzUDuRfsy4QHbIWrWpgL5aEAQgAUhRicm+uX2/qYRMwzpVOizZo6BfEOfBlNmySzEeFCuOZjwLH84StKMkdBTNAbcB0FTksgUS3ooi6Mipc0qDjbYv6la78hSvKx5aaGYsk9vPj8ZiOhOL2RDpsUC+P6fqTSfoR8HB57WmNSkE7OFLECF2fWxoFTmHx+H7RPZIm1i0dQQQ==;5:7ufh1gljo2xocdPq1/C3I3yRUcTaXgEhvqvLTi4ckQtHBDWzpNW/B5duv70fuLH2RddOoXTFGEP/enW9w9V6rLOHBYYtZM6eTNIlwiSkP0jWl3XQ9wJuyvU04ZFoLVO23qw7xlVsXqEvp6PDlWfiVA==;24:p/qk0egaygvqcKVVHjCGJGlHOKcobJVvj2xAJazUOSssSRQYauB++SJVbt71rIlQvaucgGPMLnwd1nNyzhLNeN4+YfG08fnJ0Ah0p2hb/UE=;7:5dSPOjGPtUUjpf1/ZBInJN8TLCDks/0salPsBqWY4eS6Xpd55WQ5wz+f3cIjydUf1qTJQF0TmvEwBT2BvpPdmujqPyvVRke9eJkJSeJBKcaRY96shEFgOJjfUa+n8qCMrh69cvyj4mFxy7RCEk7cCucRL7sUwrWrQgvNvTtx+Ycc1NlWEphHqhngUiWcIVP9j2SeN7irPxvzI4LIuW6x/hJ2CGfPD5PJe8+FQcuDmIc= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;AM4PR0802MB2212;20:j2FAHIsLFf4n7IB1JbFhLRghaTGZILkGZ9Q5JaRqi7B0sOKibSkd78EtGE+KD/FGGZoZ3itSs0h8Th7CQ0RPxqhG1yq9mCKHHHti3Gd939mbzdi7IwhnDqXqy4L4c9t++AxlDLRRJrUAOhM6l79NaK3mY2vFnU8JEflANQUl2kw= X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Aug 2017 07:29:07.2795 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR0802MB2212 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3685 Lines: 112 On 24.08.2017 11:00, Paolo Bonzini wrote: > On 23/08/2017 18:02, Paolo Bonzini wrote: >> >> More duct tape would have been just: >> >> - if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC) >> + mode = READ_ONCE(pvclock_gtod_data.clock.vclock_mode); >> + if (mode != VCLOCK_TSC && >> + (mode != VCLOCK_PVCLOCK || !pvclock_nested_virt_magic()) >> return false; >> >> - return do_realtime(ts, cycle_now) == VCLOCK_TSC; >> + switch (mode) { >> + case VCLOCK_TSC: >> + return do_realtime_tsc(ts, cycle_now); >> + case VCLOCK_PVCLOCK: >> + return do_realtime_pvclock(ts, cycle_now); >> + } >> >> Nested virtualization does need a clocksource change notifier on top, >> but we can cross that bridge later. Maybe Denis can post just those >> patches to begin with. > > For what it's worth, this is all that's needed (with patches 1-2-3-4-5-7) > to support kvmclock on top of Hyper-V clock. It's trivial. > > Even if we could add paravirtualization magic to KVM live migration, we > certainly couldn't do that for other hypervisors. > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > index 5b882cc0c0e9..3bab935b021a 100644 > --- a/arch/x86/hyperv/hv_init.c > +++ b/arch/x86/hyperv/hv_init.c > @@ -46,10 +46,24 @@ static u64 read_hv_clock_tsc(struct clocksource *arg) > return current_tick; > } > > +static bool read_hv_clock_tsc_with_stamp(struct clocksource *arg, > + u64 *cycles, u64 *cycles_stamp) > +{ > + *cycles = __hv_read_tsc_page(tsc_pg, &cycles_stamp); > + > + if (*cycles == U64_MAX) { > + *cycles = rdmsrl(HV_X64_MSR_TIME_REF_COUNT); > + return false; > + } > + > + return true; > +} > + > static struct clocksource hyperv_cs_tsc = { > .name = "hyperv_clocksource_tsc_page", > .rating = 400, > .read = read_hv_clock_tsc, > + .read_with_stamp = read_hv_clock_tsc_with_stamp, > .mask = CLOCKSOURCE_MASK(64), > .flags = CLOCK_SOURCE_IS_CONTINUOUS, > }; > diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h > index 2b58c8c1eeaa..5aff66e9fff7 100644 > --- a/arch/x86/include/asm/mshyperv.h > +++ b/arch/x86/include/asm/mshyperv.h > @@ -176,9 +176,9 @@ void hyperv_cleanup(void); > #endif > #ifdef CONFIG_HYPERV_TSCPAGE > struct ms_hyperv_tsc_page *hv_get_tsc_page(void); > -static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg) > +static inline u64 __hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg, u64 *cur_tsc) > { > - u64 scale, offset, cur_tsc; > + u64 scale, offset; > u32 sequence; > > /* > @@ -209,7 +209,7 @@ static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg) > > scale = READ_ONCE(tsc_pg->tsc_scale); > offset = READ_ONCE(tsc_pg->tsc_offset); > - cur_tsc = rdtsc_ordered(); > + *cur_tsc = rdtsc_ordered(); > > /* > * Make sure we read sequence after we read all other values > @@ -219,9 +219,14 @@ static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg) > > } while (READ_ONCE(tsc_pg->tsc_sequence) != sequence); > > - return mul_u64_u64_shr(cur_tsc, scale, 64) + offset; > + return mul_u64_u64_shr(*cur_tsc, scale, 64) + offset; > } > > +static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg) > +{ > + u64 cur_tsc; > + return __hv_read_tsc_page(tsc_pg, &cur_tsc); > +} > #else > static inline struct ms_hyperv_tsc_page *hv_get_tsc_page(void) > { > > > Denis, could you try redoing patch 7 to use the pvclock_gtod_notifier > instead of the new one you're adding, and only send that first part? I > think it's a worthwhile cleanup anyway, so let's start with that. > > Paolo > Ok, I'll do that Denis