Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752101AbdI0U6Y (ORCPT ); Wed, 27 Sep 2017 16:58:24 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:21008 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751952AbdI0U6W (ORCPT ); Wed, 27 Sep 2017 16:58:22 -0400 Subject: Re: [PATCH v4 2/3] x86/xen/time: setup vcpu 0 time info page To: Boris Ostrovsky References: <20170927134623.3147-1-joao.m.martins@oracle.com> <20170927134623.3147-3-joao.m.martins@oracle.com> <0f58594c-ee42-7813-a0c2-1cbbc1e2a576@oracle.com> Cc: linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org, Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Juergen Gross , Andy Lutomirski From: Joao Martins Message-ID: <2fb49000-5f07-8cb9-f16d-1701c36b6c49@oracle.com> Date: Wed, 27 Sep 2017 21:57:25 +0100 MIME-Version: 1.0 In-Reply-To: <0f58594c-ee42-7813-a0c2-1cbbc1e2a576@oracle.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Source-IP: userv0021.oracle.com [156.151.31.71] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3510 Lines: 88 On 09/27/2017 09:22 PM, Boris Ostrovsky wrote: > On 09/27/2017 11:26 AM, Joao Martins wrote: >> On 09/27/2017 03:40 PM, Boris Ostrovsky wrote: >>>> +static void xen_setup_vsyscall_time_info(void) >>>> +{ >>>> + struct vcpu_register_time_memory_area t; >>>> + struct pvclock_vsyscall_time_info *ti; >>>> + struct pvclock_vcpu_time_info *pvti; >>>> + int ret; >>>> + >>>> + pvti = &__this_cpu_read(xen_vcpu)->time; >>>> + >>>> + /* >>>> + * We check ahead on the primary time info if this >>>> + * bit is supported hence speeding up Xen clocksource. >>>> + */ >>>> + if (!(pvti->flags & PVCLOCK_TSC_STABLE_BIT)) >>>> + return; >>>> + >>>> + pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT); >>> Is it OK to have this flag set if anything below fails? >>> >> Yes - if anything below fails it will only affect userspace mapped page. > > Then should it be set somewhere else, like in xen_time_init()? > Hm, I could move it if you think it's better - but given the importance of the bit we are checking and its direct correlation to whether or not we can setup VCLOCK_PVCLOCK then I find it cleaner to have it here in the same routine. One thing I failed to mention before is that checking ahead like above, let us also avoid allocating a page plus an hypercall to register the pvti just to check the one bit of info we need for using VCLOCK_PVCLOCK. It is very unlikely with current Xen code that 1) the secondary copy register below fails, or 2) master and secondary don't have the same bits set. So in case you're reconsidering the "shortcut" check above I can move it like we had in v1 and have pvclock_set_flags right before pvclock_set_pvti_cpu0_va(). >> What I >> do above is just allowing xen clocksource to use/check that bit (consequently >> speeding up sched_clock) given the necessary support is there in the master >> copy. The secondary copy (i.e. what's being set up below, mapped/used in vdso) >> has the same data from the master copy, just separate memory regions. The checks >> below are just for the unlikely cases of failing to register the secondary copy >> or if its content were to differ from master copy in future releases - and >> therefore we handle those more gracefully. >> >>> (I can see in the changelog that apparently at some point I've asked >>> about this at v1 but I can't remember/find what exactly it was) >>> >>>> + >>>> + ti = (struct pvclock_vsyscall_time_info *)get_zeroed_page(GFP_KERNEL); >>>> + if (!ti) >>>> + return; >>>> + >>>> + t.addr.v = &ti->pvti; >>>> + >>>> + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area, 0, &t); >>>> + if (ret) { >>>> + pr_notice("xen: VCLOCK_PVCLOCK not supported (err %d)\n", ret); >>>> + free_page((unsigned long)ti); >>>> + return; >>>> + } >>>> + >>>> + /* >>>> + * If the check above succedded this one should too since it's the >>>> + * same data on both primary and secondary time infos just different >>>> + * memory regions. But we still check it in case hypervisor is buggy. >>>> + */ >>>> + pvti = &ti->pvti; >>>> + if (!(pvti->flags & PVCLOCK_TSC_STABLE_BIT)) { >>>> + t.addr.v = NULL; >>>> + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area, >>>> + 0, &t); >>>> + if (!ret) >>>> + free_page((unsigned long)ti); >>>> + >>>> + pr_notice("xen: VCLOCK_PVCLOCK not supported (tsc unstable)\n"); >>>> + return; >>>> + } >>>> + >>>> + xen_clock = ti; >>>> + pvclock_set_pvti_cpu0_va(xen_clock); >>>> + >>>> + xen_clocksource.archdata.vclock_mode = VCLOCK_PVCLOCK; >>>> +} >>>> + >