Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp3673926pxu; Mon, 30 Nov 2020 08:02:49 -0800 (PST) X-Google-Smtp-Source: ABdhPJyJwjIPCPgvRB7ygbmpTAglrjVMvEPEw16XNBOTtEf0ZYdLzJfJcY6wL6PkxFeT0lBrAtZz X-Received: by 2002:aa7:d1c2:: with SMTP id g2mr16835139edp.8.1606752169527; Mon, 30 Nov 2020 08:02:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606752169; cv=none; d=google.com; s=arc-20160816; b=XBy/pGuuVD2NLXWV5G+pxEPMpmAOk6D/9MEjKyVy54sZK8laJcAhSba1wMcSNJg1h/ YlZprxRF0nsUr1u9t7xF7h1ySgshqXqG45wBwUW9DS6ti/fbbq3iRs0oZzGCSmls3BcO 0KFcF2XihXnOWKI85T8TKEWyAMaU8XP8EJADtZQUciJLGrvvwQQlb8YrWUZ7ZB3WCi8X YdVnDE+M9pDDbOGkiGZF0kDsXHXNlZ2e+elqsKPe4oM7vth+fp/3Xr3paXSrl+L2P7Cy XGokRNkKSK66+ClOqrpcSioy9R48yIWdsQ9fAbIxNkRVOVktTaKs1+1eEZ+Zz377IId8 e47A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=WYyaNMWXIzphPUXACrUhTp271jhjXQRzrhohRKC0mcw=; b=KhVLFGSq9NYzpefQuTzD6L87FpyOfv6uTi4qi6o9lvpoMF7tj16soEliKUeOyzQ+RB HS9XYcml/ovw3wDVCIfgz8csIc58P3JqcbHgW05iVc3ZtjSKUzIt1I8Ys0RSKrrsZMQO sCHOYuD32UIk4NTpsRgQA5LqDFMgPQnZZXNB2XxuJ2dKhskUYN41Whura9EN7l66G/4X Dct3bT0hekCBrsrv8lyjnRxBZIa8hLOsPPQ/aG9WYQzwEYtVKydRDgJzvQqwlSzSq8mq xUIFGQvLhw6F6B/VdeWBjG6YB6/NeVHhmbxAzUos+3OFDY0gI9tETXXECVG8BQg+24ip UAOg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ZnGlVYlK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y23si10803739eju.563.2020.11.30.08.02.25; Mon, 30 Nov 2020 08:02:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ZnGlVYlK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727392AbgK3QAe (ORCPT + 99 others); Mon, 30 Nov 2020 11:00:34 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:45569 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726860AbgK3QAe (ORCPT ); Mon, 30 Nov 2020 11:00:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1606751947; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WYyaNMWXIzphPUXACrUhTp271jhjXQRzrhohRKC0mcw=; b=ZnGlVYlKXrvHMy/0o2lzr55LfGCu3NWWmnjtF45eFF9aV+X5t4tD3VUnAePVy6vZPowhyH kZ8pIkfC7BEQUxDutfCUpn9aLvb1PljUBmqTF8Xxcvbispq7mhMiQnKWB0w2w6kylZWn+c OXFasvHOFUzopG++XuhfFBdW2YRfLcI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-444-PEEJD8uZNkCkGJpbLxvTaw-1; Mon, 30 Nov 2020 10:59:03 -0500 X-MC-Unique: PEEJD8uZNkCkGJpbLxvTaw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B30025708B; Mon, 30 Nov 2020 15:59:01 +0000 (UTC) Received: from starship (unknown [10.35.206.90]) by smtp.corp.redhat.com (Postfix) with ESMTP id F15B05D9C0; Mon, 30 Nov 2020 15:58:55 +0000 (UTC) Message-ID: Subject: Re: [PATCH 1/2] KVM: x86: implement KVM_SET_TSC_PRECISE/KVM_GET_TSC_PRECISE From: Maxim Levitsky To: Paolo Bonzini , kvm@vger.kernel.org Cc: Oliver Upton , Ingo Molnar , Sean Christopherson , Thomas Gleixner , open list , Marcelo Tosatti , Jonathan Corbet , Wanpeng Li , Borislav Petkov , Jim Mattson , "H. Peter Anvin" , "open list:DOCUMENTATION" , Joerg Roedel , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , Vitaly Kuznetsov Date: Mon, 30 Nov 2020 17:58:54 +0200 In-Reply-To: <38602ef4-7ecf-a5fd-6db9-db86e8e974e4@redhat.com> References: <20201130133559.233242-1-mlevitsk@redhat.com> <20201130133559.233242-2-mlevitsk@redhat.com> <38602ef4-7ecf-a5fd-6db9-db86e8e974e4@redhat.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.3 (3.36.3-1.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2020-11-30 at 15:33 +0100, Paolo Bonzini wrote: > On 30/11/20 14:35, Maxim Levitsky wrote: > > + if (guest_cpuid_has(vcpu, X86_FEATURE_TSC_ADJUST)) { > > + tsc_state.tsc_adjust = vcpu->arch.ia32_tsc_adjust_msr; > > + tsc_state.flags |= KVM_TSC_STATE_TSC_ADJUST_VALID; > > + } > > This is mostly useful for userspace that doesn't disable the quirk, right? Isn't this the opposite? If I understand the original proposal correctly, the reason that we include the TSC_ADJUST in the new ioctl, is that we would like to disable the special kvm behavior (that is disable the quirk), which would mean that tsc will jump on regular host initiated TSC_ADJUST write. To avoid this, userspace would set TSC_ADJUST through this new interface. Note that I haven't yet disabled the quirk in the patches I posted to the qemu, because we need some infrastructure to manage which quirks we want to disable in qemu (That is, KVM_ENABLE_CAP is as I understand write only, so I can't just disable KVM_X86_QUIRK_TSC_HOST_ACCESS, in the code that enables x-precise-tsc in qemu). > > > + kvm_get_walltime(&wall_nsec, &host_tsc); > > + diff = wall_nsec - tsc_state.nsec; > > + > > + if (diff < 0 || tsc_state.nsec == 0) > > + diff = 0; > > + > > diff < 0 should be okay. Also why the nsec==0 special case? What about > using a flag instead? In theory diff < 0 should indeed be okay (though this would mean that target, has unsynchronized clock or time travel happened). However for example nsec_to_cycles takes unsigned number, and then pvclock_scale_delta also takes unsigned number, and so on, so I was thinking why bother with this case. There is still (mostly?) theoretical issue, if on some vcpus 'diff' is positive and on some is negative (this can happen if the migration was really fast, and target has the clock A. that is only slightly ahead of the source). Do you think that this is an issue? If so I can make the code work with signed numbers. About nsec == 0, this is to allow to use this API for VM initialization. (That is to call KVM_SET_TSC_PRECISE prior to doing KVM_GET_TSC_PRECISE) This simplifies qemu code, and I don't think that this makes the API much worse. Best regards, Maxim Levitsky > > Paolo >