Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp2145521imm; Tue, 2 Oct 2018 22:17:03 -0700 (PDT) X-Google-Smtp-Source: ACcGV62dbsDNyvdHYn1nT76Vq3NcFNfSfO5URcZQWmG4oD4G9JcLDIzCkIBr+/bmMsvD6xTTHoVL X-Received: by 2002:a62:384c:: with SMTP id f73-v6mr19801242pfa.242.1538543823755; Tue, 02 Oct 2018 22:17:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538543823; cv=none; d=google.com; s=arc-20160816; b=e8W9E7zle0zwJhOlAtiihMhdxVQtLNYSzs1k6ARCZpYmKkeOG8Y/1FgQdLyMyblfaT UawTgKRgm637gTyzOBZWmZmUe57FtXb4ypZNB09RIm3FWvjFnEuwLe/0cdrooHHgKR7i 4dxNWzspvYpW1B5a/rFsxuoKuhBVNK16OKea6pGcFudwPTZOVrCw7/IuplNg9xC5n06G PHAwgpJeP55WMCbrLuGTxkDGsHg9k4ht2CZPqYX2Que/Pt+LIu1TUca5SO1qym9CnHLC o/eV44spXfbtX2Re6LV5ZPTb4aR5UIsxxWZwUiL5EAwIi6RuspkYcsClF27x4dm4vfME YU2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=nHI4hF6WwjNPDx6SrSjOczWvSqcNbjUEK6s2jFgvQAo=; b=gfSh1VryjTURd2Eb4yhR598jIDMzbI0A+/RbLvTz5cidnThcsROVa2UsPuAXRYwDJJ zjvkK5CRMIPgOfoG4Tg85Q+UoFj2f4zJLdo52+nZMwyPY0kBOzHK2rVMvlaoPjFFbhPr EeRn5ed4mRrlu2/HHlgYDvc7vXJq1tf+PWhZY76Nyvln/HxgmrLn+bF/jUcRmBllk+qo L/si0cDtz32tqPQod+ileiZ5dVhNwaEuLHgUguls6eKKYSOPusJicXrYsN0KDN71TSAp /QU/0Ke07nmETVUEixK2DC8nlLmkXowgZ47RMiCN9Lc5u/3p3FnLVLndQwIHv4wqUy6o PrFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=YEvw3Jos; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e127-v6si363420pfe.8.2018.10.02.22.16.46; Tue, 02 Oct 2018 22:17:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=YEvw3Jos; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726726AbeJCMCt (ORCPT + 99 others); Wed, 3 Oct 2018 08:02:49 -0400 Received: from mail.kernel.org ([198.145.29.99]:43276 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726619AbeJCMCs (ORCPT ); Wed, 3 Oct 2018 08:02:48 -0400 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id AA1F921470 for ; Wed, 3 Oct 2018 05:16:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1538543762; bh=mD1CO+IjZXv0CDDY/wEkUp89zH7oIbyw2Bo8Z/BiOTw=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=YEvw3JosT3LdZOBWn57qhG09yCz9gZFSvZ8sMAnzy8o71JrV7atlFkPbAoQGfUGgP eb2V5qO8yVxPZJNtc2PJB9TwCj7qCJDkH+/QHRupTYk/scUQhT/6k9KPmZsW9iY42g 0AZ3gZf6e/J98ZMuywLrT1AUc+EhoSmNKEnMPvMY= Received: by mail-wm1-f52.google.com with SMTP id 185-v6so4239773wmt.2 for ; Tue, 02 Oct 2018 22:16:01 -0700 (PDT) X-Gm-Message-State: ABuFfoh1YAVUtVc0atx2M7UUH+Dt3hug9qAmDHVZJCo0F+5rwd67l2kl TU0K0r/GnH0e4o4oqQEdpQhDrJAwtOiok5WWpO1bGg== X-Received: by 2002:a1c:f312:: with SMTP id q18-v6mr168502wmq.14.1538543760007; Tue, 02 Oct 2018 22:16:00 -0700 (PDT) MIME-Version: 1.0 References: <20180914125006.349747096@linutronix.de> In-Reply-To: <20180914125006.349747096@linutronix.de> From: Andy Lutomirski Date: Tue, 2 Oct 2018 22:15:49 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support To: Thomas Gleixner , Marcelo Tosatti , Paolo Bonzini , Radim Krcmar , Wanpeng Li Cc: LKML , Andrew Lutomirski , X86 ML , Peter Zijlstra , Matt Rickard , Stephen Boyd , John Stultz , Florian Weimer , KY Srinivasan , Vitaly Kuznetsov , devel@linuxdriverproject.org, Linux Virtualization , Arnd Bergmann , Juergen Gross Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Vitaly, Paolo, Radim, etc., On Fri, Sep 14, 2018 at 5:52 AM Thomas Gleixner wrote: > > Matt attempted to add CLOCK_TAI support to the VDSO clock_gettime() > implementation, which extended the clockid switch case and added yet > another slightly different copy of the same code. > > Especially the extended switch case is problematic as the compiler tends to > generate a jump table which then requires to use retpolines. If jump tables > are disabled it adds yet another conditional to the existing maze. > > This series takes a different approach by consolidating the almost > identical functions into one implementation for high resolution clocks and > one for the coarse grained clock ids by storing the base data for each > clock id in an array which is indexed by the clock id. > I was trying to understand more of the implications of this patch series, and I was again reminded that there is an entire extra copy of the vclock reading code in arch/x86/kvm/x86.c. And the purpose of that code is very, very opaque. Can one of you explain what the code is even doing? From a couple of attempts to read through it, it's a whole bunch of probably-extremely-buggy code that, drumroll please, tries to atomically read the TSC value and the time. And decide whether the result is "based on the TSC". And then synthesizes a TSC-to-ns multiplier and shift, based on *something other than the actual multiply and shift used*. IOW, unless I'm totally misunderstanding it, the code digs into the private arch clocksource data intended for the vDSO, uses a poorly maintained copy of the vDSO code to read the time (instead of doing the sane thing and using the kernel interfaces for this), and propagates a totally made up copy to the guest. And gets it entirely wrong when doing nested virt, since, unless there's some secret in this maze, it doesn't acutlaly use the scaling factor from the host when it tells the guest what to do. I am really, seriously tempted to send a patch to simply delete all this code. The correct way to do it is to hook And I don't see how it's even possible to pass kvmclock correctly to the L2 guest when L0 is hyperv. KVM could pass *hyperv's* clock, but L1 isn't notified when the data structure changes, so how the heck is it supposed to update the kvmclock structure? --Andy