Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp919384ybp; Thu, 17 Oct 2019 05:36:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqzJxVuPj7FisIYx2nwMkokkE0VvFaep7tENHgmqdAFm/Jcmgyca1Q4uh+82WVUgv17FWY0t X-Received: by 2002:a17:906:6dd3:: with SMTP id j19mr3167423ejt.144.1571315770309; Thu, 17 Oct 2019 05:36:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571315770; cv=none; d=google.com; s=arc-20160816; b=T3aon3a9XN+CEmyKZ8aYnq8O552Y+sodYDeGDqeQks3OgBevKdJ5ISmqHnNFBdZKEI m5+eRJBBxcf4NMoe78RQdNPSLje+Zu2ddDgqwznYkynZzxqQ1WQlKiqAC4dsHBTnu8VX zohGBzKL1JjTolAzUGwRqX/roQeTLGvJieZqmr7G3AE0EQwo3f7TA91sLZ+t8yIdTN9Q il7+VoGRlPM7LxRWHZLEGDZDSBgub+g2bwGZCVgKSWQps8xAG6AYRm5Sfdwadtj4TGiA pgdbJ5LkHOfIMQ+O8FWhN+zMUsxPKMl9fGn94AimAUBOy3Yte9ffkGCSr6ItCEOOxGsZ 3sfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=1jZKBKM/0wmPHRQqYt/r08B89zbHodlZuWHdpQpHuN4=; b=XJymB8Xp5WZrogqxYaBRUjFvxHGiW/Gph0GSigRrjK6qkszSyKBywOMsusJoVq3Lvz CoxLIkyA+OFZ4PVmflgih/b+QDQw5ofU0AqVbG8dzqZpW/+K7zgPsCUKZSlEQ5QAnbRB rdyS+t/INa0gkZwkS3NYpUtOdfurEwB+rHivzvL16o5dAA1Fv8U4Xokhd+sYRYhKtNa+ azdp32CZKbaFubKrDwzMdK4/kakJBtEedS2Kaq7XZoecol93K4PrcNHtuvwfFOGiGP94 V4n+5Aidt0knAsN0VPFSm1W6WOVo2Z5x5uVAPoxI5BGDTwsK1haBK9HyezJ5iiRkgxjf N0rA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e31si1522042ede.199.2019.10.17.05.35.46; Thu, 17 Oct 2019 05:36:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406461AbfJPQu7 (ORCPT + 99 others); Wed, 16 Oct 2019 12:50:59 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52324 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728601AbfJPQu7 (ORCPT ); Wed, 16 Oct 2019 12:50:59 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 886C73090FD6; Wed, 16 Oct 2019 16:50:58 +0000 (UTC) Received: from mail (ovpn-124-232.rdu2.redhat.com [10.10.124.232]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4D02660C5D; Wed, 16 Oct 2019 16:50:58 +0000 (UTC) Date: Wed, 16 Oct 2019 12:50:57 -0400 From: Andrea Arcangeli To: Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vitaly Kuznetsov , Sean Christopherson Subject: Re: [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers Message-ID: <20191016165057.GJ6487@redhat.com> References: <20190928172323.14663-1-aarcange@redhat.com> <20190928172323.14663-13-aarcange@redhat.com> <933ca564-973d-645e-fe9c-9afb64edba5b@redhat.com> <20191015164952.GE331@redhat.com> <870aaaf3-7a52-f91a-c5f3-fd3c7276a5d9@redhat.com> <20191015203516.GF331@redhat.com> <20191015234229.GC6487@redhat.com> <27cc0d6b-6bd7-fcaf-10b4-37bb566871f8@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <27cc0d6b-6bd7-fcaf-10b4-37bb566871f8@redhat.com> User-Agent: Mutt/1.12.2 (2019-09-21) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Wed, 16 Oct 2019 16:50:58 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 16, 2019 at 09:07:39AM +0200, Paolo Bonzini wrote: > Yet you would add CPUID to the list even though it is not even there in > your benchmarks, and is *never* invoked in a hot path by *any* sane I justified CPUID as a "short term" benchmark gadget, it's one of those it shouldn't be a problem at all to remove, I couldn't possibly be against removing it. I only pointed out the fact cpuid on any modern linux guest is going to run more frequently than any inb/outb so if I had to pick a benchmark gadget, that remains my favorite one. > program? Some OSes have never gotten virtio 1.0 drivers. OpenBSD only > got it earlier this year. If the target is an optimization to a cranky OS that can't upgrade virtio to obtain the full performance benefit from the retpoline removal too (I don't know the specifics by just reading the above) then it's a better argument. At least it sounds fair enough not to unfair penalize the cranky OS forced to run obsolete protocols that nobody can update or has the time to update. I mean, until you said there's some OS that cannot upgrade to virtio 1.0, I thought it was perfectly fine to say "if you want to run a guest with the full benefit of virtio 1.0 on KVM, you should upgrade to virtio 1.0 and not stick to whatever 3 year old protocol, then also the inb/outb retpoline will go away if you upgrade the host because the inb/outb will go away in the first place". > It still doesn't add up. 0.3ms / 5 is 1/15000th of a second; 43us is > 1/25000th of a second. Do you have multiple vCPU perhaps? Why would I run any test on UP guests? Rather then spending time doing the math on my results, it's probably quicker that you run it yourself: https://lkml.kernel.org/r/20190109034941.28759-1-aarcange@redhat.com/ Marcelo should have better reproducers for frequent HLT that is a real workload we have to pass, I reported the first two random things I had around that reported fairly frequent HLT. The pipe loop load is similar to local network I/O. > The number of vmexits doesn't count (for HLT). What counts is how long > they take to be serviced, and as long as it's 1us or more the > optimization is pointless. > > Consider these pictures > > w/o optimization with optimization > ---------------------- ------------------------- > 0us vmexit vmexit > 500ns retpoline call vmexit handler directly > 600ns retpoline kvm_vcpu_check_block() > 700ns retpoline kvm_vcpu_check_block() > 800ns kvm_vcpu_check_block() kvm_vcpu_check_block() > 900ns kvm_vcpu_check_block() kvm_vcpu_check_block() > ... > 39900ns kvm_vcpu_check_block() kvm_vcpu_check_block() > > > > 40000ns kvm_vcpu_check_block() kvm_vcpu_check_block() > > > Unless the interrupt arrives exactly in the few nanoseconds that it > takes to execute the retpoline, a direct handling of HLT vmexits makes > *absolutely no difference*. > You keep focusing on what happens if the host is completely idle (in which case guest HLT is a slow path) and you keep ignoring the case that the host isn't completely idle (in which case guest HLT is not a slow path). Please note the single_task_running() check which immediately breaks the kvm_vcpu_check_block() loop if there's even a single other task that can be scheduled in the runqueue of the host CPU. What happen when the host is not idle is quoted below: w/o optimization with optimization ---------------------- ------------------------- 0us vmexit vmexit 500ns retpoline call vmexit handler directly 600ns retpoline kvm_vcpu_check_block() 700ns retpoline schedule() 800ns kvm_vcpu_check_block() 900ns schedule() ... Disclaimer: the numbers on the left are arbitrary and I just cut and pasted them from yours, no idea how far off they are. To be clear, I would find it very reasonable to be requested to proof the benefit of the HLT optimization with benchmarks specifics for that single one liner, but until then, the idea that we can drop the retpoline optimization from the HLT vmexit by just thinking about it, still doesn't make sense to me, because by thinking about it I come to the opposite conclusion. The lack of single_task_running() in the guest driver is also why the guest cpuidle haltpoll risks to waste some CPU with host overcommit or with the host loaded at full capacity and why we may not assume it to be universally enabled. Thanks, Andrea