Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp2317221pxp; Mon, 21 Mar 2022 16:46:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwKLi8wcMPKyGkPmYMwDHGpMwXmjN1eLPfE38lap/QoSaS5wEtexHP9A2XqBxe3vaWxZW0p X-Received: by 2002:a63:475b:0:b0:382:821b:c17d with SMTP id w27-20020a63475b000000b00382821bc17dmr6022209pgk.250.1647906389813; Mon, 21 Mar 2022 16:46:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647906389; cv=none; d=google.com; s=arc-20160816; b=C/18ECsn7uxsIzAaTNWkNThBq9p7zvfRH+rYiGXx09VZIfq1R1pSAyJ0YV1rwhmd7x dggCzL4YfHJ0ZyGLAA7HYk49rBOtTfE45Bxegn1le4+3Qe3ZOLbuqOGwKHPhE98JQ0jT 9BSU1+RqrWROpnhp19CFa7nf3MJD9YsR/0dVvCflLZPnpV3jf8VcqpJPQzLKvbcqBx7J ULmCbuYBUlXMU3MJylvVULzQ5HW9Wgev5GExTclXKlAob02Klxw+LJHlYlZnZrQh0w8i MsZ2XrePQu5eEmEbknk7lJLKrECIpY3ytySuVQ2nyxU4B1KEbvrYRsiXRPf7w9Qa5TIt E0Xw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=a37Iy4acq0RECreesuZ9P5gY0hyf9x+k89MXGTdyb6g=; b=BfsaiS/+Sq+LimUA6rdTYHgoTFUbd6ctlJu3eTJbad6DmEkGfgRXvSzgGhhy/ID2R7 QHdGIL7KOACC71VG+bbZBfrUh6sCHujLWyDvv4szfpSgHdd6Aia2dPk9Ll+XpbDZuWdF AZZApCav2jFsBd3ySI7CNF1KrD726+t9h2Cpt8Zh1PAhN2iQBuolXfP1UgwwZEY3lbbo GRNEiWNd3KJ0o3o42EYq4TERbUl2YxjsWo5mGzwpwtCQFmOh1hyvaLkL/4fA6N1K0a4h aHZZ2z/OhPxGr1B3muLqA4BPJBWRlmgNmjUHdsh0IBIexzebrxKrw1zM5O7AuJw6fzEo 4teA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fY5ABXD5; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id z5-20020a17090ad78500b001c653f1f13esi643517pju.95.2022.03.21.16.46.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Mar 2022 16:46:29 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fY5ABXD5; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E7D2936E831; Mon, 21 Mar 2022 16:09:12 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232125AbiCUXKN (ORCPT + 99 others); Mon, 21 Mar 2022 19:10:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233304AbiCUXIg (ORCPT ); Mon, 21 Mar 2022 19:08:36 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 211EA4ECD8 for ; Mon, 21 Mar 2022 15:57:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1647903241; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=a37Iy4acq0RECreesuZ9P5gY0hyf9x+k89MXGTdyb6g=; b=fY5ABXD5W4PrdXu2gI8BZLHjsdaQRSzxTgOOFMHlEZhvP+KqPRStli52TzUFb1LZz+iGZk ZnmduRRuvuwrj3uyvZjtMlB92CqMB6bWYsBVvMPo4nr52yotmW/N1y2kql3W9Cx3VGNZNJ tuFJQCH+1Uc7APRgOKKQyeCOyjIodmQ= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-189-sCM1BnxDOzO1wmc2uFs5xw-1; Mon, 21 Mar 2022 17:36:07 -0400 X-MC-Unique: sCM1BnxDOzO1wmc2uFs5xw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0ED4C2999B2C; Mon, 21 Mar 2022 21:36:07 +0000 (UTC) Received: from starship (unknown [10.40.194.231]) by smtp.corp.redhat.com (Postfix) with ESMTP id 559A040D2821; Mon, 21 Mar 2022 21:36:02 +0000 (UTC) Message-ID: Subject: Re: [PATCH v3 4/7] KVM: x86: nSVM: support PAUSE filter threshold and count when cpu_pm=on From: Maxim Levitsky To: Jim Mattson , Paolo Bonzini Cc: kvm@vger.kernel.org, Ingo Molnar , Dave Hansen , Sean Christopherson , Borislav Petkov , "H. Peter Anvin" , Thomas Gleixner , x86@kernel.org, Vitaly Kuznetsov , Joerg Roedel , linux-kernel@vger.kernel.org, Wanpeng Li Date: Mon, 21 Mar 2022 23:36:01 +0200 In-Reply-To: References: <20220301143650.143749-1-mlevitsk@redhat.com> <20220301143650.143749-5-mlevitsk@redhat.com> <6a7f13d1-ed00-b4a6-c39b-dd8ba189d639@redhat.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.84 on 10.11.54.2 X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2022-03-09 at 11:07 -0800, Jim Mattson wrote: > On Wed, Mar 9, 2022 at 10:47 AM Paolo Bonzini wrote: > > On 3/9/22 19:35, Jim Mattson wrote: > > > I didn't think pause filtering was virtualizable, since the value of > > > the internal counter isn't exposed on VM-exit. > > > > > > On bare metal, for instance, assuming the hypervisor doesn't intercept > > > CPUID, the following code would quickly trigger a PAUSE #VMEXIT with > > > the filter count set to 2. > > > > > > 1: > > > pause > > > cpuid > > > jmp 1 > > > > > > Since L0 intercepts CPUID, however, L2 will exit to L0 on each loop > > > iteration, and when L0 resumes L2, the internal counter will be set to > > > 2 again. L1 will never see a PAUSE #VMEXIT. > > > > > > How do you handle this? > > > > > > > I would expect that the same would happen on an SMI or a host interrupt. > > > > 1: > > pause > > outl al, 0xb2 > > jmp 1 > > > > In general a PAUSE vmexit will mostly benefit the VM that is pausing, so > > having a partial implementation would be better than disabling it > > altogether. > > Indeed, the APM does say, "Certain events, including SMI, can cause > the internal count to be reloaded from the VMCB." However, expanding > that set of events so much that some pause loops will *never* trigger > a #VMEXIT seems problematic. If the hypervisor knew that the PAUSE > filter may not be triggered, it could always choose to exit on every > PAUSE. > > Having a partial implementation is only better than disabling it > altogether if the L2 pause loop doesn't contain a hidden #VMEXIT to > L0. > Hi! You bring up a very valid point, which I didn't think about. However after thinking about this, I think that in practice, this isn't a show stopper problem for exposing this feature to the guest. This is what I am thinking: First lets assume that the L2 is malicious. In this case no doubt it can craft such a loop which will not VMexit on PAUSE. But that isn't a problem - instead of this guest could have just used NOP which is not possible to intercept anyway - no harm is done. Now lets assume a non malicious L2: First of all the problem can only happen when a VM exit is intercepted by L0, and not by L1. Both above cases usually don't pass this criteria since L1 is highly likely to intercept both CPUID and IO port access. It is also highly unlikely to allow L2 direct access to L1's mmio ranges. Overall there are very few cases of deterministic vm exit which is intercepted by L0 but not L1. If that happens then L1 will not catch the PAUSE loop, which is not different much from not catching it because of not suitable thresholds. Also note that this is an optimization only - due to count and threshold, it is not guaranteed to catch all pause loops - in fact hypervisor has to guess these values, and update them in attempt to catch as many such loops as it can. I think overall it is OK to expose that feature to the guest and it should even improve performance in some cases - currently at least nested KVM intercepts every PAUSE otherwise. Best regards, Maxim Levitsky