Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp2501775rwb; Mon, 3 Oct 2022 01:26:35 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5Ih3vKmIMFjrSR1bQ/KPW3kBi0wLQy9sOPDDrQoGO1FiHmvNFksRjyI8eWRs29i5dgiqmt X-Received: by 2002:a63:d914:0:b0:44a:9b14:19e8 with SMTP id r20-20020a63d914000000b0044a9b1419e8mr7122831pgg.20.1664785595531; Mon, 03 Oct 2022 01:26:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664785595; cv=none; d=google.com; s=arc-20160816; b=LShwDIxRCTaBq1hX5t5QvjIrdMy7NUPGGKLz/6cdFzO5tfK6iO1kXYpCWb17wugr7L OFbfT8/wvo4larBndQTZb8cj6Gv7N1gZXvUqMOri/HTCW1YGGrQLL7bOK2VSu9tnRCQu mnUXw+hyJ52Gz0n3uZhntVk6YAyFqE610GagMTIrxHIn6GIfulEbt0waFdo8S+IsEGJZ /fxL0yeqLCg5iQoBi74k5xHkE1jXEq5Heh/SdW2V3wpkLotOA9mh2455cO83YdUtdwJ7 7B/JOnbzJmMsphlGBvYCb+S5XlbhzmT955neVj6ospLXmQ72fBbNPxcUrNI+onDijwPa rUqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=WmztSdLxGLePdL68BUHnz4SzM/l/VKgoGqcyR7GOU3U=; b=TIhoXyamklIf5fgySxVmTsB53JchxbVOzKjnyUUHhnQ+tnH4Re8i8l7SU4L2k4VJDY fttasmxHyCH0V774RegiwQxClud6WPCpZI0lnS8xFNwE0nTP/g8BBx7A5kEdxq+JkyUB kG2Y86g/WINqqrvmUKeCzFVxuNcpoIWed1SNovVemE2yg55+cbLY1bOlAWbOzd/252uo 30SfQ8w6knpBp3HwpqDzH7gCdTmlLrOAaDzQXRoCMLgd7QGFv16KJF8/J8Y7IQUSbleE LVrL/IvaZW6UFBGy9Fl0Vr2XgW/HwIxYKsh8xzmQU2mAMl+TqMARhfRDnZUmF2Criev2 6/vw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=W9NLgDIQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f185-20020a636ac2000000b0043af57e5d16si9607635pgc.724.2022.10.03.01.26.22; Mon, 03 Oct 2022 01:26:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=W9NLgDIQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232314AbiJCHwB (ORCPT + 99 others); Mon, 3 Oct 2022 03:52:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232226AbiJCHv3 (ORCPT ); Mon, 3 Oct 2022 03:51:29 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E03755C36B for ; Mon, 3 Oct 2022 00:29:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1664782086; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WmztSdLxGLePdL68BUHnz4SzM/l/VKgoGqcyR7GOU3U=; b=W9NLgDIQ4y9T4MMlLRbv/zS3ghJy3T5j7jbzshDRML6zgS0eetTedt7fGcgpSLmRD/Nr+5 87XmQtsIJ1EWUr2h9Nb65ODuFLsm4sN083PNL+F5IfObVg3JQWqMuaSWL+EryLXpKHs9x8 lZluO1/YTsixwI+qVWC7m7c0iTQ6+fU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-365-rDj1GlhuNzGvFkYvPvriTg-1; Mon, 03 Oct 2022 03:28:03 -0400 X-MC-Unique: rDj1GlhuNzGvFkYvPvriTg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C4AB28027ED; Mon, 3 Oct 2022 07:28:01 +0000 (UTC) Received: from starship (unknown [10.40.193.232]) by smtp.corp.redhat.com (Postfix) with ESMTP id B9A621121314; Mon, 3 Oct 2022 07:27:54 +0000 (UTC) Message-ID: Subject: Re: Nested AVIC design (was:Re: [RFC PATCH v3 04/19] KVM: x86: mmu: allow to enable write tracking externally) From: Maxim Levitsky To: Sean Christopherson Cc: kvm@vger.kernel.org, Wanpeng Li , Vitaly Kuznetsov , Jani Nikula , Paolo Bonzini , Tvrtko Ursulin , Rodrigo Vivi , Zhenyu Wang , Joonas Lahtinen , Tom Lendacky , Ingo Molnar , David Airlie , Thomas Gleixner , Dave Hansen , x86@kernel.org, intel-gfx@lists.freedesktop.org, Daniel Vetter , Borislav Petkov , Joerg Roedel , linux-kernel@vger.kernel.org, Jim Mattson , Zhi Wang , Brijesh Singh , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Mon, 03 Oct 2022 10:27:53 +0300 In-Reply-To: References: <20220427200314.276673-1-mlevitsk@redhat.com> <20220427200314.276673-5-mlevitsk@redhat.com> <5ed0d0e5a88bbee2f95d794dbbeb1ad16789f319.camel@redhat.com> <7c4cf32dca42ab84bdb427a9e4862dbf5509f961.camel@redhat.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2022-09-29 at 22:38 +0000, Sean Christopherson wrote: > On Mon, Aug 08, 2022, Maxim Levitsky wrote: > > Hi Sean, Paolo, and everyone else who wants to review my nested AVIC work. > > Before we dive deep into design details, I think we should first decide whether > or not nested AVIC is worth pursing/supporting. > > - Rome has a ucode/silicon bug with no known workaround and no anticipated fix[*]; > AMD's recommended "workaround" is to disable AVIC. > - AVIC is not available in Milan, which may or may not be related to the > aforementioned bug. > - AVIC is making a comeback on Zen4, but Zen4 comes with x2AVIC. > - x2APIC is likely going to become ubiquitous, e.g. Intel is effectively > requiring x2APIC to fudge around xAPIC bugs. > - It's actually quite realistic to effectively force the guest to use x2APIC, > at least if it's a Linux guest. E.g. turn x2APIC on in BIOS, which is often > (always?) controlled by the host, and Linux will use x2APIC. > > In other words, given that AVIC is well on its way to becoming a "legacy" feature, > IMO there needs to be a fairly strong use case to justify taking on this much code > and complexity. ~1500 lines of code to support a feature that has historically > been buggy _without_ nested support is going to require a non-trivial amount of > effort to review, stabilize, and maintain. > > [*] 1235 "Guest With AVIC (Advanced Virtual Interrupt Controller) Enabled May Fail > to Process IPI (Inter-Processor Interrupt) Until Guest Is Re-Scheduled" in > https://www.amd.com/system/files/TechDocs/56323-PUB_1.00.pdf > I am afraid that you mixed things up: You mistake is that x2avic is just a minor addition to AVIC. It is still for all practical purposes the same feature. 1. The AVIC is indeed kind of broken on Zen2 (but AFAIK for all practical purposes, including nested it works fine, the errata only shows up in a unit test and/or under very specific workloads (most of the time a delayed wakeup doesn't cause a hang). Yet, I agree that for production Zen2 should not have AVIC enabled. 2. Zen3 does indeed have AVIC soft disabled in CPUID. AFAIK it works just fine, but I understand that customers won't use it against AMD's guidance. 3. On Zen4, AVIC is fully enabled and also extended to support x2apic mode. The fact that AVIC was extended to support X2apic mode also shows that AMD is committed to supporting it. My nested AVIC code technically doesn't expose x2avic to the guest, but it is pretty much trivial to add (I am only waiting to get my hands on Zen4 machine to do it), and also even in its current form it would work just fine if the host uses normal AVIC . (or even doesn't use AVIC at all - the nested AVIC code works just fine even if the host has its AVIC inhibited for some reason). Adding nested x2avic support is literally about not passing through that MMIO address, Enabling the x2avic bit in int_ctl, and opening up the access to x2apic msrs. Plus I need to do some minor changes in unaccelerated IPI handler, dealing With read-only logical ID and such. Physid tables, apic backing pages, doorbell emulation, everything is pretty much unchanged. So AVIC is nothing but a legacy feature, and my nested AVIC code will support both nested AVIC and nested X2AVIC. Best regards, Maxim Levitsky