Received: by 2002:a05:6359:6284:b0:131:369:b2a3 with SMTP id se4csp5388578rwb; Wed, 9 Aug 2023 03:31:14 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGyDcQS3Cell5zd3e4Is6OiU1PItIpaZh3QssEjNQy1h3b+InONoZ5AgApMqjJCfGpC6RYZ X-Received: by 2002:aa7:de08:0:b0:523:6b00:2440 with SMTP id h8-20020aa7de08000000b005236b002440mr550089edv.30.1691577074332; Wed, 09 Aug 2023 03:31:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691577074; cv=none; d=google.com; s=arc-20160816; b=sghS3H86UW+Z3u118mljT6UFXgg/tmgB0xBQoW6EbBRQIltI8gFrSwh2MIw+4ZabzL qI9oUFLWRlw0TwYZGKugteBjYvCJQdc40ISNAPS/RO8BK+IQBNdenMnAr698efMXI4rP zzHVQf7/zL7gGxcy4g7WqV2PH5rHVAxWzYdZDkcto4TF0NC191Ar5fIg2FhwiPVCWoJS lAdE1wuE+yeyVQnRBJu/UU9UYOTNCOU4cmmeeU8z3TBLw8P5fpg7Hz392cvbc6q1zA4Q 83MWknrd6NxqQfQwz3BuRhrdEAGvqUwugX+RNzq0kFjd5UAy1BHyDYMvCuU2Pvf/6MIf EK3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:dkim-signature; bh=TrtPDxAEcqilg4TX4aQgmvdeFUGgEe4Z7iRGQc11hbI=; fh=JIFjWQOVJ4wH3/x4J2gKc0lGPGT6FeRjfpeOeyPrrWc=; b=WAxp7T0PZl/kKJ6IeZkUULMCtIwg3+3lLpDpk7TGJQuQWVmo9QvVeTAyAjON0S2i9y 8i3ZdtG0hdGfE8vcNwStYtYopTDAseo0+Y3SKSoHB75Prd1w1jlU5YiViPsLAfX1VYBb PXvAMuKOtIHvYgLuBej/2I9E6EyuFrl8Ttyjo2zj70y40ew5bKCM+p/Pzoyed2d/II30 QS7CNmz7MEwhVpN6Jj1jSnJ3vaVPE0zmcebBNIU+enAZjT0pp2eC6hpAbIYveDqs63U4 2ZsNA7gR4pd7TugvCbO7fXjOb9Bti4fUTD2WrKHMuenszdr5O2va/Z1jDxZwBi+vioGO UDNA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@igel-co-jp.20221208.gappssmtp.com header.s=20221208 header.b=MBnMFpxX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d11-20020a05640208cb00b005221e2f99e6si8318547edz.506.2023.08.09.03.30.49; Wed, 09 Aug 2023 03:31:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@igel-co-jp.20221208.gappssmtp.com header.s=20221208 header.b=MBnMFpxX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231805AbjHIImx (ORCPT + 99 others); Wed, 9 Aug 2023 04:42:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229984AbjHIImw (ORCPT ); Wed, 9 Aug 2023 04:42:52 -0400 Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 749EE1BD9 for ; Wed, 9 Aug 2023 01:42:51 -0700 (PDT) Received: by mail-pf1-x434.google.com with SMTP id d2e1a72fcca58-686efb9ee3cso6254318b3a.3 for ; Wed, 09 Aug 2023 01:42:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=igel-co-jp.20221208.gappssmtp.com; s=20221208; t=1691570571; x=1692175371; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:from:to:cc:subject :date:message-id:reply-to; bh=TrtPDxAEcqilg4TX4aQgmvdeFUGgEe4Z7iRGQc11hbI=; b=MBnMFpxXrRY5OSyAUBw0JnfmG6GlX3JVagq7+CIiyxMHZB56sb+mPMr2dBL7y4Hd8a EsKILdQmTpphQ/KFt033uIZ6s8xE7Uxye3XTX5KNj3Z6kS1IZAL0IsxwD3PpJP9sEFdL b1jnn0qi6kEns7E0iNw/CDgqFRfQNsA/VW/Ke5m9ZMCUnC+vBOkgJWoFmNXXCHaSj3LY zUgWm8lAfQJRv0jOX7Nvy/ajtEUzyWCbSUtdEcKSzUeT7UDhvonRvqCv6OUH0gEIYZ+x 2bjkrvCzX+2cqYaSz86Cb08mQNlmLNzxdXdmClscwyMv+olqzWzLvyKGF8bsFNXddbZu I/yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691570571; x=1692175371; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=TrtPDxAEcqilg4TX4aQgmvdeFUGgEe4Z7iRGQc11hbI=; b=F1Qfyh0eiSf+mGkekvoLnjqd1XplYR4a8/QbnfVMYIzvZ8zSWGD9I1A18CuDQDYrUH Ss4x85bJ0yQZJdVTDWkq2WfuG8s/91UVIDaSVmKcbayqPpEgTCmru5MSLSbXVTfkg6qw D9RggH6niitgXgf9bUC4X6r897HGaQWoLZ+Yc3BzMXYsBNytEprpwnZheb3NEuiPVTRA DY/Z7d4tjy32ZzbQ7G45/A/aD3IaVsS+GPqQQE2sFD2PtHb7azaLho4TWL+NsDWp+FvQ Vtk0rKJanpzfcf5g2fcBsQe4aVkZdGP5soVy0Vy2cByL2r8unJyLIGtL4xKqlL0Ayy5R 0Aig== X-Gm-Message-State: AOJu0YwGiGlZu8I+bVfPjTOj7C4HsZ2msgWSmxPOcdC2phKksElK2Wbc vEYtTXTdG55QcdEU0xoLI7VMuw== X-Received: by 2002:a05:6a00:9aa:b0:687:2e26:9ca9 with SMTP id u42-20020a056a0009aa00b006872e269ca9mr2334049pfg.11.1691570570847; Wed, 09 Aug 2023 01:42:50 -0700 (PDT) Received: from ake-x260 (napt.igel.co.jp. [219.106.231.132]) by smtp.gmail.com with ESMTPSA id i5-20020a63bf45000000b00563397f1624sm6731096pgo.69.2023.08.09.01.42.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Aug 2023 01:42:50 -0700 (PDT) Date: Wed, 9 Aug 2023 17:42:41 +0900 From: Ake Koomsin To: Sean Christopherson Cc: Maxim Levitsky , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" Subject: Re: [RFC PATCH] KVM: x86: inhibit APICv upon detecting direct APIC access from L2 Message-ID: <20230809174241.36dd569d@ake-x260> In-Reply-To: References: <20230807062611.12596-1-ake@igel.co.jp> <43c18a3d57305cf52a1c3643fa8f714ae3769551.camel@redhat.com> <20230808164532.09337d49@ake-x260> Organization: igel X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 8 Aug 2023 16:48:19 -0700 Sean Christopherson wrote: > > The idea from step 6 to step 10 is to start BitVisor first, and > > start Linux on top of it. You can adjust the step as you like. Feel > > free to ask me anything regarding reproducing the problem with > > BitVisor if the giving steps are not sufficient. > > Thank you for the detailed repro steps! However, it's likely going > to be O(weeks) before anyone is able to look at this in detail given > the extensive repro steps. If you have bandwidth, it's probably worth > trying to reproduce the problem in a KVM selftest (or a > KVM-Unit-Test), e.g. create a nested VM, send an IPI from L2, and see > if it gets routed correctly. This purely a suggestion to try and get > a faster fix, it's by no means necessary. > > Actually, typing that out raises a question (or two). What APICv > VMCS control settings does BitVisor use? E.g. is BitVisor enabling > APICv for its VM (L2)? If so, what values for the APIC access page > and vAPIC page are shoved into BitVisor's VMCS? BitVisor does not set up APICv at all. It also does not setup APIC access page at all. It does not try to emulate APIC at all. It only monitors for APIC INIT event through EPT_VIOLATION mechanism only for its AP bringup and stop monitoring after that. As I mentioned in the previous mail, when BitVisor runs on real hardware, it lets the guest control real APIC directly. As it is a micro hypervisor, it runs only one guest OS. Its main focus is on device access monitoring/manipulation depending on the configuration. It tries to avoid anything to do with interrupts as much as possible. In mean time, I will try to get deeper into KVM internal. Thank you very much suggesting on KVM-Unit-Test. > > The problem does not happen when enable_apicv=N. Note that SMP > > bringup with enable_apicv=N can fail. This is another problem. We > > don't have to worry about this for now. Linux seems to have no > > delay between INIT DEASSERT and SIPI during its SMP bringup. This > > can easily makes INIT and SIPI pending together resultling in > > signal lost. > > > > I admit that my knowledge on KVM and APICv is very limited. I may > > misunderstand the problem. If you don't mind, would it be possible > > for you to guide me which code path should I pay attention to? I > > would love to learn to find out the actual cause of the problem. > > KVM *should* emulate the APIC MMIO access from L2. The call stack > should reach apic_mmio_write(), and assuming it's an ICR write, KVM > should send an IPI. When enable_apicv=N, interrupts work properly. This is why I wrote this RFC patch. Regarding SMP bringup fail, The thing is when L2 Linux guest runs on top of L1 BitVisor, it is not going to rely on KVM specific features at all. In this case, it seems to me that vcpus possibly can not change their state to wait-for-sipi in time once INIT is issued (might be due to scheduling?). This does not happen when BitVisor runs on real hardware. Once you have time to try BitVisor, please let me know if you can reproduce the problem with the default configuration. Trying with -smp 8+ on a machine with many cores might be easy to reproduce the problem. I test mine on i5-13600K.