Received: by 2002:a05:6359:6284:b0:131:369:b2a3 with SMTP id se4csp5025032rwb; Tue, 8 Aug 2023 19:08:44 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFsoUrO38mjyvFR+5MGe5zEiPLJFc20WrwsdZcjRcTggXShahI6UCJTZV5FtkSZZ94lCwox X-Received: by 2002:a17:906:14d:b0:99c:5056:4e2e with SMTP id 13-20020a170906014d00b0099c50564e2emr876806ejh.31.1691546924642; Tue, 08 Aug 2023 19:08:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691546924; cv=none; d=google.com; s=arc-20160816; b=YQCSwqDqflvd1F8QVpr4yVWV7gijJS0nSi3BKMT7m6bAOAEXBv4IaNEw0bH/OfI1L5 wQRdaB49dvlvdEWDwB+C/6C9E+L2/FYImYWvHYxsEN+faYnvK2hJs/wqdX5B/lv1vWsm Z+gn7iSUlqOqrXyZo5LKs/zY1owAHHku0Vdjs6ijr5qes0RY8xP/pd1lAL95QU4t0794 Dq09DwuDuQUy643w+aSUT+6h4RHrKJhVhCcHYYKLqO3vVLeYm61LN/ajY8Pq8mvZ9vYb Mg8MNAJQ4RhnlymEthVIA7YyapBPC375h3Gxrvjx10XGmGcBt4F2vWF82wzzm5TRuFTk 20Ng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=D2YnROpvUkDVkcj2rtSrEnyN7qYS/S3plRfhiHd5mcc=; fh=+47K0bu/LCdwqVW2Ma/UIS8OEyvRrxw9F/gIfJaqYoI=; b=pNtIQvyVdwj/raFSp5hJgXk6leOZLKEjCcLN5chHbQezOjUc96GcUCSXUq140h683s jBzZ1lG2D4iICZfb8giZUTuRS5RsYvig2F5D7VfKqR/Y/mY7cKMcB/l3AmfPhtbFQ2Tm IuZMYdxO0zz05eDn4NT5JUlqw+HO4yD5oUDgq+aErQrz2UXG1t4H54xTiy+uDnl+O5ok 5rWrX975RBgh1tDfnmZ0I0yMW5LBzcJ25LEu+iLCDTGjFwdgnaCN1Fna32rw23cBoHr8 YUlmdEldr2ejeqZ2ZxzF07zEy9GVWnIviDCICn66ejpwTATxKgmPXRBA3qDLA1nKF2v0 OQ1A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=6P2ISKH9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s3-20020a170906354300b0099b4985e4d5si8899799eja.451.2023.08.08.19.08.20; Tue, 08 Aug 2023 19:08:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=6P2ISKH9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230265AbjHHXsX (ORCPT + 99 others); Tue, 8 Aug 2023 19:48:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229903AbjHHXsW (ORCPT ); Tue, 8 Aug 2023 19:48:22 -0400 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DECFF0 for ; Tue, 8 Aug 2023 16:48:21 -0700 (PDT) Received: by mail-pf1-x449.google.com with SMTP id d2e1a72fcca58-687167290fbso6824358b3a.2 for ; Tue, 08 Aug 2023 16:48:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691538501; x=1692143301; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=D2YnROpvUkDVkcj2rtSrEnyN7qYS/S3plRfhiHd5mcc=; b=6P2ISKH9dvBzrmFMwwWEjDe2KAlrgm7n3Rw9Zs+PD42odg66V+LxrZ02Ue8LMdiYul oWEUtT2q5ZOWeg6+4OIbliz12nv6mqaJ2+ThsfUOp7RCVT1OBUkYjffdzH7nZ88cKLC/ cNFeOXcl1+we7dDpCA6swxOHdcqUe7TJFTqs0lRwXcNua58vKB4tlYbk+VYpDkjUaGsC DVeyLt7jtkIy5AyihK80tD5eKIIe/TAF5t7O0uJPh0HHrh0N+cTyd6p9CYSX9F6+UGbq uhEY1Vt1ZEnz58dxntnxZNVUaNPGBklzd8xOWA4zdJfyZw3OqGMPkPPrDd7jjj2TKYk0 xBqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691538501; x=1692143301; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=D2YnROpvUkDVkcj2rtSrEnyN7qYS/S3plRfhiHd5mcc=; b=aCcTGYOF/Q5eNOYvfvLbPpQdv4NDVJrtwskz842PmVq5GsdvX62Lp5rVS5dZtFvyjn tG0zab6xdN8IAOTRuq5Sihe4auXsS/0853YKI4dFBaqcL0oGMgHYrWCXrh2UOUJ9tPUP wIdGdatfmfN3njesWnuLEZ0D9sG9yYJhVUK+panLhhg1NucO5r09t+M8cjL95CutexPa Y9Gyh7WgGIvcgHeuGH6mQfBbUjOsPbqwfikVSKSV+vTw8qmSHNLeOJuvF3JeYl9xevLp OkHTZ/UWIzhVEVLnKmTerNFlM0xKPIANqlMuq5XjoqQifYgaTbPZz7NQIN562/edgb1P 1yug== X-Gm-Message-State: AOJu0Yx2TqQS+B66/Vk8Ztj5ua+xKOo4OS7hGDCI0/a1GB55p7sDjrLJ W6NNtMMToIsI7RI1n+cMb1J4tyHihOU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:179b:b0:686:2ad5:d132 with SMTP id s27-20020a056a00179b00b006862ad5d132mr26585pfg.5.1691538500779; Tue, 08 Aug 2023 16:48:20 -0700 (PDT) Date: Tue, 8 Aug 2023 16:48:19 -0700 In-Reply-To: <20230808164532.09337d49@ake-x260> Mime-Version: 1.0 References: <20230807062611.12596-1-ake@igel.co.jp> <43c18a3d57305cf52a1c3643fa8f714ae3769551.camel@redhat.com> <20230808164532.09337d49@ake-x260> Message-ID: Subject: Re: [RFC PATCH] KVM: x86: inhibit APICv upon detecting direct APIC access from L2 From: Sean Christopherson To: Ake Koomsin Cc: Maxim Levitsky , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 08, 2023, Ake Koomsin wrote: > On Mon, 07 Aug 2023 17:00:58 +0300 > Maxim Levitsky wrote: > > > Is there a good reason why KVM doesn't expose APIC memslot to a > > nested guest? While nested guest runs, the L1's APICv is "inhibited" > > effectively anyway, so writes to this memslot should update APIC > > registers and be picked up by APICv hardware when L1 resumes > > execution. > > > > Since APICv alows itself to be inhibited due to other reasons, it > > means that just like AVIC, it should be able to pick up arbitrary > > changes to APIC registers which happened while it was inhibited, just > > like AVIC does. > > > > I'll take a look at the code to see if APICv does this (I know AVIC's > > code much better that APICv's) > > > > Is there a reproducer for this bug? > > The idea from step 6 to step 10 is to start BitVisor first, and start Linux on > top of it. You can adjust the step as you like. Feel free to ask me anything > regarding reproducing the problem with BitVisor if the giving steps are not > sufficient. Thank you for the detailed repro steps! However, it's likely going to be O(weeks) before anyone is able to look at this in detail given the extensive repro steps. If you have bandwidth, it's probably worth trying to reproduce the problem in a KVM selftest (or a KVM-Unit-Test), e.g. create a nested VM, send an IPI from L2, and see if it gets routed correctly. This purely a suggestion to try and get a faster fix, it's by no means necessary. Actually, typing that out raises a question (or two). What APICv VMCS control settings does BitVisor use? E.g. is BitVisor enabling APICv for its VM (L2)? If so, what values for the APIC access page and vAPIC page are shoved into BitVisor's VMCS? > The problem does not happen when enable_apicv=N. Note that SMP bringup with > enable_apicv=N can fail. This is another problem. We don't have to worry about > this for now. Linux seems to have no delay between INIT DEASSERT and SIPI during > its SMP bringup. This can easily makes INIT and SIPI pending together resultling > in signal lost. > > I admit that my knowledge on KVM and APICv is very limited. I may misunderstand > the problem. If you don't mind, would it be possible for you to guide me which > code path should I pay attention to? I would love to learn to find out the > actual cause of the problem. KVM *should* emulate the APIC MMIO access from L2. The call stack should reach apic_mmio_write(), and assuming it's an ICR write, KVM should send an IPI.