Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp5722414rwl; Tue, 11 Apr 2023 09:06:29 -0700 (PDT) X-Google-Smtp-Source: AKy350aOekKbg/ALdkEQzlhiaFOguUWe4FUgKOVJKnHJ+zPT0B32bozpCnLXuy/RBdS61THvNf4v X-Received: by 2002:aa7:d483:0:b0:504:92aa:12ee with SMTP id b3-20020aa7d483000000b0050492aa12eemr8781701edr.15.1681229189604; Tue, 11 Apr 2023 09:06:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681229189; cv=none; d=google.com; s=arc-20160816; b=iVTsdiVKTLPL9egEpvXOyAmMamg6n3bSdgMnpBPrkMWQ+93NtelAuI2P1Wss1tKkZx rzFTXOcBZvJPF4r9GuEbQlSIgc+9uSR4q7QTZwNy8PBwOW6geY82/LVb16ezhTvE+YNG rMdFtHQOKNiPxsnzo8fdiTyFLaQEGuxYOTd1zki+9NbG7kOaSYPT0uxNub1ffM2Lc8Uw +sP+eE7rB+YD1ueIq/ocF4BDl08YBiuig92ueob/+0Fa5bb7oHP/M9K+1Z5Bv2kiiwZo tklM18o+Z0M+HZwaauCiSetuw88LASFbMTra7rS8Mr01jt4RiRWBX3183L4y3tlQOAUw qiLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=dAeE7rBx2LrEXGBOeJAa11rA6LzPuJMSc3yrCGL+cOo=; b=DLHC2EkiKo4/vvzVCHTF8KCuvxNQ0zR6DZ7nFWiSY5Z7a2cKIvz4whKgCZdcO3+lvA S6UkLO7yTutD9wuej0VpGj0JKo769H4kzdrhnCpmQCW2ZTnnfs45uNv8X/tGM/PfCNvY NM8HAx9lrefWupC/2PfH0XF9hNedSxZT94CC6PllfQhDxX7QZwWwjjChfb+kKegaYzei dtJDkxdrrnT2tIzq0/ncQ0N5aqum9VIEQdhomjloEBu499oHl9UJxAqBjp+CJsbVhuNG nj6fmOSd31ij+kemGACNp6MB2SXvc0N5J76wMSEdLc3gFpEBR6Y3waU94n777ez2Z6+t +27w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=yHKE0ADk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r25-20020aa7da19000000b004fb16259824si358025eds.339.2023.04.11.09.06.03; Tue, 11 Apr 2023 09:06:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=yHKE0ADk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229698AbjDKQDR (ORCPT + 99 others); Tue, 11 Apr 2023 12:03:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229593AbjDKQC4 (ORCPT ); Tue, 11 Apr 2023 12:02:56 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1DEA1E76 for ; Tue, 11 Apr 2023 09:02:55 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-54ef5a44bd9so82637287b3.8 for ; Tue, 11 Apr 2023 09:02:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1681228974; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dAeE7rBx2LrEXGBOeJAa11rA6LzPuJMSc3yrCGL+cOo=; b=yHKE0ADk4mlLhhfRhtgCadpVkMpvaHdD8O+iQcvjrzCzWiWCoMxJ5B6WSy8oSIrtU+ 86IzlzFKFuCL/splmCmz4Y+fXbPQeHRrrVeQ+fp1Zod8GOX1yf6SIb2XwpKJF2DaXbiH addi8Cj3zaUQD9mEDa5w9UMN9fbhWcC0M8q3n01RY8du83aaL27mCvHpFnmkP4CPdio+ RG5//irHEyT9DFgwyTszL4Au6tMwBD4ko1UI3T5Ti+vyGLSb50Fzz/L9e+R+6H3pfWE4 cjjomGR/siYxw8Zw1Vi7iFRkAD6mV5w3EPJjtgz1yxif0YQZAHJlSKHB8MoMvYIQ8xCs i6Vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681228974; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dAeE7rBx2LrEXGBOeJAa11rA6LzPuJMSc3yrCGL+cOo=; b=JYFNH6jBpQ8LXLBG7nPrtNarwKagHMseQgZ2LesGd9bqxZPcSBo1MY+pNHueuGSUay 7thWEaOUupJkrSMvPnjZKBmktaG/KJNVcvOEV4Wt+IfisBqVNHdtFG7bjC4oahswbuDW hsZ6Vgx7d/aIH7jLRu3L0qNhYjSS3f39esNwNtELoaHrlkGNIZUobqNl+WPG1aCxidzP a9ONEV0CFmip1Jbb0HI7K8jSUWPzu/E5cHVDLnx+dqf/PyG+Z+zrCL67qX2vPBOEtTy/ l70xT15MJ08l+YIJ5QYnxd2P67zkkelV5rAhAhFZr7bqZkBe8+HvO+7UTHj0xuVg+6QS UPpw== X-Gm-Message-State: AAQBX9feRmeoJglk/1Oeaf4nDgnkHufqVlw/jF1pHZQoct1/csZqIXIg K4mCBliGiwC5a/nTYodgyht9NTWmSdM= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:690c:c9a:b0:533:a15a:d33e with SMTP id cm26-20020a05690c0c9a00b00533a15ad33emr9641481ywb.5.1681228974368; Tue, 11 Apr 2023 09:02:54 -0700 (PDT) Date: Tue, 11 Apr 2023 09:02:52 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230227171751.1211786-1-jpiotrowski@linux.microsoft.com> <959c5bce-beb5-b463-7158-33fc4a4f910c@linux.microsoft.com> Message-ID: Subject: Re: [PATCH] KVM: SVM: Disable TDP MMU when running on Hyper-V From: Sean Christopherson To: Jeremi Piotrowski Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Vitaly Kuznetsov , Paolo Bonzini , Tianyu Lan , Michael Kelley Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-7.7 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 11, 2023, Jeremi Piotrowski wrote: > On 4/11/2023 1:25 AM, Sean Christopherson wrote: > > On Wed, Apr 05, 2023, Jeremi Piotrowski wrote: > >> On 3/7/2023 6:36 PM, Sean Christopherson wrote: > >>> Thinking about this more, I would rather revert commit 1e0c7d40758b ("KVM: SVM: > >>> hyper-v: Remote TLB flush for SVM") or fix the thing properly straitaway. KVM > >>> doesn't magically handle the flushes correctly for the shadow/legacy MMU, KVM just > >>> happens to get lucky and not run afoul of the underlying bugs. The revert appears > >>> to be reasonably straightforward (see bottom). > >> > >> Hi Sean, > >> > >> I'm back, and I don't have good news. The fix for the missing hyperv TLB flushes has > >> landed in Linus' tree and I now had the chance to test things outside Azure, in WSL on my > >> AMD laptop. > >> > >> There is some seriously weird interaction going on between TDP MMU and Hyper-V, with > >> or without enlightened TLB. My laptop has 16 vCPUs, so the WSL VM also has 16 vCPUs. > >> I have hardcoded the kernel to disable enlightened TLB (so we know that is not interfering). > >> I'm running a Flatcar Linux VM inside the WSL VM using legacy BIOS, a single CPU > >> and 4GB of RAM. > >> > >> If I run with `kvm.tdp_mmu=0`, I can boot and shutdown my VM consistently in 20 seconds. > >> > >> If I run with TDP MMU, the VM boot stalls for seconds at a time in various spots > >> (loading grub, decompressing kernel, during kernel boot), the boot output feels like > >> it's happening in slow motion. The fastest I see it finish the same cycle is 2 minutes, > >> I have also seen it take 4 minutes, sometimes even not finish at all. Same everything, > >> the only difference is the value of `kvm.tdp_mmu`. > > > > When a stall occurs, can you tell where the time is lost? E.g. is the CPU stuck > > in L0, L1, or L2? L2 being a single vCPU rules out quite a few scenarios, e.g. > > lock contention and whatnot. > > It shows up as around 90% L2 time, 10% L1 time. Are those numbers coming from /proc//stat? Something else? > I don't have great visibility into L0 time right now, I'm trying to find > someone who might be able to help with that. > > > > > If you can run perf in WSL, that might be the easiest way to suss out what's going > > on. > > I can run perf, what trace would help? Good question. I'm not exactly a perf expert and almost never do anything beyond `perf top`. That's probably sufficient for now, I really just want to confirm that L1 doesn't appear to be stuck, e.g. in KVM's page fault handler. > The results are the same for both branches, and it does look like this affects AMD and > Intel equally. Nice. I have Intel hardware at home that I'll try to repro on, though it will be several weeks until I can dive into this. > So seeing as this will likely take a while to figure out (and I know I won't be able to > spend too many cycles on this in the next few weeks), what do you think of a patch to > disable tdp_mmu in this configuration (for the time being?). I don't particularly love the idea of disabling the TDP MMU without having the slightest clue what's going wrong, but I'm not totally opposed to it. Paolo, any thoughts? You have far more experience with supporting downstream consumers of KVM. > Something else I've been wondering: in a KVM-on-KVM setup, is tdp_mmu used in both L0 > and L1 hypervisors right now? By default, yes. I double checked that L2 has similar boot times for KVM-on-KVM with and without the TDP MMU. Certainly nothing remotely close to 2 minutes.