Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp4745119rwl; Mon, 10 Apr 2023 16:27:14 -0700 (PDT) X-Google-Smtp-Source: AKy350YRX2r0IMEh7YPkc4JKw0A5BwXYLIXQOSJXeBfQNc7y4d4dIu9p1TftyP6+WnjnXqMFYx+V X-Received: by 2002:a17:906:19c4:b0:931:85f8:6d00 with SMTP id h4-20020a17090619c400b0093185f86d00mr8325206ejd.47.1681169233753; Mon, 10 Apr 2023 16:27:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681169233; cv=none; d=google.com; s=arc-20160816; b=F/2qUpuYJxE64qGqSpuLzrW1x1VPt8nPRSpYapK9hWhau+r+ZmIooFWCTKUfMTuTE1 mmpMl41Phhm4c0HErZBvTjh7CdHMRLdTloxYvNdXc1OLYyHV2B4+VV0BAMU3QwLF2L0x nJPhb4MZo7gM8a5HUze4PBR1Jr75GRW/Gt4hvWfEaGUAgnV8mJHeYYO+JSE48TRZMtbE PS6agAzu+3eWG2qPzsmJwwTADkllErgmIBB5ITNEOkJyW8TprAwFjMY1eW4IX5hxsYWr oj8qNRE4bGWOFtoyjoOwc4pzjljiYGi1sBpT75xw5sNXT5VKzY98EzkLBAuUqS0YX9nr LiZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=XDQcwM0Ljq47Kk+Cx8Zo9PwBtwHuWqUYBxNgZio/RfY=; b=1ApqLbDNdA9hay0k3/6mRrPeaAvI4naKV/iI4k/ZeFluueU0X3g5KnPfCSDnJbqeOs 2/vUNO4LpTi+JjUgPB/BEX11x0Lu9Lh/4IVhF4jIV78c1oLAmQD7Kx2tbbAzTNYwat5d XVUpBGiGjKN/0Nr1Kr0d8McAcp3ZeAofx5XllXRYCudLWsi9+5XXXNpdDqhgPKD1zUnY 9vFGdphnebk0mSPWyqv3KOb6bUaInUJmsQ4Dxf4ai96WNf3AAGjllKO5uI6WWGSUEkKI vV9rQ9DnapLwNpeKKFEEFWqE2/82OlERjx7Cf4oR4GUF7WxVX1XKG1GF1V8PzTWkJc8l Jgig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=bkyyOP2B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c22-20020a170906529600b0094a819713e7si3133326ejm.603.2023.04.10.16.26.49; Mon, 10 Apr 2023 16:27:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=bkyyOP2B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229789AbjDJXZp (ORCPT + 99 others); Mon, 10 Apr 2023 19:25:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229523AbjDJXZo (ORCPT ); Mon, 10 Apr 2023 19:25:44 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F21161BCA for ; Mon, 10 Apr 2023 16:25:42 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-54c0b8ca2d1so135847017b3.17 for ; Mon, 10 Apr 2023 16:25:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1681169142; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XDQcwM0Ljq47Kk+Cx8Zo9PwBtwHuWqUYBxNgZio/RfY=; b=bkyyOP2BDUJ+Th0lHoGO7vJXGV+88YNyum6G/YIxmMfueZJtcBv9odQCjB+mTj5U6u iuqMnEWS/RBWfoENX3U6XkNTInEgC5Zeujj2Y7olklOMNDLxmIvsOrXs3kYbI3SEVHk2 bQ9nZzL2dL+oYUBfs3cqncHkNY+uly0GeA8kIaxIisGXszxC5bzNUapXOTnvwgIXND9e dZvDamirs/JT+T5ouQYIR79vZd4yRA4vqAFVoW/Ref+lT+oenzXYdJ1IVboJzX/DEp8h oNmH3sxCZydRo6YIUzUvqqsF1XswzjT8qYth3Yz+sUCWNqOymLDjrRQNiYrYCgzjRZ6k MYRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681169142; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XDQcwM0Ljq47Kk+Cx8Zo9PwBtwHuWqUYBxNgZio/RfY=; b=WpV/M3XlTrHEVj3OISmcgSvZeFH6rtHik9mmwntraHKhpAhAiMlXxr1QPXtJjb0Y83 5bglCS09RQ77Z7C6O83zjLGnLqo91oyTPLpnI2WwHEdDdO3HxOsnEWhZvlsdqtDa+IOP A0KzvzuD1ez4g0HmQpDe7ODL/A69kG61tTdNgGc38HJ9FXBFpqsco0P0RfuxxP/uKo91 7QgyxylYtvk7u0PKtasRJ5enHT8iITBI6JrNWExww5IoWiuCY87TLqIPVHUR+J7dY3pY 3UyY08ZCvmpFB53u89Y7q4HMrPmwIm85z+x/CflqOU8jNJn9x8FlgsUXbRYGiGWxgaOx w6fg== X-Gm-Message-State: AAQBX9chtjATKcjjHAdkVtVARg+FHuIAfdrKYp4V3wltxbGVSFDvJoY3 UtGl2mzXtr2P4Vj+nDFrQJGzIIJH0KU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:3157:0:b0:b8e:db4a:a366 with SMTP id x84-20020a253157000000b00b8edb4aa366mr4567666ybx.11.1681169142239; Mon, 10 Apr 2023 16:25:42 -0700 (PDT) Date: Mon, 10 Apr 2023 16:25:40 -0700 In-Reply-To: <959c5bce-beb5-b463-7158-33fc4a4f910c@linux.microsoft.com> Mime-Version: 1.0 References: <20230227171751.1211786-1-jpiotrowski@linux.microsoft.com> <959c5bce-beb5-b463-7158-33fc4a4f910c@linux.microsoft.com> Message-ID: Subject: Re: [PATCH] KVM: SVM: Disable TDP MMU when running on Hyper-V From: Sean Christopherson To: Jeremi Piotrowski Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Vitaly Kuznetsov , Paolo Bonzini , Tianyu Lan , Michael Kelley Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-7.7 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 05, 2023, Jeremi Piotrowski wrote: > On 3/7/2023 6:36 PM, Sean Christopherson wrote: > > Thinking about this more, I would rather revert commit 1e0c7d40758b ("KVM: SVM: > > hyper-v: Remote TLB flush for SVM") or fix the thing properly straitaway. KVM > > doesn't magically handle the flushes correctly for the shadow/legacy MMU, KVM just > > happens to get lucky and not run afoul of the underlying bugs. The revert appears > > to be reasonably straightforward (see bottom). > > Hi Sean, > > I'm back, and I don't have good news. The fix for the missing hyperv TLB flushes has > landed in Linus' tree and I now had the chance to test things outside Azure, in WSL on my > AMD laptop. > > There is some seriously weird interaction going on between TDP MMU and Hyper-V, with > or without enlightened TLB. My laptop has 16 vCPUs, so the WSL VM also has 16 vCPUs. > I have hardcoded the kernel to disable enlightened TLB (so we know that is not interfering). > I'm running a Flatcar Linux VM inside the WSL VM using legacy BIOS, a single CPU > and 4GB of RAM. > > If I run with `kvm.tdp_mmu=0`, I can boot and shutdown my VM consistently in 20 seconds. > > If I run with TDP MMU, the VM boot stalls for seconds at a time in various spots > (loading grub, decompressing kernel, during kernel boot), the boot output feels like > it's happening in slow motion. The fastest I see it finish the same cycle is 2 minutes, > I have also seen it take 4 minutes, sometimes even not finish at all. Same everything, > the only difference is the value of `kvm.tdp_mmu`. When a stall occurs, can you tell where the time is lost? E.g. is the CPU stuck in L0, L1, or L2? L2 being a single vCPU rules out quite a few scenarios, e.g. lock contention and whatnot. If you can run perf in WSL, that might be the easiest way to suss out what's going on. > So I would like to revisit disabling tdp_mmu on hyperv altogether for the time being but it > should probably be with the following condition: > > tdp_mmu_enabled = tdp_mmu_allowed && tdp_enabled && !hypervisor_is_type(X86_HYPER_MS_HYPERV) > > Do you have an environment where you would be able to reproduce this? A Windows server perhaps > or an AMD laptop? Hrm, not easily, no. Can you try two things? 1. Linus' tree on Intel hardware 2. kvm-x86/next[*] on Intel hardware Don't bother with #2 if #1 (Linus' tree) does NOT suffer the same stalls as AMD. #2 is interesting iff Intel is also affected as kvm-x86/next has an optimization for CR0.WP toggling, which was the achilles heel of the TDP MMU. If Intel isn't affected, then something other than CR0.WP is to blame. I fully expect both experiments to show the same behavior as AMD, but if for some reason they don't, the results should help narrow the search. [*] https://github.com/kvm-x86/linux/tree/next