Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp5668927pxj; Wed, 23 Jun 2021 06:33:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxsUfnQbynADfmWjMB68GM8UEm1nJdV+Vz4b3PAHjZ1n7hTzwr78aQyUTolCANit5uWVvk4 X-Received: by 2002:a50:ff0a:: with SMTP id a10mr12466890edu.273.1624455230226; Wed, 23 Jun 2021 06:33:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624455230; cv=none; d=google.com; s=arc-20160816; b=mFxm5pPdwNLqaIeEYqrGCpVXz4LbdN5pYLc+BseB0mgKWeX5mfqygiYCE7PRP7dm2a nxHO750fy+qspZUQJUBJI+nrztizcmZnt28IOH6n1eoce0WrMbcOuLb1UNY6Dx9AnKn0 hlVItCCffXfSfFFkwUvQMMu9fErqKUtKPVG4iKMgltsnHpfpUCZ1Wj3qsQALYeN+XMXu xCVSoh23smovn6NoSjsFdFJUnlLrfhj3rAh90uz1Vt5MVM1jUNrxeBPPydYoqVcEp20n bKyqvmIZSfilRr8UeTTRjBHUDmMP1PSpd5t4MwY8u/naSJTyNeHbZOkY7cfEEBqDx3HJ spDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=NauHAI0CHbhsa2fGusxRfVa5/FAzeCMJVSm7zm9WPEI=; b=nKPdb9r7WEpOZLRp73nTIhB+CDHGkHchaqUd4qUsuB9MGvbU2ekkuYYuLbMr4XvbAw iUP4YXz8mNOgS7xRJKI8f8gO38W2F7MquVN84W4MOY3VkumYu3An0sezmwJs+70xMSct zmitSMRAvCzOtP7X35HJFsI3J9HyIZbq4H7qUGp3zQ46Eo5UdTkUFNrRKXW2gvKuBfng OmAxkieBGQtgGQ+oX+jE2L7sednJPEU6xeHg5Xpj4oiWFO8qj1oMOGvtGInWRl0eXfbl r0velWuEgNCj+xgXIbnqf5TEQTlR1oKe4HDIEQlOLSnjUNQWTnyI3lmTUXGrR/bVk6B0 DXJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PJtkOEjm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f3si18158694ejl.557.2021.06.23.06.33.26; Wed, 23 Jun 2021 06:33:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PJtkOEjm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231247AbhFWNek (ORCPT + 99 others); Wed, 23 Jun 2021 09:34:40 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:57705 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230438AbhFWNe3 (ORCPT ); Wed, 23 Jun 2021 09:34:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1624455131; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NauHAI0CHbhsa2fGusxRfVa5/FAzeCMJVSm7zm9WPEI=; b=PJtkOEjmSFK/nbHjZE+SaVqIUPR0KnMjkd8idKMf7l4lKLO4xH33U606E9T1uvCpAdPNe0 Kt7mhFtvR83eBMqkf2xyxnexqOTd7tnw3ZRtK9cjKa4wa0OHAbTV1A5eA7DWVdIFj8zRUo a5m7azRfSzxznl0R6yOcyBrXa+SQQHU= Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-355-o9gxqqVnP-ye4nUOV7_Cow-1; Wed, 23 Jun 2021 09:32:10 -0400 X-MC-Unique: o9gxqqVnP-ye4nUOV7_Cow-1 Received: by mail-ed1-f69.google.com with SMTP id r6-20020a05640216c6b0290394ed90b605so1208271edx.20 for ; Wed, 23 Jun 2021 06:32:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=NauHAI0CHbhsa2fGusxRfVa5/FAzeCMJVSm7zm9WPEI=; b=hVYS7TSENnaWowOQNCw92z7FWZ7+ivhACP+qmnDBJp295TCQlE1bUo/qpbgrFFkILJ AsmZWsnST71c0/L9oNB1Wbh7EE9J0hRBrv+Mq66rtDFwk4zgKVReMP1l9x1aDKSH9scV PDCkA4T3ob0Mu8HBYw87n5Y9Y6NbTNvYIlLrYaUxfeV3jHSRiyO5fAtGuVH0jFp9Km/w uSwavpGkSxOwNjKMUr8S100twaVI4r/+45QASKWtYg4WdQcYtPIeQ3dfBOf4Te3mvrwj tyjOZVtwdM7tPVbNx51bcPQVLmaZEIKY4kvxE5c+WoMC6dJfulLaZyKMkTpx1jSDVZdq bqBg== X-Gm-Message-State: AOAM531ruaJX7DqB8cwxidyURe2R/wtnEBk47eJkWn8o4FMzYNRYM7OG ZKyp1BuUEaAUF0RI4QlHOrdsGEYMYxl/661fYW6NJMkN8NHqajd4yz2ht0H9DxmHyLkb/2/qMlk RiMU7LE/QCzNTsXftFB0QEJQR X-Received: by 2002:a05:6402:354d:: with SMTP id f13mr12403178edd.71.1624455129117; Wed, 23 Jun 2021 06:32:09 -0700 (PDT) X-Received: by 2002:a05:6402:354d:: with SMTP id f13mr12403149edd.71.1624455128934; Wed, 23 Jun 2021 06:32:08 -0700 (PDT) Received: from vitty.brq.redhat.com (g-server-2.ign.cz. [91.219.240.2]) by smtp.gmail.com with ESMTPSA id o20sm34087eds.20.2021.06.23.06.32.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jun 2021 06:32:08 -0700 (PDT) From: Vitaly Kuznetsov To: Maxim Levitsky Cc: Sean Christopherson , Wanpeng Li , Jim Mattson , Cathy Avery , Emanuele Giuseppe Esposito , linux-kernel@vger.kernel.org, Paolo Bonzini , kvm@vger.kernel.org Subject: Re: [PATCH RFC] KVM: nSVM: Fix L1 state corruption upon return from SMM In-Reply-To: References: <20210623074427.152266-1-vkuznets@redhat.com> <53a9f893cb895f4b52e16c374cbe988607925cdf.camel@redhat.com> Date: Wed, 23 Jun 2021 15:32:07 +0200 Message-ID: <87pmwc4sh4.fsf@vitty.brq.redhat.com> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Maxim Levitsky writes: > On Wed, 2021-06-23 at 16:01 +0300, Maxim Levitsky wrote: >> On Wed, 2021-06-23 at 11:39 +0200, Paolo Bonzini wrote: >> > On 23/06/21 09:44, Vitaly Kuznetsov wrote: >> > > - RFC: I'm not 100% sure my 'smart' idea to use currently-unused HSAVE area >> > > is that smart. Also, we don't even seem to check that L1 set it up upon >> > > nested VMRUN so hypervisors which don't do that may remain broken. A very >> > > much needed selftest is also missing. >> > >> > It's certainly a bit weird, but I guess it counts as smart too. It >> > needs a few more comments, but I think it's a good solution. >> > >> > One could delay the backwards memcpy until vmexit time, but that would >> > require a new flag so it's not worth it for what is a pretty rare and >> > already expensive case. >> > >> > Paolo >> > >> >> Hi! >> >> I did some homework on this now and I would like to share few my thoughts on this: >> >> First of all my attention caught the way we intercept the #SMI >> (this isn't 100% related to the bug but still worth talking about IMHO) >> >> A. Bare metal: Looks like SVM allows to intercept SMI, with SVM_EXIT_SMI, >> with an intention of then entering the BIOS SMM handler manually using the SMM_CTL msr. >> On bare metal we do set the INTERCEPT_SMI but we emulate the exit as a nop. >> I guess on bare metal there are some undocumented bits that BIOS set which >> make the CPU to ignore that SMI intercept and still take the #SMI handler, >> normally but I wonder if we could still break some motherboard >> code due to that. >> >> >> B. Nested: If #SMI is intercepted, then it causes nested VMEXIT. >> Since KVM does enable SMI intercept, when it runs nested it means that all SMIs >> that nested KVM gets are emulated as NOP, and L1's SMI handler is not run. >> >> >> About the issue that was fixed in this patch. Let me try to understand how >> it would work on bare metal: >> >> 1. A guest is entered. Host state is saved to VM_HSAVE_PA area (or stashed somewhere >> in the CPU) >> >> 2. #SMI (without intercept) happens >> >> 3. CPU has to exit SVM, and start running the host SMI handler, it loads the SMM >> state without touching the VM_HSAVE_PA runs the SMI handler, then once it RSMs, >> it restores the guest state from SMM area and continues the guest >> >> 4. Once a normal VMexit happens, the host state is restored from VM_HSAVE_PA >> >> So host state indeed can't be saved to VMC01. >> >> I to be honest think would prefer not to use the L1's hsave area but rather add back our >> 'hsave' in KVM and store there the L1 host state on the nested entry always. >> >> This way we will avoid touching the vmcb01 at all and both solve the issue and >> reduce code complexity. >> (copying of L1 host state to what basically is L1 guest state area and back >> even has a comment to explain why it (was) possible to do so. >> (before you discovered that this doesn't work with SMM). > > I need more coffee today. The comment is somwhat wrong actually. > When L1 switches to L2, then its HSAVE area is L1 guest state, but > but L1 is a "host" vs L2, so it is host state. > The copying is more between kvm's register cache and the vmcb. > > So maybe backing it up as this patch does is the best solution yet. > I will take more in depth look at this soon. We can resurrect 'hsave' and keep it internally indeed but to make this migratable, we'd have to add it to the nested state acquired through svm_get_nested_state(). Using L1's HSAVE area (ponted to by MSR_VM_HSAVE_PA) avoids that as we have everything in L1's memory. And, as far as I understand, we comply with the spec as 1) L1 has to set it up and 2) L1 is not supposed to expect any particular format there, it's completely volatile. -- Vitaly