Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp1139231rwb; Wed, 28 Sep 2022 13:52:27 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6PR4juROuuQbw7DTmwrGW0V8reACNKDSUC7u94FxanjK2UmSNefnScTtyOoNpBu5aPKWR6 X-Received: by 2002:a05:6402:270d:b0:452:4285:87b7 with SMTP id y13-20020a056402270d00b00452428587b7mr34680118edd.135.1664398346854; Wed, 28 Sep 2022 13:52:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664398346; cv=none; d=google.com; s=arc-20160816; b=FU7vO1hVY+sHX2nIrqaqgG5YV5jVToyenRbQJzXx6h9imsjx2kANMwlSuy6AD/LTJx mFfgIGAwxo18MEaxk2+cMWFMK7SclkT3bdQiu4Le5HZK/rgXWDS3xc2S8Nlw0MXMoXpX oHq3F+1Qj+g0VgWJSfbg8Nwm2mLZyz3Y7GWzldmNg3m/KCWh5CUusqr6lEWOp8dFBILb ep5tSJEKcA2UJkCh+qlEAjxXlXQhspeLsQEjF5MUfvBVAyLwy4rPBDd0+P9+Te6ClL+b WdeTFEsHmUL/n1/e7+F9qv5tVupkRIKvwJswyOWuWmKP1e6I5AMSU1Lzquxousnc1WTF WJbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=/NNblFt5ns9rlFXYqtWj/i2oVe4wzb7GDex70CTH3v0=; b=XkjS9fUsDu8GygC5y0BGNmbt9BO02Udbx1mt15x+Gj7DM0GJ+HPHp9jhLWgpYqbzAJ cSh4eWh/L6OLfQTx7/B+CKCyyGziW5ju5FoI91LMqsGsZ0Uj7NZdIAfjmVpMkitTblY8 EVnVRYE3UPWiX17ZCCf7vrLO3S1rraYcok2N3lK1T+qDFWjY8Z93UFGAa7TKqSXiTdOs 8mQ6509l4oTJqS/+4sEx4ui1RZdjvcbzgmxdWFWwwrziy8RBb27EAVRATnFCTEh7In5S j7om2Czrq9wjoqkJ2ze2u8X7R/lNkJWeqJlgFveszvQA3a8GdNh2lfUe7dBIvTNazNqX Amvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="P/aBt6mT"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h4-20020a056402280400b00457b871604bsi6615703ede.545.2022.09.28.13.52.01; Wed, 28 Sep 2022 13:52:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="P/aBt6mT"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233245AbiI1Ulx (ORCPT + 99 others); Wed, 28 Sep 2022 16:41:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231499AbiI1Ulq (ORCPT ); Wed, 28 Sep 2022 16:41:46 -0400 Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 48B548E4E5 for ; Wed, 28 Sep 2022 13:41:45 -0700 (PDT) Received: by mail-pf1-x42c.google.com with SMTP id a80so13538720pfa.4 for ; Wed, 28 Sep 2022 13:41:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date; bh=/NNblFt5ns9rlFXYqtWj/i2oVe4wzb7GDex70CTH3v0=; b=P/aBt6mT+WCaCxH+V1KwNWzKbB7UqQc7hOJ+SrXoiUP6ubp4Cu39FkvNm2hllmAfB0 w9Jbbbu/4BQh6f120aAepVD5D3eQR1HxA3y51AZCci2P2ZWrQS0YM53c+8tYSwi/qvJP A1UJHXFG3/ivYmf6DPGn2LSntQb97sdDDGG33lu1+UTJuEuEJjv2eQwcjiNgMB0AvDnL VoDQ/JQE4WihBZAk4QfVEJVZg2ufJrXftpMj5t7ACjU/EcjYThRrY7ZiiHXP9eSlSKA3 G+4spJ83hmlUMdTX9AM55kayOXcDln1AcgJtUIjiaVCwhzcpfl+SocfpC5KmJJaJzvAX CGfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date; bh=/NNblFt5ns9rlFXYqtWj/i2oVe4wzb7GDex70CTH3v0=; b=ZCuxZb6bsK2Yzl7SWNhOrePHu0uySL+1PUONe63ycjswgJSd2WNc9W9pgQBt4UaGl4 3rhbUCaqtoO0Y9UB5E258jdKLmr4CL6oAwvRzGSXOQe71A7+JEoVdM2q4DaXURB5pHHc IDODy9Ytx5gLiEhwYtyipY/WunksPCm3PgUBta7kFnZDhTpm3VPZIG4CFMYxEI/Mh2ES mJ0tPE/Cj/+6tIFe31iHdZr9cQiH5+jbftkWTqW6c2S8WmjotIP1th6jSR6bn3+H6Dcl 2ERqZSY1+TGk4bqFwlOk/bE14CfOCmdeFl8FHVcifVCXkyhSomyhOYGKQo2fatItKpAX GDWA== X-Gm-Message-State: ACrzQf34SNL5vqpH7Z0rdw/xjPY+i/8PUclatslEXty61SpI13ahuEom Uk1t1kW0pZS/XOVFWuy8GGh1kA== X-Received: by 2002:a63:2c6:0:b0:43c:1f18:a475 with SMTP id 189-20020a6302c6000000b0043c1f18a475mr28572736pgc.186.1664397704637; Wed, 28 Sep 2022 13:41:44 -0700 (PDT) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id 128-20020a620686000000b0053e4baecc14sm4470151pfg.108.2022.09.28.13.41.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Sep 2022 13:41:44 -0700 (PDT) Date: Wed, 28 Sep 2022 20:41:40 +0000 From: Sean Christopherson To: Paolo Bonzini Cc: Emanuele Giuseppe Esposito , David Hildenbrand , Maxim Levitsky , kvm@vger.kernel.org, Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , linux-kernel@vger.kernel.org, Like Xu Subject: Re: [RFC PATCH 0/9] kvm: implement atomic memslot updates Message-ID: References: <55d7f0bd-ace1-506b-ea5b-105a86290114@redhat.com> <111a46c1-7082-62e3-4f3a-860a95cd560a@redhat.com> <14d5b8f2-7cb6-ce24-c7a7-32aa9117c953@redhat.com> <3b04db9d-0177-7e6e-a54c-a28ada8b1d36@redhat.com> <8534dfe4-bc71-2c14-b268-e610a3111d14@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8534dfe4-bc71-2c14-b268-e610a3111d14@redhat.com> X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 28, 2022, Paolo Bonzini wrote: > On 9/28/22 17:58, Sean Christopherson wrote: > > I don't disagree that the memslots API is lacking, but IMO that is somewhat > > orthogonal to fixing KVM x86's "code fetch to MMIO" mess. Such a massive new API > > should be viewed and prioritized as a new feature, not as a bug fix, e.g. I'd > > like to have the luxury of being able to explore ideas beyond "let userspace > > batch memslot updates", and I really don't want to feel pressured to get this > > code reviewed and merge. > > I absolutely agree that this is not a bugfix. Most new features for KVM can > be seen as bug fixes if you squint hard enough, but they're still features. I guess I'm complaining that there isn't sufficient justification for this new feature. The cover letter provides a bug that would be fixed by having batched updates, but as above, that's really due to deficiencies in a different KVM ABI. Beyond that, there's no explanation of why this exact API is necessary, i.e. there are no requirements given. - Why can't this be solved in userspace? - Is performance a concern? I.e. are updates that need to be batched going to be high frequency operations? - What operations does userspace truly need? E.g. if the only use case is to split/truncate/hole punch an existing memslot, can KVM instead provide a memslot flag and exit reason that allows kicking vCPUs to userspace if the memslot is accessed? E.g. KVM_MEM_DISABLED that acts like an invalid memslot, but KVM exists with a dedicated exit reason instead of generating MMIO semantics. - What precisely needs to be atomic? If updates only need to be "atomic" for an address space, does the API allowing mixing non-SMM and SMM memslots? - When is KVM required to invalidate and flush? E.g. if a memslot is deleted and recreated with a different HVA, how does KVM ensure that there are no outstanding references to the old HVA without introducing non-determinstic behavior. The current API works by forcing userspace to fully delete the memslot, i.e. KVM can ensure all references are gone in all TLBs before allowing userspace to create new, conflicting entries. I don't see how this can work with batched updates. The update needs to be "atomic", i.e. vCPUs must never see an invalid/deleted memslot, but if the memslot is writable, how does KVM prevent some writes from hitting the old HVA and some from hitting the new HVA without a quiescent period? - If a memslot is truncated while dirty logging is enabled, what happens to the bitmap? Is it preserved? Dropped? Again, I completely agree that the current memslots API is far from perfect, but I'm not convinced that simply extending the existing API to batch updates is the best solution from a KVM perspective. > > E.g. why do a batch update and not provide KVM_SET_ALL_USER_MEMORY_REGIONS to > > do wholesale replacement? That seems like it would be vastly simpler to handle > > on KVM's end. Or maybe there's a solution in the opposite direction, e.g. an > > API that allows 1->N or N->1 conversions but not arbitrary batching. > > Wholesale replacement was my first idea when I looked at the issue, I think > at the end of 2020. I never got to a full implementation, but my impression > was that allocating/deallocating dirty bitmaps, rmaps etc. would make it any > easier than arbitrary batch updates. It's not obvious to me that the memslot metadata is going to be easy to handle regardless of what we do. E.g. I'm pretty sure that batching updates will "corrupt" the dirty bitmap if a hole is punched in a memslot that's being dirty logged.