Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp8643228rwi; Tue, 25 Oct 2022 09:05:12 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6gDFaChH/0C8mF65biBuXopZeeCAYKIUlrTjXfOFmgBr+qTRKLO2xpW86kT4soLg1m4TiD X-Received: by 2002:a17:906:2f87:b0:78a:c0d5:d019 with SMTP id w7-20020a1709062f8700b0078ac0d5d019mr33126390eji.158.1666713912336; Tue, 25 Oct 2022 09:05:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666713912; cv=none; d=google.com; s=arc-20160816; b=VPbWlghx65VQ94Pqat2EWXJG4gJcOGAzF+nDp6ZA463kU/WNWJGFuHAY6l0r1wU1ed 08FzHuSS7NbHS0TKYudusSHCh59ULI92pH7QZ4l79Z//K2quGtQdvACTFex6zF8R8HQR lkDVMnItKdGSG/DousPayqT4+nGV/0slDuuyb4SLZIASdfpsiYbGY8DIvIfN2LVBrkAl q2qbNn6GDCApZYr2xmL3KjgfO4sedDHyZ5GF8XDsd+CIp+K8yJMXFBA4VF+1Lr7KhsCZ 6VA3V/r49GRUFElTUgKMOOXPHTdIoJ8JGAaVMHSmeAF63/f+WZuieBr4d/VsVi//gG+7 V8AA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=V0ERmtJH8Y8ohiV20nUYLWYgRYkI81zx7wLU2s7JOh8=; b=kIfhQu7FAACyQGHYjGU5c62dC36bd2U/RmiilVyHek7759P0ErDrRIbNAJDfttVQlH x2BDN7s++/x0LQUcFhnhJLjAN9uQrnkpQcD+j9ba4vvgHIwJlOxNEMYi5xEcLx9ysLqN uW22o4ddajysCDPydv6e6VOJaj/KRxiZ57Xpl7AfndcmQvlHg3cv7Rmvjk5iL7cU6ivu MsErfSyqLMYZs8G6iC3IwBrbLtqAKtm+KD2FYOUWWS3ubRY4WPLSsgs0Q9PyjGh3FYWO aLHqgtt4y/GqyGODgqX/b+5GIfMrYncL6/3uwn9NiKc1PxjMLUTydzdsf4/LC2/wghvm MxTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=L903+97r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r7-20020a05640251c700b0045c10a55f2fsi3711756edd.322.2022.10.25.09.04.44; Tue, 25 Oct 2022 09:05:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=L903+97r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231745AbiJYPzt (ORCPT + 99 others); Tue, 25 Oct 2022 11:55:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230107AbiJYPzs (ORCPT ); Tue, 25 Oct 2022 11:55:48 -0400 Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17346114DE8 for ; Tue, 25 Oct 2022 08:55:47 -0700 (PDT) Received: by mail-pf1-x434.google.com with SMTP id f140so12256220pfa.1 for ; Tue, 25 Oct 2022 08:55:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=V0ERmtJH8Y8ohiV20nUYLWYgRYkI81zx7wLU2s7JOh8=; b=L903+97rjrcROLjG979Ms+HuOfnSZrQdCTDdbZl0Hi1jUoGMfLZPFvAAOKpj0xzsYH RejbrZr1bARGt7nRG7PDkRibySyxcOiQI6+E0KYfbTXo/3gZf/CMaYKJ8wrxG2ecJprm ZX2YMJXPgOVzyyDpUMGsk9NCIvx5g0iCLvQxOjVDZp0Mb2jn/djRVRpMRG/+2iZhbbw6 Pthp5wAHHLpVz0WfZkrLT/tkX+l5tb8Sv+HnFa5ArzK3+3WRJph+IR/Oh50/VMfOYonp 0+iF3HcjxWeHoIFK62f8nZ8T3g8ifmnLdUUgUt3XgVaXZBayr9dk1Fka9Ae3gQ7O5UWH 6zpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=V0ERmtJH8Y8ohiV20nUYLWYgRYkI81zx7wLU2s7JOh8=; b=BQFTWF05BEPTqvgceEso1yqQtI+7nnjacCMumao8NltBJIIbdTEja3UgY7OTC6mLIN qOuLI8UbeJ/HzqADEy58GFeLX9OO5D3vbgaYXLenD453fTzA5YzPWcvGDFfPafztecVX mmysRH5ffXHvGTeFPB4Ep884TK/xlbjU1p+aCeGyDa7CLswCLuS0Vt3AJN6Hqj6+zVSN KIOnRi7tHIa+1IB06I0PjC+CYHZFr6tiEtrfqtDtWcwzuimR8izl4r2gUqEYeiuaKILl jzb39VgAajo6v5LHbbvX3TSazu1D0mt/9ti+eEN8s9RsNxkLODHb4BcLMTIutpHsFSxa erLg== X-Gm-Message-State: ACrzQf0HWwZ2aTgBVtKeZkP6/GYoU8/IwbqAsxkgVLpklatdN8BYjGfA DTmjTiDw26Pa7xXlLd2VCczWNA== X-Received: by 2002:aa7:83cd:0:b0:563:5f54:d78c with SMTP id j13-20020aa783cd000000b005635f54d78cmr40623601pfn.66.1666713346386; Tue, 25 Oct 2022 08:55:46 -0700 (PDT) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id i20-20020aa796f4000000b0056c063dd4cfsm1545927pfq.66.2022.10.25.08.55.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Oct 2022 08:55:45 -0700 (PDT) Date: Tue, 25 Oct 2022 15:55:41 +0000 From: Sean Christopherson To: Paolo Bonzini Cc: Christian Borntraeger , Emanuele Giuseppe Esposito , kvm@vger.kernel.org, Jonathan Corbet , Maxim Levitsky , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , David Hildenbrand , x86@kernel.org, "H. Peter Anvin" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/4] KVM: API to block and resume all running vcpus in a vm Message-ID: References: <20221022154819.1823133-1-eesposit@redhat.com> <2701ce67-bfff-8c0c-4450-7c4a281419de@redhat.com> <384b2622-8d7f-ce02-1452-84a86e3a5697@linux.ibm.com> <5a26c107-9ab5-60ee-0e9c-a9955dfe313d@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5a26c107-9ab5-60ee-0e9c-a9955dfe313d@redhat.com> X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 25, 2022, Paolo Bonzini wrote: > On 10/25/22 00:45, Sean Christopherson wrote: > > > Yes that helps and should be part of the cover letter for the next iterations. > > But that doesn't explain why KVM needs to get involved, it only explains why QEMU > > can't use its existing pause_all_vcpus(). I do not understand why this is a > > problem QEMU needs KVM's help to solve. > > I agree that it's not KVM's problem that QEMU cannot use pause_all_vcpus(). > Having a ioctl in KVM, rather than coding the same in QEMU, is *mostly* a > matter of programmer and computer efficiency because the code is pretty > simple. > > That said, I believe the limited memslot API makes it more than just a QEMU > problem. Because KVM_GET_DIRTY_LOG cannot be combined atomically with > KVM_SET_USER_MEMORY_REGION(MR_DELETE), any VMM that uses dirty-log regions > while the VM is running is liable to losing the dirty status of some pages. ... and doesn't already do the sane thing and pause vCPUs _and anything else that can touch guest memory_ before modifying memslots. I honestly think QEMU is the only VMM that would ever use this API. > That's also a reason to provide this API in KVM. It's frankly a terrible API though. Providing a way to force vCPUs out of KVM_RUN is at best half of the solution. Userspace still needs: - a refcounting scheme to track the number of "holds" put on the system - serialization to ensure KVM_RESUME_ALL_KICKED_VCPUS completes before a new KVM_KICK_ALL_RUNNING_VCPUS is initiated - to prevent _all_ ioctls() because it's not just KVM_RUN that consumes memslots - to stop anything else in the system that consumes KVM memslots, e.g. KVM GT - to signal vCPU tasks so that the system doesn't livelock if a vCPU is stuck outside of KVM, e.g. in get_user_pages_unlocked() (Peter Xu's series) And because of the nature of KVM, to support this API on all architectures, KVM needs to make change on all architectures, whereas userspace should be able to implement a generic solution.