Received: by 2002:a05:7412:a9a3:b0:f9:327e:43ab with SMTP id o35csp159005rdh; Mon, 18 Dec 2023 07:09:07 -0800 (PST) X-Google-Smtp-Source: AGHT+IFrdgSXCKZqYWZFRuwyVRmmCgxOhQfvwliKd/bFw/idSm5ahIlFyaVsloMKy6c24Euqx+Tz X-Received: by 2002:a50:8701:0:b0:552:fc86:d417 with SMTP id i1-20020a508701000000b00552fc86d417mr1936333edb.15.1702912147221; Mon, 18 Dec 2023 07:09:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702912147; cv=none; d=google.com; s=arc-20160816; b=yfRqGk+VPZezG/StFKjj0nhQK0HzVsgp5DHZgAilYg8xIGShmEpPEcz7JkMVzrUC5h d0WFYjUsDAgZzM3v2eEf5Cjk5+79Y7G2sskXy0nlPPo8RxrXA2BJEl5JI+JWP8XhOg4S ZyB5y+rqsLbX1tWRHmI1eKahZC3354mc03ZlJfPUKBDigWtsqfsxJ9x+uJsTj3IAtMxp uE6eAhIKRtfrwR6qq1/j2u2oiyk+JD5M55AJnHy2w1TJwG+8ZDZU9pHi286oAnl9BKuC aGit8lqGGQdJxdmJB/jclB7M+n0pctqcQNEDHR7J9q5Ij9YttueSz+glzh+iu4Wee5Ec 1yMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=DFsibCzDQe/Rw1aQWfU7E5RwiUby8fkVdnpwwig3Neo=; fh=Lm8OhTUsZJjpI0oku/+a27s4YaQLStGawwPjKlHUuvw=; b=zo/NzDMPVirzGa0AI62WvcgpTIZUdZ/WPhD+OIBQHnKZFZWrDdMW9nfDnr+xLiSxqk rvwU3sS5YGkhn9LfCDremPg23rHp69CsATO/vI0n1fwcsA1izOLjpKR6+zMtWNhOYXR6 51kQ8ZMTzfP5NBk/p0rS2w9gzE65l9gMG/I8b1DC22pkC4WeomdqsvhA3cfA9JJomRNI eUuAIdsS5ig0qo3orTX47p1XlfuFFHZy0EulYOSUiRj4hwBUq6ZC07PgNGzfJU9/79G8 3y/6Skt8trHlCz9UgEJSWt/xT2LhN5etek8cAfA7G0sOUugjvT+lYsu5scTSYgehGcG/ cOkA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="rgRJQN9/"; spf=pass (google.com: domain of linux-kernel+bounces-3891-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-3891-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id r21-20020a50aad5000000b0055369c52d07si488786edc.158.2023.12.18.07.09.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Dec 2023 07:09:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-3891-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="rgRJQN9/"; spf=pass (google.com: domain of linux-kernel+bounces-3891-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-3891-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id C775E1F20FEE for ; Mon, 18 Dec 2023 15:09:06 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 034A13A1D8; Mon, 18 Dec 2023 15:08:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="rgRJQN9/" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 675A23A1C1 for ; Mon, 18 Dec 2023 15:08:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dbcca4c56b9so3211561276.3 for ; Mon, 18 Dec 2023 07:08:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702912133; x=1703516933; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=DFsibCzDQe/Rw1aQWfU7E5RwiUby8fkVdnpwwig3Neo=; b=rgRJQN9/V+Wc307v3B1NYvLRPOF5BlrsTs9E6QRfpcK3ivpr+nO4CNJcgykRkyCobb eUgYNCwZrKDX48PeHr4+n6CS7tM877q/a0HpF2WDKaUj9kl4BxqskL112kc89qqy1tJI VTdFI3n76dWxwJeT50j/UHNz7wxa0bn4fv/681/i2D2UygN8lVjPVVMBO5lU5ywxtqqd i+i9l/w5R0fzYu/uBPNVEnJoFnI3++mpTWv0e6N+qix+17ymBqAZMhycPnVPnVV4zNQH ZtFM2tGWsaa2j+Ez9NKXlXmlwfIk18Rtp533k+vdV7p9C+qFbNV9TtyTbWLMyMwbQysO mPGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702912133; x=1703516933; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DFsibCzDQe/Rw1aQWfU7E5RwiUby8fkVdnpwwig3Neo=; b=hkvOJ1cwLkX8j6oNd3JHzvT6z0/0dyRSA2Wznyh1iHReDmquS3GsMYy3CWky8shqGd HcJE/oZiYH4uOIaqpaA+ULuGqVPuSvJ852z54QbvU/dNLciTuNrtGwxty3RIxr32+GNl RXKoAvbd6cBxRrgUtT1iMciLo3KKNL0LUX3TN/9E0tPVIPzgdzsjmHspk7sqoFHlm9fX 1m0oHk9ClSm4evQbyGhRyXi/J97vEMv1+cezq/LFHZxvYZYSfDaNF5yY0AMFcIYGoJH0 0KM3onQFaThpMucl5AzQ+kM6zpm9lMAwpUNOxIlGGKl1hk+D+SNTJ7jZooTpFUQ5/PtJ G9Wg== X-Gm-Message-State: AOJu0Yw5aex2aJiYPVL6Nvhsv4I4PTZ9LG1LhX/UeGYP1Sg271x2VB9f g3L1ZPshSnKPrl53mmwmJQCK2XEUuSc= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a5b:cce:0:b0:dbc:b62a:98c4 with SMTP id e14-20020a5b0cce000000b00dbcb62a98c4mr207688ybr.7.1702912133436; Mon, 18 Dec 2023 07:08:53 -0800 (PST) Date: Mon, 18 Dec 2023 07:08:51 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231214103520.7198-1-yan.y.zhao@intel.com> Message-ID: Subject: Re: [RFC PATCH] KVM: Introduce KVM VIRTIO device From: Sean Christopherson To: Kevin Tian Cc: Yan Y Zhao , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , "pbonzini@redhat.com" , "olvaffe@gmail.com" , Zhiyuan Lv , Zhenyu Z Wang , Yongwei Ma , "vkuznets@redhat.com" , "wanpengli@tencent.com" , "jmattson@google.com" , "joro@8bytes.org" , "gurchetansingh@chromium.org" , "kraxel@redhat.com" , Yiwei Zhang Content-Type: text/plain; charset="us-ascii" +Yiwei On Fri, Dec 15, 2023, Kevin Tian wrote: > > From: Zhao, Yan Y > > Sent: Thursday, December 14, 2023 6:35 PM > > > > - For host non-MMIO pages, > > * virtio guest frontend and host backend driver should be synced to use > > the same memory type to map a buffer. Otherwise, there will be > > potential problem for incorrect memory data. But this will only impact > > the buggy guest alone. > > * for live migration, > > as QEMU will read all guest memory during live migration, page aliasing > > could happen. > > Current thinking is to disable live migration if a virtio device has > > indicated its noncoherent state. > > As a follow-up, we can discuss other solutions. e.g. > > (a) switching back to coherent path before starting live migration. > > both guest/host switching to coherent or host-only? > > host-only certainly is problematic if guest is still using non-coherent. > > on the other hand I'm not sure whether the host/guest gfx stack is > capable of switching between coherent and non-coherent path in-fly > when the buffer is right being rendered. > > > (b) read/write of guest memory with clflush during live migration. > > write is irrelevant as it's only done in the resume path where the > guest is not running. > > > > > Implementation Consideration > > === > > There is a previous series [1] from google to serve the same purpose to > > let KVM be aware of virtio GPU's noncoherent DMA status. That series > > requires a new memslot flag, and special memslots in user space. > > > > We don't choose to use memslot flag to request honoring guest memory > > type. > > memslot flag has the potential to restrict the impact e.g. when using > clflush-before-read in migration? Yep, exactly. E.g. if KVM needs to ensure coherency when freeing memory back to the host kernel, then the memslot flag will allow for a much more targeted operation. > Of course the implication is to honor guest type only for the selected slot > in KVM instead of applying to the entire guest memory as in previous series > (which selects this way because vmx_get_mt_mask() is in perf-critical path > hence not good to check memslot flag?) Checking a memslot flag won't impact performance. KVM already has the memslot when creating SPTEs, e.g. the sole caller of vmx_get_mt_mask(), make_spte(), has access to the memslot. That isn't coincidental, KVM _must_ have the memslot to construct the SPTE, e.g. to retrieve the associated PFN, update write-tracking for shadow pages, etc. I added Yiwei, who I think is planning on posting another RFC for the memslot idea (I actually completely forgot that the memslot idea had been thought of and posted a few years back). > > Instead we hope to make the honoring request to be explicit (not tied to a > > memslot flag). This is because once guest memory type is honored, not only > > memory used by guest virtio device, but all guest memory is facing page > > aliasing issue potentially. KVM needs a generic solution to take care of > > page aliasing issue rather than counting on memory type of a special > > memslot being aligned in host and guest. > > (we can discuss what a generic solution to handle page aliasing issue will > > look like in later follow-up series). > > > > On the other hand, we choose to introduce a KVM virtio device rather than > > just provide an ioctl to wrap kvm_arch_[un]register_noncoherent_dma() > > directly, which is based on considerations that > > I wonder it's over-engineered for the purpose. > > why not just introducing a KVM_CAP and allowing the VMM to enable? > KVM doesn't need to know the exact source of requiring it... Agreed. If we end up needing to grant the whole VM access for some reason, just give userspace a direct toggle.