Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp733656img; Wed, 20 Mar 2019 09:43:17 -0700 (PDT) X-Google-Smtp-Source: APXvYqztWnTQFtLIbGJIXpORItVjZh6lJ+XcMF0pC/wz1pqbKo+8BLejlfs/vl1gNrCZTiULfBer X-Received: by 2002:a63:e850:: with SMTP id a16mr7730587pgk.195.1553100197228; Wed, 20 Mar 2019 09:43:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553100197; cv=none; d=google.com; s=arc-20160816; b=W6CpGRbblnhc/NNuPFRYGbQLTwBft7U/oNzUyMg0/hcc9ARTu9eDCU3j5zw/TNaLEL RixeszwkPP1io1UFibIMo2E848TZ/pdvyp/ATxAescquJrIpKNvoDZHloJ7nh0grSY+c O/KH1jyMclM5KIuHalCqkAyycyHxQKLWN6yUpvpNvuczGnq7G0e6KpxxsiNuhSpjy4X1 GJDn6UKJR3vP9bYzJBtom2J3jM787V/QfckJ0k8vtqC0qce1gKFrUx7yNYh3PDCyh+X4 HXXk8kQgW9gh9gxANYVMmuFaMr6v+UCuHSluyl49QwZ1ePecUSwt0a4sABId8WuRD3zu ewQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id; bh=Egk0p0WQO6HuaD4XW2gNI5kBfRuXpW+Z5ByerDRcYIo=; b=s6RINyPjVuDqZDj7RF4+hXP5FkcXSYTjHL4iI7GwbGtN81+TBd7i0hJ2fZmoyKP/eL Wnd/qR1/6XQL+rmwYXOs1avI+x6OmDU5CjyBK3vvvOndJpZsAhiz5pPkAH7L2iZRj9is ngqxnd87bBW16cCirKJ0DblPq0VZXiBH/FRAwDgYdWiUVwE5r7jomUODJ30Mztpv4s45 S965kKhQmYyINOnv8pY23rz1di50AUkSLHJBScVlhmnkiKP33eQzAfrXFw4Ylaqsiink o6crnKq8dfvf62mfmBmBPuWakzDC3VQO1cJlBKakm4ge6dYoKZNvVqxaGeBACe6v3/G5 9jdQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m7si2168495pls.209.2019.03.20.09.43.01; Wed, 20 Mar 2019 09:43:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726987AbfCTQmN (ORCPT + 99 others); Wed, 20 Mar 2019 12:42:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44624 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726295AbfCTQmM (ORCPT ); Wed, 20 Mar 2019 12:42:12 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1D512756; Wed, 20 Mar 2019 16:42:12 +0000 (UTC) Received: from maximlenovopc.usersys.redhat.com (unknown [10.35.206.58]) by smtp.corp.redhat.com (Postfix) with ESMTP id 633F660148; Wed, 20 Mar 2019 16:42:03 +0000 (UTC) Message-ID: <8994f43d26ebf6040b9d5d5e3866ee81abcf1a1c.camel@redhat.com> Subject: Re: [PATCH 0/9] RFC: NVME VFIO mediated device From: Maxim Levitsky To: Bart Van Assche , linux-nvme@lists.infradead.org Cc: Fam Zheng , Jens Axboe , Alex Williamson , Sagi Grimberg , kvm@vger.kernel.org, Wolfram Sang , Greg Kroah-Hartman , Liang Cunming , Nicolas Ferre , linux-kernel@vger.kernel.org, Liu Changpeng , Keith Busch , Kirti Wankhede , Christoph Hellwig , Paolo Bonzini , Mauro Carvalho Chehab , John Ferlan , "Paul E . McKenney" , Amnon Ilan , "David S . Miller" Date: Wed, 20 Mar 2019 18:42:02 +0200 In-Reply-To: <1553095686.65329.36.camel@acm.org> References: <20190319144116.400-1-mlevitsk@redhat.com> <1553095686.65329.36.camel@acm.org> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Wed, 20 Mar 2019 16:42:12 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2019-03-20 at 08:28 -0700, Bart Van Assche wrote: > On Tue, 2019-03-19 at 16:41 +0200, Maxim Levitsky wrote: > > * All guest memory is mapped into the physical nvme device > > but not 1:1 as vfio-pci would do this. > > This allows very efficient DMA. > > To support this, patch 2 adds ability for a mdev device to listen on > > guest's memory map events. > > Any such memory is immediately pinned and then DMA mapped. > > (Support for fabric drivers where this is not possible exits too, > > in which case the fabric driver will do its own DMA mapping) > > Does this mean that all guest memory is pinned all the time? If so, are you > sure that's acceptable? I think so. The VFIO pci passthrough also pins all the guest memory. SPDK also does this (pins and dma maps) all the guest memory. I agree that this is not an ideal solution but this is a fastest and simplest solution possible. > > Additionally, what is the performance overhead of the IOMMU notifier added > by patch 8/9? How often was that notifier called per second in your tests > and how much time was spent per call in the notifier callbacks? To be honest I haven't optimized my IOMMU notifier at all, so when it is called, it stops the IO thread, does its work and then restarts it which is very slow. Fortunelly it is not called at all during normal operation as VFIO dma map/unmap events are really rare and happen only on guest boot. The same is even true for nested guests, as nested guest startup causes a wave of map unmap events while shadow IOMMU updates, but then it just uses these mapping without changing them. The only case when performance is really bad is when you boot a guest with iommu=on intel_iommu=on and then use the nvme driver there. In this case, the driver in the guest does itself IOMMU maps/unmaps (on the virtual IOMMU) and for each such event my VFIO map/unmap callback is called. This can be optimized though to be much better using also some kind of queued invalidation in my driver. iommu=pt meanwhile in the guest solves that issue. Best regards, Maxim Levitsky > > Thanks, > > Bart. > > _______________________________________________ > Linux-nvme mailing list > Linux-nvme@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-nvme