Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp2231115yba; Mon, 6 May 2019 02:06:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqxYDYLnPQLzbFS32PpvYENP/+69Qq2ORuNoDQMiV9aG09muWovAwonyEC+R4tmA3ayMXv3L X-Received: by 2002:a63:8242:: with SMTP id w63mr30591569pgd.169.1557133571102; Mon, 06 May 2019 02:06:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557133571; cv=none; d=google.com; s=arc-20160816; b=rJdhDbPHXQfI3yr/njTg/rWPSEibxop853tanjBxvYKvO8xDxlfgIvax3kvZjgcXZN phjiNDr/5+iPZ80WYeneKjJgtjPNrUeGnftNYDSj8JWgyPx8VCdP7ZChxyh8C9hdtN1y 8aNvjSKZxJSIXSQeGvMrmkp73tpY++FU0D02MtbbUxuLT4TERLb956IOP/uLjnLMAgV9 tWREOTxe+gbFfk+IvTvi+A2d3Aqh/fqqaBRPVxfN1bQ5ChAJUXTJZZl8ui54HKSGCNhG bBJcYbOpYwpcaghESB5ZqNRXbzNc5+M1bw0SiZUGTc0svjZpQkvMBDvrd3a/djemp++h bbHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id; bh=e4PE333ls/gJ9pKhbyJu4WmoOwxkoev6wNLjiteW8/k=; b=ah3IIdd29OFZPKLltGivhngU1HF1cKvNDkH9XKp2vgQJcP7ecfCRqUncLvtk8GLKKc WgjnEw4ZTMxgWCCPfKiBN87uR+wi/rnIMRXyPhODdCQlCjoXKcRMuXYHHj3EgCpzza59 fnAzZbSwEhdVPD8LZar2gRYUKEtMD5TEEpgM6Sb7/fhaWyd0TWY/Q8kkY0bbmuK6yTL+ 3wVO1559ZueMLaLWm7chT8Uqv5uSaZShIAzoNu36nllTeFJ500DOjbv5MaH5cWh1iq2i Ep2k7t4hfiLu93qSz63v/QrG6gFpkcoiVihiyp4w9jblVbpDgNWnJHIGq7oTPQ5lrV49 nUkQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 31si14193504plz.198.2019.05.06.02.05.52; Mon, 06 May 2019 02:06:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726512AbfEFJET (ORCPT + 99 others); Mon, 6 May 2019 05:04:19 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45308 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726085AbfEFJER (ORCPT ); Mon, 6 May 2019 05:04:17 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7DD605945B; Mon, 6 May 2019 09:04:17 +0000 (UTC) Received: from maximlenovopc.usersys.redhat.com (unknown [10.35.206.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id D03F55F9D4; Mon, 6 May 2019 09:04:04 +0000 (UTC) Message-ID: Subject: Re: [PATCH v2 00/10] RFC: NVME MDEV From: Maxim Levitsky To: Christoph Hellwig Cc: Fam Zheng , Keith Busch , Sagi Grimberg , kvm@vger.kernel.org, Wolfram Sang , Greg Kroah-Hartman , Liang Cunming , Nicolas Ferre , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, "David S . Miller" , Jens Axboe , Alex Williamson , Kirti Wankhede , Mauro Carvalho Chehab , Paolo Bonzini , Liu Changpeng , "Paul E . McKenney" , Amnon Ilan , John Ferlan Date: Mon, 06 May 2019 12:04:06 +0300 In-Reply-To: <20190503121838.GA21041@lst.de> References: <20190502114801.23116-1-mlevitsk@redhat.com> <20190503121838.GA21041@lst.de> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 06 May 2019 09:04:17 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2019-05-03 at 14:18 +0200, Christoph Hellwig wrote: > I simply don't get the point of this series. > > MDEV is an interface for exposing parts of a device to a userspace > program / VM. But that this series appears to do is to expose a > purely software defined nvme controller to userspace. Which in > principle is a good idea, but we have a much better framework for that, > which is called vhost. Let me explain the reasons for choosing the IO interfaces as I did: 1. Frontend interface (the interface that faces the guest/userspace/etc): VFIO/mdev is just way to expose a (partially) software defined PCIe device to a guest. Vhost on the other hand is an interface that is hardcoded and optimized for virtio. It can be extended to be pci generic, but why to do so if we already have VFIO. So the biggest advantage of using VFIO _currently_ is that I don't add any new API/ABI to the kernel, and neither the userspace (qemu) needs to learn to use a new API. It also worth noting that VFIO supports nesting out of box, so I don't need to worry about it (vhost has to deal with that on the protocol level using its IOTLB facility). On top of that, it is expected that newer hardware will support the PASID based device subdivision, which will allow us to _directly_ pass through the submission queues of the device and _force_ us to use the NVME protocol for the frontend. 2. Backend interface (the connection to the real nvme device): Currently the backend interface _doesn't have_ to allocate a dedicated queue and bypass the block layer. It can use the block submit_bio/blk_poll as I demonstrate in the last patch in the series. Its 2x slower though. However, similar to the (1), when the driver will support the devices with hardware based passthrough, it will have to dedicate a bunch of queues to the guest, configure them with the appropriate PASID, and then let the guest use these queues directly. Best regards, Maxim Levitsky