Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp1021518ybv; Wed, 5 Feb 2020 19:22:26 -0800 (PST) X-Google-Smtp-Source: APXvYqwFG59m3I3yD9YNczeBWFQJXworbHabgrKazx+5GTbQbSVjyxexk2K/KLyusp4jt94esy9C X-Received: by 2002:a54:4e96:: with SMTP id c22mr5626716oiy.110.1580959346839; Wed, 05 Feb 2020 19:22:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580959346; cv=none; d=google.com; s=arc-20160816; b=nBTQl3XYT6J+9xA24vucm/Ub/OATRI2uKEJnZkTCPLd0m6vijVfkLY5w9ySz3O2ddY g4APv6lLPPVVWlF4F0jvegbp/hyKoqEQ/R6oftpQhFvv/JxYEGq7RnFo0abNGoa7NFKc JkAJMw1e4PDlwyy5bx1MkZumaPgexfu0OYQo1ErYh82nj/qkbtU+e4FKl8qYzHhvkORH 7+LaRnxfX/3iJ6eC0AhJQMg2fcLyZdlazLd1is2VU58bJSaqcyXZIjb/wOKFjy8Ip1E+ HHnsbsixVWtfJPs/+NT7PhnAhse63hCATzM/Axn86iQhY49w8CP1uTbeiN88Zw2qYMjK lrtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=P1TczXbUHB/A1ibG/DqEE8S0FtOzhIhbj3hz2btGu3s=; b=rERHuPL29O10AJbXNg4ArQUhzADzZeGYS/HZE+76NscEz4o4PgpwDHSXtCYxiKiIYm 6dDnSoCGy6AY6NtAtzj4ErnabfY12F9n6z6cmWcbGeSfEct31pnUD30fPMLtcPYJhw7W xOfdxHRUuxjwRc9EVtJeBa4JqfJ6vbL8uAMLkjkLnbD13iMx2RQJoZwitB21PcQ57bax aFp1tUTS2h5CuEtyWygX5mTQ2J1Tomdy8MxcyB1RFhEKgvVVQn4YC6UlIlStHFDYGjQg ju9SoGQsRZ7iVx1N1fFHNUZml4DOjTTjw0jvwm4IQKAy0MDnrbckWgB7PGubp69Ldqfl U7ag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KU7FRVlH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r21si1126507otd.135.2020.02.05.19.22.14; Wed, 05 Feb 2020 19:22:26 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KU7FRVlH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727747AbgBFDKF (ORCPT + 99 others); Wed, 5 Feb 2020 22:10:05 -0500 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:53403 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727572AbgBFDKE (ORCPT ); Wed, 5 Feb 2020 22:10:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580958603; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P1TczXbUHB/A1ibG/DqEE8S0FtOzhIhbj3hz2btGu3s=; b=KU7FRVlHo5RI4RPTb4LiQ6cjsxX5XWoFuoDzUGJLHOqyGJwuCookEZfahYqxuhLzhbeWyJ 5nHk6rw8i4IVH/nLGWeIuhdNBHLCETc+gLiQroV+xWmgnMx+stYvsIYtbuWOXogHkND9DE SZw3WeZGTb0E9b229CKOxtKVigYzv6Y= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-161-i1UTngbRMf-R7FlrhRD3mg-1; Wed, 05 Feb 2020 22:10:01 -0500 X-MC-Unique: i1UTngbRMf-R7FlrhRD3mg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id ABA228018A7; Thu, 6 Feb 2020 03:09:58 +0000 (UTC) Received: from [10.72.13.85] (ovpn-13-85.pek2.redhat.com [10.72.13.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 400941A7E3; Thu, 6 Feb 2020 03:09:43 +0000 (UTC) Subject: Re: [PATCH] vhost: introduce vDPA based backend To: "Michael S. Tsirkin" , Shahaf Shuler Cc: Tiwei Bie , "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , Jason Gunthorpe , "rob.miller@broadcom.com" , "haotian.wang@sifive.com" , "eperezma@redhat.com" , "lulu@redhat.com" , Parav Pandit , "rdunlap@infradead.org" , "hch@infradead.org" , Jiri Pirko , "hanand@xilinx.com" , "mhabets@solarflare.com" , "maxime.coquelin@redhat.com" , "lingshan.zhu@intel.com" , "dan.daly@intel.com" , "cunming.liang@intel.com" , "zhihong.wang@intel.com" References: <20200131033651.103534-1-tiwei.bie@intel.com> <7aab2892-bb19-a06a-a6d3-9c28bc4c3400@redhat.com> <20200205020247.GA368700@___> <112858a4-1a01-f4d7-e41a-1afaaa1cad45@redhat.com> <20200205053129-mutt-send-email-mst@kernel.org> From: Jason Wang Message-ID: <80b4a5f9-8cc0-326a-a133-07a0ae3c7909@redhat.com> Date: Thu, 6 Feb 2020 11:09:42 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20200205053129-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/2/5 =E4=B8=8B=E5=8D=886:33, Michael S. Tsirkin wrote: > On Wed, Feb 05, 2020 at 09:30:14AM +0000, Shahaf Shuler wrote: >> Wednesday, February 5, 2020 9:50 AM, Jason Wang: >>> Subject: Re: [PATCH] vhost: introduce vDPA based backend >>> On 2020/2/5 =E4=B8=8B=E5=8D=883:15, Shahaf Shuler wrote: >>>> Wednesday, February 5, 2020 4:03 AM, Tiwei Bie: >>>>> Subject: Re: [PATCH] vhost: introduce vDPA based backend >>>>> >>>>> On Tue, Feb 04, 2020 at 11:30:11AM +0800, Jason Wang wrote: >>>>>> On 2020/1/31 =E4=B8=8A=E5=8D=8811:36, Tiwei Bie wrote: >>>>>>> This patch introduces a vDPA based vhost backend. This backend is >>>>>>> built on top of the same interface defined in virtio-vDPA and >>>>>>> provides a generic vhost interface for userspace to accelerate th= e >>>>>>> virtio devices in guest. >>>>>>> >>>>>>> This backend is implemented as a vDPA device driver on top of the >>>>>>> same ops used in virtio-vDPA. It will create char device entry >>>>>>> named vhost-vdpa/$vdpa_device_index for userspace to use. >>> Userspace >>>>>>> can use vhost ioctls on top of this char device to setup the back= end. >>>>>>> >>>>>>> Signed-off-by: Tiwei Bie >>>> [...] >>>> >>>>>>> +static long vhost_vdpa_do_dma_mapping(struct vhost_vdpa *v) { >>>>>>> + /* TODO: fix this */ >>>>>> Before trying to do this it looks to me we need the following duri= ng >>>>>> the probe >>>>>> >>>>>> 1) if set_map() is not supported by the vDPA device probe the IOMM= U >>>>>> that is supported by the vDPA device >>>>>> 2) allocate IOMMU domain >>>>>> >>>>>> And then: >>>>>> >>>>>> 3) pin pages through GUP and do proper accounting >>>>>> 4) store GPA->HPA mapping in the umem >>>>>> 5) generate diffs of memory table and using IOMMU API to setup the >>>>>> dma mapping in this method >>>>>> >>>>>> For 1), I'm not sure parent is sufficient for to doing this or nee= d >>>>>> to introduce new API like iommu_device in mdev. >>>>> Agree. We may also need to introduce something like the iommu_devic= e. >>>>> >>>> Would it be better for the map/umnap logic to happen inside each dev= ice ? >>>> Devices that needs the IOMMU will call iommu APIs from inside the dr= iver >>> callback. >>> >>> >>> Technically, this can work. But if it can be done by vhost-vpda it wi= ll make the >>> vDPA driver more compact and easier to be implemented. >> Need to see the layering of such proposal but am not sure. >> Vhost-vdpa is generic framework, while the DMA mapping is vendor speci= fic. >> Maybe vhost-vdpa can have some shared code needed to operate on iommu,= so drivers can re-use it. to me it seems simpler than exposing a new io= mmu device. >> >>>> Devices that has other ways to do the DMA mapping will call the >>> proprietary APIs. >>> >>> >>> To confirm, do you prefer: >>> >>> 1) map/unmap >> It is not only that. AFAIR there also flush and invalidate calls, righ= t? >> >>> or >>> >>> 2) pass all maps at one time? >> To me this seems more straight forward. >> It is correct that under hotplug and large number of memory segments >> the driver will need to understand the diff (or not and just reload >> the new configuration). >> However, my assumption here is that memory >> hotplug is heavy flow anyway, and the driver extra cycles will not be >> that visible > I think we can just allow both, after all vhost already has both interf= aces ... > We just need a flag that tells userspace whether it needs to > update all maps aggressively or can wait for a fault. It looks to me such flag is not a must and we can introduce it later=20 when device support page fault. Thanks