Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp3357710ybn; Fri, 27 Sep 2019 05:17:15 -0700 (PDT) X-Google-Smtp-Source: APXvYqyqg8nUdsITCrhiyva5qO/uRPFetHuTNtYOXvOBqN4Gx7J27h5JIypKDCWV5PLVnCEq6b9m X-Received: by 2002:a17:906:c7d4:: with SMTP id dc20mr7467299ejb.235.1569586635775; Fri, 27 Sep 2019 05:17:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569586635; cv=none; d=google.com; s=arc-20160816; b=QrOpnp9zSvYwpWUsY0d43IOHzFoSGmsCSN0YLVqP2X6VbXMO/zg35B10XYZtpoTCqu UqdkR6f4h246dkbm1TA+9/s8FJWxdOSWPs7bKUlCg6inI5/NNUbdYLdgvJ3tw8KqhDeC lYY/Cx+Z6lMYmNljGJrPEfh+YsmM54bhk6sI9MOPi7YJVl4t5yVFAbbb2/uDyfZEv82D PwuAwfsJ1BSWnQrkmtl6WHd8BkdxeSotiiPPawt0e7qXXYjck/XmMidOYFmDteGEsafH uosi8mPtLGlwf2ftvNzsy73r4wtWv3Q6FsaQT+pyocoBuaiggiYZoyofPE/A28uz3DyT +2TQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=r02tv/dHobEQtzOUPxqAlCS8dwNypcdtRzNCUHAcbBk=; b=m5W3Pk3rA7vlGUc4JGgzjzfimFkBByaOk9+tBiHx8rFZTsMN0b5DsdcE3V+A9E1XPJ d/tEonLehh9GtltLOdh4qYt1kK8IF5ibyJTC5EeplI34yCcq5PHKLJ0Tr32AUX1gG5H/ VGq3rOSym/QrgWnXi5wyKwdTBKis7r9ncD7ETWY75NeqeZ9kRRj0NlIoHNAY1mo9d3gg 0Kv1fCidSxqVgPu05A+2FKsYbiZcbT0FMuxCAHDP38NWitqANI49d+hSabs9jGFZATfN 9CW4mF99RRdOyJQHGuCZmZzpmhslsRFlPl3Qw90odkdVVgesW8iOpgeVN3dCx/lLc5SQ 4k+A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id sa16si2641414ejb.356.2019.09.27.05.16.50; Fri, 27 Sep 2019 05:17:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727474AbfI0MP4 (ORCPT + 99 others); Fri, 27 Sep 2019 08:15:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34854 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726251AbfI0MP4 (ORCPT ); Fri, 27 Sep 2019 08:15:56 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 96CF22D7E1; Fri, 27 Sep 2019 12:15:55 +0000 (UTC) Received: from [10.72.12.24] (ovpn-12-24.pek2.redhat.com [10.72.12.24]) by smtp.corp.redhat.com (Postfix) with ESMTP id DC722608C2; Fri, 27 Sep 2019 12:15:43 +0000 (UTC) Subject: Re: [PATCH] vhost: introduce mdev based hardware backend To: "Michael S. Tsirkin" Cc: Tiwei Bie , alex.williamson@redhat.com, maxime.coquelin@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, dan.daly@intel.com, cunming.liang@intel.com, zhihong.wang@intel.com, lingshan.zhu@intel.com References: <20190926045427.4973-1-tiwei.bie@intel.com> <1b4b8891-8c14-1c85-1d6a-2eed1c90bcde@redhat.com> <20190927045438.GA17152@___> <05ab395e-6677-e8c3-becf-57bc1529921f@redhat.com> <20190927053829-mutt-send-email-mst@kernel.org> From: Jason Wang Message-ID: <552fc91c-2eb6-8870-3077-e808e7e0917b@redhat.com> Date: Fri, 27 Sep 2019 20:15:41 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20190927053829-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 27 Sep 2019 12:15:55 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2019/9/27 下午5:38, Michael S. Tsirkin wrote: > On Fri, Sep 27, 2019 at 04:47:43PM +0800, Jason Wang wrote: >> On 2019/9/27 下午12:54, Tiwei Bie wrote: >>> On Fri, Sep 27, 2019 at 11:46:06AM +0800, Jason Wang wrote: >>>> On 2019/9/26 下午12:54, Tiwei Bie wrote: >>>>> + >>>>> +static long vhost_mdev_start(struct vhost_mdev *m) >>>>> +{ >>>>> + struct mdev_device *mdev = m->mdev; >>>>> + const struct virtio_mdev_device_ops *ops = mdev_get_dev_ops(mdev); >>>>> + struct virtio_mdev_callback cb; >>>>> + struct vhost_virtqueue *vq; >>>>> + int idx; >>>>> + >>>>> + ops->set_features(mdev, m->acked_features); >>>>> + >>>>> + mdev_add_status(mdev, VIRTIO_CONFIG_S_FEATURES_OK); >>>>> + if (!(mdev_get_status(mdev) & VIRTIO_CONFIG_S_FEATURES_OK)) >>>>> + goto reset; >>>>> + >>>>> + for (idx = 0; idx < m->nvqs; idx++) { >>>>> + vq = &m->vqs[idx]; >>>>> + >>>>> + if (!vq->desc || !vq->avail || !vq->used) >>>>> + break; >>>>> + >>>>> + if (ops->set_vq_state(mdev, idx, vq->last_avail_idx)) >>>>> + goto reset; >>>> If we do set_vq_state() in SET_VRING_BASE, we won't need this step here. >>> Yeah, I plan to do it in the next version. >>> >>>>> + >>>>> + /* >>>>> + * In vhost-mdev, userspace should pass ring addresses >>>>> + * in guest physical addresses when IOMMU is disabled or >>>>> + * IOVAs when IOMMU is enabled. >>>>> + */ >>>> A question here, consider we're using noiommu mode. If guest physical >>>> address is passed here, how can a device use that? >>>> >>>> I believe you meant "host physical address" here? And it also have the >>>> implication that the HPA should be continuous (e.g using hugetlbfs). >>> The comment is talking about the virtual IOMMU (i.e. iotlb in vhost). >>> It should be rephrased to cover the noiommu case as well. Thanks for >>> spotting this. >>> >>> >>>>> + >>>>> + switch (cmd) { >>>>> + case VHOST_MDEV_SET_STATE: >>>>> + r = vhost_set_state(m, argp); >>>>> + break; >>>>> + case VHOST_GET_FEATURES: >>>>> + r = vhost_get_features(m, argp); >>>>> + break; >>>>> + case VHOST_SET_FEATURES: >>>>> + r = vhost_set_features(m, argp); >>>>> + break; >>>>> + case VHOST_GET_VRING_BASE: >>>>> + r = vhost_get_vring_base(m, argp); >>>>> + break; >>>> Does it mean the SET_VRING_BASE may only take affect after >>>> VHOST_MEV_SET_STATE? >>> Yeah, in this version, SET_VRING_BASE won't set the base to the >>> device directly. But I plan to not delay this anymore in the next >>> version to support the SET_STATUS. >>> >>>>> + default: >>>>> + r = vhost_dev_ioctl(&m->dev, cmd, argp); >>>>> + if (r == -ENOIOCTLCMD) >>>>> + r = vhost_vring_ioctl(&m->dev, cmd, argp); >>>>> + } >>>>> + >>>>> + mutex_unlock(&m->mutex); >>>>> + return r; >>>>> +} >>>>> + >>>>> +static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { >>>>> + .name = "vfio-vhost-mdev", >>>>> + .open = vhost_mdev_open, >>>>> + .release = vhost_mdev_release, >>>>> + .ioctl = vhost_mdev_unlocked_ioctl, >>>>> +}; >>>>> + >>>>> +static int vhost_mdev_probe(struct device *dev) >>>>> +{ >>>>> + struct mdev_device *mdev = mdev_from_dev(dev); >>>>> + const struct virtio_mdev_device_ops *ops = mdev_get_dev_ops(mdev); >>>>> + struct vhost_mdev *m; >>>>> + int nvqs, r; >>>>> + >>>>> + m = kzalloc(sizeof(*m), GFP_KERNEL | __GFP_RETRY_MAYFAIL); >>>>> + if (!m) >>>>> + return -ENOMEM; >>>>> + >>>>> + mutex_init(&m->mutex); >>>>> + >>>>> + nvqs = ops->get_queue_max(mdev); >>>>> + m->nvqs = nvqs; >>>> The name could be confusing, get_queue_max() is to get the maximum number of >>>> entries for a virtqueue supported by this device. >>> OK. It might be better to rename it to something like: >>> >>> get_vq_num_max() >>> >>> which is more consistent with the set_vq_num(). >>> >>>> It looks to me that we need another API to query the maximum number of >>>> virtqueues supported by the device. >>> Yeah. >>> >>> Thanks, >>> Tiwei >> >> One problem here: >> >> Consider if we want to support multiqueue, how did userspace know about >> this? > There's a feature bit for this, isn't there? Yes, but it needs to know how many queue pairs are available. Thanks > >> Note this information could be fetched from get_config() via a device >> specific way, do we want ioctl for accessing that area? >> >> Thanks