Received: by 2002:a05:6a10:6d25:0:0:0:0 with SMTP id gq37csp1456685pxb; Sun, 12 Sep 2021 20:15:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxjBfBJd8b5XgMzPRFTG9Ju8XZyytvZON98/hbUsJzrO4+sjMfW1tyv0tkmwNSNUQJ+QhQ/ X-Received: by 2002:a17:906:bfe7:: with SMTP id vr7mr10192850ejb.32.1631502920954; Sun, 12 Sep 2021 20:15:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631502920; cv=none; d=google.com; s=arc-20160816; b=bSvyFQiq5qtUCVa/2wk3VqKw53krSSuh2hYKeEBA8ib7b8+/0UlVgmeGz5aZa6dtQU N5iRJjd8xLOfYphtsC9Z4WzfvP7nLzerFRJniVnrGI6pt0LLy/QgNJVwWH91A0c7UjHM 7UX/EYqzo7TxnNFDn3HL8F3N6Kgi+URZNHoObgy17Q9jlglwQUhJnDzqhWF7PYNXOGMj nf3myNbbKKqOntUI1ECvVP8p78JSdam1s0CKyNdFdYQIhChF2WfU7Wwlhc6a7fnr7zP6 akmF5DbR50kiv3bago8BXuoFMSSy0BZCIpvacFDq9M2UIA0Xl0y4Saf4O/Kz66rPDzL1 bWKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=7ynfqwpH3KzIeA0uiKbDJ8EYG2NA2NQ/Pqq4R6pwUis=; b=gA8sj0aJUhNNxr+5o+fZ+fGwrtiZHQJHQlYLPdx6+s8Y/+lHtI/DiMOHLFp5WH0Cs5 hFGMESwUelI9QZUI8NCjqWXCDpbii6KH1pb9n/3xcOSnhAr+RqbMq+7UTVN7aoxmeiA4 vJntm1vOFFzbJfpudZuiz0NkESSBaF1sacZPbDkfYOAj3ZV/JAMgBz1PgKol7uajXZIC VDBkpCXfdfy66lUTiWaC1D9BN7Dwu+c9swWx/eH7B6lfrsY6xF/X/T4hZDDE3+tsayYL XyyNsZB5qzi9mRca5Zpk2VzLbUeSBWW9HyG8Bp9F3ZGHG6E80CzsYCPIvvh/o2VYebdJ Mh3Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FLKyGAc5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dd4si7663202ejc.613.2021.09.12.20.14.57; Sun, 12 Sep 2021 20:15:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FLKyGAc5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237951AbhIMDOs (ORCPT + 99 others); Sun, 12 Sep 2021 23:14:48 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:52867 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237947AbhIMDOm (ORCPT ); Sun, 12 Sep 2021 23:14:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1631502806; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=7ynfqwpH3KzIeA0uiKbDJ8EYG2NA2NQ/Pqq4R6pwUis=; b=FLKyGAc5K9V3RBs8Zvqn8rR4KRSc5yEon8OzaQjRHQVSIimaDSZ6xx35bmIkgO/YKXAw3y n1g1wsmWxbOyPXr+b669mxyyPtTphe4icPpucrWBpceSkl5Obh73UlJO2HlDxAn6+3LwTA nxA75L7Q5CFxo5auUOsPI2VWxGzL1Yc= Received: from mail-lf1-f70.google.com (mail-lf1-f70.google.com [209.85.167.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-119-odHSLi1zOr6hDf407Oz47Q-1; Sun, 12 Sep 2021 23:13:25 -0400 X-MC-Unique: odHSLi1zOr6hDf407Oz47Q-1 Received: by mail-lf1-f70.google.com with SMTP id d10-20020ac24c8a000000b003dce50ea2c4so2775916lfl.0 for ; Sun, 12 Sep 2021 20:13:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7ynfqwpH3KzIeA0uiKbDJ8EYG2NA2NQ/Pqq4R6pwUis=; b=CuoYcxlBat1OWxd51hgElM5NTbwl0xLWVzp3mbN7XAMNErZPhBZhJrnAoASjfAEGu+ wSJjH/1YqYCp1RtB14/OjBwbhR4fp7Yg/WoPn5qw4ACEC+4g8gVZRZJ7S+eJBo+g7Tyz mtu0/GFpw+0n7udajU/5qAzPAZ0yA4E3BwVZuENqgeG+d3uXvQLaZqAkvAv1/yQZjZNs E0jOaBhBeKLiZbQ5qSxEPJLA+pv02YCkIEJMUjYSq+CP6yhkHzOn7qXNGZQZJ8316b0j lNP36vAJOoJMBeh3IvuKUq0PFXfIqc+HSlng1UHrYjAjL2c5OBlWfB7qaVXw9II9YsUw F9rw== X-Gm-Message-State: AOAM530ws1zn57sXBbkUnUCnBNdDQ1wUg87Xf0OPTPB+TjpA+sWEIo/c TrMDEOO/KUi7ut8kcA+Rdz9kZcirxwvvWg27r/DGAnUyd1LGUhKevOqhFcMfQjY+FlEtt1H+rBE d6v4u8m9KHeAIcc8M0d2AAVaPVf2ONjkYxaWu27mF X-Received: by 2002:a05:6512:110b:: with SMTP id l11mr7284207lfg.199.1631502803429; Sun, 12 Sep 2021 20:13:23 -0700 (PDT) X-Received: by 2002:a05:6512:110b:: with SMTP id l11mr7284195lfg.199.1631502803161; Sun, 12 Sep 2021 20:13:23 -0700 (PDT) MIME-Version: 1.0 References: <20210909080157.GA17383@L-PF27918B-1352.localdomain> <20210909095726.GA17469@L-PF27918B-1352.localdomain> <20210910073231.GA17715@L-PF27918B-1352.localdomain> <20210910092013.GA17783@L-PF27918B-1352.localdomain> <20210913025920.GB17820@L-PF27918B-1352.localdomain> In-Reply-To: <20210913025920.GB17820@L-PF27918B-1352.localdomain> From: Jason Wang Date: Mon, 13 Sep 2021 11:13:12 +0800 Message-ID: Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops To: Wu Zongyong Cc: Cindy Lu , virtualization , linux-kernel , mst , wei.yang1@linux.alibaba.com Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 13, 2021 at 10:59 AM Wu Zongyong wrote: > > On Mon, Sep 13, 2021 at 09:43:40AM +0800, Jason Wang wrote: > > On Fri, Sep 10, 2021 at 11:11 PM Cindy Lu wrote: > > > > > > On Fri, Sep 10, 2021 at 5:20 PM Wu Zongyong > > > wrote: > > > > > > > > On Fri, Sep 10, 2021 at 04:25:18PM +0800, Cindy Lu wrote: > > > > > , > > > > > > > > > > On Fri, Sep 10, 2021 at 3:33 PM Wu Zongyong > > > > > wrote: > > > > > > > > > > > > On Fri, Sep 10, 2021 at 09:45:53AM +0800, Jason Wang wrote: > > > > > > > On Thu, Sep 9, 2021 at 5:57 PM Wu Zongyong wrote: > > > > > > > > > > > > > > > > On Thu, Sep 09, 2021 at 05:28:26PM +0800, Jason Wang wrote: > > > > > > > > > On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong wrote: > > > > > > > > > > > > > > > > > > > > On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote: > > > > > > > > > > > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong wrote: > > > > > > > > > > > > > > > > > > > > > > > > This new callback is used to indicate whether the vring size can be > > > > > > > > > > > > change or not. It is useful when we have a legacy virtio pci device as > > > > > > > > > > > > the vdpa device for there is no way to negotiate the vring num by the > > > > > > > > > > > > specification. > > > > > > > > > > > > > > > > > > > > > > So I'm not sure it's worth bothering. E.g what if we just fail > > > > > > > > > > > VHOST_SET_VRING_NUM it the value doesn't match what hardware has? > > > > > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > I think we should not call VHOST_SET_VRING_NUM in that case. > > > > > > > > > > > > > > > > > > > > If the hardware reports that the virtqueue size cannot be changed, we > > > > > > > > > > should call VHOST_GET_VRING_NUM to get the static virtqueue size > > > > > > > > > > firstly, then allocate the same size memory for the virtqueues and write > > > > > > > > > > the address to hardware finally. > > > > > > > > > > > > > > > > > > > > For QEMU, we will ignore the properties rx/tx_queue_size and just get it > > > > > > > > > > from the hardware if this new callback return true. > > > > > > > > > > > > > > > > > > This will break live migration. My understanding is that we can > > > > > > > > > advertise those capability/limitation via the netlink management > > > > > > > > > protocol then management layer can choose to use the correct queue > > > > > > > > > size. > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > I agree, it is a good idea. > > > > > > > > BTW, can we also advertise mac address of network device? I found the > > > > > > > > mac address generated by libvirt or qemu will break the network datapath > > > > > > > > down if I don't specify the right mac explicitly in the XML or qemu > > > > > > > > commandline. > > > > > > > > > > > > > > We never saw this before, AFAIK when vhost-vdpa is used, currently > > > > > > > qemu will probably ignore the mac address set via command line since > > > > > > > the config space is read from the device instead of qemu itself? > > > > > > > > > > > > > > > > > > > I saw the code below in qemu: > > > > > > > > > > > > static void virtio_net_device_realize(DeviceState *dev, Error **errp) > > > > > > { > > > > > > ... > > > > > > if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { > > > > > > struct virtio_net_config netcfg = {}; > > > > > > memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN); > > > > > > vhost_net_set_config(get_vhost_net(nc->peer), > > > > > > (uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_MASTER); > > > > > > } > > > > > > ... > > > > > > } > > > > > > > > > > > > This write the mac address set via cmdline into vdpa device config, and > > > > > > then guest will read it back. > > > > > > If I remove these codes, it behaves like you said. > > > > > > > > > > > > > > > > > Hi Zongyong > > > > > I think this code only works while qemu get an all 0 mac address from > > > > > hardware , you can get more information from the function > > > > > virtio_net_get_config. > > > > > > > > It depends how vdpa_config_ops->set_config implements. > > > > For mlx5, callback set_config do nothing. But for virtio-pci, callback > > > > set_config will write the config register of the vdpa device, so qemu > > > > will write the mac set via cmdline to hardware and the mac guest read > > > > it back is the value writted by qemu just now. > > > > > > > So here comes a question, which MAC address has higher priority ? > > > the MAC address in hardware or the MAC address from the cmdline? > > > If both of these two MAC addresses exist, which should we use? > > > I have checked the spec, not sure if the bit VIRTIO_NET_F_MAC is the right one? > > > > I think so, if VIRTIO_NET_F_MAC is set, qemu can override the mac otherwise not. > > > The spec says: > "driver SHOULD negotiate VIRTIO_NET_F_MAC if the device offers it. If the driver > negotiates the VIRTIO_NET_F_MAC feature, the driver MUST set the physical address > of the NIC to mac. Otherwise, it SHOULD use a locally-administered MAC address." > > To my understanding, I guess you mean qemu CANNOT override the mac > device provides actually? Seems not, if VIRTIO_NET_F_MAC is not negotiated, mac is not valid in the config space: "The mac address field always exists (though is only valid if VIRTIO_NET_F_MAC is set)" So I think the right approach: - if mac is not specified in the cli, Qemu doesn't need to override the mac - if mac is specified in the cli and VIRTIO_NET_F_MAC is supported, Qemu can override the mac - if mac is specified in the cli and VIRTIO_NET_F_MAC is not supported, we need fail the launching Note that we're working on extending the netlink management API to set mac address during vDPA instance provisioning. Management layer can then get the correct mac address and set it via cli. AFAIK, Cindy's patch is a workaround when netlink doesn't support mac address. Thanks > > Thanks > > > > > if yes, I will post a patch in qemu and add check for this bit before > > > we set the mac to hardware > > > https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html > > > > > > Thanks > > > cindy > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Wu Zongyong > > > > > > > > > > > > --- > > > > > > > > > > > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++ > > > > > > > > > > > > drivers/virtio/virtio_vdpa.c | 5 ++++- > > > > > > > > > > > > include/linux/vdpa.h | 4 ++++ > > > > > > > > > > > > include/uapi/linux/vhost.h | 2 ++ > > > > > > > > > > > > 4 files changed, 29 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > > > > > > > > > > > > index 9479f7f79217..2204d27d1e5d 100644 > > > > > > > > > > > > --- a/drivers/vhost/vdpa.c > > > > > > > > > > > > +++ b/drivers/vhost/vdpa.c > > > > > > > > > > > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp) > > > > > > > > > > > > return 0; > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v, > > > > > > > > > > > > + u32 __user *argp) > > > > > > > > > > > > +{ > > > > > > > > > > > > + struct vdpa_device *vdpa = v->vdpa; > > > > > > > > > > > > + const struct vdpa_config_ops *ops = vdpa->config; > > > > > > > > > > > > + bool unchangeable = false; > > > > > > > > > > > > + > > > > > > > > > > > > + if (ops->get_vq_num_unchangeable) > > > > > > > > > > > > + unchangeable = ops->get_vq_num_unchangeable(vdpa); > > > > > > > > > > > > + > > > > > > > > > > > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable))) > > > > > > > > > > > > + return -EFAULT; > > > > > > > > > > > > + > > > > > > > > > > > > + return 0; > > > > > > > > > > > > +} > > > > > > > > > > > > + > > > > > > > > > > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd, > > > > > > > > > > > > void __user *argp) > > > > > > > > > > > > { > > > > > > > > > > > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, > > > > > > > > > > > > case VHOST_VDPA_GET_IOVA_RANGE: > > > > > > > > > > > > r = vhost_vdpa_get_iova_range(v, argp); > > > > > > > > > > > > break; > > > > > > > > > > > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE: > > > > > > > > > > > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp); > > > > > > > > > > > > + break; > > > > > > > > > > > > default: > > > > > > > > > > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp); > > > > > > > > > > > > if (r == -ENOIOCTLCMD) > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c > > > > > > > > > > > > index 72eaef2caeb1..afb47465307a 100644 > > > > > > > > > > > > --- a/drivers/virtio/virtio_vdpa.c > > > > > > > > > > > > +++ b/drivers/virtio/virtio_vdpa.c > > > > > > > > > > > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index, > > > > > > > > > > > > struct vdpa_vq_state state = {0}; > > > > > > > > > > > > unsigned long flags; > > > > > > > > > > > > u32 align, num; > > > > > > > > > > > > + bool may_reduce_num = true; > > > > > > > > > > > > int err; > > > > > > > > > > > > > > > > > > > > > > > > if (!name) > > > > > > > > > > > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index, > > > > > > > > > > > > > > > > > > > > > > > > /* Create the vring */ > > > > > > > > > > > > align = ops->get_vq_align(vdpa); > > > > > > > > > > > > + if (ops->get_vq_num_unchangeable) > > > > > > > > > > > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa); > > > > > > > > > > > > vq = vring_create_virtqueue(index, num, align, vdev, > > > > > > > > > > > > - true, true, ctx, > > > > > > > > > > > > + true, may_reduce_num, ctx, > > > > > > > > > > > > virtio_vdpa_notify, callback, name); > > > > > > > > > > > > if (!vq) { > > > > > > > > > > > > err = -ENOMEM; > > > > > > > > > > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h > > > > > > > > > > > > index 35648c11e312..f809b7ada00d 100644 > > > > > > > > > > > > --- a/include/linux/vdpa.h > > > > > > > > > > > > +++ b/include/linux/vdpa.h > > > > > > > > > > > > @@ -195,6 +195,9 @@ struct vdpa_iova_range { > > > > > > > > > > > > * @vdev: vdpa device > > > > > > > > > > > > * Returns the iova range supported by > > > > > > > > > > > > * the device. > > > > > > > > > > > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional) > > > > > > > > > > > > + * @vdev: vdpa device > > > > > > > > > > > > + * Returns boolean: unchangeable (true) or not (false) > > > > > > > > > > > > * @set_map: Set device memory mapping (optional) > > > > > > > > > > > > * Needed for device that using device > > > > > > > > > > > > * specific DMA translation (on-chip IOMMU) > > > > > > > > > > > > @@ -262,6 +265,7 @@ struct vdpa_config_ops { > > > > > > > > > > > > const void *buf, unsigned int len); > > > > > > > > > > > > u32 (*get_generation)(struct vdpa_device *vdev); > > > > > > > > > > > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev); > > > > > > > > > > > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev); > > > > > > > > > > > > > > > > > > > > > > > > /* DMA ops */ > > > > > > > > > > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb); > > > > > > > > > > > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h > > > > > > > > > > > > index c998860d7bbc..184f1f7f8498 100644 > > > > > > > > > > > > --- a/include/uapi/linux/vhost.h > > > > > > > > > > > > +++ b/include/uapi/linux/vhost.h > > > > > > > > > > > > @@ -150,4 +150,6 @@ > > > > > > > > > > > > /* Get the valid iova range */ > > > > > > > > > > > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \ > > > > > > > > > > > > struct vhost_vdpa_iova_range) > > > > > > > > > > > > +/* Check if the vring size can be change */ > > > > > > > > > > > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool) > > > > > > > > > > > > #endif > > > > > > > > > > > > -- > > > > > > > > > > > > 2.31.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >