Received: by 2002:a25:ef43:0:0:0:0:0 with SMTP id w3csp381294ybm; Fri, 29 May 2020 02:29:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzCCn2nz4+Jnt+dUFwP/Gry6BdpltnGctnYYhAQwALMdn4USun5F4Ttr4YMbTiI6o0wAhj5 X-Received: by 2002:a17:906:f189:: with SMTP id gs9mr6611541ejb.203.1590744574518; Fri, 29 May 2020 02:29:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590744574; cv=none; d=google.com; s=arc-20160816; b=aMQUfkSPl+LalwEfRQNXbsiSQ76KOIPN9IabtPum05L6Nnyn4Re2mnKHa3ofHGVBPH a36IsvWpU6nvJJTjzrEWp5PAs8GT8z+/3l+hEpOM/rwg2//Cci3BoEt6uD/XhmMcNx2c BQuKvSPzI3xDc1uVFKxd6+TOYGkqnD3gQwCeJ/LfCZvdK7LhzVxLZkP8J1dt9wiflZKN w/wvkJgC01s0ZQnV7Fvfb7Qpdu+SvGzENmOOLZ2+iqXN7WqGCrwC3oZ93/1iOR6GbCix N5a4gnWnPHLkoo9/scdw9ZGB2QlBgNB7UJqM0K9O5gS42vJuepvsXhS0MOjYGe07Y8UF zLwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=u+ErMh3RNlykDs0Q6FLYjEbsHK/xbH1yzRUXA3RW6tU=; b=teVymEIAMWIFPWBfGo/ieCV5mgDI16MJA58l7TORNemq8jHGYH2Ltjt0BMe9OcNH4Q F+NHcE0ZP8JFKeYiLaVee7Cg/AfNFMO/XDTA3hC11w1+BgdZHRwoydImwUUnZZK98mXC ThljAPIoDtc3Y/TdO/Qxj9Ixv4RSJ27st9T+XGRR6BM8iqLUUmmPin/KdI4NCvZBqUVU NkfyRoRDKY2Kt38RTcPIxysvl9Z4Vs14Keadly5mNv556XpgK/zp9eDCESsNuX85q/WQ /py9G6a62gziMjJrQlUieFU2FMvWCBzfaULbuQis1oiy8FHVY/Y/zDSV5Br6Tw4zT8jb sX/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=GnshLUN4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k5si5210298ejv.13.2020.05.29.02.29.11; Fri, 29 May 2020 02:29:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=GnshLUN4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726905AbgE2JYy (ORCPT + 99 others); Fri, 29 May 2020 05:24:54 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:45951 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725306AbgE2JYx (ORCPT ); Fri, 29 May 2020 05:24:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590744291; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u+ErMh3RNlykDs0Q6FLYjEbsHK/xbH1yzRUXA3RW6tU=; b=GnshLUN40DCoxjGlbXN3jSSLKYP7qC1l7nxMbtgNpibHlpjS8KOJXyzGhiOO3R/qmcRM0K 9cgJ6YPlkNC4KX9e+QxkGXPlmjiHgVxMrmdJds71/9kEPzs4ZHdjA6f+IdtmtmfAQN8QRJ auCxRZ8UWI7D+TJ8UEQDRmf2mUJDi5o= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-241-76Yy88yIMa6jris-p_igFA-1; Fri, 29 May 2020 05:24:47 -0400 X-MC-Unique: 76Yy88yIMa6jris-p_igFA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5B04B835B43; Fri, 29 May 2020 09:24:45 +0000 (UTC) Received: from [10.72.13.231] (ovpn-13-231.pek2.redhat.com [10.72.13.231]) by smtp.corp.redhat.com (Postfix) with ESMTP id 65D965C1C8; Fri, 29 May 2020 09:24:35 +0000 (UTC) Subject: Re: [PATCH 4/6] vhost_vdpa: support doorbell mapping via mmap To: =?UTF-8?Q?Mika_Penttil=c3=a4?= , mst@redhat.com Cc: kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, rob.miller@broadcom.com, lingshan.zhu@intel.com, eperezma@redhat.com, lulu@redhat.com, shahafs@mellanox.com, hanand@xilinx.com, mhabets@solarflare.com, gdawar@xilinx.com, saugatm@xilinx.com, vmireyno@marvell.com, zhangweining@ruijie.com.cn, eli@mellanox.com References: <20200529080303.15449-1-jasowang@redhat.com> <20200529080303.15449-5-jasowang@redhat.com> From: Jason Wang Message-ID: Date: Fri, 29 May 2020 17:24:33 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/5/29 下午5:16, Mika Penttilä wrote: > Hi, > > On 29.5.2020 11.03, Jason Wang wrote: >> Currently the doorbell is relayed via eventfd which may have >> significant overhead because of the cost of vmexits or syscall. This >> patch introduces mmap() based doorbell mapping which can eliminate the >> overhead caused by vmexit or syscall. > > Just wondering. I know very little about vdpa. But how is such a "sw > doorbell" monitored or observed, if no fault or wmexit etc. > Is there some kind of polling used? Hi Mika: It's not a software doorbell. It just allow userspace to map page of hardware doorbell directly into userspace. Without this, for KVM, it needs to trap the MMIO access of the guest and write to eventfd, for other userspace driver, it needs to write to eventfd. vhost-vDPA's eventfd wakeup function may let the driver to do touch the doorbell. With this, since the doorbell page is mapped into userspace address space, guest or other userspace driver may write directly to the hardware doorbell register. Thanks > >> To ease the userspace modeling of the doorbell layout (usually >> virtio-pci), this patch starts from a doorbell per page >> model. Vhost-vdpa only support the hardware doorbell that sit at the >> boundary of a page and does not share the page with other registers. >> >> Doorbell of each virtqueue must be mapped separately, pgoff is the >> index of the virtqueue. This allows userspace to map a subset of the >> doorbell which may be useful for the implementation of software >> assisted virtqueue (control vq) in the future. >> >> Signed-off-by: Jason Wang >> --- >>   drivers/vhost/vdpa.c | 59 ++++++++++++++++++++++++++++++++++++++++++++ >>   1 file changed, 59 insertions(+) >> >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c >> index 6ff72289f488..bbe23cea139a 100644 >> --- a/drivers/vhost/vdpa.c >> +++ b/drivers/vhost/vdpa.c >> @@ -15,6 +15,7 @@ >>   #include >>   #include >>   #include >> +#include >>   #include >>   #include >>   #include >> @@ -741,12 +742,70 @@ static int vhost_vdpa_release(struct inode >> *inode, struct file *filep) >>       return 0; >>   } >>   +static vm_fault_t vhost_vdpa_fault(struct vm_fault *vmf) >> +{ >> +    struct vhost_vdpa *v = vmf->vma->vm_file->private_data; >> +    struct vdpa_device *vdpa = v->vdpa; >> +    const struct vdpa_config_ops *ops = vdpa->config; >> +    struct vdpa_notification_area notify; >> +    struct vm_area_struct *vma = vmf->vma; >> +    u16 index = vma->vm_pgoff; >> + >> +    notify = ops->get_vq_notification(vdpa, index); >> + >> +    vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); >> +    if (remap_pfn_range(vma, vmf->address & PAGE_MASK, >> +                notify.addr >> PAGE_SHIFT, PAGE_SIZE, >> +                vma->vm_page_prot)) >> +        return VM_FAULT_SIGBUS; >> + >> +    return VM_FAULT_NOPAGE; >> +} >> + >> +static const struct vm_operations_struct vhost_vdpa_vm_ops = { >> +    .fault = vhost_vdpa_fault, >> +}; >> + >> +static int vhost_vdpa_mmap(struct file *file, struct vm_area_struct >> *vma) >> +{ >> +    struct vhost_vdpa *v = vma->vm_file->private_data; >> +    struct vdpa_device *vdpa = v->vdpa; >> +    const struct vdpa_config_ops *ops = vdpa->config; >> +    struct vdpa_notification_area notify; >> +    int index = vma->vm_pgoff; >> + >> +    if (vma->vm_end - vma->vm_start != PAGE_SIZE) >> +        return -EINVAL; >> +    if ((vma->vm_flags & VM_SHARED) == 0) >> +        return -EINVAL; >> +    if (vma->vm_flags & VM_READ) >> +        return -EINVAL; >> +    if (index > 65535) >> +        return -EINVAL; >> +    if (!ops->get_vq_notification) >> +        return -ENOTSUPP; >> + >> +    /* To be safe and easily modelled by userspace, We only >> +     * support the doorbell which sits on the page boundary and >> +     * does not share the page with other registers. >> +     */ >> +    notify = ops->get_vq_notification(vdpa, index); >> +    if (notify.addr & (PAGE_SIZE - 1)) >> +        return -EINVAL; >> +    if (vma->vm_end - vma->vm_start != notify.size) >> +        return -ENOTSUPP; >> + >> +    vma->vm_ops = &vhost_vdpa_vm_ops; >> +    return 0; >> +} >> + >>   static const struct file_operations vhost_vdpa_fops = { >>       .owner        = THIS_MODULE, >>       .open        = vhost_vdpa_open, >>       .release    = vhost_vdpa_release, >>       .write_iter    = vhost_vdpa_chr_write_iter, >>       .unlocked_ioctl    = vhost_vdpa_unlocked_ioctl, >> +    .mmap        = vhost_vdpa_mmap, >>       .compat_ioctl    = compat_ptr_ioctl, >>   }; >