Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp773244pxy; Wed, 28 Apr 2021 14:00:33 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzLe9EeCBK9WL2n5z/eHyCcpAWCJkO1LR7bamRFg04P8I5BXDwi9p3LlXZM3qugeh5EusG3 X-Received: by 2002:a17:906:5052:: with SMTP id e18mr31219634ejk.112.1619643633133; Wed, 28 Apr 2021 14:00:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619643633; cv=none; d=google.com; s=arc-20160816; b=tdQQBFLX9XRjann4DnRMb/DAEN7TbR7O8TPpnye4Jg0o9ntUwGifU72m87M5CypyHS CJWM1J05dee3BBMIc+O17qC9lpGie6GhxC7QDJzeHyrfFQKVb/qeKqiaA3XhQQ59+eHo IYJqFg///rP07ya66iAXE/fhY9ttTrbj57YKCKHCulKE/y9IM5ApZ1W69O9L80+6X0qx Up9ENku93RLCk/D40ds18AOTA4gM2J7+zTZ+MKIIS8L0h4AsLZ7kxORyszgqvMWxJZmP MmpnI2JtSKuU3AFBYUJB6WqmqFAbm4nFzNNz4dr2+g5a3Dw7Gq+1TRWwzxXXRkPQWPrU 33xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=F01dWG387l+l2oQAv3AlnSccDc9oKMZDHg8elg+ZIjs=; b=NyIoqz5K/Jn2PasWx6Fmea3vkR4lyHPcVSAAatiEDUNumkZP7MvTknPXEkojF8Tzw5 LQXmBpMRExqSGR2y5XyfeS1q4lguSxw+OiBN64cmVoc4toy1a8y5LGqz4xzHvbp+gM9L cAVpqKO0KuchRylp9HPP4Vl0I76EPdsmALWOV39+itIgf2oDyUhq1P26wOgfjT8bvs7p Y9jJJZzWX3O3dRqR3iXKT0joLYO2iyOOAant+BOjZlkginzW1bIfpKjRUaKn9XOHsRj1 P1UMp4wfzHgZ8KoeIa4Pxfq54qS8GwmpTiRdDS/+OxJTntZyMzBqlR5+AsJU+Gqrnj7d y3Yg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=DRhtAVvq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dd18si1055620ejb.264.2021.04.28.14.00.09; Wed, 28 Apr 2021 14:00:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=DRhtAVvq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230413AbhD1Uy3 (ORCPT + 99 others); Wed, 28 Apr 2021 16:54:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229995AbhD1Uy2 (ORCPT ); Wed, 28 Apr 2021 16:54:28 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C71C3C06138A for ; Wed, 28 Apr 2021 13:53:41 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id e5so35850795wrg.7 for ; Wed, 28 Apr 2021 13:53:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=F01dWG387l+l2oQAv3AlnSccDc9oKMZDHg8elg+ZIjs=; b=DRhtAVvq3xQyEQerqYxxjMKa0Ei7risYYXOBYMoHUUmytmcpk/oBooBnRkyuZ5fqcl 5xuIeBmfZU4bXAAAkkbsxlSExcWKtBeFHzbuTOxCcuAB8g7y6VL5lJDqUkSSvoZlJ/Rm R/JNDI2Fge6T1DQy/YP5SRYQMyLcx/Lw2HBLoYFnsOw+3tnx7WFsZCsOQ4hr14IsmDys /pARt2qA90/CFSWH2s2m1DJcKCiS8WkcDXdvJ3Q1atz2LudVMULkMCO7YCeqQkKnPMqV MnzH848h1IF4YjG8mbqLIpHRcXFMtSVrevHJJdwz8TjquLp4kXRLv9RIEQPlT9SqSSrb 8e4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=F01dWG387l+l2oQAv3AlnSccDc9oKMZDHg8elg+ZIjs=; b=ncby+IbMLfjK8Ys4iebJOHuvBIlC2VKBAMKTG6P2CM8vlm3yUfVNC80SynigPE+My9 wDUILdpqwMZVD6Jf7FM3IXdWbHMrfG3UbipWQ9nQomLinAnv/VmixV9aKCv1YB60/0af TR4iVz9+ESh2V9Ast4/3DWh1PyBWavL+xE77f9GVOu1q+8rLVU4GSA7DYaVfVnw3WZfo Z+qS0LA0LqgqkPLKEDhxPCty2BxZWcI0W+UiAuKRFuxRT49qHI/kdKfOfbYExNLDMwKT qk7JWOSXbHf6HsBb4MzYeXt96D2+c8zyaFWv7gEixqNkM6hbq6dkQRc9ZuOmSgLSBKzS in5Q== X-Gm-Message-State: AOAM530/2v6cWeEEYtYC+dBkBlGRAZXEnirRWw1+UxDjkteN7V2yNaVW vOh+qWwyZhb44dsyJ0kajqZsZY9xmMPqGEEki9I= X-Received: by 2002:a5d:51d2:: with SMTP id n18mr38089280wrv.69.1619643220521; Wed, 28 Apr 2021 13:53:40 -0700 (PDT) MIME-Version: 1.0 References: <20210421104145.115907-1-elic@nvidia.com> <20210422060358.GA140698@mtl-vdi-166.wap.labs.mlnx> <20210422080725.GB140698@mtl-vdi-166.wap.labs.mlnx> <9d3d8976-800d-bb14-0a4a-c4b008f6872c@redhat.com> <20210422083902.GA146406@mtl-vdi-166.wap.labs.mlnx> <20210425132523.GA43506@mtl-vdi-166.wap.labs.mlnx> <086936c8-adff-d4c2-469f-2df58c4db858@redhat.com> In-Reply-To: <086936c8-adff-d4c2-469f-2df58c4db858@redhat.com> From: Si-Wei Liu Date: Wed, 28 Apr 2021 13:53:28 -0700 Message-ID: Subject: Re: [PATCH] vdpa/mlx5: Add support for doorbell bypassing To: Jason Wang Cc: Eli Cohen , mst@redhat.com, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, "Zhu, Lingshan" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Apr 25, 2021 at 7:38 PM Jason Wang wrote: > > > =E5=9C=A8 2021/4/25 =E4=B8=8B=E5=8D=889:25, Eli Cohen =E5=86=99=E9=81=93: > > On Thu, Apr 22, 2021 at 04:59:11PM +0800, Jason Wang wrote: > >> =E5=9C=A8 2021/4/22 =E4=B8=8B=E5=8D=884:39, Eli Cohen =E5=86=99=E9=81= =93: > >>> On Thu, Apr 22, 2021 at 04:21:45PM +0800, Jason Wang wrote: > >>>> =E5=9C=A8 2021/4/22 =E4=B8=8B=E5=8D=884:07, Eli Cohen =E5=86=99=E9= =81=93: > >>>>> On Thu, Apr 22, 2021 at 09:03:58AM +0300, Eli Cohen wrote: > >>>>>> On Thu, Apr 22, 2021 at 10:37:38AM +0800, Jason Wang wrote: > >>>>>>> =E5=9C=A8 2021/4/21 =E4=B8=8B=E5=8D=886:41, Eli Cohen =E5=86=99= =E9=81=93: > >>>>>>>> Implement mlx5_get_vq_notification() to return the doorbell addr= ess. > >>>>>>>> Size is set to one system page as required. > >>>>>>>> > >>>>>>>> Signed-off-by: Eli Cohen > >>>>>>>> --- > >>>>>>>> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > >>>>>>>> drivers/vdpa/mlx5/core/resources.c | 1 + > >>>>>>>> drivers/vdpa/mlx5/net/mlx5_vnet.c | 6 ++++++ > >>>>>>>> 3 files changed, 8 insertions(+) > >>>>>>>> > >>>>>>>> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/m= lx5/core/mlx5_vdpa.h > >>>>>>>> index b6cc53ba980c..49de62cda598 100644 > >>>>>>>> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >>>>>>>> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >>>>>>>> @@ -41,6 +41,7 @@ struct mlx5_vdpa_resources { > >>>>>>>> u32 pdn; > >>>>>>>> struct mlx5_uars_page *uar; > >>>>>>>> void __iomem *kick_addr; > >>>>>>>> + u64 phys_kick_addr; > >>>>>>>> u16 uid; > >>>>>>>> u32 null_mkey; > >>>>>>>> bool valid; > >>>>>>>> diff --git a/drivers/vdpa/mlx5/core/resources.c b/drivers/vdpa/m= lx5/core/resources.c > >>>>>>>> index 6521cbd0f5c2..665f8fc1710f 100644 > >>>>>>>> --- a/drivers/vdpa/mlx5/core/resources.c > >>>>>>>> +++ b/drivers/vdpa/mlx5/core/resources.c > >>>>>>>> @@ -247,6 +247,7 @@ int mlx5_vdpa_alloc_resources(struct mlx5_vd= pa_dev *mvdev) > >>>>>>>> goto err_key; > >>>>>>>> kick_addr =3D mdev->bar_addr + offset; > >>>>>>>> + res->phys_kick_addr =3D kick_addr; > >>>>>>>> res->kick_addr =3D ioremap(kick_addr, PAGE_SIZE); > >>>>>>>> if (!res->kick_addr) { > >>>>>>>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/ml= x5/net/mlx5_vnet.c > >>>>>>>> index 10c5fef3c020..680751074d2a 100644 > >>>>>>>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > >>>>>>>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > >>>>>>>> @@ -1865,8 +1865,14 @@ static void mlx5_vdpa_free(struct vdpa_de= vice *vdev) > >>>>>>>> static struct vdpa_notification_area mlx5_get_vq_notificati= on(struct vdpa_device *vdev, u16 idx) > >>>>>>>> { > >>>>>>>> + struct mlx5_vdpa_dev *mvdev =3D to_mvdev(vdev); > >>>>>>>> struct vdpa_notification_area ret =3D {}; > >>>>>>>> + struct mlx5_vdpa_net *ndev; > >>>>>>>> + > >>>>>>>> + ndev =3D to_mlx5_vdpa_ndev(mvdev); > >>>>>>>> + ret.addr =3D (phys_addr_t)ndev->mvdev.res.phys_kick_addr; > >>>>>>>> + ret.size =3D PAGE_SIZE; > >>>>>>> Note that the page will be mapped in to guest, so it's only safe = if the > >>>>>>> doorbeel exclusively own the page. This means if there're other r= egisters in > >>>>>>> the page, we can not let the doorbell bypass to work. > >>>>>>> > >>>>>>> So this is suspicious at least in the case of subfunction where w= e calculate > >>>>>>> the bar length in mlx5_sf_dev_table_create() as: > >>>>>>> > >>>>>>> table->sf_bar_length =3D 1 << (MLX5_CAP_GEN(dev, log_min_sf_size)= + 12); > >>>>>>> > >>>>>>> It looks to me this can only work for the arch with PAGE_SIZE =3D= 4096, > >>>>>>> otherwise we can map more into the userspace(guest). > >>>>>>> > >>>>>> Correct, so I guess I should return here 4096. > >>>> I'm not quite sure but since the calculation of the sf_bar_length is= doen > >>>> via a shift of 12, it might be correct. > >>>> > >>>> And please double check if the doorbell own the page exclusively. > >>> I am checking if it is safe to map the any part of the SF's BAR to > >>> userspace without harming other functions. If this is true, I will ch= eck > >>> if I can return PAGE_SIZE without compromising security. > >> > >> It's usally not safe and a layer violation if other registers are plac= ed at > >> the same page. > >> > >> > >>> I think we may > >>> need to extend struct vdpa_notification_area to contain another field > >>> offset which indicates the offset from addr where the actual doorbell > >>> resides. > >> > >> The movitiaton of the current design is to be fit seamless into how Qe= mu > >> model doorbell layouts currently: > >> > >> 1) page-per-vq, each vq has its own page aligned doorbell > >> 2) 2 bytes doorbell, each vq has its own 2 byte aligend doorbell > >> > >> Only 1) is support in vhost-vDPA (and vhost-user) since it's rather si= mple > >> and secure (page aligned) to be modelled and implemented via mmap(). > >> > >> Exporting a complex layout is possbile but requires careful design. > >> > >> Actually, we had antoher option > >> > >> 3) shared doorbell: all virtqueue shares a single page aligned doorbel= l > >> > > This nearly matches we have in ConnectX devices. All the doorbells are > > located at the same place. For 4K page size atchitectures it is aligned > > to the start of the page. For larger page sizes it is not aligned. > > If we don't allow to some offset within the page, it means that direct > > doorbells will not work for 64K page size archs over ConnectX. > > > Right, just to clarify. This can still be model by the current > page-per-vq model. It means the doorbell will be mapped into different > pages for each virtqueue by Qemu. So from the view of Qemu or guest, > each virtqueue has its own doorbell in this case. So this is the proposed model for mlx5 vdpa with doorbell per-instance (rather than per-vq), assuming the exclusive ownership of mapped doorbell page? > > > > > >> This is not yet supported by Qemu. > > > For "not supported" I meant present this (doorbells sharing) layout to > guest. So it means this new layout perhaps will have to introduce new virtio features to guest thus not compatible with eixsting driver? -Siwei > > Thanks > > > >> > >> Thanks > >> > >> > >>>>>> I also think that the check in vhost_vdpa_mmap() should verify tha= t the > >>>>>> returned size is not smaller than PAGE_SIZE because the returned a= ddress > >>>>> Actually I think it's ok since you verify the size equals vma->vm_e= nd - > >>>>> vma->vm_start which must be at least PAGE_SIZE. > >>>> Yes. > >>>> > >>>> Thanks > >>>> > >>>> > >>>>>> might just be aligned to PAGE_SIZE. I think this should be enoght = but > >>>>>> maybe also use the same logic in vhost_vdpa_fault(). >