Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1479237pxb; Sun, 17 Jan 2021 10:20:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJxXuEYs1wbvvNJFT/v/xAmZQRnUtZYa8lM8RENbHUolU59/Iegz7lBDw1SEI/WMwnA4PaCy X-Received: by 2002:a17:906:7253:: with SMTP id n19mr2012365ejk.543.1610907633476; Sun, 17 Jan 2021 10:20:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610907633; cv=none; d=google.com; s=arc-20160816; b=TkLTQjqaEqhgJQaVDz5UiBmRtcDLYcJ7fX/FpCkfVjb0hCd1kWrhd74VCscTxmM+H3 iN1HiggOAkFsN0Rp12zVMQC3VT++VJ4bixZULvVVjfOSlIRhSNJTuWJP+W5jOQ+0Zxlj PkoDJP23TaHkpYBXi7QPPovw9D73f7G0IBm4PvtHvsFP8Xn/KIPZCf3kFembytQP550D xWQIVdFWCPecgzU5BFNT5+6lGHNClmFq4T0IkX2/wj7AUW08BqkwXd2xHbWgPhZSNzoO zK9urdNgyHy2nSyX5ooKMvE6wE5uSpPUnx1GxAhIK5ePrN+NSZ88bDBSg2Ql/1Tv9n8d LEKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:dkim-signature:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from; bh=pAyJZ7mNeLkDaimi0TkhKHmZsELWPGOGz2JDjV9FDOk=; b=aV0olrRhrQ+IWRr3z7pNTIrshCbJDP1IKhxp7/eKSggYZYr7ksP/Z9SmwzdZVWJK0C csNadqFroScw81Re56b4G1iXI1Xp6ASmhpXvr7WrvcR/98pZcIpou1Sr8Wrj7u2uveb+ MJvjIX9tIvTnLrb4WzPVdorYzJPFrriSQeNAciq1gp+V2MD5meloZ1bJPCa4iZSqPuF+ uuV/8CSy+XMi7j6E/bSbaPKK1ezEUaTd01u/CCY8MmldFAVpz8GYKkwmHmZgX+V5pQ+z 8dOlv74lUzdf+ywWwvNO6wKkf5bCNcG4ygUNxFkrANcMAh9GcvGcrLN7bWKtRnVmd0St 6hVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=BtdFW4ND; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id jx2si1342983ejb.526.2021.01.17.10.20.10; Sun, 17 Jan 2021 10:20:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=BtdFW4ND; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730005AbhAQSRL (ORCPT + 99 others); Sun, 17 Jan 2021 13:17:11 -0500 Received: from hqnvemgate26.nvidia.com ([216.228.121.65]:15763 "EHLO hqnvemgate26.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729333AbhAQSRC (ORCPT ); Sun, 17 Jan 2021 13:17:02 -0500 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Sun, 17 Jan 2021 10:15:53 -0800 Received: from HQMAIL107.nvidia.com (172.20.187.13) by HQMAIL111.nvidia.com (172.20.187.18) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Sun, 17 Jan 2021 18:15:53 +0000 Received: from r-nvmx02.mtr.labs.mlnx (172.20.145.6) by mail.nvidia.com (172.20.187.13) with Microsoft SMTP Server id 15.0.1473.3 via Frontend Transport; Sun, 17 Jan 2021 18:15:49 +0000 From: Max Gurtovoy To: , , , CC: , , , , , , , , , , , , , Max Gurtovoy Subject: [PATCH 3/3] mlx5-vfio-pci: add new vfio_pci driver for mlx5 devices Date: Sun, 17 Jan 2021 18:15:34 +0000 Message-ID: <20210117181534.65724-4-mgurtovoy@nvidia.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20210117181534.65724-1-mgurtovoy@nvidia.com> References: <20210117181534.65724-1-mgurtovoy@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1610907353; bh=pAyJZ7mNeLkDaimi0TkhKHmZsELWPGOGz2JDjV9FDOk=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type; b=BtdFW4NDsClEoa3HTbxbwBOaDXfpfmiy6ULgBonofHh5Jvy1kHWmUY+FPkCS0Vp4T NA4WSHnC6iQhWS+8NhpdsloAjLi06SRJ+Qht64SJ4/ovevdFXSL7oBjtLngINhM/kp 4cY0shJZcuSnUYEPhNq5BCojuuWZygGZsMawXfzu0WusPFPOB9izec5uv7gKnfbBbZ ay5wZ00/y2YPuiKkhB273bM7xpQe26EHRiwCY+9LnBCXGtgiCQOg3FshogC5ajfhET 1D09tiao/uXXCWVKq9u7iuTed38OtDqbGiDgign9cTUJZ78gsXGlO0oZSPuszYPaVx WVInLtrKe4vTQ== Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This driver will register to PCI bus and Auxiliary bus. In case the probe of both devices will succeed, we'll have a vendor specific VFIO PCI device. mlx5_vfio_pci use vfio_pci_core to register and create a VFIO device and use auxiliary_device to get the needed extension from the vendor device driver. If one of the probe() functions will fail, the VFIO char device will not be created. For now, only register and bind the auxiliary_device to the pci_device in case we have a match between the auxiliary_device id to the pci_device BDF. Later, vendor specific features such as live migration will be added and will be available to the virtualization software. Note: Although we've created the mlx5-vfio-pci.ko, the binding to vfio-pci.ko will still work as before. It's fully backward compatible. Of course, the extended vendor functionality will not exist in case one will bind the device to the generic vfio_pci.ko. Signed-off-by: Max Gurtovoy --- drivers/vfio/pci/Kconfig | 10 ++ drivers/vfio/pci/Makefile | 3 + drivers/vfio/pci/mlx5_vfio_pci.c | 253 +++++++++++++++++++++++++++++++ include/linux/mlx5/vfio_pci.h | 36 +++++ 4 files changed, 302 insertions(+) create mode 100644 drivers/vfio/pci/mlx5_vfio_pci.c create mode 100644 include/linux/mlx5/vfio_pci.h diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig index 5f90be27fba0..2133cd2f9c92 100644 --- a/drivers/vfio/pci/Kconfig +++ b/drivers/vfio/pci/Kconfig @@ -65,3 +65,13 @@ config VFIO_PCI_ZDEV for zPCI devices passed through via VFIO on s390. =20 Say Y here. + +config MLX5_VFIO_PCI + tristate "VFIO support for MLX5 PCI devices" + depends on VFIO_PCI_CORE && MLX5_CORE + select AUXILIARY_BUS + help + This provides a generic PCI support for MLX5 devices using the VFIO + framework. + + If you don't know what to do here, say N. diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile index 3f2a27e222cd..9f67edca31c5 100644 --- a/drivers/vfio/pci/Makefile +++ b/drivers/vfio/pci/Makefile @@ -2,6 +2,7 @@ =20 obj-$(CONFIG_VFIO_PCI_CORE) +=3D vfio-pci-core.o obj-$(CONFIG_VFIO_PCI) +=3D vfio-pci.o +obj-$(CONFIG_MLX5_VFIO_PCI) +=3D mlx5-vfio-pci.o =20 vfio-pci-core-y :=3D vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio= _pci_config.o vfio-pci-core-$(CONFIG_VFIO_PCI_IGD) +=3D vfio_pci_igd.o @@ -9,3 +10,5 @@ vfio-pci-core-$(CONFIG_VFIO_PCI_NVLINK2) +=3D vfio_pci_nvl= ink2.o vfio-pci-core-$(CONFIG_VFIO_PCI_ZDEV) +=3D vfio_pci_zdev.o =20 vfio-pci-y :=3D vfio_pci.o + +mlx5-vfio-pci-y :=3D mlx5_vfio_pci.o diff --git a/drivers/vfio/pci/mlx5_vfio_pci.c b/drivers/vfio/pci/mlx5_vfio_= pci.c new file mode 100644 index 000000000000..98cc2d037b0d --- /dev/null +++ b/drivers/vfio/pci/mlx5_vfio_pci.c @@ -0,0 +1,253 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. + * Author: Max Gurtovoy + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "vfio_pci_private.h" + +#define DRIVER_VERSION "0.1" +#define DRIVER_AUTHOR "Max Gurtovoy " +#define DRIVER_DESC "MLX5 VFIO PCI - User Level meta-driver for NVIDIA= MLX5 device family" + +static LIST_HEAD(aux_devs_list); +static DEFINE_MUTEX(aux_devs_lock); + +static struct mlx5_vfio_pci_adev *mlx5_vfio_pci_find_adev(struct pci_dev *= pdev) +{ + struct mlx5_vfio_pci_adev *mvadev, *found =3D NULL; + + mutex_lock(&aux_devs_lock); + list_for_each_entry(mvadev, &aux_devs_list, entry) { + if (mvadev->madev.adev.id =3D=3D pci_dev_id(pdev)) { + found =3D mvadev; + break; + } + } + mutex_unlock(&aux_devs_lock); + + return found; +} + +static int mlx5_vfio_pci_aux_probe(struct auxiliary_device *adev, + const struct auxiliary_device_id *id) +{ + struct mlx5_vfio_pci_adev *mvadev; + + mvadev =3D adev_to_mvadev(adev); + + pr_info("%s aux probing bdf %02x:%02x.%d mdev is %s\n", + adev->name, + PCI_BUS_NUM(adev->id & 0xffff), + PCI_SLOT(adev->id & 0xff), + PCI_FUNC(adev->id & 0xff), dev_name(mvadev->madev.mdev->device)); + + mutex_lock(&aux_devs_lock); + list_add(&mvadev->entry, &aux_devs_list); + mutex_unlock(&aux_devs_lock); + + return 0; +} + +static void mlx5_vfio_pci_aux_remove(struct auxiliary_device *adev) +{ + struct mlx5_vfio_pci_adev *mvadev =3D adev_to_mvadev(adev); + struct vfio_pci_device *vdev =3D dev_get_drvdata(&adev->dev); + + /* TODO: is this the right thing to do ? maybe FLR ? */ + if (vdev) + pci_reset_function(vdev->pdev); + + mutex_lock(&aux_devs_lock); + list_del(&mvadev->entry); + mutex_unlock(&aux_devs_lock); +} + +static const struct auxiliary_device_id mlx5_vfio_pci_aux_id_table[] =3D { + { .name =3D MLX5_ADEV_NAME ".vfio_pci", }, + {}, +}; + +MODULE_DEVICE_TABLE(auxiliary, mlx5_vfio_pci_aux_id_table); + +static struct auxiliary_driver mlx5_vfio_pci_aux_driver =3D { + .name =3D "vfio_pci_ex", + .probe =3D mlx5_vfio_pci_aux_probe, + .remove =3D mlx5_vfio_pci_aux_remove, + .id_table =3D mlx5_vfio_pci_aux_id_table, +}; + +static struct pci_driver mlx5_vfio_pci_driver; + +static ssize_t mlx5_vfio_pci_write(void *device_data, + const char __user *buf, size_t count, loff_t *ppos) +{ + /* DO vendor specific stuff here ? */ + + return vfio_pci_core_write(device_data, buf, count, ppos); +} + +static ssize_t mlx5_vfio_pci_read(void *device_data, char __user *buf, + size_t count, loff_t *ppos) +{ + /* DO vendor specific stuff here ? */ + + return vfio_pci_core_read(device_data, buf, count, ppos); +} + +static long mlx5_vfio_pci_ioctl(void *device_data, unsigned int cmd, + unsigned long arg) +{ + /* DO vendor specific stuff here ? */ + + return vfio_pci_core_ioctl(device_data, cmd, arg); +} + +static void mlx5_vfio_pci_release(void *device_data) +{ + /* DO vendor specific stuff here ? */ + + vfio_pci_core_release(device_data); +} + +static int mlx5_vfio_pci_open(void *device_data) +{ + /* DO vendor specific stuff here ? */ + + return vfio_pci_core_open(device_data); +} + +static const struct vfio_device_ops mlx5_vfio_pci_ops =3D { + .name =3D "mlx5-vfio-pci", + .open =3D mlx5_vfio_pci_open, + .release =3D mlx5_vfio_pci_release, + .ioctl =3D mlx5_vfio_pci_ioctl, + .read =3D mlx5_vfio_pci_read, + .write =3D mlx5_vfio_pci_write, + .mmap =3D vfio_pci_core_mmap, + .request =3D vfio_pci_core_request, + .match =3D vfio_pci_core_match, +}; + +static int mlx5_vfio_pci_probe(struct pci_dev *pdev, const struct pci_devi= ce_id *id) +{ + struct vfio_pci_device *vdev; + struct mlx5_vfio_pci_adev *mvadev; + + mvadev =3D mlx5_vfio_pci_find_adev(pdev); + if (!mvadev) { + pr_err("failed to find aux device for %s\n", + dev_name(&pdev->dev)); + return -ENODEV; + } + + vdev =3D vfio_create_pci_device(pdev, &mlx5_vfio_pci_ops, mvadev); + if (IS_ERR(vdev)) + return PTR_ERR(vdev); + + dev_set_drvdata(&mvadev->madev.adev.dev, vdev); + return 0; +} + +static void mlx5_vfio_pci_remove(struct pci_dev *pdev) +{ + struct mlx5_vfio_pci_adev *mvadev; + + mvadev =3D mlx5_vfio_pci_find_adev(pdev); + if (mvadev) + dev_set_drvdata(&mvadev->madev.adev.dev, NULL); + + vfio_destroy_pci_device(pdev); +} + +static pci_ers_result_t mlx5_vfio_pci_aer_err_detected(struct pci_dev *pde= v, + pci_channel_state_t state) +{ + /* DO vendor specific stuff here ? */ + + return vfio_pci_core_aer_err_detected(pdev, state); +} + +#ifdef CONFIG_PCI_IOV +static int mlx5_vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virt= fn) +{ + might_sleep(); + + /* DO vendor specific stuff here */ + + return vfio_pci_core_sriov_configure(pdev, nr_virtfn); +} +#endif + +static const struct pci_error_handlers mlx5_vfio_err_handlers =3D { + .error_detected =3D mlx5_vfio_pci_aer_err_detected, +}; + +static const struct pci_device_id mlx5_vfio_pci_table[] =3D { + { PCI_VDEVICE(MELLANOX, 0x6001) }, /* NVMe SNAP controllers */ + { PCI_DEVICE_SUB(PCI_VENDOR_ID_REDHAT_QUMRANET, 0x1042, + PCI_VENDOR_ID_MELLANOX, PCI_ANY_ID) }, /* Virtio SNAP controllers */ + { 0, } +}; + +static struct pci_driver mlx5_vfio_pci_driver =3D { + .name =3D "mlx5-vfio-pci", + .id_table =3D mlx5_vfio_pci_table, + .probe =3D mlx5_vfio_pci_probe, + .remove =3D mlx5_vfio_pci_remove, +#ifdef CONFIG_PCI_IOV + .sriov_configure =3D mlx5_vfio_pci_sriov_configure, +#endif + .err_handler =3D &mlx5_vfio_err_handlers, +}; + +static void __exit mlx5_vfio_pci_cleanup(void) +{ + auxiliary_driver_unregister(&mlx5_vfio_pci_aux_driver); + pci_unregister_driver(&mlx5_vfio_pci_driver); +} + +static int __init mlx5_vfio_pci_init(void) +{ + int ret; + + ret =3D pci_register_driver(&mlx5_vfio_pci_driver); + if (ret) + return ret; + + ret =3D auxiliary_driver_register(&mlx5_vfio_pci_aux_driver); + if (ret) + goto out_unregister; + + return 0; + +out_unregister: + pci_unregister_driver(&mlx5_vfio_pci_driver); + return ret; +} + +module_init(mlx5_vfio_pci_init); +module_exit(mlx5_vfio_pci_cleanup); + +MODULE_VERSION(DRIVER_VERSION); +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR(DRIVER_AUTHOR); +MODULE_DESCRIPTION(DRIVER_DESC); diff --git a/include/linux/mlx5/vfio_pci.h b/include/linux/mlx5/vfio_pci.h new file mode 100644 index 000000000000..c1e7b4d6da30 --- /dev/null +++ b/include/linux/mlx5/vfio_pci.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* + * Copyright (c) 2020 NVIDIA Corporation + */ + +#ifndef _VFIO_PCI_H +#define _VFIO_PCI_H + +#include +#include +#include +#include +#include + +struct mlx5_vfio_pci_adev { + struct mlx5_adev madev; + + /* These fields should not be used outside mlx5_vfio_pci.ko */ + struct list_head entry; +}; + +static inline struct mlx5_vfio_pci_adev* +madev_to_mvadev(struct mlx5_adev *madev) +{ + return container_of(madev, struct mlx5_vfio_pci_adev, madev); +} + +static inline struct mlx5_vfio_pci_adev* +adev_to_mvadev(struct auxiliary_device *adev) +{ + struct mlx5_adev *madev =3D container_of(adev, struct mlx5_adev, adev); + + return madev_to_mvadev(madev); +} + +#endif --=20 2.25.4