Dear kernel maintainers,
This submission is a kernel driver to support Intel(R) Gaussian & Neural
Accelerator (Intel(R) GNA). Intel(R) GNA is a PCI-based neural co-processor
available on multiple Intel platforms. AI developers and users can offload
continuous inference workloads to an Intel(R) GNA device in order to free
processor resources and save power. Noise reduction and speech recognition
are the examples of the workloads Intel(R) GNA deals with while its usage
is not limited to the two.
For a list of processors equipped with Intel(R) GNA device, please refer to
this link:
https://docs.openvinotoolkit.org/latest/openvino_docs_IE_DG_supported_plugins_GNA.html
We think contributing this driver to the upstream kernel project is the
best way for developers and users to get the latest Intel(R) GNA support in
a Linux kernel, through the mainline to any Linux distributions installed
on their systems. Upstreaming also enables contribution from developers
around the world to the driver once it is merged.
The driver works with Intel(R) libraries in user space. The Intel(R) driver
exposes a few IOCTL interfaces for use by libraries in user space. The
libraries are open sourced and are available at:
https://github.com/intel/gna
---
Changelogs:
v1->v2:
- driver's new layout:
- driver name: gna -> intel_gna
- module name: gna -> intel_gna
- device file name: /dev/gnaN -> /dev/intel_gnaN
- driver's source directory: drivers/misc/gna/ -> drivers/misc/intel/gna/
- UAPI: include/uapi/misc/gna.h -> include/uapi/misc/intel/gna.h
- DOC: Documentation/misc-devices/gna.rst ->
Documentation/misc-devices/intel/gna.rst
- 'MISC' device framework used
- fixes throughout GNA device's PCI management
- header files' includes and forward declarations cleanup
- ISR made static
- unused comments cleanup
- "_priv_" segment removed from function names
- tested: v5.11-rc3 -> v5.11
- number of other/minor fixes
---
Maciej Kwapulinski (1):
intel_gna: add a 'misc' device
Tomasz Jankowski (12):
intel_gna: add driver module
intel_gna: add component of hardware operation
intel_gna: read hardware info in the driver
intel_gna: add memory handling
intel_gna: initialize mmu
intel_gna: add hardware ids
intel_gna: add request component
intel_gna: implement scoring
intel_gna: add a work queue to process scoring requests
intel_gna: add interrupt handler
intel_gna: add ioctl handler
intel_gna: add file operations to a 'misc' device
Documentation/misc-devices/index.rst | 1 +
Documentation/misc-devices/intel/gna.rst | 48 ++
.../userspace-api/ioctl/ioctl-number.rst | 1 +
MAINTAINERS | 7 +
drivers/misc/Kconfig | 1 +
drivers/misc/Makefile | 1 +
drivers/misc/intel/gna/Kbuild | 5 +
drivers/misc/intel/gna/Kconfig | 13 +
drivers/misc/intel/gna/gna_device.c | 429 ++++++++++++++++
drivers/misc/intel/gna/gna_device.h | 89 ++++
drivers/misc/intel/gna/gna_driver.c | 43 ++
drivers/misc/intel/gna/gna_driver.h | 33 ++
drivers/misc/intel/gna/gna_hw.c | 125 +++++
drivers/misc/intel/gna/gna_hw.h | 61 +++
drivers/misc/intel/gna/gna_ioctl.c | 249 +++++++++
drivers/misc/intel/gna/gna_ioctl.h | 11 +
drivers/misc/intel/gna/gna_mem.c | 473 ++++++++++++++++++
drivers/misc/intel/gna/gna_mem.h | 107 ++++
drivers/misc/intel/gna/gna_request.c | 463 +++++++++++++++++
drivers/misc/intel/gna/gna_request.h | 62 +++
drivers/misc/intel/gna/gna_score.c | 298 +++++++++++
drivers/misc/intel/gna/gna_score.h | 18 +
include/uapi/misc/intel/gna.h | 155 ++++++
23 files changed, 2693 insertions(+)
create mode 100644 Documentation/misc-devices/intel/gna.rst
create mode 100644 drivers/misc/intel/gna/Kbuild
create mode 100644 drivers/misc/intel/gna/Kconfig
create mode 100644 drivers/misc/intel/gna/gna_device.c
create mode 100644 drivers/misc/intel/gna/gna_device.h
create mode 100644 drivers/misc/intel/gna/gna_driver.c
create mode 100644 drivers/misc/intel/gna/gna_driver.h
create mode 100644 drivers/misc/intel/gna/gna_hw.c
create mode 100644 drivers/misc/intel/gna/gna_hw.h
create mode 100644 drivers/misc/intel/gna/gna_ioctl.c
create mode 100644 drivers/misc/intel/gna/gna_ioctl.h
create mode 100644 drivers/misc/intel/gna/gna_mem.c
create mode 100644 drivers/misc/intel/gna/gna_mem.h
create mode 100644 drivers/misc/intel/gna/gna_request.c
create mode 100644 drivers/misc/intel/gna/gna_request.h
create mode 100644 drivers/misc/intel/gna/gna_score.c
create mode 100644 drivers/misc/intel/gna/gna_score.h
create mode 100644 include/uapi/misc/intel/gna.h
--
2.28.0
From: Tomasz Jankowski <[email protected]>
Add a new PCI driver for Intel(R) Gaussian & Neural Accelerator
with basic support like module loading and unloading. The full
function of the driver will be added by further changes.
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-developed-by: Jianxun Zhang <[email protected]>
Signed-off-by: Jianxun Zhang <[email protected]>
Co-developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
Documentation/misc-devices/index.rst | 1 +
Documentation/misc-devices/intel/gna.rst | 48 ++++++
.../userspace-api/ioctl/ioctl-number.rst | 1 +
MAINTAINERS | 7 +
drivers/misc/Kconfig | 1 +
drivers/misc/Makefile | 1 +
drivers/misc/intel/gna/Kbuild | 5 +
drivers/misc/intel/gna/Kconfig | 13 ++
drivers/misc/intel/gna/gna_device.c | 74 +++++++++
drivers/misc/intel/gna/gna_device.h | 36 ++++
drivers/misc/intel/gna/gna_driver.c | 39 +++++
drivers/misc/intel/gna/gna_driver.h | 15 ++
include/uapi/misc/intel/gna.h | 155 ++++++++++++++++++
13 files changed, 396 insertions(+)
create mode 100644 Documentation/misc-devices/intel/gna.rst
create mode 100644 drivers/misc/intel/gna/Kbuild
create mode 100644 drivers/misc/intel/gna/Kconfig
create mode 100644 drivers/misc/intel/gna/gna_device.c
create mode 100644 drivers/misc/intel/gna/gna_device.h
create mode 100644 drivers/misc/intel/gna/gna_driver.c
create mode 100644 drivers/misc/intel/gna/gna_driver.h
create mode 100644 include/uapi/misc/intel/gna.h
diff --git a/Documentation/misc-devices/index.rst b/Documentation/misc-devices/index.rst
index 64420b3314fe..1b187ee121b0 100644
--- a/Documentation/misc-devices/index.rst
+++ b/Documentation/misc-devices/index.rst
@@ -19,6 +19,7 @@ fit into other categories.
bh1770glc
eeprom
c2port
+ intel/gna
ibmvmc
ics932s401
isl29003
diff --git a/Documentation/misc-devices/intel/gna.rst b/Documentation/misc-devices/intel/gna.rst
new file mode 100644
index 000000000000..9baeec5ceb5c
--- /dev/null
+++ b/Documentation/misc-devices/intel/gna.rst
@@ -0,0 +1,48 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+=====================================================
+Intel(R) Gaussian & Neural Accelerator (Intel(R) GNA)
+=====================================================
+
+Acronyms
+--------
+GNA - Gaussian & Neural Accelerator
+GMM - Gaussian Mixer Model
+CNN - Convolutional Neural Network
+RNN - Recurrent Neural Networks
+DNN - Deep Neural Networks
+
+Introduction
+------------
+The Intel(R) GNA is an internal PCI fixed device available on several Intel platforms/SoCs.
+Feature set depends on the Intel chipset SKU.
+
+Intel(R) GNA provides hardware accelerated computation for GMMs and Neural Networks.
+It supports several layer types: affine, recurrent, and convolutional among others.
+Hardware also provides helper layer types for copying and transposing matrices.
+
+Linux Driver
+------------
+The driver also registers a character device to expose file operations via dev node.
+
+The driver probes/removes PCI device, implements file operations, handles runtime
+power management, and interacts with hardware through MMIO registers.
+
+Multiple processes can independently file many requests to the driver. These requests are
+processed in a FIFO manner. The hardware can process one request at a time by using a FIFO
+queue.
+
+IOCTL
+-----
+Intel(R) GNA driver controls the device through IOCTL interfaces.
+Following IOCTL commands are supported:
+
+GNA_IOCTL_PARAM_GET gets driver and device capabilities.
+
+GNA_IOCTL_MEMORY_MAP locks user pages and GNA MMU setups for DMA transfer.
+
+GNA_IOCTL_MEMORY_UNMAP unlocks user pages and releases GNA MMU structures.
+
+GNA_IOCTL_COMPUTE submits a request to the device queue.
+
+GNA_IOCTL_WAIT blocks and waits on the submitted request.
diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index a4c75a28c839..9ec2b32f656a 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -115,6 +115,7 @@ Code Seq# Include File Comments
'B' C0-FF advanced bbus <mailto:[email protected]>
'C' all linux/soundcard.h conflict!
'C' 01-2F linux/capi.h conflict!
+'C' 01-5F uapi/misc/intel/gna.h conflict!
'C' F0-FF drivers/net/wan/cosa.h conflict!
'D' all arch/s390/include/asm/dasd.h
'D' 40-5F drivers/scsi/dpt/dtpi_ioctl.h
diff --git a/MAINTAINERS b/MAINTAINERS
index bfc1b86e3e73..da926aa4523c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8928,6 +8928,13 @@ S: Maintained
F: Documentation/fb/intelfb.rst
F: drivers/video/fbdev/intelfb/
+INTEL GNA PCI DRIVER
+M: Maciej Kwapulinski <[email protected]>
+S: Maintained
+F: Documentation/misc-devices/intel/gna.rst
+F: drivers/misc/intel/gna/*
+F: include/uapi/misc/intel/gna.h
+
INTEL GPIO DRIVERS
M: Andy Shevchenko <[email protected]>
L: [email protected]
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index fafa8b0d8099..ce3dc5b9f821 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -481,4 +481,5 @@ source "drivers/misc/ocxl/Kconfig"
source "drivers/misc/cardreader/Kconfig"
source "drivers/misc/habanalabs/Kconfig"
source "drivers/misc/uacce/Kconfig"
+source "drivers/misc/intel/gna/Kconfig"
endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index d23231e73330..5fca2e730d96 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -57,3 +57,4 @@ obj-$(CONFIG_HABANA_AI) += habanalabs/
obj-$(CONFIG_UACCE) += uacce/
obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o
obj-$(CONFIG_HISI_HIKEY_USB) += hisi_hikey_usb.o
+obj-$(CONFIG_INTEL_GNA) += intel/gna/
diff --git a/drivers/misc/intel/gna/Kbuild b/drivers/misc/intel/gna/Kbuild
new file mode 100644
index 000000000000..5d3becc71683
--- /dev/null
+++ b/drivers/misc/intel/gna/Kbuild
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+intel_gna-y := gna_device.o gna_driver.o
+
+obj-$(CONFIG_INTEL_GNA) += intel_gna.o
diff --git a/drivers/misc/intel/gna/Kconfig b/drivers/misc/intel/gna/Kconfig
new file mode 100644
index 000000000000..c3b768a40684
--- /dev/null
+++ b/drivers/misc/intel/gna/Kconfig
@@ -0,0 +1,13 @@
+#
+# Intel(R) Gaussian & Neural Accelerator (Intel(R) GNA)
+#
+
+config INTEL_GNA
+ tristate "Intel(R) Gaussian & Neural Accelerator (Intel(R) GNA)"
+ depends on X86_64 && PCI
+ help
+ This option enables the Intel(R) Gaussian & Neural Accelerator
+ (Intel(R) GNA) driver: intel_gna.
+ User space interface is defined in include/uapi/misc/intel/gna.h, while
+ information about functionality is in
+ Documentation/misc-devices/intel/gna.rst
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
new file mode 100644
index 000000000000..431113297879
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2017-2021 Intel Corporation
+
+#include <linux/module.h>
+#include <linux/pci.h>
+
+#include "gna_device.h"
+#include "gna_driver.h"
+
+#define GNA_BAR0 0
+
+static void gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
+ const struct pci_device_id *pci_id)
+{
+ pci_set_drvdata(pcidev, gna_priv);
+
+ gna_priv->parent = &pcidev->dev;
+ gna_priv->pdev = pcidev;
+ gna_priv->info = *(struct gna_drv_info *)pci_id->driver_data;
+ gna_priv->drv_priv = &gna_drv_priv;
+}
+
+int gna_probe(struct pci_dev *pcidev, const struct pci_device_id *pci_id)
+{
+ struct gna_private *gna_priv;
+ void __iomem *const *iomap;
+ unsigned long phys_len;
+ phys_addr_t phys;
+ int ret;
+
+ ret = pcim_enable_device(pcidev);
+ if (ret) {
+ dev_err(&pcidev->dev, "pci device can't be enabled\n");
+ return ret;
+ }
+
+ ret = pcim_iomap_regions(pcidev, 1 << GNA_BAR0, GNA_DV_NAME);
+ if (ret) {
+ dev_err(&pcidev->dev, "cannot iomap regions\n");
+ return ret;
+ }
+
+ phys = pci_resource_start(pcidev, GNA_BAR0);
+ phys_len = pci_resource_len(pcidev, GNA_BAR0);
+
+ dev_info(&pcidev->dev, "physical base address %pap, %lu bytes\n",
+ &phys, phys_len);
+
+ iomap = pcim_iomap_table(pcidev);
+ if (!iomap) {
+ dev_err(&pcidev->dev, "failed to iomap table\n");
+ return -ENODEV;
+ }
+
+ gna_priv = devm_kzalloc(&pcidev->dev, sizeof(*gna_priv), GFP_KERNEL);
+ if (!gna_priv)
+ return -ENOMEM;
+
+ gna_priv->bar0_base = iomap[GNA_BAR0];
+
+ dev_dbg(&pcidev->dev, "bar0 memory address: %p\n", gna_priv->bar0_base);
+
+ ret = dma_set_mask(&pcidev->dev, DMA_BIT_MASK(64));
+ if (ret) {
+ dev_err(&pcidev->dev, "pci_set_dma_mask returned error %d\n", ret);
+ return ret;
+ }
+
+ pci_set_master(pcidev);
+
+ gna_dev_init(gna_priv, pcidev, pci_id);
+
+ return 0;
+}
diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
new file mode 100644
index 000000000000..d0b47f75f47f
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2017-2021 Intel Corporation */
+
+#ifndef __GNA_DEVICE_H__
+#define __GNA_DEVICE_H__
+
+#include <linux/types.h>
+
+struct gna_driver_private;
+struct pci_device_id;
+struct pci_dev;
+struct device;
+
+struct gna_drv_info {
+ u32 hwid;
+ u32 num_pagetables;
+ u32 num_page_entries;
+ u32 max_layer_count;
+ u64 max_hw_mem;
+};
+
+struct gna_private {
+ struct gna_driver_private *drv_priv;
+
+ struct pci_dev *pdev;
+ /* pdev->dev */
+ struct device *parent;
+
+ /* device related resources */
+ void __iomem *bar0_base;
+ struct gna_drv_info info;
+};
+
+int gna_probe(struct pci_dev *dev, const struct pci_device_id *id);
+
+#endif /* __GNA_DEVICE_H__ */
diff --git a/drivers/misc/intel/gna/gna_driver.c b/drivers/misc/intel/gna/gna_driver.c
new file mode 100644
index 000000000000..f4922a388be7
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_driver.c
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2017-2021 Intel Corporation
+
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+
+#include "gna_device.h"
+#include "gna_driver.h"
+
+static int recovery_timeout = 60;
+module_param(recovery_timeout, int, 0644);
+MODULE_PARM_DESC(recovery_timeout, "Recovery timeout in seconds");
+
+struct gna_driver_private gna_drv_priv;
+
+static struct pci_driver gna_driver = {
+ .name = GNA_DV_NAME,
+ .probe = gna_probe,
+};
+
+static int __init gna_drv_init(void)
+{
+ gna_drv_priv.recovery_timeout_jiffies = msecs_to_jiffies(recovery_timeout * 1000);
+
+ return pci_register_driver(&gna_driver);
+}
+
+static void __exit gna_drv_exit(void)
+{
+ pci_unregister_driver(&gna_driver);
+}
+
+module_init(gna_drv_init);
+module_exit(gna_drv_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) Gaussian & Neural Accelerator (Intel(R) GNA) Driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/misc/intel/gna/gna_driver.h b/drivers/misc/intel/gna/gna_driver.h
new file mode 100644
index 000000000000..ed507ea10866
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_driver.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2017-2021 Intel Corporation */
+
+#ifndef __GNA_DRIVER_H__
+#define __GNA_DRIVER_H__
+
+#define GNA_DV_NAME "intel_gna"
+
+struct gna_driver_private {
+ int recovery_timeout_jiffies;
+};
+
+extern struct gna_driver_private gna_drv_priv;
+
+#endif /* __GNA_DRIVER_H__ */
diff --git a/include/uapi/misc/intel/gna.h b/include/uapi/misc/intel/gna.h
new file mode 100644
index 000000000000..a7e435b74a0a
--- /dev/null
+++ b/include/uapi/misc/intel/gna.h
@@ -0,0 +1,155 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
+/* Copyright(c) 2017-2021 Intel Corporation */
+
+#ifndef _UAPI_GNA_H_
+#define _UAPI_GNA_H_
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+#include <linux/types.h>
+#include <linux/ioctl.h>
+#include <linux/const.h>
+
+#ifndef __user
+#define __user
+#endif
+
+/* Operation modes */
+#define GNA_MODE_GMM 0
+#define GNA_MODE_XNN 1
+
+#define GNA_PARAM_DEVICE_ID 1
+#define GNA_PARAM_RECOVERY_TIMEOUT 2
+#define GNA_PARAM_DEVICE_TYPE 3
+#define GNA_PARAM_INPUT_BUFFER_S 4
+
+#define GNA_STS_SCORE_COMPLETED _BITUL(0)
+#define GNA_STS_STATISTICS_VALID _BITUL(3)
+#define GNA_STS_PCI_MMU_ERR _BITUL(4)
+#define GNA_STS_PCI_DMA_ERR _BITUL(5)
+#define GNA_STS_PCI_UNEXCOMPL_ERR _BITUL(6)
+#define GNA_STS_VA_OOR _BITUL(7)
+#define GNA_STS_PARAM_OOR _BITUL(8)
+#define GNA_STS_SATURATE _BITUL(17)
+
+#define GNA_ERROR (GNA_STS_PCI_DMA_ERR | \
+ GNA_STS_PCI_MMU_ERR | \
+ GNA_STS_PCI_UNEXCOMPL_ERR | \
+ GNA_STS_PARAM_OOR | \
+ GNA_STS_VA_OOR)
+
+#define GNA_DEV_TYPE_0_9 0x09
+#define GNA_DEV_TYPE_1_0 0x10
+#define GNA_DEV_TYPE_2_0 0x20
+
+/*
+ * Structure describes part of memory to be overwritten before starting GNA
+ */
+struct gna_memory_patch {
+ /* offset from targeted memory */
+ __u64 offset;
+
+ __u64 size;
+ __u64 value;
+};
+
+struct gna_buffer {
+ __u64 memory_id;
+
+ __u64 offset;
+ __u64 size;
+
+ __u64 patch_count;
+ __u64 patches_ptr;
+};
+
+/*
+ * Driver performance timestamps in nanoseconds.
+ * Values regard system boot time, but do not count during suspend.
+ */
+struct gna_drv_perf {
+ __u64 pre_processing; /* driver starts pre-processing */
+ __u64 processing; /* hw starts processing */
+ __u64 hw_completed; /* hw finishes processing */
+ __u64 completion; /* driver finishes post-processing */
+};
+
+struct gna_hw_perf {
+ __u64 total;
+ __u64 stall;
+};
+
+struct gna_compute_cfg {
+ __u32 layer_base;
+ __u32 layer_count;
+
+ /* List of GNA memory buffers */
+ __u64 buffers_ptr;
+ __u64 buffer_count;
+
+ __u8 active_list_on;
+ __u8 gna_mode;
+ __u8 hw_perf_encoding;
+ __u8 pad[5];
+};
+
+union gna_parameter {
+ struct {
+ __u64 id;
+ } in;
+
+ struct {
+ __u64 value;
+ } out;
+};
+
+union gna_memory_map {
+ struct {
+ __u64 address;
+ __u32 size;
+ __u32 pad;
+ } in;
+
+ struct {
+ __u64 memory_id;
+ } out;
+};
+
+union gna_compute {
+ struct {
+ struct gna_compute_cfg config;
+ } in;
+
+ struct {
+ __u64 request_id;
+ } out;
+};
+
+union gna_wait {
+ struct {
+ __u64 request_id;
+ __u32 timeout;
+ __u32 pad;
+ } in;
+
+ struct {
+ __u32 hw_status;
+ __u32 pad;
+ struct gna_drv_perf drv_perf;
+ struct gna_hw_perf hw_perf;
+ } out;
+};
+
+#define GNA_GET_PARAMETER _IOWR('C', 0x01, union gna_parameter)
+#define GNA_MAP_MEMORY _IOWR('C', 0x02, union gna_memory_map)
+#define GNA_UNMAP_MEMORY _IOWR('C', 0x03, __u64)
+#define GNA_COMPUTE _IOWR('C', 0x04, union gna_compute)
+#define GNA_WAIT _IOWR('C', 0x05, union gna_wait)
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* _UAPI_GNA_H_ */
--
2.28.0
From: Tomasz Jankowski <[email protected]>
Get the hardware information from register MMIO_IBUFFS
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-developed-by: Jianxun Zhang <[email protected]>
Signed-off-by: Jianxun Zhang <[email protected]>
Co-developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
drivers/misc/intel/gna/gna_device.c | 5 +++++
drivers/misc/intel/gna/gna_device.h | 5 +++++
2 files changed, 10 insertions(+)
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
index 431113297879..6bac481b2247 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -12,12 +12,17 @@
static void gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
const struct pci_device_id *pci_id)
{
+ u32 bld_reg;
+
pci_set_drvdata(pcidev, gna_priv);
gna_priv->parent = &pcidev->dev;
gna_priv->pdev = pcidev;
gna_priv->info = *(struct gna_drv_info *)pci_id->driver_data;
gna_priv->drv_priv = &gna_drv_priv;
+
+ bld_reg = gna_reg_read(gna_priv->bar0_base, GNA_MMIO_IBUFFS);
+ gna_priv->hw_info.in_buf_s = bld_reg & GENMASK(7, 0);
}
int gna_probe(struct pci_dev *pcidev, const struct pci_device_id *pci_id)
diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
index 39dc03d53feb..7704eeda90f6 100644
--- a/drivers/misc/intel/gna/gna_device.h
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -23,6 +23,10 @@ struct gna_drv_info {
struct gna_desc_info desc_info;
};
+struct gna_hw_info {
+ u8 in_buf_s;
+};
+
struct gna_private {
struct gna_driver_private *drv_priv;
@@ -33,6 +37,7 @@ struct gna_private {
/* device related resources */
void __iomem *bar0_base;
struct gna_drv_info info;
+ struct gna_hw_info hw_info;
};
int gna_probe(struct pci_dev *dev, const struct pci_device_id *id);
--
2.28.0
From: Tomasz Jankowski <[email protected]>
Setup mmu in the driver with a new memory component.
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-Developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
drivers/misc/intel/gna/gna_device.c | 29 +++++++++++++++++++++++++++--
1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
index 25137d0ac519..3d559f22ee93 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -9,10 +9,11 @@
#define GNA_BAR0 0
-static void gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
+static int gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
const struct pci_device_id *pci_id)
{
u32 bld_reg;
+ int ret;
pci_set_drvdata(pcidev, gna_priv);
@@ -24,10 +25,29 @@ static void gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
bld_reg = gna_reg_read(gna_priv->bar0_base, GNA_MMIO_IBUFFS);
gna_priv->hw_info.in_buf_s = bld_reg & GENMASK(7, 0);
+ if (gna_mmu_alloc(gna_priv)) {
+ dev_err(&pcidev->dev, "mmu allocation failed\n");
+ ret = -EFAULT;
+ goto err_pci_drvdata_unset;
+
+ }
+ dev_dbg(&pcidev->dev, "maximum memory size %llu num pd %d\n",
+ gna_priv->info.max_hw_mem, gna_priv->info.num_pagetables);
+ dev_dbg(&pcidev->dev, "desc rsvd size %d mmu vamax size %d\n",
+ gna_priv->info.desc_info.rsvd_size,
+ gna_priv->info.desc_info.mmu_info.vamax_size);
+
mutex_init(&gna_priv->mmu_lock);
idr_init(&gna_priv->memory_idr);
mutex_init(&gna_priv->memidr_lock);
+
+ return 0;
+
+err_pci_drvdata_unset:
+ pci_set_drvdata(pcidev, NULL);
+
+ return ret;
}
static void gna_dev_deinit(struct gna_private *gna_priv)
@@ -84,7 +104,12 @@ int gna_probe(struct pci_dev *pcidev, const struct pci_device_id *pci_id)
pci_set_master(pcidev);
- gna_dev_init(gna_priv, pcidev, pci_id);
+ ret = gna_dev_init(gna_priv, pcidev, pci_id);
+ if (ret) {
+ dev_err(&pcidev->dev, "could not initialize %s device\n", GNA_DV_NAME);
+ return ret;
+ }
+
return 0;
}
--
2.28.0
From: Tomasz Jankowski <[email protected]>
Add PCI ids of Intel(R) Gaussian & Neural Accelerator on supported
platforms.
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-developed-by: Jianxun Zhang <[email protected]>
Signed-off-by: Jianxun Zhang <[email protected]>
Co-developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
drivers/misc/intel/gna/gna_device.c | 76 +++++++++++++++++++++++++++++
drivers/misc/intel/gna/gna_device.h | 5 +-
drivers/misc/intel/gna/gna_driver.c | 1 +
3 files changed, 80 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
index 3d559f22ee93..9838d003426f 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -7,8 +7,84 @@
#include "gna_device.h"
#include "gna_driver.h"
+#define GNA_DEV_HWID_CNL 0x5A11
+#define GNA_DEV_HWID_EHL 0x4511
+#define GNA_DEV_HWID_GLK 0x3190
+#define GNA_DEV_HWID_ICL 0x8A11
+#define GNA_DEV_HWID_JSL 0x4E11
+#define GNA_DEV_HWID_TGL 0x9A11
+
#define GNA_BAR0 0
+#define GNA_FEATURES \
+ .max_hw_mem = 256 * 1024 * 1024, \
+ .num_pagetables = 64, \
+ .num_page_entries = PAGE_SIZE / sizeof(u32), \
+ /* desc_info all in bytes */ \
+ .desc_info = { \
+ .rsvd_size = 256, \
+ .cfg_size = 256, \
+ .desc_size = 784, \
+ .mmu_info = { \
+ .vamax_size = 4, \
+ .rsvd_size = 12, \
+ .pd_size = 4 * 64, \
+ }, \
+ }
+
+#define GNA_GEN1_FEATURES \
+ GNA_FEATURES, \
+ .max_layer_count = 1024
+
+#define GNA_GEN2_FEATURES \
+ GNA_FEATURES, \
+ .max_layer_count = 4096
+
+static const struct gna_drv_info cnl_drv_info = {
+ .hwid = GNA_DEV_HWID_CNL,
+ GNA_GEN1_FEATURES
+};
+
+static const struct gna_drv_info glk_drv_info = {
+ .hwid = GNA_DEV_HWID_GLK,
+ GNA_GEN1_FEATURES
+};
+
+static const struct gna_drv_info ehl_drv_info = {
+ .hwid = GNA_DEV_HWID_EHL,
+ GNA_GEN1_FEATURES
+};
+
+static const struct gna_drv_info icl_drv_info = {
+ .hwid = GNA_DEV_HWID_ICL,
+ GNA_GEN1_FEATURES
+};
+
+static const struct gna_drv_info jsl_drv_info = {
+ .hwid = GNA_DEV_HWID_JSL,
+ GNA_GEN2_FEATURES
+};
+
+static const struct gna_drv_info tgl_drv_info = {
+ .hwid = GNA_DEV_HWID_TGL,
+ GNA_GEN2_FEATURES
+};
+
+#define INTEL_GNA_DEVICE(hwid, info) \
+ { PCI_VDEVICE(INTEL, hwid), (kernel_ulong_t)(info) }
+
+const struct pci_device_id gna_pci_ids[] = {
+ INTEL_GNA_DEVICE(GNA_DEV_HWID_CNL, &cnl_drv_info),
+ INTEL_GNA_DEVICE(GNA_DEV_HWID_EHL, &ehl_drv_info),
+ INTEL_GNA_DEVICE(GNA_DEV_HWID_GLK, &glk_drv_info),
+ INTEL_GNA_DEVICE(GNA_DEV_HWID_ICL, &icl_drv_info),
+ INTEL_GNA_DEVICE(GNA_DEV_HWID_JSL, &jsl_drv_info),
+ INTEL_GNA_DEVICE(GNA_DEV_HWID_TGL, &tgl_drv_info),
+ { }
+};
+
+MODULE_DEVICE_TABLE(pci, gna_pci_ids);
+
static int gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
const struct pci_device_id *pci_id)
{
diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
index a5657ab9f62a..799788d70033 100644
--- a/drivers/misc/intel/gna/gna_device.h
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -7,13 +7,12 @@
#include <linux/types.h>
#include <linux/mutex.h>
#include <linux/idr.h>
+#include <linux/pci.h>
#include "gna_hw.h"
#include "gna_mem.h"
struct gna_driver_private;
-struct pci_device_id;
-struct pci_dev;
struct device;
struct gna_drv_info {
@@ -51,6 +50,8 @@ struct gna_private {
struct mutex memidr_lock;
};
+extern const struct pci_device_id gna_pci_ids[];
+
int gna_probe(struct pci_dev *dev, const struct pci_device_id *id);
void gna_remove(struct pci_dev *dev);
diff --git a/drivers/misc/intel/gna/gna_driver.c b/drivers/misc/intel/gna/gna_driver.c
index 79f4d8522bdc..a8f69a970f7a 100644
--- a/drivers/misc/intel/gna/gna_driver.c
+++ b/drivers/misc/intel/gna/gna_driver.c
@@ -16,6 +16,7 @@ struct gna_driver_private gna_drv_priv;
static struct pci_driver gna_driver = {
.name = GNA_DV_NAME,
+ .id_table = gna_pci_ids,
.probe = gna_probe,
.remove = gna_remove,
};
--
2.28.0
From: Tomasz Jankowski <[email protected]>
Patch adds memory handling - mapping, DMA, pinning.
The GNA driver maps and unmaps the physical pages for 64-byte aligned
buffer allocated by user space program. The pages of mapped memory
are being locked only during actual computation.
Patch adds configuration of the DMA scatter gather list for physical pages
and generation of page table and page directory to be programmed in the GNA HW
at the time of scoring initiation.
GNA’s MMU is being configured based on specific request memory usage.
As the MMU can address up to 256MB a single scoring request is limited
to this amount of memory being used.
GNA Library can allocate any number of memory regions for GNA usage.
Its number and total capacity are limited by the OSs’ resources.
Due to GNA MMU restrictions, even when using multiple memory regions,
the sum of all the memory regions used within a single inference
request must be less than 256MB.
At least a single GNA memory region is needed to be allocated
(and can be shared by multiple models). At the other extreme,
each GNA tensor (e.g., weights/biases/inputs/outputs) could use
its own, separate GNA memory region.
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-developed-by: Jianxun Zhang <[email protected]>
Signed-off-by: Jianxun Zhang <[email protected]>
Co-developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
drivers/misc/intel/gna/Kbuild | 2 +-
drivers/misc/intel/gna/gna_device.c | 20 ++
drivers/misc/intel/gna/gna_device.h | 13 +
drivers/misc/intel/gna/gna_driver.c | 1 +
drivers/misc/intel/gna/gna_driver.h | 16 +
drivers/misc/intel/gna/gna_mem.c | 470 ++++++++++++++++++++++++++++
drivers/misc/intel/gna/gna_mem.h | 107 +++++++
7 files changed, 628 insertions(+), 1 deletion(-)
create mode 100644 drivers/misc/intel/gna/gna_mem.c
create mode 100644 drivers/misc/intel/gna/gna_mem.h
diff --git a/drivers/misc/intel/gna/Kbuild b/drivers/misc/intel/gna/Kbuild
index 0cf083bb211a..e5cd953d83b2 100644
--- a/drivers/misc/intel/gna/Kbuild
+++ b/drivers/misc/intel/gna/Kbuild
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
-intel_gna-y := gna_device.o gna_driver.o gna_hw.o
+intel_gna-y := gna_device.o gna_driver.o gna_mem.o gna_hw.o
obj-$(CONFIG_INTEL_GNA) += intel_gna.o
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
index 6bac481b2247..25137d0ac519 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -23,6 +23,17 @@ static void gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
bld_reg = gna_reg_read(gna_priv->bar0_base, GNA_MMIO_IBUFFS);
gna_priv->hw_info.in_buf_s = bld_reg & GENMASK(7, 0);
+
+ mutex_init(&gna_priv->mmu_lock);
+
+ idr_init(&gna_priv->memory_idr);
+ mutex_init(&gna_priv->memidr_lock);
+}
+
+static void gna_dev_deinit(struct gna_private *gna_priv)
+{
+ idr_destroy(&gna_priv->memory_idr);
+ gna_mmu_free(gna_priv);
}
int gna_probe(struct pci_dev *pcidev, const struct pci_device_id *pci_id)
@@ -77,3 +88,12 @@ int gna_probe(struct pci_dev *pcidev, const struct pci_device_id *pci_id)
return 0;
}
+
+void gna_remove(struct pci_dev *pcidev)
+{
+ struct gna_private *gna_priv;
+
+ gna_priv = pci_get_drvdata(pcidev);
+
+ gna_dev_deinit(gna_priv);
+}
diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
index 7704eeda90f6..a5657ab9f62a 100644
--- a/drivers/misc/intel/gna/gna_device.h
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -5,8 +5,11 @@
#define __GNA_DEVICE_H__
#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/idr.h>
#include "gna_hw.h"
+#include "gna_mem.h"
struct gna_driver_private;
struct pci_device_id;
@@ -38,8 +41,18 @@ struct gna_private {
void __iomem *bar0_base;
struct gna_drv_info info;
struct gna_hw_info hw_info;
+
+ struct gna_mmu_object mmu;
+ struct mutex mmu_lock;
+
+ /* memory objects' store */
+ struct idr memory_idr;
+ /* lock protecting memory_idr */
+ struct mutex memidr_lock;
};
int gna_probe(struct pci_dev *dev, const struct pci_device_id *id);
+void gna_remove(struct pci_dev *dev);
+
#endif /* __GNA_DEVICE_H__ */
diff --git a/drivers/misc/intel/gna/gna_driver.c b/drivers/misc/intel/gna/gna_driver.c
index f4922a388be7..79f4d8522bdc 100644
--- a/drivers/misc/intel/gna/gna_driver.c
+++ b/drivers/misc/intel/gna/gna_driver.c
@@ -17,6 +17,7 @@ struct gna_driver_private gna_drv_priv;
static struct pci_driver gna_driver = {
.name = GNA_DV_NAME,
.probe = gna_probe,
+ .remove = gna_remove,
};
static int __init gna_drv_init(void)
diff --git a/drivers/misc/intel/gna/gna_driver.h b/drivers/misc/intel/gna/gna_driver.h
index ed507ea10866..69087317d668 100644
--- a/drivers/misc/intel/gna/gna_driver.h
+++ b/drivers/misc/intel/gna/gna_driver.h
@@ -4,12 +4,28 @@
#ifndef __GNA_DRIVER_H__
#define __GNA_DRIVER_H__
+#include <linux/mutex.h>
+#include <linux/list.h>
+
#define GNA_DV_NAME "intel_gna"
+struct gna_private;
+struct file;
+
struct gna_driver_private {
int recovery_timeout_jiffies;
};
+struct gna_file_private {
+ struct file *fd;
+ struct gna_private *gna_priv;
+
+ struct list_head memory_list;
+ struct mutex memlist_lock;
+
+ struct list_head flist;
+};
+
extern struct gna_driver_private gna_drv_priv;
#endif /* __GNA_DRIVER_H__ */
diff --git a/drivers/misc/intel/gna/gna_mem.c b/drivers/misc/intel/gna/gna_mem.c
new file mode 100644
index 000000000000..f3828b503ff6
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_mem.c
@@ -0,0 +1,470 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2017-2021 Intel Corporation
+
+#include <linux/device.h>
+#include <linux/mm.h>
+#include <linux/mmap_lock.h>
+#include <linux/pagemap.h>
+#include <linux/pci.h>
+#include <linux/sched.h>
+#include <linux/sched/mm.h>
+#include <linux/sched/task.h>
+#include <linux/slab.h>
+#include <linux/swap.h>
+
+#include <uapi/misc/intel/gna.h>
+
+#include "gna_device.h"
+#include "gna_driver.h"
+#include "gna_mem.h"
+
+static void gna_mmu_init(struct gna_private *gna_priv)
+{
+ struct gna_mmu_object *mmu;
+ dma_addr_t pagetable_dma;
+ u32 *pgdirn;
+ int i;
+
+ mmu = &gna_priv->mmu;
+
+ pgdirn = mmu->hwdesc->mmu.pagedir_n;
+
+ for (i = 0; i < mmu->num_pagetables; i++) {
+ pagetable_dma = mmu->pagetables_dma[i];
+ pgdirn[i] = pagetable_dma >> PAGE_SHIFT;
+ }
+
+ for (; i < GNA_PGDIRN_LEN; i++)
+ pgdirn[i] = GNA_PGDIR_INVALID;
+}
+
+/* descriptor and page tables allocation */
+int gna_mmu_alloc(struct gna_private *gna_priv)
+{
+ struct gna_mmu_object *mmu;
+ struct pci_dev *pcidev;
+ int desc_size;
+ int i;
+
+ pcidev = gna_priv->pdev;
+
+ if (gna_priv->info.num_pagetables > GNA_PGDIRN_LEN) {
+ dev_err(&pcidev->dev, "too large number of pagetables requested\n");
+ return -EINVAL;
+ }
+
+ mmu = &gna_priv->mmu;
+
+ desc_size = round_up(gna_priv->info.desc_info.desc_size, PAGE_SIZE);
+
+ mmu->hwdesc = dma_alloc_coherent(&pcidev->dev, desc_size, &mmu->hwdesc_dma,
+ GFP_KERNEL);
+ if (!mmu->hwdesc)
+ goto end;
+
+ mmu->num_pagetables = gna_priv->info.num_pagetables;
+
+ mmu->pagetables_dma = kmalloc_array(mmu->num_pagetables, sizeof(*mmu->pagetables_dma),
+ GFP_KERNEL);
+ if (!mmu->pagetables_dma)
+ goto err_free_descriptor;
+
+ mmu->pagetables = kmalloc_array(mmu->num_pagetables, sizeof(*mmu->pagetables), GFP_KERNEL);
+
+ if (!mmu->pagetables)
+ goto err_free_pagetables_dma;
+
+ for (i = 0; i < mmu->num_pagetables; i++) {
+ mmu->pagetables[i] = dma_alloc_coherent(&pcidev->dev, PAGE_SIZE,
+ &mmu->pagetables_dma[i], GFP_KERNEL);
+ if (!mmu->pagetables[i])
+ goto err_free_mmu;
+ }
+
+ gna_mmu_init(gna_priv);
+
+ return 0;
+
+err_free_mmu:
+ while (i--) {
+ pci_free_consistent(pcidev, PAGE_SIZE, mmu->pagetables[i],
+ mmu->pagetables_dma[i]);
+ mmu->pagetables[i] = NULL;
+ mmu->pagetables_dma[i] = 0;
+ }
+
+ kfree(mmu->pagetables);
+ mmu->pagetables = NULL;
+ mmu->num_pagetables = 0;
+
+err_free_pagetables_dma:
+ kfree(mmu->pagetables_dma);
+ mmu->pagetables_dma = NULL;
+
+err_free_descriptor:
+ pci_free_consistent(pcidev, desc_size, mmu->hwdesc, mmu->hwdesc_dma);
+ mmu->hwdesc = NULL;
+ mmu->hwdesc_dma = 0;
+
+end:
+ return -ENOMEM;
+}
+
+void gna_mmu_free(struct gna_private *gna_priv)
+{
+ struct gna_mmu_object *mmu;
+ int desc_size;
+ int i;
+
+ mmu = &gna_priv->mmu;
+ mutex_lock(&gna_priv->mmu_lock);
+
+ for (i = 0; i < mmu->num_pagetables; i++) {
+ pci_free_consistent(gna_priv->pdev, PAGE_SIZE, mmu->pagetables[i],
+ mmu->pagetables_dma[i]);
+ mmu->pagetables[i] = NULL;
+ mmu->pagetables_dma[i] = 0;
+ }
+
+ kfree(mmu->pagetables);
+ mmu->pagetables = NULL;
+
+ kfree(mmu->pagetables_dma);
+ mmu->pagetables_dma = NULL;
+
+ desc_size = round_up(gna_priv->info.desc_info.desc_size, PAGE_SIZE);
+ pci_free_consistent(gna_priv->pdev, desc_size, mmu->hwdesc, mmu->hwdesc_dma);
+ mmu->hwdesc = NULL;
+ mmu->hwdesc_dma = 0;
+
+ mutex_unlock(&gna_priv->mmu_lock);
+}
+
+void gna_mmu_add(struct gna_private *gna_priv, struct gna_memory_object *mo)
+{
+ struct gna_mmu_object *mmu;
+ struct scatterlist *sgl;
+ dma_addr_t sg_page;
+ int sg_page_len;
+ u32 *pagetable;
+ u32 mmu_page;
+ int sg_pages;
+ int i;
+ int j;
+
+ mmu = &gna_priv->mmu;
+ mutex_lock(&gna_priv->mmu_lock);
+
+ j = mmu->filled_pages;
+ sgl = mo->sgt->sgl;
+ if (!sgl) {
+ dev_warn(&gna_priv->pdev->dev, "empty scatter list in memory object\n");
+ goto warn_empty_sgl;
+ }
+ sg_page = sg_dma_address(sgl);
+ sg_page_len = round_up(sg_dma_len(sgl), PAGE_SIZE) >> PAGE_SHIFT;
+ sg_pages = 0;
+
+ for (i = mmu->filled_pts; i < mmu->num_pagetables; i++) {
+ if (!sgl)
+ break;
+
+ pagetable = mmu->pagetables[i];
+
+ for (j = mmu->filled_pages; j < GNA_PT_LENGTH; j++) {
+ mmu_page = sg_page >> PAGE_SHIFT;
+ pagetable[j] = mmu_page;
+
+ mmu->filled_pages++;
+ sg_page += PAGE_SIZE;
+ sg_pages++;
+ if (sg_pages == sg_page_len) {
+ sgl = sg_next(sgl);
+ if (!sgl)
+ break;
+
+ sg_page = sg_dma_address(sgl);
+ sg_page_len =
+ round_up(sg_dma_len(sgl), PAGE_SIZE)
+ >> PAGE_SHIFT;
+ sg_pages = 0;
+ }
+ }
+
+ if (j == GNA_PT_LENGTH) {
+ mmu->filled_pages = 0;
+ mmu->filled_pts++;
+ }
+ }
+
+ mmu->hwdesc->mmu.vamaxaddr =
+ (mmu->filled_pts * PAGE_SIZE * GNA_PGDIR_ENTRIES) +
+ (mmu->filled_pages * PAGE_SIZE) - 1;
+ dev_dbg(&gna_priv->pdev->dev, "vamaxaddr set to %u\n", mmu->hwdesc->mmu.vamaxaddr);
+
+warn_empty_sgl:
+ mutex_unlock(&gna_priv->mmu_lock);
+}
+
+void gna_mmu_clear(struct gna_private *gna_priv)
+{
+ struct gna_mmu_object *mmu;
+ int i;
+
+ mmu = &gna_priv->mmu;
+ mutex_lock(&gna_priv->mmu_lock);
+
+ for (i = 0; i < mmu->filled_pts; i++)
+ memset(mmu->pagetables[i], 0, PAGE_SIZE);
+
+ if (mmu->filled_pages > 0)
+ memset(mmu->pagetables[mmu->filled_pts], 0, mmu->filled_pages * GNA_PT_ENTRY_SIZE);
+
+ mmu->filled_pts = 0;
+ mmu->filled_pages = 0;
+ mmu->hwdesc->mmu.vamaxaddr = 0;
+
+ mutex_unlock(&gna_priv->mmu_lock);
+}
+
+int gna_buffer_get_size(u64 offset, u64 size)
+{
+ u64 page_offset;
+
+ page_offset = offset & ~PAGE_MASK;
+ return round_up(page_offset + size, PAGE_SIZE);
+}
+
+/* must be called with gna_memory_object page_lock held */
+static int gna_get_pages(struct gna_memory_object *mo, u64 offset, u64 size)
+{
+ struct gna_private *gna_priv;
+ u64 effective_address;
+ struct mm_struct *mm;
+ struct sg_table *sgt;
+ struct page **pages;
+ int effective_size;
+ int num_pinned;
+ int num_pages;
+ int skip_size;
+ int ents;
+ int ret;
+
+ ret = 0;
+ gna_priv = mo->gna_priv;
+
+ if (mo->pages) {
+ dev_warn(&gna_priv->pdev->dev, "pages are already pinned\n");
+ return -EFAULT;
+ }
+
+ /* using vmalloc because num_pages can be large */
+ skip_size = round_down(offset, PAGE_SIZE);
+ effective_address = mo->user_address + skip_size;
+ dev_dbg(&gna_priv->pdev->dev, "user address %llx\n", mo->user_address);
+ dev_dbg(&gna_priv->pdev->dev, "effective user address %llx\n", effective_address);
+
+ effective_size = gna_buffer_get_size(offset, size);
+
+ num_pages = effective_size >> PAGE_SHIFT;
+ dev_dbg(&gna_priv->pdev->dev, "allocating %d pages\n", num_pages);
+ pages = kvmalloc_array(num_pages, sizeof(struct page *), GFP_KERNEL);
+ if (!pages) {
+ ret = -ENOMEM;
+ goto err_exit;
+ }
+
+ get_task_struct(mo->task);
+ mm = get_task_mm(mo->task);
+ if (!mm) {
+ ret = -ENOENT;
+ goto err_put_task;
+ }
+ mmap_read_lock(mm);
+ num_pinned = get_user_pages_remote(mm, effective_address, num_pages,
+ FOLL_WRITE, pages, NULL, NULL);
+ mmap_read_unlock(mm);
+ mmput(mm);
+
+ if (num_pinned <= 0) {
+ ret = num_pinned;
+ dev_err(&gna_priv->pdev->dev, "function get_user_pages_remote() failed\n");
+ goto err_free_pages;
+ }
+ if (num_pinned < num_pages) {
+ ret = -EFAULT;
+ dev_err(&gna_priv->pdev->dev,
+ "get_user_pages_remote() pinned fewer pages number than requested\n");
+ goto err_free_pages;
+ }
+
+ sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
+ if (!sgt) {
+ ret = -ENOMEM;
+ goto err_put_pages;
+ }
+
+ ret = sg_alloc_table_from_pages(sgt, pages, num_pinned, 0, mo->memory_size, GFP_KERNEL);
+ if (ret) {
+ dev_err(&gna_priv->pdev->dev, "could not alloc scatter list\n");
+ goto err_free_sgt;
+ }
+
+ if (IS_ERR(sgt->sgl)) {
+ dev_err(&gna_priv->pdev->dev, "sgl allocation failed\n");
+ ret = PTR_ERR(sgt->sgl);
+ goto err_free_sgt;
+ }
+
+ ents = pci_map_sg(gna_priv->pdev, sgt->sgl, sgt->nents, PCI_DMA_BIDIRECTIONAL);
+ if (ents <= 0) {
+ dev_err(&gna_priv->pdev->dev, "could not map scatter gather list\n");
+ ret = -EIO;
+ goto err_free_sgl;
+ }
+
+ mo->sgt = sgt;
+ mo->pages = pages;
+ mo->num_pinned = num_pinned;
+
+ return 0;
+
+err_free_sgl:
+ sg_free_table(sgt);
+
+err_free_sgt:
+ kfree(sgt);
+
+err_put_pages:
+ release_pages(pages, num_pinned);
+
+err_free_pages:
+ kvfree(pages);
+
+err_put_task:
+ put_task_struct(mo->task);
+
+err_exit:
+ return ret;
+}
+
+/* must be called with gna_memory_object page_lock held */
+static void gna_put_pages(struct gna_memory_object *mo)
+{
+ struct gna_private *gna_priv;
+ struct sg_table *sgt;
+
+ gna_priv = mo->gna_priv;
+
+ if (!mo->pages) {
+ dev_warn(&gna_priv->pdev->dev, "memory object has no pages %llu\n", mo->memory_id);
+ return;
+ }
+
+ sgt = mo->sgt;
+
+ pci_unmap_sg(gna_priv->pdev, sgt->sgl, sgt->nents, PCI_DMA_BIDIRECTIONAL);
+ sg_free_table(sgt);
+ kfree(sgt);
+ mo->sgt = NULL;
+
+ release_pages(mo->pages, mo->num_pinned);
+ kvfree(mo->pages);
+ mo->pages = NULL;
+ mo->num_pinned = 0;
+
+ put_task_struct(mo->task);
+}
+
+void gna_memory_free(struct gna_private *gna_priv, struct gna_memory_object *mo)
+{
+ mutex_lock(&gna_priv->memidr_lock);
+ idr_remove(&gna_priv->memory_idr, mo->memory_id);
+ mutex_unlock(&gna_priv->memidr_lock);
+
+ cancel_work_sync(&mo->work);
+ kfree(mo);
+}
+
+static void gna_memory_release(struct work_struct *work)
+{
+ struct gna_memory_object *mo;
+
+ mo = container_of(work, struct gna_memory_object, work);
+
+ mo->user_ptr = NULL;
+
+ wake_up_interruptible(&mo->waitq);
+}
+
+static const struct gna_memory_operations memory_ops = {
+ .get_pages = gna_get_pages,
+ .put_pages = gna_put_pages,
+};
+
+int gna_map_memory(struct gna_file_private *file_priv, union gna_memory_map *gna_mem)
+{
+ struct gna_memory_object *mo;
+ struct gna_private *gna_priv;
+ int memory_id;
+ int ret;
+
+ ret = 0;
+
+ gna_priv = file_priv->gna_priv;
+
+ if (gna_mem->in.address & ~PAGE_MASK) {
+ dev_err(&gna_priv->pdev->dev, "user pointer not page aligned\n");
+ return -EINVAL;
+ }
+
+ if (!gna_mem->in.size) {
+ dev_err(&gna_priv->pdev->dev, "invalid user memory size\n");
+ return -EINVAL;
+ }
+
+ if (!access_ok(u64_to_user_ptr(gna_mem->in.address), gna_mem->in.size)) {
+ dev_err(&gna_priv->pdev->dev, "invalid user pointer\n");
+ return -EINVAL;
+ }
+
+ mo = kzalloc(sizeof(*mo), GFP_KERNEL);
+ if (!mo)
+ return -ENOMEM;
+
+ mo->fd = file_priv->fd;
+ mo->gna_priv = gna_priv;
+ mo->ops = &memory_ops;
+ mo->user_address = gna_mem->in.address;
+ mo->memory_size = gna_mem->in.size;
+ mo->user_ptr = u64_to_user_ptr(gna_mem->in.address);
+ mo->num_pages = round_up(gna_mem->in.size, PAGE_SIZE) >> PAGE_SHIFT;
+ mo->task = current;
+ INIT_WORK(&mo->work, gna_memory_release);
+ init_waitqueue_head(&mo->waitq);
+ mutex_init(&mo->page_lock);
+
+ mutex_lock(&gna_priv->memidr_lock);
+ memory_id = idr_alloc(&gna_priv->memory_idr, mo, 1, 0, GFP_KERNEL);
+ mutex_unlock(&gna_priv->memidr_lock);
+
+ if (memory_id < 0) {
+ dev_err(&gna_priv->pdev->dev, "idr allocation for memory failed\n");
+ ret = -EFAULT;
+ goto err_free_mo;
+ }
+
+ mo->memory_id = (u64)memory_id;
+
+ mutex_lock(&file_priv->memlist_lock);
+ list_add_tail(&mo->file_mem_list, &file_priv->memory_list);
+ mutex_unlock(&file_priv->memlist_lock);
+
+ gna_mem->out.memory_id = mo->memory_id;
+
+ return 0;
+
+err_free_mo:
+ kfree(mo);
+ return ret;
+}
diff --git a/drivers/misc/intel/gna/gna_mem.h b/drivers/misc/intel/gna/gna_mem.h
new file mode 100644
index 000000000000..583e5bed1d2d
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_mem.h
@@ -0,0 +1,107 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2017-2021 Intel Corporation */
+
+#ifndef __GNA_MEM_H__
+#define __GNA_MEM_H__
+
+#include <linux/mmu_notifier.h>
+
+#include "gna_hw.h"
+
+union gna_memory_map;
+
+struct gna_file_private;
+
+struct gna_xnn_descriptor {
+ u32 labase;
+ u16 lacount;
+ u16 _rsvd;
+};
+
+struct gna_mmu {
+ u32 vamaxaddr;
+ u8 __res_204[12];
+ u32 pagedir_n[GNA_PGDIRN_LEN];
+};
+
+struct gna_hw_descriptor {
+ u8 __res_0000[256];
+ struct gna_xnn_descriptor xnn_config;
+ u8 __unused[248];
+ struct gna_mmu mmu;
+};
+
+struct gna_mmu_object {
+ struct gna_hw_descriptor *hwdesc;
+
+ dma_addr_t hwdesc_dma;
+
+ u32 **pagetables;
+ dma_addr_t *pagetables_dma;
+
+ u32 num_pagetables;
+
+ u32 filled_pts;
+ u32 filled_pages;
+};
+
+struct gna_mmu_notifier {
+ struct gna_file_private *file_priv;
+ struct gna_private *gna_priv;
+ struct gna_memory_object *mo;
+ struct mmu_notifier mn;
+ struct mm_struct *mm;
+};
+
+struct gna_memory_object {
+ u64 memory_id;
+
+ const struct gna_memory_operations *ops;
+
+ struct gna_private *gna_priv;
+ struct file *fd;
+
+ void __user *user_ptr;
+ u64 user_address;
+ u64 memory_size;
+
+ struct page **pages;
+ struct sg_table *sgt;
+ int num_pages;
+ int num_pinned;
+ struct mutex page_lock; /* protects get/put pages operations */
+
+ struct task_struct *task;
+
+ struct list_head mem_list;
+
+ struct list_head file_mem_list;
+
+ struct work_struct work;
+
+ struct wait_queue_head waitq;
+};
+
+struct gna_memory_operations {
+ /* pins pages */
+ int (*get_pages)(struct gna_memory_object *mo, u64 offset, u64 size);
+
+ /* puts previously pinned pages */
+ void (*put_pages)(struct gna_memory_object *mo);
+};
+
+int gna_buffer_get_size(u64 offset, u64 size);
+
+int gna_map_memory(struct gna_file_private *file_priv, union gna_memory_map *gna_mem);
+
+int gna_mmu_alloc(struct gna_private *gna_priv);
+
+void gna_mmu_free(struct gna_private *gna_priv);
+
+void gna_mmu_add(struct gna_private *gna_priv, struct gna_memory_object *object);
+
+void gna_mmu_clear(struct gna_private *gna_priv);
+
+void gna_memory_free(struct gna_private *gna_priv, struct gna_memory_object *mo);
+
+#endif // __GNA_MEM_H__
--
2.28.0
From: Tomasz Jankowski <[email protected]>
Add a new component for scoring logic such as configuring and kicking
off the hardware.
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-developed-by: Jianxun Zhang <[email protected]>
Signed-off-by: Jianxun Zhang <[email protected]>
Co-developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
drivers/misc/intel/gna/Kbuild | 2 +-
drivers/misc/intel/gna/gna_device.c | 3 +
drivers/misc/intel/gna/gna_device.h | 5 +
drivers/misc/intel/gna/gna_score.c | 298 ++++++++++++++++++++++++++++
drivers/misc/intel/gna/gna_score.h | 18 ++
5 files changed, 325 insertions(+), 1 deletion(-)
create mode 100644 drivers/misc/intel/gna/gna_score.c
create mode 100644 drivers/misc/intel/gna/gna_score.h
diff --git a/drivers/misc/intel/gna/Kbuild b/drivers/misc/intel/gna/Kbuild
index 5dbbd3f0a543..9dac467839c9 100644
--- a/drivers/misc/intel/gna/Kbuild
+++ b/drivers/misc/intel/gna/Kbuild
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
-intel_gna-y := gna_device.o gna_driver.o gna_mem.o gna_request.o gna_hw.o
+intel_gna-y := gna_device.o gna_driver.o gna_mem.o gna_request.o gna_score.o gna_hw.o
obj-$(CONFIG_INTEL_GNA) += intel_gna.o
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
index 14ce24fd18ff..e1a1f3142684 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -119,6 +119,9 @@ static int gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
idr_init(&gna_priv->memory_idr);
mutex_init(&gna_priv->memidr_lock);
+ mutex_init(&gna_priv->flist_lock);
+ INIT_LIST_HEAD(&gna_priv->file_list);
+
atomic_set(&gna_priv->request_count, 0);
mutex_init(&gna_priv->reqlist_lock);
diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
index b54d0ea9b9ef..878a972ab5b3 100644
--- a/drivers/misc/intel/gna/gna_device.h
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -33,6 +33,11 @@ struct gna_hw_info {
struct gna_private {
struct gna_driver_private *drv_priv;
+ /* list of opened files */
+ struct list_head file_list;
+ /* protects file_list */
+ struct mutex flist_lock;
+
struct pci_dev *pdev;
/* pdev->dev */
struct device *parent;
diff --git a/drivers/misc/intel/gna/gna_score.c b/drivers/misc/intel/gna/gna_score.c
new file mode 100644
index 000000000000..794039d2da43
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_score.c
@@ -0,0 +1,298 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2017-2021 Intel Corporation
+
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/poll.h>
+#include <linux/sched.h>
+#include <linux/sched/mm.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/vmalloc.h>
+
+#include <uapi/misc/intel/gna.h>
+
+#include "gna_device.h"
+#include "gna_driver.h"
+#include "gna_request.h"
+#include "gna_score.h"
+
+int gna_validate_score_config(struct gna_compute_cfg *compute_cfg,
+ struct gna_file_private *file_priv)
+{
+ struct gna_private *gna_priv;
+ size_t buffers_size;
+
+ gna_priv = file_priv->gna_priv;
+
+ if (compute_cfg->gna_mode > GNA_MODE_XNN) {
+ dev_err(&gna_priv->pdev->dev, "invalid mode\n");
+ return -EINVAL;
+ }
+
+ if (compute_cfg->layer_count > gna_priv->info.max_layer_count) {
+ dev_err(&gna_priv->pdev->dev, "max layer count exceeded\n");
+ return -EINVAL;
+ }
+
+ if (compute_cfg->buffer_count == 0) {
+ dev_err(&gna_priv->pdev->dev, "no buffers\n");
+ return -EINVAL;
+ }
+
+ buffers_size = sizeof(struct gna_buffer) * compute_cfg->buffer_count;
+ if (!access_ok(u64_to_user_ptr(compute_cfg->buffers_ptr), buffers_size)) {
+ dev_err(&gna_priv->pdev->dev, "invalid buffers pointer\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int gna_do_patch_memory(struct gna_private *gna_priv, struct gna_memory_object *mo,
+ struct gna_memory_patch *patch, void *vaddr)
+{
+ size_t size;
+ void *dest;
+ u64 value;
+
+ value = patch->value;
+ size = patch->size;
+ dest = (u8 *)vaddr + patch->offset;
+ dev_dbg(&gna_priv->pdev->dev, "patch offset: %llu, size: %zu, value: %llu\n",
+ patch->offset, size, value);
+
+ switch (size) {
+ case 0:
+ return -EFAULT;
+ case sizeof(u8):
+ *((u8 *)dest) = (u8)value;
+ break;
+ case sizeof(u16):
+ *((u16 *)dest) = (u16)value;
+ break;
+ case sizeof(u32):
+ *((u32 *)dest) = (u32)value;
+ break;
+ case sizeof(u64):
+ *((u64 *)dest) = (u64)value;
+ break;
+ default:
+ // should never happen
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int gna_mem_patch_memory(struct gna_private *gna_priv, struct gna_buffer *buffer)
+{
+ struct gna_memory_patch *patch;
+ struct gna_memory_object *mo;
+ void *vaddr;
+ int ret = 0;
+ u32 i;
+
+ dev_dbg(&gna_priv->pdev->dev, "memory_id: %llu, patch_count, %llu\n",
+ buffer->memory_id, buffer->patch_count);
+
+ mutex_lock(&gna_priv->memidr_lock);
+ mo = idr_find(&gna_priv->memory_idr, buffer->memory_id);
+ mutex_unlock(&gna_priv->memidr_lock);
+ if (!mo)
+ return -EINVAL;
+
+ mutex_lock(&mo->page_lock);
+ ret = mo->ops->get_pages(mo, buffer->offset, buffer->size);
+ mutex_unlock(&mo->page_lock);
+ if (ret)
+ return ret;
+
+ if (buffer->patch_count) {
+ vaddr = vm_map_ram(mo->pages, mo->num_pinned, 0);
+ if (!vaddr)
+ return -ENOMEM;
+
+ patch = (struct gna_memory_patch *)(uintptr_t)buffer->patches_ptr;
+ for (i = 0; i < buffer->patch_count; i++, patch++) {
+ ret = gna_do_patch_memory(gna_priv, mo, patch, vaddr + buffer->offset);
+ if (ret)
+ break;
+ }
+
+ kvfree((void *)(uintptr_t)buffer->patches_ptr);
+ buffer->patches_ptr = 0;
+ vm_unmap_ram(vaddr, mo->num_pages);
+
+ if (ret)
+ return ret;
+ }
+
+ gna_mmu_add(gna_priv, mo);
+
+ return ret;
+}
+
+static struct gna_buffer *gna_find_buffer(struct gna_buffer *buffer_list, u32 buffer_count,
+ u32 mmu_offset, u32 *memory_offset)
+{
+ struct gna_buffer *buffer;
+ u32 page_offset;
+ u32 memory_size;
+ u32 offset;
+ u32 i;
+
+ offset = 0;
+ for (i = 0; i < buffer_count; i++) {
+ buffer = buffer_list + i;
+ page_offset = buffer->offset & ~PAGE_MASK;
+ memory_size = round_up(page_offset + buffer->size, PAGE_SIZE);
+ if (mmu_offset < offset + memory_size) {
+ *memory_offset = offset;
+ return buffer;
+ }
+ offset += memory_size;
+ }
+
+ return NULL;
+}
+
+static int gna_copy_gmm_config(struct gna_private *gna_priv,
+ struct gna_buffer *buffer_list,
+ u32 buffer_count, u32 mmu_offset)
+{
+ struct gna_hw_descriptor *hwdesc;
+ struct gna_memory_object *mo;
+ struct gna_mmu_object *mmu;
+ struct gna_buffer *buffer;
+ u32 memory_offset;
+ u32 skip_offset;
+ u8 *gmm_desc;
+ void *vaddr;
+
+ mmu = &gna_priv->mmu;
+ hwdesc = mmu->hwdesc;
+
+ buffer = gna_find_buffer(buffer_list, buffer_count, mmu_offset, &memory_offset);
+ if (!buffer) {
+ dev_dbg(&gna_priv->pdev->dev, "buffer not found\n");
+ return -EINVAL;
+ }
+
+ mutex_lock(&gna_priv->memidr_lock);
+ mo = idr_find(&gna_priv->memory_idr, buffer->memory_id);
+ mutex_unlock(&gna_priv->memidr_lock);
+ if (!mo) {
+ dev_dbg(&gna_priv->pdev->dev, "memory object not found\n");
+ return -EFAULT;
+ }
+
+ vaddr = vm_map_ram(mo->pages, mo->num_pinned, 0);
+ if (!vaddr) {
+ dev_dbg(&gna_priv->pdev->dev, "mapping failed\n");
+ return -EFAULT;
+ }
+
+ skip_offset = round_down(buffer->offset, PAGE_SIZE);
+ gmm_desc = (u8 *)vaddr + skip_offset + (mmu_offset - memory_offset);
+ memcpy(&hwdesc->xnn_config, gmm_desc, sizeof(struct gna_xnn_descriptor));
+ vm_unmap_ram(vaddr, mo->num_pages);
+
+ return 0;
+}
+
+int gna_score(struct gna_request *score_request)
+{
+ struct gna_xnn_descriptor *xnn_config;
+ struct gna_compute_cfg *compute_cfg;
+ struct gna_private *gna_priv;
+ struct gna_memory_object *mo;
+ struct gna_mmu_object *mmu;
+ struct gna_buffer *buffer;
+ bool mo_valid = true;
+ void __iomem *addr;
+ u64 buffer_count;
+ u32 desc_base;
+ int ret;
+ u64 i;
+
+ ret = 0;
+
+ gna_priv = score_request->gna_priv;
+
+ mmu = &gna_priv->mmu;
+ xnn_config = &mmu->hwdesc->xnn_config;
+ compute_cfg = &score_request->compute_cfg;
+
+ buffer = score_request->buffer_list;
+ buffer_count = score_request->buffer_count;
+ dev_dbg(&gna_priv->pdev->dev, "buffer count: %llu\n", buffer_count);
+ for (i = 0; i < buffer_count; i++, buffer++) {
+ dev_dbg(&gna_priv->pdev->dev, "patch count: %llu\n", buffer->patch_count);
+ ret = gna_mem_patch_memory(gna_priv, buffer);
+ if (ret)
+ goto err_put_pages;
+ }
+
+ switch (compute_cfg->gna_mode) {
+ case GNA_MODE_XNN:
+ dev_dbg(&gna_priv->pdev->dev, "xNN mode, labase: %d, lacount: %d\n",
+ compute_cfg->layer_base, compute_cfg->layer_count);
+ xnn_config->labase = compute_cfg->layer_base;
+ xnn_config->lacount = compute_cfg->layer_count;
+ break;
+ case GNA_MODE_GMM:
+ dev_dbg(&gna_priv->pdev->dev, "GMM mode, offset: %d\n", compute_cfg->layer_base);
+ ret = gna_copy_gmm_config(gna_priv, score_request->buffer_list,
+ buffer_count, compute_cfg->layer_base);
+ if (ret)
+ goto err_put_pages_decr;
+ break;
+ default:
+ ret = -EINVAL;
+ goto err_put_pages_decr;
+ }
+
+ addr = gna_priv->bar0_base;
+ desc_base = (u32)(mmu->hwdesc_dma >> PAGE_SHIFT);
+ gna_reg_write(addr, GNA_MMIO_DESBASE, desc_base);
+
+ gna_start_scoring(gna_priv, addr, compute_cfg);
+
+ return 0;
+
+err_put_pages_decr:
+ i--;
+ buffer--;
+err_put_pages:
+ do {
+ mutex_lock(&gna_priv->memidr_lock);
+ mo = idr_find(&gna_priv->memory_idr, buffer->memory_id);
+ mutex_unlock(&gna_priv->memidr_lock);
+ if (mo) {
+ mutex_lock(&mo->page_lock);
+ mo->ops->put_pages(mo);
+ mutex_unlock(&mo->page_lock);
+ } else {
+ mo_valid = false;
+ dev_warn(&gna_priv->pdev->dev, "memory object not found %llu\n",
+ buffer->memory_id);
+ }
+ buffer--;
+ } while (i--);
+
+ if (mo_valid) {
+ i = score_request->buffer_count;
+ while (i--)
+ kvfree((void *)(uintptr_t)score_request->buffer_list[i].patches_ptr);
+ kvfree(score_request->buffer_list);
+ }
+ score_request->buffer_list = NULL;
+ score_request->buffer_count = 0;
+
+ return ret;
+}
diff --git a/drivers/misc/intel/gna/gna_score.h b/drivers/misc/intel/gna/gna_score.h
new file mode 100644
index 000000000000..056cf02586f9
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_score.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2017-2021 Intel Corporation */
+
+#ifndef __GNA_SCORE_H__
+#define __GNA_SCORE_H__
+
+#include <uapi/misc/intel/gna.h>
+
+struct gna_private;
+struct gna_file_private;
+struct gna_request;
+
+int gna_validate_score_config(struct gna_compute_cfg *compute_cfg,
+ struct gna_file_private *file_priv);
+
+int gna_score(struct gna_request *score_request);
+
+#endif // __GNA_SCORE_H__
--
2.28.0
The new 'misc' device is the node for applications in user space to
interact with the driver.
Signed-off-by: Maciej Kwapulinski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
---
drivers/misc/intel/gna/gna_device.c | 69 ++++++++++++++++++++++++++--
drivers/misc/intel/gna/gna_device.h | 6 +++
drivers/misc/intel/gna/gna_driver.c | 2 +
drivers/misc/intel/gna/gna_driver.h | 2 +
drivers/misc/intel/gna/gna_hw.c | 28 +++++------
drivers/misc/intel/gna/gna_ioctl.c | 36 +++++++--------
drivers/misc/intel/gna/gna_mem.c | 32 ++++++-------
drivers/misc/intel/gna/gna_request.c | 32 ++++++-------
drivers/misc/intel/gna/gna_score.c | 28 +++++------
9 files changed, 154 insertions(+), 81 deletions(-)
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
index d8e1d4b8a9eb..3f7f4c07d1a1 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -91,9 +91,57 @@ const struct pci_device_id gna_pci_ids[] = {
MODULE_DEVICE_TABLE(pci, gna_pci_ids);
+static int gna_open(struct inode *inode, struct file *f)
+{
+ return -EPERM;
+}
+
+static const struct file_operations gna_file_ops = {
+ .owner = THIS_MODULE,
+ .open = gna_open,
+};
+
+static void gna_dev_release(struct gna_private *gna_priv)
+{
+ misc_deregister(&gna_priv->misc);
+ kfree(gna_priv->misc.name);
+ gna_priv->misc.name = NULL;
+}
+
+static int gna_dev_create(struct gna_private *gna_priv, char *gna_name)
+{
+ struct pci_dev *pcidev;
+ int ret;
+
+ pcidev = gna_priv->pdev;
+
+ gna_priv->misc.minor = MISC_DYNAMIC_MINOR;
+ gna_priv->misc.name = kasprintf(GFP_KERNEL, "%s", gna_name);
+ gna_priv->misc.fops = &gna_file_ops;
+ gna_priv->misc.parent = &pcidev->dev;
+ gna_priv->misc.mode = 0666;
+
+ dev_dbg(&pcidev->dev, "registering device: %s\n",
+ gna_priv->misc.name);
+
+ ret = misc_register(&gna_priv->misc);
+ if (ret) {
+ dev_err(&pcidev->dev, "misc_register %s failed: %d\n",
+ gna_name, ret);
+ misc_deregister(&gna_priv->misc);
+ kfree(gna_priv->misc.name);
+ gna_priv->misc.name = NULL;
+ }
+
+ return ret;
+}
+
+
static int gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
const struct pci_device_id *pci_id)
{
+ // strlen(GNA_DV_NAME) + max minor number.
+ char gna_name[sizeof(GNA_DV_NAME) + sizeof("255") + 1];
u32 bld_reg;
int ret;
@@ -104,6 +152,8 @@ static int gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
gna_priv->info = *(struct gna_drv_info *)pci_id->driver_data;
gna_priv->drv_priv = &gna_drv_priv;
+ gna_priv->index = atomic_inc_return(&gna_priv->drv_priv->dev_last_idx);
+
bld_reg = gna_reg_read(gna_priv->bar0_base, GNA_MMIO_IBUFFS);
gna_priv->hw_info.in_buf_s = bld_reg & GENMASK(7, 0);
@@ -134,15 +184,26 @@ static int gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
init_waitqueue_head(&gna_priv->dev_busy_waitq);
- gna_priv->request_wq = create_singlethread_workqueue(GNA_DV_NAME);
+ snprintf(gna_name, sizeof(gna_name), "%s%d", GNA_DV_NAME, gna_priv->index);
+
+ gna_priv->request_wq = create_singlethread_workqueue(gna_name);
if (!gna_priv->request_wq) {
- dev_err(&pcidev->dev, "could not create %s workqueue\n", GNA_DV_NAME);
+ dev_err(&pcidev->dev, "could not create %s workqueue\n", gna_name);
ret = -EFAULT;
goto err_pci_drvdata_unset;
}
+ ret = gna_dev_create(gna_priv, gna_name);
+ if (ret) {
+ dev_err(&pcidev->dev, "could not create %s device\n", GNA_DV_NAME);
+ goto err_del_wq;
+ }
+
return 0;
+err_del_wq:
+ destroy_workqueue(gna_priv->request_wq);
+
err_pci_drvdata_unset:
pci_set_drvdata(pcidev, NULL);
@@ -151,6 +212,8 @@ static int gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
static void gna_dev_deinit(struct gna_private *gna_priv)
{
+ gna_dev_release(gna_priv);
+
flush_workqueue(gna_priv->request_wq);
destroy_workqueue(gna_priv->request_wq);
@@ -300,7 +363,7 @@ int gna_getparam(struct gna_private *gna_priv, union gna_parameter *param)
param->out.value = gna_device_type_by_hwid(gna_priv->info.hwid);
break;
default:
- dev_err(&gna_priv->pdev->dev, "unknown parameter id %llu\n", param->in.id);
+ dev_err(gna_priv->misc.this_device, "unknown parameter id %llu\n", param->in.id);
return -EINVAL;
}
diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
index aa7fadcf93b1..72692f5f3582 100644
--- a/drivers/misc/intel/gna/gna_device.h
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -4,6 +4,7 @@
#ifndef __GNA_DEVICE_H__
#define __GNA_DEVICE_H__
+#include <linux/miscdevice.h>
#include <linux/types.h>
#include <linux/mutex.h>
#include <linux/list.h>
@@ -44,6 +45,11 @@ struct gna_private {
/* pdev->dev */
struct device *parent;
+ int index;
+
+ /* gna misc-device */
+ struct miscdevice misc;
+
int irq;
/* hardware status set by interrupt handler */
u32 hw_status;
diff --git a/drivers/misc/intel/gna/gna_driver.c b/drivers/misc/intel/gna/gna_driver.c
index a8f69a970f7a..79b735f3e492 100644
--- a/drivers/misc/intel/gna/gna_driver.c
+++ b/drivers/misc/intel/gna/gna_driver.c
@@ -23,6 +23,8 @@ static struct pci_driver gna_driver = {
static int __init gna_drv_init(void)
{
+ atomic_set(&gna_drv_priv.dev_last_idx, -1);
+
gna_drv_priv.recovery_timeout_jiffies = msecs_to_jiffies(recovery_timeout * 1000);
return pci_register_driver(&gna_driver);
diff --git a/drivers/misc/intel/gna/gna_driver.h b/drivers/misc/intel/gna/gna_driver.h
index 69087317d668..48a443d69f61 100644
--- a/drivers/misc/intel/gna/gna_driver.h
+++ b/drivers/misc/intel/gna/gna_driver.h
@@ -5,6 +5,7 @@
#define __GNA_DRIVER_H__
#include <linux/mutex.h>
+#include <linux/types.h>
#include <linux/list.h>
#define GNA_DV_NAME "intel_gna"
@@ -13,6 +14,7 @@ struct gna_private;
struct file;
struct gna_driver_private {
+ atomic_t dev_last_idx;
int recovery_timeout_jiffies;
};
diff --git a/drivers/misc/intel/gna/gna_hw.c b/drivers/misc/intel/gna/gna_hw.c
index 7d2f4ef00136..1f3f747aa88e 100644
--- a/drivers/misc/intel/gna/gna_hw.c
+++ b/drivers/misc/intel/gna/gna_hw.c
@@ -14,13 +14,13 @@ int gna_parse_hw_status(struct gna_private *gna_priv, u32 hw_status)
int status;
if (hw_status & GNA_ERROR) {
- dev_dbg(&gna_priv->pdev->dev, "GNA completed with errors: %#x\n", hw_status);
+ dev_dbg(gna_priv->misc.this_device, "GNA completed with errors: %#x\n", hw_status);
status = -EIO;
} else if (hw_status & GNA_STS_SCORE_COMPLETED) {
status = 0;
- dev_dbg(&gna_priv->pdev->dev, "GNA completed successfully: %#x\n", hw_status);
+ dev_dbg(gna_priv->misc.this_device, "GNA completed successfully: %#x\n", hw_status);
} else {
- dev_err(&gna_priv->pdev->dev, "GNA not completed, status: %#x\n", hw_status);
+ dev_err(gna_priv->misc.this_device, "GNA not completed, status: %#x\n", hw_status);
status = -ENODATA;
}
@@ -30,22 +30,22 @@ int gna_parse_hw_status(struct gna_private *gna_priv, u32 hw_status)
void gna_print_error_status(struct gna_private *gna_priv, u32 hw_status)
{
if (hw_status & GNA_STS_PARAM_OOR)
- dev_dbg(&gna_priv->pdev->dev, "GNA error: Param Out Range Error\n");
+ dev_dbg(gna_priv->misc.this_device, "GNA error: Param Out Range Error\n");
if (hw_status & GNA_STS_VA_OOR)
- dev_dbg(&gna_priv->pdev->dev, "GNA error: VA Out of Range Error\n");
+ dev_dbg(gna_priv->misc.this_device, "GNA error: VA Out of Range Error\n");
if (hw_status & GNA_STS_PCI_MMU_ERR)
- dev_dbg(&gna_priv->pdev->dev, "GNA error: PCI MMU Error\n");
+ dev_dbg(gna_priv->misc.this_device, "GNA error: PCI MMU Error\n");
if (hw_status & GNA_STS_PCI_DMA_ERR)
- dev_dbg(&gna_priv->pdev->dev, "GNA error: PCI MMU Error\n");
+ dev_dbg(gna_priv->misc.this_device, "GNA error: PCI MMU Error\n");
if (hw_status & GNA_STS_PCI_UNEXCOMPL_ERR)
- dev_dbg(&gna_priv->pdev->dev, "GNA error: PCI Unexpected Completion Error\n");
+ dev_dbg(gna_priv->misc.this_device, "GNA error: PCI Unexpected Completion Error\n");
if (hw_status & GNA_STS_SATURATE)
- dev_dbg(&gna_priv->pdev->dev, "GNA error: Saturation Reached !\n");
+ dev_dbg(gna_priv->misc.this_device, "GNA error: Saturation Reached !\n");
}
bool gna_hw_perf_enabled(struct gna_private *gna_priv)
@@ -77,7 +77,7 @@ void gna_start_scoring(struct gna_private *gna_priv, void __iomem *addr,
gna_reg_write(addr, GNA_MMIO_CTRL, ctrl);
- dev_dbg(&gna_priv->pdev->dev, "scoring started...\n");
+ dev_dbg(gna_priv->misc.this_device, "scoring started...\n");
}
static void gna_clear_saturation(struct gna_private *gna_priv)
@@ -87,8 +87,8 @@ static void gna_clear_saturation(struct gna_private *gna_priv)
val = gna_reg_read(addr, GNA_MMIO_STS);
if (val & GNA_STS_SATURATE) {
- dev_dbg(&gna_priv->pdev->dev, "saturation reached\n");
- dev_dbg(&gna_priv->pdev->dev, "status: %#x\n", val);
+ dev_dbg(gna_priv->misc.this_device, "saturation reached\n");
+ dev_dbg(gna_priv->misc.this_device, "status: %#x\n", val);
val = val & GNA_STS_SATURATE;
gna_reg_write(addr, GNA_MMIO_STS, val);
@@ -107,7 +107,7 @@ void gna_abort_hw(struct gna_private *gna_priv)
gna_clear_saturation(gna_priv);
val = gna_reg_read(addr, GNA_MMIO_STS);
- dev_dbg(&gna_priv->pdev->dev, "status before abort: %#x\n", val);
+ dev_dbg(gna_priv->misc.this_device, "status before abort: %#x\n", val);
val = gna_reg_read(addr, GNA_MMIO_CTRL);
val |= GNA_CTRL_ABORT_CLR_ACCEL;
@@ -121,5 +121,5 @@ void gna_abort_hw(struct gna_private *gna_priv)
} while (--i);
if (i == 0)
- dev_err(&gna_priv->pdev->dev, "abort did not complete\n");
+ dev_err(gna_priv->misc.this_device, "abort did not complete\n");
}
diff --git a/drivers/misc/intel/gna/gna_ioctl.c b/drivers/misc/intel/gna/gna_ioctl.c
index 79ce3aeb27cf..03d85850dcf1 100644
--- a/drivers/misc/intel/gna/gna_ioctl.c
+++ b/drivers/misc/intel/gna/gna_ioctl.c
@@ -22,25 +22,25 @@ static int gna_ioctl_score(struct gna_file_private *file_priv, void __user *argp
gna_priv = file_priv->gna_priv;
if (copy_from_user(&score_args, argptr, sizeof(score_args))) {
- dev_err(&gna_priv->pdev->dev, "could not copy score ioctl config from user\n");
+ dev_err(gna_priv->misc.this_device, "could not copy score ioctl config from user\n");
return -EFAULT;
}
ret = gna_validate_score_config(&score_args.in.config, file_priv);
if (ret) {
- dev_err(&gna_priv->pdev->dev, "request not valid\n");
+ dev_err(gna_priv->misc.this_device, "request not valid\n");
return ret;
}
ret = gna_enqueue_request(&score_args.in.config, file_priv, &request_id);
if (ret) {
- dev_err(&gna_priv->pdev->dev, "could not enqueue score request %d\n", ret);
+ dev_err(gna_priv->misc.this_device, "could not enqueue score request %d\n", ret);
return ret;
}
score_args.out.request_id = request_id;
if (copy_to_user(argptr, &score_args, sizeof(score_args))) {
- dev_err(&gna_priv->pdev->dev, "could not copy score ioctl status to user\n");
+ dev_err(gna_priv->misc.this_device, "could not copy score ioctl status to user\n");
return -EFAULT;
}
@@ -63,7 +63,7 @@ static int gna_ioctl_wait(struct file *f, void __user *argptr)
ret = 0;
if (copy_from_user(&wait_data, argptr, sizeof(wait_data))) {
- dev_err(&gna_priv->pdev->dev, "could not copy wait ioctl data from user\n");
+ dev_err(gna_priv->misc.this_device, "could not copy wait ioctl data from user\n");
return -EFAULT;
}
@@ -73,7 +73,7 @@ static int gna_ioctl_wait(struct file *f, void __user *argptr)
score_request = gna_find_request_by_id(request_id, gna_priv);
if (!score_request) {
- dev_err(&gna_priv->pdev->dev, "could not find request with id: %llu\n", request_id);
+ dev_err(gna_priv->misc.this_device, "could not find request with id: %llu\n", request_id);
return -EINVAL;
}
@@ -82,17 +82,17 @@ static int gna_ioctl_wait(struct file *f, void __user *argptr)
return -EINVAL;
}
- dev_dbg(&gna_priv->pdev->dev, "waiting for request %llu for timeout %u\n", request_id, timeout);
+ dev_dbg(gna_priv->misc.this_device, "waiting for request %llu for timeout %u\n", request_id, timeout);
ret = wait_event_interruptible_timeout(score_request->waitq, score_request->state == DONE,
msecs_to_jiffies(timeout));
if (ret == 0 || ret == -ERESTARTSYS) {
- dev_err(&gna_priv->pdev->dev, "request timed out, id: %llu\n", request_id);
+ dev_err(gna_priv->misc.this_device, "request timed out, id: %llu\n", request_id);
kref_put(&score_request->refcount, gna_request_release);
return -EBUSY;
}
- dev_dbg(&gna_priv->pdev->dev, "request wait completed with %d req id %llu\n", ret, request_id);
+ dev_dbg(gna_priv->misc.this_device, "request wait completed with %d req id %llu\n", ret, request_id);
wait_data.out.hw_perf = score_request->hw_perf;
wait_data.out.drv_perf = score_request->drv_perf;
@@ -100,14 +100,14 @@ static int gna_ioctl_wait(struct file *f, void __user *argptr)
ret = score_request->status;
- dev_dbg(&gna_priv->pdev->dev, "request status %d, hw status: %#x\n",
+ dev_dbg(gna_priv->misc.this_device, "request status %d, hw status: %#x\n",
score_request->status, score_request->hw_status);
kref_put(&score_request->refcount, gna_request_release);
gna_delete_request_by_id(request_id, gna_priv);
if (copy_to_user(argptr, &wait_data, sizeof(wait_data))) {
- dev_err(&gna_priv->pdev->dev, "could not copy wait ioctl status to user\n");
+ dev_err(gna_priv->misc.this_device, "could not copy wait ioctl status to user\n");
ret = -EFAULT;
}
@@ -123,7 +123,7 @@ static int gna_ioctl_map(struct gna_file_private *file_priv, void __user *argptr
gna_priv = file_priv->gna_priv;
if (copy_from_user(&gna_mem, argptr, sizeof(gna_mem))) {
- dev_err(&gna_priv->pdev->dev, "could not copy userptr ioctl data from user\n");
+ dev_err(gna_priv->misc.this_device, "could not copy userptr ioctl data from user\n");
return -EFAULT;
}
@@ -132,7 +132,7 @@ static int gna_ioctl_map(struct gna_file_private *file_priv, void __user *argptr
return ret;
if (copy_to_user(argptr, &gna_mem, sizeof(gna_mem))) {
- dev_err(&gna_priv->pdev->dev, "could not copy userptr ioctl status to user\n");
+ dev_err(gna_priv->misc.this_device, "could not copy userptr ioctl status to user\n");
return -EFAULT;
}
@@ -154,13 +154,13 @@ static int gna_ioctl_free(struct gna_file_private *file_priv, unsigned long arg)
mutex_unlock(&gna_priv->memidr_lock);
if (!mo) {
- dev_warn(&gna_priv->pdev->dev, "memory object not found\n");
+ dev_warn(gna_priv->misc.this_device, "memory object not found\n");
return -EINVAL;
}
queue_work(gna_priv->request_wq, &mo->work);
if (wait_event_interruptible(mo->waitq, true)) {
- dev_dbg(&gna_priv->pdev->dev, "wait interrupted\n");
+ dev_dbg(gna_priv->misc.this_device, "wait interrupted\n");
return -ETIME;
}
@@ -184,7 +184,7 @@ static int gna_ioctl_getparam(struct gna_private *gna_priv, void __user *argptr)
int ret;
if (copy_from_user(¶m, argptr, sizeof(param))) {
- dev_err(&gna_priv->pdev->dev, "could not copy getparam ioctl data from user\n");
+ dev_err(gna_priv->misc.this_device, "could not copy getparam ioctl data from user\n");
return -EFAULT;
}
@@ -193,7 +193,7 @@ static int gna_ioctl_getparam(struct gna_private *gna_priv, void __user *argptr)
return ret;
if (copy_to_user(argptr, ¶m, sizeof(param))) {
- dev_err(&gna_priv->pdev->dev, "could not copy getparam ioctl status to user\n");
+ dev_err(gna_priv->misc.this_device, "could not copy getparam ioctl status to user\n");
return -EFAULT;
}
@@ -240,7 +240,7 @@ long gna_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
break;
default:
- dev_warn(&gna_priv->pdev->dev, "wrong ioctl %#x\n", cmd);
+ dev_warn(gna_priv->misc.this_device, "wrong ioctl %#x\n", cmd);
ret = -EINVAL;
break;
}
diff --git a/drivers/misc/intel/gna/gna_mem.c b/drivers/misc/intel/gna/gna_mem.c
index ce1691d68edb..d81dc8f7f2df 100644
--- a/drivers/misc/intel/gna/gna_mem.c
+++ b/drivers/misc/intel/gna/gna_mem.c
@@ -159,7 +159,7 @@ void gna_mmu_add(struct gna_private *gna_priv, struct gna_memory_object *mo)
j = mmu->filled_pages;
sgl = mo->sgt->sgl;
if (!sgl) {
- dev_warn(&gna_priv->pdev->dev, "empty scatter list in memory object\n");
+ dev_warn(gna_priv->misc.this_device, "empty scatter list in memory object\n");
goto warn_empty_sgl;
}
sg_page = sg_dma_address(sgl);
@@ -201,7 +201,7 @@ void gna_mmu_add(struct gna_private *gna_priv, struct gna_memory_object *mo)
mmu->hwdesc->mmu.vamaxaddr =
(mmu->filled_pts * PAGE_SIZE * GNA_PGDIR_ENTRIES) +
(mmu->filled_pages * PAGE_SIZE) - 1;
- dev_dbg(&gna_priv->pdev->dev, "vamaxaddr set to %u\n", mmu->hwdesc->mmu.vamaxaddr);
+ dev_dbg(gna_priv->misc.this_device, "vamaxaddr set to %u\n", mmu->hwdesc->mmu.vamaxaddr);
warn_empty_sgl:
mutex_unlock(&gna_priv->mmu_lock);
@@ -255,20 +255,20 @@ static int gna_get_pages(struct gna_memory_object *mo, u64 offset, u64 size)
gna_priv = mo->gna_priv;
if (mo->pages) {
- dev_warn(&gna_priv->pdev->dev, "pages are already pinned\n");
+ dev_warn(gna_priv->misc.this_device, "pages are already pinned\n");
return -EFAULT;
}
/* using vmalloc because num_pages can be large */
skip_size = round_down(offset, PAGE_SIZE);
effective_address = mo->user_address + skip_size;
- dev_dbg(&gna_priv->pdev->dev, "user address %llx\n", mo->user_address);
- dev_dbg(&gna_priv->pdev->dev, "effective user address %llx\n", effective_address);
+ dev_dbg(gna_priv->misc.this_device, "user address %llx\n", mo->user_address);
+ dev_dbg(gna_priv->misc.this_device, "effective user address %llx\n", effective_address);
effective_size = gna_buffer_get_size(offset, size);
num_pages = effective_size >> PAGE_SHIFT;
- dev_dbg(&gna_priv->pdev->dev, "allocating %d pages\n", num_pages);
+ dev_dbg(gna_priv->misc.this_device, "allocating %d pages\n", num_pages);
pages = kvmalloc_array(num_pages, sizeof(struct page *), GFP_KERNEL);
if (!pages) {
ret = -ENOMEM;
@@ -289,12 +289,12 @@ static int gna_get_pages(struct gna_memory_object *mo, u64 offset, u64 size)
if (num_pinned <= 0) {
ret = num_pinned;
- dev_err(&gna_priv->pdev->dev, "function get_user_pages_remote() failed\n");
+ dev_err(gna_priv->misc.this_device, "function get_user_pages_remote() failed\n");
goto err_free_pages;
}
if (num_pinned < num_pages) {
ret = -EFAULT;
- dev_err(&gna_priv->pdev->dev,
+ dev_err(gna_priv->misc.this_device,
"get_user_pages_remote() pinned fewer pages number than requested\n");
goto err_free_pages;
}
@@ -307,19 +307,19 @@ static int gna_get_pages(struct gna_memory_object *mo, u64 offset, u64 size)
ret = sg_alloc_table_from_pages(sgt, pages, num_pinned, 0, mo->memory_size, GFP_KERNEL);
if (ret) {
- dev_err(&gna_priv->pdev->dev, "could not alloc scatter list\n");
+ dev_err(gna_priv->misc.this_device, "could not alloc scatter list\n");
goto err_free_sgt;
}
if (IS_ERR(sgt->sgl)) {
- dev_err(&gna_priv->pdev->dev, "sgl allocation failed\n");
+ dev_err(gna_priv->misc.this_device, "sgl allocation failed\n");
ret = PTR_ERR(sgt->sgl);
goto err_free_sgt;
}
ents = pci_map_sg(gna_priv->pdev, sgt->sgl, sgt->nents, PCI_DMA_BIDIRECTIONAL);
if (ents <= 0) {
- dev_err(&gna_priv->pdev->dev, "could not map scatter gather list\n");
+ dev_err(gna_priv->misc.this_device, "could not map scatter gather list\n");
ret = -EIO;
goto err_free_sgl;
}
@@ -358,7 +358,7 @@ static void gna_put_pages(struct gna_memory_object *mo)
gna_priv = mo->gna_priv;
if (!mo->pages) {
- dev_warn(&gna_priv->pdev->dev, "memory object has no pages %llu\n", mo->memory_id);
+ dev_warn(gna_priv->misc.this_device, "memory object has no pages %llu\n", mo->memory_id);
return;
}
@@ -417,17 +417,17 @@ int gna_map_memory(struct gna_file_private *file_priv, union gna_memory_map *gna
gna_priv = file_priv->gna_priv;
if (gna_mem->in.address & ~PAGE_MASK) {
- dev_err(&gna_priv->pdev->dev, "user pointer not page aligned\n");
+ dev_err(gna_priv->misc.this_device, "user pointer not page aligned\n");
return -EINVAL;
}
if (!gna_mem->in.size) {
- dev_err(&gna_priv->pdev->dev, "invalid user memory size\n");
+ dev_err(gna_priv->misc.this_device, "invalid user memory size\n");
return -EINVAL;
}
if (!access_ok(u64_to_user_ptr(gna_mem->in.address), gna_mem->in.size)) {
- dev_err(&gna_priv->pdev->dev, "invalid user pointer\n");
+ dev_err(gna_priv->misc.this_device, "invalid user pointer\n");
return -EINVAL;
}
@@ -452,7 +452,7 @@ int gna_map_memory(struct gna_file_private *file_priv, union gna_memory_map *gna
mutex_unlock(&gna_priv->memidr_lock);
if (memory_id < 0) {
- dev_err(&gna_priv->pdev->dev, "idr allocation for memory failed\n");
+ dev_err(gna_priv->misc.this_device, "idr allocation for memory failed\n");
ret = -EFAULT;
goto err_free_mo;
}
diff --git a/drivers/misc/intel/gna/gna_request.c b/drivers/misc/intel/gna/gna_request.c
index ba9bac358270..f913a54bb1c3 100644
--- a/drivers/misc/intel/gna/gna_request.c
+++ b/drivers/misc/intel/gna/gna_request.c
@@ -39,7 +39,7 @@ static void gna_request_update_status(struct gna_request *score_request)
score_request->hw_perf.total = total_cycles;
score_request->hw_perf.stall = stall_cycles;
} else
- dev_warn(&gna_priv->pdev->dev, "GNA statistics missing\n");
+ dev_warn(gna_priv->misc.this_device, "GNA statistics missing\n");
}
if (unlikely(hw_status & GNA_ERROR))
gna_print_error_status(gna_priv, hw_status);
@@ -57,7 +57,7 @@ static void gna_request_process(struct work_struct *work)
score_request = container_of(work, struct gna_request, work);
gna_priv = score_request->gna_priv;
- dev_dbg(&gna_priv->pdev->dev, "processing request %llu\n", score_request->request_id);
+ dev_dbg(gna_priv->misc.this_device, "processing request %llu\n", score_request->request_id);
score_request->state = ACTIVE;
@@ -85,7 +85,7 @@ static void gna_request_process(struct work_struct *work)
!gna_priv->dev_busy, hw_timeout);
if (!hw_timeout)
- dev_warn(&gna_priv->pdev->dev, "hardware timeout occurred\n");
+ dev_warn(gna_priv->misc.this_device, "hardware timeout occurred\n");
gna_priv->hw_status = gna_reg_read(gna_priv->bar0_base, GNA_MMIO_STS);
@@ -102,7 +102,7 @@ static void gna_request_process(struct work_struct *work)
mo->ops->put_pages(mo);
mutex_unlock(&mo->page_lock);
} else {
- dev_warn(&gna_priv->pdev->dev, "mo not found %llu\n", buffer->memory_id);
+ dev_warn(gna_priv->misc.this_device, "mo not found %llu\n", buffer->memory_id);
}
}
@@ -115,7 +115,7 @@ static void gna_request_process(struct work_struct *work)
end:
score_request->drv_perf.completion = ktime_get_ns();
- dev_dbg(&gna_priv->pdev->dev, "request %llu done, waking processes\n",
+ dev_dbg(gna_priv->misc.this_device, "request %llu done, waking processes\n",
score_request->request_id);
score_request->state = DONE;
wake_up_interruptible_all(&score_request->waitq);
@@ -136,7 +136,7 @@ static struct gna_request *gna_request_create(struct gna_file_private *file_priv
return NULL;
kref_init(&score_request->refcount);
- dev_dbg(&gna_priv->pdev->dev, "layer_base %d layer_count %d\n",
+ dev_dbg(gna_priv->misc.this_device, "layer_base %d layer_count %d\n",
compute_cfg->layer_base, compute_cfg->layer_count);
score_request->request_id = atomic_inc_return(&gna_priv->request_count);
@@ -166,12 +166,12 @@ static int gna_validate_patches(struct gna_private *gna_priv, __u64 buffer_size,
for (idx = 0; idx < count; ++idx) {
if (patches[idx].size > 8) {
- dev_err(&gna_priv->pdev->dev, "invalid patch size: %llu\n", patches[idx].size);
+ dev_err(gna_priv->misc.this_device, "invalid patch size: %llu\n", patches[idx].size);
return -EINVAL;
}
if (!gna_validate_ranges(buffer_size, patches[idx].offset, patches[idx].size)) {
- dev_err(&gna_priv->pdev->dev,
+ dev_err(gna_priv->misc.this_device,
"patch out of bounds. buffer size: %llu, patch offset/size:%llu/%llu\n",
buffer_size, patches[idx].offset, patches[idx].size);
return -EINVAL;
@@ -204,14 +204,14 @@ static int gna_buffer_fill_patches(struct gna_buffer *buffer, struct gna_private
if (copy_from_user(patches, u64_to_user_ptr(patches_user),
sizeof(struct gna_memory_patch) * patch_count)) {
- dev_err(&gna_priv->pdev->dev, "copy %llu patches from user failed\n", patch_count);
+ dev_err(gna_priv->misc.this_device, "copy %llu patches from user failed\n", patch_count);
ret = -EFAULT;
goto err_fill_patches;
}
ret = gna_validate_patches(gna_priv, buffer->size, patches, patch_count);
if (ret) {
- dev_err(&gna_priv->pdev->dev, "patches failed validation\n");
+ dev_err(gna_priv->misc.this_device, "patches failed validation\n");
goto err_fill_patches;
}
@@ -246,7 +246,7 @@ static int gna_request_fill_buffers(struct gna_request *score_request,
if (copy_from_user(buffer_list, u64_to_user_ptr(compute_cfg->buffers_ptr),
sizeof(*buffer_list) * buffer_count)) {
- dev_err(&gna_priv->pdev->dev, "copying %llu buffers failed\n", buffer_count);
+ dev_err(gna_priv->misc.this_device, "copying %llu buffers failed\n", buffer_count);
ret = -EFAULT;
goto err_free_buffers;
}
@@ -257,7 +257,7 @@ static int gna_request_fill_buffers(struct gna_request *score_request,
for (j = 0; j < i; j++) {
if (buffer_list[j].memory_id == memory_id) {
- dev_err(&gna_priv->pdev->dev,
+ dev_err(gna_priv->misc.this_device,
"doubled memory id in score config. id:%llu\n", memory_id);
ret = -EINVAL;
goto err_zero_patch_ptr;
@@ -267,7 +267,7 @@ static int gna_request_fill_buffers(struct gna_request *score_request,
buffers_total_size +=
gna_buffer_get_size(buffer->offset, buffer->size);
if (buffers_total_size > gna_priv->info.max_hw_mem) {
- dev_err(&gna_priv->pdev->dev, "buffers' total size too big\n");
+ dev_err(gna_priv->misc.this_device, "buffers' total size too big\n");
ret = -EINVAL;
goto err_zero_patch_ptr;
}
@@ -276,14 +276,14 @@ static int gna_request_fill_buffers(struct gna_request *score_request,
mo = idr_find(&gna_priv->memory_idr, memory_id);
if (!mo) {
mutex_unlock(&gna_priv->memidr_lock);
- dev_err(&gna_priv->pdev->dev, "memory object %llu not found\n", memory_id);
+ dev_err(gna_priv->misc.this_device, "memory object %llu not found\n", memory_id);
ret = -EINVAL;
goto err_zero_patch_ptr;
}
mutex_unlock(&gna_priv->memidr_lock);
if (mo->fd != score_request->fd) {
- dev_err(&gna_priv->pdev->dev,
+ dev_err(gna_priv->misc.this_device,
"memory object from another file. %p != %p\n",
mo->fd, score_request->fd);
ret = -EINVAL;
@@ -291,7 +291,7 @@ static int gna_request_fill_buffers(struct gna_request *score_request,
}
if (!gna_validate_ranges(mo->memory_size, buffer->offset, buffer->size)) {
- dev_err(&gna_priv->pdev->dev,
+ dev_err(gna_priv->misc.this_device,
"buffer out of bounds. mo size: %llu, buffer offset/size:%llu/%llu\n",
mo->memory_size, buffer->offset, buffer->size);
ret = -EINVAL;
diff --git a/drivers/misc/intel/gna/gna_score.c b/drivers/misc/intel/gna/gna_score.c
index 794039d2da43..70ad867e215e 100644
--- a/drivers/misc/intel/gna/gna_score.c
+++ b/drivers/misc/intel/gna/gna_score.c
@@ -30,23 +30,23 @@ int gna_validate_score_config(struct gna_compute_cfg *compute_cfg,
gna_priv = file_priv->gna_priv;
if (compute_cfg->gna_mode > GNA_MODE_XNN) {
- dev_err(&gna_priv->pdev->dev, "invalid mode\n");
+ dev_err(gna_priv->misc.this_device, "invalid mode\n");
return -EINVAL;
}
if (compute_cfg->layer_count > gna_priv->info.max_layer_count) {
- dev_err(&gna_priv->pdev->dev, "max layer count exceeded\n");
+ dev_err(gna_priv->misc.this_device, "max layer count exceeded\n");
return -EINVAL;
}
if (compute_cfg->buffer_count == 0) {
- dev_err(&gna_priv->pdev->dev, "no buffers\n");
+ dev_err(gna_priv->misc.this_device, "no buffers\n");
return -EINVAL;
}
buffers_size = sizeof(struct gna_buffer) * compute_cfg->buffer_count;
if (!access_ok(u64_to_user_ptr(compute_cfg->buffers_ptr), buffers_size)) {
- dev_err(&gna_priv->pdev->dev, "invalid buffers pointer\n");
+ dev_err(gna_priv->misc.this_device, "invalid buffers pointer\n");
return -EINVAL;
}
@@ -63,7 +63,7 @@ static int gna_do_patch_memory(struct gna_private *gna_priv, struct gna_memory_o
value = patch->value;
size = patch->size;
dest = (u8 *)vaddr + patch->offset;
- dev_dbg(&gna_priv->pdev->dev, "patch offset: %llu, size: %zu, value: %llu\n",
+ dev_dbg(gna_priv->misc.this_device, "patch offset: %llu, size: %zu, value: %llu\n",
patch->offset, size, value);
switch (size) {
@@ -97,7 +97,7 @@ static int gna_mem_patch_memory(struct gna_private *gna_priv, struct gna_buffer
int ret = 0;
u32 i;
- dev_dbg(&gna_priv->pdev->dev, "memory_id: %llu, patch_count, %llu\n",
+ dev_dbg(gna_priv->misc.this_device, "memory_id: %llu, patch_count, %llu\n",
buffer->memory_id, buffer->patch_count);
mutex_lock(&gna_priv->memidr_lock);
@@ -179,7 +179,7 @@ static int gna_copy_gmm_config(struct gna_private *gna_priv,
buffer = gna_find_buffer(buffer_list, buffer_count, mmu_offset, &memory_offset);
if (!buffer) {
- dev_dbg(&gna_priv->pdev->dev, "buffer not found\n");
+ dev_dbg(gna_priv->misc.this_device, "buffer not found\n");
return -EINVAL;
}
@@ -187,13 +187,13 @@ static int gna_copy_gmm_config(struct gna_private *gna_priv,
mo = idr_find(&gna_priv->memory_idr, buffer->memory_id);
mutex_unlock(&gna_priv->memidr_lock);
if (!mo) {
- dev_dbg(&gna_priv->pdev->dev, "memory object not found\n");
+ dev_dbg(gna_priv->misc.this_device, "memory object not found\n");
return -EFAULT;
}
vaddr = vm_map_ram(mo->pages, mo->num_pinned, 0);
if (!vaddr) {
- dev_dbg(&gna_priv->pdev->dev, "mapping failed\n");
+ dev_dbg(gna_priv->misc.this_device, "mapping failed\n");
return -EFAULT;
}
@@ -230,9 +230,9 @@ int gna_score(struct gna_request *score_request)
buffer = score_request->buffer_list;
buffer_count = score_request->buffer_count;
- dev_dbg(&gna_priv->pdev->dev, "buffer count: %llu\n", buffer_count);
+ dev_dbg(gna_priv->misc.this_device, "buffer count: %llu\n", buffer_count);
for (i = 0; i < buffer_count; i++, buffer++) {
- dev_dbg(&gna_priv->pdev->dev, "patch count: %llu\n", buffer->patch_count);
+ dev_dbg(gna_priv->misc.this_device, "patch count: %llu\n", buffer->patch_count);
ret = gna_mem_patch_memory(gna_priv, buffer);
if (ret)
goto err_put_pages;
@@ -240,13 +240,13 @@ int gna_score(struct gna_request *score_request)
switch (compute_cfg->gna_mode) {
case GNA_MODE_XNN:
- dev_dbg(&gna_priv->pdev->dev, "xNN mode, labase: %d, lacount: %d\n",
+ dev_dbg(gna_priv->misc.this_device, "xNN mode, labase: %d, lacount: %d\n",
compute_cfg->layer_base, compute_cfg->layer_count);
xnn_config->labase = compute_cfg->layer_base;
xnn_config->lacount = compute_cfg->layer_count;
break;
case GNA_MODE_GMM:
- dev_dbg(&gna_priv->pdev->dev, "GMM mode, offset: %d\n", compute_cfg->layer_base);
+ dev_dbg(gna_priv->misc.this_device, "GMM mode, offset: %d\n", compute_cfg->layer_base);
ret = gna_copy_gmm_config(gna_priv, score_request->buffer_list,
buffer_count, compute_cfg->layer_base);
if (ret)
@@ -279,7 +279,7 @@ int gna_score(struct gna_request *score_request)
mutex_unlock(&mo->page_lock);
} else {
mo_valid = false;
- dev_warn(&gna_priv->pdev->dev, "memory object not found %llu\n",
+ dev_warn(gna_priv->misc.this_device, "memory object not found %llu\n",
buffer->memory_id);
}
buffer--;
--
2.28.0
From: Tomasz Jankowski <[email protected]>
Add definitions and utilities to interact with the hardware
device.
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-developed-by: Jianxun Zhang <[email protected]>
Signed-off-by: Jianxun Zhang <[email protected]>
Co-developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
drivers/misc/intel/gna/Kbuild | 2 +-
drivers/misc/intel/gna/gna_device.h | 4 +
drivers/misc/intel/gna/gna_hw.c | 125 ++++++++++++++++++++++++++++
drivers/misc/intel/gna/gna_hw.h | 62 ++++++++++++++
4 files changed, 192 insertions(+), 1 deletion(-)
create mode 100644 drivers/misc/intel/gna/gna_hw.c
create mode 100644 drivers/misc/intel/gna/gna_hw.h
diff --git a/drivers/misc/intel/gna/Kbuild b/drivers/misc/intel/gna/Kbuild
index 5d3becc71683..0cf083bb211a 100644
--- a/drivers/misc/intel/gna/Kbuild
+++ b/drivers/misc/intel/gna/Kbuild
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
-intel_gna-y := gna_device.o gna_driver.o
+intel_gna-y := gna_device.o gna_driver.o gna_hw.o
obj-$(CONFIG_INTEL_GNA) += intel_gna.o
diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
index d0b47f75f47f..39dc03d53feb 100644
--- a/drivers/misc/intel/gna/gna_device.h
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -6,6 +6,8 @@
#include <linux/types.h>
+#include "gna_hw.h"
+
struct gna_driver_private;
struct pci_device_id;
struct pci_dev;
@@ -17,6 +19,8 @@ struct gna_drv_info {
u32 num_page_entries;
u32 max_layer_count;
u64 max_hw_mem;
+
+ struct gna_desc_info desc_info;
};
struct gna_private {
diff --git a/drivers/misc/intel/gna/gna_hw.c b/drivers/misc/intel/gna/gna_hw.c
new file mode 100644
index 000000000000..7d2f4ef00136
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_hw.c
@@ -0,0 +1,125 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2017-2021 Intel Corporation
+
+#include <linux/pci.h>
+
+#include <uapi/misc/intel/gna.h>
+
+#include "gna_device.h"
+#include "gna_driver.h"
+#include "gna_hw.h"
+
+int gna_parse_hw_status(struct gna_private *gna_priv, u32 hw_status)
+{
+ int status;
+
+ if (hw_status & GNA_ERROR) {
+ dev_dbg(&gna_priv->pdev->dev, "GNA completed with errors: %#x\n", hw_status);
+ status = -EIO;
+ } else if (hw_status & GNA_STS_SCORE_COMPLETED) {
+ status = 0;
+ dev_dbg(&gna_priv->pdev->dev, "GNA completed successfully: %#x\n", hw_status);
+ } else {
+ dev_err(&gna_priv->pdev->dev, "GNA not completed, status: %#x\n", hw_status);
+ status = -ENODATA;
+ }
+
+ return status;
+}
+
+void gna_print_error_status(struct gna_private *gna_priv, u32 hw_status)
+{
+ if (hw_status & GNA_STS_PARAM_OOR)
+ dev_dbg(&gna_priv->pdev->dev, "GNA error: Param Out Range Error\n");
+
+ if (hw_status & GNA_STS_VA_OOR)
+ dev_dbg(&gna_priv->pdev->dev, "GNA error: VA Out of Range Error\n");
+
+ if (hw_status & GNA_STS_PCI_MMU_ERR)
+ dev_dbg(&gna_priv->pdev->dev, "GNA error: PCI MMU Error\n");
+
+ if (hw_status & GNA_STS_PCI_DMA_ERR)
+ dev_dbg(&gna_priv->pdev->dev, "GNA error: PCI MMU Error\n");
+
+ if (hw_status & GNA_STS_PCI_UNEXCOMPL_ERR)
+ dev_dbg(&gna_priv->pdev->dev, "GNA error: PCI Unexpected Completion Error\n");
+
+ if (hw_status & GNA_STS_SATURATE)
+ dev_dbg(&gna_priv->pdev->dev, "GNA error: Saturation Reached !\n");
+}
+
+bool gna_hw_perf_enabled(struct gna_private *gna_priv)
+{
+ void __iomem *addr = gna_priv->bar0_base;
+ u32 ctrl = gna_reg_read(addr, GNA_MMIO_CTRL);
+
+ return FIELD_GET(GNA_CTRL_COMP_STATS_EN, ctrl) ? true : false;
+}
+
+void gna_start_scoring(struct gna_private *gna_priv, void __iomem *addr,
+ struct gna_compute_cfg *compute_cfg)
+{
+ u32 ctrl = gna_reg_read(addr, GNA_MMIO_CTRL);
+
+ ctrl |= GNA_CTRL_START_ACCEL | GNA_CTRL_COMP_INT_EN | GNA_CTRL_ERR_INT_EN;
+
+ ctrl &= ~GNA_CTRL_COMP_STATS_EN;
+ ctrl |= FIELD_PREP(GNA_CTRL_COMP_STATS_EN,
+ compute_cfg->hw_perf_encoding & FIELD_MAX(GNA_CTRL_COMP_STATS_EN));
+
+ ctrl &= ~GNA_CTRL_ACTIVE_LIST_EN;
+ ctrl |= FIELD_PREP(GNA_CTRL_ACTIVE_LIST_EN,
+ compute_cfg->active_list_on & FIELD_MAX(GNA_CTRL_ACTIVE_LIST_EN));
+
+ ctrl &= ~GNA_CTRL_OP_MODE;
+ ctrl |= FIELD_PREP(GNA_CTRL_OP_MODE,
+ compute_cfg->gna_mode & FIELD_MAX(GNA_CTRL_OP_MODE));
+
+ gna_reg_write(addr, GNA_MMIO_CTRL, ctrl);
+
+ dev_dbg(&gna_priv->pdev->dev, "scoring started...\n");
+}
+
+static void gna_clear_saturation(struct gna_private *gna_priv)
+{
+ void __iomem *addr = gna_priv->bar0_base;
+ u32 val;
+
+ val = gna_reg_read(addr, GNA_MMIO_STS);
+ if (val & GNA_STS_SATURATE) {
+ dev_dbg(&gna_priv->pdev->dev, "saturation reached\n");
+ dev_dbg(&gna_priv->pdev->dev, "status: %#x\n", val);
+
+ val = val & GNA_STS_SATURATE;
+ gna_reg_write(addr, GNA_MMIO_STS, val);
+ }
+}
+
+void gna_abort_hw(struct gna_private *gna_priv)
+{
+ void __iomem *addr = gna_priv->bar0_base;
+ u32 val;
+ int i;
+
+ /* saturation bit in the GNA status register needs
+ * to be explicitly cleared.
+ */
+ gna_clear_saturation(gna_priv);
+
+ val = gna_reg_read(addr, GNA_MMIO_STS);
+ dev_dbg(&gna_priv->pdev->dev, "status before abort: %#x\n", val);
+
+ val = gna_reg_read(addr, GNA_MMIO_CTRL);
+ val |= GNA_CTRL_ABORT_CLR_ACCEL;
+ gna_reg_write(addr, GNA_MMIO_CTRL, val);
+
+ i = 100;
+ do {
+ val = gna_reg_read(addr, GNA_MMIO_STS);
+ if ((val & 0x1) == 0)
+ break;
+ } while (--i);
+
+ if (i == 0)
+ dev_err(&gna_priv->pdev->dev, "abort did not complete\n");
+}
diff --git a/drivers/misc/intel/gna/gna_hw.h b/drivers/misc/intel/gna/gna_hw.h
new file mode 100644
index 000000000000..dd682f95094e
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_hw.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2017-2021 Intel Corporation */
+
+#ifndef __GNA_HW_H__
+#define __GNA_HW_H__
+
+#include <linux/bits.h>
+#include <linux/bitfield.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+
+/* GNA MMIO registers */
+#define GNA_MMIO_STS 0x80
+#define GNA_MMIO_CTRL 0x84
+#define GNA_MMIO_PTC 0x8C
+#define GNA_MMIO_PSC 0x90
+#define GNA_MMIO_DESBASE 0xB0
+#define GNA_MMIO_IBUFFS 0xB4
+
+#define GNA_PT_ENTRY_SIZE 4
+/* there are up to 1024 32-bit pointers in one page in Page Table (L1) */
+#define GNA_PT_LENGTH (PAGE_SIZE / GNA_PT_ENTRY_SIZE)
+
+#define GNA_PGDIRN_LEN 64
+#define GNA_PGDIR_ENTRIES 1024 /* 32-bit page addresses */
+#define GNA_PGDIR_INVALID 1
+
+#define GNA_CTRL_START_ACCEL BIT(0)
+#define GNA_CTRL_ACTIVE_LIST_EN BIT(1)
+#define GNA_CTRL_ABORT_CLR_ACCEL BIT(2)
+#define GNA_CTRL_OP_MODE GENMASK(6, 5)
+#define GNA_CTRL_COMP_INT_EN BIT(8)
+#define GNA_CTRL_ERR_INT_EN BIT(10)
+#define GNA_CTRL_COMP_STATS_EN GENMASK(15, 12)
+
+struct gna_mmu_info {
+ u32 vamax_size;
+ u32 rsvd_size;
+ u32 pd_size;
+};
+
+struct gna_desc_info {
+ u32 rsvd_size;
+ u32 cfg_size;
+ u32 desc_size;
+ struct gna_mmu_info mmu_info;
+};
+
+struct gna_private;
+struct gna_compute_cfg;
+
+void gna_abort_hw(struct gna_private *gna_priv);
+bool gna_hw_perf_enabled(struct gna_private *gna_priv);
+int gna_parse_hw_status(struct gna_private *gna_priv, u32 hw_status);
+void gna_print_error_status(struct gna_private *gna_priv, u32 hw_status);
+void gna_start_scoring(struct gna_private *gna_priv, void __iomem *addr,
+ struct gna_compute_cfg *compute_cfg);
+
+#define gna_reg_read(addr, offset) readl((addr) + (offset))
+#define gna_reg_write(addr, offset, value) writel((value), (addr) + (offset))
+
+#endif // __GNA_HW_H__
--
2.28.0
From: Tomasz Jankowski <[email protected]>
The scoring work submitted to the GNA driver is implemented as a
list of requests that will be processed by the hardware.
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-developed-by: Anisha Dattatraya Kulkarni <[email protected]>
Signed-off-by: Anisha Dattatraya Kulkarni <[email protected]>
Co-developed-by: Jianxun Zhang <[email protected]>
Signed-off-by: Jianxun Zhang <[email protected]>
Co-developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
drivers/misc/intel/gna/Kbuild | 2 +-
drivers/misc/intel/gna/gna_device.c | 6 +
drivers/misc/intel/gna/gna_device.h | 6 +
drivers/misc/intel/gna/gna_mem.c | 3 +
drivers/misc/intel/gna/gna_request.c | 347 +++++++++++++++++++++++++++
drivers/misc/intel/gna/gna_request.h | 61 +++++
6 files changed, 424 insertions(+), 1 deletion(-)
create mode 100644 drivers/misc/intel/gna/gna_request.c
create mode 100644 drivers/misc/intel/gna/gna_request.h
diff --git a/drivers/misc/intel/gna/Kbuild b/drivers/misc/intel/gna/Kbuild
index e5cd953d83b2..5dbbd3f0a543 100644
--- a/drivers/misc/intel/gna/Kbuild
+++ b/drivers/misc/intel/gna/Kbuild
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
-intel_gna-y := gna_device.o gna_driver.o gna_mem.o gna_hw.o
+intel_gna-y := gna_device.o gna_driver.o gna_mem.o gna_request.o gna_hw.o
obj-$(CONFIG_INTEL_GNA) += intel_gna.o
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
index 9838d003426f..14ce24fd18ff 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -6,6 +6,7 @@
#include "gna_device.h"
#include "gna_driver.h"
+#include "gna_request.h"
#define GNA_DEV_HWID_CNL 0x5A11
#define GNA_DEV_HWID_EHL 0x4511
@@ -118,6 +119,11 @@ static int gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
idr_init(&gna_priv->memory_idr);
mutex_init(&gna_priv->memidr_lock);
+ atomic_set(&gna_priv->request_count, 0);
+
+ mutex_init(&gna_priv->reqlist_lock);
+ INIT_LIST_HEAD(&gna_priv->request_list);
+
return 0;
err_pci_drvdata_unset:
diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
index 799788d70033..b54d0ea9b9ef 100644
--- a/drivers/misc/intel/gna/gna_device.h
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -6,6 +6,7 @@
#include <linux/types.h>
#include <linux/mutex.h>
+#include <linux/list.h>
#include <linux/idr.h>
#include <linux/pci.h>
@@ -44,6 +45,11 @@ struct gna_private {
struct gna_mmu_object mmu;
struct mutex mmu_lock;
+ struct list_head request_list;
+ /* protects request_list */
+ struct mutex reqlist_lock;
+ atomic_t request_count;
+
/* memory objects' store */
struct idr memory_idr;
/* lock protecting memory_idr */
diff --git a/drivers/misc/intel/gna/gna_mem.c b/drivers/misc/intel/gna/gna_mem.c
index f3828b503ff6..ce1691d68edb 100644
--- a/drivers/misc/intel/gna/gna_mem.c
+++ b/drivers/misc/intel/gna/gna_mem.c
@@ -17,6 +17,7 @@
#include "gna_device.h"
#include "gna_driver.h"
#include "gna_mem.h"
+#include "gna_request.h"
static void gna_mmu_init(struct gna_private *gna_priv)
{
@@ -392,6 +393,8 @@ static void gna_memory_release(struct work_struct *work)
mo = container_of(work, struct gna_memory_object, work);
+ gna_delete_memory_requests(mo->memory_id, mo->gna_priv);
+
mo->user_ptr = NULL;
wake_up_interruptible(&mo->waitq);
diff --git a/drivers/misc/intel/gna/gna_request.c b/drivers/misc/intel/gna/gna_request.c
new file mode 100644
index 000000000000..383871eaebab
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_request.c
@@ -0,0 +1,347 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2017-2021 Intel Corporation
+
+#include <linux/device.h>
+#include <linux/kref.h>
+#include <linux/wait.h>
+#include <linux/workqueue.h>
+
+#include "gna_device.h"
+#include "gna_driver.h"
+#include "gna_request.h"
+
+static struct gna_request *gna_request_create(struct gna_file_private *file_priv,
+ struct gna_compute_cfg *compute_cfg)
+{
+ struct gna_request *score_request;
+ struct gna_private *gna_priv;
+
+ gna_priv = file_priv->gna_priv;
+ if (IS_ERR(gna_priv))
+ return NULL;
+
+ score_request = kzalloc(sizeof(*score_request), GFP_KERNEL);
+ if (!score_request)
+ return NULL;
+ kref_init(&score_request->refcount);
+
+ dev_dbg(&gna_priv->pdev->dev, "layer_base %d layer_count %d\n",
+ compute_cfg->layer_base, compute_cfg->layer_count);
+
+ score_request->request_id = atomic_inc_return(&gna_priv->request_count);
+ score_request->compute_cfg = *compute_cfg;
+ score_request->fd = file_priv->fd;
+ score_request->gna_priv = gna_priv;
+ score_request->state = NEW;
+ init_waitqueue_head(&score_request->waitq);
+
+ return score_request;
+}
+
+/*
+ * returns true if [inner_offset, inner_size) is embraced by [0, outer_size). False otherwise.
+ */
+static bool gna_validate_ranges(u64 outer_size, u64 inner_offset, u64 inner_size)
+{
+ return inner_offset < outer_size &&
+ inner_size <= (outer_size - inner_offset);
+}
+
+static int gna_validate_patches(struct gna_private *gna_priv, __u64 buffer_size,
+ struct gna_memory_patch *patches, u64 count)
+{
+ u64 idx;
+
+ for (idx = 0; idx < count; ++idx) {
+ if (patches[idx].size > 8) {
+ dev_err(&gna_priv->pdev->dev, "invalid patch size: %llu\n", patches[idx].size);
+ return -EINVAL;
+ }
+
+ if (!gna_validate_ranges(buffer_size, patches[idx].offset, patches[idx].size)) {
+ dev_err(&gna_priv->pdev->dev,
+ "patch out of bounds. buffer size: %llu, patch offset/size:%llu/%llu\n",
+ buffer_size, patches[idx].offset, patches[idx].size);
+ return -EINVAL;
+ }
+ }
+
+ return 0;
+}
+
+static int gna_buffer_fill_patches(struct gna_buffer *buffer, struct gna_private *gna_priv)
+{
+ __u64 patches_user = buffer->patches_ptr;
+ struct gna_memory_patch *patches;
+ /* At this point, the buffer points to a memory region in kernel space where the copied
+ * patches_ptr also lives, but the value of it is still an address from user space. This
+ * function will set patches_ptr to either an address in kernel space or null before it
+ * exits.
+ */
+ u64 patch_count;
+ int ret;
+
+ buffer->patches_ptr = 0;
+ patch_count = buffer->patch_count;
+ if (!patch_count)
+ return 0;
+
+ patches = kvmalloc_array(patch_count, sizeof(struct gna_memory_patch), GFP_KERNEL);
+ if (!patches)
+ return -ENOMEM;
+
+ if (copy_from_user(patches, u64_to_user_ptr(patches_user),
+ sizeof(struct gna_memory_patch) * patch_count)) {
+ dev_err(&gna_priv->pdev->dev, "copy %llu patches from user failed\n", patch_count);
+ ret = -EFAULT;
+ goto err_fill_patches;
+ }
+
+ ret = gna_validate_patches(gna_priv, buffer->size, patches, patch_count);
+ if (ret) {
+ dev_err(&gna_priv->pdev->dev, "patches failed validation\n");
+ goto err_fill_patches;
+ }
+
+ buffer->patches_ptr = (uintptr_t)patches;
+
+ return 0;
+
+err_fill_patches:
+ kvfree(patches);
+ return ret;
+}
+
+static int gna_request_fill_buffers(struct gna_request *score_request,
+ struct gna_compute_cfg *compute_cfg)
+{
+ struct gna_buffer *buffer_list;
+ struct gna_memory_object *mo;
+ struct gna_private *gna_priv;
+ u64 buffers_total_size = 0;
+ struct gna_buffer *buffer;
+ u64 buffer_count;
+ u64 memory_id;
+ u64 i, j;
+ int ret;
+
+ gna_priv = score_request->gna_priv;
+
+ buffer_count = compute_cfg->buffer_count;
+ buffer_list = kvmalloc_array(buffer_count, sizeof(struct gna_buffer), GFP_KERNEL);
+ if (!buffer_list)
+ return -ENOMEM;
+
+ if (copy_from_user(buffer_list, u64_to_user_ptr(compute_cfg->buffers_ptr),
+ sizeof(*buffer_list) * buffer_count)) {
+ dev_err(&gna_priv->pdev->dev, "copying %llu buffers failed\n", buffer_count);
+ ret = -EFAULT;
+ goto err_free_buffers;
+ }
+
+ for (i = 0; i < buffer_count; i++) {
+ buffer = &buffer_list[i];
+ memory_id = buffer->memory_id;
+
+ for (j = 0; j < i; j++) {
+ if (buffer_list[j].memory_id == memory_id) {
+ dev_err(&gna_priv->pdev->dev,
+ "doubled memory id in score config. id:%llu\n", memory_id);
+ ret = -EINVAL;
+ goto err_zero_patch_ptr;
+ }
+ }
+
+ buffers_total_size +=
+ gna_buffer_get_size(buffer->offset, buffer->size);
+ if (buffers_total_size > gna_priv->info.max_hw_mem) {
+ dev_err(&gna_priv->pdev->dev, "buffers' total size too big\n");
+ ret = -EINVAL;
+ goto err_zero_patch_ptr;
+ }
+
+ mutex_lock(&gna_priv->memidr_lock);
+ mo = idr_find(&gna_priv->memory_idr, memory_id);
+ if (!mo) {
+ mutex_unlock(&gna_priv->memidr_lock);
+ dev_err(&gna_priv->pdev->dev, "memory object %llu not found\n", memory_id);
+ ret = -EINVAL;
+ goto err_zero_patch_ptr;
+ }
+ mutex_unlock(&gna_priv->memidr_lock);
+
+ if (mo->fd != score_request->fd) {
+ dev_err(&gna_priv->pdev->dev,
+ "memory object from another file. %p != %p\n",
+ mo->fd, score_request->fd);
+ ret = -EINVAL;
+ goto err_zero_patch_ptr;
+ }
+
+ if (!gna_validate_ranges(mo->memory_size, buffer->offset, buffer->size)) {
+ dev_err(&gna_priv->pdev->dev,
+ "buffer out of bounds. mo size: %llu, buffer offset/size:%llu/%llu\n",
+ mo->memory_size, buffer->offset, buffer->size);
+ ret = -EINVAL;
+ goto err_zero_patch_ptr;
+ }
+
+ ret = gna_buffer_fill_patches(buffer, gna_priv);
+ if (ret)
+ goto err_free_patches;
+ }
+
+ score_request->buffer_list = buffer_list;
+ score_request->buffer_count = buffer_count;
+
+ return 0;
+
+err_zero_patch_ptr:
+ /* patches_ptr may still hold an address in userspace.
+ * Don't pass it to kvfree().
+ */
+ buffer->patches_ptr = 0;
+
+err_free_patches:
+ /* patches_ptr of each processed buffer should be either
+ * null or pointing to an allocated memory block in the
+ * kernel at this point.
+ */
+ for (j = 0; j <= i; j++)
+ kvfree((void *)(uintptr_t)buffer_list[j].patches_ptr);
+
+err_free_buffers:
+ kvfree(buffer_list);
+ return ret;
+}
+
+int gna_enqueue_request(struct gna_compute_cfg *compute_cfg,
+ struct gna_file_private *file_priv, u64 *request_id)
+{
+ struct gna_request *score_request;
+ struct gna_private *gna_priv;
+ int ret;
+
+ if (!file_priv)
+ return -EINVAL;
+
+ gna_priv = file_priv->gna_priv;
+
+ score_request = gna_request_create(file_priv, compute_cfg);
+ if (!score_request)
+ return -ENOMEM;
+
+ ret = gna_request_fill_buffers(score_request, compute_cfg);
+ if (ret) {
+ kref_put(&score_request->refcount, gna_request_release);
+ return ret;
+ }
+
+ kref_get(&score_request->refcount);
+ mutex_lock(&gna_priv->reqlist_lock);
+ list_add_tail(&score_request->node, &gna_priv->request_list);
+ mutex_unlock(&gna_priv->reqlist_lock);
+
+ kref_put(&score_request->refcount, gna_request_release);
+
+ *request_id = score_request->request_id;
+
+ return 0;
+}
+
+void gna_request_release(struct kref *ref)
+{
+ struct gna_request *score_request =
+ container_of(ref, struct gna_request, refcount);
+ kfree(score_request);
+}
+
+struct gna_request *gna_find_request_by_id(u64 req_id, struct gna_private *gna_priv)
+{
+ struct gna_request *req, *found_req;
+ struct list_head *reqs_list;
+
+ mutex_lock(&gna_priv->reqlist_lock);
+
+ reqs_list = &gna_priv->request_list;
+ found_req = NULL;
+ if (!list_empty(reqs_list)) {
+ list_for_each_entry(req, reqs_list, node) {
+ if (req_id == req->request_id) {
+ found_req = req;
+ kref_get(&found_req->refcount);
+ break;
+ }
+ }
+ }
+
+ mutex_unlock(&gna_priv->reqlist_lock);
+
+ return found_req;
+}
+
+void gna_delete_request_by_id(u64 req_id, struct gna_private *gna_priv)
+{
+ struct gna_request *req, *temp_req;
+ struct list_head *reqs_list;
+
+ mutex_lock(&gna_priv->reqlist_lock);
+
+ reqs_list = &gna_priv->request_list;
+ if (!list_empty(reqs_list)) {
+ list_for_each_entry_safe(req, temp_req, reqs_list, node) {
+ if (req->request_id == req_id) {
+ list_del(&req->node);
+ kref_put(&req->refcount, gna_request_release);
+ break;
+ }
+ }
+ }
+
+ mutex_unlock(&gna_priv->reqlist_lock);
+}
+
+void gna_delete_file_requests(struct file *fd, struct gna_private *gna_priv)
+{
+ struct gna_request *req, *temp_req;
+ struct list_head *reqs_list;
+
+ mutex_lock(&gna_priv->reqlist_lock);
+
+ reqs_list = &gna_priv->request_list;
+ if (!list_empty(reqs_list)) {
+ list_for_each_entry_safe(req, temp_req, reqs_list, node) {
+ if (req->fd == fd) {
+ list_del(&req->node);
+ kref_put(&req->refcount, gna_request_release);
+ break;
+ }
+ }
+ }
+
+ mutex_unlock(&gna_priv->reqlist_lock);
+}
+
+void gna_delete_memory_requests(u64 memory_id, struct gna_private *gna_priv)
+{
+ struct gna_request *req, *temp_req;
+ struct list_head *reqs_list;
+ int i;
+
+ mutex_lock(&gna_priv->reqlist_lock);
+
+ reqs_list = &gna_priv->request_list;
+ if (!list_empty(reqs_list)) {
+ list_for_each_entry_safe(req, temp_req, reqs_list, node) {
+ for (i = 0; i < req->buffer_count; ++i) {
+ if (req->buffer_list[i].memory_id == memory_id) {
+ list_del(&req->node);
+ kref_put(&req->refcount, gna_request_release);
+ break;
+ }
+ }
+ }
+ }
+
+ mutex_unlock(&gna_priv->reqlist_lock);
+}
diff --git a/drivers/misc/intel/gna/gna_request.h b/drivers/misc/intel/gna/gna_request.h
new file mode 100644
index 000000000000..609e66ffb54f
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_request.h
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2017-2021 Intel Corporation */
+
+#ifndef __GNA_REQUEST_H__
+#define __GNA_REQUEST_H__
+
+#include <linux/kref.h>
+#include <linux/wait.h>
+#include <linux/workqueue.h>
+
+#include <uapi/misc/intel/gna.h>
+
+enum gna_request_state {
+ NEW,
+ ACTIVE,
+ DONE,
+};
+
+struct gna_file_private;
+
+struct gna_request {
+ u64 request_id;
+
+ struct kref refcount;
+
+ struct gna_private *gna_priv;
+ struct file *fd;
+
+ u32 hw_status;
+
+ enum gna_request_state state;
+
+ int status;
+
+ struct gna_hw_perf hw_perf;
+ struct gna_drv_perf drv_perf;
+
+ struct list_head node;
+
+ struct gna_compute_cfg compute_cfg;
+
+ struct gna_buffer *buffer_list;
+ u64 buffer_count;
+
+ struct wait_queue_head waitq;
+};
+
+int gna_enqueue_request(struct gna_compute_cfg *compute_cfg,
+ struct gna_file_private *file_priv, u64 *request_id);
+
+void gna_request_release(struct kref *ref);
+
+struct gna_request *gna_find_request_by_id(u64 req_id, struct gna_private *gna_priv);
+
+void gna_delete_request_by_id(u64 req_id, struct gna_private *gna_priv);
+
+void gna_delete_file_requests(struct file *fd, struct gna_private *gna_priv);
+
+void gna_delete_memory_requests(u64 memory_id, struct gna_private *gna_priv);
+
+#endif // __GNA_REQUEST_H__
--
2.28.0
From: Tomasz Jankowski <[email protected]>
An interrupt is generated by the hardware when a scoring job is
done. The interrupt handler wakes up the work queue to resume
the processing on the current request.
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-developed-by: Jianxun Zhang <[email protected]>
Signed-off-by: Jianxun Zhang <[email protected]>
Co-developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
drivers/misc/intel/gna/gna_device.c | 45 ++++++++++++++++++++++++++++-
drivers/misc/intel/gna/gna_device.h | 2 ++
drivers/misc/intel/gna/gna_hw.h | 1 -
3 files changed, 46 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
index 47f238677bc9..67917106a262 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0-only
// Copyright(c) 2017-2021 Intel Corporation
+#include <linux/interrupt.h>
#include <linux/module.h>
#include <linux/pci.h>
@@ -153,6 +154,16 @@ static void gna_dev_deinit(struct gna_private *gna_priv)
gna_mmu_free(gna_priv);
}
+static irqreturn_t gna_interrupt(int irq, void *priv)
+{
+ struct gna_private *gna_priv;
+
+ gna_priv = (struct gna_private *)priv;
+ gna_priv->dev_busy = false;
+ wake_up(&gna_priv->dev_busy_waitq);
+ return IRQ_HANDLED;
+}
+
int gna_probe(struct pci_dev *pcidev, const struct pci_device_id *pci_id)
{
struct gna_private *gna_priv;
@@ -201,14 +212,42 @@ int gna_probe(struct pci_dev *pcidev, const struct pci_device_id *pci_id)
pci_set_master(pcidev);
+ ret = pci_alloc_irq_vectors(pcidev, 1, 1, PCI_IRQ_ALL_TYPES);
+ if (ret < 0)
+ return ret;
+
+ gna_priv->irq = pci_irq_vector(pcidev, 0);
+ if (unlikely(gna_priv->irq < 0)) {
+ dev_err(&pcidev->dev, "could not obtain irq number\n");
+ ret = -EIO;
+ goto err_free_irq_vector;
+ }
+
+ ret = request_irq(gna_priv->irq, gna_interrupt,
+ IRQF_SHARED, GNA_DV_NAME, gna_priv);
+
+ if (ret) {
+ dev_err(&pcidev->dev, "could not register for interrupt\n");
+ goto err_free_irq_vector;
+ }
+
+ dev_dbg(&pcidev->dev, "irq num %d\n", gna_priv->irq);
+
ret = gna_dev_init(gna_priv, pcidev, pci_id);
if (ret) {
dev_err(&pcidev->dev, "could not initialize %s device\n", GNA_DV_NAME);
- return ret;
+ goto err_free_irq;
}
return 0;
+
+err_free_irq:
+ free_irq(gna_priv->irq, gna_priv);
+err_free_irq_vector:
+ pci_free_irq_vectors(pcidev);
+
+ return ret;
}
void gna_remove(struct pci_dev *pcidev)
@@ -217,5 +256,9 @@ void gna_remove(struct pci_dev *pcidev)
gna_priv = pci_get_drvdata(pcidev);
+ free_irq(gna_priv->irq, gna_priv);
+
gna_dev_deinit(gna_priv);
+
+ pci_free_irq_vectors(pcidev);
}
diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
index 23eae806f96d..7ba25f6f8492 100644
--- a/drivers/misc/intel/gna/gna_device.h
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -43,6 +43,8 @@ struct gna_private {
/* pdev->dev */
struct device *parent;
+ int irq;
+ /* hardware status set by interrupt handler */
u32 hw_status;
/* device related resources */
diff --git a/drivers/misc/intel/gna/gna_hw.h b/drivers/misc/intel/gna/gna_hw.h
index dd682f95094e..904e551f4f17 100644
--- a/drivers/misc/intel/gna/gna_hw.h
+++ b/drivers/misc/intel/gna/gna_hw.h
@@ -6,7 +6,6 @@
#include <linux/bits.h>
#include <linux/bitfield.h>
-#include <linux/interrupt.h>
#include <linux/io.h>
/* GNA MMIO registers */
--
2.28.0
From: Tomasz Jankowski <[email protected]>
The new workqueue is responsible to process the list of requests
in a FIFO manner. It waits for the hardware to complete on every
request until it is woken up by an interrupt that will be addressed
in following changes.
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-developed-by: Anisha Dattatraya Kulkarni <[email protected]>
Signed-off-by: Anisha Dattatraya Kulkarni <[email protected]>
Co-developed-by: Jianxun Zhang <[email protected]>
Signed-off-by: Jianxun Zhang <[email protected]>
Co-developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
drivers/misc/intel/gna/gna_device.c | 12 +++
drivers/misc/intel/gna/gna_device.h | 8 ++
drivers/misc/intel/gna/gna_request.c | 116 +++++++++++++++++++++++++++
drivers/misc/intel/gna/gna_request.h | 1 +
4 files changed, 137 insertions(+)
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
index e1a1f3142684..47f238677bc9 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -127,6 +127,15 @@ static int gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
mutex_init(&gna_priv->reqlist_lock);
INIT_LIST_HEAD(&gna_priv->request_list);
+ init_waitqueue_head(&gna_priv->dev_busy_waitq);
+
+ gna_priv->request_wq = create_singlethread_workqueue(GNA_DV_NAME);
+ if (!gna_priv->request_wq) {
+ dev_err(&pcidev->dev, "could not create %s workqueue\n", GNA_DV_NAME);
+ ret = -EFAULT;
+ goto err_pci_drvdata_unset;
+ }
+
return 0;
err_pci_drvdata_unset:
@@ -137,6 +146,9 @@ static int gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
static void gna_dev_deinit(struct gna_private *gna_priv)
{
+ flush_workqueue(gna_priv->request_wq);
+ destroy_workqueue(gna_priv->request_wq);
+
idr_destroy(&gna_priv->memory_idr);
gna_mmu_free(gna_priv);
}
diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
index 878a972ab5b3..23eae806f96d 100644
--- a/drivers/misc/intel/gna/gna_device.h
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -14,6 +14,7 @@
#include "gna_mem.h"
struct gna_driver_private;
+struct workqueue_struct;
struct device;
struct gna_drv_info {
@@ -42,6 +43,8 @@ struct gna_private {
/* pdev->dev */
struct device *parent;
+ u32 hw_status;
+
/* device related resources */
void __iomem *bar0_base;
struct gna_drv_info info;
@@ -50,9 +53,14 @@ struct gna_private {
struct gna_mmu_object mmu;
struct mutex mmu_lock;
+ /* if true, then gna device is processing */
+ bool dev_busy;
+ struct wait_queue_head dev_busy_waitq;
+
struct list_head request_list;
/* protects request_list */
struct mutex reqlist_lock;
+ struct workqueue_struct *request_wq;
atomic_t request_count;
/* memory objects' store */
diff --git a/drivers/misc/intel/gna/gna_request.c b/drivers/misc/intel/gna/gna_request.c
index 383871eaebab..ba9bac358270 100644
--- a/drivers/misc/intel/gna/gna_request.c
+++ b/drivers/misc/intel/gna/gna_request.c
@@ -8,7 +8,118 @@
#include "gna_device.h"
#include "gna_driver.h"
+#include "gna_hw.h"
#include "gna_request.h"
+#include "gna_score.h"
+
+static void gna_request_update_status(struct gna_request *score_request)
+{
+ struct gna_private *gna_priv = score_request->gna_priv;
+ void __iomem *addr = gna_priv->bar0_base;
+ /* The gna_priv's hw_status should be updated first */
+ u32 hw_status = gna_priv->hw_status;
+ u32 stall_cycles;
+ u32 total_cycles;
+
+ /* Technically, the time stamp can be a bit later than
+ * when the hw actually completed scoring. Here we just
+ * do our best in a deferred work, unless we want to
+ * tax isr for a more accurate record.
+ */
+ score_request->drv_perf.hw_completed = ktime_get_ns();
+
+ score_request->hw_status = hw_status;
+
+ score_request->status = gna_parse_hw_status(gna_priv, hw_status);
+
+ if (gna_hw_perf_enabled(gna_priv)) {
+ if (hw_status & GNA_STS_STATISTICS_VALID) {
+ total_cycles = gna_reg_read(addr, GNA_MMIO_PTC);
+ stall_cycles = gna_reg_read(addr, GNA_MMIO_PSC);
+ score_request->hw_perf.total = total_cycles;
+ score_request->hw_perf.stall = stall_cycles;
+ } else
+ dev_warn(&gna_priv->pdev->dev, "GNA statistics missing\n");
+ }
+ if (unlikely(hw_status & GNA_ERROR))
+ gna_print_error_status(gna_priv, hw_status);
+}
+
+static void gna_request_process(struct work_struct *work)
+{
+ struct gna_request *score_request;
+ struct gna_memory_object *mo;
+ struct gna_private *gna_priv;
+ struct gna_buffer *buffer;
+ unsigned long hw_timeout;
+ int ret;
+ u64 i;
+
+ score_request = container_of(work, struct gna_request, work);
+ gna_priv = score_request->gna_priv;
+ dev_dbg(&gna_priv->pdev->dev, "processing request %llu\n", score_request->request_id);
+
+ score_request->state = ACTIVE;
+
+ score_request->drv_perf.pre_processing = ktime_get_ns();
+
+ /* Set busy flag before kicking off HW. The isr will clear it and wake up us. There is
+ * no difference if isr is missed in a timeout situation of the last request. We just
+ * always set it busy and let the wait_event_timeout check the reset.
+ * wq: X -> true
+ * isr: X -> false
+ */
+ gna_priv->dev_busy = true;
+
+ ret = gna_score(score_request);
+ if (ret) {
+ score_request->status = ret;
+ goto end;
+ }
+
+ score_request->drv_perf.processing = ktime_get_ns();
+
+ hw_timeout = gna_priv->drv_priv->recovery_timeout_jiffies;
+
+ hw_timeout = wait_event_timeout(gna_priv->dev_busy_waitq,
+ !gna_priv->dev_busy, hw_timeout);
+
+ if (!hw_timeout)
+ dev_warn(&gna_priv->pdev->dev, "hardware timeout occurred\n");
+
+ gna_priv->hw_status = gna_reg_read(gna_priv->bar0_base, GNA_MMIO_STS);
+
+ gna_request_update_status(score_request);
+ gna_abort_hw(gna_priv);
+
+ buffer = score_request->buffer_list;
+ for (i = 0; i < score_request->buffer_count; i++, buffer++) {
+ mutex_lock(&gna_priv->memidr_lock);
+ mo = idr_find(&gna_priv->memory_idr, buffer->memory_id);
+ mutex_unlock(&gna_priv->memidr_lock);
+ if (mo) {
+ mutex_lock(&mo->page_lock);
+ mo->ops->put_pages(mo);
+ mutex_unlock(&mo->page_lock);
+ } else {
+ dev_warn(&gna_priv->pdev->dev, "mo not found %llu\n", buffer->memory_id);
+ }
+ }
+
+ /* patches_ptr's are already freed by ops->score() function */
+ kvfree(score_request->buffer_list);
+ score_request->buffer_list = NULL;
+ score_request->buffer_count = 0;
+
+ gna_mmu_clear(gna_priv);
+
+end:
+ score_request->drv_perf.completion = ktime_get_ns();
+ dev_dbg(&gna_priv->pdev->dev, "request %llu done, waking processes\n",
+ score_request->request_id);
+ score_request->state = DONE;
+ wake_up_interruptible_all(&score_request->waitq);
+}
static struct gna_request *gna_request_create(struct gna_file_private *file_priv,
struct gna_compute_cfg *compute_cfg)
@@ -34,6 +145,7 @@ static struct gna_request *gna_request_create(struct gna_file_private *file_priv
score_request->gna_priv = gna_priv;
score_request->state = NEW;
init_waitqueue_head(&score_request->waitq);
+ INIT_WORK(&score_request->work, gna_request_process);
return score_request;
}
@@ -242,6 +354,7 @@ int gna_enqueue_request(struct gna_compute_cfg *compute_cfg,
list_add_tail(&score_request->node, &gna_priv->request_list);
mutex_unlock(&gna_priv->reqlist_lock);
+ queue_work(gna_priv->request_wq, &score_request->work);
kref_put(&score_request->refcount, gna_request_release);
*request_id = score_request->request_id;
@@ -292,6 +405,7 @@ void gna_delete_request_by_id(u64 req_id, struct gna_private *gna_priv)
list_for_each_entry_safe(req, temp_req, reqs_list, node) {
if (req->request_id == req_id) {
list_del(&req->node);
+ cancel_work_sync(&req->work);
kref_put(&req->refcount, gna_request_release);
break;
}
@@ -313,6 +427,7 @@ void gna_delete_file_requests(struct file *fd, struct gna_private *gna_priv)
list_for_each_entry_safe(req, temp_req, reqs_list, node) {
if (req->fd == fd) {
list_del(&req->node);
+ cancel_work_sync(&req->work);
kref_put(&req->refcount, gna_request_release);
break;
}
@@ -336,6 +451,7 @@ void gna_delete_memory_requests(u64 memory_id, struct gna_private *gna_priv)
for (i = 0; i < req->buffer_count; ++i) {
if (req->buffer_list[i].memory_id == memory_id) {
list_del(&req->node);
+ cancel_work_sync(&req->work);
kref_put(&req->refcount, gna_request_release);
break;
}
diff --git a/drivers/misc/intel/gna/gna_request.h b/drivers/misc/intel/gna/gna_request.h
index 609e66ffb54f..0d8c0f4180c8 100644
--- a/drivers/misc/intel/gna/gna_request.h
+++ b/drivers/misc/intel/gna/gna_request.h
@@ -43,6 +43,7 @@ struct gna_request {
u64 buffer_count;
struct wait_queue_head waitq;
+ struct work_struct work;
};
int gna_enqueue_request(struct gna_compute_cfg *compute_cfg,
--
2.28.0
From: Tomasz Jankowski <[email protected]>
Add ioctl handler into GNA driver.
The ioctl interface provides the ability to do the following:
- Map and unmap memory buffers for GNA computation requests.
- Retrieve capabilities of the underlying GNA IP.
- Submit GNA computation requests.
- Request notification of scoring completion.
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-developed-by: Jianxun Zhang <[email protected]>
Signed-off-by: Jianxun Zhang <[email protected]>
Co-developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
drivers/misc/intel/gna/Kbuild | 2 +-
drivers/misc/intel/gna/gna_device.c | 44 +++++
drivers/misc/intel/gna/gna_device.h | 3 +
drivers/misc/intel/gna/gna_ioctl.c | 249 ++++++++++++++++++++++++++++
drivers/misc/intel/gna/gna_ioctl.h | 11 ++
5 files changed, 308 insertions(+), 1 deletion(-)
create mode 100644 drivers/misc/intel/gna/gna_ioctl.c
create mode 100644 drivers/misc/intel/gna/gna_ioctl.h
diff --git a/drivers/misc/intel/gna/Kbuild b/drivers/misc/intel/gna/Kbuild
index 9dac467839c9..879cc76204c3 100644
--- a/drivers/misc/intel/gna/Kbuild
+++ b/drivers/misc/intel/gna/Kbuild
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
-intel_gna-y := gna_device.o gna_driver.o gna_mem.o gna_request.o gna_score.o gna_hw.o
+intel_gna-y := gna_device.o gna_driver.o gna_mem.o gna_ioctl.o gna_request.o gna_score.o gna_hw.o
obj-$(CONFIG_INTEL_GNA) += intel_gna.o
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
index 67917106a262..d8e1d4b8a9eb 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -5,8 +5,12 @@
#include <linux/module.h>
#include <linux/pci.h>
+#include <uapi/misc/intel/gna.h>
+
#include "gna_device.h"
#include "gna_driver.h"
+#include "gna_hw.h"
+#include "gna_ioctl.h"
#include "gna_request.h"
#define GNA_DEV_HWID_CNL 0x5A11
@@ -262,3 +266,43 @@ void gna_remove(struct pci_dev *pcidev)
pci_free_irq_vectors(pcidev);
}
+
+static u32 gna_device_type_by_hwid(u32 hwid)
+{
+ switch (hwid) {
+ case GNA_DEV_HWID_CNL:
+ return GNA_DEV_TYPE_0_9;
+ case GNA_DEV_HWID_GLK:
+ case GNA_DEV_HWID_EHL:
+ case GNA_DEV_HWID_ICL:
+ return GNA_DEV_TYPE_1_0;
+ case GNA_DEV_HWID_JSL:
+ case GNA_DEV_HWID_TGL:
+ return GNA_DEV_TYPE_2_0;
+ default:
+ return 0;
+ }
+}
+
+int gna_getparam(struct gna_private *gna_priv, union gna_parameter *param)
+{
+ switch (param->in.id) {
+ case GNA_PARAM_DEVICE_ID:
+ param->out.value = gna_priv->info.hwid;
+ break;
+ case GNA_PARAM_RECOVERY_TIMEOUT:
+ param->out.value = jiffies_to_msecs(gna_priv->drv_priv->recovery_timeout_jiffies) / 1000;
+ break;
+ case GNA_PARAM_INPUT_BUFFER_S:
+ param->out.value = gna_priv->hw_info.in_buf_s;
+ break;
+ case GNA_PARAM_DEVICE_TYPE:
+ param->out.value = gna_device_type_by_hwid(gna_priv->info.hwid);
+ break;
+ default:
+ dev_err(&gna_priv->pdev->dev, "unknown parameter id %llu\n", param->in.id);
+ return -EINVAL;
+ }
+
+ return 0;
+}
diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
index 7ba25f6f8492..aa7fadcf93b1 100644
--- a/drivers/misc/intel/gna/gna_device.h
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -15,6 +15,7 @@
struct gna_driver_private;
struct workqueue_struct;
+union gna_parameter;
struct device;
struct gna_drv_info {
@@ -77,4 +78,6 @@ int gna_probe(struct pci_dev *dev, const struct pci_device_id *id);
void gna_remove(struct pci_dev *dev);
+int gna_getparam(struct gna_private *gna_priv, union gna_parameter *param);
+
#endif /* __GNA_DEVICE_H__ */
diff --git a/drivers/misc/intel/gna/gna_ioctl.c b/drivers/misc/intel/gna/gna_ioctl.c
new file mode 100644
index 000000000000..79ce3aeb27cf
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_ioctl.c
@@ -0,0 +1,249 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2017-2021 Intel Corporation
+
+#include <linux/uaccess.h>
+
+#include <uapi/misc/intel/gna.h>
+
+#include "gna_driver.h"
+#include "gna_device.h"
+#include "gna_ioctl.h"
+#include "gna_mem.h"
+#include "gna_request.h"
+#include "gna_score.h"
+
+static int gna_ioctl_score(struct gna_file_private *file_priv, void __user *argptr)
+{
+ union gna_compute score_args;
+ struct gna_private *gna_priv;
+ u64 request_id;
+ int ret;
+
+ gna_priv = file_priv->gna_priv;
+
+ if (copy_from_user(&score_args, argptr, sizeof(score_args))) {
+ dev_err(&gna_priv->pdev->dev, "could not copy score ioctl config from user\n");
+ return -EFAULT;
+ }
+
+ ret = gna_validate_score_config(&score_args.in.config, file_priv);
+ if (ret) {
+ dev_err(&gna_priv->pdev->dev, "request not valid\n");
+ return ret;
+ }
+
+ ret = gna_enqueue_request(&score_args.in.config, file_priv, &request_id);
+ if (ret) {
+ dev_err(&gna_priv->pdev->dev, "could not enqueue score request %d\n", ret);
+ return ret;
+ }
+
+ score_args.out.request_id = request_id;
+ if (copy_to_user(argptr, &score_args, sizeof(score_args))) {
+ dev_err(&gna_priv->pdev->dev, "could not copy score ioctl status to user\n");
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int gna_ioctl_wait(struct file *f, void __user *argptr)
+{
+ struct gna_file_private *file_priv;
+ struct gna_request *score_request;
+ struct gna_private *gna_priv;
+ union gna_wait wait_data;
+ u64 request_id;
+ u32 timeout;
+ int ret;
+
+ file_priv = (struct gna_file_private *)f->private_data;
+ gna_priv = file_priv->gna_priv;
+
+ ret = 0;
+
+ if (copy_from_user(&wait_data, argptr, sizeof(wait_data))) {
+ dev_err(&gna_priv->pdev->dev, "could not copy wait ioctl data from user\n");
+ return -EFAULT;
+ }
+
+ request_id = wait_data.in.request_id;
+ timeout = wait_data.in.timeout;
+
+ score_request = gna_find_request_by_id(request_id, gna_priv);
+
+ if (!score_request) {
+ dev_err(&gna_priv->pdev->dev, "could not find request with id: %llu\n", request_id);
+ return -EINVAL;
+ }
+
+ if (score_request->fd != f) {
+ kref_put(&score_request->refcount, gna_request_release);
+ return -EINVAL;
+ }
+
+ dev_dbg(&gna_priv->pdev->dev, "waiting for request %llu for timeout %u\n", request_id, timeout);
+
+ ret = wait_event_interruptible_timeout(score_request->waitq, score_request->state == DONE,
+ msecs_to_jiffies(timeout));
+ if (ret == 0 || ret == -ERESTARTSYS) {
+ dev_err(&gna_priv->pdev->dev, "request timed out, id: %llu\n", request_id);
+ kref_put(&score_request->refcount, gna_request_release);
+ return -EBUSY;
+ }
+
+ dev_dbg(&gna_priv->pdev->dev, "request wait completed with %d req id %llu\n", ret, request_id);
+
+ wait_data.out.hw_perf = score_request->hw_perf;
+ wait_data.out.drv_perf = score_request->drv_perf;
+ wait_data.out.hw_status = score_request->hw_status;
+
+ ret = score_request->status;
+
+ dev_dbg(&gna_priv->pdev->dev, "request status %d, hw status: %#x\n",
+ score_request->status, score_request->hw_status);
+ kref_put(&score_request->refcount, gna_request_release);
+
+ gna_delete_request_by_id(request_id, gna_priv);
+
+ if (copy_to_user(argptr, &wait_data, sizeof(wait_data))) {
+ dev_err(&gna_priv->pdev->dev, "could not copy wait ioctl status to user\n");
+ ret = -EFAULT;
+ }
+
+ return ret;
+}
+
+static int gna_ioctl_map(struct gna_file_private *file_priv, void __user *argptr)
+{
+ struct gna_private *gna_priv;
+ union gna_memory_map gna_mem;
+ int ret;
+
+ gna_priv = file_priv->gna_priv;
+
+ if (copy_from_user(&gna_mem, argptr, sizeof(gna_mem))) {
+ dev_err(&gna_priv->pdev->dev, "could not copy userptr ioctl data from user\n");
+ return -EFAULT;
+ }
+
+ ret = gna_map_memory(file_priv, &gna_mem);
+ if (ret)
+ return ret;
+
+ if (copy_to_user(argptr, &gna_mem, sizeof(gna_mem))) {
+ dev_err(&gna_priv->pdev->dev, "could not copy userptr ioctl status to user\n");
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int gna_ioctl_free(struct gna_file_private *file_priv, unsigned long arg)
+{
+ struct gna_memory_object *iter_mo, *temp_mo;
+ struct gna_memory_object *mo;
+ struct gna_private *gna_priv;
+
+ u64 memory_id = arg;
+
+ gna_priv = file_priv->gna_priv;
+
+ mutex_lock(&gna_priv->memidr_lock);
+ mo = idr_find(&gna_priv->memory_idr, memory_id);
+ mutex_unlock(&gna_priv->memidr_lock);
+
+ if (!mo) {
+ dev_warn(&gna_priv->pdev->dev, "memory object not found\n");
+ return -EINVAL;
+ }
+
+ queue_work(gna_priv->request_wq, &mo->work);
+ if (wait_event_interruptible(mo->waitq, true)) {
+ dev_dbg(&gna_priv->pdev->dev, "wait interrupted\n");
+ return -ETIME;
+ }
+
+ mutex_lock(&file_priv->memlist_lock);
+ list_for_each_entry_safe(iter_mo, temp_mo, &file_priv->memory_list, file_mem_list) {
+ if (iter_mo->memory_id == memory_id) {
+ list_del(&iter_mo->file_mem_list);
+ break;
+ }
+ }
+ mutex_unlock(&file_priv->memlist_lock);
+
+ gna_memory_free(gna_priv, mo);
+
+ return 0;
+}
+
+static int gna_ioctl_getparam(struct gna_private *gna_priv, void __user *argptr)
+{
+ union gna_parameter param;
+ int ret;
+
+ if (copy_from_user(¶m, argptr, sizeof(param))) {
+ dev_err(&gna_priv->pdev->dev, "could not copy getparam ioctl data from user\n");
+ return -EFAULT;
+ }
+
+ ret = gna_getparam(gna_priv, ¶m);
+ if (ret)
+ return ret;
+
+ if (copy_to_user(argptr, ¶m, sizeof(param))) {
+ dev_err(&gna_priv->pdev->dev, "could not copy getparam ioctl status to user\n");
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+long gna_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+ struct gna_file_private *file_priv;
+ struct gna_private *gna_priv;
+ void __user *argptr;
+ int ret = 0;
+
+ argptr = (void __user *)arg;
+
+ file_priv = (struct gna_file_private *)f->private_data;
+ // TODO following is always false?
+ if (!file_priv)
+ return -ENODEV;
+
+ gna_priv = file_priv->gna_priv;
+ if (!gna_priv)
+ return -ENODEV;
+
+ switch (cmd) {
+ case GNA_GET_PARAMETER:
+ ret = gna_ioctl_getparam(gna_priv, argptr);
+ break;
+
+ case GNA_MAP_MEMORY:
+ ret = gna_ioctl_map(file_priv, argptr);
+ break;
+
+ case GNA_UNMAP_MEMORY:
+ ret = gna_ioctl_free(file_priv, arg);
+ break;
+
+ case GNA_COMPUTE:
+ ret = gna_ioctl_score(file_priv, argptr);
+ break;
+
+ case GNA_WAIT:
+ ret = gna_ioctl_wait(f, argptr);
+ break;
+
+ default:
+ dev_warn(&gna_priv->pdev->dev, "wrong ioctl %#x\n", cmd);
+ ret = -EINVAL;
+ break;
+ }
+
+ return ret;
+}
diff --git a/drivers/misc/intel/gna/gna_ioctl.h b/drivers/misc/intel/gna/gna_ioctl.h
new file mode 100644
index 000000000000..562f7f835f5f
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_ioctl.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2017-2021 Intel Corporation */
+
+#ifndef __GNA_IOCTL_H__
+#define __GNA_IOCTL_H__
+
+#include <linux/fs.h>
+
+long gna_ioctl(struct file *f, unsigned int cmd, unsigned long arg);
+
+#endif // __GNA_IOCTL_H__
--
2.28.0
From: Tomasz Jankowski <[email protected]>
Signed-off-by: Tomasz Jankowski <[email protected]>
Tested-by: Savo Novakovic <[email protected]>
Co-developed-by: Jianxun Zhang <[email protected]>
Signed-off-by: Jianxun Zhang <[email protected]>
Co-developed-by: Maciej Kwapulinski <[email protected]>
Signed-off-by: Maciej Kwapulinski <[email protected]>
---
drivers/misc/intel/gna/gna_device.c | 60 ++++++++++++++++++++++++++++-
1 file changed, 59 insertions(+), 1 deletion(-)
diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
index 3f7f4c07d1a1..3f74a0a3bd30 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -93,12 +93,70 @@ MODULE_DEVICE_TABLE(pci, gna_pci_ids);
static int gna_open(struct inode *inode, struct file *f)
{
- return -EPERM;
+ struct gna_file_private *file_priv;
+ struct gna_private *gna_priv;
+
+ gna_priv = container_of(f->private_data, struct gna_private, misc);
+
+ file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
+ if (!file_priv)
+ return -ENOMEM;
+
+ file_priv->fd = f;
+ file_priv->gna_priv = gna_priv;
+
+ mutex_init(&file_priv->memlist_lock);
+ INIT_LIST_HEAD(&file_priv->memory_list);
+
+ mutex_lock(&gna_priv->flist_lock);
+ list_add_tail(&file_priv->flist, &gna_priv->file_list);
+ mutex_unlock(&gna_priv->flist_lock);
+
+ f->private_data = file_priv;
+
+ return 0;
+}
+
+static int gna_release(struct inode *inode, struct file *f)
+{
+ struct gna_file_private *iter_file, *temp_file;
+ struct gna_memory_object *iter_mo, *temp_mo;
+ struct gna_file_private *file_priv;
+ struct gna_private *gna_priv;
+
+ /* free all memory objects created by that file */
+ file_priv = (struct gna_file_private *)f->private_data;
+ gna_priv = file_priv->gna_priv;
+
+ mutex_lock(&file_priv->memlist_lock);
+ list_for_each_entry_safe(iter_mo, temp_mo, &file_priv->memory_list, file_mem_list) {
+ queue_work(gna_priv->request_wq, &iter_mo->work);
+ wait_event(iter_mo->waitq, true);
+ gna_memory_free(gna_priv, iter_mo);
+ }
+ mutex_unlock(&file_priv->memlist_lock);
+
+ gna_delete_file_requests(f, gna_priv);
+
+ mutex_lock(&gna_priv->flist_lock);
+ list_for_each_entry_safe(iter_file, temp_file, &gna_priv->file_list, flist) {
+ if (iter_file->fd == f) {
+ list_del(&iter_file->flist);
+ f->private_data = NULL;
+ kfree(iter_file);
+ break;
+ }
+ }
+ mutex_unlock(&gna_priv->flist_lock);
+
+ return 0;
}
static const struct file_operations gna_file_ops = {
.owner = THIS_MODULE,
.open = gna_open,
+ .release = gna_release,
+ .unlocked_ioctl = gna_ioctl,
};
static void gna_dev_release(struct gna_private *gna_priv)
--
2.28.0
On Wed, Mar 24, 2021 at 8:38 PM Maciej Kwapulinski
<[email protected]> wrote:
>
> Dear kernel maintainers,
>
> This submission is a kernel driver to support Intel(R) Gaussian & Neural
> Accelerator (Intel(R) GNA). Intel(R) GNA is a PCI-based neural co-processor
> available on multiple Intel platforms. AI developers and users can offload
> continuous inference workloads to an Intel(R) GNA device in order to free
> processor resources and save power. Noise reduction and speech recognition
> are the examples of the workloads Intel(R) GNA deals with while its usage
> is not limited to the two.
>
> For a list of processors equipped with Intel(R) GNA device, please refer to
> this link:
> https://docs.openvinotoolkit.org/latest/openvino_docs_IE_DG_supported_plugins_GNA.html
>
> We think contributing this driver to the upstream kernel project is the
> best way for developers and users to get the latest Intel(R) GNA support in
> a Linux kernel, through the mainline to any Linux distributions installed
> on their systems. Upstreaming also enables contribution from developers
> around the world to the driver once it is merged.
>
> The driver works with Intel(R) libraries in user space. The Intel(R) driver
> exposes a few IOCTL interfaces for use by libraries in user space. The
> libraries are open sourced and are available at:
> https://github.com/intel/gna
>
> ---
>
> Changelogs:
>
> v1->v2:
> - driver's new layout:
> - driver name: gna -> intel_gna
> - module name: gna -> intel_gna
> - device file name: /dev/gnaN -> /dev/intel_gnaN
Not sure we need this, but if Greg asked for that (I haven't followed)
than it's okay.
> - driver's source directory: drivers/misc/gna/ -> drivers/misc/intel/gna/
> - UAPI: include/uapi/misc/gna.h -> include/uapi/misc/intel/gna.h
> - DOC: Documentation/misc-devices/gna.rst ->
> Documentation/misc-devices/intel/gna.rst
> - 'MISC' device framework used
> - fixes throughout GNA device's PCI management
> - header files' includes and forward declarations cleanup
> - ISR made static
> - unused comments cleanup
> - "_priv_" segment removed from function names
> - tested: v5.11-rc3 -> v5.11
We are at v5.12-rc4. The rule of thumb is latest rc or release +
subsystem tree against which the driver is created.
> - number of other/minor fixes
>
> ---
>
> Maciej Kwapulinski (1):
> intel_gna: add a 'misc' device
>
> Tomasz Jankowski (12):
> intel_gna: add driver module
> intel_gna: add component of hardware operation
> intel_gna: read hardware info in the driver
> intel_gna: add memory handling
> intel_gna: initialize mmu
> intel_gna: add hardware ids
> intel_gna: add request component
> intel_gna: implement scoring
> intel_gna: add a work queue to process scoring requests
> intel_gna: add interrupt handler
> intel_gna: add ioctl handler
> intel_gna: add file operations to a 'misc' device
>
> Documentation/misc-devices/index.rst | 1 +
> Documentation/misc-devices/intel/gna.rst | 48 ++
> .../userspace-api/ioctl/ioctl-number.rst | 1 +
> MAINTAINERS | 7 +
> drivers/misc/Kconfig | 1 +
> drivers/misc/Makefile | 1 +
> drivers/misc/intel/gna/Kbuild | 5 +
> drivers/misc/intel/gna/Kconfig | 13 +
> drivers/misc/intel/gna/gna_device.c | 429 ++++++++++++++++
> drivers/misc/intel/gna/gna_device.h | 89 ++++
> drivers/misc/intel/gna/gna_driver.c | 43 ++
> drivers/misc/intel/gna/gna_driver.h | 33 ++
> drivers/misc/intel/gna/gna_hw.c | 125 +++++
> drivers/misc/intel/gna/gna_hw.h | 61 +++
> drivers/misc/intel/gna/gna_ioctl.c | 249 +++++++++
> drivers/misc/intel/gna/gna_ioctl.h | 11 +
> drivers/misc/intel/gna/gna_mem.c | 473 ++++++++++++++++++
> drivers/misc/intel/gna/gna_mem.h | 107 ++++
> drivers/misc/intel/gna/gna_request.c | 463 +++++++++++++++++
> drivers/misc/intel/gna/gna_request.h | 62 +++
> drivers/misc/intel/gna/gna_score.c | 298 +++++++++++
> drivers/misc/intel/gna/gna_score.h | 18 +
> include/uapi/misc/intel/gna.h | 155 ++++++
> 23 files changed, 2693 insertions(+)
> create mode 100644 Documentation/misc-devices/intel/gna.rst
> create mode 100644 drivers/misc/intel/gna/Kbuild
> create mode 100644 drivers/misc/intel/gna/Kconfig
> create mode 100644 drivers/misc/intel/gna/gna_device.c
> create mode 100644 drivers/misc/intel/gna/gna_device.h
> create mode 100644 drivers/misc/intel/gna/gna_driver.c
> create mode 100644 drivers/misc/intel/gna/gna_driver.h
> create mode 100644 drivers/misc/intel/gna/gna_hw.c
> create mode 100644 drivers/misc/intel/gna/gna_hw.h
> create mode 100644 drivers/misc/intel/gna/gna_ioctl.c
> create mode 100644 drivers/misc/intel/gna/gna_ioctl.h
> create mode 100644 drivers/misc/intel/gna/gna_mem.c
> create mode 100644 drivers/misc/intel/gna/gna_mem.h
> create mode 100644 drivers/misc/intel/gna/gna_request.c
> create mode 100644 drivers/misc/intel/gna/gna_request.h
> create mode 100644 drivers/misc/intel/gna/gna_score.c
> create mode 100644 drivers/misc/intel/gna/gna_score.h
> create mode 100644 include/uapi/misc/intel/gna.h
>
> --
> 2.28.0
>
--
With Best Regards,
Andy Shevchenko
On Wed, Mar 24, 2021 at 8:38 PM Maciej Kwapulinski
<[email protected]> wrote:
>
> From: Tomasz Jankowski <[email protected]>
>
> Add a new PCI driver for Intel(R) Gaussian & Neural Accelerator
> with basic support like module loading and unloading. The full
> function of the driver will be added by further changes.
>
> Signed-off-by: Tomasz Jankowski <[email protected]>
> Tested-by: Savo Novakovic <[email protected]>
> Co-developed-by: Jianxun Zhang <[email protected]>
> Signed-off-by: Jianxun Zhang <[email protected]>
> Co-developed-by: Maciej Kwapulinski <[email protected]>
> Signed-off-by: Maciej Kwapulinski <[email protected]>
> ---
> Documentation/misc-devices/index.rst | 1 +
> Documentation/misc-devices/intel/gna.rst | 48 ++++++
> .../userspace-api/ioctl/ioctl-number.rst | 1 +
> MAINTAINERS | 7 +
> drivers/misc/Kconfig | 1 +
> drivers/misc/Makefile | 1 +
> drivers/misc/intel/gna/Kbuild | 5 +
> drivers/misc/intel/gna/Kconfig | 13 ++
> drivers/misc/intel/gna/gna_device.c | 74 +++++++++
> drivers/misc/intel/gna/gna_device.h | 36 ++++
> drivers/misc/intel/gna/gna_driver.c | 39 +++++
> drivers/misc/intel/gna/gna_driver.h | 15 ++
What is the point now to have gna/gna_, I guess the latter should go?
> include/uapi/misc/intel/gna.h | 155 ++++++++++++++++++
> 13 files changed, 396 insertions(+)
> create mode 100644 Documentation/misc-devices/intel/gna.rst
> create mode 100644 drivers/misc/intel/gna/Kbuild
> create mode 100644 drivers/misc/intel/gna/Kconfig
> create mode 100644 drivers/misc/intel/gna/gna_device.c
> create mode 100644 drivers/misc/intel/gna/gna_device.h
> create mode 100644 drivers/misc/intel/gna/gna_driver.c
> create mode 100644 drivers/misc/intel/gna/gna_driver.h
> create mode 100644 include/uapi/misc/intel/gna.h
>
> diff --git a/Documentation/misc-devices/index.rst b/Documentation/misc-devices/index.rst
> index 64420b3314fe..1b187ee121b0 100644
> --- a/Documentation/misc-devices/index.rst
> +++ b/Documentation/misc-devices/index.rst
> @@ -19,6 +19,7 @@ fit into other categories.
> bh1770glc
> eeprom
> c2port
> + intel/gna
Shouldn't it preserve ordering?
> ibmvmc
> ics932s401
> isl29003
> diff --git a/Documentation/misc-devices/intel/gna.rst b/Documentation/misc-devices/intel/gna.rst
> new file mode 100644
> index 000000000000..9baeec5ceb5c
> --- /dev/null
> +++ b/Documentation/misc-devices/intel/gna.rst
> @@ -0,0 +1,48 @@
> +.. SPDX-License-Identifier: GPL-2.0-only
> +
> +=====================================================
> +Intel(R) Gaussian & Neural Accelerator (Intel(R) GNA)
> +=====================================================
> +
> +Acronyms
> +--------
> +GNA - Gaussian & Neural Accelerator
> +GMM - Gaussian Mixer Model
> +CNN - Convolutional Neural Network
> +RNN - Recurrent Neural Networks
> +DNN - Deep Neural Networks
> +
> +Introduction
> +------------
> +The Intel(R) GNA is an internal PCI fixed device available on several Intel platforms/SoCs.
> +Feature set depends on the Intel chipset SKU.
> +
> +Intel(R) GNA provides hardware accelerated computation for GMMs and Neural Networks.
> +It supports several layer types: affine, recurrent, and convolutional among others.
> +Hardware also provides helper layer types for copying and transposing matrices.
> +
> +Linux Driver
> +------------
> +The driver also registers a character device to expose file operations via dev node.
> +
> +The driver probes/removes PCI device, implements file operations, handles runtime
a PCI device
> +power management, and interacts with hardware through MMIO registers.
> +
> +Multiple processes can independently file many requests to the driver. These requests are
> +processed in a FIFO manner. The hardware can process one request at a time by using a FIFO
> +queue.
> +
> +IOCTL
> +-----
> +Intel(R) GNA driver controls the device through IOCTL interfaces.
> +Following IOCTL commands are supported:
> +
> +GNA_IOCTL_PARAM_GET gets driver and device capabilities.
> +
> +GNA_IOCTL_MEMORY_MAP locks user pages and GNA MMU setups for DMA transfer.
> +
> +GNA_IOCTL_MEMORY_UNMAP unlocks user pages and releases GNA MMU structures.
> +
> +GNA_IOCTL_COMPUTE submits a request to the device queue.
> +
> +GNA_IOCTL_WAIT blocks and waits on the submitted request.
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index a4c75a28c839..9ec2b32f656a 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -115,6 +115,7 @@ Code Seq# Include File Comments
> 'B' C0-FF advanced bbus <mailto:[email protected]>
> 'C' all linux/soundcard.h conflict!
> 'C' 01-2F linux/capi.h conflict!
> +'C' 01-5F uapi/misc/intel/gna.h conflict!
> 'C' F0-FF drivers/net/wan/cosa.h conflict!
> 'D' all arch/s390/include/asm/dasd.h
> 'D' 40-5F drivers/scsi/dpt/dtpi_ioctl.h
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bfc1b86e3e73..da926aa4523c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8928,6 +8928,13 @@ S: Maintained
> F: Documentation/fb/intelfb.rst
> F: drivers/video/fbdev/intelfb/
>
> +INTEL GNA PCI DRIVER
> +M: Maciej Kwapulinski <[email protected]>
> +S: Maintained
> +F: Documentation/misc-devices/intel/gna.rst
> +F: drivers/misc/intel/gna/*
> +F: include/uapi/misc/intel/gna.h
> +
> INTEL GPIO DRIVERS
> M: Andy Shevchenko <[email protected]>
> L: [email protected]
> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> index fafa8b0d8099..ce3dc5b9f821 100644
> --- a/drivers/misc/Kconfig
> +++ b/drivers/misc/Kconfig
> @@ -481,4 +481,5 @@ source "drivers/misc/ocxl/Kconfig"
> source "drivers/misc/cardreader/Kconfig"
> source "drivers/misc/habanalabs/Kconfig"
> source "drivers/misc/uacce/Kconfig"
> +source "drivers/misc/intel/gna/Kconfig"
> endmenu
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index d23231e73330..5fca2e730d96 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -57,3 +57,4 @@ obj-$(CONFIG_HABANA_AI) += habanalabs/
> obj-$(CONFIG_UACCE) += uacce/
> obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o
> obj-$(CONFIG_HISI_HIKEY_USB) += hisi_hikey_usb.o
> +obj-$(CONFIG_INTEL_GNA) += intel/gna/
> diff --git a/drivers/misc/intel/gna/Kbuild b/drivers/misc/intel/gna/Kbuild
> new file mode 100644
> index 000000000000..5d3becc71683
> --- /dev/null
> +++ b/drivers/misc/intel/gna/Kbuild
> @@ -0,0 +1,5 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +intel_gna-y := gna_device.o gna_driver.o
> +
> +obj-$(CONFIG_INTEL_GNA) += intel_gna.o
> diff --git a/drivers/misc/intel/gna/Kconfig b/drivers/misc/intel/gna/Kconfig
> new file mode 100644
> index 000000000000..c3b768a40684
> --- /dev/null
> +++ b/drivers/misc/intel/gna/Kconfig
> @@ -0,0 +1,13 @@
> +#
> +# Intel(R) Gaussian & Neural Accelerator (Intel(R) GNA)
> +#
> +
> +config INTEL_GNA
> + tristate "Intel(R) Gaussian & Neural Accelerator (Intel(R) GNA)"
> + depends on X86_64 && PCI
Why 64?
> + help
> + This option enables the Intel(R) Gaussian & Neural Accelerator
> + (Intel(R) GNA) driver: intel_gna.
> + User space interface is defined in include/uapi/misc/intel/gna.h, while
> + information about functionality is in
> + Documentation/misc-devices/intel/gna.rst
> diff --git a/drivers/misc/intel/gna/gna_device.c b/drivers/misc/intel/gna/gna_device.c
> new file mode 100644
> index 000000000000..431113297879
> --- /dev/null
> +++ b/drivers/misc/intel/gna/gna_device.c
> @@ -0,0 +1,74 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright(c) 2017-2021 Intel Corporation
> +
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +
> +#include "gna_device.h"
> +#include "gna_driver.h"
> +#define GNA_BAR0 0
Why? Expecting something different?
> +static void gna_dev_init(struct gna_private *gna_priv, struct pci_dev *pcidev,
> + const struct pci_device_id *pci_id)
> +{
> + pci_set_drvdata(pcidev, gna_priv);
You assign this and make it visible before assigning fields. Not good.
> + gna_priv->parent = &pcidev->dev;
> + gna_priv->pdev = pcidev;
> + gna_priv->info = *(struct gna_drv_info *)pci_id->driver_data;
> + gna_priv->drv_priv = &gna_drv_priv;
> +}
Why is this can't be inlined into ->probe() ?
> +int gna_probe(struct pci_dev *pcidev, const struct pci_device_id *pci_id)
> +{
> + struct gna_private *gna_priv;
> + void __iomem *const *iomap;
> + unsigned long phys_len;
> + phys_addr_t phys;
> + int ret;
> +
> + ret = pcim_enable_device(pcidev);
> + if (ret) {
> + dev_err(&pcidev->dev, "pci device can't be enabled\n");
> + return ret;
> + }
> +
> + ret = pcim_iomap_regions(pcidev, 1 << GNA_BAR0, GNA_DV_NAME);
> + if (ret) {
> + dev_err(&pcidev->dev, "cannot iomap regions\n");
> + return ret;
> + }
> + phys = pci_resource_start(pcidev, GNA_BAR0);
> + phys_len = pci_resource_len(pcidev, GNA_BAR0);
> +
> + dev_info(&pcidev->dev, "physical base address %pap, %lu bytes\n",
> + &phys, phys_len);
Why is it important?
> + iomap = pcim_iomap_table(pcidev);
> + if (!iomap) {
> + dev_err(&pcidev->dev, "failed to iomap table\n");
> + return -ENODEV;
> + }
This conditional is redundant.
> + gna_priv = devm_kzalloc(&pcidev->dev, sizeof(*gna_priv), GFP_KERNEL);
> + if (!gna_priv)
> + return -ENOMEM;
> +
> + gna_priv->bar0_base = iomap[GNA_BAR0];
> +
> + dev_dbg(&pcidev->dev, "bar0 memory address: %p\n", gna_priv->bar0_base);
> +
> + ret = dma_set_mask(&pcidev->dev, DMA_BIT_MASK(64));
> + if (ret) {
> + dev_err(&pcidev->dev, "pci_set_dma_mask returned error %d\n", ret);
Typo in the message.
Why not try to fall back to 32 bit mask and support 32-bit builds?
> + return ret;
> + }
> +
> + pci_set_master(pcidev);
> +
> + gna_dev_init(gna_priv, pcidev, pci_id);
> +
> + return 0;
> +}
> diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
> new file mode 100644
> index 000000000000..d0b47f75f47f
> --- /dev/null
> +++ b/drivers/misc/intel/gna/gna_device.h
> @@ -0,0 +1,36 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright(c) 2017-2021 Intel Corporation */
> +
> +#ifndef __GNA_DEVICE_H__
> +#define __GNA_DEVICE_H__
> +
> +#include <linux/types.h>
> +
> +struct gna_driver_private;
> +struct pci_device_id;
> +struct pci_dev;
> +struct device;
> +
> +struct gna_drv_info {
> + u32 hwid;
> + u32 num_pagetables;
> + u32 num_page_entries;
> + u32 max_layer_count;
> + u64 max_hw_mem;
> +};
> +
> +struct gna_private {
> + struct gna_driver_private *drv_priv;
> + struct pci_dev *pdev;
> + /* pdev->dev */
> + struct device *parent;
This is a mess. One struct device is enough.
Comment doesn't bring any value and is confusing.
> + /* device related resources */
> + void __iomem *bar0_base;
> + struct gna_drv_info info;
> +};
> +int gna_probe(struct pci_dev *dev, const struct pci_device_id *id);
This looks wrong. If you provide ->probe() function for others, it's
probably generalized enough and mustn't be PCI or any other bus
dependent. See below.
> +#endif /* __GNA_DEVICE_H__ */
> diff --git a/drivers/misc/intel/gna/gna_driver.c b/drivers/misc/intel/gna/gna_driver.c
> new file mode 100644
> index 000000000000..f4922a388be7
> --- /dev/null
> +++ b/drivers/misc/intel/gna/gna_driver.c
> @@ -0,0 +1,39 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright(c) 2017-2021 Intel Corporation
> +
> +#include <linux/jiffies.h>
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +
> +#include "gna_device.h"
> +#include "gna_driver.h"
> +
> +static int recovery_timeout = 60;
> +module_param(recovery_timeout, int, 0644);
> +MODULE_PARM_DESC(recovery_timeout, "Recovery timeout in seconds");
Why module parameters?!
> +struct gna_driver_private gna_drv_priv;
Global?!
> +static struct pci_driver gna_driver = {
> + .name = GNA_DV_NAME,
> + .probe = gna_probe,
> +};
> +
> +static int __init gna_drv_init(void)
> +{
> + gna_drv_priv.recovery_timeout_jiffies = msecs_to_jiffies(recovery_timeout * 1000);
> +
> + return pci_register_driver(&gna_driver);
> +}
> +
> +static void __exit gna_drv_exit(void)
> +{
> + pci_unregister_driver(&gna_driver);
> +}
> +
> +module_init(gna_drv_init);
> +module_exit(gna_drv_exit);
What is this entire module for?!
Really you should not split this like above. This belongs to the PCI
glue driver.
> +MODULE_AUTHOR("Intel Corporation");
> +MODULE_DESCRIPTION("Intel(R) Gaussian & Neural Accelerator (Intel(R) GNA) Driver");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/misc/intel/gna/gna_driver.h b/drivers/misc/intel/gna/gna_driver.h
> new file mode 100644
> index 000000000000..ed507ea10866
> --- /dev/null
> +++ b/drivers/misc/intel/gna/gna_driver.h
> @@ -0,0 +1,15 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright(c) 2017-2021 Intel Corporation */
> +
> +#ifndef __GNA_DRIVER_H__
> +#define __GNA_DRIVER_H__
> +
> +#define GNA_DV_NAME "intel_gna"
> +
> +struct gna_driver_private {
> + int recovery_timeout_jiffies;
> +};
> +
> +extern struct gna_driver_private gna_drv_priv;
> +
> +#endif /* __GNA_DRIVER_H__ */
> diff --git a/include/uapi/misc/intel/gna.h b/include/uapi/misc/intel/gna.h
> new file mode 100644
> index 000000000000..a7e435b74a0a
> --- /dev/null
> +++ b/include/uapi/misc/intel/gna.h
> @@ -0,0 +1,155 @@
> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
> +/* Copyright(c) 2017-2021 Intel Corporation */
> +
> +#ifndef _UAPI_GNA_H_
> +#define _UAPI_GNA_H_
> +
> +#if defined(__cplusplus)
> +extern "C" {
> +#endif
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +#include <linux/const.h>
Ordered?
> +#ifndef __user
> +#define __user
> +#endif
What for?
> +/* Operation modes */
> +#define GNA_MODE_GMM 0
> +#define GNA_MODE_XNN 1
> +
> +#define GNA_PARAM_DEVICE_ID 1
> +#define GNA_PARAM_RECOVERY_TIMEOUT 2
> +#define GNA_PARAM_DEVICE_TYPE 3
> +#define GNA_PARAM_INPUT_BUFFER_S 4
> +
> +#define GNA_STS_SCORE_COMPLETED _BITUL(0)
> +#define GNA_STS_STATISTICS_VALID _BITUL(3)
> +#define GNA_STS_PCI_MMU_ERR _BITUL(4)
> +#define GNA_STS_PCI_DMA_ERR _BITUL(5)
> +#define GNA_STS_PCI_UNEXCOMPL_ERR _BITUL(6)
> +#define GNA_STS_VA_OOR _BITUL(7)
> +#define GNA_STS_PARAM_OOR _BITUL(8)
> +#define GNA_STS_SATURATE _BITUL(17)
> +
> +#define GNA_ERROR (GNA_STS_PCI_DMA_ERR | \
Better to start value on the new line.
> + GNA_STS_PCI_MMU_ERR | \
> + GNA_STS_PCI_UNEXCOMPL_ERR | \
> + GNA_STS_PARAM_OOR | \
> + GNA_STS_VA_OOR)
> +
> +#define GNA_DEV_TYPE_0_9 0x09
> +#define GNA_DEV_TYPE_1_0 0x10
> +#define GNA_DEV_TYPE_2_0 0x20
> +
> +/*
> + * Structure describes part of memory to be overwritten before starting GNA
> + */
> +struct gna_memory_patch {
> + /* offset from targeted memory */
> + __u64 offset;
> +
> + __u64 size;
> + __u64 value;
> +};
> +
> +struct gna_buffer {
> + __u64 memory_id;
> +
> + __u64 offset;
> + __u64 size;
> +
> + __u64 patch_count;
> + __u64 patches_ptr;
> +};
> +
> +/*
> + * Driver performance timestamps in nanoseconds.
> + * Values regard system boot time, but do not count during suspend.
> + */
> +struct gna_drv_perf {
> + __u64 pre_processing; /* driver starts pre-processing */
> + __u64 processing; /* hw starts processing */
> + __u64 hw_completed; /* hw finishes processing */
> + __u64 completion; /* driver finishes post-processing */
> +};
> +
> +struct gna_hw_perf {
> + __u64 total;
> + __u64 stall;
> +};
> +
> +struct gna_compute_cfg {
> + __u32 layer_base;
> + __u32 layer_count;
> +
> + /* List of GNA memory buffers */
> + __u64 buffers_ptr;
> + __u64 buffer_count;
> +
> + __u8 active_list_on;
> + __u8 gna_mode;
> + __u8 hw_perf_encoding;
> + __u8 pad[5];
> +};
> +
> +union gna_parameter {
> + struct {
> + __u64 id;
> + } in;
> +
> + struct {
> + __u64 value;
> + } out;
> +};
> +
> +union gna_memory_map {
> + struct {
> + __u64 address;
> + __u32 size;
> + __u32 pad;
> + } in;
> +
> + struct {
> + __u64 memory_id;
> + } out;
> +};
> +
> +union gna_compute {
> + struct {
> + struct gna_compute_cfg config;
> + } in;
> +
> + struct {
> + __u64 request_id;
> + } out;
> +};
> +
> +union gna_wait {
> + struct {
> + __u64 request_id;
> + __u32 timeout;
> + __u32 pad;
> + } in;
> +
> + struct {
> + __u32 hw_status;
> + __u32 pad;
> + struct gna_drv_perf drv_perf;
> + struct gna_hw_perf hw_perf;
> + } out;
> +};
For all unions:
How do you know which branch is used (out, in)? What field and where
in the ABI points to that?
> +#define GNA_GET_PARAMETER _IOWR('C', 0x01, union gna_parameter)
> +#define GNA_MAP_MEMORY _IOWR('C', 0x02, union gna_memory_map)
> +#define GNA_UNMAP_MEMORY _IOWR('C', 0x03, __u64)
> +#define GNA_COMPUTE _IOWR('C', 0x04, union gna_compute)
> +#define GNA_WAIT _IOWR('C', 0x05, union gna_wait)
> +
> +#if defined(__cplusplus)
> +}
> +#endif
> +
> +#endif /* _UAPI_GNA_H_ */
> --
> 2.28.0
>
--
With Best Regards,
Andy Shevchenko
On Wed, Mar 24, 2021 at 8:38 PM Maciej Kwapulinski
<[email protected]> wrote:
>
> From: Tomasz Jankowski <[email protected]>
>
> Add definitions and utilities to interact with the hardware
> device.
>
> Signed-off-by: Tomasz Jankowski <[email protected]>
> Tested-by: Savo Novakovic <[email protected]>
> Co-developed-by: Jianxun Zhang <[email protected]>
> Signed-off-by: Jianxun Zhang <[email protected]>
> Co-developed-by: Maciej Kwapulinski <[email protected]>
> Signed-off-by: Maciej Kwapulinski <[email protected]>
> ---
> drivers/misc/intel/gna/Kbuild | 2 +-
> drivers/misc/intel/gna/gna_device.h | 4 +
> drivers/misc/intel/gna/gna_hw.c | 125 ++++++++++++++++++++++++++++
> drivers/misc/intel/gna/gna_hw.h | 62 ++++++++++++++
> 4 files changed, 192 insertions(+), 1 deletion(-)
> create mode 100644 drivers/misc/intel/gna/gna_hw.c
> create mode 100644 drivers/misc/intel/gna/gna_hw.h
>
> diff --git a/drivers/misc/intel/gna/Kbuild b/drivers/misc/intel/gna/Kbuild
> index 5d3becc71683..0cf083bb211a 100644
> --- a/drivers/misc/intel/gna/Kbuild
> +++ b/drivers/misc/intel/gna/Kbuild
> @@ -1,5 +1,5 @@
> # SPDX-License-Identifier: GPL-2.0-only
>
> -intel_gna-y := gna_device.o gna_driver.o
> +intel_gna-y := gna_device.o gna_driver.o gna_hw.o
>
> obj-$(CONFIG_INTEL_GNA) += intel_gna.o
> diff --git a/drivers/misc/intel/gna/gna_device.h b/drivers/misc/intel/gna/gna_device.h
> index d0b47f75f47f..39dc03d53feb 100644
> --- a/drivers/misc/intel/gna/gna_device.h
> +++ b/drivers/misc/intel/gna/gna_device.h
> @@ -6,6 +6,8 @@
>
> #include <linux/types.h>
>
> +#include "gna_hw.h"
> +
> struct gna_driver_private;
> struct pci_device_id;
> struct pci_dev;
> @@ -17,6 +19,8 @@ struct gna_drv_info {
> u32 num_page_entries;
> u32 max_layer_count;
> u64 max_hw_mem;
> +
> + struct gna_desc_info desc_info;
> };
>
> struct gna_private {
> diff --git a/drivers/misc/intel/gna/gna_hw.c b/drivers/misc/intel/gna/gna_hw.c
> new file mode 100644
> index 000000000000..7d2f4ef00136
> --- /dev/null
> +++ b/drivers/misc/intel/gna/gna_hw.c
> @@ -0,0 +1,125 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright(c) 2017-2021 Intel Corporation
> +
> +#include <linux/pci.h>
> +
> +#include <uapi/misc/intel/gna.h>
> +
> +#include "gna_device.h"
> +#include "gna_driver.h"
> +#include "gna_hw.h"
> +
> +int gna_parse_hw_status(struct gna_private *gna_priv, u32 hw_status)
> +{
> + int status;
Redundant. See below.
> +
> + if (hw_status & GNA_ERROR) {
> + dev_dbg(&gna_priv->pdev->dev, "GNA completed with errors: %#x\n", hw_status);
Exactly my point, you need only one struct device, w/o these tricks.
> + status = -EIO;
return -EIO;
> + } else if (hw_status & GNA_STS_SCORE_COMPLETED) {
...drop else
> + status = 0;
> + dev_dbg(&gna_priv->pdev->dev, "GNA completed successfully: %#x\n", hw_status);
> + } else {
> + dev_err(&gna_priv->pdev->dev, "GNA not completed, status: %#x\n", hw_status);
> + status = -ENODATA;
> + }
> +
> + return status;
As above.
> +}
> +
> +void gna_print_error_status(struct gna_private *gna_priv, u32 hw_status)
> +{
> + if (hw_status & GNA_STS_PARAM_OOR)
> + dev_dbg(&gna_priv->pdev->dev, "GNA error: Param Out Range Error\n");
> +
> + if (hw_status & GNA_STS_VA_OOR)
> + dev_dbg(&gna_priv->pdev->dev, "GNA error: VA Out of Range Error\n");
> +
> + if (hw_status & GNA_STS_PCI_MMU_ERR)
> + dev_dbg(&gna_priv->pdev->dev, "GNA error: PCI MMU Error\n");
> +
> + if (hw_status & GNA_STS_PCI_DMA_ERR)
> + dev_dbg(&gna_priv->pdev->dev, "GNA error: PCI MMU Error\n");
> +
> + if (hw_status & GNA_STS_PCI_UNEXCOMPL_ERR)
> + dev_dbg(&gna_priv->pdev->dev, "GNA error: PCI Unexpected Completion Error\n");
> +
> + if (hw_status & GNA_STS_SATURATE)
> + dev_dbg(&gna_priv->pdev->dev, "GNA error: Saturation Reached !\n");
> +}
> +
> +bool gna_hw_perf_enabled(struct gna_private *gna_priv)
> +{
> + void __iomem *addr = gna_priv->bar0_base;
> + u32 ctrl = gna_reg_read(addr, GNA_MMIO_CTRL);
If you want to have better helpers, supply priv directly to them. Look
into other (recent enough) drivers in the kernel how they do it.
Ditto for all cases.
> + return FIELD_GET(GNA_CTRL_COMP_STATS_EN, ctrl) ? true : false;
Missed bitfield.h.
Redundant ternary. Use !!
> +}
> +
> +void gna_start_scoring(struct gna_private *gna_priv, void __iomem *addr,
> + struct gna_compute_cfg *compute_cfg)
> +{
> + u32 ctrl = gna_reg_read(addr, GNA_MMIO_CTRL);
> +
> + ctrl |= GNA_CTRL_START_ACCEL | GNA_CTRL_COMP_INT_EN | GNA_CTRL_ERR_INT_EN;
> +
> + ctrl &= ~GNA_CTRL_COMP_STATS_EN;
> + ctrl |= FIELD_PREP(GNA_CTRL_COMP_STATS_EN,
> + compute_cfg->hw_perf_encoding & FIELD_MAX(GNA_CTRL_COMP_STATS_EN));
> +
> + ctrl &= ~GNA_CTRL_ACTIVE_LIST_EN;
> + ctrl |= FIELD_PREP(GNA_CTRL_ACTIVE_LIST_EN,
> + compute_cfg->active_list_on & FIELD_MAX(GNA_CTRL_ACTIVE_LIST_EN));
> +
> + ctrl &= ~GNA_CTRL_OP_MODE;
> + ctrl |= FIELD_PREP(GNA_CTRL_OP_MODE,
> + compute_cfg->gna_mode & FIELD_MAX(GNA_CTRL_OP_MODE));
> +
> + gna_reg_write(addr, GNA_MMIO_CTRL, ctrl);
> +
> + dev_dbg(&gna_priv->pdev->dev, "scoring started...\n");
> +}
> +
> +static void gna_clear_saturation(struct gna_private *gna_priv)
> +{
> + void __iomem *addr = gna_priv->bar0_base;
> + u32 val;
> +
> + val = gna_reg_read(addr, GNA_MMIO_STS);
> + if (val & GNA_STS_SATURATE) {
> + dev_dbg(&gna_priv->pdev->dev, "saturation reached\n");
> + dev_dbg(&gna_priv->pdev->dev, "status: %#x\n", val);
> +
> + val = val & GNA_STS_SATURATE;
> + gna_reg_write(addr, GNA_MMIO_STS, val);
> + }
> +}
> +
> +void gna_abort_hw(struct gna_private *gna_priv)
> +{
> + void __iomem *addr = gna_priv->bar0_base;
> + u32 val;
> + int i;
unsigned.
> + /* saturation bit in the GNA status register needs
> + * to be explicitly cleared.
> + */
> + gna_clear_saturation(gna_priv);
> +
> + val = gna_reg_read(addr, GNA_MMIO_STS);
> + dev_dbg(&gna_priv->pdev->dev, "status before abort: %#x\n", val);
> +
> + val = gna_reg_read(addr, GNA_MMIO_CTRL);
> + val |= GNA_CTRL_ABORT_CLR_ACCEL;
> + gna_reg_write(addr, GNA_MMIO_CTRL, val);
> +
> + i = 100;
> + do {
> + val = gna_reg_read(addr, GNA_MMIO_STS);
> + if ((val & 0x1) == 0)
> + break;
> + } while (--i);
NIH or readx_poll_timeout() from iopoll.h.
> + if (i == 0)
> + dev_err(&gna_priv->pdev->dev, "abort did not complete\n");
> +}
> diff --git a/drivers/misc/intel/gna/gna_hw.h b/drivers/misc/intel/gna/gna_hw.h
> new file mode 100644
> index 000000000000..dd682f95094e
> --- /dev/null
> +++ b/drivers/misc/intel/gna/gna_hw.h
> @@ -0,0 +1,62 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright(c) 2017-2021 Intel Corporation */
> +
> +#ifndef __GNA_HW_H__
> +#define __GNA_HW_H__
> +
> +#include <linux/bits.h>
> +#include <linux/bitfield.h>
No user of it here.
> +#include <linux/interrupt.h>
Ditto.
> +#include <linux/io.h>
> +/* GNA MMIO registers */
> +#define GNA_MMIO_STS 0x80
> +#define GNA_MMIO_CTRL 0x84
> +#define GNA_MMIO_PTC 0x8C
> +#define GNA_MMIO_PSC 0x90
> +#define GNA_MMIO_DESBASE 0xB0
> +#define GNA_MMIO_IBUFFS 0xB4
> +
> +#define GNA_PT_ENTRY_SIZE 4
> +/* there are up to 1024 32-bit pointers in one page in Page Table (L1) */
> +#define GNA_PT_LENGTH (PAGE_SIZE / GNA_PT_ENTRY_SIZE)
Missed header for PAGE_SIZE.
> +#define GNA_PGDIRN_LEN 64
> +#define GNA_PGDIR_ENTRIES 1024 /* 32-bit page addresses */
> +#define GNA_PGDIR_INVALID 1
> +
> +#define GNA_CTRL_START_ACCEL BIT(0)
> +#define GNA_CTRL_ACTIVE_LIST_EN BIT(1)
> +#define GNA_CTRL_ABORT_CLR_ACCEL BIT(2)
> +#define GNA_CTRL_OP_MODE GENMASK(6, 5)
> +#define GNA_CTRL_COMP_INT_EN BIT(8)
> +#define GNA_CTRL_ERR_INT_EN BIT(10)
> +#define GNA_CTRL_COMP_STATS_EN GENMASK(15, 12)
> +
> +struct gna_mmu_info {
> + u32 vamax_size;
> + u32 rsvd_size;
> + u32 pd_size;
> +};
Missed types.h.
> +struct gna_desc_info {
> + u32 rsvd_size;
> + u32 cfg_size;
> + u32 desc_size;
> + struct gna_mmu_info mmu_info;
> +};
> +
> +struct gna_private;
> +struct gna_compute_cfg;
> +
> +void gna_abort_hw(struct gna_private *gna_priv);
> +bool gna_hw_perf_enabled(struct gna_private *gna_priv);
> +int gna_parse_hw_status(struct gna_private *gna_priv, u32 hw_status);
> +void gna_print_error_status(struct gna_private *gna_priv, u32 hw_status);
> +void gna_start_scoring(struct gna_private *gna_priv, void __iomem *addr,
Missed header for __iomem (but I guess it's guaranteed to be included
by types.h, double check this).
> + struct gna_compute_cfg *compute_cfg);
> +
> +#define gna_reg_read(addr, offset) readl((addr) + (offset))
> +#define gna_reg_write(addr, offset, value) writel((value), (addr) + (offset))
No point And make them functions, not macros.
> +
> +#endif // __GNA_HW_H__
> --
> 2.28.0
>
--
With Best Regards,
Andy Shevchenko
On Wed, Mar 24, 2021 at 7:39 PM Maciej Kwapulinski
<[email protected]> wrote:
> This submission is a kernel driver to support Intel(R) Gaussian & Neural
> Accelerator (Intel(R) GNA). Intel(R) GNA is a PCI-based neural co-processor
> available on multiple Intel platforms.
I clearly remember Olof Johansson talking about the potential need
of creating a kernel subsystem for inference engines, so I believe he
wants to be in on this discussion.
There is already misc/habanalabs, and I personally feel this is already
going in the same direction as did pin control before we standardized
it (somewhat), with vendors claiming they are all necessarily different.
So are they necessarily different? New frontiers in the Wild West
every vendor shooting from the hip without any attempts at
standardizing this thing?
Habanalabs was first at this and they made it in, has there been
any attempt to see if the two drivers could actually share code or
have anything in common? Could they share interfaces to userspace?
That kind of thing.
In the end what kernel users want is to be able to write a
userspace making use of any kind of inference/statistics engine
without having to worry about the underlying hardware, this is
what abstractions are for.
> The driver works with Intel(R) libraries in user space. The Intel(R) driver
> exposes a few IOCTL interfaces for use by libraries in user space. The
> libraries are open sourced and are available at:
> https://github.com/intel/gna
This is nice.
Have you made any attempts to cooperate with anyone else in the
world on this, or is this Intel's personal playground?
Yours,
Linus Walleij
Andy Shevchenko <[email protected]> writes:
> On Wed, Mar 24, 2021 at 8:38 PM Maciej Kwapulinski
> <[email protected]> wrote:
>> +#define gna_reg_write(addr, offset, value) writel((value), (addr) + (offset))
>
> No point And make them functions, not macros.
>
>> +
>> +#endif // __GNA_HW_H__
>> --
>> 2.28.0
>>
Andy, Thank You for all Your comments on these two patches.
I'm starting to work with them. Just as with v1 patch series, I'll
get back to You in case of any questions
Regards,
Maciej
Andy Shevchenko <[email protected]> writes:
> On Wed, Mar 24, 2021 at 8:38 PM Maciej Kwapulinski
> <[email protected]> wrote:
>>
....
>> diff --git a/include/uapi/misc/intel/gna.h b/include/uapi/misc/intel/gna.h
>> new file mode 100644
>> index 000000000000..a7e435b74a0a
>> --- /dev/null
>> +++ b/include/uapi/misc/intel/gna.h
>> @@ -0,0 +1,155 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
>> +/* Copyright(c) 2017-2021 Intel Corporation */
>> +
>> +#ifndef _UAPI_GNA_H_
>> +#define _UAPI_GNA_H_
>> +
>> +#if defined(__cplusplus)
>> +extern "C" {
>> +#endif
>
>> +#include <linux/types.h>
>> +#include <linux/ioctl.h>
>> +#include <linux/const.h>
>
> Ordered?
>
What do You mean?
>>
......
>> +struct gna_compute_cfg {
>> + __u32 layer_base;
>> + __u32 layer_count;
>> +
>> + /* List of GNA memory buffers */
>> + __u64 buffers_ptr;
>> + __u64 buffer_count;
>> +
>> + __u8 active_list_on;
>> + __u8 gna_mode;
>> + __u8 hw_perf_encoding;
>> + __u8 pad[5];
>> +};
>> +
>> +union gna_parameter {
>> + struct {
>> + __u64 id;
>> + } in;
>> +
>> + struct {
>> + __u64 value;
>> + } out;
>> +};
>> +
>> +union gna_memory_map {
>> + struct {
>> + __u64 address;
>> + __u32 size;
>> + __u32 pad;
>> + } in;
>> +
>> + struct {
>> + __u64 memory_id;
>> + } out;
>> +};
>> +
>> +union gna_compute {
>> + struct {
>> + struct gna_compute_cfg config;
>> + } in;
>> +
>> + struct {
>> + __u64 request_id;
>> + } out;
>> +};
>> +
>> +union gna_wait {
>> + struct {
>> + __u64 request_id;
>> + __u32 timeout;
>> + __u32 pad;
>> + } in;
>> +
>> + struct {
>> + __u32 hw_status;
>> + __u32 pad;
>> + struct gna_drv_perf drv_perf;
>> + struct gna_hw_perf hw_perf;
>> + } out;
>> +};
>
> For all unions:
> How do you know which branch is used (out, in)? What field and where
> in the ABI points to that?
each of the unions above plays the role of in/out argument to its
corresponding ioctl call.
'in' part is used when ioctl() is called by client (userland
application) - data is written by app.
'out' part is read by app on exit from ioctl(), but only when ioctl()
retuns 0.
do You suggest adding the comment to gna.h for the above?
> .....
Andy Shevchenko <[email protected]> writes:
> On Wed, Mar 24, 2021 at 8:38 PM Maciej Kwapulinski
> <[email protected]> wrote:
>>
....
>> +static int recovery_timeout = 60;
>> +module_param(recovery_timeout, int, 0644);
>> +MODULE_PARM_DESC(recovery_timeout, "Recovery timeout in seconds");
>
> Why module parameters?!
>
Used for testing on slower FPGA boards - in v3, it is present under
CONFIG_DEBUG_INTEL_GNA flag.