2022-07-25 06:54:48

by Jiho Chu

[permalink] [raw]
Subject: [PATCH 0/9] Samsung Trinity NPU device driver

Hello,

My name is Jiho Chu, and working for device driver and system daemon for
several years at Samsung Electronics.

Trinity Neural Processing Unit (NPU) series are hardware accelerators
for neural network processing in embedded systems, which are integrated
into application processors or SoCs. Trinity NPU is compatible with AMBA
bus architecture and first launched in 2018 with its first version for
vision processing, Trinity Version1 (TRIV1). Its second version, TRIV2,
is released in Dec, 2021. Another Trinity NPU for audio processing is
referred as TRIA.

TRIV2 is shipped for many models of 2022 Samsung TVs, providing
acceleration for various AI-based applications, which include image
recognition and picture quality improvements for streaming video, which
can be accessed via GStreamer and its neural network plugins,
NNStreamer.

In this patch set, it includes Trinity Vision 2 kernel device driver.
Trinity Vision 2 supports accelerating image inference process for
Convolution Neural Network (CNN). The CNN workload is executed by Deep
Learning Accelerator (DLA), and general Neural Network Layers are
executed by Digital Signal Processor (DSP). And there is a Control
Processor (CP) which can control DLA and DSP. These three IPs (DLA, DSP,
CP) are composing Trinity Vision 2 NPU, and the device driver mainly
supervise the CP to manage entire NPU.

Controlling DLA and DSP operations is performed with internal command
instructions. and the instructions for the Trinity is similar with
general processor's ISA, but it is specialized for Neural Processing
operations. The virtual ISA (vISA) is designed for calculating multiple
data with single operation, like modern SIMD processor. The device
driver loads a program to CP at start up, and the program can decode a
binary which is built with the vISA. We calls this decoding program as a
Instruction Decoding Unit (IDU) program. While running the NPU, the CP
executes IDU program to fetch and decode instructions which made up of
vISA, by the scheduling policy of the device driver.

These DLA, DSP and CP are loosely coupled using ARM's AMBA, so the
Trinity can easily communicate with most ARM processors. Each IPs
designed to have memory-mapped registers which can be used to control
the IP, and the CP provides Wait-For-Event (WFE) operation to subscribe
interrupt signals from the DLA and DSP. Also, embedded Direct Memory
Access Controller (DMAC) manages data communications between internal
SRAM and outer main memory, IOMMU module supports unified memory space.

A user can control the Trinity NPU with IOCTLs provided by driver. These
controls includes memory management operations to transfer model data
(HWMEM_ALLOC/HWMEM_DEALLOC), NPU workload control operations to submit
workload (RUN/STOP), and statistics operations to check current NPU
status. (STAT)

The device driver also implemented features for developers. It provides
sysfs control attributes like stop, suspend, sched_test, and profile.
Also, it provides status attributes like app status, a number of total
requests, a number of active requests and memory usages. For the tracing
operations, several ftrace events are defined and embedded for several
important points.

I would highly appreciate your feedback.
Review, question or anythings.


Thanks.
Jiho Chu

Jiho Chu (9):
trinity: Add base driver
tirnity: Add dma memory module
trinity: Add load/unload IDU files
trinity: Add schduler module
trinity: Add sysfs debugfs module
trinity: Add pm and ioctl feature
trinity: Add profile module
trinity: Add trace module
MAINTAINERS: add TRINITY driver

MAINTAINERS | 7 +
drivers/misc/Kconfig | 1 +
drivers/misc/Makefile | 1 +
drivers/misc/trinity/Kconfig | 27 +
drivers/misc/trinity/Makefile | 12 +
drivers/misc/trinity/sched/core.c | 170 ++
drivers/misc/trinity/sched/priority.c | 335 +++
drivers/misc/trinity/sched/priority.h | 18 +
drivers/misc/trinity/sched/sched.h | 52 +
drivers/misc/trinity/trinity.c | 1282 +++++++++++
drivers/misc/trinity/trinity_common.h | 434 ++++
drivers/misc/trinity/trinity_debug.c | 358 ++++
drivers/misc/trinity/trinity_hwmem.c | 438 ++++
drivers/misc/trinity/trinity_hwmem.h | 45 +
drivers/misc/trinity/trinity_pm.c | 76 +
drivers/misc/trinity/trinity_resv_mem.c | 264 +++
drivers/misc/trinity/trinity_resv_mem.h | 41 +
drivers/misc/trinity/trinity_stat.c | 893 ++++++++
drivers/misc/trinity/trinity_stat.h | 56 +
drivers/misc/trinity/trinity_sysfs.c | 864 ++++++++
drivers/misc/trinity/trinity_trace.c | 15 +
drivers/misc/trinity/trinity_trace.h | 406 ++++
drivers/misc/trinity/trinity_vision2_drv.c | 1893 +++++++++++++++++
.../misc/trinity/trinity_vision2_profile.h | 324 +++
drivers/misc/trinity/trinity_vision2_regs.h | 210 ++
include/uapi/misc/trinity.h | 458 ++++
26 files changed, 8680 insertions(+)
create mode 100644 drivers/misc/trinity/Kconfig
create mode 100644 drivers/misc/trinity/Makefile
create mode 100644 drivers/misc/trinity/sched/core.c
create mode 100644 drivers/misc/trinity/sched/priority.c
create mode 100644 drivers/misc/trinity/sched/priority.h
create mode 100644 drivers/misc/trinity/sched/sched.h
create mode 100644 drivers/misc/trinity/trinity.c
create mode 100644 drivers/misc/trinity/trinity_common.h
create mode 100644 drivers/misc/trinity/trinity_debug.c
create mode 100644 drivers/misc/trinity/trinity_hwmem.c
create mode 100644 drivers/misc/trinity/trinity_hwmem.h
create mode 100644 drivers/misc/trinity/trinity_pm.c
create mode 100644 drivers/misc/trinity/trinity_resv_mem.c
create mode 100644 drivers/misc/trinity/trinity_resv_mem.h
create mode 100644 drivers/misc/trinity/trinity_stat.c
create mode 100644 drivers/misc/trinity/trinity_stat.h
create mode 100644 drivers/misc/trinity/trinity_sysfs.c
create mode 100644 drivers/misc/trinity/trinity_trace.c
create mode 100644 drivers/misc/trinity/trinity_trace.h
create mode 100644 drivers/misc/trinity/trinity_vision2_drv.c
create mode 100644 drivers/misc/trinity/trinity_vision2_profile.h
create mode 100644 drivers/misc/trinity/trinity_vision2_regs.h
create mode 100644 include/uapi/misc/trinity.h

--
2.25.1


2022-07-25 06:54:56

by Jiho Chu

[permalink] [raw]
Subject: [PATCH 9/9] MAINTAINERS: add TRINITY driver

Add SAMSUNG TRINITY DRIVER.
Jiho Chu and Yelin Jeong is added as the maintainers.

Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
---
MAINTAINERS | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 3cf9842d9233..3c41497e5e76 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17690,6 +17690,13 @@ S: Maintained
F: Documentation/devicetree/bindings/thermal/samsung,exynos-thermal.yaml
F: drivers/thermal/samsung/

+SAMSUNG TRINITY DRIVER
+M: Jiho Chu <[email protected]>
+M: Yelin Jeong <[email protected]>
+S: Supported
+F: drivers/misc/trinity/
+F: include/uapi/misc/trinity.h
+
SAMSUNG USB2 PHY DRIVER
M: Sylwester Nawrocki <[email protected]>
L: [email protected]
--
2.25.1

2022-07-25 06:55:14

by Jiho Chu

[permalink] [raw]
Subject: [PATCH 5/9] trinity: Add sysfs debugfs module

This patch includes debugfs and sysfs interfaces.

It provides NPU's internal statistics and status. The moudles
show each request's status, scheduled time and duration. Also, it can
show total requests and memory usage. The statistics module
helps these calculations.

Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/Makefile | 2 +
drivers/misc/trinity/trinity.c | 48 ++
drivers/misc/trinity/trinity_common.h | 22 +
drivers/misc/trinity/trinity_debug.c | 358 +++++++++++
drivers/misc/trinity/trinity_stat.c | 893 ++++++++++++++++++++++++++
drivers/misc/trinity/trinity_stat.h | 56 ++
drivers/misc/trinity/trinity_sysfs.c | 864 +++++++++++++++++++++++++
7 files changed, 2243 insertions(+)
create mode 100644 drivers/misc/trinity/trinity_debug.c
create mode 100644 drivers/misc/trinity/trinity_stat.c
create mode 100644 drivers/misc/trinity/trinity_stat.h
create mode 100644 drivers/misc/trinity/trinity_sysfs.c

diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
index dcf9d7ad1b4b..ce3539affbf2 100644
--- a/drivers/misc/trinity/Makefile
+++ b/drivers/misc/trinity/Makefile
@@ -5,5 +5,7 @@ obj-$(CONFIG_TRINITY_VISION2) += trinity_vision2.o
trinity-y := trinity.o
trinity-y += trinity_resv_mem.o trinity_hwmem.o
trinity-y += sched/core.o sched/priority.o
+trinity-y += trinity_debug.o
+trinity-y += trinity_sysfs.o trinity_stat.o

trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
index 4c1b8a7108d6..8f8ade0aff89 100644
--- a/drivers/misc/trinity/trinity.c
+++ b/drivers/misc/trinity/trinity.c
@@ -39,6 +39,11 @@

#define TRINITY_PADDR_BASE (0x0)

+#define TRINITY_MODEL_HASH_BITS 10
+#define TRINITY_MODEL_HASH_SIZE (1 << TRINITY_MODEL_HASH_BITS)
+
+static struct hlist_bl_head trinity_model_node_hlist[TRINITY_MODEL_HASH_SIZE];
+
/* A global lock for shared static variables such as dev_bitmap */
static DEFINE_SPINLOCK(trinity_lock);

@@ -53,6 +58,49 @@ phys_addr_t trinity_get_paddr(struct iommu_domain *domain, dma_addr_t daddr)
return TRINITY_PADDR_BASE + daddr;
}

+static uint64_t trinity_gen_model_id(int32_t dbuf_fd)
+{
+ static uint32_t id;
+ uint64_t mid = 0;
+
+ spin_lock(&trinity_lock);
+ if (++id >= (1 << TRINITY_SHIFT_MODEL_ID))
+ id = 0;
+ mid = id;
+ spin_unlock(&trinity_lock);
+
+ mid |= (dbuf_fd << TRINITY_SHIFT_MODEL_ID);
+
+ return mid;
+}
+
+static int32_t trinity_model_id_to_dbuf_fd(uint64_t id)
+{
+ return (id >> TRINITY_SHIFT_MODEL_ID) & UINT_MAX;
+}
+
+static void trinity_model_htable_init(void)
+{
+ int i;
+
+ for (i = 0; i < TRINITY_MODEL_HASH_SIZE; ++i)
+ INIT_HLIST_BL_HEAD(&trinity_model_node_hlist[i]);
+}
+
+/**
+ * trinity_init_model_htable() - Initialize model hash table
+ *
+ * @drv: An instance of the trinity driver
+ * @ht: hash table to be initialized
+ */
+void trinity_init_model_htable(const struct trinity_driver *drv,
+ struct trinity_model_htable *ht)
+{
+ ht->ht_heads = trinity_model_node_hlist;
+ ht->hash_size = TRINITY_MODEL_HASH_SIZE;
+ ht->hash_bits = TRINITY_MODEL_HASH_BITS;
+}
+
/**
* trinity_release() - A common callback for close() in file_operations for a
* Trinity device node. If there are device-specific data to be
diff --git a/drivers/misc/trinity/trinity_common.h b/drivers/misc/trinity/trinity_common.h
index 6940318362f6..c70f66722391 100644
--- a/drivers/misc/trinity/trinity_common.h
+++ b/drivers/misc/trinity/trinity_common.h
@@ -378,6 +378,8 @@ static inline int32_t trinity_get_app_id(void)
int trinity_create_node(struct trinity_driver *drv);
void trinity_destroy_node(struct trinity_driver *drv);
int trinity_wait_ready(struct trinity_driver *drv);
+void trinity_init_model_htable(const struct trinity_driver *drv,
+ struct trinity_model_htable *ht);
phys_addr_t trinity_get_paddr(struct iommu_domain *domain, dma_addr_t daddr);

/* File operations */
@@ -390,4 +392,24 @@ int trinity_probe(struct platform_device *pdev,
int trinity_remove(struct platform_device *pdev,
const struct trinity_desc *desc);

+/* sysfs operations */
+int trinity_sysfs_init(struct trinity_driver *drv);
+int trinity_sysfs_cleanup(struct trinity_driver *drv);
+
+/* debugfs operations */
+int trinity_debug_init(void);
+void trinity_debug_exit(void);
+
+int trinity_debug_add(struct trinity_driver *drv);
+void trinity_debug_remove(struct trinity_driver *drv);
+void trinity_debug_clear(struct trinity_driver *drv, unsigned long msg_max);
+unsigned long trinity_debug_get_max(struct trinity_driver *drv);
+void trinity_debug_dump_msg(struct trinity_driver *drv, const char *fmt, ...);
+void trinity_debug_dump_model(struct trinity_driver *drv,
+ const struct trinity_model *model,
+ const char *fmt, ...);
+void trinity_debug_dump_input(struct trinity_driver *drv,
+ const struct trinity_input *input,
+ const char *fmt, ...);
+
#endif /* __TRINITY_COMMON_H__ */
diff --git a/drivers/misc/trinity/trinity_debug.c b/drivers/misc/trinity/trinity_debug.c
new file mode 100644
index 000000000000..5b10446eced3
--- /dev/null
+++ b/drivers/misc/trinity/trinity_debug.c
@@ -0,0 +1,358 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/**
+ * Implementation of debug functions for trinity drivers
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/debugfs.h>
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "trinity_common.h"
+#include "trinity_resv_mem.h"
+
+#define TRINITY_DEVVER(drv) (drv->desc->ver >> TRINITY_SHIFT_DEV)
+#define TRINITY_DEBUGFS_DIR ("trinity")
+#define TRINITY_DEBUGFS_MAX (1024UL)
+#define TRINITY_DEBUGFS_LENGTH (255)
+
+struct trinity_debugfs_msg {
+ char msg[TRINITY_DEBUGFS_LENGTH + 1]; /* including NULL */
+};
+
+struct trinity_debugfs_entry {
+ struct dentry *dentry;
+ spinlock_t lock;
+
+ unsigned long msg_max;
+ unsigned long msg_num;
+ unsigned long msg_off;
+
+ struct trinity_resv_mem msg_buf;
+};
+
+static struct dentry *trinity_debugfs;
+
+static size_t trinity_debug_append_app_id(struct trinity_driver *drv, char *msg)
+{
+ return snprintf(msg, TRINITY_DEBUGFS_LENGTH, "[%d] ",
+ trinity_get_app_id());
+}
+
+static char *trinity_debug_get_msg_buf(struct trinity_driver *drv)
+{
+ struct trinity_debugfs_entry *entry = drv->debugfs_pdata;
+ struct trinity_debugfs_msg *buf;
+
+ if (!entry || entry->msg_max == 0)
+ return NULL;
+
+ spin_lock(&entry->lock);
+ if (entry->msg_num == entry->msg_max) {
+ buf = &((struct trinity_debugfs_msg *)
+ entry->msg_buf.vaddr)[entry->msg_off];
+ entry->msg_off = (entry->msg_off + 1) % entry->msg_max;
+ } else {
+ buf = &((struct trinity_debugfs_msg *)
+ entry->msg_buf.vaddr)[entry->msg_num++];
+ }
+ spin_unlock(&entry->lock);
+
+ memset(buf, '\x00', sizeof(*buf));
+ return buf->msg;
+}
+
+/**
+ * trinity_debug_dump_msg() - Dump trinity debug message
+ *
+ * @drv: an instance of the trinity driver
+ * @fmt: tag message format
+ */
+void trinity_debug_dump_msg(struct trinity_driver *drv, const char *fmt, ...)
+{
+ char *msg;
+ size_t len;
+ va_list args;
+
+ msg = trinity_debug_get_msg_buf(drv);
+ if (msg == NULL)
+ return;
+
+ len = trinity_debug_append_app_id(drv, msg);
+
+ va_start(args, fmt);
+ len += vsnprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len, fmt, args);
+ va_end(args);
+
+ if (drv->verbose > 0)
+ dev_info(drv_to_dev_ptr(drv), msg);
+}
+
+/**
+ * trinity_debug_dump_input() - Dump trinity input data
+ *
+ * @drv: an instance of the trinity driver
+ * @input: an instance of the trinity model
+ * @fmt: tag message format
+ */
+void trinity_debug_dump_model(struct trinity_driver *drv,
+ const struct trinity_model *model,
+ const char *fmt, ...)
+{
+ char *msg;
+ size_t len;
+ va_list args;
+
+ msg = trinity_debug_get_msg_buf(drv);
+ if (msg == NULL)
+ return;
+
+ len = trinity_debug_append_app_id(drv, msg);
+
+ va_start(args, fmt);
+ len += vsnprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len, fmt, args);
+ va_end(args);
+
+ len += snprintf(
+ msg + len, TRINITY_DEBUGFS_LENGTH - len,
+ "\n\tid(0x%llx) dbuf_fd(%d) program_offset_addr(0x%llx) program_size(0x%llx)\n",
+ model->config.id, model->config.dbuf_fd,
+ model->config.program_offset_addr, model->config.program_size);
+ if (TRINITY_DEVVER(drv) == 1) {
+ len += snprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len,
+ "\tweight_offset_addr(0x%llx)",
+ model->config.weight_offset_addr);
+ } else if (TRINITY_DEVVER(drv) == 2) {
+ len += snprintf(
+ msg + len, TRINITY_DEBUGFS_LENGTH - len,
+ "\tmetadata_dbuf_fd(%d) metadata_ext_dbuf_fd(%d) metadata_ext_size(0x%llx)",
+ model->config.metadata_dbuf_fd,
+ model->config.metadata_ext_dbuf_fd,
+ model->config.metadata_ext_size);
+ }
+
+ if (drv->verbose > 0)
+ dev_info(drv_to_dev_ptr(drv), msg);
+}
+
+/**
+ * trinity_debug_dump_input() - Dump trinity input data
+ *
+ * @drv: an instance of the trinity driver
+ * @input: an instance of the trinity input
+ * @fmt: tag message format
+ */
+void trinity_debug_dump_input(struct trinity_driver *drv,
+ const struct trinity_input *input,
+ const char *fmt, ...)
+{
+ char *msg;
+ size_t len;
+ va_list args;
+
+ msg = trinity_debug_get_msg_buf(drv);
+ if (msg == NULL)
+ return;
+
+ len = trinity_debug_append_app_id(drv, msg);
+
+ va_start(args, fmt);
+ len += vsnprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len, fmt, args);
+ va_end(args);
+
+ len += snprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len,
+ "\n\tdbuf_fd(%d) model_id(0x%llx)\n",
+ input->config.dbuf_fd, input->config.model_id);
+ if (TRINITY_DEVVER(drv) == 1) {
+ len += snprintf(
+ msg + len, TRINITY_DEBUGFS_LENGTH - len,
+ "\tactivation_offset_addr0(0x%llx) activation_offset_addr1(0x%llx)",
+ input->config.activation_offset_addr0,
+ input->config.activation_offset_addr1);
+ } else if (TRINITY_DEVVER(drv) == 2) {
+ len += snprintf(
+ msg + len, TRINITY_DEBUGFS_LENGTH - len,
+ "\ttimeout_ms(%lld) priority(%u) num_segments(%u) input_mode(%d) output_mode(%d)",
+ input->config.timeout_ms, input->config.priority,
+ input->config.num_segments, input->config.input_mode,
+ input->config.output_mode);
+ }
+
+ if (drv->verbose > 0)
+ dev_info(drv_to_dev_ptr(drv), msg);
+}
+
+static int trinity_debugfs_show(struct seq_file *s, void *unsed)
+{
+ struct trinity_driver *drv = s->private;
+ struct trinity_debugfs_entry *entry = drv->debugfs_pdata;
+ struct trinity_debugfs_msg *msg;
+ unsigned long i, offset;
+
+ spin_lock(&entry->lock);
+ for (i = 0; i < entry->msg_num; i++) {
+ offset = (entry->msg_off + i) % entry->msg_max;
+ msg = &((struct trinity_debugfs_msg *)
+ entry->msg_buf.vaddr)[offset];
+
+ seq_puts(s, msg->msg);
+ seq_puts(s, "\n");
+ }
+ spin_unlock(&entry->lock);
+
+ return 0;
+}
+
+static int trinity_debugfs_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, trinity_debugfs_show, inode->i_private);
+}
+
+static const struct file_operations trinity_debugfs_fops = {
+ .open = trinity_debugfs_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+/**
+ * trinity_debug_add() - Add trinity debug file system entry
+ *
+ * @drv: an instance of the trinity driver
+ */
+int trinity_debug_add(struct trinity_driver *drv)
+{
+ struct trinity_debugfs_entry *entry;
+ struct dentry *dentry;
+ const char *name = drv->name;
+
+ if (name == NULL)
+ return -EINVAL;
+
+ entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+ if (!entry)
+ return -ENOMEM;
+
+ dentry = debugfs_create_file_unsafe(name, 0400, trinity_debugfs, drv,
+ &trinity_debugfs_fops);
+ if (IS_ERR(dentry)) {
+ kfree(entry);
+ return PTR_ERR(dentry);
+ }
+
+ entry->dentry = dentry;
+ spin_lock_init(&entry->lock);
+
+ drv->debugfs_pdata = entry;
+
+ return 0;
+}
+
+/**
+ * trinity_debug_remove() - Remove trinity debug file system entry
+ *
+ * @drv: an instance of the trinity driver
+ */
+void trinity_debug_remove(struct trinity_driver *drv)
+{
+ struct trinity_debugfs_entry *entry = drv->debugfs_pdata;
+
+ trinity_debug_clear(drv, 0);
+
+ debugfs_remove(entry->dentry);
+ kfree(entry);
+
+ drv->debugfs_pdata = NULL;
+}
+
+/**
+ * trinity_debug_clear() - Clear debug message entity
+ *
+ * @drv: an instance of the trinity driver
+ * @msg_max: reset max size of debug message entity
+ */
+void trinity_debug_clear(struct trinity_driver *drv, unsigned long msg_max)
+{
+ struct trinity_debugfs_entry *entry = drv->debugfs_pdata;
+ struct device *dev = drv_to_dev_ptr(drv);
+ size_t size;
+
+ /* maximum size limit: 256KiB */
+ if (msg_max > TRINITY_DEBUGFS_MAX) {
+ dev_err(dev, "Too much debugfs entries (limit: %lu)",
+ TRINITY_DEBUGFS_MAX);
+ return;
+ }
+
+ spin_lock(&entry->lock);
+
+ /* disable debugfs temporally */
+ trinity_free_from_resv_mem(&entry->msg_buf, false);
+ entry->msg_max = 0;
+ entry->msg_num = 0;
+ entry->msg_off = 0;
+
+ if (msg_max == 0)
+ goto out;
+
+ /* reallocate debugfs buffer */
+ size = PAGE_ALIGN(msg_max * sizeof(struct trinity_debugfs_msg));
+ if (trinity_alloc_from_resv_mem(size, &entry->msg_buf, false) < 0) {
+ dev_warn(dev, "No available reserved memory for debugfs");
+ goto out;
+ }
+ /* more available entries due to page size alignment */
+ entry->msg_max = size / sizeof(struct trinity_debugfs_msg);
+
+out:
+ spin_unlock(&entry->lock);
+}
+
+/**
+ * trinity_debug_exit() - Get max size of debug message entity
+ *
+ * @drv: an instance of the trinity driver
+ *
+ * Return: max size of debug message entity
+ */
+unsigned long trinity_debug_get_max(struct trinity_driver *drv)
+{
+ struct trinity_debugfs_entry *entry = drv->debugfs_pdata;
+ unsigned long msg_max;
+
+ spin_lock(&entry->lock);
+ msg_max = entry->msg_max;
+ spin_unlock(&entry->lock);
+
+ return msg_max;
+}
+
+/**
+ * trinity_debug_exit() - Initialize debug file system
+ */
+int trinity_debug_init(void)
+{
+ struct dentry *entry;
+
+ entry = debugfs_create_dir(TRINITY_DEBUGFS_DIR, NULL);
+ if (IS_ERR(entry))
+ return PTR_ERR(entry);
+
+ trinity_debugfs = entry;
+
+ return 0;
+}
+
+/**
+ * trinity_debug_exit() - Exit debug file system
+ */
+void trinity_debug_exit(void)
+{
+ debugfs_remove_recursive(trinity_debugfs);
+}
diff --git a/drivers/misc/trinity/trinity_stat.c b/drivers/misc/trinity/trinity_stat.c
new file mode 100644
index 000000000000..388d38f81acd
--- /dev/null
+++ b/drivers/misc/trinity/trinity_stat.c
@@ -0,0 +1,893 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Providing statistics for Samsung Trinity device family support
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include "trinity_stat.h"
+#include "trinity_common.h"
+#include "trinity_resv_mem.h"
+
+#include <linux/bitmap.h>
+#include <linux/list_bl.h>
+
+/* maximum number of stats configurable from sysfs */
+#define TRINITY_STAT_MAX_APPS (128UL)
+#define TRINITY_STAT_MAX_REQS (4096UL)
+#define TRINITY_STAT_MAX_REQS_PER_APP (128UL)
+
+/* default number of stats */
+#define TRINITY_STAT_DEF_APPS (32UL)
+#define TRINITY_STAT_DEF_REQS (128UL)
+#define TRINITY_STAT_DEF_REQS_PER_APP (32UL)
+
+/**
+ * struct trinity_stat_pool - Statistics pool which maintain statistics for device
+ *
+ * @bitmap_app: bitmap for app
+ * @bitmap_req: bitmap for request
+ * @mem_app: reserved memory for applications
+ * @mem_req: reserved memory for request
+ * @max_stat_apps: max statistics size of applications
+ * @max_stat_reqs: max statistics size of requests.
+ * @max_stat_reqs_per_app: max statistics size of request per application
+ * @cur_stat_apps: current statistics for applications
+ * @cur_stat_reqs: current statistics for requests
+ * @drv: an instance of the trinity driver
+ */
+struct trinity_stat_pool {
+ DECLARE_BITMAP(bitmap_app, TRINITY_STAT_MAX_APPS);
+ DECLARE_BITMAP(bitmap_req, TRINITY_STAT_MAX_REQS);
+
+ struct trinity_resv_mem mem_app;
+ struct trinity_resv_mem mem_req;
+
+ unsigned long max_stat_apps;
+ unsigned long max_stat_reqs;
+ unsigned long max_stat_reqs_per_app;
+
+ unsigned long cur_stat_apps;
+ unsigned long cur_stat_reqs;
+
+ struct trinity_driver *drv;
+};
+
+/**
+ * trinity_stat_pool_init(): Initialize trinity statistics pool
+ *
+ * @drv: an instance of the trinity driver
+ *
+ * Returns: 0 on success. Otherwise, returns negative error.
+ */
+int trinity_stat_pool_init(struct trinity_driver *drv)
+{
+ struct trinity_stat_pool *pool;
+
+ pool = kzalloc(sizeof(*pool), GFP_KERNEL);
+ if (!pool)
+ return -ENOMEM;
+
+ pool->drv = drv;
+
+ drv->stat.pdata = pool;
+
+ return 0;
+}
+
+/**
+ * trinity_stat_pool_init(): finish trinity statistics pool
+ *
+ * @drv: an instance of the trinity driver
+ */
+void trinity_stat_pool_fini(struct trinity_driver *drv)
+{
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+
+ if (!pool)
+ return;
+
+ trinity_free_from_resv_mem(&pool->mem_app, false);
+ trinity_free_from_resv_mem(&pool->mem_req, false);
+ kfree(pool);
+
+ drv->stat.pdata = NULL;
+}
+
+static void trinity_stat_pool_resize_apps(struct trinity_stat_pool *pool,
+ unsigned long num_apps)
+{
+ struct device *dev = drv_to_dev_ptr(pool->drv);
+ struct trinity_resv_mem mem;
+ unsigned long size;
+
+ if (num_apps > TRINITY_STAT_MAX_APPS) {
+ dev_err(dev, "The maximum number of stat apps: %lu",
+ TRINITY_STAT_MAX_APPS);
+ return;
+ }
+
+ size = PAGE_ALIGN(sizeof(struct trinity_stat_app) * num_apps);
+ if (trinity_alloc_from_resv_mem(size, &mem, false) == 0) {
+ trinity_free_from_resv_mem(&pool->mem_app, false);
+
+ bitmap_fill(pool->bitmap_app, TRINITY_STAT_MAX_APPS);
+ bitmap_zero(pool->bitmap_app, num_apps);
+
+ pool->max_stat_apps = num_apps;
+ pool->mem_app = mem;
+ } else {
+ dev_warn(dev, "Unable to allocate stats for apps");
+ }
+}
+
+static void trinity_stat_pool_resize_reqs(struct trinity_stat_pool *pool,
+ unsigned long num_reqs)
+{
+ struct device *dev = drv_to_dev_ptr(pool->drv);
+ struct trinity_resv_mem mem;
+ unsigned long size;
+
+ if (num_reqs > TRINITY_STAT_MAX_REQS) {
+ dev_err(dev, "The maximum number of stat reqs: %lu",
+ TRINITY_STAT_MAX_REQS);
+ return;
+ }
+
+ size = PAGE_ALIGN(sizeof(struct trinity_stat_req) * num_reqs);
+ if (trinity_alloc_from_resv_mem(size, &mem, false) == 0) {
+ trinity_free_from_resv_mem(&pool->mem_req, false);
+
+ bitmap_fill(pool->bitmap_req, TRINITY_STAT_MAX_REQS);
+ bitmap_zero(pool->bitmap_req, num_reqs);
+
+ pool->max_stat_reqs = num_reqs;
+ pool->mem_req = mem;
+ } else {
+ dev_warn(dev, "Unable to allocate stats for reqs");
+ }
+}
+
+static struct trinity_stat_app *
+trinity_stat_pool_get_app(struct trinity_driver *drv)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_pool *pool = stat->pdata;
+ struct trinity_stat_app *app = NULL;
+ unsigned long slot;
+ bool retried = false;
+
+ /* ensured that the lock is acquired */
+retry:
+ slot = find_first_zero_bit(pool->bitmap_app, TRINITY_STAT_MAX_APPS);
+ if (slot < TRINITY_STAT_MAX_APPS) {
+ app = &((struct trinity_stat_app *)pool->mem_app.vaddr)[slot];
+ memset(app, '\x00', sizeof(*app));
+ set_bit(slot, pool->bitmap_app);
+ app->slot = slot;
+ } else if (!retried) {
+ /* retry after destroy old stats */
+ retried = true;
+ trinity_destroy_stats(stat, true);
+ goto retry;
+ } else {
+ dev_warn(drv_to_dev_ptr(pool->drv),
+ "Please increase stat pool limit for apps");
+ }
+
+ return app;
+}
+
+static void trinity_stat_pool_put_app(struct trinity_driver *drv,
+ struct trinity_stat_app *app)
+{
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+
+ /* ensured that the lock is acquired */
+ clear_bit(app->slot, pool->bitmap_app);
+}
+
+static struct trinity_stat_req *
+trinity_stat_pool_get_req(struct trinity_driver *drv)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_pool *pool = stat->pdata;
+ struct trinity_stat_req *req = NULL;
+ unsigned long slot;
+ bool retried = false;
+
+ /* ensured that the lock is acquired */
+retry:
+ slot = find_first_zero_bit(pool->bitmap_req, TRINITY_STAT_MAX_REQS);
+ if (slot < TRINITY_STAT_MAX_REQS) {
+ req = &((struct trinity_stat_req *)pool->mem_req.vaddr)[slot];
+ memset(req, '\x00', sizeof(*req));
+ set_bit(slot, pool->bitmap_req);
+ req->slot = slot;
+ } else if (!retried) {
+ /* retry after destroy old stats */
+ retried = true;
+ trinity_destroy_stats(stat, true);
+ goto retry;
+ } else {
+ dev_warn(drv_to_dev_ptr(pool->drv),
+ "Please increase stat pool limit for reqs");
+ }
+
+ return req;
+}
+
+static void trinity_stat_pool_put_req(struct trinity_driver *drv,
+ struct trinity_stat_req *req)
+{
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+
+ /* ensured that the lock is acquired */
+ clear_bit(req->slot, pool->bitmap_req);
+}
+
+/**
+ * trinity_stat_init(): Initialize trinity statistics
+ *
+ * @drv: an instance of the trinity driver
+ */
+void trinity_stat_init(struct trinity_driver *drv)
+{
+ unsigned long i;
+
+ spin_lock_init(&drv->stat.lock);
+
+ INIT_LIST_HEAD(&drv->stat.list);
+ for (i = 0; i < TRINITY_STAT_HASH_SIZE; ++i)
+ INIT_HLIST_BL_HEAD(&drv->stat.hlist[i]);
+
+ trinity_stat_pool_init(drv);
+ /* initialize to default values */
+ trinity_stat_resize(drv, TRINITY_STAT_DEF_APPS, TRINITY_STAT_DEF_REQS,
+ TRINITY_STAT_DEF_REQS_PER_APP);
+}
+
+/**
+ * trinity_stat_fini(): Finish trinity statistics
+ *
+ * @drv: an instance of the trinity driver
+ */
+void trinity_stat_fini(struct trinity_driver *drv)
+{
+ trinity_stat_resize(drv, 0, 0, 0);
+ trinity_stat_pool_fini(drv);
+}
+
+/**
+ * trinity_stat_fini(): Finish trinity statistics
+ *
+ * @drv: an instance of the trinity driver
+ * @num_apps: a number of applications
+ * @num_reqs: a number of requests
+ * @num_reqs_per_app: a number of requests per application
+ */
+void trinity_stat_resize(struct trinity_driver *drv, unsigned long num_apps,
+ unsigned long num_reqs, unsigned long num_reqs_per_app)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_pool *pool = stat->pdata;
+ unsigned long i;
+
+ if (!pool)
+ return;
+
+ trinity_stat_lock(&drv->stat);
+
+ for (i = 0; i < TRINITY_STAT_HASH_SIZE; i++) {
+ struct trinity_stat_app *stat_app;
+ struct hlist_bl_node *hn;
+
+ hlist_bl_lock(&(stat->hlist[i]));
+ hlist_bl_for_each_entry(stat_app, hn, &(stat->hlist[i]),
+ hnode) {
+ if (stat_app->status != TRINITY_APP_STATUS_TERMINATED) {
+ dev_warn(drv_to_dev_ptr(drv),
+ "Still busy apps detected.. waiting");
+ hlist_bl_unlock(&(stat->hlist[i]));
+ goto unlock;
+ }
+ }
+ hlist_bl_unlock(&(stat->hlist[i]));
+ }
+
+ trinity_destroy_stats(stat, true);
+
+ /* re-allocate each stat buffer */
+ if (num_apps > 0)
+ trinity_stat_pool_resize_apps(pool, num_apps);
+
+ if (num_reqs > 0)
+ trinity_stat_pool_resize_reqs(pool, num_reqs);
+
+ if (num_reqs_per_app > 0)
+ pool->max_stat_reqs_per_app = num_reqs_per_app;
+
+unlock:
+ trinity_stat_unlock(&drv->stat);
+}
+
+/**
+ * trinity_stat_get_max_apps(): Get max statistics size for application
+ *
+ * @drv: an instance of the trinity driver
+ *
+ * Returns max number of statistics for applications. 0 on error.
+ */
+unsigned long trinity_stat_get_max_apps(struct trinity_driver *drv)
+{
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+ unsigned long num;
+
+ if (!pool)
+ return 0;
+
+ trinity_stat_lock(&drv->stat);
+ num = pool->max_stat_apps;
+ trinity_stat_unlock(&drv->stat);
+
+ return num;
+}
+
+/**
+ * trinity_stat_get_max_reqs(): Get max statistics size for requests
+ *
+ * @drv: an instance of the trinity driver
+ *
+ * Returns max number of statistics for requests. 0 on error.
+ */
+unsigned long trinity_stat_get_max_reqs(struct trinity_driver *drv)
+{
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+ unsigned long num;
+
+ if (!pool)
+ return 0;
+
+ trinity_stat_lock(&drv->stat);
+ num = pool->max_stat_reqs;
+ trinity_stat_unlock(&drv->stat);
+
+ return num;
+}
+
+/**
+ * trinity_stat_get_max_reqs(): Get max statistics size for requests per application
+ *
+ * @drv: an instance of the trinity driver
+ *
+ * Returns max number of statistics for requests per application. 0 on error.
+ */
+unsigned long trinity_stat_get_max_reqs_per_app(struct trinity_driver *drv)
+{
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+ unsigned long num;
+
+ if (!pool)
+ return 0;
+
+ trinity_stat_lock(&drv->stat);
+ num = pool->max_stat_reqs_per_app;
+ trinity_stat_unlock(&drv->stat);
+
+ return num;
+}
+
+/**
+ * trinity_stat_lock(): Lock for trinity statistics
+ *
+ * @stat: an instance of trinity statistics
+ */
+void trinity_stat_lock(struct trinity_stat *stat)
+{
+ if (stat)
+ spin_lock(&stat->lock);
+}
+
+/**
+ * trinity_stat_unlock(): Unlock for trinity statistics
+ *
+ * @stat: an instance of trinity statistics
+ */
+void trinity_stat_unlock(struct trinity_stat *stat)
+{
+ if (stat)
+ spin_unlock(&stat->lock);
+}
+
+/**
+ * trinity_create_stat_app() - Create a stat structure for the opened app
+ *
+ * @drv: An instance of the trinity driver.
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+static int trinity_create_stat_app(struct trinity_driver *drv)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_pool *pool = stat->pdata;
+ struct trinity_stat_app *stat_app;
+ unsigned long key;
+
+ trinity_stat_lock(stat);
+ stat_app = trinity_stat_pool_get_app(drv);
+ if (IS_ERR_OR_NULL(stat_app)) {
+ trinity_stat_unlock(stat);
+ dev_err(drv_to_dev_ptr(drv),
+ "Unable to allocate stat of request");
+ return -ENOMEM;
+ }
+
+ stat_app->parent = stat;
+ stat_app->app_id = trinity_get_app_id();
+ stat_app->total_alloc_mem = 0;
+ stat_app->total_freed_mem = 0;
+ stat_app->num_total_reqs = 0;
+ stat_app->num_kept_reqs = 0;
+ stat_app->num_active_reqs = 0;
+ stat_app->status = TRINITY_APP_STATUS_STARTED;
+
+ strncpy(stat_app->name, current->comm, TASK_COMM_LEN);
+ stat_app->name[TASK_COMM_LEN - 1] = '\x00';
+
+ INIT_HLIST_BL_NODE(&stat_app->hnode);
+ INIT_LIST_HEAD(&stat_app->reqs);
+
+ key = hash_long(stat_app->app_id, TRINITY_STAT_HASH_BITS);
+
+ hlist_bl_lock(&(stat->hlist[key]));
+ hlist_bl_add_head(&stat_app->hnode, &(stat->hlist[key]));
+ hlist_bl_unlock(&(stat->hlist[key]));
+
+ list_add_tail(&stat_app->lnode, &stat->list);
+ pool->cur_stat_apps++;
+
+ /* Remove terminated stats if the number reaches the maximum */
+ trinity_destroy_stats(stat, false);
+
+ trinity_stat_unlock(stat);
+
+ return 0;
+}
+
+static void trinity_destroy_stat_req(struct trinity_stat_req *stat_req)
+{
+ struct trinity_stat_app *stat_app = stat_req->parent;
+ struct trinity_stat *stat = stat_app->parent;
+ struct trinity_driver *drv =
+ container_of(stat, struct trinity_driver, stat);
+
+ if (stat_req->profile)
+ drv->desc->destroy_profile(drv, stat_req->profile);
+ list_del(&stat_req->list);
+ trinity_stat_pool_put_req(drv, stat_req);
+}
+
+static void trinity_destroy_stat_reqs(struct trinity_stat_app *stat_app)
+{
+ struct trinity_stat_req *stat_req, *tmp;
+
+ list_for_each_entry_safe(stat_req, tmp, &stat_app->reqs, list)
+ trinity_destroy_stat_req(stat_req);
+}
+
+/**
+ * trinity_destroy_stats - Destroy terminated stat structures
+ *
+ * @drv: An instance of the trinity driver
+ * @force: force destroy
+ */
+void trinity_destroy_stats(struct trinity_stat *stat, bool force)
+{
+ struct trinity_driver *drv =
+ container_of(stat, struct trinity_driver, stat);
+ struct trinity_stat_pool *pool = stat->pdata;
+ struct trinity_stat_app *stat_app;
+ struct hlist_bl_node *hn, *tmp;
+ int i;
+
+ /* lock should be acquired before */
+ if (!force && pool->cur_stat_apps <= pool->max_stat_apps)
+ return;
+
+ for (i = 0; i < TRINITY_STAT_HASH_SIZE; i++) {
+ hlist_bl_lock(&stat->hlist[i]);
+ hlist_bl_for_each_entry_safe(stat_app, hn, tmp,
+ &(stat->hlist[i]), hnode) {
+ enum trinity_app_status status = stat_app->status;
+
+ if (status == TRINITY_APP_STATUS_TERMINATED) {
+ hlist_bl_del(&stat_app->hnode);
+ list_del(&stat_app->lnode);
+
+ pool->cur_stat_apps--;
+
+ trinity_destroy_stat_reqs(stat_app);
+ trinity_stat_pool_put_app(drv, stat_app);
+ }
+ }
+ hlist_bl_unlock(&stat->hlist[i]);
+ }
+}
+
+static struct trinity_stat_app *
+trinity_get_stat_by_id(struct trinity_driver *drv, int32_t app_id)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+ struct hlist_bl_node *hn;
+ unsigned long key;
+
+ key = hash_long(app_id, TRINITY_STAT_HASH_BITS);
+
+ hlist_bl_lock(&stat->hlist[key]);
+ hlist_bl_for_each_entry(stat_app, hn, &stat->hlist[key], hnode) {
+ if (stat_app->app_id == app_id)
+ goto out;
+ }
+ stat_app = NULL;
+out:
+ hlist_bl_unlock(&stat->hlist[key]);
+
+ return stat_app;
+}
+
+/**
+ * trinity_get_stat_app() - Get a status structure for the target app
+ *
+ * @drv: an instance of the trinity driver.
+ *
+ * Returns statistics for application on success. Otherwise, returns NULL.
+ *
+ * @note: If the stat is not allocated yet, try to create and return it.
+ */
+struct trinity_stat_app *trinity_get_stat_app(struct trinity_driver *drv)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+ int app_id = trinity_get_app_id();
+
+retry:
+ trinity_stat_lock(stat);
+ stat_app = trinity_get_stat_by_id(drv, app_id);
+ trinity_stat_unlock(stat);
+
+ if (!IS_ERR_OR_NULL(stat_app))
+ return stat_app;
+
+ if (trinity_create_stat_app(drv) != 0)
+ return NULL;
+
+ goto retry;
+}
+
+/**
+ * trinity_stat_app_set_status() - Set a status structure for the target app
+ *
+ * @drv: an instance of the trinity driver.
+ * @status: application status
+ */
+void trinity_stat_app_set_status(struct trinity_driver *drv,
+ enum trinity_app_status status)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+ int app_id = trinity_get_app_id();
+
+ trinity_stat_lock(stat);
+ stat_app = trinity_get_stat_by_id(drv, app_id);
+ trinity_stat_unlock(stat);
+
+ if (IS_ERR_OR_NULL(stat_app))
+ return;
+
+ stat_app->status = status;
+}
+
+/**
+ * trinity_stat_append_req() - Append request information for statistics
+ *
+ * @drv: an instance of the trinity driver.
+ * @req: an instance of request
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+int trinity_stat_append_req(struct trinity_driver *drv, struct trinity_req *req)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_pool *pool = stat->pdata;
+ struct trinity_stat_app *stat_app;
+ struct trinity_stat_req *stat_req;
+
+ stat_app = trinity_get_stat_app(drv);
+ if (IS_ERR_OR_NULL(stat_app))
+ return -ENOMEM;
+
+ trinity_stat_lock(stat);
+ stat_req = trinity_stat_pool_get_req(drv);
+ if (!stat_req) {
+ trinity_stat_unlock(stat);
+ dev_err(drv_to_dev_ptr(drv),
+ "Unable to allocate stat of request");
+ return -ENOMEM;
+ }
+
+ stat_req->parent = stat_app;
+ stat_req->app_id = stat_app->app_id;
+ stat_req->req_id = req->input.config.req_id;
+ stat_req->model_id = req->input.config.model_id;
+ stat_req->submitted = ktime_get();
+ stat_req->status = TRINITY_REQ_STATUS_PENDING;
+ stat_req->priority =
+ (enum trinity_req_priority)req->input.config.priority;
+ stat_req->is_kernel = req->is_kernel;
+
+ req->stat = stat_req;
+
+ list_add_tail(&stat_req->list, &stat_app->reqs);
+
+ /* don't count kernel requests */
+ if (!req->is_kernel) {
+ if (stat_app->num_kept_reqs == pool->max_stat_reqs_per_app) {
+ struct trinity_stat_req *old_stat;
+
+ old_stat = list_first_entry(
+ &stat_app->reqs, struct trinity_stat_req, list);
+ /* skip any kernel or unfinished request */
+ while (old_stat->is_kernel ||
+ (old_stat->status !=
+ TRINITY_REQ_STATUS_FINISHED &&
+ old_stat->status != TRINITY_REQ_STATUS_ERROR))
+ old_stat = list_next_entry(old_stat, list);
+
+ WARN_ON(old_stat == NULL);
+
+ trinity_destroy_stat_req(old_stat);
+ stat_app->num_total_reqs--;
+ } else {
+ /* total number of user requests kepted */
+ stat_app->num_kept_reqs++;
+ }
+ }
+
+ stat_app->num_total_reqs++;
+ stat_app->num_active_reqs++;
+
+ trinity_stat_unlock(stat);
+ return 0;
+}
+
+/**
+ * trinity_stat_remove_req() - Remove request information for statistics
+ *
+ * @drv: an instance of the trinity driver.
+ * @req: an instance of the request to be used for statistics
+ * @rollback: rollback statistics
+ */
+void trinity_stat_remove_req(struct trinity_driver *drv,
+ struct trinity_req *req, bool rollback)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_req *stat_req = req->stat;
+ struct trinity_stat_app *stat_app = stat_req->parent;
+
+ trinity_stat_lock(stat);
+
+ trinity_destroy_stat_req(stat_req);
+
+ if (!req->is_kernel) {
+ WARN_ON(stat_app->num_kept_reqs == 0);
+ stat_app->num_kept_reqs--;
+ }
+
+ if (rollback) {
+ WARN_ON(stat_app->num_total_reqs == 0);
+ stat_app->num_total_reqs--;
+ WARN_ON(stat_app->num_active_reqs == 0);
+ stat_app->num_active_reqs--;
+ }
+
+ trinity_stat_unlock(stat);
+}
+
+/**
+ * trinity_stat_finish_req() - Finish request for statistics
+ *
+ * @drv: an instance of the trinity driver.
+ * @req: an instance of the request to be used for statistics
+ */
+void trinity_stat_finish_req(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_req *stat_req = req->stat;
+ struct trinity_stat_app *stat_app = stat_req->parent;
+
+ trinity_stat_lock(stat);
+ if (stat_app->num_active_reqs != 0)
+ stat_app->num_active_reqs--;
+ else
+ dev_err(drv_to_dev_ptr(drv),
+ "Fail to keep track of the active reqs");
+ trinity_stat_unlock(stat);
+}
+
+static void copy_stat_app_ioctl(struct trinity_stat_app *stat_app,
+ struct trinity_ioctl_stat_app *ioctl_stat_app)
+{
+ ioctl_stat_app->app_id = stat_app->app_id;
+ ioctl_stat_app->status = stat_app->status;
+ ioctl_stat_app->num_total_reqs = stat_app->num_total_reqs;
+ ioctl_stat_app->num_active_reqs = stat_app->num_active_reqs;
+ ioctl_stat_app->total_alloc_mem = stat_app->total_alloc_mem;
+ ioctl_stat_app->total_freed_mem = stat_app->total_freed_mem;
+
+ strncpy(ioctl_stat_app->name, stat_app->name, TASK_COMM_LEN);
+ ioctl_stat_app->name[TASK_COMM_LEN - 1] = '\x00';
+}
+
+static void copy_stat_req_ioctl(struct trinity_stat_req *stat_req,
+ struct trinity_ioctl_stat_req *ioctl_stat_req)
+{
+ ktime_t cur_time = ktime_get();
+ ktime_t submitted, scheduled, completed;
+
+ submitted = stat_req->submitted;
+ scheduled = stat_req->scheduled ? stat_req->scheduled : cur_time;
+ completed = stat_req->completed ? stat_req->completed : cur_time;
+
+ ioctl_stat_req->req_id = stat_req->req_id;
+ ioctl_stat_req->model_id = stat_req->model_id;
+ ioctl_stat_req->priority = stat_req->priority;
+ ioctl_stat_req->status = stat_req->status;
+
+ if (stat_req->priority == TRINITY_REQ_PRIORITY_HIGH)
+ ioctl_stat_req->sched_time = 0;
+ else
+ ioctl_stat_req->sched_time = TIME_DIFF(scheduled, submitted);
+ ioctl_stat_req->infer_time = TIME_DIFF(completed, scheduled);
+}
+
+/**
+ * trinity_stat_app_copy_ioctl() - Copy an application's statistics information to ioctl info
+ *
+ * @drv: an instance of the trinity driver.
+ * @ioctl_stat_app: ioctl statistics information for an application
+ */
+void trinity_stat_app_copy_ioctl(struct trinity_driver *drv,
+ struct trinity_ioctl_stat_app *ioctl_stat_app)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+ int app_id = trinity_get_app_id();
+
+ trinity_stat_lock(stat);
+
+ stat_app = trinity_get_stat_by_id(drv, app_id);
+ if (IS_ERR_OR_NULL(stat_app)) {
+ ioctl_stat_app->app_id = app_id;
+ ioctl_stat_app->status = TRINITY_APP_STATUS_PENDING;
+ ioctl_stat_app->num_total_reqs = 0;
+ ioctl_stat_app->num_active_reqs = 0;
+ ioctl_stat_app->total_alloc_mem = 0;
+ ioctl_stat_app->total_freed_mem = 0;
+
+ strncpy(ioctl_stat_app->name, current->comm, TASK_COMM_LEN);
+ ioctl_stat_app->name[TASK_COMM_LEN - 1] = '\x00';
+ } else {
+ copy_stat_app_ioctl(stat_app, ioctl_stat_app);
+ }
+
+ trinity_stat_unlock(stat);
+}
+
+/**
+ * trinity_stat_apps_copy_ioctl() - Copy applications' statistics information to ioctl info
+ *
+ * @drv: an instance of the trinity driver.
+ * @ioctl_stat_apps: ioctl statistics information for applications
+ */
+void trinity_stat_apps_copy_ioctl(
+ struct trinity_driver *drv,
+ struct trinity_ioctl_stat_apps *ioctl_stat_apps)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_ioctl_stat_app *ioctl_stat_app;
+ struct trinity_stat_app *stat_app;
+ uint32_t idx = 0;
+
+ trinity_stat_lock(stat);
+
+ list_for_each_entry(stat_app, &stat->list, lnode) {
+ if (idx >= TRINITY_APP_STAT_MAX)
+ break;
+ ioctl_stat_app = &ioctl_stat_apps->stat[idx++];
+ copy_stat_app_ioctl(stat_app, ioctl_stat_app);
+ }
+ ioctl_stat_apps->num_apps = idx;
+
+ trinity_stat_unlock(stat);
+}
+
+/**
+ * trinity_stat_app_copy_ioctl() - Copy requests' statistics information to ioctl info
+ *
+ * @drv: an instance of the trinity driver.
+ * @ioctl_stat_reqs: ioctl statistics information for requests
+ */
+void trinity_stat_reqs_copy_ioctl(
+ struct trinity_driver *drv,
+ struct trinity_ioctl_stat_reqs *ioctl_stat_reqs)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_ioctl_stat_req *ioctl_stat_req;
+ struct trinity_stat_app *stat_app;
+ struct trinity_stat_req *stat_req;
+ uint32_t idx = 0;
+
+ trinity_stat_lock(stat);
+ stat_app = trinity_get_stat_by_id(drv, ioctl_stat_reqs->app_id);
+ if (IS_ERR_OR_NULL(stat_app)) {
+ ioctl_stat_reqs->num_reqs = 0;
+ trinity_stat_unlock(stat);
+ return;
+ }
+
+ list_for_each_entry(stat_req, &stat_app->reqs, list) {
+ if (idx >= TRINITY_REQ_STAT_MAX)
+ break;
+ ioctl_stat_req = &ioctl_stat_reqs->stat[idx++];
+ copy_stat_req_ioctl(stat_req, ioctl_stat_req);
+ }
+ ioctl_stat_reqs->num_reqs = idx;
+
+ trinity_stat_unlock(stat);
+}
+
+/**
+ * trinity_stat_app_total_alloc() - Append allocated size to application's total memory size
+ *
+ * @drv: an instance of the trinity driver.
+ * @size: allocated memory size
+ */
+void trinity_stat_app_total_alloc(struct trinity_driver *drv, size_t size)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+
+ stat_app = trinity_get_stat_app(drv);
+ if (IS_ERR_OR_NULL(stat_app))
+ return;
+
+ trinity_stat_lock(stat);
+ stat_app->total_alloc_mem += size;
+ trinity_stat_unlock(stat);
+}
+
+/**
+ * trinity_stat_app_total_alloc() - Append freed size to application's total memory size
+ *
+ * @drv: an instance of the trinity driver.
+ * @size: freed memory size
+ */
+void trinity_stat_app_total_freed(struct trinity_driver *drv, size_t size)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+
+ stat_app = trinity_get_stat_app(drv);
+ if (IS_ERR_OR_NULL(stat_app))
+ return;
+
+ trinity_stat_lock(stat);
+ stat_app->total_freed_mem += size;
+ trinity_stat_unlock(stat);
+}
diff --git a/drivers/misc/trinity/trinity_stat.h b/drivers/misc/trinity/trinity_stat.h
new file mode 100644
index 000000000000..6be666e4e102
--- /dev/null
+++ b/drivers/misc/trinity/trinity_stat.h
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * Statistics header for trinity devices
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __TRINITY_STAT_H__
+#define __TRINITY_STAT_H__
+
+#include "trinity_common.h"
+
+void trinity_stat_init(struct trinity_driver *drv);
+void trinity_stat_fini(struct trinity_driver *drv);
+void trinity_stat_resize(struct trinity_driver *drv, unsigned long num_apps,
+ unsigned long num_reqs,
+ unsigned long num_reqs_per_app);
+
+void trinity_stat_lock(struct trinity_stat *stat);
+void trinity_stat_unlock(struct trinity_stat *stat);
+void trinity_destroy_stats(struct trinity_stat *stat, bool force);
+
+unsigned long trinity_stat_get_max_apps(struct trinity_driver *drv);
+unsigned long trinity_stat_get_max_reqs(struct trinity_driver *drv);
+unsigned long trinity_stat_get_max_reqs_per_app(struct trinity_driver *drv);
+
+struct trinity_stat_app *trinity_get_stat_app(struct trinity_driver *drv);
+
+void trinity_stat_app_total_alloc(struct trinity_driver *drv, size_t size);
+void trinity_stat_app_total_freed(struct trinity_driver *drv, size_t size);
+void trinity_stat_app_set_status(struct trinity_driver *drv,
+ enum trinity_app_status status);
+
+int trinity_stat_append_req(struct trinity_driver *drv,
+ struct trinity_req *req);
+void trinity_stat_remove_req(struct trinity_driver *drv,
+ struct trinity_req *req, bool rollback);
+void trinity_stat_finish_req(struct trinity_driver *drv,
+ struct trinity_req *req);
+
+void trinity_stat_app_copy_ioctl(struct trinity_driver *drv,
+ struct trinity_ioctl_stat_app *ioctl_stat_app);
+
+void trinity_stat_apps_copy_ioctl(
+ struct trinity_driver *drv,
+ struct trinity_ioctl_stat_apps *ioctl_stat_apps);
+
+void trinity_stat_reqs_copy_ioctl(
+ struct trinity_driver *drv,
+ struct trinity_ioctl_stat_reqs *ioctl_stat_reqs);
+
+#endif /* __TRINITY_STAT_H__ */
diff --git a/drivers/misc/trinity/trinity_sysfs.c b/drivers/misc/trinity/trinity_sysfs.c
new file mode 100644
index 000000000000..bdf630b04222
--- /dev/null
+++ b/drivers/misc/trinity/trinity_sysfs.c
@@ -0,0 +1,864 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Sysfs interface for Samsung Research Trinity device family
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/device.h>
+#include <linux/sysfs.h>
+
+#include "sched/sched.h"
+#include "trinity_common.h"
+#include "trinity_stat.h"
+
+enum trinity_sysfs_msg {
+ SYSFS_MSG_NORMAL = 0,
+ SYSFS_MSG_PROLOGUE,
+ SYSFS_MSG_EPILOGUE,
+ SYSFS_MSG_EMIT,
+};
+
+static ssize_t verbose_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ int32_t ret = 0;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ ret = kstrtoul(buf, 10, &drv->verbose);
+ if (ret != 0)
+ return -EINVAL;
+
+ return (ssize_t)count;
+}
+
+static ssize_t verbose_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ return snprintf(buf, PAGE_SIZE, "%lu\n", drv->verbose);
+}
+static DEVICE_ATTR_RW(verbose);
+
+static ssize_t debugfs_max_store(struct device *dev,
+ struct device_attribute *attr, const char *buf,
+ size_t count)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ unsigned long msg_max;
+ int32_t ret = 0;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ ret = kstrtoul(buf, 10, &msg_max);
+ if (ret != 0)
+ return -EINVAL;
+
+ trinity_debug_clear(drv, msg_max);
+
+ return (ssize_t)count;
+}
+
+static ssize_t debugfs_max_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ return snprintf(buf, PAGE_SIZE, "%lu\n", trinity_debug_get_max(drv));
+}
+static DEVICE_ATTR_RW(debugfs_max);
+
+static ssize_t show_profile_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ unsigned long req_id;
+ int32_t ret = 0;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ ret = kstrtoul(buf, 10, &req_id);
+ if (ret != 0)
+ return -EINVAL;
+
+ if (drv->desc->show_profile)
+ drv->desc->show_profile(drv, (int)req_id);
+
+ return (ssize_t)count;
+}
+static DEVICE_ATTR_WO(show_profile);
+
+static ssize_t idu_version_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ if (drv->desc->idu_version) {
+ uint32_t major, minor, extra;
+
+ if (drv->desc->idu_version(drv, &major, &minor, &extra) == 0)
+ return snprintf(buf, PAGE_SIZE, "v%u.%u.%u\n", major,
+ minor, extra);
+ }
+
+ return snprintf(buf, PAGE_SIZE,
+ "Unknown... v0.30.7 or higher version required.\n");
+}
+static DEVICE_ATTR_RO(idu_version);
+
+static struct attribute *trinity_attrs_debug[] = {
+ &dev_attr_verbose.attr, &dev_attr_debugfs_max.attr,
+ &dev_attr_show_profile.attr, &dev_attr_idu_version.attr, NULL
+};
+
+/* e.g, /sys/devices/platform/304f0000.triv2/debug/ */
+static struct attribute_group trinity_attrs_debug_group = {
+ .name = "debug",
+ .attrs = trinity_attrs_debug
+};
+
+static ssize_t max_stat_apps_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ unsigned long val;
+ int32_t ret = 0;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ ret = kstrtoul(buf, 10, &val);
+ if (ret != 0)
+ return -EINVAL;
+
+ trinity_stat_resize(drv, val, 0, 0);
+
+ return (ssize_t)count;
+}
+
+static ssize_t max_stat_apps_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ return snprintf(buf, PAGE_SIZE, "%lu\n",
+ trinity_stat_get_max_apps(drv));
+}
+static DEVICE_ATTR_RW(max_stat_apps);
+
+static ssize_t max_stat_reqs_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ unsigned long val;
+ int32_t ret = 0;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ ret = kstrtoul(buf, 10, &val);
+ if (ret != 0)
+ return -EINVAL;
+
+ trinity_stat_resize(drv, 0, val, 0);
+
+ return (ssize_t)count;
+}
+
+static ssize_t max_stat_reqs_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ return snprintf(buf, PAGE_SIZE, "%lu\n",
+ trinity_stat_get_max_reqs(drv));
+}
+static DEVICE_ATTR_RW(max_stat_reqs);
+
+static ssize_t max_stat_reqs_per_app_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ unsigned long val;
+ int32_t ret = 0;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ ret = kstrtoul(buf, 10, &val);
+ if (ret != 0)
+ return -EINVAL;
+
+ trinity_stat_resize(drv, 0, 0, val);
+
+ return (ssize_t)count;
+}
+
+static ssize_t max_stat_reqs_per_app_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ return snprintf(buf, PAGE_SIZE, "%lu\n",
+ trinity_stat_get_max_reqs_per_app(drv));
+}
+static DEVICE_ATTR_RW(max_stat_reqs_per_app);
+
+static ssize_t mem_usage_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ struct trinity_stat_app *stat_app;
+ ssize_t pos = 0;
+ bool first = true;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ trinity_stat_lock(&drv->stat);
+
+ list_for_each_entry(stat_app, &drv->stat.list, lnode) {
+ if (first) {
+ pos += snprintf(
+ buf + pos, PAGE_SIZE,
+ "Memory usage statistics for all opened devices\n");
+ first = false;
+ }
+
+ pos += snprintf(
+ buf + pos, PAGE_SIZE,
+ " [%d] total_alloc: %llu bytes, total_freed: %llu bytes\n",
+ stat_app->app_id, stat_app->total_alloc_mem,
+ stat_app->total_freed_mem);
+ }
+
+ if (first)
+ pos += snprintf(buf + pos, PAGE_SIZE, "No active devices\n");
+
+ trinity_stat_unlock(&drv->stat);
+
+ return pos;
+}
+static DEVICE_ATTR_RO(mem_usage);
+
+#define MODEL_REGISTERED_PROLOGUE \
+ "\n Model statistics registered in all opened devices\n" \
+ "+--------------+--------------+-----------+------------+\n" \
+ "| Model ID | Model Size | Dmabuf FD | Offset |\n" \
+ "+--------------+--------------+-----------+------------+\n"
+#define MODEL_REGISTERED_NORMAL "| %#12llx | %#12llx | %9d | %#10llx |\n"
+#define MODEL_REGISTERED_EPILOGUE \
+ "+--------------+--------------+-----------+------------+\n"
+
+static ssize_t print_registered_models(const struct trinity_model *model,
+ char *buf, enum trinity_sysfs_msg msg)
+{
+ ssize_t pos = 0;
+
+ switch (msg) {
+ case SYSFS_MSG_PROLOGUE:
+ pos = snprintf(buf, PAGE_SIZE, MODEL_REGISTERED_PROLOGUE);
+ break;
+ case SYSFS_MSG_NORMAL:
+ pos = snprintf(buf, PAGE_SIZE, MODEL_REGISTERED_NORMAL,
+ model->config.id, model->config.program_size,
+ model->config.dbuf_fd,
+ model->config.program_offset_addr);
+ break;
+ case SYSFS_MSG_EPILOGUE:
+ pos = snprintf(buf, PAGE_SIZE, MODEL_REGISTERED_EPILOGUE);
+ break;
+ default:
+ break;
+ }
+
+ return pos;
+}
+
+static ssize_t registered_models_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ struct trinity_model_htable ht;
+ struct trinity_model *model;
+ struct hlist_bl_node *hn;
+ ssize_t pos;
+ int i, num_printed = 0;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ trinity_init_model_htable(drv, &ht);
+
+ pos = print_registered_models(NULL, buf, SYSFS_MSG_PROLOGUE);
+
+ for (i = 0; i < ht.hash_size; i++) {
+ hlist_bl_lock(&(ht.ht_heads[i]));
+ hlist_bl_for_each_entry(model, hn, &(ht.ht_heads[i]), hnode) {
+ pos += print_registered_models(model, buf + pos,
+ SYSFS_MSG_NORMAL);
+ num_printed++;
+ }
+ hlist_bl_unlock(&(ht.ht_heads[i]));
+ }
+
+ if (num_printed > 0)
+ pos += print_registered_models(NULL, buf + pos,
+ SYSFS_MSG_EPILOGUE);
+
+ return pos;
+}
+static DEVICE_ATTR_RO(registered_models);
+
+static const char *priority_to_string(enum trinity_req_priority priority)
+{
+ static const char *const priority_strings[] = {
+ [TRINITY_REQ_PRIORITY_LOW] = "Low",
+ [TRINITY_REQ_PRIORITY_MID] = "Mid",
+ [TRINITY_REQ_PRIORITY_HIGH] = "High",
+ };
+ return priority_strings[priority];
+}
+
+static const char *status_to_string(enum trinity_req_status status)
+{
+ static const char *const status_strings[] = {
+ [TRINITY_REQ_STATUS_UNKNOWN] = "Unknown",
+ [TRINITY_REQ_STATUS_ERROR] = "Error",
+ [TRINITY_REQ_STATUS_PENDING] = "Pending",
+ [TRINITY_REQ_STATUS_RUNNING] = "Running",
+ [TRINITY_REQ_STATUS_FINISHED] = "Finished",
+ };
+ return status_strings[status];
+}
+
+#define APP_STATUS_LENGTH (77)
+#define USER_APP_STATUS_PROLOGUE \
+ "\n\tUser-level request statistics running in %s\n" \
+ "+-------+--------+----------+------+----------+--------------+-------------+\n" \
+ "| PID | Req ID | Model ID | Prio | Status | Sched (us) | Infer (us) |\n" \
+ "+-------+--------+----------+------+----------+--------------+-------------+\n"
+#define USER_APP_STATUS_NORMAL \
+ "| %5d | %6d | %#8llx | %4s | %8s | %12lld | %11lld |\n"
+#define USER_APP_STATUS_EMIT \
+ "| ... (emitted) ... |\n"
+#define USER_APP_STATUS_EPILOGUE \
+ "+-------+--------+----------+------+----------+--------------+-------------+\n"
+
+static ssize_t print_user_app_status(struct device *dev,
+ const struct trinity_stat_req *req,
+ char *buf, enum trinity_sysfs_msg msg)
+{
+ ssize_t pos = 0;
+
+ switch (msg) {
+ case SYSFS_MSG_PROLOGUE:
+ pos = snprintf(buf, APP_STATUS_LENGTH * 4 + 1,
+ USER_APP_STATUS_PROLOGUE, dev_name(dev));
+ break;
+ case SYSFS_MSG_NORMAL: {
+ ktime_t cur_time = ktime_get();
+ ktime_t submitted = req->submitted;
+ ktime_t scheduled = req->scheduled ? req->scheduled : cur_time;
+ ktime_t completed = req->completed ? req->completed : cur_time;
+
+ int64_t sched_diff = TIME_DIFF_US(scheduled, submitted);
+ int64_t infer_diff = TIME_DIFF_US(completed, scheduled);
+
+ if (req->status == TRINITY_REQ_STATUS_ERROR) {
+ sched_diff = 0;
+ infer_diff = 0;
+ }
+
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ USER_APP_STATUS_NORMAL, req->app_id, req->req_id,
+ req->model_id, priority_to_string(req->priority),
+ status_to_string(req->status), sched_diff,
+ infer_diff);
+ } break;
+ case SYSFS_MSG_EMIT:
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ USER_APP_STATUS_EMIT);
+ break;
+ case SYSFS_MSG_EPILOGUE:
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ USER_APP_STATUS_EPILOGUE);
+ break;
+ default:
+ break;
+ }
+
+ return pos;
+}
+
+#define KERNEL_APP_STATUS_PROLOGUE \
+ "\n\tKernel-level request statistics running in %s\n" \
+ "+-------+--------+----------+------+----------+------------+---------------+\n" \
+ "| PID | Req ID | Model ID | Prio | Status | # Runs | Avg. Lat (us) |\n" \
+ "+-------+--------+----------+------+----------+------------+---------------+\n"
+#define KERNEL_APP_STATUS_NORMAL \
+ "| %5d | %6d | %#8llx | %4s | %8s | %10u | %13u |\n"
+#define KERNEL_APP_STATUS_EMIT \
+ "| ... (emitted) ... |\n"
+#define KERNEL_APP_STATUS_EPILOGUE \
+ "+-------+--------+----------+------+----------+------------+---------------+\n"
+
+static ssize_t print_kernel_app_status(struct device *dev,
+ const struct trinity_stat_req *req,
+ char *buf, enum trinity_sysfs_msg msg)
+{
+ ssize_t pos = 0;
+
+ switch (msg) {
+ case SYSFS_MSG_PROLOGUE:
+ pos = snprintf(buf, APP_STATUS_LENGTH * 4 + 1,
+ KERNEL_APP_STATUS_PROLOGUE, dev_name(dev));
+ break;
+ case SYSFS_MSG_NORMAL: {
+ uint32_t avg_latency = 0;
+
+ if (req->num_runs > 0)
+ avg_latency = req->total_time / req->num_runs;
+
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ KERNEL_APP_STATUS_NORMAL, req->app_id,
+ req->req_id, req->model_id,
+ priority_to_string(req->priority),
+ status_to_string(req->status), req->num_runs,
+ avg_latency);
+ } break;
+ case SYSFS_MSG_EMIT:
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ KERNEL_APP_STATUS_EMIT);
+ break;
+ case SYSFS_MSG_EPILOGUE:
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ KERNEL_APP_STATUS_EPILOGUE);
+ break;
+ default:
+ break;
+ }
+
+ return pos;
+}
+
+static ssize_t app_status_user_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ struct trinity_stat_app *stat_app;
+ struct trinity_stat_req *stat_req;
+ int num_printed = 0;
+ ssize_t pos;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ pos = print_user_app_status(dev, NULL, buf, SYSFS_MSG_PROLOGUE);
+
+ trinity_stat_lock(&drv->stat);
+ list_for_each_entry(stat_app, &drv->stat.list, lnode) {
+ list_for_each_entry(stat_req, &stat_app->reqs, list) {
+ if (stat_req->is_kernel)
+ continue;
+
+ pos += print_user_app_status(dev, stat_req, buf + pos,
+ SYSFS_MSG_NORMAL);
+ num_printed++;
+
+ /* buffer size limit: PAGE_SIZE (also need reserved bytes) */
+ if (pos + APP_STATUS_LENGTH >
+ PAGE_SIZE - 2 * APP_STATUS_LENGTH) {
+ pos += print_user_app_status(
+ dev, NULL, buf + pos, SYSFS_MSG_EMIT);
+ /* clear old stats */
+ trinity_destroy_stats(&drv->stat, true);
+ goto out;
+ }
+ }
+ }
+out:
+ trinity_stat_unlock(&drv->stat);
+
+ if (num_printed > 0)
+ pos += print_user_app_status(dev, NULL, buf + pos,
+ SYSFS_MSG_EPILOGUE);
+
+ return pos;
+}
+static DEVICE_ATTR_RO(app_status_user);
+
+static ssize_t app_status_kernel_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ struct trinity_stat_app *stat_app;
+ struct trinity_stat_req *stat_req;
+ int num_printed = 0;
+ ssize_t pos;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ pos = print_kernel_app_status(dev, NULL, buf, SYSFS_MSG_PROLOGUE);
+
+ trinity_stat_lock(&drv->stat);
+ list_for_each_entry(stat_app, &drv->stat.list, lnode) {
+ list_for_each_entry(stat_req, &stat_app->reqs, list) {
+ if (!stat_req->is_kernel)
+ continue;
+
+ pos += print_kernel_app_status(dev, stat_req, buf + pos,
+ SYSFS_MSG_NORMAL);
+ num_printed++;
+
+ /* buffer size limit: PAGE_SIZE (also need reserved bytes) */
+ if (pos + APP_STATUS_LENGTH >
+ PAGE_SIZE - 2 * APP_STATUS_LENGTH) {
+ pos += print_kernel_app_status(
+ dev, NULL, buf + pos, SYSFS_MSG_EMIT);
+ /* clear old stats */
+ trinity_destroy_stats(&drv->stat, true);
+ goto out;
+ }
+ }
+ }
+out:
+ trinity_stat_unlock(&drv->stat);
+
+ if (num_printed > 0)
+ pos += print_kernel_app_status(dev, NULL, buf + pos,
+ SYSFS_MSG_EPILOGUE);
+
+ return pos;
+}
+static DEVICE_ATTR_RO(app_status_kernel);
+
+static ssize_t num_total_reqs_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ struct trinity_stat_app *stat_app;
+ uint32_t num_total_reqs = 0;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ trinity_stat_lock(&drv->stat);
+
+ list_for_each_entry(stat_app, &drv->stat.list, lnode) {
+ num_total_reqs += stat_app->num_total_reqs;
+ }
+
+ trinity_stat_unlock(&drv->stat);
+
+ return snprintf(buf, PAGE_SIZE, "%u\n", num_total_reqs);
+}
+static DEVICE_ATTR_RO(num_total_reqs);
+
+static ssize_t num_active_reqs_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ struct trinity_stat_app *stat_app;
+ uint32_t num_active_reqs = 0;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ trinity_stat_lock(&drv->stat);
+
+ list_for_each_entry(stat_app, &drv->stat.list, lnode) {
+ num_active_reqs += stat_app->num_active_reqs;
+ }
+
+ trinity_stat_unlock(&drv->stat);
+
+ return snprintf(buf, PAGE_SIZE, "%u\n", num_active_reqs);
+}
+static DEVICE_ATTR_RO(num_active_reqs);
+
+static struct attribute *trinity_attrs_stat[] = {
+ &dev_attr_max_stat_apps.attr, &dev_attr_max_stat_reqs.attr,
+ &dev_attr_max_stat_reqs_per_app.attr, &dev_attr_mem_usage.attr,
+ &dev_attr_registered_models.attr, &dev_attr_app_status_user.attr,
+ &dev_attr_app_status_kernel.attr, &dev_attr_num_total_reqs.attr,
+ &dev_attr_num_active_reqs.attr, NULL
+};
+
+/* e.g, /sys/devices/platform/304f0000.triv2/stat/ */
+static struct attribute_group trinity_attrs_stat_group = {
+ .name = "stat",
+ .attrs = trinity_attrs_stat
+};
+
+static ssize_t stop_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ unsigned long stop;
+ int32_t ret = 0;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ ret = kstrtoul(buf, 10, &stop);
+ if (ret != 0)
+ return 0;
+
+ if (stop == 1 && drv->desc->stop_reqs)
+ schedule_work(&drv->work_stop);
+
+ return (ssize_t)count;
+}
+
+static DEVICE_ATTR_WO(stop);
+
+static ssize_t idu_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ char dirpath[NAME_MAX];
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ strncpy(dirpath, buf, NAME_MAX);
+ /* remove newline if exists */
+ dirpath[strcspn(dirpath, "\n")] = '\x00';
+
+ mutex_lock(&drv->lock);
+ drv->desc->idu_load(drv, dirpath, true);
+ mutex_unlock(&drv->lock);
+
+ return (ssize_t)count;
+}
+
+static DEVICE_ATTR_WO(idu);
+
+static ssize_t suspend_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ unsigned long suspend;
+
+ if (kstrtoul(buf, 10, &suspend) != 0)
+ return 0;
+
+ /** Note that this interface is used only for testing purpose */
+ if (suspend == 1) {
+ const struct dev_pm_ops *ops = dev->driver->pm;
+
+ ops->runtime_suspend(dev);
+ }
+
+ return (ssize_t)count;
+}
+
+static DEVICE_ATTR_WO(suspend);
+
+static ssize_t resume_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ unsigned long resume;
+
+ if (kstrtoul(buf, 10, &resume) != 0)
+ return 0;
+
+ /** Note that this interface is used only for testing purpose */
+ if (resume == 1) {
+ const struct dev_pm_ops *ops = dev->driver->pm;
+
+ ops->runtime_resume(dev);
+ }
+
+ return (ssize_t)count;
+}
+
+static DEVICE_ATTR_WO(resume);
+
+static ssize_t profile_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ unsigned long profile;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ if (kstrtoul(buf, 10, &profile) != 0)
+ return 0;
+
+ /** Note that this interface is used only for testing purpose */
+ if (drv->desc->init_profile)
+ drv->desc->init_profile(drv, profile);
+
+ return (ssize_t)count;
+}
+
+static DEVICE_ATTR_WO(profile);
+
+static ssize_t reset_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct platform_device *pdev;
+ struct trinity_driver *drv;
+ unsigned long reset;
+
+ pdev = container_of(dev, struct platform_device, dev);
+ drv = platform_get_drvdata(pdev);
+
+ if (drv == NULL)
+ return 0;
+
+ if (kstrtoul(buf, 10, &reset) != 0)
+ return 0;
+
+ if (reset == 1 && drv->desc->reset)
+ drv->desc->reset(drv);
+
+ return (ssize_t)count;
+}
+
+static DEVICE_ATTR_WO(reset);
+
+static struct attribute *trinity_attrs_control[] = { &dev_attr_stop.attr,
+ &dev_attr_idu.attr,
+ &dev_attr_suspend.attr,
+ &dev_attr_resume.attr,
+ &dev_attr_profile.attr,
+ &dev_attr_reset.attr,
+ NULL };
+
+/* e.g, /sys/devices/platform/304f0000.triv2/control/ */
+static struct attribute_group trinity_attrs_control_group = {
+ .name = "control",
+ .attrs = trinity_attrs_control
+};
+
+static const struct attribute_group *trinity_attrs_groups[] = {
+ &trinity_attrs_debug_group, &trinity_attrs_stat_group,
+ &trinity_attrs_control_group, NULL
+};
+
+int trinity_sysfs_init(struct trinity_driver *drv)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+ int err;
+
+ err = sysfs_create_groups(&dev->kobj, trinity_attrs_groups);
+ if (err < 0) {
+ dev_err(dev, "failed to create sysfs groups\n");
+ return err;
+ }
+
+ return 0;
+}
+
+int trinity_sysfs_cleanup(struct trinity_driver *drv)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+
+ sysfs_remove_groups(&dev->kobj, trinity_attrs_groups);
+
+ return 0;
+}
--
2.25.1

2022-07-25 06:55:41

by Jiho Chu

[permalink] [raw]
Subject: [PATCH 8/9] trinity: Add trace module

This patch is for trace declaration.

'trinity' TRACE SUBSYSTEM is introduced for several trace
points. They are for tracing each ioctl control, wakeup,
irq, and run trigger.

Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/Makefile | 2 +-
drivers/misc/trinity/trinity.c | 82 ++++-
drivers/misc/trinity/trinity_trace.c | 15 +
drivers/misc/trinity/trinity_trace.h | 406 +++++++++++++++++++++
drivers/misc/trinity/trinity_vision2_drv.c | 11 +-
5 files changed, 513 insertions(+), 3 deletions(-)
create mode 100644 drivers/misc/trinity/trinity_trace.c
create mode 100644 drivers/misc/trinity/trinity_trace.h

diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
index 22141e2233e8..3b546c0f303d 100644
--- a/drivers/misc/trinity/Makefile
+++ b/drivers/misc/trinity/Makefile
@@ -6,7 +6,7 @@ trinity-y := trinity.o
trinity-y += trinity_resv_mem.o trinity_hwmem.o
trinity-y += sched/core.o sched/priority.o
trinity-y += trinity_pm.o
-trinity-y += trinity_debug.o
+trinity-y += trinity_debug.o trinity_trace.o
trinity-y += trinity_sysfs.o trinity_stat.o

trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
index 08d15f08da39..b7d6bdcd51d1 100644
--- a/drivers/misc/trinity/trinity.c
+++ b/drivers/misc/trinity/trinity.c
@@ -35,6 +35,7 @@
#include "trinity_common.h"
#include "trinity_resv_mem.h"
#include "trinity_stat.h"
+#include "trinity_trace.h"

#define BASE_DEV_NAME "trinity"

@@ -448,6 +449,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (copy_to_user((uint32_t __user *)arg, &(desc->ver),
sizeof((desc->ver))))
return -EFAULT;
+
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_GET_VERSION");
break;
}
case TRINITY_IOCTL_GET_API_LEVEL: {
@@ -456,6 +460,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (copy_to_user((uint32_t __user *)arg, &api_level,
sizeof(api_level)))
return -EFAULT;
+
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_GET_API_LEVEL");
break;
}
case TRINITY_IOCTL_GET_STATE: {
@@ -465,18 +472,29 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (copy_to_user((enum trinity_state __user *)arg, &ready,
sizeof(ready)))
return -EFAULT;
+
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_GET_STATE");
break;
}
case TRINITY_IOCTL_GET_TOPS: {
if (copy_to_user((uint32_t __user *)arg, &(drv->tops),
sizeof((drv->tops))))
return -EFAULT;
+
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_GET_TOPS");
+
break;
}
case TRINITY_IOCTL_GET_DSPM: {
if (copy_to_user((uint32_t __user *)arg, &(drv->dspm),
sizeof((drv->dspm))))
return -EFAULT;
+
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_GET_DSPM");
+
break;
}
case TRINITY_IOCTL_GET_NEXT_REQUEST: {
@@ -485,6 +503,10 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (copy_to_user((int32_t __user *)arg, &req_id,
sizeof(req_id)))
return -EFAULT;
+
+ trace_trinity_ioctl_next_req(drv->dev_id, trinity_get_app_id(),
+ req_id);
+
break;
}
case TRINITY_IOCTL_HWMEM_ALLOC: {
@@ -497,6 +519,10 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
hwmem.type);
if (err >= 0)
trinity_stat_app_total_alloc(drv, hwmem.size);
+
+ trace_trinity_ioctl_hwmem_alloc(
+ drv->dev_id, trinity_get_app_id(), hwmem.size, err);
+
break;
}
case TRINITY_IOCTL_HWMEM_DEALLOC: {
@@ -513,6 +539,10 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
err = trinity_hwmem_free(drv_to_dev_ptr(drv), hwmem.dbuf_fd);
if (err == 0)
trinity_stat_app_total_freed(drv, dbuf->size);
+
+ trace_trinity_ioctl_hwmem_dealloc(
+ drv->dev_id, trinity_get_app_id(), hwmem.dbuf_fd);
+
break;
}
case TRINITY_IOCTL_REGISTER_MODEL: {
@@ -536,6 +566,18 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (copy_to_user((struct trinity_model __user *)arg,
&model->config, sizeof(model->config)))
return -EFAULT;
+
+ trace_trinity_ioctl_register_model(
+ drv->dev_id, trinity_get_app_id(), model->config.id,
+ model->config.dbuf_fd,
+ model->config.program_offset_addr,
+ model->config.program_size);
+
+ trace_trinity_ioctl_register_model_drv_ver2(
+ model->config.metadata_dbuf_fd,
+ model->config.metadata_ext_dbuf_fd,
+ model->config.metadata_ext_size);
+
break;
}
case TRINITY_IOCTL_DEREGISTER_MODEL: {
@@ -545,6 +587,10 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
return -EFAULT;

err = trinity_deregister_model(drv, id);
+
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_DEREGISTER_MODEL");
+
break;
}
case TRINITY_IOCTL_RUN_INPUT: {
@@ -572,6 +618,15 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
return err;
}

+ trace_trinity_ioctl_run_input(drv->dev_id, trinity_get_app_id(),
+ input->config.dbuf_fd,
+ input->config.model_id);
+
+ trace_trinity_ioctl_run_input_drv_ver2(
+ input->config.timeout_ms, input->config.priority,
+ input->config.num_segments, input->config.input_mode,
+ input->config.output_mode);
+
if (copy_to_user((struct trinity_input __user *)arg,
&input->config, sizeof(input->config))) {
drv->desc->dealloc_req(drv, req);
@@ -585,8 +640,16 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
break;
}
case TRINITY_IOCTL_STOP_REQUESTS: {
- if (drv->desc->stop_reqs)
+ if (drv->desc->stop_reqs) {
schedule_work(&drv->work_stop);
+ trace_trinity_ioctl_msg(drv->dev_id,
+ trinity_get_app_id(),
+ "TRINITY_IOCTL_STOP_REQUESTS");
+ } else {
+ trace_trinity_ioctl_msg(
+ drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_STOP_REQUESTS: not supported");
+ }
break;
}
case TRINITY_IOCTL_STAT_CURRENT_APP: {
@@ -602,6 +665,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (copy_to_user((struct trinity_ioctl_stat_app __user *)arg,
&ioctl_stat_app, sizeof(ioctl_stat_app)))
return -EACCES;
+
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_STAT_CURRENT_APP");
break;
}
case TRINITY_IOCTL_STAT_APPS: {
@@ -617,6 +683,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (copy_to_user((struct trinity_ioctl_stat_apps __user *)arg,
&ioctl_stat_apps, sizeof(ioctl_stat_apps)))
return -EACCES;
+
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_STAT_APPS");
break;
}
case TRINITY_IOCTL_STAT_REQS: {
@@ -635,6 +704,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (copy_to_user((struct trinity_ioctl_stat_reqs __user *)arg,
&ioctl_stat_reqs, sizeof(ioctl_stat_reqs)))
return -EACCES;
+
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_STAT_REQS");
break;
}
case TRINITY_IOCTL_GET_PROFILE_META: {
@@ -659,6 +731,11 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (copy_to_user((struct trinity_ioctl_profile_meta __user *)arg,
&profile, sizeof(profile)))
return -EACCES;
+
+ trace_trinity_ioctl_get_profile_meta(drv->dev_id,
+ trinity_get_app_id(),
+ profile.req_id,
+ profile.profile_size);
break;
}
case TRINITY_IOCTL_GET_PROFILE_BUFF: {
@@ -677,6 +754,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
&profile, sizeof(profile)))
return -EACCES;

+ trace_trinity_ioctl_get_profile_buff(
+ drv->dev_id, trinity_get_app_id(), profile.req_id,
+ profile.profile_pos, profile.profile_size);
break;
}
case TRINITY_IOCTL_FPGA_MEMCPY: {
diff --git a/drivers/misc/trinity/trinity_trace.c b/drivers/misc/trinity/trinity_trace.c
new file mode 100644
index 000000000000..d5721273eeb1
--- /dev/null
+++ b/drivers/misc/trinity/trinity_trace.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/**
+ * Trace source for trinity devices
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __CHECKER__
+#define CREATE_TRACE_POINTS
+#include "trinity_trace.h"
+#endif
diff --git a/drivers/misc/trinity/trinity_trace.h b/drivers/misc/trinity/trinity_trace.h
new file mode 100644
index 000000000000..fd87f090b73d
--- /dev/null
+++ b/drivers/misc/trinity/trinity_trace.h
@@ -0,0 +1,406 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * Trace header for trinity devices
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#if !defined(__TRINITY_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
+#define __TRINITY_TRACE_H__
+
+#include <linux/tracepoint.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM trinity
+#define TRACE_INCLUDE_FILE trinity_trace
+
+// clang-format off
+TRACE_EVENT(triv2_run_trigger,
+ TP_PROTO(u32 device_id, s32 slot),
+ TP_ARGS(device_id, slot),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, slot)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->slot = slot;
+ ),
+ TP_printk("device_id=%u slot=%d",
+ __entry->device_id,
+ __entry->slot)
+);
+TRACE_EVENT(triv2_wakeup_cp,
+ TP_PROTO(u32 device_id),
+ TP_ARGS(device_id),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ ),
+ TP_printk("device_id=%u",
+ __entry->device_id)
+);
+TRACE_EVENT(triv2_handle_irq,
+ TP_PROTO(u32 device_id, s32 irq),
+ TP_ARGS(device_id, irq),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, irq)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->irq = irq;
+ ),
+ TP_printk("device_id=%u irq=%d",
+ __entry->device_id,
+ __entry->irq)
+);
+TRACE_EVENT(triv2_handle_threaded_irq,
+ TP_PROTO(u32 device_id, s32 irq),
+ TP_ARGS(device_id, irq),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, irq)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->irq = irq;
+ ),
+ TP_printk("device_id=%u irq=%d",
+ __entry->device_id,
+ __entry->irq)
+);
+TRACE_EVENT(triv2_handle_cmd_done,
+ TP_PROTO(u32 device_id, s32 slot, u32 cycles, u32 time),
+ TP_ARGS(device_id, slot, cycles, time),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, slot)
+ __field(u32, cycles)
+ __field(u32, time)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->slot = slot;
+ __entry->cycles = cycles;
+ __entry->time = time;
+ ),
+ TP_printk("device_id=%u slot=%d cycles=%u time(us)=%u",
+ __entry->device_id,
+ __entry->slot,
+ __entry->cycles,
+ __entry->time)
+);
+TRACE_EVENT(triv2_map_sched_data,
+ TP_PROTO(u32 device_id, s32 slot, u32 batch_size, u32 in_cnt, u32 out_cnt),
+ TP_ARGS(device_id, slot, batch_size, in_cnt, out_cnt),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, slot)
+ __field(u32, batch_size)
+ __field(u32, in_cnt)
+ __field(u32, out_cnt)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->slot = slot;
+ __entry->batch_size = batch_size;
+ __entry->in_cnt = in_cnt;
+ __entry->out_cnt = out_cnt;
+ ),
+ TP_printk("device_id=%u slot=%d batch_size=%u in_cnt=%u out_cnt=%u",
+ __entry->device_id,
+ __entry->slot,
+ __entry->batch_size,
+ __entry->in_cnt,
+ __entry->out_cnt)
+);
+TRACE_EVENT(triv2_unmap_sched_data,
+ TP_PROTO(u32 device_id, s32 slot),
+ TP_ARGS(device_id, slot),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, slot)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->slot = slot;
+ ),
+ TP_printk("device_id=%u slot=%d",
+ __entry->device_id,
+ __entry->slot)
+);
+TRACE_EVENT(trinity_ioctl_msg,
+ TP_PROTO(u32 device_id, s32 app_id, char *msg),
+ TP_ARGS(device_id, app_id, msg),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(char*, msg)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->msg = msg;
+ ),
+ TP_printk("device_id=%u app_id=%d msg=%s",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->msg)
+);
+TRACE_EVENT(trinity_ioctl_next_req,
+ TP_PROTO(u32 device_id, s32 app_id, s32 req_id),
+ TP_ARGS(device_id, app_id, req_id),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s32, req_id)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->req_id = req_id;
+ ),
+ TP_printk("device_id=%u app_id=%d req_id=%d",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->req_id)
+);
+TRACE_EVENT(trinity_ioctl_stop_req,
+ TP_PROTO(u32 device_id, s32 app_id, s32 req_id),
+ TP_ARGS(device_id, app_id, req_id),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s32, req_id)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->req_id = req_id;
+ ),
+ TP_printk("device_id=%u app_id=%d req_id=%d",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->req_id)
+);
+TRACE_EVENT(trinity_ioctl_hwmem_alloc,
+ TP_PROTO(u32 device_id, s32 app_id, s64 size, s32 dbuf_fd),
+ TP_ARGS(device_id, app_id, size, dbuf_fd),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s64, size)
+ __field(s32, dbuf_fd)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->size = size;
+ __entry->dbuf_fd = dbuf_fd;
+ ),
+ TP_printk("device_id=%u app_id=%d size=%lld dbuf_fd=%d",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->size,
+ __entry->dbuf_fd)
+);
+TRACE_EVENT(trinity_ioctl_hwmem_dealloc,
+ TP_PROTO(u32 device_id, s32 app_id, s32 dbuf_fd),
+ TP_ARGS(device_id, app_id, dbuf_fd),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s32, dbuf_fd)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->dbuf_fd = dbuf_fd;
+ ),
+ TP_printk("device_id=%u app_id=%d dbuf_fd=%d",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->dbuf_fd)
+);
+TRACE_EVENT(trinity_ioctl_get_profile_meta,
+ TP_PROTO(u32 device_id, s32 app_id, s32 req_id, u32 profile_size),
+ TP_ARGS(device_id, app_id, req_id, profile_size),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s32, req_id)
+ __field(u32, profile_size)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->req_id = req_id;
+ __entry->profile_size = profile_size;
+ ),
+ TP_printk("device_id=%u app_id=%d req_id=%d profile_size=%u",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->req_id,
+ __entry->profile_size)
+);
+TRACE_EVENT(trinity_ioctl_get_profile_buff,
+ TP_PROTO(u32 device_id, s32 app_id, s32 req_id, u32 profile_pos,
+ u32 profile_size),
+ TP_ARGS(device_id, app_id, req_id, profile_pos, profile_size),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s32, req_id)
+ __field(u32, profile_pos)
+ __field(u32, profile_size)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->req_id = req_id;
+ __entry->profile_pos = profile_pos;
+ __entry->profile_size = profile_size;
+ ),
+ TP_printk("device_id=%u app_id=%d req_id=%d profile_pos=%u profile_size=%u",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->req_id,
+ __entry->profile_pos,
+ __entry->profile_size)
+);
+TRACE_EVENT(trinity_ioctl_register_model,
+ TP_PROTO(u32 device_id, s32 app_id, u64 config_id, s32 dbuf_fd,
+ u64 program_offset_addr, u64 program_size),
+ TP_ARGS(device_id, app_id, config_id, dbuf_fd,
+ program_offset_addr, program_size),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(u64, config_id)
+ __field(s32, dbuf_fd)
+ __field(u64, program_offset_addr)
+ __field(u64, program_size)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->config_id = config_id;
+ __entry->dbuf_fd = dbuf_fd;
+ __entry->program_offset_addr = program_offset_addr;
+ __entry->program_size = program_size;
+ ),
+ TP_printk("device_id=%u app_id=%d config_id=0x%llx dbuf_fd=%d program_offset_addr=0x%llx program_size=0x%llx",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->config_id,
+ __entry->dbuf_fd,
+ __entry->program_offset_addr,
+ __entry->program_size)
+);
+TRACE_EVENT(trinity_ioctl_register_model_drv_ver1,
+ TP_PROTO(u64 weight_offset_addr),
+ TP_ARGS(weight_offset_addr),
+ TP_STRUCT__entry(
+ __field(u64, weight_offset_addr)
+ ),
+ TP_fast_assign(
+ __entry->weight_offset_addr = weight_offset_addr;
+ ),
+ TP_printk("weight_offset_addr=0x%llx",
+ __entry->weight_offset_addr)
+);
+TRACE_EVENT(trinity_ioctl_register_model_drv_ver2,
+ TP_PROTO(s32 metadata_dbuf_fd, s32 metadata_ext_dbuf_fd,
+ u64 metadata_ext_size),
+ TP_ARGS(metadata_dbuf_fd, metadata_ext_dbuf_fd, metadata_ext_size),
+ TP_STRUCT__entry(
+ __field(s32, metadata_dbuf_fd)
+ __field(s32, metadata_ext_dbuf_fd)
+ __field(u64, metadata_ext_size)
+ ),
+ TP_fast_assign(
+ __entry->metadata_dbuf_fd = metadata_dbuf_fd;
+ __entry->metadata_ext_dbuf_fd = metadata_ext_dbuf_fd;
+ __entry->metadata_ext_size = metadata_ext_size;
+ ),
+ TP_printk("metadata_dbuf_fd=%d metadata_ext_dbuf_fd=%d metadata_ext_size=0x%llx",
+ __entry->metadata_dbuf_fd,
+ __entry->metadata_ext_dbuf_fd,
+ __entry->metadata_ext_size)
+);
+TRACE_EVENT(trinity_ioctl_run_input,
+ TP_PROTO(u32 device_id, s32 app_id, s32 dbuf_fd, u64 model_id),
+ TP_ARGS(device_id, app_id, dbuf_fd, model_id),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s32, dbuf_fd)
+ __field(u64, model_id)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->dbuf_fd = dbuf_fd;
+ __entry->model_id = model_id;
+ ),
+ TP_printk("device_id=%u app_id=%d dbuf_fd=%d model_id=0x%llx",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->dbuf_fd,
+ __entry->model_id)
+);
+TRACE_EVENT(trinity_ioctl_run_input_drv_ver1,
+ TP_PROTO(u64 activation_offset_addr0, u64 activation_offset_addr1),
+ TP_ARGS(activation_offset_addr0, activation_offset_addr1),
+ TP_STRUCT__entry(
+ __field(u64, activation_offset_addr0)
+ __field(u64, activation_offset_addr1)
+ ),
+ TP_fast_assign(
+ __entry->activation_offset_addr0 = activation_offset_addr0;
+ __entry->activation_offset_addr1 = activation_offset_addr1;
+ ),
+ TP_printk("activation_offset_addr0=0x%llx activation_offset_addr1=0x%llx",
+ __entry->activation_offset_addr0,
+ __entry->activation_offset_addr1)
+);
+TRACE_EVENT(trinity_ioctl_run_input_drv_ver2,
+ TP_PROTO(s64 timeout_ms, u32 priority, u32 num_segments, s32 input_mode,
+ s32 output_mode),
+ TP_ARGS(timeout_ms, priority, num_segments, input_mode, output_mode),
+ TP_STRUCT__entry(
+ __field(s64, timeout_ms)
+ __field(u32, priority)
+ __field(u32, num_segments)
+ __field(s32, input_mode)
+ __field(s32, output_mode)
+ ),
+ TP_fast_assign(
+ __entry->timeout_ms = timeout_ms;
+ __entry->priority = priority;
+ __entry->num_segments = num_segments;
+ __entry->input_mode = input_mode;
+ __entry->output_mode = output_mode;
+ ),
+ TP_printk("timeout_ms=%lld priority=%u num_segments=%u input_mode=%d output_mode=%d",
+ __entry->timeout_ms,
+ __entry->priority,
+ __entry->num_segments,
+ __entry->input_mode,
+ __entry->output_mode)
+);
+// clang-format on
+
+#endif /* __TRINITY_TRACE_H__ */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH ../../drivers/misc/trinity
+#include <trace/define_trace.h>
diff --git a/drivers/misc/trinity/trinity_vision2_drv.c b/drivers/misc/trinity/trinity_vision2_drv.c
index 539eadeca09d..d1633d8d2f90 100644
--- a/drivers/misc/trinity/trinity_vision2_drv.c
+++ b/drivers/misc/trinity/trinity_vision2_drv.c
@@ -177,7 +177,8 @@ static int triv2_idu_load(struct trinity_driver *drv, const char *dirpath,

static LIST_HEAD(triv2_driver_list);
static struct hlist_bl_head triv2_model_node_hlist[TRIV2_MODEL_HASH_SIZE];
-static const char * const triv2_op_names[] = TRIV2_FOREACH_OPNAME(TRIV2_GENERATE_OPNAME);
+static const char *const triv2_op_names[] =
+ TRIV2_FOREACH_OPNAME(TRIV2_GENERATE_OPNAME);

static struct triv2_profile *
triv2_find_profile(const struct trinity_driver *drv, int req_id)
@@ -418,6 +419,8 @@ static void triv2_wakeup_cp(const struct trinity_driver *drv)
void *addr =
trinity_get_iomem_addr(drv->mmreg_vaddr[0], OFFSET_CP_PROC_SET);

+ trace_triv2_wakeup_cp(drv->dev_id);
+
trinity_set_bit(BIT_SET_SEND_EVT1, addr);
}

@@ -541,6 +544,8 @@ static void triv2_run_trigger(const struct trinity_driver *drv, int slot)
struct triv2_cmd_info *cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
struct triv2_req *t_req = cmd_info->reqs[slot];

+ trace_triv2_run_trigger(drv->dev_id, slot);
+
if (!t_req) {
dev_err(drv_to_dev_ptr(drv),
"Unable to find the corresponding req");
@@ -605,6 +610,10 @@ static void triv2_handle_cmd_done(struct trinity_driver *drv,
req->stat->prev_cycles = cmd->total_cycles;
req->stat->num_runs++;
req->stat->total_time += req->stat->prev_time;
+
+ trace_triv2_handle_cmd_done(drv->dev_id, cmd->slot,
+ cmd->total_cycles,
+ req->stat->prev_time);
}

t_req->total_cycles = cmd->total_cycles;
--
2.25.1

2022-07-25 06:56:14

by Jiho Chu

[permalink] [raw]
Subject: [PATCH 3/9] trinity: Add load/unload IDU files

This patch implements IDU load/unload works.

Samsung NPU loads Instruction Decoder Unit (IDU) program,
which can decode binary code generated by NPU compiler.
The IDU program is loaded while loading driver, and it
starts to parse the codes of compiled decoder binary.
Then, all operations of the NPU is working with the decoder
program which is using predefined virtual ISA.

Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/trinity.c | 10 +
drivers/misc/trinity/trinity_common.h | 1 +
drivers/misc/trinity/trinity_vision2_drv.c | 397 ++++++++++++++++++++-
3 files changed, 404 insertions(+), 4 deletions(-)

diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
index 1ee9403dbdca..4c1b8a7108d6 100644
--- a/drivers/misc/trinity/trinity.c
+++ b/drivers/misc/trinity/trinity.c
@@ -37,12 +37,22 @@

#define BASE_DEV_NAME "trinity"

+#define TRINITY_PADDR_BASE (0x0)
+
/* A global lock for shared static variables such as dev_bitmap */
static DEFINE_SPINLOCK(trinity_lock);

/* A bitmap to keep track of active Trinity devices */
static unsigned long dev_bitmap[TRINITY_DEV_END];

+phys_addr_t trinity_get_paddr(struct iommu_domain *domain, dma_addr_t daddr)
+{
+ if (domain)
+ return iommu_iova_to_phys(domain, daddr);
+
+ return TRINITY_PADDR_BASE + daddr;
+}
+
/**
* trinity_release() - A common callback for close() in file_operations for a
* Trinity device node. If there are device-specific data to be
diff --git a/drivers/misc/trinity/trinity_common.h b/drivers/misc/trinity/trinity_common.h
index 7f576d4a71a5..6940318362f6 100644
--- a/drivers/misc/trinity/trinity_common.h
+++ b/drivers/misc/trinity/trinity_common.h
@@ -378,6 +378,7 @@ static inline int32_t trinity_get_app_id(void)
int trinity_create_node(struct trinity_driver *drv);
void trinity_destroy_node(struct trinity_driver *drv);
int trinity_wait_ready(struct trinity_driver *drv);
+phys_addr_t trinity_get_paddr(struct iommu_domain *domain, dma_addr_t daddr);

/* File operations */
int trinity_open(struct inode *inode, struct file *f);
diff --git a/drivers/misc/trinity/trinity_vision2_drv.c b/drivers/misc/trinity/trinity_vision2_drv.c
index f1c1e06d188e..9e616466c57b 100644
--- a/drivers/misc/trinity/trinity_vision2_drv.c
+++ b/drivers/misc/trinity/trinity_vision2_drv.c
@@ -105,6 +105,7 @@ struct triv2_cmd_info {

struct triv2_req *reqs[TRIV2_MAX_CMDSLOTS];
struct triv2_cmd cur_cmd;
+ struct trinity_resv_mem buf;
};

struct triv2_hashed_cmd_info {
@@ -124,6 +125,8 @@ struct triv2_kernel_req {
struct triv2_req {
struct trinity_req req;

+ struct trinity_hwmem_import *seg_import;
+
int cmd_slot;

/** kernel requets */
@@ -140,6 +143,9 @@ struct triv2_req {
struct triv2_idu {
phys_addr_t *addrs;
size_t addr_num;
+ struct trinity_resv_mem data;
+ struct trinity_resv_mem code;
+ dma_addr_t dspm;
};

struct triv2_pdata {
@@ -153,6 +159,9 @@ struct triv2_pdata {

/* command info */
struct triv2_cmd_info cmd_info;
+
+ /* back buffer for context switching */
+ struct trinity_resv_mem back_buf;
};

static void triv2_setup_buffers(struct trinity_driver *drv);
@@ -161,6 +170,74 @@ static int triv2_idu_load(struct trinity_driver *drv, const char *dirpath,

static LIST_HEAD(triv2_driver_list);

+/**
+ * triv2_get_state() - Get state (TRINITY_STATE_READY/TRINITY_STATE_PAUSE) of the device.
+ * @returns (enum triv2_state) TRINITY_STATE_READY (i.e., 1) or TRINITY_STATE_PAUSE (i.e., 0 )
+ * according to the state of the device
+ */
+int32_t triv2_get_state(const struct trinity_driver *drv)
+{
+ if (ioread32(drv->mmreg_vaddr[0] + OFFSET_NPU_CMD_READY) == 1)
+ return TRINITY_STATE_READY;
+
+ return TRINITY_STATE_PAUSE;
+}
+
+/**
+ * triv2_set_state() - Set state of the device to TRINITY_STATE_READY (1) or TRINITY_STATE_PAUSE (0)
+ */
+static void triv2_set_state(const struct trinity_driver *drv,
+ enum trinity_state state)
+{
+ void __iomem *addr;
+
+ switch (state) {
+ case TRINITY_STATE_PAUSE:
+ /* CP */
+ addr = trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_CP_PROC_SET);
+ trinity_set_bit(BIT_SET_PAUSE, addr);
+ iowrite32(0, addr);
+
+ /* DSP */
+ addr = trinity_get_iomem_addr(drv->mmreg_vaddr[1],
+ OFFSET_DSP_PROC_SET);
+ trinity_set_bit(BIT_SET_PAUSE, addr);
+ iowrite32(0, addr);
+
+ break;
+ case TRINITY_STATE_READY:
+ /* CP */
+ addr = trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_CP_PROC_CLR);
+ trinity_set_bit(BIT_CLR_PAUSE, addr);
+ iowrite32(0, addr);
+
+ /* DSP */
+ addr = trinity_get_iomem_addr(drv->mmreg_vaddr[1],
+ OFFSET_DSP_PROC_CLR);
+ trinity_set_bit(BIT_CLR_PAUSE, addr);
+ iowrite32(0, addr);
+
+ /* Performance Counter */
+ addr = trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_CP_CNT_CFG);
+ trinity_set_bit(BIT_CNT_IST_EN | BIT_CNT_FR_EN, addr);
+ break;
+ default:
+ dev_err(drv_to_dev_ptr(drv),
+ "failed to set state of the NPU state: %d", state);
+ }
+}
+
+static void triv2_wakeup_cp(const struct trinity_driver *drv)
+{
+ void *addr =
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0], OFFSET_CP_PROC_SET);
+
+ trinity_set_bit(BIT_SET_SEND_EVT1, addr);
+}
+
static void triv2_cancel_reqs(struct trinity_driver *drv)
{
struct triv2_cmd_info *info;
@@ -204,6 +281,8 @@ static void triv2_reset(struct trinity_driver *drv)
do_test = true;
list_for_each_entry(pdata, &triv2_driver_list, list) {
triv2_reset_devices(pdata->drv, do_test);
+ if (pdata->drv->opened > 0)
+ triv2_set_state(pdata->drv, TRINITY_STATE_READY);
do_test = false;
}

@@ -233,6 +312,20 @@ static const struct file_operations triv2_fops = {
.llseek = noop_llseek,
};

+static void triv2_setup_cp(struct trinity_driver *drv, phys_addr_t paddr)
+{
+ iowrite32(TRIV2_IDU_ADDR(paddr) >> 4,
+ drv->mmreg_vaddr[0] + OFFSET_CP_IMIF_BASE);
+ iowrite32(TRIV2_IDU_ADDR(drv->mmreg_paddr[2]),
+ drv->mmreg_vaddr[0] + OFFSET_NPU_CBOX_BASE);
+}
+
+static void triv2_setup_dsp(struct trinity_driver *drv, phys_addr_t paddr)
+{
+ iowrite32(TRIV2_IDU_ADDR(paddr) >> 4,
+ drv->mmreg_vaddr[1] + OFFSET_DSP_IMIF_BASE);
+}
+
static void triv2_init_common(void)
{
static bool done;
@@ -244,6 +337,20 @@ static void triv2_init_common(void)
done = true;
}

+static int triv2_idu_alloc(struct device *dev, struct trinity_resv_mem *mem)
+{
+ return trinity_alloc_from_resv_mem(mem->size, mem, false);
+}
+
+static void triv2_idu_free(struct device *dev, struct trinity_resv_mem *mem)
+{
+ if (!mem->vaddr)
+ return;
+
+ trinity_free_from_resv_mem(mem, false);
+ mem->vaddr = NULL;
+}
+
static int triv2_idu_version(struct trinity_driver *drv, uint32_t *major,
uint32_t *minor, uint32_t *extra)
{
@@ -272,36 +379,276 @@ static void triv2_idu_check(struct trinity_driver *drv)
struct device *dev = drv_to_dev_ptr(drv);
uint32_t major, minor, extra;

+ if (trinity_wait_ready(drv) != 0) {
+ dev_warn(dev, "Unable to load IDU properly");
+ return;
+ }
+
pdata->idu_version =
ioread32(drv->mmreg_vaddr[0] + OFFSET_NPU_IDU_VERSION);
if (triv2_idu_version(drv, &major, &minor, &extra) == 0)
dev_info(dev,
"Instruction Decoder Unit (IDU) v%u.%u.%u detected",
major, minor, extra);
+
+ /* paused until device is opened */
+ triv2_set_state(drv, TRINITY_STATE_PAUSE);
+}
+
+static int triv2_idu_load_file(struct trinity_driver *drv, const char *dirpath,
+ const char *file_name,
+ struct trinity_resv_mem *sector)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+ struct trinity_resv_mem mem;
+ char filepath[NAME_MAX];
+ struct kstat *stat;
+ struct file *filp;
+ loff_t pos = 0;
+ size_t size;
+ int ret;
+
+ dev = drv_to_dev_ptr(drv);
+ stat = vmalloc(sizeof(*stat));
+ if (stat == NULL)
+ return -ENOMEM;
+
+ /* if dirpath is null, use the default path */
+ if (dirpath)
+ snprintf(filepath, NAME_MAX, "%s/%s", dirpath, file_name);
+ else
+ snprintf(filepath, NAME_MAX, TRIV2_IDU_DIRPATH_FMT "/%s",
+ utsname()->release, file_name);
+
+ filp = filp_open(filepath, O_RDONLY, 0400);
+ if (IS_ERR(filp)) {
+ dev_err(dev, "Failed to open the idu binary: %s", filepath);
+ ret = PTR_ERR(filp);
+ goto out_free;
+ }
+
+ /* check file existence first */
+ ret = vfs_getattr(&filp->f_path, stat, STATX_SIZE,
+ AT_STATX_SYNC_AS_STAT);
+
+ if (ret != 0 || stat->size == 0) {
+ dev_warn(dev, "File not found: %s", filepath);
+ ret = -ENOENT;
+ goto out_close;
+ }
+
+ size = stat->size;
+ if (size > TRIV2_IDU_MAXSIZE) {
+ dev_err(dev, "Too large idu binary: %zu MiB", size >> 20);
+ ret = -EINVAL;
+ goto out_close;
+ }
+
+ mem.size = PAGE_ALIGN(size);
+ ret = triv2_idu_alloc(dev, &mem);
+ if (ret < 0) {
+ dev_err(dev, "Failed to allocate memory for idu");
+ goto out_close;
+ }
+
+ ret = read_idu_file(filp, pos, mem.vaddr, size);
+ if (ret != size) {
+ dev_err(dev, "Failed to read the file %s", filepath);
+ triv2_idu_free(dev, &mem);
+ ret = -ERANGE;
+ goto out_close;
+ }
+
+ /* free previous idu if exists */
+ if (sector->vaddr)
+ triv2_idu_free(dev, sector);
+
+ sector->daddr = mem.daddr;
+ sector->vaddr = mem.vaddr;
+ sector->size = mem.size;
+ sector->orig_size = size;
+
+ ret = 0;
+out_close:
+ filp_close(filp, NULL);
+out_free:
+ vfree(stat);
+
+ return ret;
+}
+
+static int triv2_idu_load_files(struct trinity_driver *drv, const char *dirpath)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct iommu_domain *domain;
+ phys_addr_t paddr;
+ int ret;
+
+ domain = iommu_get_domain_for_dev(drv_to_dev_ptr(drv));
+
+ ret = triv2_idu_load_file(drv, dirpath, "cp/data.bin",
+ &(pdata->idu_cp.data));
+ if (ret < 0)
+ return ret;
+
+ ret = triv2_idu_load_file(drv, dirpath, "cp/code.bin",
+ &(pdata->idu_cp.code));
+ if (ret < 0)
+ return ret;
+
+ paddr = trinity_get_paddr(domain, pdata->idu_cp.code.daddr);
+ pdata->idu_cp.addrs[TRIV2_IDU_CODEIDX] = paddr;
+
+ if (!pdata->idu_dsp.addrs)
+ return 0;
+
+ ret = triv2_idu_load_file(drv, dirpath, "dsp/data.bin",
+ &(pdata->idu_dsp.data));
+ if (ret < 0)
+ return ret;
+
+ ret = triv2_idu_load_file(drv, dirpath, "dsp/code.bin",
+ &(pdata->idu_dsp.code));
+ if (ret < 0)
+ return ret;
+
+ paddr = trinity_get_paddr(domain, pdata->idu_dsp.code.daddr);
+ pdata->idu_dsp.addrs[TRIV2_IDU_CODEIDX] = paddr;
+
+ return 0;
+}
+
+static void triv2_idu_fill_zero(struct trinity_driver *drv, phys_addr_t paddr,
+ size_t size)
+{
+ void *__iomem vaddr;
+
+ vaddr = ioremap(paddr, PAGE_ALIGN(size));
+ if (vaddr == NULL) {
+ dev_err(drv_to_dev_ptr(drv), "Failed to do ioremap() for 0x%lx",
+ (unsigned long)paddr);
+ return;
+ }
+ memset_io(vaddr, 0, size);
+
+ iounmap(vaddr);
+}
+
+static void triv2_idu_fill_data(struct trinity_driver *drv, phys_addr_t paddr,
+ struct trinity_resv_mem *data)
+{
+ void *__iomem vaddr;
+
+ vaddr = ioremap(paddr, data->size);
+ if (vaddr == NULL) {
+ dev_err(drv_to_dev_ptr(drv), "Failed to do ioremap() for 0x%lx",
+ (unsigned long)paddr);
+ return;
+ }
+ memcpy_toio(vaddr, data->vaddr, data->orig_size);
+
+ iounmap(vaddr);
+}
+
+static void triv2_idu_load_code(struct trinity_driver *drv)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+
+ /* CP is mandatory */
+ triv2_setup_cp(drv, pdata->idu_cp.addrs[TRIV2_IDU_CODEIDX]);
+
+ /* DSP is optional */
+ if (pdata->idu_dsp.addrs)
+ triv2_setup_dsp(drv, pdata->idu_dsp.addrs[TRIV2_IDU_CODEIDX]);
}

static int triv2_idu_load(struct trinity_driver *drv, const char *dirpath,
bool load_files)
{
- /* load idu data */
+ struct triv2_pdata *pdata;
+ struct triv2_idu *idu_cp;
+ struct triv2_idu *idu_dsp;
+ struct device *dev;
+
+ if (!drv)
+ return -EINVAL;
+
+ dev = drv_to_dev_ptr(drv);
+ if (load_files) {
+ int ret = triv2_idu_load_files(drv, dirpath);
+
+ if (ret != 0) {
+ dev_warn(dev, "Unable to load IDU files: %d", ret);
+ goto load_code;
+ }
+ }
+
+ pdata = TRIV2_DRV_GET_PDATA(drv);
+ idu_cp = &pdata->idu_cp;
+ idu_dsp = &pdata->idu_dsp;
+
+ triv2_idu_fill_zero(drv, idu_cp->addrs[TRIV2_IDU_ZEROIDX],
+ TRIV2_IDU_CP_DSPM_SIZE);
+ triv2_idu_fill_data(drv, idu_cp->addrs[TRIV2_IDU_DATAIDX],
+ &idu_cp->data);
+
+ if (!pdata->idu_dsp.addrs)
+ goto load_code;
+
+ triv2_idu_fill_zero(drv, idu_dsp->addrs[TRIV2_IDU_ZEROIDX],
+ drv->dspm + TRIV2_DSP_DSPM_OFFSET);
+ triv2_idu_fill_data(drv, idu_dsp->addrs[TRIV2_IDU_DATAIDX],
+ &idu_dsp->data);
+
+load_code:
+ triv2_idu_load_code(drv);

return 0;
}

static void triv2_idu_unload(struct trinity_driver *drv)
{
- /* unload idu data */
+ struct device *dev = drv_to_dev_ptr(drv);
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+
+ triv2_idu_free(dev, &pdata->idu_cp.data);
+ triv2_idu_free(dev, &pdata->idu_dsp.data);
+
+ triv2_idu_free(dev, &pdata->idu_cp.code);
+ triv2_idu_free(dev, &pdata->idu_dsp.code);
}

static void triv2_setup_buffers(struct trinity_driver *drv)
{
- /* setup buffer */
+ struct device *dev = drv_to_dev_ptr(drv);
+ struct iommu_domain *domain;
+ struct trinity_resv_mem *cmd_buf;
+ struct trinity_resv_mem *back_buf;
+ phys_addr_t paddr;
+
+ domain = iommu_get_domain_for_dev(dev);
+ cmd_buf = TRIV2_DRV_GET_CMD_BUF(drv);
+ back_buf = TRIV2_DRV_GET_BACK_BUF(drv);
+
+ /* command */
+ paddr = trinity_get_paddr(domain, cmd_buf->daddr);
+ iowrite32(TRIV2_IDU_ADDR(paddr),
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_CMD_BASE));
+ /* backup */
+ iowrite32(TRIV2_IDU_ADDR(back_buf->daddr),
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_BACK_ADDR));
+ iowrite32(back_buf->size, trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_BACK_SIZE));
}

static int32_t triv2_init_pdata(struct trinity_driver *drv)
{
struct triv2_pdata *pdata;
struct triv2_cmd_info *cmd_info;
+ struct trinity_resv_mem *cmd_buf;
+ struct trinity_resv_mem *back_buf;

/* alloc triv2 pdata */
drv->pdata = kzalloc(sizeof(struct triv2_pdata), GFP_KERNEL);
@@ -312,6 +659,8 @@ static int32_t triv2_init_pdata(struct trinity_driver *drv)
pdata->drv = drv;

cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+ cmd_buf = TRIV2_DRV_GET_CMD_BUF(drv);
+ back_buf = TRIV2_DRV_GET_BACK_BUF(drv);

spin_lock_init(&cmd_info->lock);
/* init cmd bitmap */
@@ -393,7 +742,21 @@ static int triv2_setup_idu(struct trinity_driver *drv)
triv2_idu_check(drv);
}

- /* setup dma info */
+ if (pdata->idu_dsp.addrs && drv->dspm > 0) {
+ struct iommu_domain *domain;
+ phys_addr_t paddr;
+ dma_addr_t daddr;
+
+ /* iommu mapping for dspm segment */
+ domain = iommu_get_domain_for_dev(dev);
+ if (!domain)
+ return 0;
+
+ paddr = pdata->idu_dsp.addrs[0] + TRIV2_DSP_DSPM_OFFSET;
+ daddr = dma_map_resource(dev, paddr, drv->dspm,
+ DMA_BIDIRECTIONAL, 0);
+ pdata->idu_dsp.dspm = daddr;
+ }

return 0;
}
@@ -412,11 +775,23 @@ static int32_t triv2_init(struct trinity_driver *drv)
*/
static void triv2_cleanup(struct trinity_driver *drv)
{
+ struct trinity_resv_mem *cmd_buf;
+ struct trinity_resv_mem *back_buf;
+
if (!drv->pdata)
return;

triv2_idu_unload(drv);

+ cmd_buf = TRIV2_DRV_GET_CMD_BUF(drv);
+ back_buf = TRIV2_DRV_GET_BACK_BUF(drv);
+
+ if (cmd_buf->vaddr)
+ trinity_free_from_resv_mem(cmd_buf, false);
+
+ if (back_buf->vaddr)
+ trinity_free_from_resv_mem(back_buf, false);
+
list_del(&(TRIV2_DRV_GET_PDATA(drv)->list));
kfree(drv->pdata);
drv->pdata = NULL;
@@ -430,6 +805,8 @@ static struct trinity_desc triv2_desc = {
.reset = triv2_reset,
.idu_load = triv2_idu_load,
.idu_version = triv2_idu_version,
+ .get_state = triv2_get_state,
+ .set_state = triv2_set_state,
/* req management */
.alloc_req = triv2_alloc_req,
.dealloc_req = triv2_dealloc_req,
@@ -456,6 +833,18 @@ static int trinity_triv2_probe(struct platform_device *pdev)
if (err < 0)
return err;

+ drv = (struct trinity_driver *)platform_get_drvdata(pdev);
+ if (drv->dspm > 0) {
+ /* DSPM's some region is reserved for DSP kernel operations */
+ if (drv->dspm < TRIV2_DSP_DSPM_OFFSET) {
+ dev_err(drv_to_dev_ptr(drv),
+ "Too small DSPM size.. wrong device tree?");
+ err = -EINVAL;
+ goto out_remove;
+ }
+ drv->dspm -= TRIV2_DSP_DSPM_OFFSET;
+ }
+
err = triv2_init(drv);
if (err < 0)
goto out_remove;
--
2.25.1

2022-07-25 06:56:57

by Jiho Chu

[permalink] [raw]
Subject: [PATCH 1/9] trinity: Add base driver

It contains the base codes for trinity driver. Minimal codes to load and
probe device is provided. The Trinity Family is controlled by the
Memory-Mapped Registers, the register addresses and offsets are
described. And user api interfaces are presented to control device under
ioctl manner.

Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: yelini-jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: Parichay Kapoor <[email protected]>
Signed-off-by: Wook Song <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/Kconfig | 1 +
drivers/misc/Makefile | 1 +
drivers/misc/trinity/Kconfig | 27 ++
drivers/misc/trinity/Makefile | 7 +
drivers/misc/trinity/trinity.c | 369 ++++++++++++++
drivers/misc/trinity/trinity_common.h | 392 +++++++++++++++
drivers/misc/trinity/trinity_vision2_drv.c | 512 ++++++++++++++++++++
drivers/misc/trinity/trinity_vision2_regs.h | 210 ++++++++
include/uapi/misc/trinity.h | 458 +++++++++++++++++
9 files changed, 1977 insertions(+)
create mode 100644 drivers/misc/trinity/Kconfig
create mode 100644 drivers/misc/trinity/Makefile
create mode 100644 drivers/misc/trinity/trinity.c
create mode 100644 drivers/misc/trinity/trinity_common.h
create mode 100644 drivers/misc/trinity/trinity_vision2_drv.c
create mode 100644 drivers/misc/trinity/trinity_vision2_regs.h
create mode 100644 include/uapi/misc/trinity.h

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 41d2bb0ae23a..ad0d5f6af291 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -500,4 +500,5 @@ source "drivers/misc/cardreader/Kconfig"
source "drivers/misc/habanalabs/Kconfig"
source "drivers/misc/uacce/Kconfig"
source "drivers/misc/pvpanic/Kconfig"
+source "drivers/misc/trinity/Kconfig"
endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 70e800e9127f..c63f3fc89780 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o
obj-$(CONFIG_HISI_HIKEY_USB) += hisi_hikey_usb.o
obj-$(CONFIG_HI6421V600_IRQ) += hi6421v600-irq.o
obj-$(CONFIG_OPEN_DICE) += open-dice.o
+obj-$(CONFIG_TRINITY) += trinity/
diff --git a/drivers/misc/trinity/Kconfig b/drivers/misc/trinity/Kconfig
new file mode 100644
index 000000000000..ad4bab78f7c6
--- /dev/null
+++ b/drivers/misc/trinity/Kconfig
@@ -0,0 +1,27 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config TRINITY
+ bool "Samsung Neural Processing Unit"
+ depends on HAS_IOMEM
+ depends on HAS_DMA
+ default n
+ help
+ Select this option to enable driver support for Samsung
+ Neural Processing Unit (NPU).
+
+ This driver works as a base driver of the other drivers
+ for Trinity device family.
+
+ This option should be enabled to support Trinity
+ Vision 2 (TRIV2), and Trinity Audio (TRIA).
+
+config TRINITY_VISION2
+ tristate "Samsung NPU Trinity Vision 2"
+ depends on TRINITY
+ default n
+ help
+ Select this option to enable driver support for a Samsung
+ Neural Processing Unit (NPU), Tinity Vision 2.
+
+ This driver enables userspace system library to access the
+ device via /dev/triv2-N.
diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
new file mode 100644
index 000000000000..a8e5697d6d85
--- /dev/null
+++ b/drivers/misc/trinity/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+obj-$(CONFIG_TRINITY_VISION2) += trinity_vision2.o
+
+trinity-y := trinity.o
+
+trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
new file mode 100644
index 000000000000..a85904c17f2e
--- /dev/null
+++ b/drivers/misc/trinity/trinity.c
@@ -0,0 +1,369 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Base device driver for Samsung NPU Trinity device family
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/cacheflush.h>
+#include <linux/bitmap.h>
+#include <linux/device.h>
+#include <linux/dma-buf.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/init.h>
+#include <linux/iommu.h>
+#include <linux/kernel.h>
+#include <linux/kobject.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_iommu.h>
+#include <linux/of_irq.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/stddef.h>
+#include <linux/uaccess.h>
+
+#include "trinity_common.h"
+
+#define BASE_DEV_NAME "trinity"
+
+/* A global lock for shared static variables such as dev_bitmap */
+static DEFINE_SPINLOCK(trinity_lock);
+
+/* A bitmap to keep track of active Trinity devices */
+static unsigned long dev_bitmap[TRINITY_DEV_END];
+
+/**
+ * trinity_release() - A common callback for close() in file_operations for a
+ * Trinity device node. If there are device-specific data to be
+ * cleaned-up, it is required to clean them up before invoke this
+ * callback.
+ *
+ * @inode: Inode to be closed
+ * @file: File to be closed
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+int trinity_release(struct inode *inode, struct file *file)
+{
+ struct trinity_driver *drv;
+
+ drv = file->private_data;
+
+ if (drv->verbose)
+ dev_info(drv_to_dev_ptr(drv), "%s\n", "Device closed");
+
+ mutex_lock(&drv->lock);
+ drv->opened = drv->opened - 1;
+ if (drv->opened == 0) {
+ /* wait already submitted requests */
+ if (drv->desc->drain_reqs)
+ drv->desc->drain_reqs(drv);
+
+ drv->desc->set_state(drv, TRINITY_STATE_PAUSE);
+ }
+ mutex_unlock(&drv->lock);
+
+ return 0;
+}
+
+static bool trinity_is_empty(void)
+{
+ enum trinity_dev_type type;
+ bool empty = true;
+
+ spin_lock(&trinity_lock);
+ for (type = TRINITY_DEV_UNKNOWN, type++; type < TRINITY_DEV_END;
+ type++) {
+ if (find_first_bit(&dev_bitmap[type], TRINITY_DEV_EACH_MAX) !=
+ TRINITY_DEV_EACH_MAX) {
+ empty = false;
+ break;
+ }
+ }
+ spin_unlock(&trinity_lock);
+
+ return empty;
+}
+
+/**
+ * trinity_wait_ready() - Wait until trinity is ready state
+ *
+ * @drv: an instance of trinity driver
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+int trinity_wait_ready(struct trinity_driver *drv)
+{
+ const unsigned long time_out = HZ / 100UL; /* 1/100 seconds*/
+ const unsigned int max_retry = 10;
+ unsigned int retry = 0;
+ wait_queue_head_t wq;
+
+ drv->desc->set_state(drv, TRINITY_STATE_READY);
+
+ init_waitqueue_head(&wq);
+ /* try to ensure that NPU is in the ready state */
+ while (wait_event_timeout(
+ wq, drv->desc->get_state(drv) == TRINITY_STATE_READY,
+ time_out) == 0) {
+ /* regarded as failure */
+ if (retry == max_retry)
+ return -ETIMEDOUT;
+ retry++;
+ }
+
+ return 0;
+}
+
+/**
+ * trinity_open() - A common callback for open() in file_operations for a Trinity
+ * device node. If device-specific open() is required, this
+ * callback should be invoked by that open().
+ *
+ * @inode: inode to be opened
+ * @f: file to be opened
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+int trinity_open(struct inode *inode, struct file *f)
+{
+ struct miscdevice *miscdev;
+ struct trinity_driver *drv;
+ int ret = 0;
+
+ miscdev = (struct miscdevice *)f->private_data;
+ drv = container_of(miscdev, struct trinity_driver, mdev);
+ f->private_data = drv;
+
+ mutex_lock(&drv->lock);
+ /** remove PAUSE set on the CP of the NPU */
+ if (drv->opened == 0) {
+ ret = trinity_wait_ready(drv);
+ if (ret != 0)
+ goto out;
+ }
+ drv->opened = drv->opened + 1;
+
+ if (drv->verbose)
+ dev_info(drv_to_dev_ptr(drv), "%s\n", "Device opened");
+
+out:
+ mutex_unlock(&drv->lock);
+
+ return 0;
+}
+
+static void trinity_common_init(struct device *dev)
+{
+ if (!trinity_is_empty())
+ return;
+
+ /* Common init codes */
+}
+
+static void trinity_common_exit(void)
+{
+ if (!trinity_is_empty())
+ return;
+
+ /* Common deinit codes */
+}
+
+static int trinity_set_device_id(struct trinity_driver *drv)
+{
+ const struct trinity_desc *desc = drv->desc;
+ struct device *dev = drv_to_dev_ptr(drv);
+ int err = -EEXIST;
+
+ spin_lock(&trinity_lock);
+ drv->dev_id =
+ find_first_zero_bit(&dev_bitmap[dev->id], TRINITY_DEV_EACH_MAX);
+ if (drv->dev_id < TRINITY_DEV_EACH_MAX) {
+ set_bit(drv->dev_id, &dev_bitmap[dev->id]);
+ err = 0;
+ }
+ spin_unlock(&trinity_lock);
+
+ if (err == 0) {
+ drv->name = devm_kasprintf(dev, GFP_KERNEL, "%s-%u", desc->type,
+ drv->dev_id);
+ err = IS_ERR_OR_NULL(drv->name) ? -ENOMEM : 0;
+ }
+
+ return err;
+}
+
+int trinity_create_node(struct trinity_driver *drv)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+ int err;
+
+ /** register as a misc device */
+ drv->mdev.minor = MISC_DYNAMIC_MINOR;
+ drv->mdev.parent = NULL;
+ drv->mdev.name = drv->name;
+
+ err = misc_register(&drv->mdev);
+ if (err < 0)
+ dev_err(dev, "failed to register as a misc device");
+ else
+ dev_info(dev, "misc device created!");
+
+ return err;
+}
+
+void trinity_destroy_node(struct trinity_driver *drv)
+{
+ misc_deregister(&drv->mdev);
+}
+
+/**
+ * trinity_probe() - Probes a new Trinity device. This is a standard interface to
+ * probe a Trinity family device.
+ *
+ * @pdev: Platform device structure to probe
+ * @desc: Device description to probe
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+int trinity_probe(struct platform_device *pdev, const struct trinity_desc *desc)
+{
+ struct device_node *np;
+ struct device *dev;
+ struct trinity_driver *drv;
+ int irq_out;
+ int i, err;
+
+ dev = &pdev->dev;
+ dev->id = ((desc->ver & TRINITY_MASK_DEV) >> TRINITY_SHIFT_DEV);
+
+ /* set private data */
+ drv = devm_kzalloc(dev, sizeof(*drv), GFP_KERNEL);
+ if (drv == NULL)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, drv);
+ dev_set_drvdata(dev, drv);
+
+ drv->dev = dev;
+ drv->desc = desc;
+
+ np = dev->of_node;
+ if (of_property_match_string(np, "samsung,trinity-type", desc->type))
+ return -EPROBE_DEFER;
+
+ /* get reg info for MMREG_BASE */
+ for (i = 0; i < TRINITY_MAX_MMREGS; i++) {
+ struct resource mmreg;
+
+ err = of_address_to_resource(np, i, &mmreg);
+ if (err < 0) {
+ if (i == 0) {
+ dev_err(dev, "failed to get %d-th mmreg info",
+ i);
+ return -EINVAL;
+ }
+ break;
+ }
+
+ drv->mmreg_vaddr[i] = devm_ioremap_resource(dev, &mmreg);
+ if (IS_ERR(drv->mmreg_vaddr[i])) {
+ dev_err(dev,
+ "failed to remap %d-th mmreg resource info", i);
+ return PTR_ERR(drv->mmreg_vaddr[i]);
+ }
+ drv->mmreg_paddr[i] = mmreg.start;
+ }
+
+ /** get a TOPS property */
+ err = of_property_read_u32(np, "samsung,tops", &drv->tops);
+ if (err < 0) {
+ dev_err(dev, "failed to read 'tops' property: %d\n", err);
+ return err;
+ }
+
+ /** get a DSPM property */
+ err = of_property_read_u32(np, "samsung,dspm", &drv->dspm);
+ if (err < 0) {
+ dev_info(dev, "Setting the size of DPSM to 0\n");
+ drv->dspm = 0;
+ }
+
+ /* Set IRQ handlers */
+ irq_out = platform_get_irq(pdev, 0);
+ if (irq_out < 0) {
+ dev_err(dev, "IRQ is not supported");
+ return irq_out;
+ }
+
+ /* get the IRQ number from DT and set handlers for it */
+ err = devm_request_irq(dev, irq_out, desc->handle_irq,
+ IRQF_TRIGGER_HIGH, desc->type, &drv->mdev);
+ if (err < 0) {
+ dev_err(dev, "failed to register handlers for IRQ %d", irq_out);
+ return err;
+ }
+
+ /** Initialize device-specific variables */
+ init_completion(&drv->complete);
+ mutex_init(&drv->lock);
+ INIT_WORK(&drv->work_stop, desc->stop_reqs);
+ drv->mdev.fops = desc->fops;
+
+ trinity_common_init(dev);
+
+ err = trinity_set_device_id(drv);
+ if (err < 0) {
+ dev_err(dev, "Please unload old devices first (max: %d)\n",
+ TRINITY_DEV_EACH_MAX);
+ goto err_cleanup;
+ }
+
+ return 0;
+
+err_cleanup:
+ spin_lock(&trinity_lock);
+ clear_bit(drv->dev_id, &dev_bitmap[dev->id]);
+ spin_unlock(&trinity_lock);
+
+ trinity_common_exit();
+
+ return err;
+}
+
+/**
+ * trinity_remove() - Cleans up the device driver. This is a standard interface to
+ * remove a Trinity family device.
+ *
+ * @pdev: Platform device structure to probe
+ * @desc: Device description to probe
+ *
+ * Always returns 0.
+ */
+int trinity_remove(struct platform_device *pdev,
+ const struct trinity_desc *desc)
+{
+ struct trinity_driver *drv;
+ struct device *dev;
+
+ drv = (struct trinity_driver *)platform_get_drvdata(pdev);
+ dev = drv_to_dev_ptr(drv);
+
+ spin_lock(&trinity_lock);
+ clear_bit(drv->dev_id, &dev_bitmap[dev->id]);
+ spin_unlock(&trinity_lock);
+
+ trinity_common_exit();
+
+ return 0;
+}
diff --git a/drivers/misc/trinity/trinity_common.h b/drivers/misc/trinity/trinity_common.h
new file mode 100644
index 000000000000..37aba34ef9bc
--- /dev/null
+++ b/drivers/misc/trinity/trinity_common.h
@@ -0,0 +1,392 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * Common header for trinity devices
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2020 Parichay Kapoor <[email protected]>
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __TRINITY_COMMON_H__
+#define __TRINITY_COMMON_H__
+
+#include <linux/idr.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/irqreturn.h>
+#include <linux/kernel.h>
+#include <linux/list_bl.h>
+#include <linux/miscdevice.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+#include <uapi/misc/trinity.h>
+
+/** Default timeout to wait for opening device in jiffies */
+#define TRINITY_DEV_TIMEOUT_MSEC (3000)
+#define TRINITY_DEV_TIMEOUT (msecs_to_jiffies(TRINITY_DEV_TIMEOUT_MSEC))
+
+/** Default timeout to wait for running input in jiffies */
+#define TRINITY_RUN_TIMEOUT_MSEC (4000)
+#define TRINITY_RUN_TIMEOUT (msecs_to_jiffies(TRINITY_RUN_TIMEOUT_MSEC))
+
+#define TRINITY_DEV_TYPE_LEN (16)
+#define TRINITY_DEV_EACH_MAX (2)
+#define TRINITY_MAX_MMREGS (3)
+
+/** A helper function to generate the version code of the device driver */
+#define GENVER(dev, mj, mn, ex) \
+ ((dev << TRINITY_SHIFT_DEV) | (mj << TRINITY_SHIFT_MAJOR_VER) | \
+ (mn << TRINITY_SHIFT_MINOR_VER) | (ex << TRINITY_SHIFT_EXTRA_VER))
+
+#define trinity_get_iomem_addr(base, offset) (base + offset)
+#define drv_to_dev_ptr(d) (d->dev)
+#define drv_to_priv(drv) (drv->desc->pdata)
+
+#define TRINITY_STAT_HASH_BITS (10)
+#define TRINITY_STAT_HASH_SIZE (1 << TRINITY_STAT_HASH_BITS)
+
+#define TIME_DIFF(t1, t2) ktime_to_ms(ktime_sub(t1, t2))
+#define TIME_DIFF_US(t1, t2) ktime_to_us(ktime_sub(t1, t2))
+
+struct trinity_desc;
+struct trinity_driver;
+struct trinity_req;
+struct trinity_stat;
+struct trinity_stat_app;
+struct trinity_stat_req;
+struct trinity_model_htable;
+
+/**
+ * struct trinity_desc - a structure for device description
+ * @type: A string that indicates the type of this device.
+ * @ver: Coded version information generated via GENVER().
+ * @fops: Device-specific file_operations.
+ *
+ * @reset: reset trinity function
+ * @prepare_req: request configuration function before invoking
+ * trinity_submit_req() (if any). This requires a registered model
+ * to the driver.
+ * @handle_timeout: This function is invoked when the request is time-out
+ * @stop_reqs: stops current working request
+ * @drain_reqs: waits currently working requests finishes.
+ * @init_profile: initialize profile configuration
+ * @check_profile: check current profile data
+ * @get_profile_meta: get profile metadata for the target request
+ * @get_profile_buff: get profile data buffer for the target request
+ * @show_profile: write out profile data
+ * @destroy_profile: destroy profile resources
+ *
+ * @idu_load: load IDU binary to the target path
+ * @idu_version: get IDU version info
+ * @get_state: get current state of IDU
+ * @set_state: set IDU state
+ * @alloc_req: allocate request new trinity request
+ * @dealloc_req: release request resource
+ * @invoke_req: prepare to run request and sent it to scheduler
+ *
+ * @handle_irq: Device-specific IRQ handler.
+ */
+struct trinity_desc {
+ char *type;
+ uint32_t ver;
+
+ const struct file_operations *fops;
+
+ /* Optional */
+ void (*reset)(struct trinity_driver *drv);
+ int32_t (*prepare_req)(struct trinity_driver *drv,
+ struct trinity_req *req);
+ void (*handle_timeout)(struct trinity_driver *drv,
+ struct trinity_req *req);
+ void (*stop_reqs)(struct work_struct *work);
+ void (*drain_reqs)(struct trinity_driver *drv);
+ void (*init_profile)(struct trinity_driver *drv,
+ unsigned long profile_size);
+ int32_t (*check_profile)(struct trinity_driver *drv,
+ struct trinity_req *req);
+ int32_t (*get_profile_meta)(const struct trinity_driver *drv,
+ struct trinity_ioctl_profile_meta *meta);
+ int32_t (*get_profile_buff)(const struct trinity_driver *drv,
+ struct trinity_ioctl_profile_buff *buff);
+ void (*show_profile)(const struct trinity_driver *drv, int req_id);
+ void (*destroy_profile)(const struct trinity_driver *drv, void *data);
+
+ /* Mandatory */
+ int32_t (*idu_load)(struct trinity_driver *drv, const char *dirpath,
+ bool load_files);
+ int32_t (*idu_version)(struct trinity_driver *drv, uint32_t *major,
+ uint32_t *minor, uint32_t *extra);
+ int32_t (*get_state)(const struct trinity_driver *drv);
+ void (*set_state)(const struct trinity_driver *drv,
+ enum trinity_state state);
+ struct trinity_req *(*alloc_req)(struct trinity_driver *drv);
+ void (*dealloc_req)(struct trinity_driver *drv,
+ struct trinity_req *req);
+ int32_t (*invoke_req)(struct trinity_driver *drv,
+ struct trinity_req *req, void *sched_data);
+ irq_handler_t handle_irq;
+};
+
+/**
+ * struct trinity_stat - A structure for representing a device's statistics.
+ */
+struct trinity_stat {
+ spinlock_t lock;
+
+ struct hlist_bl_head hlist[TRINITY_STAT_HASH_SIZE];
+ struct list_head list;
+
+ void *pdata;
+};
+
+/**
+ * struct trinity_stat_app - a structure for representing statistics for each app
+ * @app_id: identifier for each app
+ * @name: name of stat
+ * @status: app status
+ * @parent: parent node
+ * @total_alloc_mem: total allocated memory size
+ * @total_free_mem: total freed memory size
+ * @list_head reqs: list of request
+ * @num_total_reqs: a number of total requests
+ * @num_kept_reqs: a number of kept requests
+ * @num_active_reqs: a number of active requests
+ * @hnode: hash node
+ * @lnode: list node
+ */
+struct trinity_stat_app {
+ int32_t app_id; /* app identifier */
+ char name[TASK_COMM_LEN];
+ enum trinity_app_status status;
+
+ struct trinity_stat *parent;
+
+ uint64_t total_alloc_mem; /* total allocated memory */
+ uint64_t total_freed_mem; /* total freed memory */
+
+ struct list_head reqs;
+ uint32_t num_total_reqs;
+ uint32_t num_kept_reqs;
+ uint32_t num_active_reqs;
+
+ struct hlist_bl_node hnode; /* hash node */
+ struct list_head lnode; /* list node */
+
+ unsigned long slot;
+};
+
+/**
+ * struct trinity_stat_req - A structure for representing statistics for each request
+ * @status: request status
+ * @priority: priority of request
+ * @parent: parent node
+ * @req_id: app identifier
+ * @req_id: request identifier
+ * @model_id: model identifier
+ * @is_kernel: requested from other kernel module
+ * @submitted: submitted time (i.e., when request is submitted to global queue)
+ * @scheduled: scheduled time (i.e., when request is scheduled to device)
+ * @completed: completed time (i.e., when output notification arrives)
+ * @num_runs: total number of runs
+ * @total_time: total execute time
+ * @prev_time: previous execute time
+ * @prev_cycles: previous execute cycles
+ * @list: list node managed by trinity_stat_app
+ * @profile: profile data
+ * @slot: request slot
+ */
+struct trinity_stat_req {
+ enum trinity_req_status status; /* status of submit result */
+ enum trinity_req_priority priority;
+
+ struct trinity_stat_app *parent;
+
+ int32_t app_id;
+ int32_t req_id;
+ uint64_t model_id;
+
+ bool is_kernel;
+
+ ktime_t submitted;
+ ktime_t scheduled;
+ ktime_t completed;
+
+ uint32_t num_runs;
+ uint32_t total_time;
+
+ uint32_t prev_time;
+ uint32_t prev_cycles;
+
+ struct list_head list;
+ void *profile;
+
+ unsigned long slot;
+};
+
+/**
+ * struct trinity_driver - A private data structure for Trinity device driver
+ * @desc: A pointer to the device description
+ * @name: The id-annotated name of the device
+ * @pdata: private data
+ * #dev_id: device id
+ * @mdev: A copy of &struct misc device to which the device is registered.
+ * @dev: A pointer to &struct device of the device.
+ * @complete: A &struct completion variable to maintain events from the device.
+ * @lock: A lock for access control to driver-level static variables
+ * @glboal_req_id: a request id to generate id for each request
+ * @mmreg_vaddr: The iomapped base address of memory-mapped registers
+ * @mmreg_paddr: The physical base address of memory-mapped registers
+ * @opened: The number of clients which open the device
+ * @verbose: show detailed information
+ * @work_stop: handle stop request
+ * @tops: Tera Operations Per Second (TOPS) of this device
+ * @dspm: The size of Data Scratch-Pad Memory (DSPM) in the DSP
+ * @stat: statistics information
+ * @debugfs_pdata: debugfs private data
+ */
+struct trinity_driver {
+ const struct trinity_desc *desc;
+ const char *name;
+ void *pdata;
+
+ uint32_t dev_id;
+ struct miscdevice mdev;
+ struct device *dev;
+ struct completion complete;
+ struct mutex lock;
+
+ atomic_t global_req_id;
+
+ void __iomem *mmreg_vaddr[TRINITY_MAX_MMREGS];
+ phys_addr_t mmreg_paddr[TRINITY_MAX_MMREGS];
+
+ int32_t opened;
+ unsigned long verbose;
+
+ struct work_struct work_stop;
+
+ uint32_t tops;
+ uint32_t dspm;
+
+ struct trinity_stat stat;
+ void *debugfs_pdata;
+};
+
+/**
+ * struct trinity_model - A structure for representing model data
+ * @config: model configuration
+ * @import_info: Cached hwmem import info.
+ * @hnode: hash node for indexing
+ * @owner_id: Identifier for owner app
+ * @refcnt: reference count
+ */
+struct trinity_model {
+ struct trinity_ioctl_model config;
+ struct hlist_bl_node hnode;
+ int32_t owner_id;
+ struct kref refcnt;
+} __packed;
+
+/**
+ * struct trinity_input - A structure for representing input data
+ * @config: input configuration
+ * @import_info: Cached hwmem import info.
+ */
+struct trinity_input {
+ struct trinity_ioctl_input config;
+} __packed;
+
+/**
+ * struct trinity_req - A structure for representing a request
+ * @drv: An instance of the driver.
+ * @input: Information of the input configuration to be run by this request
+ * @model: model information to be used for this request
+ * @status: Status of the submitted request
+ * @submit_retry: retry count of submit request
+ * @complete: completion information
+ * @llist: llist node for request queue
+ * @time_started: started time
+ * @is_kernel: requested from kernel module
+ * @scheduled: scheduled flag
+ * @priv: A handle of private data
+ * @note: The allocated 'trinity_req' is shared with ioctl, scheduler
+ * and interrupt handler routines. After invoking an NPU request,
+ * the irq handler can make complete the request at anytime, and it
+ * causes deallocation of the struct.
+ */
+struct trinity_req {
+ /** context where the req belongs */
+ struct trinity_driver *drv;
+
+ struct trinity_input input; /* the req's input argument */
+ struct trinity_model *model;
+
+ struct trinity_stat_req *stat;
+
+ uint64_t submit_retry;
+ struct completion complete;
+ struct llist_node llist;
+
+ ktime_t time_started;
+ bool is_kernel;
+
+ bool scheduled;
+
+ void *priv;
+};
+
+/**
+ * struct trinity_model_htable - A common hashtable to maintain models
+ * @ht_heads: A pointer to heads of this hashtable
+ * @hash_bits: The number of bits to use in hashing.
+ * @hash_size: The number of hash buckets.
+ */
+struct trinity_model_htable {
+ struct hlist_bl_head *ht_heads;
+ int hash_bits;
+ int hash_size;
+};
+
+static inline void trinity_set_bit(uint32_t bit, void __iomem *addr)
+{
+ uint32_t reg = 0;
+
+ reg |= bit;
+ iowrite32(reg, addr);
+}
+
+/**
+ * trinity_get_app_id() - Get a app_id for the current opened device
+ *
+ * Returns app_id (just returns its tgid for now).
+ */
+static inline int32_t trinity_get_app_id(void)
+{
+ return task_tgid_vnr(current);
+}
+
+/*
+ * Trinity common functions
+ */
+int trinity_create_node(struct trinity_driver *drv);
+void trinity_destroy_node(struct trinity_driver *drv);
+int trinity_wait_ready(struct trinity_driver *drv);
+
+/* File operations */
+int trinity_open(struct inode *inode, struct file *f);
+int trinity_release(struct inode *inode, struct file *f);
+
+/* Device probing and removing */
+int trinity_probe(struct platform_device *pdev,
+ const struct trinity_desc *desc);
+int trinity_remove(struct platform_device *pdev,
+ const struct trinity_desc *desc);
+
+#endif /* __TRINITY_COMMON_H__ */
diff --git a/drivers/misc/trinity/trinity_vision2_drv.c b/drivers/misc/trinity/trinity_vision2_drv.c
new file mode 100644
index 000000000000..a24eb0f6ac6d
--- /dev/null
+++ b/drivers/misc/trinity/trinity_vision2_drv.c
@@ -0,0 +1,512 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/**
+ * Samsung NPU Trinity Vision 2 driver
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/bitmap.h>
+#include <linux/dma-buf.h>
+#include <linux/fs.h>
+#include <linux/hashtable.h>
+#include <linux/highmem.h>
+#include <linux/init.h>
+#include <linux/io.h>
+#include <linux/kthread.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/utsname.h>
+#include <linux/version.h>
+
+#include <linux/delay.h>
+
+#include "trinity_common.h"
+#include "trinity_resv_mem.h"
+#include "trinity_trace.h"
+#include "trinity_vision2_profile.h"
+#include "trinity_vision2_regs.h"
+
+#define TRIV2_DRV_GET_PDATA(drv) ((struct triv2_pdata *)(drv->pdata))
+#define TRIV2_DRV_GET_CMD_INFO(drv) (&(TRIV2_DRV_GET_PDATA(drv)->cmd_info))
+#define TRIV2_DRV_GET_CMD_BUF(drv) (&(TRIV2_DRV_GET_CMD_INFO(drv)->buf))
+#define TRIV2_DRV_GET_PROF_BUF(drv) (&(TRIV2_DRV_GET_PDATA(drv)->prof_buf))
+#define TRIV2_DRV_GET_BACK_BUF(drv) (&(TRIV2_DRV_GET_PDATA(drv)->back_buf))
+
+#define TRIV2_GET_CMD_FROM_SLOT(info, slot) \
+ ((struct triv2_cmd *)(info->buf.vaddr + \
+ slot * sizeof(struct triv2_cmd)))
+
+#define TRIV2_GET_REQ(req) (container_of(req, struct triv2_req, req))
+
+#define HALF_PAGE_SIZE (PAGE_SIZE >> 1)
+
+enum triv2_cmd_status {
+ STATUS_CMD_NONE = 0,
+ STATUS_CMD_READY = 1,
+ STATUS_CMD_DONE = 2,
+};
+
+/** req command for triv2 */
+struct triv2_cmd {
+ union {
+ struct {
+ uint32_t slot;
+ uint32_t prog_addr;
+ uint32_t prog_size;
+ uint32_t segt_addr;
+ uint32_t num_visa;
+
+ uint32_t priority;
+ uint32_t status;
+ uint32_t input_mode;
+ uint32_t output_mode;
+
+ /** for profiling */
+ uint32_t profile_offset;
+
+ /** for preemptive scheduling */
+ uint32_t program_position;
+
+ /** for batch processing */
+ uint32_t batch_size;
+ uint32_t curr_cnt;
+ uint32_t in_addr[TRIV2_MAX_BATCH_SIZE];
+ uint32_t out_addr[TRIV2_MAX_BATCH_SIZE];
+ uint32_t poll_addr;
+ uint32_t poll_magic;
+ /* deprecated but keep for backward compatibiltiy */
+ uint32_t in_seg_idx;
+ uint32_t out_seg_idx;
+
+ uint32_t total_cycles;
+
+ /* kernel requests */
+ uint32_t in_extern_seg_num;
+ uint32_t out_extern_seg_num;
+ uint32_t in_extern_seg_idx[TRIV2_MAX_TENSORS];
+ uint32_t out_extern_seg_idx[TRIV2_MAX_TENSORS];
+ };
+ uint8_t reserved[TRIV2_MAX_CMD_SIZE];
+ };
+} __packed;
+
+struct triv2_cmd_info {
+ DECLARE_BITMAP(bitmap, TRIV2_MAX_CMDSLOTS);
+ spinlock_t lock;
+
+ struct triv2_req *reqs[TRIV2_MAX_CMDSLOTS];
+ struct triv2_cmd cur_cmd;
+};
+
+struct triv2_hashed_cmd_info {
+ struct trinity_driver *drv;
+ struct hlist_bl_node hnode;
+ struct triv2_req *req;
+ struct triv2_cmd *cmd;
+};
+
+struct triv2_kernel_req {
+ uint32_t in_seg_idx[TRIV2_MAX_TENSORS];
+ uint32_t in_seg_size[TRIV2_MAX_TENSORS];
+ uint32_t out_seg_idx[TRIV2_MAX_TENSORS];
+ uint32_t out_seg_size[TRIV2_MAX_TENSORS];
+};
+
+struct triv2_req {
+ struct trinity_req req;
+
+ int cmd_slot;
+
+ /** kernel requets */
+ struct triv2_kernel_req *kernel;
+
+ /** profiling */
+ uint32_t profile_offset;
+ uint32_t total_cycles;
+
+ /** misc */
+ uint32_t total_segment_size;
+};
+
+struct triv2_idu {
+ phys_addr_t *addrs;
+ size_t addr_num;
+};
+
+struct triv2_pdata {
+ struct trinity_driver *drv;
+ struct list_head list;
+
+ /* idu info */
+ struct triv2_idu idu_cp;
+ struct triv2_idu idu_dsp;
+ uint32_t idu_version;
+
+ /* command info */
+ struct triv2_cmd_info cmd_info;
+};
+
+static void triv2_setup_buffers(struct trinity_driver *drv);
+static int triv2_idu_load(struct trinity_driver *drv, const char *dirpath,
+ bool load_files);
+
+static LIST_HEAD(triv2_driver_list);
+
+static void triv2_cancel_reqs(struct trinity_driver *drv)
+{
+ struct triv2_cmd_info *info;
+ unsigned long flags;
+
+ info = TRIV2_DRV_GET_CMD_INFO(drv);
+ spin_lock_irqsave(&info->lock, flags);
+
+ /* set command done */
+
+ spin_unlock_irqrestore(&info->lock, flags);
+}
+
+static void triv2_reset_devices(struct trinity_driver *drv, bool do_test)
+{
+ triv2_setup_buffers(drv);
+ triv2_idu_load(drv, NULL, false);
+}
+
+static void triv2_reset(struct trinity_driver *drv)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+ struct triv2_pdata *pdata;
+ bool do_test;
+
+ /* FIXME: The HW reset should handle all the devices simultaneously */
+
+ list_for_each_entry(pdata, &triv2_driver_list, list)
+ mutex_lock(&pdata->drv->lock);
+
+ dev_err(dev, "NPU HW reset started");
+
+ /* cancel all requests by force */
+ list_for_each_entry(pdata, &triv2_driver_list, list)
+ triv2_cancel_reqs(pdata->drv);
+
+ /* wait some pending requests in NPU */
+ msleep(100);
+
+ /* reset all devices */
+ do_test = true;
+ list_for_each_entry(pdata, &triv2_driver_list, list) {
+ triv2_reset_devices(pdata->drv, do_test);
+ do_test = false;
+ }
+
+ dev_err(dev, "NPU HW reset completed");
+
+ list_for_each_entry(pdata, &triv2_driver_list, list)
+ mutex_unlock(&pdata->drv->lock);
+}
+
+static long triv2_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+ /* handle ioctl */
+
+ return 0;
+}
+
+static int triv2_open(struct inode *inode, struct file *f)
+{
+ return trinity_open(inode, f);
+}
+
+static const struct file_operations triv2_fops = {
+ .owner = THIS_MODULE,
+ .unlocked_ioctl = triv2_ioctl,
+ .open = triv2_open,
+ .release = trinity_release,
+ .llseek = noop_llseek,
+};
+
+static void triv2_init_common(void)
+{
+ static bool done;
+
+ if (done)
+ return;
+
+ /* init hlists */
+ done = true;
+}
+
+static int triv2_idu_version(struct trinity_driver *drv, uint32_t *major,
+ uint32_t *minor, uint32_t *extra)
+{
+ struct triv2_pdata *pdata;
+ uint32_t val;
+
+ if (!drv || !major || !minor || !extra)
+ return -EINVAL;
+
+ pdata = TRIV2_DRV_GET_PDATA(drv);
+ val = pdata->idu_version;
+ if (val != 0) {
+ *major = (val & TRIV2_IDU_MASK_MAJOR) >> TRIV2_IDU_SHIFT_MAJOR;
+ *minor = (val & TRIV2_IDU_MASK_MINOR) >> TRIV2_IDU_SHIFT_MINOR;
+ *extra = val & TRIV2_IDU_MASK_EXTRA;
+ } else {
+ return -ENOENT;
+ }
+
+ return 0;
+}
+
+static void triv2_idu_check(struct trinity_driver *drv)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct device *dev = drv_to_dev_ptr(drv);
+ uint32_t major, minor, extra;
+
+ pdata->idu_version =
+ ioread32(drv->mmreg_vaddr[0] + OFFSET_NPU_IDU_VERSION);
+ if (triv2_idu_version(drv, &major, &minor, &extra) == 0)
+ dev_info(dev,
+ "Instruction Decoder Unit (IDU) v%u.%u.%u detected",
+ major, minor, extra);
+}
+
+static int triv2_idu_load(struct trinity_driver *drv, const char *dirpath,
+ bool load_files)
+{
+ /* load idu data */
+
+ return 0;
+}
+
+static void triv2_idu_unload(struct trinity_driver *drv)
+{
+ /* unload idu data */
+}
+
+static void triv2_setup_buffers(struct trinity_driver *drv)
+{
+ /* setup buffer */
+}
+
+static int32_t triv2_init_pdata(struct trinity_driver *drv)
+{
+ struct triv2_pdata *pdata;
+ struct triv2_cmd_info *cmd_info;
+
+ /* alloc triv2 pdata */
+ drv->pdata = kzalloc(sizeof(struct triv2_pdata), GFP_KERNEL);
+ if (!drv->pdata)
+ return -ENOMEM;
+
+ pdata = drv->pdata;
+ pdata->drv = drv;
+
+ cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+
+ spin_lock_init(&cmd_info->lock);
+ /* init cmd bitmap */
+ bitmap_zero(cmd_info->bitmap, TRIV2_MAX_CMDSLOTS);
+
+ list_add_tail(&pdata->list, &triv2_driver_list);
+
+ return 0;
+}
+
+static int32_t parse_idu_property(struct device *dev,
+ const struct device_node *np,
+ const char *prop_name, struct triv2_idu *idu)
+{
+ struct property *prop;
+ u64 values[TRIV2_IDU_MAX_SECTORS];
+ size_t size;
+ int i, err;
+
+ memset(idu, '\x00', sizeof(*idu));
+
+ prop = of_find_property(np, prop_name, NULL);
+ if (!prop)
+ return -EINVAL;
+
+ size = prop->length / sizeof(u64);
+ if (size != TRIV2_IDU_MAX_SECTORS) {
+ dev_err(dev, "idu requires %d values", TRIV2_IDU_MAX_SECTORS);
+ return -EINVAL;
+ }
+
+ idu->addr_num = size;
+ idu->addrs = devm_kcalloc(dev, size, sizeof(*idu->addrs), GFP_KERNEL);
+ if (!idu->addrs) {
+ dev_err(dev, "failed to allocate memory for idu values");
+ return -ENOMEM;
+ }
+
+ err = of_property_read_u64_array(np, prop_name, values, size);
+ if (err < 0) {
+ dev_err(dev, "failed to read property u64 array: %d", err);
+ return err;
+ }
+
+ for (i = 0; i < TRIV2_IDU_MAX_SECTORS; i++)
+ idu->addrs[i] = (unsigned long)values[i];
+
+ return 0;
+}
+
+/**
+ * triv2_setup_idu() - Setup IDU (e.g., CP, DSP) sections for this device
+ */
+static int triv2_setup_idu(struct trinity_driver *drv)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct device *dev = drv_to_dev_ptr(drv);
+ struct device_node *np = dev->of_node;
+ int err;
+
+ /* get Instruction Decode Unit (IDU) property */
+ err = parse_idu_property(dev, np, "samsung,idu_cp", &pdata->idu_cp);
+ if (err < 0) {
+ dev_err(dev, "Failed to parse idu property: samsung,idu_cp");
+ return err;
+ }
+
+ err = parse_idu_property(dev, np, "samsung,idu_dsp", &pdata->idu_dsp);
+ if (err < 0) {
+ dev_info(dev, "DSP is not supported");
+ pdata->idu_dsp.addrs = NULL;
+ }
+
+ /* try to find the IDU files (default) */
+ if (triv2_idu_load(drv, NULL, true) < 0) {
+ dev_warn(dev, "Failed to load IDU in the default path\n");
+ dev_warn(dev, "Should load IDU using sysfs later\n");
+ } else {
+ triv2_idu_check(drv);
+ }
+
+ /* setup dma info */
+
+ return 0;
+}
+
+/**
+ * triv2_init() - Initialize necessary variables in TRIV2
+ */
+static int32_t triv2_init(struct trinity_driver *drv)
+{
+ triv2_init_common();
+ return triv2_init_pdata(drv);
+}
+
+/**
+ * triv2_cleanup() - Clean up initialized variables in TRIV2
+ */
+static void triv2_cleanup(struct trinity_driver *drv)
+{
+ if (!drv->pdata)
+ return;
+
+ triv2_idu_unload(drv);
+
+ list_del(&(TRIV2_DRV_GET_PDATA(drv)->list));
+ kfree(drv->pdata);
+ drv->pdata = NULL;
+}
+
+static struct trinity_desc triv2_desc = {
+ .type = "triv2",
+ .ver = GENVER(TRINITY_DEV_VISION2, VER_MAJOR, VER_MINOR, VER_EXTRA),
+ .fops = &triv2_fops,
+ /* device management */
+ .reset = triv2_reset,
+ .idu_load = triv2_idu_load,
+ .idu_version = triv2_idu_version,
+ /* req management */
+ .alloc_req = triv2_alloc_req,
+ .dealloc_req = triv2_dealloc_req,
+ .prepare_req = triv2_prepare_req,
+ .invoke_req = triv2_invoke_req,
+};
+
+static const struct of_device_id trinity_match[] = {
+ {
+ .compatible = "samsung,trinity",
+ },
+ { /** sentinel */ },
+};
+
+/**
+ * trinity_triv2_probe() - Probes for Trinity vision devices, inits them if found
+ */
+static int trinity_triv2_probe(struct platform_device *pdev)
+{
+ struct trinity_driver *drv;
+ int err;
+
+ err = trinity_probe(pdev, &triv2_desc);
+ if (err < 0)
+ return err;
+
+ err = triv2_init(drv);
+ if (err < 0)
+ goto out_remove;
+
+ err = triv2_setup_idu(drv);
+ if (err < 0) {
+ triv2_cleanup(drv);
+ goto out_remove;
+ }
+
+ err = trinity_create_node(drv);
+ if (err < 0) {
+ triv2_cleanup(drv);
+ goto out_remove;
+ }
+
+ dev_info(drv_to_dev_ptr(drv), "Trinity Vision2 (TRIV2) probed");
+
+ return 0;
+
+out_remove:
+ trinity_remove(pdev, &triv2_desc);
+ return err;
+}
+
+/**
+ * trinity_triv2_remove() - Removes an instance of a Trinity vision device
+ */
+static int trinity_triv2_remove(struct platform_device *pdev)
+{
+ struct trinity_driver *drv;
+
+ drv = (struct trinity_driver *)platform_get_drvdata(pdev);
+
+ trinity_destroy_node(drv);
+ triv2_cleanup(drv);
+ return trinity_remove(pdev, &triv2_desc);
+}
+
+static struct platform_driver trinity_triv2 = {
+ .probe = trinity_triv2_probe,
+ .remove = trinity_triv2_remove,
+ .driver = {
+ .name = "triv2",
+ .owner = THIS_MODULE,
+ .of_match_table = of_match_ptr(trinity_match),
+ },
+};
+
+module_platform_driver(trinity_triv2);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Samsung Electronics");
+MODULE_DESCRIPTION("Samsung NPU device driver for trinity vision 2");
diff --git a/drivers/misc/trinity/trinity_vision2_regs.h b/drivers/misc/trinity/trinity_vision2_regs.h
new file mode 100644
index 000000000000..d934db04b2b0
--- /dev/null
+++ b/drivers/misc/trinity/trinity_vision2_regs.h
@@ -0,0 +1,210 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * An abstraction layer to handle DMA memory buffers for Trinity device driver
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __REGS_TRINITY_VISION2_H__
+#define __REGS_TRINITY_VISION2_H__
+
+/* Register offsets for NPU CP (Config) */
+#define OFFSET_CP_INFO (0x000) /* Processor Information */
+#define OFFSET_CP_PROC_STAT (0x010) /* Processor Status */
+#define OFFSET_CP_PROC_SET (0x014) /* Processor Control (Set) */
+#define OFFSET_CP_PROC_CLR (0x018) /* Processor Control (Clear) */
+#define OFFSET_CP_IMIF_BASE (0x024) /* Instruction Base Address (DRAM) */
+#define OFFSET_CP_CNT_CFG (0x200) /* CP Performance Counter */
+
+/* Register offsets for NPU CP (IDU Setup) */
+#define OFFSET_NPU_PROG_BASE (0x100) /* GPR00: Instruction Base Address */
+#define OFFSET_NPU_PROG_SIZE (0x104) /* GPR01: Program Size */
+#define OFFSET_NPU_SEGT_ADDR (0x108) /* GPR02: Segment Table Address */
+#define OFFSET_NPU_PROF_ADDR (0x10C) /* GPR03: NPU Profiling Address */
+#define OFFSET_NPU_PROF_SIZE (0x110) /* GPR04: NPU Profiling Size */
+#define OFFSET_NPU_BACK_ADDR (0x114) /* GPR05: NPU Context Backup Address */
+#define OFFSET_NPU_BACK_SIZE (0x118) /* GPR06: NPU Context Backup Size */
+#define OFFSET_NPU_PC (0x11C) /* GPR07: NPU Program Counter */
+
+/* Register offsets for NPU CP (Commands) */
+#define OFFSET_NPU_CMD_READY (0x124) /* GPR09: Command Ready Status */
+#define OFFSET_NPU_CMD_BASE (0x128) /* GPR10: Command Base Address */
+#define OFFSET_NPU_CMD_REQ (0x12C) /* GPR11: Command Request Slots (not used) */
+#define OFFSET_NPU_CMD_FREE (0x130) /* GPR12: Command Free Slots */
+
+/* Register offsets for NPU CP (Cbox Setup) */
+#define OFFSET_NPU_CBOX_BASE (0x134) /* GPR13: NPU CBOX BASE */
+
+/* Register offsets for Debugging */
+#define OFFSET_NPU_IDU_VERSION (0x138) /* GPR14: NPU IDU VERSION */
+#define OFFSET_NPU_IDU_STAGE (0x13C) /* GPR15: NPU IDU STAGE */
+
+#define OFFSET_NPU_CP_DMAI_EADDR (0x300) /* CP DMA Source Address */
+#define OFFSET_NPU_CP_DMAI_IADDR (0x304) /* CP DMA Dest Address */
+#define OFFSET_NPU_CP_DMAI_TSIZE (0x308) /* CP DMA Transfer Size */
+#define OFFSET_NPU_CP_DMAI_CONTR (0x310) /* CP DMA Status */
+#define OFFSET_NPU_CP_DMAI_CMDID (0x314) /* CP DMA Command ID */
+#define OFFSET_NPU_CP_DMAI_LSTID \
+ (0x318) /* CP DMA Command ID of the last transfer */
+
+#define OFFSET_NPU_DLA_DMAI_EADDR (0x1000) /* DLA Input External Address */
+#define OFFSET_NPU_DLA_DMAI_EYMOD \
+ (0x1004) /* DLA Input External Address Y Modifier */
+#define OFFSET_NPU_DLA_DMAI_EZMOD \
+ (0x1008) /* DLA Input External Address Z Modifier */
+#define OFFSET_NPU_DLA_DMAI_IADDR (0x100C) /* DLA Input Internal Address */
+#define OFFSET_NPU_DLA_DMAI_IYMOD \
+ (0x1010) /* DLA Input Internal Address Y Modifier */
+#define OFFSET_NPU_DLA_DMAI_IZMOD \
+ (0x1014) /* DLA Input Internal Address Z Modifier */
+#define OFFSET_NPU_DLA_DMAI_SIZE0 (0x1018) /* DLA Input Data Size 0 */
+#define OFFSET_NPU_DLA_DMAI_SIZE1 (0x101C) /* DLA Input Data Size 1 */
+#define OFFSET_NPU_DLA_DMAI_CTRL (0x1020) /* DLA Input Channel Status */
+
+#define OFFSET_NPU_DLA_DMAO_EADDR (0x1080) /* DLA Output External Address */
+#define OFFSET_NPU_DLA_DMAO_EYMOD \
+ (0x1084) /* DLA Output External Address Y Modifier */
+#define OFFSET_NPU_DLA_DMAO_EZMOD \
+ (0x1088) /* DLA Output External Address Z Modifier */
+#define OFFSET_NPU_DLA_DMAO_IADDR (0x108C) /* DLA Output Internal Address */
+#define OFFSET_NPU_DLA_DMAO_IYMOD \
+ (0x1090) /* DLA Output Internal Address Y Modifier */
+#define OFFSET_NPU_DLA_DMAO_IZMOD \
+ (0x1094) /* DLA Output Internal Address Z Modifier */
+#define OFFSET_NPU_DLA_DMAO_SIZE0 (0x1098) /* DLA Output Data Size 0 */
+#define OFFSET_NPU_DLA_DMAO_SIZE1 (0x109C) /* DLA Output Data Size 1 */
+#define OFFSET_NPU_DLA_DMAO_CTRL (0x10A0) /* DLA Output Channel Status */
+
+#define OFFSET_NPU_DLA_CORE_OPC (0x1100) /* DLA Operation Code */
+#define OFFSET_NPU_DLA_CORE_WIND_CFG (0x1104)
+#define OFFSET_NPU_DLA_CORE_SIZE0 (0x1108)
+#define OFFSET_NPU_DLA_CORE_SIZE1 (0x110C)
+#define OFFSET_NPU_DLA_CORE_ZP (0x1110)
+#define OFFSET_NPU_DLA_CORE_OUT_MULT (0x1114)
+#define OFFSET_NPU_DLA_CORE_IN0_MULT (0x1118)
+#define OFFSET_NPU_DLA_CORE_IN1_MULT (0x111C)
+#define OFFSET_NPU_DLA_CORE_OUT_CFG (0x1120)
+#define OFFSET_NPU_DLA_CORE_OUT_MOD (0x1124)
+#define OFFSET_NPU_DLA_CORE_IN0_CFG (0x1128)
+#define OFFSET_NPU_DLA_CORE_IN0_MOD (0x112C)
+#define OFFSET_NPU_DLA_CORE_IN1_CFG (0x1130)
+#define OFFSET_NPU_DLA_CORE_IN1_MOD (0x1134)
+#define OFFSET_NPU_DLA_CORE_PARAM_ADDR (0x1138)
+#define OFFSET_NPU_DLA_CORE_PSUM_ADDR (0x113C)
+#define OFFSET_NPU_DLA_CORE_CWGT_ADDR (0x1140)
+#define OFFSET_NPU_DLA_CORE_CTR (0x1144) /* DLA Core Status */
+
+#define OFFSET_NPU_DSP_DMAI_EADDR (0x2000) /* DSP Input External Address */
+#define OFFSET_NPU_DSP_DMAI_EYMOD \
+ (0x2004) /* DSP Input External Address Y Modifier */
+#define OFFSET_NPU_DSP_DMAI_EZMOD \
+ (0x2008) /* DSP Input External Address Z Modifier */
+#define OFFSET_NPU_DSP_DMAI_IADDR (0x200C) /* DSP Input Internal Address */
+#define OFFSET_NPU_DSP_DMAI_IYMOD \
+ (0x2010) /* DSP Input Internal Address Y Modifier */
+#define OFFSET_NPU_DSP_DMAI_IZMOD \
+ (0x2014) /* DSP Input Internal Address Z Modifier */
+#define OFFSET_NPU_DSP_DMAI_SIZE0 (0x2018) /* DSP Input Data Size 0 */
+#define OFFSET_NPU_DSP_DMAI_SIZE1 (0x201C) /* DSP Input Data Size 1 */
+#define OFFSET_NPU_DSP_DMAI_CTRL (0x2020) /* DSP Input Channel Status */
+
+#define OFFSET_NPU_DSP_DMAO_EADDR (0x2080) /* DSP Output External Address */
+#define OFFSET_NPU_DSP_DMAO_EYMOD \
+ (0x2084) /* DSP Output External Address Y Modifier */
+#define OFFSET_NPU_DSP_DMAO_EZMOD \
+ (0x2088) /* DSP Output External Address Z Modifier */
+#define OFFSET_NPU_DSP_DMAO_IADDR (0x208C) /* DSP Output Internal Address */
+#define OFFSET_NPU_DSP_DMAO_IYMOD \
+ (0x2090) /* DSP Output Internal Address Y Modifier */
+#define OFFSET_NPU_DSP_DMAO_IZMOD \
+ (0x2094) /* DSP Output Internal Address Z Modifier */
+#define OFFSET_NPU_DSP_DMAO_SIZE0 (0x2098) /* DSP Output Data Size 0 */
+#define OFFSET_NPU_DSP_DMAO_SIZE1 (0x209C) /* DSP Output Data Size 1 */
+#define OFFSET_NPU_DSP_DMAO_CTRL (0x20A0) /* DSP Output Channel Status */
+#define OFFSET_NPU_DSP_CORE_CTRL (0x2140) /* DSP Core Status */
+
+/* Register offsets for NPU DSP */
+#define OFFSET_DSP_INFO (0x000) /* Processor Information */
+#define OFFSET_DSP_PROC_STAT (0x010) /* Processor Status */
+#define OFFSET_DSP_PROC_SET (0x014) /* Processor Control (Set) */
+#define OFFSET_DSP_PROC_CLR (0x018) /* Processor Control (Clear) */
+#define OFFSET_DSP_IMIF_BASE (0x024) /* Instruction Base Address (DRAM) */
+
+/* Register offsets for NPU ComBox (IRQ) */
+#define OFFSET_CBOX_EXT_IRQ_MSK (0x100) /* External IRQ Output Mask */
+#define OFFSET_CBOX_EXT_IRQ_STA (0x104) /* External IRQ Output Status */
+#define OFFSET_CBOX_CP_SWI_CLR (0x134) /* CP IRQ output Clear */
+#define OFFSET_CBOX_DSP_SWI_CLR (0x154) /* DSP IRQ output Clear */
+
+/* Location of bits inside corresponding registers */
+#define BIT_CLR_IRQ_OUT BIT(24)
+#define BIT_CLR_PAUSE BIT(0)
+#define BIT_SET_SEND_EVT1 BIT(18)
+#define BIT_SET_PAUSE BIT(0)
+#define BIT_STAT_PAUSED BIT(1)
+
+/* Performance counter configurations */
+#define BIT_CNT_DST_EN BIT(6)
+#define BIT_CNT_IST_EN BIT(5)
+#define BIT_CNT_ST_EN BIT(4)
+#define BIT_CNT_FR_EN BIT(0)
+
+/* Bit masks */
+#define MASK_DSP_SWI_STA BIT_MASK(1)
+#define MASK_CP_SWI_STA BIT_MASK(0)
+
+#define MASK_STAT_WFE_PARAM GENMASK(14, 6)
+#define MASK_STAT_WFE_PARAM_EVT1 BIT_MASK(8)
+#define MASK_STAT_WFE BIT_MASK(5)
+#define MASK_STAT_PAUSED BIT_MASK(1)
+#define MASK_STAT_PAUSE BIT_MASK(0)
+
+#define VER_MAJOR (2)
+#define VER_MINOR (0)
+#define VER_EXTRA (0)
+
+#define read_idu_file(file, pos, addr, size) kernel_read(filp, addr, size, &pos)
+
+/** Macros for Instruction Decode Unit (IDU) */
+#define TRIV2_IDU_DIRPATH_FMT "/lib/modules/%s/kernel/soc/idu"
+#define TRIV2_IDU_MAX_SECTORS (3)
+#define TRIV2_IDU_ZEROIDX (0)
+#define TRIV2_IDU_DATAIDX (1)
+#define TRIV2_IDU_CODEIDX (2)
+#define TRIV2_IDU_ADDR(addr) ((uint32_t)(addr))
+#define TRIV2_IDU_MAXSIZE (1 << 20) /* 1 MiB */
+
+#define TRIV2_IDU_CP_DSPM_SIZE (0x10000)
+
+#define TRIV2_IDU_MASK_MAJOR (0xFF000000)
+#define TRIV2_IDU_MASK_MINOR (0x00FFF000)
+#define TRIV2_IDU_MASK_EXTRA (0x00000FFF)
+
+#define TRIV2_IDU_SHIFT_MAJOR (24)
+#define TRIV2_IDU_SHIFT_MINOR (12)
+
+#define TRIV2_MODEL_HASH_BITS (8)
+#define TRIV2_MODEL_HASH_SIZE (1 << TRIV2_MODEL_HASH_BITS)
+#define TRIV2_PROFILE_HASH_BITS (6)
+#define TRIV2_PROFILE_HASH_SIZE (1 << TRINITY_PROFILE_HASH_BITS)
+#define TRIV2_PROFILE_HASH_KEY(id) (hash_long((id), TRIV2_PROFILE_HASH_BITS))
+
+#define TRIV2_MAX_SEGMENTS (256)
+/** Fits in a single 4K Page */
+#define TRIV2_MAX_CMDSLOTS (PAGE_SIZE / sizeof(struct triv2_cmd))
+
+#define TRIV2_MAX_TENSORS (16)
+#define TRIV2_MAX_CMD_SIZE (512)
+#define TRIV2_MAX_BATCH_SIZE (32)
+
+#define TRIV2_DLA_GBUFFER_SIZE (0x80000)
+#define TRIV2_DSP_DSPM_OFFSET (0x10000)
+
+/* 4MiB (~300ns to flush all caches) */
+#define TRIV2_CACHE_FLUSH_THRESHOLD (0x400000)
+#define TRIV2_KERN_TIMEOUT_RESET (1000)
+
+#endif /* __REGS_TRINITY_VISION2_H__ */
diff --git a/include/uapi/misc/trinity.h b/include/uapi/misc/trinity.h
new file mode 100644
index 000000000000..50946cd0005a
--- /dev/null
+++ b/include/uapi/misc/trinity.h
@@ -0,0 +1,458 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/**
+ * User-level header for trinity devices.
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2020 Parichay Kapoor <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __TRINITY_H__
+#define __TRINITY_H__
+
+#include <linux/types.h>
+
+#define TRINITY_API_LEVEL 12
+
+/**
+ * enum trinity_state - Enum that describes a trinity device state
+ * @TRINITY_STATE_UNKNOWN: A device has unknown state
+ * @TRINITY_STATE_PAUSE: A device is paused
+ * @TRINITY_STATE_READY: A device is ready
+ * @TRINITY_STATE_END: End of trinity_state
+ */
+enum trinity_state {
+ TRINITY_STATE_UNKNOWN = -1,
+ TRINITY_STATE_PAUSE = 0,
+ TRINITY_STATE_READY,
+ TRINITY_STATE_END,
+};
+
+/**
+ * enum trinity_input_mode - Enum that describes an input source
+ * @TRINITY_INPUT_UNKNOWN: Unknown input mode
+ * @TRINITY_INPUT_CPU: Input feed by CPU
+ * @TRINITY_INPUT_HW: Input feed by third-party HW
+ * @TRINITY_INPUT_END: End of trinity_input_mode
+ */
+enum trinity_input_mode {
+ TRINITY_INPUT_UNKNOWN = -1,
+ TRINITY_INPUT_CPU = 0,
+ TRINITY_INPUT_HW,
+ TRINITY_INPUT_END,
+};
+
+/**
+ * enum trinity_output_mode - Enum that describes an output source
+ * @TRINITY_OUTPUT_UNKNOWN: Unknown output mode
+ * @TRINITY_OUTPUT_CPU_INTR: Output completion handling by interrupt
+ * @TRINITY_OUTPUT_CPU_POLL: Output completion handling by polling
+ * @TRINITY_OUTPUT_HW: Output completion handling by third-party HW
+ * @TRINITY_OUTPUT_END: End of trinity_output_mode
+ */
+enum trinity_output_mode {
+ TRINITY_OUTPUT_UNKNOWN = -1,
+ TRINITY_OUTPUT_CPU_INTR = 0,
+ TRINITY_OUTPUT_CPU_POLL,
+ TRINITY_OUTPUT_HW,
+ TRINITY_OUTPUT_END,
+};
+
+/**
+ * enum trinity_app_status - Enum that describes an app status
+ * @TRINITY_APP_STATUS_UNKNOWN: Unknown app status
+ * @TRINITY_APP_STATUS_ERROR: App has got some errors
+ * @TRINITY_APP_STATUS_PENDING: App is currently pending
+ * @TRINITY_APP_STATUS_STARTED: App was started
+ * @TRINITY_APP_STATUS_TERMINATED: App was terminated
+ */
+enum trinity_app_status {
+ TRINITY_APP_STATUS_UNKNOWN = 0,
+ TRINITY_APP_STATUS_ERROR = 1,
+ TRINITY_APP_STATUS_PENDING = 2,
+ TRINITY_APP_STATUS_STARTED = 3,
+ TRINITY_APP_STATUS_TERMINATED = 4
+};
+
+/**
+ * enum trinity_req_status - Enum that describes a request status
+ * @TRINITY_REQ_STATUS_UNKNOWN: Unknown request status
+ * @TRINITY_REQ_STATUS_ERROR: Request has got some errors
+ * @TRINITY_REQ_STATUS_PENDING: Request is currently pending
+ * @TRINITY_REQ_STATUS_RUNING: Request is currently running
+ * @TRINITY_REQ_STATUS_FINISHED: Request was finished
+ */
+enum trinity_req_status {
+ TRINITY_REQ_STATUS_UNKNOWN = 0,
+ TRINITY_REQ_STATUS_ERROR = 1,
+ TRINITY_REQ_STATUS_PENDING = 2, /* A request is submitted */
+ TRINITY_REQ_STATUS_RUNNING = 3, /* A request is running on NPU */
+ TRINITY_REQ_STATUS_FINISHED = 4 /* A request is just finished */
+};
+
+/**
+ * enum trinity_req_priority - Enum that describes a request priority
+ * @TRINITY_REQ_PRIORITY_LOW: Low priority
+ * @TRINITY_REQ_PRIORITY_MID: Mid priority scheduled with a higher chance than low one
+ * @TRINITY_REQ_PRIORITY_HIGH: High priority preempting lower priority requests
+ */
+enum trinity_req_priority {
+ TRINITY_REQ_PRIORITY_LOW = 0,
+ TRINITY_REQ_PRIORITY_MID = 1,
+ TRINITY_REQ_PRIORITY_HIGH = 2,
+};
+
+/**
+ * enum trinity_hwmem_type - A type of DMA buffer allocation method.
+ * @TRINITY_HWMEM_DMA_CONT: Use CMA to allocate backing stroage of DMA buffers.
+ * @TRINITY_HWMEM_DMA_IOMMU: Use IOMMU to allocate backing stroage of DMA buffers.
+ * @HWMEM_END: Sentinel.
+ */
+enum trinity_hwmem_type {
+ TRINITY_HWMEM_DMA_CONT = 0,
+ TRINITY_HWMEM_DMA_IOMMU,
+ TRINITY_HWMEM_END,
+};
+
+#ifndef TASK_COMM_LEN
+#define TASK_COMM_LEN 16
+#endif
+
+#define TRINITY_APP_NAME_MAX TASK_COMM_LEN
+#define TRINITY_APP_STAT_MAX 10
+#define TRINITY_REQ_STAT_MAX 10
+
+/**
+ * struct trinity_ioctl_stat_app - Describes stat of the target app
+ * @app_id: Trinity app id (currently, equal to pid)
+ * @name: Trinity app name
+ * @status: Trinity app status
+ * @num_total_reqs: Number of total requests in app (including finished ones)
+ * @num_active_reqs: Number of active (running or pending) requests in app
+ * @total_alloc_mem: Total size of allocated memory in the device
+ * @total_freed_mem: Total size of freed memory in the device
+ */
+struct trinity_ioctl_stat_app {
+ __s32 app_id;
+
+ char name[TRINITY_APP_NAME_MAX];
+ enum trinity_app_status status;
+
+ __u32 num_total_reqs;
+ __u32 num_active_reqs;
+
+ __u64 total_alloc_mem;
+ __u64 total_freed_mem;
+} __packed;
+
+/**
+ * struct trinity_ioctl_stat_apps - Describes stats of the latest apps
+ * @num_apps: Number of apps for the stat list
+ * @stat: Stat of the latest apps
+ */
+struct trinity_ioctl_stat_apps {
+ __u32 num_apps;
+ struct trinity_ioctl_stat_app stat[TRINITY_APP_STAT_MAX];
+} __packed;
+
+/**
+ * struct trinity_ioctl_stat_req - Describes stat of the target request
+ * @req_id: Trinity req id
+ * @model_id: Trinity model id
+ * @priority: Request priority (low, mid, or high)
+ * @status: Request status
+ * @sched_time: scheduling time in ms
+ * @infer_time: inference time in ms
+ */
+struct trinity_ioctl_stat_req {
+ __s32 req_id;
+ __u64 model_id;
+
+ enum trinity_req_priority priority;
+ enum trinity_req_status status;
+
+ __u32 sched_time;
+ __u32 infer_time;
+} __packed;
+
+/**
+ * struct trinity_ioctl_stat_reqs - Describes stats of the latest reqs
+ * @app_id: Trinity app id (0 means 'current')
+ * @num_reqs: Number of reqs for stat list
+ * @stat: Stat of the latest reqs
+ */
+struct trinity_ioctl_stat_reqs {
+ __s32 app_id;
+ __u32 num_reqs;
+ struct trinity_ioctl_stat_req stat[TRINITY_REQ_STAT_MAX];
+} __packed;
+
+/**
+ * struct trinity_ioctl_hwmem - A structure that Describes hardware memory (hwmem)
+ * @type: The type of hwmem type
+ * @size: The size of hwmem
+ * @dbuf_fd: File descriptor for dmabuf representing hwmem
+ */
+struct trinity_ioctl_hwmem {
+ enum trinity_hwmem_type type;
+ __u64 size;
+ __s32 dbuf_fd;
+} __packed;
+
+/**
+ * struct trinity_ioctl_profile_meta - Describes profiling meta info.
+ * @req_id: The target req id for profiling
+ * @total_cycles: The total number of cycles of the given req
+ * @total_ops: The total number of operations of the given req
+ * @input_footprint: The DRAM footprint of input data
+ * @output_footprint: The DRAM footprint of output data
+ * @profile_size: The size of profiling data
+ */
+struct trinity_ioctl_profile_meta {
+ __s32 req_id;
+ __s64 total_cycles;
+ __u32 total_ops;
+ __s64 input_footprint;
+ __s64 output_footprint;
+ __u32 profile_size;
+} __packed;
+
+/**
+ * struct trinity_ioctl_profile_buff - Describes profiling buff info.
+ * @req_id: The target req id for profiling
+ * @profile_pos: The start position to extract profiling data
+ * @profile_size: The size of user-allocated profiling buffer
+ * @profile_buf: The profiling buffer which user allocated
+ */
+struct trinity_ioctl_profile_buff {
+ __s32 req_id;
+ __u32 profile_pos;
+ __u32 profile_size;
+ void __user *profile_buf;
+} __packed;
+
+/**
+ * struct trinity_ioctl_model - A structure that configure a model registered on NPU
+ * @id: Id for NPU model to extract the base phys addr
+ * @dbuf_fd: File descriptor for dmabuf representing the model
+ * @program_offset_addr: Offset address for the instructions (NPU_PROG_BASE)
+ * @program_size: Size of the program instructions (NPU_PROG_SIZE)
+ * @version: The version of npubinfmt
+ * @endp_trnt_model_common: Indicator for the end of common model parameters
+ * @weight_offset_addr: Offset address for storing weights (NPU_WGT_BASE)
+ * @metadata_dbuf_fd: File descriptor for dmabuf representing the metadata
+ * @metadata_extra_addr: Offset address for the metadata extra
+ * @metadata_extra_size: Size of the metadata extra
+ * @num_visa_insts: Number of virtual ISA instructions
+ */
+struct trinity_ioctl_model {
+ __u64 id;
+ __s32 dbuf_fd;
+ __u64 program_offset_addr;
+ __u64 program_size;
+ __u32 version;
+ union {
+ __u8 endp_trnt_model_common[0];
+ struct {
+ __u64 weight_offset_addr;
+ } __packed;
+ struct {
+ __s32 metadata_dbuf_fd;
+ __s32 metadata_ext_dbuf_fd;
+ __u64 metadata_ext_size;
+ __u32 num_visa_insts;
+ } __packed;
+ };
+} __packed;
+
+/**
+ * struct trinity_ioctl_input - A structure that configure an input passed to NPU
+ * @dbuf_fd: File descriptor for dmabuf of I/O buffer (or segment table)
+ * @model_id: Model id received when setting the model in the NPU
+ * @req_id: Request id to distinguish each run_input
+ * @timeout_ms: Timeout in ms, zero is regarded as preemption
+ * @priority: Priority (LOW: 0, MID: 1, HIGH: 2)
+ * @endp_trnt_input_common: Indicator for the end of common input parameters
+ * @activation_offset_addr0: Offset address for storing weights (NPU_ACT_BASE0)
+ * @activation_offset_addr1: Offset address for storing weights (NPU_ACT_BASE1)
+ * @num_segments: Number of segments
+ * @input_mode: Input mode (who is supposed to feed input)
+ * @output_mode: Output mode (who is supposed to retrieve output)
+ * @hw_input_seg: Third-party HW's input segment idx
+ * @hw_output_seg: Third-party HW's output segment idx
+ * @task_handle: user requested task handle
+ * @subtask_idx: user requested subtask idx
+ * @task_id: kernel module requested task id
+ */
+struct trinity_ioctl_input {
+ __s32 dbuf_fd;
+ __u64 model_id;
+ __s32 req_id;
+ __s64 timeout_ms;
+ __u32 priority;
+ union {
+ __u8 endp_trnt_input_common[0];
+ struct {
+ /* added for TRIV-1 */
+ __u64 activation_offset_addr0;
+ __u64 activation_offset_addr1;
+ } __packed;
+ struct {
+ /* added for TRIV-2 */
+ __u32 num_segments;
+ enum trinity_input_mode input_mode;
+ enum trinity_output_mode output_mode;
+ __s32 hw_input_seg;
+ __s32 hw_output_seg;
+ /* [optional] vd scheduler info */
+ union {
+ struct { /* user request */
+ __u32 task_handle;
+ __u32 subtask_idx;
+ } __packed;
+ struct { /* kernel request */
+ __u32 task_id;
+ } __packed;
+ };
+ } __packed;
+ };
+} __packed;
+
+#define TRINITY_MASK_DEV (0xFF000000)
+#define TRINITY_MASK_MAJOR_VER (0x00FF0000)
+#define TRINITY_MASK_MINOR_VER (0x0000FF00)
+#define TRINITY_MASK_EXTRA_VER (0x000000FF)
+
+/**
+ * struct trinity_ioctl_fpga_memcpy - A structure that contains driver-assisted memcpy
+ * @dbuf_fd: File descriptor for dmabuf of the target buffer
+ * @dbuf_off: Offset from the dmabuf base address
+ * @user_addr: Address of user-level buffer
+ * @user_size: Size of user-level buffer
+ *
+ * @note: It's workaround structure for FPGA envionment
+ */
+struct trinity_ioctl_fpga_memcpy {
+ __s32 dbuf_fd;
+ __u32 dbuf_off;
+ void __user *user_addr;
+ __u64 user_size;
+} __packed;
+
+#define TRINITY_SHIFT_DEV (24)
+#define TRINITY_SHIFT_MAJOR_VER (16)
+#define TRINITY_SHIFT_MINOR_VER (8)
+#define TRINITY_SHIFT_EXTRA_VER (0)
+#define TRINITY_SHIFT_MODEL_ID (16)
+
+#define trinity_gen_ver(dev, mj, mn, ex) \
+ { \
+ (dev << TRINITY_SHIFT_DEV) | (mj << TRINITY_SHIFT_MAJOR_VER) | \
+ (mn << TRINITY_SHIFT_MINOR_VER) | \
+ (ex << TRINITY_SHIFT_EXTRA_VER) \
+ }
+
+/**
+ * enum trinity_dev_type - Enum that describes a trinity device type
+ * @TRINITY_DEV_UNKNOWN: Unknown device type
+ * @TRINITY_DEV_VISION: Trinity Vision (TRIV)
+ * @TRINITY_DEV_AUDIO: Trinity Asr (TRIA)
+ * @TRINITY_DEV_VISION2: Trinity Vision2 (TRIV2)
+ * @TRINITY_DEV_VISION2_CUSE: Trinity Vision2 (TRIV2), CUSE-based impl.
+ * @TRINITY_DEV_END: End of trinity_dev_type
+ */
+enum trinity_dev_type {
+ TRINITY_DEV_UNKNOWN = 0,
+ TRINITY_DEV_VISION,
+ TRINITY_DEV_AUDIO,
+ TRINITY_DEV_VISION2,
+ TRINITY_DEV_VISION2_CUSE, /* CUSE-based impl. for triv2 */
+ TRINITY_DEV_END /* sentinel */
+};
+
+/**
+ * Major number can not be dynamic as ioctls need it,
+ */
+#define TRINITY_DRIVER_MAGIC 0x88
+
+#define TRINITY_IO(no) _IO(TRINITY_DRIVER_MAGIC, no)
+#define TRINITY_IOR(no, data_type) _IOR(TRINITY_DRIVER_MAGIC, no, data_type)
+#define TRINITY_IOW(no, data_type) _IOW(TRINITY_DRIVER_MAGIC, no, data_type)
+#define TRINITY_IOWR(no, data_type) _IOWR(TRINITY_DRIVER_MAGIC, no, data_type)
+
+/** Device Information */
+
+/** Get the device version information from the driver */
+#define TRINITY_IOCTL_GET_VERSION TRINITY_IOR(1, __u32)
+/** Get the device API level from the driver */
+#define TRINITY_IOCTL_GET_API_LEVEL TRINITY_IOR(2, __u32)
+/** Get the device state from the driver */
+#define TRINITY_IOCTL_GET_STATE TRINITY_IOR(3, __s32)
+/** Get the device tops information from the driver */
+#define TRINITY_IOCTL_GET_TOPS TRINITY_IOR(4, __u32)
+/** Get the device dspm information from the driver */
+#define TRINITY_IOCTL_GET_DSPM TRINITY_IOR(5, __u32)
+/** Get the next request ID from the driver */
+#define TRINITY_IOCTL_GET_NEXT_REQUEST TRINITY_IOR(6, __s32)
+
+/** Device Control */
+
+/** Allocate driver-managed memory */
+#define TRINITY_IOCTL_HWMEM_ALLOC TRINITY_IOW(21, struct trinity_ioctl_hwmem)
+
+/** De-allocate driver-managed memory */
+#define TRINITY_IOCTL_HWMEM_DEALLOC TRINITY_IOW(22, struct trinity_ioctl_hwmem)
+
+/** Register the given model config in the device and return model id */
+#define TRINITY_IOCTL_REGISTER_MODEL \
+ TRINITY_IOWR(23, struct trinity_ioctl_model)
+
+/** Unregister the model config associated with the given model_id */
+#define TRINITY_IOCTL_DEREGISTER_MODEL TRINITY_IOW(24, __u64)
+
+/** Run the device with the given input */
+#define TRINITY_IOCTL_RUN_INPUT TRINITY_IOWR(25, struct trinity_ioctl_input)
+
+/** Stop all requests submitted to the device */
+#define TRINITY_IOCTL_STOP_REQUESTS TRINITY_IO(26)
+
+/** Stop the target request with id returned by run_input */
+#define TRINITY_IOCTL_STOP_REQUEST TRINITY_IOW(27, __s32)
+
+/** Device Statistics/Profile */
+
+/** Get the current app stat in the opened device */
+#define TRINITY_IOCTL_STAT_CURRENT_APP \
+ TRINITY_IOR(51, struct trinity_ioctl_stat_app)
+
+/** Get latest apps' stat of the opened device */
+#define TRINITY_IOCTL_STAT_APPS TRINITY_IOR(52, struct trinity_ioctl_stat_apps)
+
+/** Get latest reqs' stat in the target app */
+#define TRINITY_IOCTL_STAT_REQS TRINITY_IOR(53, struct trinity_ioctl_stat_reqs)
+
+/** Get profiling metadata of the request */
+#define TRINITY_IOCTL_GET_PROFILE_META \
+ TRINITY_IOWR(54, struct trinity_ioctl_profile_meta)
+
+/** Get profiling per-op data of the request */
+#define TRINITY_IOCTL_GET_PROFILE_BUFF \
+ TRINITY_IOWR(55, struct trinity_ioctl_profile_buff)
+
+/** Device Testing/Workaround */
+
+/** Driver-assisted memory copy for FPGA env. */
+#define TRINITY_IOCTL_FPGA_MEMCPY \
+ TRINITY_IOWR(91, struct trinity_ioctl_fpga_memcpy)
+
+/** A wrapper of trinity_run_internal_req() */
+#define TRINITY_IOCTL_RUN_INTERNAL_REQ TRINITY_IOW(92, dev_t)
+
+#ifdef __KERNEL__
+__s32 trinity_run_internal_req(dev_t);
+#endif
+#endif /* __TRINITY_H__ */
--
2.25.1

2022-07-25 07:05:31

by Jiho Chu

[permalink] [raw]
Subject: [PATCH 2/9] tirnity: Add dma memory module

This patch includes memory management module.

It provides abstraction layer to handle DMA buffer.
The buffers are reserved when the driver starts, and are
used by the user request. The alloc/dealloc functions
are provided to access the DMA buffer, and the reference
of the buffers is counted.

DMA buffer address and allocation of it are depend on each
hardware, so config for supported hardware is introduced.

Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/Makefile | 1 +
drivers/misc/trinity/trinity.c | 91 +++++
drivers/misc/trinity/trinity_common.h | 2 +-
drivers/misc/trinity/trinity_hwmem.c | 438 +++++++++++++++++++++
drivers/misc/trinity/trinity_hwmem.h | 45 +++
drivers/misc/trinity/trinity_resv_mem.c | 264 +++++++++++++
drivers/misc/trinity/trinity_resv_mem.h | 41 ++
drivers/misc/trinity/trinity_vision2_drv.c | 1 +
8 files changed, 882 insertions(+), 1 deletion(-)
create mode 100644 drivers/misc/trinity/trinity_hwmem.c
create mode 100644 drivers/misc/trinity/trinity_hwmem.h
create mode 100644 drivers/misc/trinity/trinity_resv_mem.c
create mode 100644 drivers/misc/trinity/trinity_resv_mem.h

diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
index a8e5697d6d85..cf313c3afb3d 100644
--- a/drivers/misc/trinity/Makefile
+++ b/drivers/misc/trinity/Makefile
@@ -3,5 +3,6 @@
obj-$(CONFIG_TRINITY_VISION2) += trinity_vision2.o

trinity-y := trinity.o
+trinity-y += trinity_resv_mem.o trinity_hwmem.o

trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
index a85904c17f2e..1ee9403dbdca 100644
--- a/drivers/misc/trinity/trinity.c
+++ b/drivers/misc/trinity/trinity.c
@@ -33,6 +33,7 @@
#include <linux/uaccess.h>

#include "trinity_common.h"
+#include "trinity_resv_mem.h"

#define BASE_DEV_NAME "trinity"

@@ -163,11 +164,88 @@ int trinity_open(struct inode *inode, struct file *f)
return 0;
}

+/**
+ * trinity_get_dma_memory() - Get the DMA memory information
+ *
+ * @dev: device structure
+ * @paddr: acquired physical address
+ * @daddr: acquired DMA address
+ * @size: acquired size of the resource
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+static int trinity_get_dma_memory(struct device *dev, phys_addr_t *paddr,
+ dma_addr_t *daddr, size_t *size)
+{
+ struct device_node *np;
+ struct resource res;
+ int err;
+
+ if (!dev || !paddr || !daddr || !size)
+ return -EINVAL;
+
+ np = dev->of_node;
+ if (!np)
+ return -ENOENT;
+
+#ifdef ARM64
+ err = of_property_read_u64_array(np, "samsung,dma", info, 3);
+ if (err < 0)
+ return err;
+
+ *paddr = info[0];
+ *daddr = info[1];
+ *size = info[2];
+#else
+ err = of_address_to_resource(np, 0, &res);
+ if (err < 0)
+ return err;
+
+ *paddr = res.start;
+ *daddr = *paddr;
+ *size = resource_size(&res);
+#endif
+
+ dev_info(dev, "Detected DMA memory region: %lx-%lx",
+ (unsigned long)*paddr, (unsigned long)(*paddr + *size));
+ return 0;
+}
+
+static int trinity_declare_dma_memory(struct device *dev)
+{
+ phys_addr_t paddr;
+ dma_addr_t daddr;
+ size_t size;
+ int err;
+
+ err = trinity_get_dma_memory(dev, &paddr, &daddr, &size);
+ if (err < 0) {
+ dev_info(dev, "No available dma memory, skipping");
+ return 0;
+ }
+
+ err = trinity_declare_resv_mem(paddr, daddr, size);
+ if (err < 0) {
+ dev_err(dev, "Failed to declare reserved memory: %d\n", err);
+ return err;
+ }
+
+ return 0;
+}
+
+static void trinity_release_dma_memory(void)
+{
+ return trinity_release_resv_mem();
+}
+
static void trinity_common_init(struct device *dev)
{
if (!trinity_is_empty())
return;

+ if (trinity_declare_dma_memory(dev) < 0)
+ dev_warn(dev, "Failed to declare DMA memory\n");
+
/* Common init codes */
}

@@ -176,6 +254,7 @@ static void trinity_common_exit(void)
if (!trinity_is_empty())
return;

+ trinity_release_dma_memory();
/* Common deinit codes */
}

@@ -203,6 +282,13 @@ static int trinity_set_device_id(struct trinity_driver *drv)
return err;
}

+/**
+ * trinity_create_node() - Create trinity node
+ *
+ * @drv: an instance of trinity driver
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
int trinity_create_node(struct trinity_driver *drv)
{
struct device *dev = drv_to_dev_ptr(drv);
@@ -222,6 +308,11 @@ int trinity_create_node(struct trinity_driver *drv)
return err;
}

+/**
+ * trinity_destroy_node() - Destroy trinity node
+ *
+ * @drv: an instance of trinity driver
+ */
void trinity_destroy_node(struct trinity_driver *drv)
{
misc_deregister(&drv->mdev);
diff --git a/drivers/misc/trinity/trinity_common.h b/drivers/misc/trinity/trinity_common.h
index 37aba34ef9bc..7f576d4a71a5 100644
--- a/drivers/misc/trinity/trinity_common.h
+++ b/drivers/misc/trinity/trinity_common.h
@@ -25,7 +25,6 @@
#include <linux/platform_device.h>
#include <linux/slab.h>
#include <linux/types.h>
-
#include <uapi/misc/trinity.h>

/** Default timeout to wait for opening device in jiffies */
@@ -131,6 +130,7 @@ struct trinity_desc {
struct trinity_req *req);
int32_t (*invoke_req)(struct trinity_driver *drv,
struct trinity_req *req, void *sched_data);
+
irq_handler_t handle_irq;
};

diff --git a/drivers/misc/trinity/trinity_hwmem.c b/drivers/misc/trinity/trinity_hwmem.c
new file mode 100644
index 000000000000..069c856589e3
--- /dev/null
+++ b/drivers/misc/trinity/trinity_hwmem.c
@@ -0,0 +1,438 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/**
+ * An abstraction layer to handle DMA memory buffers for Trinity device driver
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/dma-buf.h>
+#include <linux/mutex.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/version.h>
+
+#include "trinity_hwmem.h"
+#include "trinity_resv_mem.h"
+
+#define dbuf_to_trnt_hwmem(d) ((struct trinity_hwmem *)d->priv)
+#define vma_to_trnt_hwmem(v) ((struct trinity_hwmem *)v->vm_private_data)
+
+/**
+ * struct trinity_hwmem - A data structure for Trinity DMA buffer management
+ * @dev: A pointer to device which this hwmem belongs to.
+ * @dbuf: The dma_buf instance.
+ * @refcnt: Reference counts.
+ * @direction: A variable indicating the DMA data direction in allocating this
+ * dma_buf.
+ * @attrs: Attributes used in allocating this dma_buf.
+ * @req_size: The size of the DMA buffer that the user request to allocate.
+ * @alc_size: The size of the DMA buffer which is actually allocated.
+ * @addr: The DMA (physical) address of this dma_buf.
+ * @cookie: The DMA cookies.
+ */
+struct trinity_hwmem {
+ struct device *dev;
+ struct dma_buf *dbuf;
+ struct kref refcnt;
+
+ enum dma_data_direction direction;
+ enum trinity_hwmem_type type;
+
+ unsigned long attrs;
+ size_t req_size;
+ size_t alc_size;
+
+ bool is_cont;
+ dma_addr_t addr;
+ void *cookie;
+};
+
+static void __trinity_hwmem_free(struct kref *refcnt)
+{
+ struct trinity_hwmem *mem =
+ container_of(refcnt, struct trinity_hwmem, refcnt);
+ /**
+ * when the dmabuf reference counter becomes zero,
+ * trinity_hwmem_dbuf_ops_release() is triggered.
+ */
+ dma_buf_put(mem->dbuf);
+}
+
+static void __trinity_hwmem_put(struct trinity_hwmem *mem)
+{
+ kref_put(&mem->refcnt, __trinity_hwmem_free);
+}
+
+static void __trinity_hwmem_put_dmabuf(struct dma_buf *dbuf)
+{
+ __trinity_hwmem_put(dbuf_to_trnt_hwmem(dbuf));
+}
+
+static struct trinity_hwmem *__trinity_hwmem_get(struct trinity_hwmem *mem)
+{
+ kref_get(&mem->refcnt);
+
+ return mem;
+}
+
+static void trinity_hwmem_dbuf_ops_detach(struct dma_buf *dbuf,
+ struct dma_buf_attachment *attachment)
+{
+ struct trinity_hwmem *mem = dbuf_to_trnt_hwmem(dbuf);
+
+ /* Decrease ref count of the backing storage */
+ __trinity_hwmem_put(mem);
+}
+
+static int trinity_hwmem_dbuf_ops_attach(struct dma_buf *dbuf,
+ struct dma_buf_attachment *attachment)
+{
+ struct trinity_hwmem *mem = dbuf_to_trnt_hwmem(dbuf);
+
+ /* Increase ref count of the backing storage */
+ mem = __trinity_hwmem_get(mem);
+ attachment->priv = mem;
+
+ return 0;
+}
+
+static struct sg_table *
+trinity_hwmem_dbuf_ops_map_dma_buf(struct dma_buf_attachment *attachment,
+ enum dma_data_direction dir)
+{
+ return NULL;
+}
+
+static void
+trinity_hwmem_dbuf_ops_unmap_dma_buf(struct dma_buf_attachment *attachment,
+ struct sg_table *sgt,
+ enum dma_data_direction dir)
+{
+}
+
+static void trinity_hwmem_vm_ops_open(struct vm_area_struct *vma)
+{
+ struct trinity_hwmem *mem = vma_to_trnt_hwmem(vma);
+
+ __trinity_hwmem_get(mem);
+}
+
+static void trinity_hwmem_vm_ops_close(struct vm_area_struct *vma)
+{
+ struct trinity_hwmem *mem = vma_to_trnt_hwmem(vma);
+
+ __trinity_hwmem_put(mem);
+}
+
+static const struct vm_operations_struct trinity_hwmem_vm_ops = {
+ .open = trinity_hwmem_vm_ops_open,
+ .close = trinity_hwmem_vm_ops_close,
+};
+
+static int32_t trinity_hwmem_dbuf_ops_mmap(struct dma_buf *dbuf,
+ struct vm_area_struct *vma)
+{
+ struct trinity_hwmem *mem;
+ int32_t ret;
+
+ if (!dbuf)
+ return -EINVAL;
+
+ mem = dbuf_to_trnt_hwmem(dbuf);
+ if (!mem)
+ return -EINVAL;
+
+ vma->vm_pgoff = 0;
+ if (mem->type == TRINITY_HWMEM_DMA_CONT)
+ ret = trinity_mmap_from_resv_mem(vma, mem->cookie,
+ mem->alc_size, mem->is_cont);
+ else
+ ret = dma_mmap_attrs(mem->dev, vma, mem->cookie, mem->addr,
+ mem->alc_size, mem->attrs);
+ if (ret)
+ return ret;
+
+ vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
+ vma->vm_private_data = mem;
+ vma->vm_ops = &trinity_hwmem_vm_ops;
+
+ vma->vm_ops->open(vma);
+
+ return 0;
+}
+
+static void trinity_hwmem_dbuf_ops_release(struct dma_buf *dbuf)
+{
+ struct trinity_hwmem *mem = dbuf_to_trnt_hwmem(dbuf);
+
+ if (mem->type == TRINITY_HWMEM_DMA_CONT) {
+ struct trinity_resv_mem resv_mem;
+
+ resv_mem.vaddr = mem->cookie;
+ resv_mem.daddr = mem->addr;
+ resv_mem.size = mem->alc_size;
+
+ trinity_free_from_resv_mem(&resv_mem, mem->is_cont);
+ } else {
+ dma_free_attrs(mem->dev, mem->alc_size, mem->cookie, mem->addr,
+ mem->attrs);
+ }
+ put_device(mem->dev);
+
+ mem->dbuf->priv = NULL;
+
+ kfree(mem);
+}
+
+static int trinity_hwmem_dbuf_ops_vmap(struct dma_buf *dbuf,
+ struct iosys_map *map)
+{
+ struct trinity_hwmem *mem;
+
+ if (!dbuf)
+ return -EINVAL;
+
+ mem = dbuf_to_trnt_hwmem(dbuf);
+ if (!mem)
+ return -ENOENT;
+
+ map->vaddr = mem->cookie;
+
+ return 0;
+}
+
+static struct dma_buf_ops trinity_hwmem_dbuf_ops = {
+ .vmap = trinity_hwmem_dbuf_ops_vmap,
+ .attach = trinity_hwmem_dbuf_ops_attach,
+ .detach = trinity_hwmem_dbuf_ops_detach,
+ .map_dma_buf = trinity_hwmem_dbuf_ops_map_dma_buf,
+ .unmap_dma_buf = trinity_hwmem_dbuf_ops_unmap_dma_buf,
+ .release = trinity_hwmem_dbuf_ops_release,
+ .mmap = trinity_hwmem_dbuf_ops_mmap,
+};
+
+static void *__trinity_hwmem_alloc(struct device *dev, const size_t size,
+ const enum dma_data_direction dir,
+ const enum trinity_hwmem_type type)
+{
+ size_t aligned_size = ALIGN(size, PAGE_SIZE);
+ struct trinity_hwmem *mem;
+ struct trinity_resv_mem resv_mem;
+ int ret;
+
+ if (WARN_ON(!dev))
+ return ERR_PTR(-EINVAL);
+
+ mem = kzalloc(sizeof(*mem), GFP_KERNEL);
+ if (!mem)
+ return ERR_PTR(-ENOMEM);
+
+ mem->dev = get_device(dev);
+ mem->req_size = size;
+ mem->alc_size = aligned_size;
+ mem->direction = dir;
+ mem->type = TRINITY_HWMEM_DMA_IOMMU;
+ mem->is_cont = (type == TRINITY_HWMEM_DMA_CONT);
+
+ mem->attrs |= DMA_ATTR_WRITE_COMBINE;
+ mem->attrs |= DMA_ATTR_SKIP_CPU_SYNC;
+
+ /**
+ * Trying to alloc memery from resv mem first regardless of hwmem type.
+ * But, the resv allocator should preserve a minimum space for vISA programs
+ * because they should be physically contiguous.
+ */
+ ret = trinity_alloc_from_resv_mem(aligned_size, &resv_mem,
+ mem->is_cont);
+ if (ret == 0) {
+ mem->addr = resv_mem.daddr;
+ mem->cookie = resv_mem.vaddr;
+ mem->type = TRINITY_HWMEM_DMA_CONT;
+ } else if (!mem->is_cont) {
+ mem->cookie = dma_alloc_attrs(dev, aligned_size, &mem->addr,
+ GFP_KERNEL, mem->attrs);
+ } else {
+ dev_err(mem->dev,
+ "Unable alloc contiguous memory for program: %zu\n",
+ size);
+ }
+
+ if (!mem->cookie) {
+ ret = -ENOMEM;
+ goto free_mem;
+ }
+
+ kref_init(&mem->refcnt);
+
+ return mem;
+
+free_mem:
+ kfree(mem);
+
+ return ERR_PTR(ret);
+}
+
+static struct dma_buf *__trinity_hwmem_get_dmabuf(struct trinity_hwmem *mem,
+ unsigned long flags)
+{
+ DEFINE_DMA_BUF_EXPORT_INFO(einfo);
+ struct dma_buf *dbuf;
+
+ einfo.ops = &trinity_hwmem_dbuf_ops;
+ einfo.size = mem->alc_size;
+ einfo.flags = flags;
+ einfo.priv = (void *)mem;
+
+ dbuf = dma_buf_export(&einfo);
+ if (IS_ERR(dbuf))
+ return dbuf;
+
+ /* Increase ref count of the backing storage */
+ dbuf->priv = (void *)__trinity_hwmem_get(mem);
+ mem->dbuf = dbuf;
+
+ return dbuf;
+}
+
+/**
+ * trinity_hwmem_alloc() - Allocate Hardware memory according to type
+ * @dev: A pointer to the instance of the device to be attached the DMA buffer
+ * @size: Requested memory size
+ * @type: Requested memory type. It will try to allocate from reserved memory first
+ *
+ * Return: a file descriptor for the dma buffer on success.
+ * Otherwise, returns negative error.
+ */
+int32_t trinity_hwmem_alloc(struct device *dev, const size_t size,
+ enum trinity_hwmem_type type)
+{
+ struct trinity_hwmem *mem;
+ struct dma_buf *dbuf;
+ int32_t ret;
+
+ mem = __trinity_hwmem_alloc(dev, size, DMA_BIDIRECTIONAL, type);
+ if (IS_ERR(mem))
+ return PTR_ERR(mem);
+
+ dbuf = __trinity_hwmem_get_dmabuf(mem, O_CLOEXEC | O_RDWR);
+ if (IS_ERR(dbuf)) {
+ ret = PTR_ERR(dbuf);
+ goto err_put_mem;
+ }
+
+ ret = dma_buf_fd(dbuf, O_CLOEXEC);
+ if (ret < 0)
+ goto err_put_mem;
+
+ return ret;
+
+err_put_mem:
+ __trinity_hwmem_put(mem);
+
+ return ret;
+}
+
+/**
+ * trinity_hwmem_free() - Free Hardware memory
+ * @dev: A pointer to the instance of the device to be attached the DMA buffer
+ * @fd: A file descriptor for a allocated memory
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+int32_t trinity_hwmem_free(struct device *dev, const int32_t fd)
+{
+ struct dma_buf *dbuf;
+
+ dbuf = dma_buf_get(fd);
+ if (!IS_ERR(dbuf)) {
+ struct trinity_hwmem *mem = dbuf_to_trnt_hwmem(dbuf);
+
+ /* Counter part of __trinity_hwmem_get() in __trinity_hwmem_get_dmabuf() */
+ __trinity_hwmem_put_dmabuf(dbuf);
+ /* Counter part of __trinity_hwmem_get() in __trinity_hwmem_alloc() */
+ __trinity_hwmem_put(mem);
+
+ dma_buf_put(dbuf);
+
+ return 0;
+ }
+
+ dev_err(dev,
+ "failed to free the dma_buf structure realted to fd with %ld\n",
+ PTR_ERR(dbuf));
+
+ return PTR_ERR(dbuf);
+}
+
+/**
+ * trinity_hwmem_import_dmabuf_begin() - Defines the beginning of a section to
+ * import a given DMA buffer file descriptor.
+ * @dev: A pointer to the instance of the device to be attached the DMA buffer
+ * @dbuf_fd: The file descriptor of the DMA buffer to be imported.
+ * @import_info: If importing is successful, information such as the DMA
+ * address, the virtual address which is mapped to the DMA address,
+ * &struct dma_buf_attachment, a scatter-gather table, and &struct
+ * dma_buf corresponding to the file descriptor will be passed
+ * using this parameter.
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+int32_t
+trinity_hwmem_import_dmabuf_begin(struct device *dev, const int32_t dbuf_fd,
+ struct trinity_hwmem_import *import_info)
+{
+ struct dma_buf_attachment *attachment;
+ struct dma_buf *buf;
+ struct trinity_hwmem *mem;
+ struct iosys_map map;
+ int32_t ret;
+
+ if (!import_info)
+ return -EINVAL;
+
+ buf = dma_buf_get(dbuf_fd);
+ if (IS_ERR(buf))
+ return PTR_ERR(buf);
+
+ attachment = dma_buf_attach(buf, dev);
+ if (IS_ERR(attachment)) {
+ ret = PTR_ERR(attachment);
+ goto err_dbuf_put;
+ }
+
+ mem = attachment->priv;
+ import_info->dma_addr = mem->addr;
+ ret = dma_buf_vmap(buf, &map);
+ if (ret)
+ goto err_dbuf_put;
+
+ import_info->addr = map.vaddr;
+ import_info->attachment = attachment;
+ import_info->buf = buf;
+
+ return 0;
+
+err_dbuf_put:
+ dma_buf_put(buf);
+
+ return ret;
+}
+
+/**
+ * trinity_hwmem_import_dmabuf_end() - Defines the ending of the section related
+ * to the given pointer to &strut trinity_hwmem_import.
+ * @import_info: Importing information related to the section to be ended.
+ */
+void trinity_hwmem_import_dmabuf_end(struct trinity_hwmem_import *import_info)
+{
+ if (!import_info || !import_info->buf)
+ return;
+ dma_buf_vunmap(import_info->buf, import_info->addr);
+ dma_buf_detach(import_info->buf, import_info->attachment);
+ dma_buf_put(import_info->buf);
+}
diff --git a/drivers/misc/trinity/trinity_hwmem.h b/drivers/misc/trinity/trinity_hwmem.h
new file mode 100644
index 000000000000..a64b83a1eec9
--- /dev/null
+++ b/drivers/misc/trinity/trinity_hwmem.h
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * An abstraction layer to handle DMA memory buffers for Trinity device driver
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __DRIVERS_MISC_TRINITY_HWMEM_H__
+#define __DRIVERS_MISC_TRINITY_HWMEM_H__
+
+#include <linux/dma-buf.h>
+#include <linux/dma-mapping.h>
+#include <linux/kref.h>
+#include <uapi/misc/trinity.h>
+
+/**
+ * struct trinity_hwmem_import - A data structure to maintin imported hwmem
+ * (that is Trinity DMA buffer).
+ * @dma_addr: The physical DMA address of this DMA buffer.
+ * @addr: A virtual address of this DMA buffer.
+ * @attachment: A pointer to &struct dma_buf_attachment.
+ * @buf: &struct dma_buf that this hwmem wrapped.
+ */
+struct trinity_hwmem_import {
+ dma_addr_t dma_addr;
+ void *addr;
+ struct dma_buf_attachment *attachment;
+ struct dma_buf *buf;
+};
+
+int32_t
+trinity_hwmem_import_dmabuf_begin(struct device *dev, const int32_t dbuf_fd,
+ struct trinity_hwmem_import *import_info);
+void trinity_hwmem_import_dmabuf_end(struct trinity_hwmem_import *import_info);
+
+int32_t trinity_hwmem_alloc(struct device *dev, const size_t size,
+ enum trinity_hwmem_type type);
+int32_t trinity_hwmem_free(struct device *dev, const int32_t fd);
+
+#endif /* __DRIVERS_MISC_TRINITY_HWMEM_H__ */
diff --git a/drivers/misc/trinity/trinity_resv_mem.c b/drivers/misc/trinity/trinity_resv_mem.c
new file mode 100644
index 000000000000..9279452b9b2d
--- /dev/null
+++ b/drivers/misc/trinity/trinity_resv_mem.c
@@ -0,0 +1,264 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/**
+ * Reserved memory allocator for Trinity device drivers
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include "trinity_resv_mem.h"
+#include <linux/io.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+
+#define PROG_POOL_SIZE (6 * 1024 * 1024) /* FIXME: 6MB */
+#define IS_INITIALIZED(pool) (atomic_read(&((pool)->initialized)) == 1)
+#define SET_INITIALIZED(pool) atomic_set(&((pool)->initialized), 1)
+#define UNSET_INITIALIZED(pool) atomic_set(&((pool)->initialized), 0)
+
+struct trinity_resv_mem_pool {
+ phys_addr_t paddr_base;
+ dma_addr_t daddr_base;
+ void *vaddr_base;
+
+ size_t total_size;
+ size_t total_used;
+
+ unsigned int num_bits;
+ unsigned long *bitmap;
+
+ spinlock_t lock;
+ atomic_t initialized;
+};
+
+/* Trinity devices share this reserved memory pool */
+static struct trinity_resv_mem_pool resv_pool_cont;
+static struct trinity_resv_mem_pool resv_pool_norm;
+
+static int init_resv_mem(phys_addr_t paddr, dma_addr_t daddr, size_t size,
+ struct trinity_resv_mem_pool *pool)
+{
+ unsigned int num_bits = size >> PAGE_SHIFT;
+ int bitmap_size = BITS_TO_LONGS(num_bits) * sizeof(long);
+ void *vaddr;
+
+ vaddr = ioremap_wc(paddr, size);
+ if (unlikely(!vaddr))
+ return -EINVAL;
+
+ pool->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
+ if (unlikely(!pool->bitmap)) {
+ iounmap(vaddr);
+ return -ENOMEM;
+ }
+
+ pool->paddr_base = paddr;
+ pool->daddr_base = daddr;
+ pool->vaddr_base = vaddr;
+ pool->total_size = size;
+ pool->total_used = 0;
+ pool->num_bits = num_bits;
+
+ spin_lock_init(&pool->lock);
+ SET_INITIALIZED(pool);
+
+ return 0;
+}
+
+static void fini_resv_mem(struct trinity_resv_mem_pool *pool)
+{
+ if (!pool || unlikely(!IS_INITIALIZED(pool)))
+ return;
+
+ UNSET_INITIALIZED(pool);
+
+ iounmap(pool->vaddr_base);
+ kfree(pool->bitmap);
+ memset(pool, '\x00', sizeof(*pool));
+}
+
+/**
+ * trinity_declare_resv_mem() - Declare reserved memory
+ *
+ * @dev: a pointer to the instance of the device to be attached the DMA buffer
+ * @daddr: DMA buffer address to be used for reserved memory
+ * @size: size of requested memory
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+int trinity_declare_resv_mem(phys_addr_t paddr, dma_addr_t daddr, size_t size)
+{
+ int ret;
+
+ /* skip if initialized before */
+ if (unlikely(IS_INITIALIZED(&resv_pool_cont) ||
+ IS_INITIALIZED(&resv_pool_norm)))
+ return 0;
+
+ ret = init_resv_mem(paddr, daddr, PROG_POOL_SIZE, &resv_pool_cont);
+ if (ret != 0)
+ return ret;
+
+ /* FIXME: reserve the first page (not used) */
+ set_bit(0, resv_pool_cont.bitmap);
+ resv_pool_cont.total_used = PAGE_SIZE;
+
+ ret = init_resv_mem(paddr + PROG_POOL_SIZE, daddr + PROG_POOL_SIZE,
+ size - PROG_POOL_SIZE, &resv_pool_norm);
+ if (ret != 0) {
+ fini_resv_mem(&resv_pool_cont);
+ return ret;
+ }
+
+ return 0;
+}
+
+/**
+ * trinity_release_resv_mem() - Release reserved memory
+ *
+ */
+void trinity_release_resv_mem(void)
+{
+ fini_resv_mem(&resv_pool_cont);
+ fini_resv_mem(&resv_pool_norm);
+}
+
+static int find_free_region(unsigned long *bitmap, unsigned long num_bits,
+ unsigned long nr)
+{
+ unsigned long index, start, end, i;
+
+ start = 0;
+retry:
+ index = find_next_zero_bit(bitmap, num_bits, start);
+ end = index + nr;
+ if (end > num_bits)
+ return -ERANGE;
+
+ i = find_next_bit(bitmap, end, index);
+ if (i < end) {
+ start = i + 1;
+ goto retry;
+ }
+ return index;
+}
+
+/**
+ * trinity_alloc_from_resv_mem() - Allocate reserved memory
+ *
+ * @size: size of requested memory
+ * @mem: allocated reserved memory information
+ * @is_cont: continuity of the memory region
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+int trinity_alloc_from_resv_mem(const size_t size, struct trinity_resv_mem *mem,
+ bool is_cont)
+{
+ struct trinity_resv_mem_pool *pool;
+ dma_addr_t offset;
+ int pageno, err = 0;
+
+ pool = is_cont ? &resv_pool_cont : &resv_pool_norm;
+
+ if (unlikely(!IS_INITIALIZED(pool)))
+ return -EPERM;
+
+ if (unlikely(!IS_ALIGNED(size, PAGE_SIZE)))
+ return -EINVAL;
+
+ spin_lock(&pool->lock);
+
+ if (unlikely(size > pool->total_size)) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ pageno = find_free_region(pool->bitmap, pool->num_bits,
+ size >> PAGE_SHIFT);
+ if (unlikely(pageno < 0)) {
+ err = pageno;
+ goto out;
+ }
+ bitmap_set(pool->bitmap, pageno, size >> PAGE_SHIFT);
+ offset = (dma_addr_t)pageno << PAGE_SHIFT;
+
+ mem->daddr = pool->daddr_base + offset;
+ mem->vaddr = pool->vaddr_base + offset;
+ mem->size = size;
+
+ memset(mem->vaddr, '\x00', size);
+
+ pool->total_used += mem->size;
+out:
+ spin_unlock(&pool->lock);
+
+ return err;
+}
+
+/**
+ * trinity_free_from_resv_mem() - Free reserved memory
+ *
+ * @mem: allocated reserved memory information
+ * @is_cont: continuity of the memory region
+ */
+void trinity_free_from_resv_mem(struct trinity_resv_mem *mem, bool is_cont)
+{
+ struct trinity_resv_mem_pool *pool;
+
+ pool = is_cont ? &resv_pool_cont : &resv_pool_norm;
+
+ if (unlikely(!IS_INITIALIZED(pool)))
+ return;
+
+ if (likely(mem->vaddr != NULL)) {
+ int page = (mem->vaddr - pool->vaddr_base) >> PAGE_SHIFT;
+ int len = mem->size >> PAGE_SHIFT;
+
+ spin_lock(&pool->lock);
+
+ bitmap_clear(pool->bitmap, page, len);
+ pool->total_used -= mem->size;
+
+ spin_unlock(&pool->lock);
+ }
+}
+
+/**
+ * trinity_mmap_from_resv_mem() - mmap for reserved memory
+ *
+ * @vma: vma to map
+ * @vaddr: target virtual address
+ * @size: size of map area
+ * @is_cont: continuity of the memory region
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+int trinity_mmap_from_resv_mem(struct vm_area_struct *vma, void *vaddr,
+ size_t size, bool is_cont)
+{
+ struct trinity_resv_mem_pool *pool;
+
+ pool = is_cont ? &resv_pool_cont : &resv_pool_norm;
+
+ if (likely(IS_INITIALIZED(pool))) {
+ unsigned long off = vma->vm_pgoff;
+ unsigned long pfn_base = PFN_DOWN(pool->paddr_base);
+ int start = (vaddr - pool->vaddr_base) >> PAGE_SHIFT;
+ int user_count = vma_pages(vma);
+ int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+
+ if (off < count && user_count <= count - off) {
+ unsigned long pfn = pfn_base + start + off;
+
+ return remap_pfn_range(vma, vma->vm_start, pfn,
+ user_count << PAGE_SHIFT,
+ vma->vm_page_prot);
+ }
+ }
+
+ return -ENXIO;
+}
diff --git a/drivers/misc/trinity/trinity_resv_mem.h b/drivers/misc/trinity/trinity_resv_mem.h
new file mode 100644
index 000000000000..94b1c712aa1d
--- /dev/null
+++ b/drivers/misc/trinity/trinity_resv_mem.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * Reserved memory allocator for Trinity device drivers
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __DRIVERS_MISC_TRINITY_RESV_MEM_H__
+#define __DRIVERS_MISC_TRINITY_RESV_MEM_H__
+
+#include <linux/mm_types.h>
+#include <linux/types.h>
+
+/**
+ * struct trinity_resv_mem - A data structure to maintin reserved memory
+ *
+ * @daddr: The physical DMA address of this DMA buffer.
+ * @vaddr: A virtual address of this DMA buffer.
+ * @size: allocated reserved memory size. page aligned.
+ * @orig_size: requested memory size
+ */
+struct trinity_resv_mem {
+ dma_addr_t daddr;
+ void *vaddr;
+ size_t size;
+ size_t orig_size;
+};
+
+int trinity_declare_resv_mem(phys_addr_t paddr, dma_addr_t daddr, size_t size);
+void trinity_release_resv_mem(void);
+int trinity_alloc_from_resv_mem(const size_t size, struct trinity_resv_mem *mem,
+ bool is_prog);
+void trinity_free_from_resv_mem(struct trinity_resv_mem *mem, bool is_prog);
+int trinity_mmap_from_resv_mem(struct vm_area_struct *vma, void *vaddr,
+ size_t size, bool is_prog);
+
+#endif /* __DRIVERS_MISC_TRINITY_RESV_MEM_H__ */
diff --git a/drivers/misc/trinity/trinity_vision2_drv.c b/drivers/misc/trinity/trinity_vision2_drv.c
index a24eb0f6ac6d..f1c1e06d188e 100644
--- a/drivers/misc/trinity/trinity_vision2_drv.c
+++ b/drivers/misc/trinity/trinity_vision2_drv.c
@@ -507,6 +507,7 @@ static struct platform_driver trinity_triv2 = {

module_platform_driver(trinity_triv2);

+MODULE_IMPORT_NS(DMA_BUF);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Samsung Electronics");
MODULE_DESCRIPTION("Samsung NPU device driver for trinity vision 2");
--
2.25.1

2022-07-25 07:07:59

by Jiho Chu

[permalink] [raw]
Subject: [PATCH 4/9] trinity: Add schduler module

This patch includes NPU scheduler interface.

Tasks can be pushed to the NPU in order by the scheduler. The default
schduling algorithm is provided using Priority policy.
The scheduler waits request from the user. When the requests are
invoked, it submits each request to the NPU by the priority, and waits
until complete interrupt arrives. The priority is calculated with
remained time to requested timeout.

Thus the scheduler algorithm may be added more in the later, it
provides an interface which can support various schedulers.

Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/Makefile | 1 +
drivers/misc/trinity/sched/core.c | 170 +++++++++++++
drivers/misc/trinity/sched/priority.c | 335 ++++++++++++++++++++++++++
drivers/misc/trinity/sched/priority.h | 18 ++
drivers/misc/trinity/sched/sched.h | 52 ++++
5 files changed, 576 insertions(+)
create mode 100644 drivers/misc/trinity/sched/core.c
create mode 100644 drivers/misc/trinity/sched/priority.c
create mode 100644 drivers/misc/trinity/sched/priority.h
create mode 100644 drivers/misc/trinity/sched/sched.h

diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
index cf313c3afb3d..dcf9d7ad1b4b 100644
--- a/drivers/misc/trinity/Makefile
+++ b/drivers/misc/trinity/Makefile
@@ -4,5 +4,6 @@ obj-$(CONFIG_TRINITY_VISION2) += trinity_vision2.o

trinity-y := trinity.o
trinity-y += trinity_resv_mem.o trinity_hwmem.o
+trinity-y += sched/core.o sched/priority.o

trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
diff --git a/drivers/misc/trinity/sched/core.c b/drivers/misc/trinity/sched/core.c
new file mode 100644
index 000000000000..2d94f5d07e8b
--- /dev/null
+++ b/drivers/misc/trinity/sched/core.c
@@ -0,0 +1,170 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * NPU scheduler interface
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/spinlock.h>
+
+#include "../trinity_common.h"
+#include "sched.h"
+#include "priority.h"
+
+static struct trinity_sched_desc *sched_table[SCHED_END];
+static DEFINE_SPINLOCK(sched_lock);
+
+/**
+ * trinity_sched_register() - Register trinity task scheduler
+ * It does nothing if it is already registered.
+ *
+ * @type: scheduler type
+ * @desc: scheduler description
+ */
+void trinity_sched_register(enum trinity_sched_type type,
+ struct trinity_sched_desc *desc)
+{
+ if (type >= SCHED_END)
+ return;
+
+ spin_lock(&sched_lock);
+ if (!sched_table[type])
+ sched_table[type] = desc;
+ spin_unlock(&sched_lock);
+}
+
+/**
+ * trinity_sched_unregister() - Unregister trinity task scheduler
+ *
+ * @type: scheduler type
+ * @desc: scheduler description
+ */
+void trinity_sched_unregister(enum trinity_sched_type type,
+ struct trinity_sched_desc *desc)
+{
+ if (type >= SCHED_END)
+ return;
+
+ spin_lock(&sched_lock);
+ if (sched_table[type] == desc)
+ sched_table[type] = NULL;
+ spin_unlock(&sched_lock);
+}
+
+/**
+ * trinity_sched_find() - Find trinity task scheduler
+ *
+ * @type: scheduler type
+ * Return: trinity scheduler description on Success, Otherwise return NULL.
+ */
+struct trinity_sched_desc *trinity_sched_find(enum trinity_sched_type type)
+{
+ struct trinity_sched_desc *desc;
+ unsigned long flags;
+
+ if (type >= SCHED_END)
+ return NULL;
+
+ spin_lock_irqsave(&sched_lock, flags);
+ desc = sched_table[type];
+ spin_unlock_irqrestore(&sched_lock, flags);
+
+ return desc;
+}
+
+/**
+ * trinity_sched_run_req() - Schedules a req to the target from the req queue
+ *
+ * @req_data: The data ptr to hold req information to be submitted.
+ *
+ * Return: 0 on success. Otherwise, returns negative error. Additional status of
+ * the submitted req could be passed by req->status.
+ */
+int32_t trinity_sched_run_req(void *req_data, void *sched_data)
+{
+ struct trinity_req *req = (struct trinity_req *)req_data;
+ struct trinity_driver *drv = req->drv;
+ int32_t err = 0;
+ int32_t ready;
+
+ /** setup is only allowed in ready state */
+ ready = drv->desc->get_state(drv);
+ if (ready != TRINITY_STATE_READY) {
+ dev_err(drv_to_dev_ptr(drv),
+ "Cannot setup NPU when it's in a non-ready state");
+ err = -EPERM;
+ goto out;
+ }
+
+ if (req->stat->status != TRINITY_REQ_STATUS_PENDING &&
+ req->stat->status != TRINITY_REQ_STATUS_FINISHED) {
+ dev_err(drv_to_dev_ptr(drv), "Invalid req status: %d",
+ req->stat->status);
+ err = -EINVAL;
+ goto out;
+ }
+
+ req->stat->status = TRINITY_REQ_STATUS_RUNNING;
+ err = drv->desc->invoke_req(drv, req, sched_data);
+out:
+ if (err != 0)
+ req->stat->status = TRINITY_REQ_STATUS_ERROR;
+
+ return err;
+}
+
+/**
+ * trinity_sched_suspend() - Suspend whole task schedulers
+ */
+void trinity_sched_suspend(void)
+{
+ enum trinity_sched_type type;
+ struct trinity_sched_desc *desc;
+
+ for (type = SCHED_PRI; type < SCHED_END; type++) {
+ desc = sched_table[type];
+ if (desc)
+ desc->suspend();
+ }
+}
+
+/**
+ * trinity_sched_suspend() - Resume whole task schedulers
+ */
+void trinity_sched_resume(void)
+{
+ enum trinity_sched_type type;
+ struct trinity_sched_desc *desc;
+
+ for (type = SCHED_PRI; type < SCHED_END; type++) {
+ desc = sched_table[type];
+ if (desc)
+ desc->resume();
+ }
+}
+
+/**
+ * trinity_sched_init() - Initialize trinity task schedulers
+ *
+ * @dev: an instance of the device
+ * Return: always returns 0
+ */
+int32_t trinity_sched_init(struct device *dev)
+{
+ if (trinity_sched_init_pri(dev) < 0)
+ dev_warn(dev, "Unable to initialize SR task scheduler");
+
+ return 0;
+}
+
+/**
+ * trinity_sched_exit() - Exit trinity task schedulers
+ */
+void trinity_sched_exit(void)
+{
+ trinity_sched_exit_pri();
+}
diff --git a/drivers/misc/trinity/sched/priority.c b/drivers/misc/trinity/sched/priority.c
new file mode 100644
index 000000000000..3d27c84ff0ba
--- /dev/null
+++ b/drivers/misc/trinity/sched/priority.c
@@ -0,0 +1,335 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * NPU scheduler follows priority policy
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/fs.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+
+#include "../trinity_common.h"
+#include "sched.h"
+
+#define get_dev_ptr() (g_sched_priv.dev)
+
+struct trinity_sched_priv {
+ struct device *dev;
+ struct llist_head req_queue;
+ wait_queue_head_t wait_queue;
+ struct task_struct *sched_thread;
+ struct mutex lock;
+ unsigned long suspended;
+};
+
+static struct trinity_sched_priv g_sched_priv;
+
+/**
+ * sched_calc_pri() - Calculate priority using timeout
+ */
+static unsigned long sched_calc_pri(struct trinity_req *req)
+{
+ ktime_t elapsed_time;
+ int64_t priority;
+
+ if (req->input.config.timeout_ms == 0)
+ return 0; /** @todo need preemption */
+
+ elapsed_time = ktime_to_ms(ktime_sub(ktime_get(), req->time_started));
+ WARN_ON(elapsed_time < 0);
+
+ /**
+ * if the elapsed time exceeds the timeout of req,
+ * its priority value is set to the minimum (highest).
+ */
+ priority = req->input.config.timeout_ms - elapsed_time;
+ if (priority < 0)
+ priority = 0;
+
+ return priority;
+}
+
+/**
+ * sched_pick_req() - Pick the top-priority request from request queue
+ */
+static struct trinity_req *sched_pick_req(struct llist_head *queue)
+{
+ struct trinity_req *req, *req_prev;
+ struct trinity_req *top_req, *top_req_prev;
+ int64_t top_priority = S64_MAX;
+ unsigned long priority;
+
+ if (llist_empty(queue))
+ return NULL;
+
+ req = req_prev = NULL;
+ top_req = top_req_prev = NULL;
+
+ /**
+ * llist is not a double linked list, and sorting is not easy
+ * because llist provides only limited APIs.
+ * it could be better than sorting if there are a few pending reqs.
+ * Note that each user application can submit only one req at once.
+ */
+ llist_for_each_entry(req, queue->first, llist) {
+ priority = sched_calc_pri(req);
+ if (top_priority > priority) {
+ top_priority = priority;
+ top_req = req;
+ top_req_prev = req_prev;
+ }
+
+ req_prev = req;
+ }
+
+ if (top_req_prev) {
+ WARN_ON(!top_req);
+ top_req_prev->llist.next = top_req->llist.next;
+ } else {
+ /** it's first entry */
+ top_req = llist_entry(llist_del_first(queue), typeof(*(req)),
+ llist);
+ }
+
+ return top_req;
+}
+
+/**
+ * llist_last() - Get latest node from list
+ */
+static struct llist_node *llist_last(struct llist_node *first)
+{
+ struct llist_node *last = first;
+
+ while (first && first->next) {
+ last = first->next;
+ first = last;
+ }
+
+ return last;
+}
+
+/**
+ * sched_thread_func() - Scheduler thread function
+ */
+static int sched_thread_func(void *data)
+{
+ const unsigned long MAX_RETRY_COUNT = 100;
+
+ struct llist_head local_queue;
+ struct llist_node *new_first;
+
+ init_llist_head(&local_queue);
+repeat:
+ if (kthread_should_stop())
+ return 0;
+
+ /** extract requests from global queue without locking */
+ new_first = llist_del_all(&g_sched_priv.req_queue);
+ /** new and pending requests could be located together */
+ if (new_first) {
+ struct llist_node *new_last = llist_last(new_first);
+
+ llist_add_batch(new_first, new_last, &local_queue);
+ }
+
+ /** flush requests in the queue */
+ while (!llist_empty(&local_queue)) {
+ struct trinity_req *req;
+ int32_t ret;
+
+ /**
+ * pick the top-priority request from the queue.
+ * first and last node pointers are updated
+ */
+ req = sched_pick_req(&local_queue);
+ if (!req)
+ goto repeat;
+
+ mutex_lock(&g_sched_priv.lock);
+ ret = trinity_sched_run_req(req, NULL);
+ mutex_unlock(&g_sched_priv.lock);
+
+ /** do not modify or access for 'req' except on an error case.
+ * it could be released by the interrupt.
+ */
+
+ if (ret == -EBUSY) {
+ if (req->submit_retry >= MAX_RETRY_COUNT) {
+ /** give up to handling this req*/
+ complete_all(&req->complete);
+ } else {
+ req->submit_retry++;
+ /** push again and restart the loop */
+ llist_add(&req->llist, &local_queue);
+ }
+ goto repeat;
+ } else if (ret != 0) {
+ /** let's notify this unknown error */
+ complete_all(&req->complete);
+ }
+ }
+
+ /** ensure the local queue is empty */
+ WARN_ON(!llist_empty(&local_queue));
+
+ wait_event_interruptible(
+ g_sched_priv.wait_queue,
+ kthread_should_stop() ||
+ !llist_empty(&(g_sched_priv.req_queue)));
+ goto repeat;
+}
+
+/**
+ * sched_ready() - Check scheduler is ready
+ */
+static bool sched_ready(void)
+{
+ return (test_bit(1, &g_sched_priv.suspended) != 1);
+}
+
+/**
+ * sched_submit() - Submit request to scheduler
+ */
+static int32_t sched_submit(void *data)
+{
+ struct trinity_req *req = data;
+
+ if (!req)
+ return -EINVAL;
+
+ if (!sched_ready())
+ return -EAGAIN;
+
+ llist_add(&req->llist, &g_sched_priv.req_queue);
+ wake_up(&g_sched_priv.wait_queue);
+
+ return 0;
+}
+
+/**
+ * sched_notify() - finishes and notify the request handled
+ */
+static void sched_notify(void *data, bool error)
+{
+ struct trinity_req *req = data;
+
+ req->scheduled = false;
+}
+
+/**
+ * sched_suspend() - Suspend scheduler
+ */
+static void sched_suspend(void)
+{
+ if (!test_and_set_bit(1, &g_sched_priv.suspended))
+ mutex_lock(&g_sched_priv.lock);
+}
+
+/**
+ * sched_resume() - Resume scheduler
+ */
+static void sched_resume(void)
+{
+ if (test_and_clear_bit(1, &g_sched_priv.suspended))
+ mutex_unlock(&g_sched_priv.lock);
+}
+
+static struct trinity_sched_desc trinity_sched_pri = {
+ .ready = sched_ready,
+ .submit = sched_submit,
+ .notify = sched_notify,
+ .suspend = sched_suspend,
+ .resume = sched_resume,
+};
+
+/**
+ * sched_open() - Open scheduler
+ */
+static int sched_open(struct inode *inodep, struct file *filp)
+{
+ return 0;
+}
+
+/**
+ * sched_open() - Release scheduler
+ */
+static int sched_release(struct inode *inodep, struct file *filp)
+{
+ return 0;
+}
+
+static const struct file_operations sched_fops = {
+ .owner = THIS_MODULE,
+ .open = sched_open,
+ .release = sched_release,
+ .llseek = no_llseek,
+};
+
+static struct miscdevice sched_device = {
+ .minor = MISC_DYNAMIC_MINOR,
+ .name = "trinity_sched_pri",
+ .fops = &sched_fops,
+};
+
+/**
+ * sched_init_priv() - Initialize scheduler
+ */
+static int sched_init_priv(void)
+{
+ g_sched_priv.dev = sched_device.this_device;
+
+ init_llist_head(&g_sched_priv.req_queue);
+ init_waitqueue_head(&g_sched_priv.wait_queue);
+
+ g_sched_priv.sched_thread =
+ kthread_run(sched_thread_func, NULL, "trinity_sched_thread");
+ if (IS_ERR(g_sched_priv.sched_thread)) {
+ dev_err(get_dev_ptr(),
+ "Failed to create a thread for scheduling reqs");
+ misc_deregister(&sched_device);
+ return PTR_ERR(g_sched_priv.sched_thread);
+ }
+
+ mutex_init(&g_sched_priv.lock);
+ clear_bit(1, &g_sched_priv.suspended);
+
+ return 0;
+}
+
+/**
+ * trinity_sched_init_pri() - Initialize trinity priority task schedulers
+ *
+ * @dev: an instance of the device
+ */
+int trinity_sched_init_pri(struct device *dev)
+{
+ int err;
+
+ err = misc_register(&sched_device);
+ if (err) {
+ dev_err(dev,
+ "Failed to register a misc device for scheduler\n");
+ return err;
+ }
+
+ trinity_sched_register(SCHED_PRI, &trinity_sched_pri);
+ return sched_init_priv();
+}
+
+/**
+ * trinity_sched_exit_pri() - Exit trinity priority task schedulers
+ */
+void trinity_sched_exit_pri(void)
+{
+ trinity_sched_unregister(SCHED_PRI, &trinity_sched_pri);
+ misc_deregister(&sched_device);
+}
diff --git a/drivers/misc/trinity/sched/priority.h b/drivers/misc/trinity/sched/priority.h
new file mode 100644
index 000000000000..35ac07530496
--- /dev/null
+++ b/drivers/misc/trinity/sched/priority.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * NPU scheduler follows priority policy
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __TRINITY_SCHED_PRI_H__
+#define __TRINITY_SCHED_PRI_H__
+
+int trinity_sched_init_pri(struct device *dev);
+void trinity_sched_exit_pri(void);
+
+#endif /* __TRINITY_SCHED_PRI_H__ */
diff --git a/drivers/misc/trinity/sched/sched.h b/drivers/misc/trinity/sched/sched.h
new file mode 100644
index 000000000000..d13ef5857fc7
--- /dev/null
+++ b/drivers/misc/trinity/sched/sched.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * Scheduler I/F header for trinity devices
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __TRINITY_SCHED_H__
+#define __TRINITY_SCHED_H__
+
+#include <linux/device.h>
+#include <linux/types.h>
+
+/**
+ * struct trinity_sched_type - scheduler type
+ */
+enum trinity_sched_type { SCHED_PRI = 0, SCHED_END };
+
+typedef void (*remove_req_cb)(void *data, void *req);
+
+/**
+ * struct trinity_sched_desc - a structure for scheduler description
+ */
+struct trinity_sched_desc {
+ bool (*ready)(void);
+ int32_t (*submit)(void *data);
+ bool (*cancel)(void *data);
+ void (*suspend)(void);
+ void (*resume)(void);
+ void (*notify)(void *data, bool error);
+
+ struct trinity_req *(*find_req)(uint32_t dev_id, int req_id);
+ void (*remove_reqs)(void *data, remove_req_cb cb);
+ void (*test_run)(void *data, int req_id);
+};
+
+struct trinity_sched_desc *trinity_sched_find(enum trinity_sched_type type);
+void trinity_sched_register(enum trinity_sched_type type,
+ struct trinity_sched_desc *desc);
+void trinity_sched_unregister(enum trinity_sched_type type,
+ struct trinity_sched_desc *desc);
+int32_t trinity_sched_run_req(void *req_data, void *sched_data);
+void trinity_sched_suspend(void);
+void trinity_sched_resume(void);
+int32_t trinity_sched_init(struct device *dev);
+void trinity_sched_exit(void);
+
+#endif /* __TRINITY_SCHED_H__ */
--
2.25.1

2022-07-25 07:10:22

by Jiho Chu

[permalink] [raw]
Subject: [PATCH 7/9] trinity: Add profile module

This patch is for profile module.

The samsung NPU provides internal statistics data,
and it includes memory read/write counts, consumed clock
cycle for each operation. This statistics can be read by
ioctl control command.

Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/trinity_vision2_drv.c | 467 +++++++++++++++++-
.../misc/trinity/trinity_vision2_profile.h | 324 ++++++++++++
2 files changed, 771 insertions(+), 20 deletions(-)
create mode 100644 drivers/misc/trinity/trinity_vision2_profile.h

diff --git a/drivers/misc/trinity/trinity_vision2_drv.c b/drivers/misc/trinity/trinity_vision2_drv.c
index ddc1739afdd8..539eadeca09d 100644
--- a/drivers/misc/trinity/trinity_vision2_drv.c
+++ b/drivers/misc/trinity/trinity_vision2_drv.c
@@ -177,31 +177,154 @@ static int triv2_idu_load(struct trinity_driver *drv, const char *dirpath,

static LIST_HEAD(triv2_driver_list);
static struct hlist_bl_head triv2_model_node_hlist[TRIV2_MODEL_HASH_SIZE];
+static const char * const triv2_op_names[] = TRIV2_FOREACH_OPNAME(TRIV2_GENERATE_OPNAME);

static struct triv2_profile *
triv2_find_profile(const struct trinity_driver *drv, int req_id)
{
- /* find profile */
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ unsigned long key = TRIV2_PROFILE_HASH_KEY(req_id);
+ struct triv2_profile *profile = NULL;
+
+ hash_for_each_possible(pdata->prof_htable, profile, hlist, key) {
+ if (profile->req_id == req_id)
+ break;
+ }

- return NULL;
+ return profile;
}

static void triv2_fini_profile(struct trinity_resv_mem *prof_buf)
{
- /* finish profile */
+ if (!prof_buf->vaddr)
+ return;
+
+ trinity_free_from_resv_mem(prof_buf, false);
+ memset(prof_buf, '\x00', sizeof(*prof_buf));
}

static void triv2_init_profile(struct trinity_driver *drv,
unsigned long profile_size)
{
- /* init profile */
+ struct device *dev = drv_to_dev_ptr(drv);
+ struct trinity_resv_mem *prof_buf = TRIV2_DRV_GET_PROF_BUF(drv);
+
+ if (profile_size > 0) {
+ /* allocate profile buffer and enable it */
+ struct iommu_domain *domain;
+ phys_addr_t paddr;
+ int status;
+
+ triv2_fini_profile(prof_buf);
+
+ profile_size = PAGE_ALIGN(profile_size);
+ status = trinity_alloc_from_resv_mem(profile_size, prof_buf,
+ false);
+ if (status < 0) {
+ dev_err(dev,
+ "Couldn't allocate memory for profiling buffer: %d",
+ status);
+ return;
+ }
+
+ domain = iommu_get_domain_for_dev(drv_to_dev_ptr(drv));
+ paddr = trinity_get_paddr(domain, prof_buf->daddr);
+ iowrite32(TRIV2_IDU_ADDR(paddr),
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_ADDR));
+ iowrite32(prof_buf->size,
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_SIZE));
+
+ if (drv->verbose)
+ dev_info(dev, "Profiling enabled (%ld bytes)",
+ profile_size);
+ } else {
+ /* disable profiling */
+ triv2_fini_profile(prof_buf);
+
+ iowrite32(0, trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_ADDR));
+ iowrite32(0, trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_SIZE));
+ if (drv->verbose)
+ dev_info(dev, "Profiling disabled");
+ }
+}
+
+static void triv2_assign_opnames(struct triv2_cmd_profile *cmd)
+{
+ struct triv2_op_profile *ops = cmd->profile_ops;
+ uint32_t i;
+
+ for (i = 0; i < cmd->total_ops; i++)
+ snprintf(ops[i].op_name, TRIV2_MAX_OPNAME, "%s",
+ triv2_op_names[ops[i].opcode]);
}

static int32_t triv2_check_profile(struct trinity_driver *drv,
struct trinity_req *req)
{
- /* check profile */
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct triv2_req *t_req = TRIV2_GET_REQ(req);
+ struct trinity_resv_mem *profile_buf;
+ struct triv2_cmd_profile *profile_cmd;
+ struct triv2_cmd_profile *profile_cmd_new;
+ struct triv2_profile *profile;
+
+ uint32_t offset = t_req->profile_offset;
+ uint32_t total_ops, total_size;
+
+ profile_buf = TRIV2_DRV_GET_PROF_BUF(drv);
+ if (!profile_buf->vaddr)
+ return 0;
+
+ if (profile_buf->size <= offset) {
+ dev_err(drv_to_dev_ptr(drv),
+ "Invalid profile offset detected: 0x%x", offset);
+ return -EINVAL;
+ }
+
+ profile_cmd = (struct triv2_cmd_profile *)((char *)profile_buf->vaddr +
+ offset);
+ profile_cmd->total_cycles = t_req->total_cycles;

+ total_ops = profile_cmd->total_ops;
+ total_size = sizeof(struct triv2_cmd_profile) +
+ total_ops * sizeof(struct triv2_op_profile);
+
+ profile_cmd_new = vzalloc(total_size);
+ if (!profile_cmd_new)
+ return -ENOMEM;
+
+ mutex_lock(&pdata->prof_lock);
+
+ profile = req->stat->profile;
+ if (profile) {
+ WARN_ON(!profile->data);
+ vfree(profile->data);
+ profile->data = profile_cmd_new;
+ } else {
+ int req_id = req->input.config.req_id;
+ unsigned long key = TRIV2_PROFILE_HASH_KEY(req_id);
+
+ profile = vzalloc(sizeof(struct triv2_profile));
+ if (!profile) {
+ vfree(profile_cmd_new);
+ mutex_unlock(&pdata->prof_lock);
+ return -ENOMEM;
+ }
+ profile->req_id = req_id;
+ profile->data = profile_cmd_new;
+
+ hash_add(pdata->prof_htable, &profile->hlist, key);
+
+ req->stat->profile = profile;
+ }
+ memcpy(profile_cmd_new, profile_cmd, total_size);
+ triv2_assign_opnames(profile_cmd_new);
+
+ mutex_unlock(&pdata->prof_lock);
return 0;
}

@@ -400,6 +523,47 @@ static void triv2_reset(struct trinity_driver *drv)
mutex_unlock(&pdata->drv->lock);
}

+enum triv2_idu_stage {
+ IDU_STAGE_UNKNOWN = 0,
+ IDU_STAGE_WAITING,
+ IDU_STAGE_GET_CMD,
+ IDU_STAGE_RUN_CMD,
+ IDU_STAGE_SWAP_OUT,
+ IDU_STAGE_SWAP_IN,
+ IDU_STAGE_SEND_IRQ,
+};
+
+/**
+ * triv2_run_trigger() - trigger memory-mapped register for inference running
+ */
+static void triv2_run_trigger(const struct trinity_driver *drv, int slot)
+{
+ struct triv2_cmd_info *cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+ struct triv2_req *t_req = cmd_info->reqs[slot];
+
+ if (!t_req) {
+ dev_err(drv_to_dev_ptr(drv),
+ "Unable to find the corresponding req");
+ return;
+ }
+
+ if (triv2_sync_segt_entries(drv, t_req) < 0)
+ dev_err(drv_to_dev_ptr(drv),
+ "Unable to sync the segment table");
+
+ /* sync the current bitmap */
+ iowrite32(*cmd_info->bitmap,
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_CMD_REQ));
+
+ t_req->req.stat->scheduled = ktime_get();
+ t_req->req.stat->completed = 0;
+ t_req->req.scheduled = true;
+
+ /* trigger the event (we do not assume that IDU always accepts this event) */
+ triv2_wakeup_cp(drv);
+}
+
static void triv2_clear_cmd(struct trinity_driver *drv, struct triv2_req *req,
struct triv2_cmd *cmd)
{
@@ -458,6 +622,128 @@ static void triv2_handle_cmd_done(struct trinity_driver *drv,
complete_all(&req->complete);
}

+/**
+ * triv2_prepare_cmd() - Prepare command info. for the target req before invoking
+ */
+static int32_t triv2_prepare_cmd(struct trinity_driver *drv,
+ struct trinity_req *req, void *sched_data)
+{
+ struct triv2_cmd_info *cmd_info;
+ struct triv2_cmd cmd = { 0 };
+ struct triv2_req *t;
+
+ const struct trinity_model *model = req->model;
+ const struct trinity_input *input = &req->input;
+
+ int32_t slot;
+ struct iommu_domain *domain;
+ phys_addr_t paddr;
+ unsigned long flags;
+
+ /** Note that the program base is not behind iommu */
+ domain = iommu_get_domain_for_dev(drv_to_dev_ptr(drv));
+
+ paddr = trinity_get_paddr(domain, model->import_info.dma_addr);
+ cmd.prog_addr = TRIV2_IDU_ADDR(paddr);
+ cmd.prog_addr += model->config.program_offset_addr;
+ cmd.prog_size = model->config.program_size;
+
+ paddr = trinity_get_paddr(domain, input->import_info.dma_addr);
+ cmd.segt_addr = TRIV2_IDU_ADDR(paddr);
+ cmd.num_visa = model->config.num_visa_insts;
+
+ cmd.priority = input->config.priority;
+ cmd.input_mode = input->config.input_mode;
+ cmd.output_mode = input->config.output_mode;
+
+ /** Find a empty cmd slot in bitmap (need a spin lock) */
+ cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+ t = TRIV2_GET_REQ(req);
+
+ spin_lock_irqsave(&cmd_info->lock, flags);
+
+ slot = find_first_zero_bit(cmd_info->bitmap, TRIV2_MAX_CMDSLOTS);
+ if (slot < TRIV2_MAX_CMDSLOTS) {
+ set_bit(slot, cmd_info->bitmap);
+ cmd_info->reqs[slot] = t;
+ t->cmd_slot = slot;
+ }
+
+ spin_unlock_irqrestore(&cmd_info->lock, flags);
+
+ /** Will be retried (rely on platform device's scheduling) */
+ if (slot >= TRIV2_MAX_CMDSLOTS)
+ return -EBUSY;
+
+ cmd.slot = slot;
+ cmd.status = STATUS_CMD_READY;
+
+ memcpy_toio(cmd_info->buf.vaddr + slot * sizeof(struct triv2_cmd), &cmd,
+ sizeof(struct triv2_cmd));
+
+ return slot;
+}
+
+/**
+ * triv2_invoke_req() - Invoke a req on the device. Note that all configurations
+ * required by running should be done before invocation of this function.
+ */
+static int32_t triv2_invoke_req(struct trinity_driver *drv,
+ struct trinity_req *req, void *sched_data)
+{
+ enum trinity_output_mode mode;
+ int32_t slot;
+
+ mode = req->input.config.output_mode;
+ slot = triv2_prepare_cmd(drv, req, sched_data);
+ if (slot < 0)
+ return slot;
+
+ if (mode == TRINITY_OUTPUT_HW || mode == TRINITY_OUTPUT_CPU_POLL ||
+ mode == TRINITY_OUTPUT_CPU_INTR) {
+ triv2_run_trigger(drv, slot);
+ } else {
+ dev_err(drv_to_dev_ptr(drv), "Invalid output mode: %d\n", mode);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static struct trinity_req *triv2_alloc_req(struct trinity_driver *drv)
+{
+ struct triv2_req *t_req;
+
+ t_req = kzalloc(sizeof(struct triv2_req), GFP_KERNEL);
+ if (!t_req)
+ return NULL;
+
+ t_req->cmd_slot = -1;
+
+ return &(t_req->req);
+}
+
+static void triv2_dealloc_req(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ struct triv2_req *t_req = TRIV2_GET_REQ(req);
+
+ if (t_req->seg_import) {
+ struct trinity_hwmem_import *import;
+ uint32_t i;
+
+ for (i = 0; i < req->input.config.num_segments; i++) {
+ import = &(t_req->seg_import[i]);
+ if (import->addr)
+ trinity_hwmem_import_dmabuf_end(import);
+ }
+ kfree(t_req->seg_import);
+ }
+
+ kfree(t_req->kernel);
+ kfree(t_req);
+}
+
static void triv2_handle_timeout(struct trinity_driver *drv,
struct trinity_req *req)
{
@@ -494,6 +780,156 @@ static void triv2_stop_reqs(struct work_struct *work)
triv2_cancel_reqs(drv);
}

+/**
+ * triv2_get_profile_meta() - get profile metadata for the target req
+ */
+static int32_t triv2_get_profile_meta(const struct trinity_driver *drv,
+ struct trinity_ioctl_profile_meta *meta)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct triv2_profile *profile;
+ struct triv2_cmd_profile *profile_data;
+ int ret = 0;
+
+ mutex_lock(&pdata->prof_lock);
+
+ profile = triv2_find_profile(drv, meta->req_id);
+ if (!profile) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ profile_data = profile->data;
+ WARN_ON(!profile_data);
+
+ meta->total_cycles = profile_data->total_cycles;
+ meta->total_ops = profile_data->total_ops;
+ meta->profile_size =
+ profile_data->total_ops * sizeof(struct triv2_op_profile);
+ /* unsupported for now */
+ meta->input_footprint = -1;
+ meta->output_footprint = -1;
+
+out:
+ mutex_unlock(&pdata->prof_lock);
+
+ return ret;
+}
+
+/**
+ * triv2_get_profile_buff() - get profile buffer for the target req
+ */
+static int32_t triv2_get_profile_buff(const struct trinity_driver *drv,
+ struct trinity_ioctl_profile_buff *buff)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct triv2_profile *profile;
+ struct triv2_cmd_profile *profile_data;
+ uint32_t total_size;
+ int ret = 0;
+
+ mutex_lock(&pdata->prof_lock);
+
+ profile = triv2_find_profile(drv, buff->req_id);
+ if (!profile) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ profile_data = profile->data;
+ WARN_ON(!profile_data);
+
+ profile_data = profile->data;
+ total_size = profile_data->total_ops * sizeof(struct triv2_op_profile);
+
+ if (buff->profile_pos + buff->profile_size > total_size) {
+ dev_err(drv_to_dev_ptr(drv),
+ "Profile data out-of-range! pos(%u) size(%u) > total_size(%u)",
+ buff->profile_pos, buff->profile_size, total_size);
+ ret = -ERANGE;
+ goto out;
+ }
+
+ /* consider partial memory copies */
+ if (copy_to_user((char __user *)buff->profile_buf,
+ (char *)profile_data->profile_ops + buff->profile_pos,
+ buff->profile_size))
+ ret = -EACCES;
+
+out:
+ mutex_unlock(&pdata->prof_lock);
+
+ return ret;
+}
+
+static void triv2_show_profile(const struct trinity_driver *drv, int req_id)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct triv2_profile *profile;
+ struct triv2_cmd_profile *profile_data;
+ uint32_t i;
+
+ mutex_lock(&pdata->prof_lock);
+
+ profile = triv2_find_profile(drv, req_id);
+ if (!profile) {
+ dev_warn(dev, "Unable to find the profile data (req_id %d)",
+ req_id);
+ goto out;
+ }
+
+ profile_data = profile->data;
+ WARN_ON(!profile_data);
+
+ dev_info(dev, "Total cycles: %lld", profile_data->total_cycles);
+ dev_info(dev, "Total ops: %u", profile_data->total_ops);
+
+ for (i = 0; i < profile_data->total_ops; i++) {
+ struct triv2_op_profile *op = &profile_data->profile_ops[i];
+
+ dev_info(dev, "[%u] opcode: %u name:%s", i, op->opcode,
+ op->op_name);
+ dev_info(dev, "\tcycles: %lld", op->cycles);
+ dev_info(dev, "\tprog_seq: %lld", op->prog_seq);
+ dev_info(dev, "\texec_seq: %lld", op->exec_seq);
+ if (op->dram_read > 0)
+ dev_info(dev, "\tdram_read: %lld", op->dram_read);
+ if (op->dram_write > 0)
+ dev_info(dev, "\tdram_write: %lld", op->dram_write);
+ if (op->sram_read > 0)
+ dev_info(dev, "\tsram_read: %lld", op->sram_read);
+ if (op->sram_write > 0)
+ dev_info(dev, "\tsram_write: %lld", op->sram_write);
+ }
+out:
+ mutex_unlock(&pdata->prof_lock);
+}
+
+/**
+ * triv2_destroy_profile() - destroy profile data
+ */
+static void triv2_destroy_profile(const struct trinity_driver *drv, void *data)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct triv2_profile *profile = data;
+ struct triv2_cmd_profile *profile_data;
+
+ if (!profile)
+ return;
+
+ mutex_lock(&pdata->prof_lock);
+
+ profile_data = profile->data;
+ WARN_ON(!profile_data);
+ vfree(profile_data);
+
+ hash_del(&profile->hlist);
+ vfree(profile);
+
+ mutex_unlock(&pdata->prof_lock);
+}
+
static void triv2_handle_irq_cmds(struct trinity_driver *drv)
{
struct triv2_cmd_info *info;
@@ -667,18 +1103,6 @@ static int32_t triv2_prepare_req(struct trinity_driver *drv,
return ret;
}

-/**
- * triv2_invoke_req() - Invoke a req on the device. Note that all configurations
- * required by running should be done before invocation of this function.
- */
-static int32_t triv2_invoke_req(struct trinity_driver *drv,
- struct trinity_req *req, void *sched_data)
-{
- /* invoke request */
-
- return 0;
-}
-
static long triv2_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
{
struct trinity_driver *drv = f->private_data;
@@ -740,13 +1164,16 @@ static void triv2_setup_dsp(struct trinity_driver *drv, phys_addr_t paddr)

static void triv2_init_common(void)
{
- static bool done;
+ static bool need_init = true;
+ int i;

- if (done)
+ if (!need_init)
return;

/* init hlists */
- done = true;
+ for (i = 0; i < TRIV2_MODEL_HASH_SIZE; ++i)
+ INIT_HLIST_BL_HEAD(&triv2_model_node_hlist[i]);
+ need_init = false;
}

static int triv2_idu_alloc(struct device *dev, struct trinity_resv_mem *mem)
diff --git a/drivers/misc/trinity/trinity_vision2_profile.h b/drivers/misc/trinity/trinity_vision2_profile.h
new file mode 100644
index 000000000000..90b42cf56c54
--- /dev/null
+++ b/drivers/misc/trinity/trinity_vision2_profile.h
@@ -0,0 +1,324 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * Profile header for TRIV2 devices
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __TRINITY_VISION2_PROFILE_H__
+#define __TRINITY_VISION2_PROFILE_H__
+
+#include <linux/types.h>
+
+#define TRIV2_MAX_OPNAME (128)
+#define TRIV2_MAX_PROFILE_SIZE (256)
+
+/**
+ * struct triv2_op_profile - A profile data per operation
+ *
+ * @op_name: The physical DMA address of this DMA buffer.
+ * @cycles: total number of cycles
+ * @dram_read: a count for dram read
+ * @dram_write: a count for dram write
+ * @sram_read: a count for sram read
+ * @sram_write: a count for sram write
+ * @start_cycles: a count for starting cycles
+ * @end_cycles: a cont for ending cycles
+ * @opcode: operation code
+ * @prog_seq: program sequence number
+ * @exec_seq: execution sequence number
+ * @reserved: reserved
+ */
+struct triv2_op_profile {
+ union {
+ struct {
+ char op_name[TRIV2_MAX_OPNAME];
+
+ int64_t cycles;
+
+ int64_t dram_read;
+ int64_t dram_write;
+
+ int64_t sram_read;
+ int64_t sram_write;
+
+ int64_t start_cycles;
+ int64_t end_cycles;
+
+ uint32_t opcode;
+ int64_t prog_seq;
+ int64_t exec_seq;
+ } __packed;
+ uint8_t reserved[TRIV2_MAX_PROFILE_SIZE];
+ };
+};
+
+/**
+ * struct triv2_cmd_profile - A profile data per command
+ *
+ * @total_cycles: total number of cycles for a command
+ * @total_ops: total operations of command
+ * @profile_ops: list of profile data for operations
+ */
+struct triv2_cmd_profile {
+ int64_t total_cycles;
+ uint32_t total_ops;
+ /* zero-length array */
+ struct triv2_op_profile profile_ops[];
+} __packed;
+
+/**
+ * struct triv2_profile - A profile data
+ *
+ * @req_id: total number of cycles for a command
+ * @hlist: list of profile data
+ * @data: command profile data
+ */
+struct triv2_profile {
+ int req_id;
+ struct hlist_node hlist;
+ struct triv2_cmd_profile *data;
+};
+
+enum {
+ NOP = 0x00,
+ HALT = 0x01,
+ ADMA_IN = 0x02,
+ ADMA_OUT = 0x03,
+ RESCALE_I8 = 0x04,
+ RESCALE_I16 = 0x05,
+ CONVERT_I16_I8 = 0x06,
+ CONVERT_I8_I16 = 0x07,
+ RELUN_I8 = 0x08,
+ RELUN_I16 = 0x09,
+ PRELU_I8 = 0x0A,
+ PRELU_I16 = 0x0B,
+ ADD_I8 = 0x0C,
+ ADD_I16 = 0x0D,
+ REDUCE_MEAN_I8 = 0x0E,
+ REDUCE_MEAN_I16 = 0x0F,
+ MAX_POOL_I8 = 0x10,
+ MAX_POOL_I16 = 0x11,
+ AVG_POOL_I8 = 0x12,
+ AVG_POOL_I16 = 0x13,
+ CONV_I8 = 0x14,
+ CONV_I16 = 0x15,
+ CONVE_I8 = 0x16,
+ CONVE_I16 = 0x17,
+ TCONV_I8 = 0x18,
+ TCONV_I16 = 0x19,
+ MUL_I8 = 0x1A,
+ MUL_I16 = 0x1B,
+ DCONV_I8 = 0x1C,
+ DCONV_I16 = 0x1D,
+ DCONVE_I8 = 0x1E,
+ DCONVE_I16 = 0x1F,
+ CONV_I8_P = 0x20,
+ CONV_I16_P = 0x21,
+ PDMA_IN = 0x40,
+ PDMA_OUT = 0x41,
+ ARGMAX_I8 = 0x42,
+ ARGMAX_I16 = 0x43,
+ RESHAPE_I8 = 0x44,
+ RESHAPE_I16 = 0x45,
+ TRANSPOSE_I8 = 0x46,
+ TRANSPOSE_I16 = 0x47,
+ CONCAT_I8 = 0x48,
+ CONCAT_I16 = 0x49,
+ PAD_I8 = 0x4A,
+ PAD_I16 = 0x4B,
+ STRIDED_SLICE_I8 = 0x4C,
+ STRIDED_SLICE_I16 = 0x4D,
+ CONVERT_FORMAT_I8 = 0x4E,
+ CONVERT_FORMAT_I16 = 0x4F,
+ SIGMOID_I8 = 0x50,
+ SIGMOID_I16 = 0x51,
+ TANH_I8 = 0x52,
+ TANH_I16 = 0x53,
+ ELU_I8 = 0x54,
+ ELU_I16 = 0x55,
+ FLOOR_I8 = 0x56,
+ FLOOR_I16 = 0x57,
+ RSQRT_I8 = 0x58,
+ RSQRT_I16 = 0x59,
+ SQRT_I8 = 0x5A,
+ SQRT_I16 = 0x5B,
+ SOFTMAX_I8 = 0x5C,
+ SOFTMAX_I16 = 0x5D,
+ DIVIDE_I8 = 0x60,
+ DIVIDE_I16 = 0x61,
+ FLOORDIV_I8 = 0x62,
+ FLOORDIV_I16 = 0x63,
+ LOGICAL_OR_I8 = 0x64,
+ LOGICAL_OR_I16 = 0x65,
+ GREATER_I8 = 0x66,
+ GREATER_I16 = 0x67,
+ GREATER_EQUAL_I8 = 0x68,
+ GREATER_EQUAL_I16 = 0x69,
+ POW_I8 = 0x6A,
+ POW_I16 = 0x6B,
+ EXP_I8 = 0x6C,
+ EXP_I16 = 0x6D,
+ NOT_EQUAL_I8 = 0x6E,
+ NOT_EQUAL_I16 = 0x6F,
+ BATCH_TO_SPACE_I8 = 0x70,
+ BATCH_TO_SPACE_I16 = 0x71,
+ SPACE_TO_BATCH_I8 = 0x72,
+ SPACE_TO_BATCH_I16 = 0x73,
+ DEPTH_TO_SPACE_I8 = 0x74,
+ DEPTH_TO_SPACE_I16 = 0x75,
+ SPACE_TO_DEPTH_I8 = 0x76,
+ SPACE_TO_DEPTH_I16 = 0x77,
+ YUV_TO_RGB_I8 = 0x7A,
+ YUV_TO_RGB_I16 = 0x7B,
+ RESIZE_BILINEAR_I8 = 0x7C,
+ RESIZE_BILINEAR_I16 = 0x7D,
+ RESIZE_NEAREST_NEIGHBOR_I8 = 0x7E,
+ RESIZE_NEAREST_NEIGHBOR_I16 = 0x7F,
+ LOCAL_RESPONSE_NORM_I8 = 0x80,
+ LOCAL_RESPONSE_NORM_I16 = 0x81,
+ INSTANCE_NORM_I8 = 0x82,
+ INSTANCE_NORM_I16 = 0x83,
+ REDUCED_SUM_SSUM_I8 = 0x84,
+ REDUCED_SUM_SSUM_I16 = 0x85,
+ REDUCED_SUM_SSUM_ACC_I8 = 0x86,
+ REDUCED_SUM_SSUM_ACC_I16 = 0x87,
+ REDUCED_SUM_2SUM_I8 = 0x88,
+ REDUCED_SUM_2SUM_I16 = 0x89,
+ REDUCED_MEAN_DEV_WSUM_I8 = 0x8A,
+ REDUCED_MEAN_DEV_WSUM_I16 = 0x8B,
+ REDUCED_MEAN_DEV_I8 = 0x8C,
+ REDUCED_MEAN_DEV_I16 = 0x8D,
+ RESCALE_CW_I8 = 0x8E,
+ RESCALE_CW_I16 = 0x8F,
+ REDUCED_MEAN_SCALE_WSUM_I8 = 0x90,
+ REDUCED_MEAN_SCALE_WSUM_I16 = 0x91,
+ RESCALE_CHANNELWISE_I8 = 0x92,
+ RESCALE_CHANNELWISE_I16 = 0x93,
+};
+
+/** generate opnames */
+#define TRIV2_GENERATE_OPNAME(OPNAME) \
+ [OPNAME] = #OPNAME,
+
+#define TRIV2_FOREACH_OPNAME(GEN) {\
+ GEN(NOP) \
+ GEN(HALT) \
+ GEN(ADMA_IN) \
+ GEN(ADMA_OUT) \
+ GEN(RESCALE_I8) \
+ GEN(RESCALE_I16) \
+ GEN(CONVERT_I16_I8) \
+ GEN(CONVERT_I8_I16) \
+ GEN(RELUN_I8) \
+ GEN(RELUN_I16) \
+ GEN(PRELU_I8) \
+ GEN(PRELU_I16) \
+ GEN(ADD_I8) \
+ GEN(ADD_I16) \
+ GEN(REDUCE_MEAN_I8) \
+ GEN(REDUCE_MEAN_I16) \
+ GEN(MAX_POOL_I8) \
+ GEN(MAX_POOL_I16) \
+ GEN(AVG_POOL_I8) \
+ GEN(AVG_POOL_I16) \
+ GEN(CONV_I8) \
+ GEN(CONV_I16) \
+ GEN(CONVE_I8) \
+ GEN(CONVE_I16) \
+ GEN(TCONV_I8) \
+ GEN(TCONV_I16) \
+ GEN(MUL_I8) \
+ GEN(MUL_I16) \
+ GEN(DCONV_I8) \
+ GEN(DCONV_I16) \
+ GEN(DCONVE_I8) \
+ GEN(DCONVE_I16) \
+ GEN(CONV_I8_P) \
+ GEN(CONV_I16_P) \
+ GEN(PDMA_IN) \
+ GEN(PDMA_OUT) \
+ GEN(ARGMAX_I8) \
+ GEN(ARGMAX_I16) \
+ GEN(RESHAPE_I8) \
+ GEN(RESHAPE_I16) \
+ GEN(TRANSPOSE_I8) \
+ GEN(TRANSPOSE_I16) \
+ GEN(CONCAT_I8) \
+ GEN(CONCAT_I16) \
+ GEN(PAD_I8) \
+ GEN(PAD_I16) \
+ GEN(STRIDED_SLICE_I8) \
+ GEN(STRIDED_SLICE_I16) \
+ GEN(CONVERT_FORMAT_I8) \
+ GEN(CONVERT_FORMAT_I16) \
+ GEN(SIGMOID_I8) \
+ GEN(SIGMOID_I16) \
+ GEN(TANH_I8) \
+ GEN(TANH_I16) \
+ GEN(ELU_I8) \
+ GEN(ELU_I16) \
+ GEN(FLOOR_I8) \
+ GEN(FLOOR_I16) \
+ GEN(RSQRT_I8) \
+ GEN(RSQRT_I16) \
+ GEN(SQRT_I8) \
+ GEN(SQRT_I16) \
+ GEN(SOFTMAX_I8) \
+ GEN(SOFTMAX_I16) \
+ GEN(DIVIDE_I8) \
+ GEN(DIVIDE_I16) \
+ GEN(FLOORDIV_I8) \
+ GEN(FLOORDIV_I16) \
+ GEN(LOGICAL_OR_I8) \
+ GEN(LOGICAL_OR_I16) \
+ GEN(GREATER_I8) \
+ GEN(GREATER_I16) \
+ GEN(GREATER_EQUAL_I8) \
+ GEN(GREATER_EQUAL_I16) \
+ GEN(POW_I8) \
+ GEN(POW_I16) \
+ GEN(EXP_I8) \
+ GEN(EXP_I16) \
+ GEN(NOT_EQUAL_I8) \
+ GEN(NOT_EQUAL_I16) \
+ GEN(BATCH_TO_SPACE_I8) \
+ GEN(BATCH_TO_SPACE_I16) \
+ GEN(SPACE_TO_BATCH_I8) \
+ GEN(SPACE_TO_BATCH_I16) \
+ GEN(DEPTH_TO_SPACE_I8) \
+ GEN(DEPTH_TO_SPACE_I16) \
+ GEN(SPACE_TO_DEPTH_I8) \
+ GEN(SPACE_TO_DEPTH_I16) \
+ GEN(YUV_TO_RGB_I8) \
+ GEN(YUV_TO_RGB_I16) \
+ GEN(RESIZE_BILINEAR_I8) \
+ GEN(RESIZE_BILINEAR_I16) \
+ GEN(RESIZE_NEAREST_NEIGHBOR_I8) \
+ GEN(RESIZE_NEAREST_NEIGHBOR_I16) \
+ GEN(LOCAL_RESPONSE_NORM_I8) \
+ GEN(LOCAL_RESPONSE_NORM_I16) \
+ GEN(INSTANCE_NORM_I8) \
+ GEN(INSTANCE_NORM_I16) \
+ GEN(REDUCED_SUM_SSUM_I8) \
+ GEN(REDUCED_SUM_SSUM_I16) \
+ GEN(REDUCED_SUM_SSUM_ACC_I8) \
+ GEN(REDUCED_SUM_SSUM_ACC_I16) \
+ GEN(REDUCED_SUM_2SUM_I8) \
+ GEN(REDUCED_SUM_2SUM_I16) \
+ GEN(REDUCED_MEAN_DEV_WSUM_I8) \
+ GEN(REDUCED_MEAN_DEV_WSUM_I16) \
+ GEN(REDUCED_MEAN_DEV_I8) \
+ GEN(REDUCED_MEAN_DEV_I16) \
+ GEN(RESCALE_CW_I8) \
+ GEN(RESCALE_CW_I16) \
+ GEN(REDUCED_MEAN_SCALE_WSUM_I8) \
+ GEN(REDUCED_MEAN_SCALE_WSUM_I16) \
+ GEN(RESCALE_CHANNELWISE_I8) \
+ GEN(RESCALE_CHANNELWISE_I16) \
+}
+#endif
--
2.25.1

2022-07-25 07:10:26

by Jiho Chu

[permalink] [raw]
Subject: [PATCH 6/9] trinity: Add pm and ioctl feature

This patch implements power management and ioctl behaviors.

Currently, power management operations check it can change to
suspend mode using pm_runtime_allow and pm_runtime_forbid.

The ioctl routines are also added to give controls to the user
library. TRINITY_IOCTL_HWMEM_ALLOC/DEALLOC is for memory
allocation for the model load, RUN/STOP operations are
provided to control NPU works. And several STAT controls can
provide statistics of the NPU, it gives an inspection
feature to the user space.

Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/Makefile | 1 +
drivers/misc/trinity/trinity.c | 690 ++++++++++++++++++++-
drivers/misc/trinity/trinity_common.h | 23 +-
drivers/misc/trinity/trinity_pm.c | 76 +++
drivers/misc/trinity/trinity_vision2_drv.c | 563 ++++++++++++++++-
5 files changed, 1344 insertions(+), 9 deletions(-)
create mode 100644 drivers/misc/trinity/trinity_pm.c

diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
index ce3539affbf2..22141e2233e8 100644
--- a/drivers/misc/trinity/Makefile
+++ b/drivers/misc/trinity/Makefile
@@ -5,6 +5,7 @@ obj-$(CONFIG_TRINITY_VISION2) += trinity_vision2.o
trinity-y := trinity.o
trinity-y += trinity_resv_mem.o trinity_hwmem.o
trinity-y += sched/core.o sched/priority.o
+trinity-y += trinity_pm.o
trinity-y += trinity_debug.o
trinity-y += trinity_sysfs.o trinity_stat.o

diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
index 8f8ade0aff89..08d15f08da39 100644
--- a/drivers/misc/trinity/trinity.c
+++ b/drivers/misc/trinity/trinity.c
@@ -34,6 +34,7 @@

#include "trinity_common.h"
#include "trinity_resv_mem.h"
+#include "trinity_stat.h"

#define BASE_DEV_NAME "trinity"

@@ -50,6 +51,9 @@ static DEFINE_SPINLOCK(trinity_lock);
/* A bitmap to keep track of active Trinity devices */
static unsigned long dev_bitmap[TRINITY_DEV_END];

+static void trinity_model_get(struct trinity_model *model);
+static void trinity_model_put(struct trinity_model *model);
+
phys_addr_t trinity_get_paddr(struct iommu_domain *domain, dma_addr_t daddr)
{
if (domain)
@@ -58,6 +62,16 @@ phys_addr_t trinity_get_paddr(struct iommu_domain *domain, dma_addr_t daddr)
return TRINITY_PADDR_BASE + daddr;
}

+void trinity_finish_req(struct trinity_driver *drv, struct trinity_req *req)
+{
+ if (drv->desc->check_profile(drv, req) < 0)
+ dev_warn(drv_to_dev_ptr(drv),
+ "Unable to get profile data from NPU\n");
+ trinity_hwmem_import_dmabuf_end(&req->input.import_info);
+ trinity_stat_finish_req(drv, req);
+ trinity_model_put(req->model);
+}
+
static uint64_t trinity_gen_model_id(int32_t dbuf_fd)
{
static uint32_t id;
@@ -101,6 +115,630 @@ void trinity_init_model_htable(const struct trinity_driver *drv,
ht->hash_bits = TRINITY_MODEL_HASH_BITS;
}

+static struct trinity_model *
+trinity_get_model_by_id(const struct trinity_driver *drv, const uint64_t id)
+{
+ struct trinity_model_htable ht;
+ struct hlist_bl_node *hn;
+ struct trinity_model *hm;
+ unsigned long key;
+ int32_t dbuf_fd;
+ bool found = false;
+
+ trinity_init_model_htable(drv, &ht);
+
+ dbuf_fd = trinity_model_id_to_dbuf_fd(id);
+ key = hash_long(dbuf_fd, ht.hash_bits);
+ hm = NULL;
+
+ hlist_bl_lock(&(ht.ht_heads[key]));
+ hlist_bl_for_each_entry(hm, hn, &(ht.ht_heads[key]), hnode) {
+ if (hm->config.id == id) {
+ found = true;
+ break;
+ }
+ }
+ hlist_bl_unlock(&(ht.ht_heads[key]));
+
+ return found ? hm : NULL;
+}
+
+/**
+ * trinity_register_model() - Registers a model to the internal hashtable. Note
+ * that the device is responsible for the hashtable maintenance.
+ *
+ * @drv: An instance of the trinity driver
+ * @model: Model information to be registered
+ *
+ * Returns 0 and sets model->id with a valid value, which is unique system-wide,
+ * on success. Otherwise, returns negative error.
+ */
+int32_t trinity_register_model(struct trinity_driver *drv,
+ struct trinity_model *model)
+{
+ struct trinity_model_htable ht;
+ unsigned long key;
+ int32_t ret;
+
+ ret = trinity_hwmem_import_dmabuf_begin(drv_to_dev_ptr(drv),
+ model->config.dbuf_fd,
+ &model->import_info);
+ if (ret)
+ return ret;
+
+#ifdef ARM
+ /* sync model program data */
+ __cpuc_flush_dcache_area(model->import_info.addr,
+ model->import_info.buf->size);
+#endif
+
+ model->config.id = trinity_gen_model_id(model->config.dbuf_fd);
+ model->owner_id = trinity_get_app_id();
+
+ INIT_HLIST_BL_NODE(&model->hnode);
+
+ trinity_init_model_htable(drv, &ht);
+
+ key = hash_long(model->config.dbuf_fd, ht.hash_bits);
+
+ hlist_bl_lock(&(ht.ht_heads[key]));
+ hlist_bl_add_head(&model->hnode, &ht.ht_heads[key]);
+ hlist_bl_unlock(&(ht.ht_heads[key]));
+
+ kref_init(&model->refcnt);
+
+ return 0;
+}
+
+static void trinity_destroy_model(struct kref *refcnt)
+{
+ struct trinity_model *model =
+ container_of(refcnt, struct trinity_model, refcnt);
+
+ trinity_hwmem_import_dmabuf_end(&model->import_info);
+ kfree(model);
+}
+
+static void trinity_model_get(struct trinity_model *model)
+{
+ if (!model)
+ return;
+
+ kref_get(&model->refcnt);
+}
+
+static void trinity_model_put(struct trinity_model *model)
+{
+ if (!model)
+ return;
+
+ kref_put(&model->refcnt, trinity_destroy_model);
+}
+
+/**
+ * trinity_deregister_model() - Deregisters the model with a given id from the
+ * table
+ *
+ * @drv: An instance of the trinity driver
+ * @id: An id of the model to be deregistered
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+int32_t trinity_deregister_model(const struct trinity_driver *drv,
+ const uint64_t id)
+{
+ int32_t dbuf_fd = trinity_model_id_to_dbuf_fd(id);
+ struct trinity_model_htable ht;
+ unsigned long key;
+ struct hlist_bl_node *hn;
+ struct trinity_model *hm = NULL;
+
+ trinity_init_model_htable(drv, &ht);
+
+ key = hash_long(dbuf_fd, ht.hash_bits);
+ hlist_bl_lock(&(ht.ht_heads[key]));
+ hlist_bl_for_each_entry(hm, hn, &(ht.ht_heads[key]), hnode) {
+ if (hm->config.id == id) {
+ hlist_bl_del_init(&hm->hnode);
+ break;
+ }
+ }
+ hlist_bl_unlock(&(ht.ht_heads[key]));
+
+ if (!hm)
+ return -ENOENT;
+
+ trinity_model_put(hm);
+
+ return 0;
+}
+
+/**
+ * trinity_deregister_models_owned() - Deregisters models owned
+ *
+ * @drv: An instance of the trinity driver
+ */
+void trinity_deregister_models_owned(struct trinity_driver *drv)
+{
+ struct trinity_model_htable ht;
+ struct trinity_model *hm;
+ struct hlist_bl_node *hn;
+ int i = 0, app_id = trinity_get_app_id();
+
+ trinity_init_model_htable(drv, &ht);
+
+retry:
+ for (; i < TRINITY_MODEL_HASH_SIZE; i++) {
+ hlist_bl_lock(&(ht.ht_heads[i]));
+ hlist_bl_for_each_entry(hm, hn, &(ht.ht_heads[i]), hnode) {
+ if (hm->owner_id == app_id) {
+ hlist_bl_del_init(&hm->hnode);
+ hlist_bl_unlock(&(ht.ht_heads[i]));
+
+ trinity_model_put(hm);
+
+ goto retry;
+ }
+ }
+ hlist_bl_unlock(&(ht.ht_heads[i]));
+ }
+}
+
+/**
+ * get_trinity_sched() - Get scheduler for the request
+ *
+ * @drv: An instance of the trinity driver
+ */
+struct trinity_sched_desc *get_trinity_sched(struct trinity_req *req)
+{
+ return trinity_sched_find(SCHED_PRI);
+}
+
+static int32_t trinity_submit_req(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ struct trinity_sched_desc *sched;
+ struct device *dev;
+ wait_queue_head_t wq;
+ unsigned long timeout, timeout_ms;
+ unsigned long retry, max_retry = 10;
+ int ret = 0;
+
+ dev = drv_to_dev_ptr(drv);
+ sched = get_trinity_sched(req);
+ if (!sched) {
+ dev_err(dev, "Unable to find the target scheduler");
+ return -EINVAL;
+ }
+
+ /* optional req setup before submission */
+ if (drv->desc->prepare_req) {
+ ret = drv->desc->prepare_req(drv, req);
+ if (ret < 0) {
+ dev_err(dev, "Unable to prepare req submission: %d",
+ ret);
+ return ret;
+ }
+ }
+
+ req->submit_retry = 0;
+ timeout_ms = req->input.config.timeout_ms;
+ /* use the default timeout if a user didn't set */
+ if (timeout_ms == 0)
+ timeout_ms = TRINITY_RUN_TIMEOUT_MSEC;
+
+ retry = 0;
+ init_waitqueue_head(&wq);
+ init_completion(&req->complete);
+
+ timeout = msecs_to_jiffies(timeout_ms);
+ while (wait_event_interruptible_timeout(wq, sched->ready(),
+ timeout / 10) == 0) {
+ if (retry == max_retry) {
+ ret = -ETIMEDOUT;
+ break;
+ }
+ retry++;
+ }
+
+ if (ret == 0) {
+ ret = trinity_stat_append_req(drv, req);
+ if (ret < 0) {
+ dev_err(dev, "Unable to append request stat: %d", ret);
+ return ret;
+ }
+
+ ret = sched->submit(req);
+ if (ret < 0)
+ trinity_stat_remove_req(drv, req, true);
+ }
+
+ if (ret < 0) {
+ dev_err(dev, "Unable to submit req to scheduler: %d", ret);
+ return ret;
+ }
+
+ if (req->input.config.output_mode != TRINITY_OUTPUT_HW) {
+ timeout = wait_for_completion_timeout(&req->complete, timeout);
+ /* Check and handle the timeout if its handler exists */
+ if (timeout == 0) {
+ bool canceled = false;
+
+ dev_warn(dev, "The request timeout reached: %lu ms",
+ timeout_ms);
+
+ if (sched->cancel) {
+ canceled = sched->cancel(req);
+ if (!canceled)
+ dev_warn(dev, "Unable to cancel req");
+ }
+
+ if (!canceled)
+ drv->desc->handle_timeout(drv, req);
+
+ req->stat->status = TRINITY_REQ_STATUS_ERROR;
+ ret = -ECANCELED;
+ } else if (req->stat->status == TRINITY_REQ_STATUS_ERROR) {
+ ret = -ECANCELED;
+ } else if (drv->verbose) {
+ dev_info(dev,
+ "Execution Cycles: %u, Elapsed Time (us): %u",
+ req->stat->prev_cycles, req->stat->prev_time);
+ }
+ trinity_finish_req(drv, req);
+ }
+
+ return ret;
+}
+
+static int32_t trinity_run_input(struct trinity_driver *drv,
+ struct trinity_input *input,
+ struct trinity_req *req)
+{
+ struct trinity_model *model;
+ int32_t err;
+
+ model = trinity_get_model_by_id(drv, input->config.model_id);
+ if (!model) {
+ dev_info(drv_to_dev_ptr(drv), "Unable to find the model");
+ return -EINVAL;
+ }
+
+ /* skip to submit this req */
+ if (model->config.program_size == 0 &&
+ input->config.output_mode != TRINITY_OUTPUT_HW)
+ return 0;
+
+ trinity_model_get(model);
+
+ err = trinity_hwmem_import_dmabuf_begin(drv_to_dev_ptr(drv),
+ input->config.dbuf_fd,
+ &input->import_info);
+ if (err < 0)
+ return err;
+
+ req->model = model;
+ err = trinity_submit_req(drv, req);
+ if (err == 0)
+ return 0;
+
+ if (err != -ECANCELED)
+ trinity_hwmem_import_dmabuf_end(&input->import_info);
+ return err;
+}
+
+/**
+ * trinity_ioctl() - A common callback for unlocked_ioctl() in file_operations for
+ * a Trinity device node.
+ *
+ * @f: A file instance of the opened device node
+ * @cmd: The target IOCTL command to be handled
+ * @arg: A user argument
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+ struct trinity_driver *drv = f->private_data;
+ const struct trinity_desc *desc = drv->desc;
+ ssize_t err = 0L;
+
+ switch (cmd) {
+ case TRINITY_IOCTL_GET_VERSION: {
+ if (copy_to_user((uint32_t __user *)arg, &(desc->ver),
+ sizeof((desc->ver))))
+ return -EFAULT;
+ break;
+ }
+ case TRINITY_IOCTL_GET_API_LEVEL: {
+ uint32_t api_level = TRINITY_API_LEVEL;
+
+ if (copy_to_user((uint32_t __user *)arg, &api_level,
+ sizeof(api_level)))
+ return -EFAULT;
+ break;
+ }
+ case TRINITY_IOCTL_GET_STATE: {
+ enum trinity_state ready;
+
+ ready = drv->desc->get_state(drv);
+ if (copy_to_user((enum trinity_state __user *)arg, &ready,
+ sizeof(ready)))
+ return -EFAULT;
+ break;
+ }
+ case TRINITY_IOCTL_GET_TOPS: {
+ if (copy_to_user((uint32_t __user *)arg, &(drv->tops),
+ sizeof((drv->tops))))
+ return -EFAULT;
+ break;
+ }
+ case TRINITY_IOCTL_GET_DSPM: {
+ if (copy_to_user((uint32_t __user *)arg, &(drv->dspm),
+ sizeof((drv->dspm))))
+ return -EFAULT;
+ break;
+ }
+ case TRINITY_IOCTL_GET_NEXT_REQUEST: {
+ int32_t req_id = atomic_inc_return(&drv->global_req_id);
+
+ if (copy_to_user((int32_t __user *)arg, &req_id,
+ sizeof(req_id)))
+ return -EFAULT;
+ break;
+ }
+ case TRINITY_IOCTL_HWMEM_ALLOC: {
+ struct trinity_ioctl_hwmem hwmem;
+
+ if (copy_from_user(&hwmem, (size_t __user *)arg, sizeof(hwmem)))
+ return -EFAULT;
+
+ err = trinity_hwmem_alloc(drv_to_dev_ptr(drv), hwmem.size,
+ hwmem.type);
+ if (err >= 0)
+ trinity_stat_app_total_alloc(drv, hwmem.size);
+ break;
+ }
+ case TRINITY_IOCTL_HWMEM_DEALLOC: {
+ struct trinity_ioctl_hwmem hwmem;
+ struct dma_buf *dbuf;
+
+ if (copy_from_user(&hwmem, (size_t __user *)arg, sizeof(hwmem)))
+ return -EFAULT;
+
+ dbuf = dma_buf_get(hwmem.dbuf_fd);
+ if (IS_ERR(dbuf))
+ return PTR_ERR(dbuf);
+
+ err = trinity_hwmem_free(drv_to_dev_ptr(drv), hwmem.dbuf_fd);
+ if (err == 0)
+ trinity_stat_app_total_freed(drv, dbuf->size);
+ break;
+ }
+ case TRINITY_IOCTL_REGISTER_MODEL: {
+ struct trinity_model *model =
+ kzalloc(sizeof(struct trinity_model), GFP_KERNEL);
+
+ if (IS_ERR_OR_NULL(model))
+ return -ENOMEM;
+
+ if (copy_from_user(&model->config,
+ (struct trinity_model __user *)arg,
+ sizeof(model->config))) {
+ kfree(model);
+ return -EFAULT;
+ }
+
+ err = trinity_register_model(drv, model);
+ if (err < 0)
+ break;
+
+ if (copy_to_user((struct trinity_model __user *)arg,
+ &model->config, sizeof(model->config)))
+ return -EFAULT;
+ break;
+ }
+ case TRINITY_IOCTL_DEREGISTER_MODEL: {
+ uint64_t id;
+
+ if (copy_from_user(&id, (uint64_t __user *)arg, sizeof(id)))
+ return -EFAULT;
+
+ err = trinity_deregister_model(drv, id);
+ break;
+ }
+ case TRINITY_IOCTL_RUN_INPUT: {
+ struct trinity_req *req;
+ struct trinity_input *input;
+
+ req = drv->desc->alloc_req(drv);
+ if (!req)
+ return -ENOMEM;
+ req->drv = drv;
+ req->time_started = ktime_get();
+
+ input = &(req->input);
+ /** run input based on config received from the user */
+ if (copy_from_user(&input->config,
+ (struct trinity_input __user *)arg,
+ sizeof(input->config))) {
+ drv->desc->dealloc_req(drv, req);
+ return -EACCES;
+ }
+
+ err = trinity_run_input(drv, input, req);
+ if (err < 0) {
+ drv->desc->dealloc_req(drv, req);
+ return err;
+ }
+
+ if (copy_to_user((struct trinity_input __user *)arg,
+ &input->config, sizeof(input->config))) {
+ drv->desc->dealloc_req(drv, req);
+ return -EACCES;
+ }
+
+ /* this will be freed when stop request is called */
+ if (!req->is_kernel)
+ drv->desc->dealloc_req(drv, req);
+
+ break;
+ }
+ case TRINITY_IOCTL_STOP_REQUESTS: {
+ if (drv->desc->stop_reqs)
+ schedule_work(&drv->work_stop);
+ break;
+ }
+ case TRINITY_IOCTL_STAT_CURRENT_APP: {
+ struct trinity_ioctl_stat_app ioctl_stat_app;
+
+ if (copy_from_user(&ioctl_stat_app,
+ (struct trinity_ioctl_stat_app __user *)arg,
+ sizeof(ioctl_stat_app)))
+ return -EACCES;
+
+ trinity_stat_app_copy_ioctl(drv, &ioctl_stat_app);
+
+ if (copy_to_user((struct trinity_ioctl_stat_app __user *)arg,
+ &ioctl_stat_app, sizeof(ioctl_stat_app)))
+ return -EACCES;
+ break;
+ }
+ case TRINITY_IOCTL_STAT_APPS: {
+ struct trinity_ioctl_stat_apps ioctl_stat_apps;
+
+ if (copy_from_user(&ioctl_stat_apps,
+ (struct trinity_ioctl_stat_apps __user *)arg,
+ sizeof(ioctl_stat_apps)))
+ return -EACCES;
+
+ trinity_stat_apps_copy_ioctl(drv, &ioctl_stat_apps);
+
+ if (copy_to_user((struct trinity_ioctl_stat_apps __user *)arg,
+ &ioctl_stat_apps, sizeof(ioctl_stat_apps)))
+ return -EACCES;
+ break;
+ }
+ case TRINITY_IOCTL_STAT_REQS: {
+ struct trinity_ioctl_stat_reqs ioctl_stat_reqs;
+
+ if (copy_from_user(&ioctl_stat_reqs,
+ (struct trinity_ioctl_stat_reqs __user *)arg,
+ sizeof(ioctl_stat_reqs)))
+ return -EACCES;
+
+ if (ioctl_stat_reqs.app_id == 0)
+ ioctl_stat_reqs.app_id = trinity_get_app_id();
+
+ trinity_stat_reqs_copy_ioctl(drv, &ioctl_stat_reqs);
+
+ if (copy_to_user((struct trinity_ioctl_stat_reqs __user *)arg,
+ &ioctl_stat_reqs, sizeof(ioctl_stat_reqs)))
+ return -EACCES;
+ break;
+ }
+ case TRINITY_IOCTL_GET_PROFILE_META: {
+ struct trinity_ioctl_profile_meta profile;
+
+ if (copy_from_user(
+ &profile,
+ (struct trinity_ioctl_profile_meta __user *)arg,
+ sizeof(profile)))
+ return -EACCES;
+
+ if (drv->desc->get_profile_meta) {
+ err = drv->desc->get_profile_meta(drv, &profile);
+ } else {
+ profile.total_cycles = -1;
+ profile.total_ops = 0;
+ profile.profile_size = 0;
+ profile.input_footprint = -1;
+ profile.output_footprint = -1;
+ }
+
+ if (copy_to_user((struct trinity_ioctl_profile_meta __user *)arg,
+ &profile, sizeof(profile)))
+ return -EACCES;
+ break;
+ }
+ case TRINITY_IOCTL_GET_PROFILE_BUFF: {
+ struct trinity_ioctl_profile_buff profile;
+
+ if (copy_from_user(
+ &profile,
+ (struct trinity_ioctl_profile_buff __user *)arg,
+ sizeof(profile)))
+ return -EACCES;
+
+ if (drv->desc->get_profile_buff)
+ err = drv->desc->get_profile_buff(drv, &profile);
+
+ if (copy_to_user((struct trinity_ioctl_profile_buff __user *)arg,
+ &profile, sizeof(profile)))
+ return -EACCES;
+
+ break;
+ }
+ case TRINITY_IOCTL_FPGA_MEMCPY: {
+ struct trinity_ioctl_fpga_memcpy fpga;
+ struct trinity_hwmem_import import_info;
+ struct iommu_domain *domain;
+ phys_addr_t paddr;
+ void __iomem *vaddr;
+ uint32_t val;
+ uint64_t i;
+
+ if (copy_from_user(
+ &fpga,
+ (struct trinity_ioctl_fpga_memcpy __user *)arg,
+ sizeof(fpga)))
+ return -EFAULT;
+
+ /* make sure that dbuf_off is PAGE_SIZE aligned */
+ if (!IS_ALIGNED(fpga.dbuf_off, PAGE_SIZE)) {
+ dev_err(drv->dev, "Unaligned dmabuf offset: 0x%x\n",
+ fpga.dbuf_off);
+ return -EINVAL;
+ }
+
+ err = trinity_hwmem_import_dmabuf_begin(
+ drv_to_dev_ptr(drv), fpga.dbuf_fd, &import_info);
+ if (err)
+ return err;
+
+ domain = iommu_get_domain_for_dev(drv->dev);
+ paddr = trinity_get_paddr(domain, import_info.dma_addr);
+
+ trinity_hwmem_import_dmabuf_end(&import_info);
+
+ vaddr = ioremap(paddr + fpga.dbuf_off,
+ PAGE_ALIGN(fpga.user_size));
+ if (vaddr == NULL) {
+ dev_err(drv->dev, "Unable to ioremap %lx",
+ (unsigned long)paddr);
+ return -EINVAL;
+ }
+
+ for (i = 0; i < fpga.user_size; i += sizeof(uint32_t)) {
+ val = ioread32((char *)vaddr + i);
+ if (copy_to_user(((char __user *)fpga.user_addr) + i,
+ &val, sizeof(uint32_t))) {
+ err = -EFAULT;
+ break;
+ }
+ }
+
+ iounmap(vaddr);
+
+ break;
+ }
+ default:
+ return -ENOTTY;
+ }
+
+ return err;
+}
+
/**
* trinity_release() - A common callback for close() in file_operations for a
* Trinity device node. If there are device-specific data to be
@@ -121,14 +759,24 @@ int trinity_release(struct inode *inode, struct file *file)
if (drv->verbose)
dev_info(drv_to_dev_ptr(drv), "%s\n", "Device closed");

+ trinity_stat_app_set_status(drv, TRINITY_APP_STATUS_TERMINATED);
+
mutex_lock(&drv->lock);
drv->opened = drv->opened - 1;
if (drv->opened == 0) {
+ /* block newly incoming requests */
+ trinity_sched_suspend();
+
/* wait already submitted requests */
if (drv->desc->drain_reqs)
drv->desc->drain_reqs(drv);

+ /* deregister models owned by this device handle */
+ trinity_deregister_models_owned(drv);
+
drv->desc->set_state(drv, TRINITY_STATE_PAUSE);
+
+ trinity_sched_resume();
}
mutex_unlock(&drv->lock);

@@ -216,6 +864,8 @@ int trinity_open(struct inode *inode, struct file *f)
if (drv->verbose)
dev_info(drv_to_dev_ptr(drv), "%s\n", "Device opened");

+ trinity_stat_app_set_status(drv, TRINITY_APP_STATUS_STARTED);
+
out:
mutex_unlock(&drv->lock);

@@ -301,10 +951,20 @@ static void trinity_common_init(struct device *dev)
if (!trinity_is_empty())
return;

+ trinity_reset_device(dev, true);
+ trinity_model_htable_init();
+
+ if (trinity_pm_runtime_init(dev) < 0)
+ dev_warn(dev, "Unable to initialize runtime pm\n");
+
+ if (trinity_debug_init() < 0)
+ dev_warn(dev, "Unable to initialize debugfs\n");
+
+ if (trinity_sched_init(dev) < 0)
+ dev_warn(dev, "Unable to initialize scheduler\n");
+
if (trinity_declare_dma_memory(dev) < 0)
dev_warn(dev, "Failed to declare DMA memory\n");
-
- /* Common init codes */
}

static void trinity_common_exit(void)
@@ -313,7 +973,8 @@ static void trinity_common_exit(void)
return;

trinity_release_dma_memory();
- /* Common deinit codes */
+ trinity_debug_exit();
+ trinity_sched_exit();
}

static int trinity_set_device_id(struct trinity_driver *drv)
@@ -454,6 +1115,7 @@ int trinity_probe(struct platform_device *pdev, const struct trinity_desc *desc)
dev_err(dev, "IRQ is not supported");
return irq_out;
}
+ trinity_set_irq_affinity(irq_out);

/* get the IRQ number from DT and set handlers for it */
err = devm_request_irq(dev, irq_out, desc->handle_irq,
@@ -478,8 +1140,26 @@ int trinity_probe(struct platform_device *pdev, const struct trinity_desc *desc)
goto err_cleanup;
}

+ err = trinity_sysfs_init(drv);
+ if (err < 0) {
+ dev_err(dev, "failed to initialize sysfs for a trinity device");
+ goto err_cleanup;
+ }
+
+ err = trinity_debug_add(drv);
+ if (err < 0) {
+ dev_err(dev,
+ "failed to add a debugging feature to the trinity device");
+ goto err_cleanup_sysfs;
+ }
+
+ trinity_stat_init(drv);
+
return 0;

+err_cleanup_sysfs:
+ trinity_sysfs_cleanup(drv);
+
err_cleanup:
spin_lock(&trinity_lock);
clear_bit(drv->dev_id, &dev_bitmap[dev->id]);
@@ -508,6 +1188,10 @@ int trinity_remove(struct platform_device *pdev,
drv = (struct trinity_driver *)platform_get_drvdata(pdev);
dev = drv_to_dev_ptr(drv);

+ trinity_stat_fini(drv);
+ trinity_debug_remove(drv);
+ trinity_sysfs_cleanup(drv);
+
spin_lock(&trinity_lock);
clear_bit(drv->dev_id, &dev_bitmap[dev->id]);
spin_unlock(&trinity_lock);
diff --git a/drivers/misc/trinity/trinity_common.h b/drivers/misc/trinity/trinity_common.h
index c70f66722391..461abca88c96 100644
--- a/drivers/misc/trinity/trinity_common.h
+++ b/drivers/misc/trinity/trinity_common.h
@@ -27,6 +27,9 @@
#include <linux/types.h>
#include <uapi/misc/trinity.h>

+#include "sched/sched.h"
+#include "trinity_hwmem.h"
+
/** Default timeout to wait for opening device in jiffies */
#define TRINITY_DEV_TIMEOUT_MSEC (3000)
#define TRINITY_DEV_TIMEOUT (msecs_to_jiffies(TRINITY_DEV_TIMEOUT_MSEC))
@@ -289,6 +292,7 @@ struct trinity_driver {
*/
struct trinity_model {
struct trinity_ioctl_model config;
+ struct trinity_hwmem_import import_info;
struct hlist_bl_node hnode;
int32_t owner_id;
struct kref refcnt;
@@ -301,6 +305,7 @@ struct trinity_model {
*/
struct trinity_input {
struct trinity_ioctl_input config;
+ struct trinity_hwmem_import import_info;
} __packed;

/**
@@ -377,14 +382,19 @@ static inline int32_t trinity_get_app_id(void)
*/
int trinity_create_node(struct trinity_driver *drv);
void trinity_destroy_node(struct trinity_driver *drv);
-int trinity_wait_ready(struct trinity_driver *drv);
+int trinity_idu_load(struct trinity_driver *drv, const char *dirpath);
void trinity_init_model_htable(const struct trinity_driver *drv,
struct trinity_model_htable *ht);
+int32_t trinity_get_app_id(void);
+void trinity_finish_req(struct trinity_driver *drv, struct trinity_req *req);
phys_addr_t trinity_get_paddr(struct iommu_domain *domain, dma_addr_t daddr);
+struct trinity_sched_desc *get_trinity_sched(struct trinity_req *req);
+int trinity_wait_ready(struct trinity_driver *drv);

/* File operations */
int trinity_open(struct inode *inode, struct file *f);
int trinity_release(struct inode *inode, struct file *f);
+long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg);

/* Device probing and removing */
int trinity_probe(struct platform_device *pdev,
@@ -399,7 +409,6 @@ int trinity_sysfs_cleanup(struct trinity_driver *drv);
/* debugfs operations */
int trinity_debug_init(void);
void trinity_debug_exit(void);
-
int trinity_debug_add(struct trinity_driver *drv);
void trinity_debug_remove(struct trinity_driver *drv);
void trinity_debug_clear(struct trinity_driver *drv, unsigned long msg_max);
@@ -412,4 +421,14 @@ void trinity_debug_dump_input(struct trinity_driver *drv,
const struct trinity_input *input,
const char *fmt, ...);

+/* pm operations */
+int trinity_pm_runtime_init(struct device *dev);
+int trinity_pm_runtime_forbid(struct device *dev);
+void trinity_pm_runtime_allow(struct device *dev);
+void trinity_pm_runtime_attach(struct device *dev);
+
+void trinity_reset_device(struct device *dev, bool do_test);
+void trinity_set_irq_affinity(int irq);
+void trinity_monitor_invalid_access(void);
+
#endif /* __TRINITY_COMMON_H__ */
diff --git a/drivers/misc/trinity/trinity_pm.c b/drivers/misc/trinity/trinity_pm.c
new file mode 100644
index 000000000000..bf292e168422
--- /dev/null
+++ b/drivers/misc/trinity/trinity_pm.c
@@ -0,0 +1,76 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/**
+ * Power Management Controls
+ *
+ * Copyright (C) 2022 Samsung Electronics
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/of_address.h>
+#include <linux/pm_runtime.h>
+
+/**
+ * trinity_pm_runtime_init() - Initialize runtime power management
+ *
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+int trinity_pm_runtime_init(struct device *dev)
+{
+ return 0;
+}
+
+/**
+ * trinity_pm_runtime_forbid() - Block runtime power management
+ *
+ * @dev: device structure
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+int trinity_pm_runtime_forbid(struct device *dev)
+{
+ pm_runtime_forbid(dev);
+ return 0;
+}
+
+/**
+ * trinity_pm_runtime_allow() - Allow runtime power management
+ *
+ * @dev: device structure
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+void trinity_pm_runtime_allow(struct device *dev)
+{
+ pm_runtime_allow(dev);
+}
+
+/**
+ * trinity_pm_runtime_allow() - Attach runtime power management
+ *
+ * @drv: An instance of the trinity driver
+ */
+void trinity_pm_runtime_attach(struct device *drv)
+{
+}
+
+/**
+ * trinity_mmap_from_resv_mem() - Reset device
+ *
+ * @dev: device structure
+ * @do_test: test reset
+ */
+void trinity_reset_device(struct device *dev, bool do_test)
+{
+}
+
+/**
+ * trinity_set_irq_affinity() - Set affinity for the IRQ
+ *
+ * @irq: IRQ number
+ */
+void trinity_set_irq_affinity(int irq)
+{
+}
diff --git a/drivers/misc/trinity/trinity_vision2_drv.c b/drivers/misc/trinity/trinity_vision2_drv.c
index 9e616466c57b..ddc1739afdd8 100644
--- a/drivers/misc/trinity/trinity_vision2_drv.c
+++ b/drivers/misc/trinity/trinity_vision2_drv.c
@@ -162,13 +162,48 @@ struct triv2_pdata {

/* back buffer for context switching */
struct trinity_resv_mem back_buf;
+
+ /* profiling */
+ struct trinity_resv_mem prof_buf;
+ struct mutex prof_lock;
+ DECLARE_HASHTABLE(prof_htable, TRIV2_PROFILE_HASH_BITS);
};

+static void triv2_handle_cmd_done(struct trinity_driver *drv,
+ struct triv2_cmd *cmd, bool timeout);
static void triv2_setup_buffers(struct trinity_driver *drv);
static int triv2_idu_load(struct trinity_driver *drv, const char *dirpath,
bool load_files);

static LIST_HEAD(triv2_driver_list);
+static struct hlist_bl_head triv2_model_node_hlist[TRIV2_MODEL_HASH_SIZE];
+
+static struct triv2_profile *
+triv2_find_profile(const struct trinity_driver *drv, int req_id)
+{
+ /* find profile */
+
+ return NULL;
+}
+
+static void triv2_fini_profile(struct trinity_resv_mem *prof_buf)
+{
+ /* finish profile */
+}
+
+static void triv2_init_profile(struct trinity_driver *drv,
+ unsigned long profile_size)
+{
+ /* init profile */
+}
+
+static int32_t triv2_check_profile(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ /* check profile */
+
+ return 0;
+}

/**
* triv2_get_state() - Get state (TRINITY_STATE_READY/TRINITY_STATE_PAUSE) of the device.
@@ -230,6 +265,31 @@ static void triv2_set_state(const struct trinity_driver *drv,
}
}

+/**
+ * triv2_sync_segt_entries() - synchronize the segment table entries
+ */
+static int triv2_sync_segt_entries(const struct trinity_driver *drv,
+ struct triv2_req *req)
+{
+#ifdef ARM
+ struct trinity_input *input = &(req->req.input);
+ int i;
+
+ /* flush all caches for heavy models */
+ if (req->total_segment_size > TRIV2_CACHE_FLUSH_THRESHOLD ||
+ /* cannot handle external segments for kernel requests */
+ req->kernel != NULL) {
+ flush_cache_all();
+ return 0;
+ }
+
+ for (i = 0; i < input->config.num_segments; ++i)
+ __cpuc_flush_dcache_area(req->seg_import[i].addr,
+ req->seg_import[i].buf->size);
+#endif
+ return 0;
+}
+
static void triv2_wakeup_cp(const struct trinity_driver *drv)
{
void *addr =
@@ -241,18 +301,55 @@ static void triv2_wakeup_cp(const struct trinity_driver *drv)
static void triv2_cancel_reqs(struct trinity_driver *drv)
{
struct triv2_cmd_info *info;
+ struct triv2_cmd *cmd;
unsigned long flags;
+ int slot;

info = TRIV2_DRV_GET_CMD_INFO(drv);
spin_lock_irqsave(&info->lock, flags);

- /* set command done */
+ slot = find_first_bit(info->bitmap, TRIV2_MAX_CMDSLOTS);
+ while (slot < TRIV2_MAX_CMDSLOTS) {
+ cmd = TRIV2_GET_CMD_FROM_SLOT(info, slot);
+ triv2_handle_cmd_done(drv, cmd, true);
+ slot = find_next_bit(info->bitmap, TRIV2_MAX_CMDSLOTS,
+ slot + 1);
+ }
+
+ spin_unlock_irqrestore(&info->lock, flags);
+}
+
+static void triv2_drain_reqs(struct trinity_driver *drv)
+{
+ struct triv2_cmd_info *info;
+ unsigned long flags;
+ int cur_retries, max_retries = 1000; /* 1-sec */
+ int slot;
+
+ cur_retries = 0;
+ info = TRIV2_DRV_GET_CMD_INFO(drv);
+retry:
+ spin_lock_irqsave(&info->lock, flags);
+
+ /* wait until all bits are unset */
+ slot = find_first_bit(info->bitmap, TRIV2_MAX_CMDSLOTS);
+ if (slot < TRIV2_MAX_CMDSLOTS) {
+ spin_unlock_irqrestore(&info->lock, flags);
+
+ usleep_range(900, 1100);
+ if (cur_retries++ < max_retries)
+ goto retry;
+
+ spin_lock_irqsave(&info->lock, flags);
+ }

spin_unlock_irqrestore(&info->lock, flags);
}

static void triv2_reset_devices(struct trinity_driver *drv, bool do_test)
{
+ trinity_reset_device(drv_to_dev_ptr(drv), do_test);
+
triv2_setup_buffers(drv);
triv2_idu_load(drv, NULL, false);
}
@@ -270,6 +367,12 @@ static void triv2_reset(struct trinity_driver *drv)

dev_err(dev, "NPU HW reset started");

+ /* block runtime pm suspend */
+ trinity_pm_runtime_forbid(dev);
+
+ /* block new incoming requests first */
+ trinity_sched_suspend();
+
/* cancel all requests by force */
list_for_each_entry(pdata, &triv2_driver_list, list)
triv2_cancel_reqs(pdata->drv);
@@ -286,22 +389,331 @@ static void triv2_reset(struct trinity_driver *drv)
do_test = false;
}

+ /* resume scheduler */
+ trinity_sched_resume();
+
+ trinity_pm_runtime_allow(dev);
+
dev_err(dev, "NPU HW reset completed");

list_for_each_entry(pdata, &triv2_driver_list, list)
mutex_unlock(&pdata->drv->lock);
}

-static long triv2_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+static void triv2_clear_cmd(struct trinity_driver *drv, struct triv2_req *req,
+ struct triv2_cmd *cmd)
+{
+ struct triv2_cmd_info *cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+
+ cmd_info->reqs[req->cmd_slot] = NULL;
+ clear_bit(req->cmd_slot, cmd_info->bitmap);
+ req->cmd_slot = -1;
+
+ memset_io(cmd, '\x00', sizeof(struct triv2_cmd));
+}
+
+static void triv2_handle_cmd_done(struct trinity_driver *drv,
+ struct triv2_cmd *cmd, bool timeout)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+ struct triv2_cmd_info *cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+ struct triv2_req *t_req;
+ struct trinity_req *req;
+ struct trinity_sched_desc *sched;
+ uint32_t slot = cmd->slot;
+ int64_t time_diff;
+
+ t_req = cmd_info->reqs[slot];
+ if (!t_req) {
+ dev_err(dev, "Failed to find the req\n");
+ return;
+ }
+
+ req = &(t_req->req);
+ req->stat->completed = ktime_get();
+ req->stat->status = TRINITY_REQ_STATUS_FINISHED;
+
+ time_diff = TIME_DIFF_US(req->stat->completed, req->stat->scheduled);
+ if (time_diff < 0) {
+ dev_warn(dev, "Detected invalid inference time of request\n");
+ } else {
+ req->stat->prev_time = (uint32_t)time_diff;
+ req->stat->prev_cycles = cmd->total_cycles;
+ req->stat->num_runs++;
+ req->stat->total_time += req->stat->prev_time;
+ }
+
+ t_req->total_cycles = cmd->total_cycles;
+ t_req->profile_offset = cmd->profile_offset;
+
+ triv2_clear_cmd(drv, t_req, cmd);
+
+ /* notify to the scheduler */
+ sched = get_trinity_sched(req);
+ if (sched && sched->notify)
+ sched->notify(req, timeout);
+
+ /* notify to the caller */
+ if (!req->is_kernel)
+ complete_all(&req->complete);
+}
+
+static void triv2_handle_timeout(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ struct triv2_cmd_info *cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+ struct triv2_cmd *cmd;
+ struct triv2_req *t;
+ unsigned long flags;
+
+ t = TRIV2_GET_REQ(req);
+
+ spin_lock_irqsave(&cmd_info->lock, flags);
+ if (t->cmd_slot >= 0) {
+ /* Timeout! check whether it's not handled in irq handler */
+ cmd = TRIV2_GET_CMD_FROM_SLOT(cmd_info, t->cmd_slot);
+ triv2_handle_cmd_done(drv, cmd, true);
+ }
+ spin_unlock_irqrestore(&cmd_info->lock, flags);
+}
+
+/**
+ * triv2_stop_reqs() - stop the submitted reqs to the driver
+ *
+ * In case of already-executed req, each device needs to determine the policy
+ * depending its capability to terminate the running one.
+ */
+static void triv2_stop_reqs(struct work_struct *work)
+{
+ struct trinity_driver *drv;
+
+ drv = container_of(work, struct trinity_driver, work_stop);
+ if (drv == NULL)
+ return;
+
+ triv2_cancel_reqs(drv);
+}
+
+static void triv2_handle_irq_cmds(struct trinity_driver *drv)
+{
+ struct triv2_cmd_info *info;
+ struct triv2_cmd *cmd;
+ unsigned long flags;
+ int slot;
+
+ info = TRIV2_DRV_GET_CMD_INFO(drv);
+ spin_lock_irqsave(&info->lock, flags);
+
+ /** Search the bitmap to find the completed CMDs */
+ slot = find_first_bit(info->bitmap, TRIV2_MAX_CMDSLOTS);
+ while (slot < TRIV2_MAX_CMDSLOTS) {
+ cmd = TRIV2_GET_CMD_FROM_SLOT(info, slot);
+ if (cmd->status == STATUS_CMD_DONE)
+ triv2_handle_cmd_done(drv, cmd, false);
+ slot = find_next_bit(info->bitmap, TRIV2_MAX_CMDSLOTS,
+ slot + 1);
+ }
+
+ spin_unlock_irqrestore(&info->lock, flags);
+}
+
+/**
+ * triv2_handle_irq() - An IRQ handler to be called when a registered IRQ (IRQ_OUT) occurs.
+ */
+static irqreturn_t triv2_handle_irq(int irq_no, void *dev_id)
+{
+ struct miscdevice *_mdev;
+ struct trinity_driver *drv;
+ void __iomem *addr;
+ uint32_t interrupt;
+ uint32_t reg;
+
+ _mdev = (struct miscdevice *)dev_id;
+ drv = container_of(_mdev, struct trinity_driver, mdev);
+
+ /**
+ * Verify that the IRQ is actually from the NPU
+ * This is required as IRQ_SHARED is used when setting up IRQ
+ */
+ addr = trinity_get_iomem_addr(drv->mmreg_vaddr[2],
+ OFFSET_CBOX_EXT_IRQ_STA);
+ reg = ioread32(addr);
+
+ interrupt = reg & MASK_CP_SWI_STA;
+ if (interrupt == 0)
+ return IRQ_NONE;
+
+ /** Clear the interrupt first */
+ addr = trinity_get_iomem_addr(drv->mmreg_vaddr[2],
+ OFFSET_CBOX_CP_SWI_CLR);
+ iowrite32(1, addr);
+
+ triv2_handle_irq_cmds(drv);
+ return IRQ_HANDLED;
+}
+
+/**
+ * triv2_prepare_req() - evaluate the physical address of entries in the segment table
+ */
+static int32_t triv2_prepare_req(struct trinity_driver *drv,
+ struct trinity_req *req)
{
- /* handle ioctl */
+ struct triv2_req *t = TRIV2_GET_REQ(req);
+ struct trinity_input *input = &(req->input);
+ struct trinity_hwmem_import *segt_import = &(input->import_info);
+ int32_t *segtable_dbuffd_base;
+ uint32_t *segtable_extra_base;
+ int ret, i;
+
+ if (input->config.num_segments == 0)
+ return -EINVAL;
+
+ if (input->config.num_segments > TRIV2_MAX_SEGMENTS)
+ return -ERANGE;
+
+ t->seg_import = kcalloc(input->config.num_segments,
+ sizeof(struct trinity_hwmem_import),
+ GFP_KERNEL);
+ if (!t->seg_import)
+ return -ENOMEM;
+
+ /* dmabuf fd to be resolved */
+ segtable_dbuffd_base = segt_import->addr;
+ /* extra value (e.g., offset or size) */
+ segtable_extra_base = segt_import->addr + HALF_PAGE_SIZE;
+
+#ifdef ARM
+ /* sync segment table */
+ __cpuc_flush_dcache_area(input->import_info.addr,
+ input->import_info.buf->size);
+#endif
+
+ for (i = 0; i < input->config.num_segments; ++i) {
+ struct trinity_hwmem_import *import;
+ int32_t fd = segtable_dbuffd_base[i];
+ dma_addr_t daddr;
+
+ if (fd < 0) {
+ uint32_t idx = (uint32_t)((fd + 1) * -1);
+ struct triv2_kernel_req *kreq;
+
+ /* it's for kernel input/output */
+ if (!req->is_kernel) {
+ req->is_kernel = true;
+ kreq = kzalloc(sizeof(*kreq), GFP_KERNEL);
+ if (!kreq) {
+ ret = -ENOMEM;
+ goto err;
+ }
+ t->kernel = kreq;
+ }
+
+ kreq = t->kernel;
+ if (idx < TRIV2_MAX_TENSORS) {
+ kreq->in_seg_idx[idx] = i;
+ kreq->in_seg_size[idx] = segtable_extra_base[i];
+ t->total_segment_size += kreq->in_seg_size[idx];
+ } else if (idx < TRIV2_MAX_TENSORS * 2) {
+ idx -= TRIV2_MAX_TENSORS;
+ kreq->out_seg_idx[idx] = i;
+ kreq->out_seg_size[idx] =
+ segtable_extra_base[i];
+ t->total_segment_size +=
+ kreq->out_seg_size[idx];
+ } else {
+ dev_err(drv_to_dev_ptr(drv),
+ "Invalid external segment (idx: %u)",
+ idx);
+ ret = -EINVAL;
+ goto err;
+ }
+ continue;
+ }
+
+ import = &(t->seg_import[i]);
+ ret = trinity_hwmem_import_dmabuf_begin(drv_to_dev_ptr(drv), fd,
+ import);
+ if (ret) {
+ dev_err(drv_to_dev_ptr(drv),
+ "%d-th segment with fd (%d) seems invalid: %d",
+ i, fd, ret);
+ goto err;
+ }
+
+ t->total_segment_size += import->buf->size;
+
+ /** @todo Use a local ptr variable */
+ daddr = import->dma_addr;
+ daddr += segtable_extra_base[i];
+
+ iowrite32(TRIV2_IDU_ADDR(daddr),
+ segt_import->addr + i * sizeof(u32));
+ }
+
+ /* set the dma address of DSPM (reserved index: TRIV2_MAX_SEGMENTS - 1) */
+ if (drv->dspm > 0) {
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+
+ iowrite32(TRIV2_IDU_ADDR(pdata->idu_dsp.dspm),
+ segt_import->addr +
+ (TRIV2_MAX_SEGMENTS - 1) * sizeof(u32));
+ }

return 0;
+
+err:
+ kfree(t->seg_import);
+ t->seg_import = NULL;
+ return ret;
+}
+
+/**
+ * triv2_invoke_req() - Invoke a req on the device. Note that all configurations
+ * required by running should be done before invocation of this function.
+ */
+static int32_t triv2_invoke_req(struct trinity_driver *drv,
+ struct trinity_req *req, void *sched_data)
+{
+ /* invoke request */
+
+ return 0;
+}
+
+static long triv2_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+ struct trinity_driver *drv = f->private_data;
+ struct device *dev = drv_to_dev_ptr(drv);
+ long ret;
+
+ if (trinity_pm_runtime_forbid(dev) != 0)
+ return -EBUSY;
+
+ ret = trinity_ioctl(f, cmd, arg);
+
+ trinity_pm_runtime_allow(dev);
+
+ return ret;
}

static int triv2_open(struct inode *inode, struct file *f)
{
- return trinity_open(inode, f);
+ struct miscdevice *miscdev;
+ struct trinity_driver *drv;
+ struct device *dev;
+ int ret;
+
+ miscdev = (struct miscdevice *)f->private_data;
+ drv = container_of(miscdev, struct trinity_driver, mdev);
+ dev = drv_to_dev_ptr(drv);
+
+ if (trinity_pm_runtime_forbid(dev) != 0)
+ return -EBUSY;
+
+ ret = trinity_open(inode, f);
+
+ trinity_pm_runtime_allow(dev);
+
+ return ret;
}

static const struct file_operations triv2_fops = {
@@ -624,11 +1036,13 @@ static void triv2_setup_buffers(struct trinity_driver *drv)
struct iommu_domain *domain;
struct trinity_resv_mem *cmd_buf;
struct trinity_resv_mem *back_buf;
+ struct trinity_resv_mem *prof_buf;
phys_addr_t paddr;

domain = iommu_get_domain_for_dev(dev);
cmd_buf = TRIV2_DRV_GET_CMD_BUF(drv);
back_buf = TRIV2_DRV_GET_BACK_BUF(drv);
+ prof_buf = TRIV2_DRV_GET_PROF_BUF(drv);

/* command */
paddr = trinity_get_paddr(domain, cmd_buf->daddr);
@@ -641,14 +1055,34 @@ static void triv2_setup_buffers(struct trinity_driver *drv)
OFFSET_NPU_BACK_ADDR));
iowrite32(back_buf->size, trinity_get_iomem_addr(drv->mmreg_vaddr[0],
OFFSET_NPU_BACK_SIZE));
+
+ /* profile */
+ if (prof_buf->size > 0) {
+ paddr = trinity_get_paddr(domain, prof_buf->daddr);
+ iowrite32(TRIV2_IDU_ADDR(paddr),
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_ADDR));
+ iowrite32(prof_buf->size,
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_SIZE));
+ } else {
+ iowrite32(0, trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_ADDR));
+ iowrite32(0, trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_SIZE));
+ }
}

static int32_t triv2_init_pdata(struct trinity_driver *drv)
{
+ struct device *dev = drv_to_dev_ptr(drv);
struct triv2_pdata *pdata;
struct triv2_cmd_info *cmd_info;
struct trinity_resv_mem *cmd_buf;
struct trinity_resv_mem *back_buf;
+ int status;
+
+ trinity_pm_runtime_attach(dev);

/* alloc triv2 pdata */
drv->pdata = kzalloc(sizeof(struct triv2_pdata), GFP_KERNEL);
@@ -662,13 +1096,42 @@ static int32_t triv2_init_pdata(struct trinity_driver *drv)
cmd_buf = TRIV2_DRV_GET_CMD_BUF(drv);
back_buf = TRIV2_DRV_GET_BACK_BUF(drv);

+ mutex_init(&pdata->prof_lock);
spin_lock_init(&cmd_info->lock);
/* init cmd bitmap */
bitmap_zero(cmd_info->bitmap, TRIV2_MAX_CMDSLOTS);

+ /* alloc command buffer */
+ status = trinity_alloc_from_resv_mem(PAGE_SIZE, cmd_buf, false);
+ if (status < 0) {
+ dev_err(dev, "Couldn't allocate memory for cmd slots");
+ goto free_pdata;
+ }
+ /* ensure cmd buffer is null-initialized, which is visible in NPU as well */
+ memset_io(cmd_buf->vaddr, '\x00', PAGE_SIZE);
+
+ /* alloc backup buffer for preemption (GBUF + DSPM) */
+ status = trinity_alloc_from_resv_mem(TRIV2_DLA_GBUFFER_SIZE + drv->dspm,
+ back_buf, false);
+ if (status < 0) {
+ dev_err(dev,
+ "Couldn't allocate memory for context backup buffer");
+ goto free_cmd_info;
+ }
+
+ triv2_setup_buffers(drv);
list_add_tail(&pdata->list, &triv2_driver_list);

return 0;
+
+free_cmd_info:
+ dma_free_wc(drv_to_dev_ptr(drv), PAGE_SIZE, cmd_buf->vaddr,
+ cmd_buf->daddr);
+free_pdata:
+ kfree(drv->pdata);
+ drv->pdata = NULL;
+
+ return status;
}

static int32_t parse_idu_property(struct device *dev,
@@ -812,6 +1275,98 @@ static struct trinity_desc triv2_desc = {
.dealloc_req = triv2_dealloc_req,
.prepare_req = triv2_prepare_req,
.invoke_req = triv2_invoke_req,
+ /* profile */
+ .init_profile = triv2_init_profile,
+ .check_profile = triv2_check_profile,
+ .get_profile_meta = triv2_get_profile_meta,
+ .get_profile_buff = triv2_get_profile_buff,
+ .show_profile = triv2_show_profile,
+ .destroy_profile = triv2_destroy_profile,
+ /* etc. */
+ .handle_timeout = triv2_handle_timeout,
+ .stop_reqs = triv2_stop_reqs,
+ .drain_reqs = triv2_drain_reqs,
+ .handle_irq = triv2_handle_irq,
+};
+
+static int triv2_suspend(struct device *dev)
+{
+ return 0;
+}
+
+static int triv2_resume(struct device *dev)
+{
+ return 0;
+}
+
+static int triv2_runtime_suspended;
+static int triv2_runtime_resumed;
+
+static int triv2_runtime_suspend(struct device *dev)
+{
+ struct trinity_driver *drv;
+
+ drv = (struct trinity_driver *)dev_get_drvdata(dev);
+ if (!drv) {
+ dev_warn(dev, "Cannot find driver data");
+ return 0;
+ }
+
+ mutex_lock(&drv->lock);
+
+ /* 1) Ensure that the scheduler was suspended */
+ trinity_sched_suspend();
+
+ /* 2) Set pause state if it's in ready state */
+ if (triv2_get_state(drv) == TRINITY_STATE_READY)
+ triv2_set_state(drv, TRINITY_STATE_PAUSE);
+
+ mutex_unlock(&drv->lock);
+
+ triv2_runtime_suspended++;
+
+ return 0;
+}
+
+static int triv2_runtime_resume(struct device *dev)
+{
+ struct trinity_driver *drv;
+
+ drv = (struct trinity_driver *)dev_get_drvdata(dev);
+ if (!drv) {
+ dev_warn(dev, "Cannot find driver data");
+ return 0;
+ }
+
+ /* 0) Reset NPU devices (only once) */
+ trinity_reset_device(dev, triv2_runtime_resumed == 0);
+
+ mutex_lock(&drv->lock);
+
+ /* 1) Restore IDU setup */
+ triv2_setup_buffers(drv);
+ triv2_idu_load(drv, NULL, false);
+
+ /* 2) Set ready state if it was in ready state before */
+ if (drv->opened > 0)
+ triv2_set_state(drv, TRINITY_STATE_READY);
+
+ /* 3) Resume the req scheduler */
+ trinity_sched_resume();
+
+ mutex_unlock(&drv->lock);
+
+ if (++triv2_runtime_resumed == triv2_runtime_suspended)
+ triv2_runtime_resumed = triv2_runtime_suspended = 0;
+
+ return 0;
+}
+
+static const struct dev_pm_ops triv2_dev_pm_ops = {
+ // clang-format off
+ SET_SYSTEM_SLEEP_PM_OPS(triv2_suspend, triv2_resume)
+ SET_RUNTIME_PM_OPS(triv2_runtime_suspend, triv2_runtime_resume, NULL)
+ // clang-format on
};

static const struct of_device_id trinity_match[] = {
--
2.25.1

2022-07-25 09:10:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 0/9] Samsung Trinity NPU device driver

On Mon, Jul 25, 2022 at 03:52:59PM +0900, Jiho Chu wrote:
> Hello,
>
> My name is Jiho Chu, and working for device driver and system daemon for
> several years at Samsung Electronics.
>
> Trinity Neural Processing Unit (NPU) series are hardware accelerators
> for neural network processing in embedded systems, which are integrated
> into application processors or SoCs. Trinity NPU is compatible with AMBA
> bus architecture and first launched in 2018 with its first version for
> vision processing, Trinity Version1 (TRIV1). Its second version, TRIV2,
> is released in Dec, 2021. Another Trinity NPU for audio processing is
> referred as TRIA.
>
> TRIV2 is shipped for many models of 2022 Samsung TVs, providing
> acceleration for various AI-based applications, which include image
> recognition and picture quality improvements for streaming video, which
> can be accessed via GStreamer and its neural network plugins,
> NNStreamer.
>
> In this patch set, it includes Trinity Vision 2 kernel device driver.
> Trinity Vision 2 supports accelerating image inference process for
> Convolution Neural Network (CNN). The CNN workload is executed by Deep
> Learning Accelerator (DLA), and general Neural Network Layers are
> executed by Digital Signal Processor (DSP). And there is a Control
> Processor (CP) which can control DLA and DSP. These three IPs (DLA, DSP,
> CP) are composing Trinity Vision 2 NPU, and the device driver mainly
> supervise the CP to manage entire NPU.
>
> Controlling DLA and DSP operations is performed with internal command
> instructions. and the instructions for the Trinity is similar with
> general processor's ISA, but it is specialized for Neural Processing
> operations. The virtual ISA (vISA) is designed for calculating multiple
> data with single operation, like modern SIMD processor. The device
> driver loads a program to CP at start up, and the program can decode a
> binary which is built with the vISA. We calls this decoding program as a
> Instruction Decoding Unit (IDU) program. While running the NPU, the CP
> executes IDU program to fetch and decode instructions which made up of
> vISA, by the scheduling policy of the device driver.
>
> These DLA, DSP and CP are loosely coupled using ARM's AMBA, so the
> Trinity can easily communicate with most ARM processors. Each IPs
> designed to have memory-mapped registers which can be used to control
> the IP, and the CP provides Wait-For-Event (WFE) operation to subscribe
> interrupt signals from the DLA and DSP. Also, embedded Direct Memory
> Access Controller (DMAC) manages data communications between internal
> SRAM and outer main memory, IOMMU module supports unified memory space.
>
> A user can control the Trinity NPU with IOCTLs provided by driver. These
> controls includes memory management operations to transfer model data
> (HWMEM_ALLOC/HWMEM_DEALLOC), NPU workload control operations to submit
> workload (RUN/STOP), and statistics operations to check current NPU
> status. (STAT)
>
> The device driver also implemented features for developers. It provides
> sysfs control attributes like stop, suspend, sched_test, and profile.
> Also, it provides status attributes like app status, a number of total
> requests, a number of active requests and memory usages. For the tracing
> operations, several ftrace events are defined and embedded for several
> important points.

If you have created sysfs files, you need to document them in
Documentation/ABI/ which I do not see in your diffstat. Perhaps add
that for your next respin?

Also, please remove the "tracing" logic you have in the code, use
ftrace, don't abuse dev_info() everywhere, that's not needed at all.

thanks,

greg k-h

2022-07-25 09:51:55

by Oded Gabbay

[permalink] [raw]
Subject: Re: [PATCH 0/9] Samsung Trinity NPU device driver

On Mon, Jul 25, 2022 at 12:02 PM Greg KH <[email protected]> wrote:
>
> On Mon, Jul 25, 2022 at 03:52:59PM +0900, Jiho Chu wrote:
> > Hello,
> >
> > My name is Jiho Chu, and working for device driver and system daemon for
> > several years at Samsung Electronics.
> >
> > Trinity Neural Processing Unit (NPU) series are hardware accelerators
> > for neural network processing in embedded systems, which are integrated
> > into application processors or SoCs. Trinity NPU is compatible with AMBA
> > bus architecture and first launched in 2018 with its first version for
> > vision processing, Trinity Version1 (TRIV1). Its second version, TRIV2,
> > is released in Dec, 2021. Another Trinity NPU for audio processing is
> > referred as TRIA.
> >
> > TRIV2 is shipped for many models of 2022 Samsung TVs, providing
> > acceleration for various AI-based applications, which include image
> > recognition and picture quality improvements for streaming video, which
> > can be accessed via GStreamer and its neural network plugins,
> > NNStreamer.
> >
> > In this patch set, it includes Trinity Vision 2 kernel device driver.
> > Trinity Vision 2 supports accelerating image inference process for
> > Convolution Neural Network (CNN). The CNN workload is executed by Deep
> > Learning Accelerator (DLA), and general Neural Network Layers are
> > executed by Digital Signal Processor (DSP). And there is a Control
> > Processor (CP) which can control DLA and DSP. These three IPs (DLA, DSP,
> > CP) are composing Trinity Vision 2 NPU, and the device driver mainly
> > supervise the CP to manage entire NPU.
> >
> > Controlling DLA and DSP operations is performed with internal command
> > instructions. and the instructions for the Trinity is similar with
> > general processor's ISA, but it is specialized for Neural Processing
> > operations. The virtual ISA (vISA) is designed for calculating multiple
> > data with single operation, like modern SIMD processor. The device
> > driver loads a program to CP at start up, and the program can decode a
> > binary which is built with the vISA. We calls this decoding program as a
> > Instruction Decoding Unit (IDU) program. While running the NPU, the CP
> > executes IDU program to fetch and decode instructions which made up of
> > vISA, by the scheduling policy of the device driver.
> >
> > These DLA, DSP and CP are loosely coupled using ARM's AMBA, so the
> > Trinity can easily communicate with most ARM processors. Each IPs
> > designed to have memory-mapped registers which can be used to control
> > the IP, and the CP provides Wait-For-Event (WFE) operation to subscribe
> > interrupt signals from the DLA and DSP. Also, embedded Direct Memory
> > Access Controller (DMAC) manages data communications between internal
> > SRAM and outer main memory, IOMMU module supports unified memory space.
> >
> > A user can control the Trinity NPU with IOCTLs provided by driver. These
> > controls includes memory management operations to transfer model data
> > (HWMEM_ALLOC/HWMEM_DEALLOC), NPU workload control operations to submit
> > workload (RUN/STOP), and statistics operations to check current NPU
> > status. (STAT)
> >
> > The device driver also implemented features for developers. It provides
> > sysfs control attributes like stop, suspend, sched_test, and profile.
> > Also, it provides status attributes like app status, a number of total
> > requests, a number of active requests and memory usages. For the tracing
> > operations, several ftrace events are defined and embedded for several
> > important points.
>
> If you have created sysfs files, you need to document them in
> Documentation/ABI/ which I do not see in your diffstat. Perhaps add
> that for your next respin?
>
> Also, please remove the "tracing" logic you have in the code, use
> ftrace, don't abuse dev_info() everywhere, that's not needed at all.
>
> thanks,
>
> greg k-h

Hi,
Why isn't this submitted to soc/ subsystem ?
Don't you think that would be more appropriate, given that this IP is
integrated into application processors ?

Thanks,
Oded

2022-07-26 02:27:33

by MyungJoo Ham

[permalink] [raw]
Subject: RE: Re: [PATCH 0/9] Samsung Trinity NPU device driver

> Hi,
> Why isn't this submitted to soc/ subsystem ?
> Don't you think that would be more appropriate, given that this IP is
> integrated into application processors ?
>
> Thanks,
> Oded

This series (Trinity-V2.3, V2.4, A1, ..) is being integrated to multiple SoCs,
not limited to Samsung-designed chips (e.g., Exynos).
It's a bit weird to have them in /drivers/soc/samsung.

CC: Krzysztof and Alim (Samsung-SoC maintainers)

Cheers,
MyungJoo

2022-07-26 07:01:31

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH 0/9] Samsung Trinity NPU device driver

On 26/07/2022 04:09, MyungJoo Ham wrote:
>> Hi,
>> Why isn't this submitted to soc/ subsystem ?
>> Don't you think that would be more appropriate, given that this IP is
>> integrated into application processors ?
>>
>> Thanks,
>> Oded
>
> This series (Trinity-V2.3, V2.4, A1, ..) is being integrated to multiple SoCs,
> not limited to Samsung-designed chips (e.g., Exynos).
> It's a bit weird to have them in /drivers/soc/samsung.
>
> CC: Krzysztof and Alim (Samsung-SoC maintainers)

If it is not related to Samsung SoCs (or other designs by Samsung
Foundry), then it should not go to drivers/soc. Based on cover letter,
it looks this is the case.


Best regards,
Krzysztof

2022-07-26 07:01:33

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH 0/9] Samsung Trinity NPU device driver

On 25/07/2022 08:52, Jiho Chu wrote:
> Hello,
>
> My name is Jiho Chu, and working for device driver and system daemon for
> several years at Samsung Electronics.
>
> Trinity Neural Processing Unit (NPU) series are hardware accelerators
> for neural network processing in embedded systems, which are integrated
> into application processors or SoCs. Trinity NPU is compatible with AMBA
> bus architecture and first launched in 2018 with its first version for
> vision processing, Trinity Version1 (TRIV1). Its second version, TRIV2,
> is released in Dec, 2021. Another Trinity NPU for audio processing is
> referred as TRIA.
>

Why there are no bindings? How is it supposed to be used on ARM64 platforms?


Best regards,
Krzysztof

2022-07-26 08:45:53

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 0/9] Samsung Trinity NPU device driver

On Tue, Jul 26, 2022 at 8:59 AM Krzysztof Kozlowski
<[email protected]> wrote:
> On 26/07/2022 04:09, MyungJoo Ham wrote:
> >> Hi,
> >> Why isn't this submitted to soc/ subsystem ?
> >> Don't you think that would be more appropriate, given that this IP is
> >> integrated into application processors ?
> >>
> >> Thanks,
> >> Oded
> >
> > This series (Trinity-V2.3, V2.4, A1, ..) is being integrated to multiple SoCs,
> > not limited to Samsung-designed chips (e.g., Exynos).
> > It's a bit weird to have them in /drivers/soc/samsung.
> >
> > CC: Krzysztof and Alim (Samsung-SoC maintainers)
>
> If it is not related to Samsung SoCs (or other designs by Samsung
> Foundry), then it should not go to drivers/soc. Based on cover letter,
> it looks this is the case.

Agreed, and I also don't want to add any drivers with a user interface
to drivers/soc/. The things we have in there mainly fall into two categories:

- soc_device drivers for identifying the SoC itself from userspace or
another driver

- drivers that provide exported symbols to other kernel drivers for things
that do not have a proper subsystem abstraction (yet).

This driver clearly does not fall into those categories. As long as there
is no subsystem for NPUs, the only sensible options are drivers/gpu
and drivers/misc/.

Arnd

2022-07-26 11:34:48

by Oded Gabbay

[permalink] [raw]
Subject: Re: [PATCH 0/9] Samsung Trinity NPU device driver

On Tue, Jul 26, 2022 at 10:51 AM Arnd Bergmann <[email protected]> wrote:
>
> On Tue, Jul 26, 2022 at 8:59 AM Krzysztof Kozlowski
> <[email protected]> wrote:
> > On 26/07/2022 04:09, MyungJoo Ham wrote:
> > >> Hi,
> > >> Why isn't this submitted to soc/ subsystem ?
> > >> Don't you think that would be more appropriate, given that this IP is
> > >> integrated into application processors ?
> > >>
> > >> Thanks,
> > >> Oded
> > >
> > > This series (Trinity-V2.3, V2.4, A1, ..) is being integrated to multiple SoCs,
> > > not limited to Samsung-designed chips (e.g., Exynos).
> > > It's a bit weird to have them in /drivers/soc/samsung.
> > >
> > > CC: Krzysztof and Alim (Samsung-SoC maintainers)
> >
> > If it is not related to Samsung SoCs (or other designs by Samsung
> > Foundry), then it should not go to drivers/soc. Based on cover letter,
> > it looks this is the case.
>
> Agreed, and I also don't want to add any drivers with a user interface
> to drivers/soc/. The things we have in there mainly fall into two categories:
>
> - soc_device drivers for identifying the SoC itself from userspace or
> another driver
>
> - drivers that provide exported symbols to other kernel drivers for things
> that do not have a proper subsystem abstraction (yet).
>
> This driver clearly does not fall into those categories. As long as there
> is no subsystem for NPUs, the only sensible options are drivers/gpu
> and drivers/misc/.
>
> Arnd

Thanks for the explanation, I wasn't sure what the criteria for
getting into drivers/soc is,
but now it is clear.

Oded

2022-07-26 15:05:53

by Jiho Chu

[permalink] [raw]
Subject: RE: [PATCH 0/9] Samsung Trinity NPU device driver

> --------- Original Message ---------
> Sender : Greg KH <[email protected]> Date : 2022-07-25 18:02
> (GMT+9) Title : Re: [PATCH 0/9] Samsung Trinity NPU device driver
>
> On Mon, Jul 25, 2022 at 03:52:59PM +0900, Jiho Chu wrote:
> > Hello,
> >
> > My name is Jiho Chu, and working for device driver and system daemon
> >for
> > several years at Samsung Electronics.
> >
> > Trinity Neural Processing Unit (NPU) series are hardware accelerators
> > for neural network processing in embedded systems, which are
> >integrated
> > into application processors or SoCs. Trinity NPU is compatible with
> >AMBA
> > bus architecture and first launched in 2018 with its first version for
> > vision processing, Trinity Version1 (TRIV1). Its second version,
> >TRIV2,
> > is released in Dec, 2021. Another Trinity NPU for audio processing is
> > referred as TRIA.
> >
> > TRIV2 is shipped for many models of 2022 Samsung TVs, providing
> > acceleration for various AI-based applications, which include image
> > recognition and picture quality improvements for streaming video,
> >which
> > can be accessed via GStreamer and its neural network plugins,
> > NNStreamer.
> >
> > In this patch set, it includes Trinity Vision 2 kernel device driver.
> > Trinity Vision 2 supports accelerating image inference process for
> > Convolution Neural Network (CNN). The CNN workload is executed by Deep
> > Learning Accelerator (DLA), and general Neural Network Layers are
> > executed by Digital Signal Processor (DSP). And there is a Control
> > Processor (CP) which can control DLA and DSP. These three IPs (DLA,
> >DSP,
> > CP) are composing Trinity Vision 2 NPU, and the device driver mainly
> > supervise the CP to manage entire NPU.
> >
> > Controlling DLA and DSP operations is performed with internal command
> > instructions. and the instructions for the Trinity is similar with
> > general processor's ISA, but it is specialized for Neural Processing
> > operations. The virtual ISA (vISA) is designed for calculating
> >multiple
> > data with single operation, like modern SIMD processor. The device
> > driver loads a program to CP at start up, and the program can decode a
> > binary which is built with the vISA. We calls this decoding program as
> >a
> > Instruction Decoding Unit (IDU) program. While running the NPU, the CP
> > executes IDU program to fetch and decode instructions which made up of
> > vISA, by the scheduling policy of the device driver.
> >
> > These DLA, DSP and CP are loosely coupled using ARM's AMBA, so the
> > Trinity can easily communicate with most ARM processors. Each IPs
> > designed to have memory-mapped registers which can be used to control
> > the IP, and the CP provides Wait-For-Event (WFE) operation to
> >subscribe
> > interrupt signals from the DLA and DSP. Also, embedded Direct Memory
> > Access Controller (DMAC) manages data communications between internal
> > SRAM and outer main memory, IOMMU module supports unified memory space.
> >
> > A user can control the Trinity NPU with IOCTLs provided by driver.
> >These
> > controls includes memory management operations to transfer model data
> > (HWMEM_ALLOC/HWMEM_DEALLOC), NPU workload control operations to submit
> > workload (RUN/STOP), and statistics operations to check current NPU
> > status. (STAT)
> >
> > The device driver also implemented features for developers. It
> >provides
> > sysfs control attributes like stop, suspend, sched_test, and profile.
> > Also, it provides status attributes like app status, a number of total
> > requests, a number of active requests and memory usages. For the
> >tracing
> > operations, several ftrace events are defined and embedded for several
> > important points.
>
> If you have created sysfs files, you need to document them in
> Documentation/ABI/ which I do not see in your diffstat. Perhaps add
> that for your next respin?
>
> Also, please remove the "tracing" logic you have in the code, use
> ftrace, don't abuse dev_info() everywhere, that's not needed at all.
>
> thanks,
>
> greg k-h
>

Hi, Greg
Thanks for your review.
A documentation for ABI/ is added for user interfaces.
And, most of unnecessary 'dev_info' removed except initialize information.

Thanks,

Jiho Chu


2022-07-26 15:56:52

by Jiho Chu

[permalink] [raw]
Subject: RE: RE: [PATCH 0/9] Samsung Trinity NPU device driver

> --------- Original Message ---------
> Sender : Greg KH <[email protected]> Date : 2022-07-25 18:02
> (GMT+9) Title : Re: [PATCH 0/9] Samsung Trinity NPU device driver
>
> On Mon, Jul 25, 2022 at 03:52:59PM +0900, Jiho Chu wrote:
> > Hello,
> >
> > My name is Jiho Chu, and working for device driver and system daemon
> >for
> > several years at Samsung Electronics.
> >
> > Trinity Neural Processing Unit (NPU) series are hardware accelerators
> > for neural network processing in embedded systems, which are
> >integrated
> > into application processors or SoCs. Trinity NPU is compatible with
> >AMBA
> > bus architecture and first launched in 2018 with its first version for
> > vision processing, Trinity Version1 (TRIV1). Its second version,
> >TRIV2,
> > is released in Dec, 2021. Another Trinity NPU for audio processing is
> > referred as TRIA.
> >
> > TRIV2 is shipped for many models of 2022 Samsung TVs, providing
> > acceleration for various AI-based applications, which include image
> > recognition and picture quality improvements for streaming video,
> >which
> > can be accessed via GStreamer and its neural network plugins,
> > NNStreamer.
> >
> > In this patch set, it includes Trinity Vision 2 kernel device driver.
> > Trinity Vision 2 supports accelerating image inference process for
> > Convolution Neural Network (CNN). The CNN workload is executed by Deep
> > Learning Accelerator (DLA), and general Neural Network Layers are
> > executed by Digital Signal Processor (DSP). And there is a Control
> > Processor (CP) which can control DLA and DSP. These three IPs (DLA,
> >DSP,
> > CP) are composing Trinity Vision 2 NPU, and the device driver mainly
> > supervise the CP to manage entire NPU.
> >
> > Controlling DLA and DSP operations is performed with internal command
> > instructions. and the instructions for the Trinity is similar with
> > general processor's ISA, but it is specialized for Neural Processing
> > operations. The virtual ISA (vISA) is designed for calculating
> >multiple
> > data with single operation, like modern SIMD processor. The device
> > driver loads a program to CP at start up, and the program can decode a
> > binary which is built with the vISA. We calls this decoding program as
> >a
> > Instruction Decoding Unit (IDU) program. While running the NPU, the CP
> > executes IDU program to fetch and decode instructions which made up of
> > vISA, by the scheduling policy of the device driver.
> >
> > These DLA, DSP and CP are loosely coupled using ARM's AMBA, so the
> > Trinity can easily communicate with most ARM processors. Each IPs
> > designed to have memory-mapped registers which can be used to control
> > the IP, and the CP provides Wait-For-Event (WFE) operation to
> >subscribe
> > interrupt signals from the DLA and DSP. Also, embedded Direct Memory
> > Access Controller (DMAC) manages data communications between internal
> > SRAM and outer main memory, IOMMU module supports unified memory space.
> >
> > A user can control the Trinity NPU with IOCTLs provided by driver.
> >These
> > controls includes memory management operations to transfer model data
> > (HWMEM_ALLOC/HWMEM_DEALLOC), NPU workload control operations to submit
> > workload (RUN/STOP), and statistics operations to check current NPU
> > status. (STAT)
> >
> > The device driver also implemented features for developers. It
> >provides
> > sysfs control attributes like stop, suspend, sched_test, and profile.
> > Also, it provides status attributes like app status, a number of total
> > requests, a number of active requests and memory usages. For the
> >tracing
> > operations, several ftrace events are defined and embedded for several
> > important points.
>
> If you have created sysfs files, you need to document them in
> Documentation/ABI/ which I do not see in your diffstat. Perhaps add
> that for your next respin?
>
> Also, please remove the "tracing" logic you have in the code, use
> ftrace, don't abuse dev_info() everywhere, that's not needed at all.
>
> thanks,
>
> greg k-h
>
>

Hi, Greg
Thanks for your review.
A documentation for ABI/ is added for user interfaces.
And, most of unnecessary 'dev_info' removed except initialize information.

Thanks,

Jiho Chu

2022-07-27 12:01:00

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH 1/9] trinity: Add base driver

On 25/07/2022 08:53, Jiho Chu wrote:
> It contains the base codes for trinity driver. Minimal codes to load and
> probe device is provided. The Trinity Family is controlled by the
> Memory-Mapped Registers, the register addresses and offsets are
> described. And user api interfaces are presented to control device under
> ioctl manner.
>

> + dev = &pdev->dev;
> + dev->id = ((desc->ver & TRINITY_MASK_DEV) >> TRINITY_SHIFT_DEV);
> +
> + /* set private data */
> + drv = devm_kzalloc(dev, sizeof(*drv), GFP_KERNEL);
> + if (drv == NULL)
> + return -ENOMEM;
> +
> + platform_set_drvdata(pdev, drv);
> + dev_set_drvdata(dev, drv);
> +
> + drv->dev = dev;
> + drv->desc = desc;
> +
> + np = dev->of_node;
> + if (of_property_match_string(np, "samsung,trinity-type", desc->type))

Let me be more specific.

You need to document your bindings.

Patch cannot be accepted without them.

Best regards,
Krzysztof

2022-07-27 12:36:05

by Jiho Chu

[permalink] [raw]
Subject: Re: [PATCH 0/9] Samsung Trinity NPU device driver

On Tue, 26 Jul 2022 08:57:08 +0200 Krzysztof Kozlowski <[email protected]> wrote:

> On 25/07/2022 08:52, Jiho Chu wrote:
> > Hello,
> >
> > My name is Jiho Chu, and working for device driver and system daemon for
> > several years at Samsung Electronics.
> >
> > Trinity Neural Processing Unit (NPU) series are hardware accelerators
> > for neural network processing in embedded systems, which are integrated
> > into application processors or SoCs. Trinity NPU is compatible with AMBA
> > bus architecture and first launched in 2018 with its first version for
> > vision processing, Trinity Version1 (TRIV1). Its second version, TRIV2,
> > is released in Dec, 2021. Another Trinity NPU for audio processing is
> > referred as TRIA.
> >
>
> Why there are no bindings? How is it supposed to be used on ARM64 platforms?
>
>
> Best regards,
> Krzysztof
>

Hi, Krzysztof
Thanks for your review.
A dt-bindings document under 'bindings/arm/' is being ready, and it could be included in v2.

Sincerely,
Jiho Chu

2022-07-27 13:28:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5/9] trinity: Add sysfs debugfs module

On Mon, Jul 25, 2022 at 03:53:04PM +0900, Jiho Chu wrote:
> This patch includes debugfs and sysfs interfaces.

debugfs and sysfs are two totally different things, with different rules
and requirements. Split this up into at least 2 different patches and
don't mush them all together.

Would you want to try to review 2000+ lines of this type of thing that
does two totally different things at the same time?

Also, you forgot the sysfs Documentation/ABI/ entries, which are
required as you know.

And finally, you should never do this:

> +int trinity_sysfs_init(struct trinity_driver *drv)
> +{
> + struct device *dev = drv_to_dev_ptr(drv);
> + int err;
> +
> + err = sysfs_create_groups(&dev->kobj, trinity_attrs_groups);


You just raced with userspace and lost. Use the default groups pointer
for your driver and all will be fine. Makes for much smaller code that
works properly.

thanks,

greg k-h

2022-07-27 13:44:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3/9] trinity: Add load/unload IDU files

On Mon, Jul 25, 2022 at 03:53:02PM +0900, Jiho Chu wrote:
> +static int triv2_idu_load_file(struct trinity_driver *drv, const char *dirpath,
> + const char *file_name,
> + struct trinity_resv_mem *sector)
> +{
> + struct device *dev = drv_to_dev_ptr(drv);
> + struct trinity_resv_mem mem;
> + char filepath[NAME_MAX];
> + struct kstat *stat;
> + struct file *filp;
> + loff_t pos = 0;
> + size_t size;
> + int ret;
> +
> + dev = drv_to_dev_ptr(drv);
> + stat = vmalloc(sizeof(*stat));
> + if (stat == NULL)
> + return -ENOMEM;
> +
> + /* if dirpath is null, use the default path */
> + if (dirpath)
> + snprintf(filepath, NAME_MAX, "%s/%s", dirpath, file_name);
> + else
> + snprintf(filepath, NAME_MAX, TRIV2_IDU_DIRPATH_FMT "/%s",
> + utsname()->release, file_name);
> +
> + filp = filp_open(filepath, O_RDONLY, 0400);

That is cute. And totally not ok.

Please never do this, that is not how to properly load a firmware blob
in the kernel. This is racy and broken and probably a huge security
hole.

Heck, I wrote an article about this very topic, way back in 2005, with
the title of, "Things you should never do in the kernel" and can be seen
here:
https://www.linuxjournal.com/article/8110

This should not be news to anyone, again, never do this.

thanks,

greg k-h

2022-07-27 13:44:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/9] trinity: Add base driver

On Mon, Jul 25, 2022 at 03:53:00PM +0900, Jiho Chu wrote:
> It contains the base codes for trinity driver. Minimal codes to load and
> probe device is provided. The Trinity Family is controlled by the
> Memory-Mapped Registers, the register addresses and offsets are
> described. And user api interfaces are presented to control device under
> ioctl manner.
>
> Signed-off-by: Jiho Chu <[email protected]>
> Signed-off-by: yelini-jeong <[email protected]>
> Signed-off-by: Dongju Chae <[email protected]>
> Signed-off-by: Parichay Kapoor <[email protected]>
> Signed-off-by: Wook Song <[email protected]>
> Signed-off-by: MyungJoo Ham <[email protected]>
> ---
> drivers/misc/Kconfig | 1 +
> drivers/misc/Makefile | 1 +
> drivers/misc/trinity/Kconfig | 27 ++
> drivers/misc/trinity/Makefile | 7 +
> drivers/misc/trinity/trinity.c | 369 ++++++++++++++
> drivers/misc/trinity/trinity_common.h | 392 +++++++++++++++
> drivers/misc/trinity/trinity_vision2_drv.c | 512 ++++++++++++++++++++
> drivers/misc/trinity/trinity_vision2_regs.h | 210 ++++++++
> include/uapi/misc/trinity.h | 458 +++++++++++++++++
> 9 files changed, 1977 insertions(+)
> create mode 100644 drivers/misc/trinity/Kconfig
> create mode 100644 drivers/misc/trinity/Makefile
> create mode 100644 drivers/misc/trinity/trinity.c
> create mode 100644 drivers/misc/trinity/trinity_common.h
> create mode 100644 drivers/misc/trinity/trinity_vision2_drv.c
> create mode 100644 drivers/misc/trinity/trinity_vision2_regs.h
> create mode 100644 include/uapi/misc/trinity.h
>
> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> index 41d2bb0ae23a..ad0d5f6af291 100644
> --- a/drivers/misc/Kconfig
> +++ b/drivers/misc/Kconfig
> @@ -500,4 +500,5 @@ source "drivers/misc/cardreader/Kconfig"
> source "drivers/misc/habanalabs/Kconfig"
> source "drivers/misc/uacce/Kconfig"
> source "drivers/misc/pvpanic/Kconfig"
> +source "drivers/misc/trinity/Kconfig"
> endmenu
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index 70e800e9127f..c63f3fc89780 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o
> obj-$(CONFIG_HISI_HIKEY_USB) += hisi_hikey_usb.o
> obj-$(CONFIG_HI6421V600_IRQ) += hi6421v600-irq.o
> obj-$(CONFIG_OPEN_DICE) += open-dice.o
> +obj-$(CONFIG_TRINITY) += trinity/
> diff --git a/drivers/misc/trinity/Kconfig b/drivers/misc/trinity/Kconfig
> new file mode 100644
> index 000000000000..ad4bab78f7c6
> --- /dev/null
> +++ b/drivers/misc/trinity/Kconfig
> @@ -0,0 +1,27 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config TRINITY
> + bool "Samsung Neural Processing Unit"
> + depends on HAS_IOMEM
> + depends on HAS_DMA
> + default n

The default is 'n', no need to ever say it again.

> + help
> + Select this option to enable driver support for Samsung
> + Neural Processing Unit (NPU).
> +
> + This driver works as a base driver of the other drivers
> + for Trinity device family.
> +
> + This option should be enabled to support Trinity
> + Vision 2 (TRIV2), and Trinity Audio (TRIA).
> +
> +config TRINITY_VISION2
> + tristate "Samsung NPU Trinity Vision 2"

What happened to "vision 1"?

> + depends on TRINITY
> + default n
> + help
> + Select this option to enable driver support for a Samsung
> + Neural Processing Unit (NPU), Tinity Vision 2.
> +
> + This driver enables userspace system library to access the
> + device via /dev/triv2-N.

What is the module name?

Where is the userspace library code that talks to this? Any
documentation for this interface anywhere?

> +#define BASE_DEV_NAME "trinity"

KBUILD_MODNAME?

> +/* A global lock for shared static variables such as dev_bitmap */
> +static DEFINE_SPINLOCK(trinity_lock);

That's a sign something is wrong, you should not need any module-wide
code variables.

> +/* A bitmap to keep track of active Trinity devices */
> +static unsigned long dev_bitmap[TRINITY_DEV_END];

Should not be needed, use a simple ida structure if you really want to
name things cleanly.

> +
> +/**
> + * trinity_release() - A common callback for close() in file_operations for a
> + * Trinity device node. If there are device-specific data to be
> + * cleaned-up, it is required to clean them up before invoke this
> + * callback.
> + *
> + * @inode: Inode to be closed
> + * @file: File to be closed
> + *
> + * Returns 0 on success. Otherwise, returns negative error.
> + */
> +int trinity_release(struct inode *inode, struct file *file)
> +{
> + struct trinity_driver *drv;
> +
> + drv = file->private_data;
> +
> + if (drv->verbose)
> + dev_info(drv_to_dev_ptr(drv), "%s\n", "Device closed");
> +
> + mutex_lock(&drv->lock);
> + drv->opened = drv->opened - 1;

That will never work, you can't keep track of open/close calls.

> + if (drv->opened == 0) {
> + /* wait already submitted requests */
> + if (drv->desc->drain_reqs)
> + drv->desc->drain_reqs(drv);
> +
> + drv->desc->set_state(drv, TRINITY_STATE_PAUSE);
> + }
> + mutex_unlock(&drv->lock);
> +
> + return 0;
> +}
> +
> +static bool trinity_is_empty(void)
> +{
> + enum trinity_dev_type type;
> + bool empty = true;
> +
> + spin_lock(&trinity_lock);
> + for (type = TRINITY_DEV_UNKNOWN, type++; type < TRINITY_DEV_END;
> + type++) {
> + if (find_first_bit(&dev_bitmap[type], TRINITY_DEV_EACH_MAX) !=
> + TRINITY_DEV_EACH_MAX) {
> + empty = false;
> + break;
> + }
> + }
> + spin_unlock(&trinity_lock);
> +
> + return empty;
> +}
> +
> +/**
> + * trinity_wait_ready() - Wait until trinity is ready state
> + *
> + * @drv: an instance of trinity driver
> + *
> + * Returns 0 on success. Otherwise, returns negative error.
> + */
> +int trinity_wait_ready(struct trinity_driver *drv)
> +{
> + const unsigned long time_out = HZ / 100UL; /* 1/100 seconds*/
> + const unsigned int max_retry = 10;
> + unsigned int retry = 0;
> + wait_queue_head_t wq;
> +
> + drv->desc->set_state(drv, TRINITY_STATE_READY);
> +
> + init_waitqueue_head(&wq);
> + /* try to ensure that NPU is in the ready state */
> + while (wait_event_timeout(
> + wq, drv->desc->get_state(drv) == TRINITY_STATE_READY,
> + time_out) == 0) {
> + /* regarded as failure */
> + if (retry == max_retry)
> + return -ETIMEDOUT;
> + retry++;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * trinity_open() - A common callback for open() in file_operations for a Trinity
> + * device node. If device-specific open() is required, this
> + * callback should be invoked by that open().
> + *
> + * @inode: inode to be opened
> + * @f: file to be opened
> + *
> + * Returns 0 on success. Otherwise, returns negative error.
> + */
> +int trinity_open(struct inode *inode, struct file *f)
> +{
> + struct miscdevice *miscdev;
> + struct trinity_driver *drv;
> + int ret = 0;
> +
> + miscdev = (struct miscdevice *)f->private_data;

Why the cast?

> + drv = container_of(miscdev, struct trinity_driver, mdev);
> + f->private_data = drv;
> +
> + mutex_lock(&drv->lock);
> + /** remove PAUSE set on the CP of the NPU */
> + if (drv->opened == 0) {
> + ret = trinity_wait_ready(drv);
> + if (ret != 0)
> + goto out;
> + }
> + drv->opened = drv->opened + 1;

Again, trying to keep track of open/close calls will never work. Just
let the vfs handle that for you (you will note it does that already).
Your driver should never need to worry about it.


> +
> + if (drv->verbose)
> + dev_info(drv_to_dev_ptr(drv), "%s\n", "Device opened");
> +
> +out:
> + mutex_unlock(&drv->lock);
> +
> + return 0;
> +}
> +
> +static void trinity_common_init(struct device *dev)
> +{
> + if (!trinity_is_empty())
> + return;
> +
> + /* Common init codes */
> +}

Missing something?


> +
> +static void trinity_common_exit(void)
> +{
> + if (!trinity_is_empty())
> + return;
> +
> + /* Common deinit codes */
> +}
> +

Don't provide empty functions that do nothing please.

> +static int trinity_set_device_id(struct trinity_driver *drv)
> +{
> + const struct trinity_desc *desc = drv->desc;
> + struct device *dev = drv_to_dev_ptr(drv);
> + int err = -EEXIST;
> +
> + spin_lock(&trinity_lock);
> + drv->dev_id =
> + find_first_zero_bit(&dev_bitmap[dev->id], TRINITY_DEV_EACH_MAX);

Again, use an ida structure please.

> + if (drv->dev_id < TRINITY_DEV_EACH_MAX) {
> + set_bit(drv->dev_id, &dev_bitmap[dev->id]);
> + err = 0;
> + }
> + spin_unlock(&trinity_lock);
> +
> + if (err == 0) {
> + drv->name = devm_kasprintf(dev, GFP_KERNEL, "%s-%u", desc->type,
> + drv->dev_id);
> + err = IS_ERR_OR_NULL(drv->name) ? -ENOMEM : 0;

Spell out if statements, this just makes things hard to read. And you
just leaked a "bit" if this failed, so are you sure this was ever
tested?



> + }
> +
> + return err;
> +}
> +
> +int trinity_create_node(struct trinity_driver *drv)
> +{
> + struct device *dev = drv_to_dev_ptr(drv);
> + int err;
> +
> + /** register as a misc device */
> + drv->mdev.minor = MISC_DYNAMIC_MINOR;
> + drv->mdev.parent = NULL;

No parent device? Why not? What bus does this device live on? This is
a platform device lower on in this code, please use that, don't just
hang out there at the top of the device tree.


> + drv->mdev.name = drv->name;
> +
> + err = misc_register(&drv->mdev);
> + if (err < 0)
> + dev_err(dev, "failed to register as a misc device");
> + else
> + dev_info(dev, "misc device created!");

Again, drivers are quiet if all goes well.

I stopped here.

Also, please remove the layers of abstraction you have in your
structures that you never use, but yet still define in this patch for
some reason...

thanks,

greg k-h

2022-07-27 13:44:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4/9] trinity: Add schduler module

On Mon, Jul 25, 2022 at 03:53:03PM +0900, Jiho Chu wrote:
> This patch includes NPU scheduler interface.
>
> Tasks can be pushed to the NPU in order by the scheduler. The default
> schduling algorithm is provided using Priority policy.
> The scheduler waits request from the user. When the requests are
> invoked, it submits each request to the NPU by the priority, and waits
> until complete interrupt arrives. The priority is calculated with
> remained time to requested timeout.
>
> Thus the scheduler algorithm may be added more in the later, it
> provides an interface which can support various schedulers.

Please do not add interfaces that you do not use at all. Just make it
simple for the first version and then, if you really need to add new
types of "schedulers" add them later on.

As it is, this is a whole layer of abstraction that is not needed and
can be removed.

thanks,

greg k-h

2022-07-27 13:58:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/9] tirnity: Add dma memory module

On Mon, Jul 25, 2022 at 03:53:01PM +0900, Jiho Chu wrote:
> This patch includes memory management module.
>
> It provides abstraction layer to handle DMA buffer.

Again, no abstactions please. Get this working and merged properly
first, before worrying about any sort of additional hardware models or
abstractions. It just makes this so much harder to review and to
determine what you really are, or are not, using here.

So far, it seems you aren't using any of these new abstractions, which
is odd. Or I just can't find them. Either way that's a huge sign this
code is wrong and needs to be cleaned up.

thanks,

greg k-h

2022-07-28 02:26:12

by MyungJoo Ham

[permalink] [raw]
Subject: RE: Re: [PATCH 1/9] trinity: Add base driver

>
> > + help
> > + Select this option to enable driver support for Samsung
> > + Neural Processing Unit (NPU).
> > +
> > + This driver works as a base driver of the other drivers
> > + for Trinity device family.
> > +
> > + This option should be enabled to support Trinity
> > + Vision 2 (TRIV2), and Trinity Audio (TRIA).
> > +
> > +config TRINITY_VISION2
> > + tristate "Samsung NPU Trinity Vision 2"
>
> What happened to "vision 1"?

It's designed before Vision 1, but its products are still not yet
released. The two have different target neural networks.

>
> > + depends on TRINITY
> > + default n
> > + help
> > + Select this option to enable driver support for a Samsung
> > + Neural Processing Unit (NPU), Tinity Vision 2.
> > +
> > + This driver enables userspace system library to access the
> > + device via /dev/triv2-N.
>
> What is the module name?
>
> Where is the userspace library code that talks to this? Any
> documentation for this interface anywhere?
>

I believe Jiho will provide documents soon; however, the userspace
library code is at
https://git.tizen.org/cgit/platform/adaptation/npu/trix-engine/
, which is shipped with 2022 TVs.

Cheers,
MyungJoo

2022-07-29 17:56:33

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 0/9] Samsung Trinity NPU device driver

Hi!

> This driver clearly does not fall into those categories. As long as there
> is no subsystem for NPUs, the only sensible options are drivers/gpu
> and drivers/misc/.

Well, we can create drivers/npu. I'm sure these will get more
widespread.

And GPU people really should be cc-ed.

Best regards,
Pavel

--

2022-09-01 19:01:31

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH 1/9] trinity: Add base driver

On Mon, Jul 25, 2022 at 03:53:00PM +0900, Jiho Chu wrote:

> + drv->opened = drv->opened - 1;
> + if (drv->opened == 0) {
> + /* wait already submitted requests */
> + if (drv->desc->drain_reqs)
> + drv->desc->drain_reqs(drv);

> + drv->desc->set_state(drv, TRINITY_STATE_PAUSE);

> + mutex_lock(&drv->lock);
> + /** remove PAUSE set on the CP of the NPU */
> + if (drv->opened == 0) {
> + ret = trinity_wait_ready(drv);
> + if (ret != 0)
> + goto out;
> + }
> + drv->opened = drv->opened + 1;

Would it perhaps be cleaner to hold a runtime PM reference on the
device for each file and deal with the power up/down of the hardware in
the runtime PM callbacks?


Attachments:
(No filename) (690.00 B)
signature.asc (499.00 B)
Download all attachments

2022-09-01 19:15:34

by Dafna Hirschfeld

[permalink] [raw]
Subject: Re: [PATCH 1/9] trinity: Add base driver

On 27.07.2022 15:22, Greg KH wrote:
>On Mon, Jul 25, 2022 at 03:53:00PM +0900, Jiho Chu wrote:
>> It contains the base codes for trinity driver. Minimal codes to load and
>> probe device is provided. The Trinity Family is controlled by the
>> Memory-Mapped Registers, the register addresses and offsets are
>> described. And user api interfaces are presented to control device under
>> ioctl manner.
>>
>> Signed-off-by: Jiho Chu <[email protected]>
>> Signed-off-by: yelini-jeong <[email protected]>
>> Signed-off-by: Dongju Chae <[email protected]>
>> Signed-off-by: Parichay Kapoor <[email protected]>
>> Signed-off-by: Wook Song <[email protected]>
>> Signed-off-by: MyungJoo Ham <[email protected]>
>> ---
>> drivers/misc/Kconfig | 1 +
>> drivers/misc/Makefile | 1 +
>> drivers/misc/trinity/Kconfig | 27 ++
>> drivers/misc/trinity/Makefile | 7 +
>> drivers/misc/trinity/trinity.c | 369 ++++++++++++++
>> drivers/misc/trinity/trinity_common.h | 392 +++++++++++++++
>> drivers/misc/trinity/trinity_vision2_drv.c | 512 ++++++++++++++++++++
>> drivers/misc/trinity/trinity_vision2_regs.h | 210 ++++++++
>> include/uapi/misc/trinity.h | 458 +++++++++++++++++
>> 9 files changed, 1977 insertions(+)
>> create mode 100644 drivers/misc/trinity/Kconfig
>> create mode 100644 drivers/misc/trinity/Makefile
>> create mode 100644 drivers/misc/trinity/trinity.c
>> create mode 100644 drivers/misc/trinity/trinity_common.h
>> create mode 100644 drivers/misc/trinity/trinity_vision2_drv.c
>> create mode 100644 drivers/misc/trinity/trinity_vision2_regs.h
>> create mode 100644 include/uapi/misc/trinity.h
>>
>> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
>> index 41d2bb0ae23a..ad0d5f6af291 100644
>> --- a/drivers/misc/Kconfig
>> +++ b/drivers/misc/Kconfig
>> @@ -500,4 +500,5 @@ source "drivers/misc/cardreader/Kconfig"
>> source "drivers/misc/habanalabs/Kconfig"
>> source "drivers/misc/uacce/Kconfig"
>> source "drivers/misc/pvpanic/Kconfig"
>> +source "drivers/misc/trinity/Kconfig"
>> endmenu
>> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
>> index 70e800e9127f..c63f3fc89780 100644
>> --- a/drivers/misc/Makefile
>> +++ b/drivers/misc/Makefile
>> @@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o
>> obj-$(CONFIG_HISI_HIKEY_USB) += hisi_hikey_usb.o
>> obj-$(CONFIG_HI6421V600_IRQ) += hi6421v600-irq.o
>> obj-$(CONFIG_OPEN_DICE) += open-dice.o
>> +obj-$(CONFIG_TRINITY) += trinity/
>> diff --git a/drivers/misc/trinity/Kconfig b/drivers/misc/trinity/Kconfig
>> new file mode 100644
>> index 000000000000..ad4bab78f7c6
>> --- /dev/null
>> +++ b/drivers/misc/trinity/Kconfig
>> @@ -0,0 +1,27 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +
>> +config TRINITY
>> + bool "Samsung Neural Processing Unit"
>> + depends on HAS_IOMEM
>> + depends on HAS_DMA
>> + default n
>
>The default is 'n', no need to ever say it again.
>
>> + help
>> + Select this option to enable driver support for Samsung
>> + Neural Processing Unit (NPU).
>> +
>> + This driver works as a base driver of the other drivers
>> + for Trinity device family.
>> +
>> + This option should be enabled to support Trinity
>> + Vision 2 (TRIV2), and Trinity Audio (TRIA).
>> +
>> +config TRINITY_VISION2
>> + tristate "Samsung NPU Trinity Vision 2"
>
>What happened to "vision 1"?
>
>> + depends on TRINITY
>> + default n
>> + help
>> + Select this option to enable driver support for a Samsung
>> + Neural Processing Unit (NPU), Tinity Vision 2.
>> +
>> + This driver enables userspace system library to access the
>> + device via /dev/triv2-N.
>
>What is the module name?
>
>Where is the userspace library code that talks to this? Any
>documentation for this interface anywhere?
>
>> +#define BASE_DEV_NAME "trinity"
>
>KBUILD_MODNAME?
>
>> +/* A global lock for shared static variables such as dev_bitmap */
>> +static DEFINE_SPINLOCK(trinity_lock);
>
>That's a sign something is wrong, you should not need any module-wide
>code variables.
>
>> +/* A bitmap to keep track of active Trinity devices */
>> +static unsigned long dev_bitmap[TRINITY_DEV_END];
>
>Should not be needed, use a simple ida structure if you really want to
>name things cleanly.
>
>> +
>> +/**
>> + * trinity_release() - A common callback for close() in file_operations for a
>> + * Trinity device node. If there are device-specific data to be
>> + * cleaned-up, it is required to clean them up before invoke this
>> + * callback.
>> + *
>> + * @inode: Inode to be closed
>> + * @file: File to be closed
>> + *
>> + * Returns 0 on success. Otherwise, returns negative error.
>> + */
>> +int trinity_release(struct inode *inode, struct file *file)
>> +{
>> + struct trinity_driver *drv;
>> +
>> + drv = file->private_data;
>> +
>> + if (drv->verbose)
>> + dev_info(drv_to_dev_ptr(drv), "%s\n", "Device closed");
>> +
>> + mutex_lock(&drv->lock);
>> + drv->opened = drv->opened - 1;
>
>That will never work, you can't keep track of open/close calls.

Hi, can you explain why this will not work?

Thanks,
Dafna

>
>> + if (drv->opened == 0) {
>> + /* wait already submitted requests */
>> + if (drv->desc->drain_reqs)
>> + drv->desc->drain_reqs(drv);
>> +
>> + drv->desc->set_state(drv, TRINITY_STATE_PAUSE);
>> + }
>> + mutex_unlock(&drv->lock);
>> +
>> + return 0;
>> +}
>> +
>> +static bool trinity_is_empty(void)
>> +{
>> + enum trinity_dev_type type;
>> + bool empty = true;
>> +
>> + spin_lock(&trinity_lock);
>> + for (type = TRINITY_DEV_UNKNOWN, type++; type < TRINITY_DEV_END;
>> + type++) {
>> + if (find_first_bit(&dev_bitmap[type], TRINITY_DEV_EACH_MAX) !=
>> + TRINITY_DEV_EACH_MAX) {
>> + empty = false;
>> + break;
>> + }
>> + }
>> + spin_unlock(&trinity_lock);
>> +
>> + return empty;
>> +}
>> +
>> +/**
>> + * trinity_wait_ready() - Wait until trinity is ready state
>> + *
>> + * @drv: an instance of trinity driver
>> + *
>> + * Returns 0 on success. Otherwise, returns negative error.
>> + */
>> +int trinity_wait_ready(struct trinity_driver *drv)
>> +{
>> + const unsigned long time_out = HZ / 100UL; /* 1/100 seconds*/
>> + const unsigned int max_retry = 10;
>> + unsigned int retry = 0;
>> + wait_queue_head_t wq;
>> +
>> + drv->desc->set_state(drv, TRINITY_STATE_READY);
>> +
>> + init_waitqueue_head(&wq);
>> + /* try to ensure that NPU is in the ready state */
>> + while (wait_event_timeout(
>> + wq, drv->desc->get_state(drv) == TRINITY_STATE_READY,
>> + time_out) == 0) {
>> + /* regarded as failure */
>> + if (retry == max_retry)
>> + return -ETIMEDOUT;
>> + retry++;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +/**
>> + * trinity_open() - A common callback for open() in file_operations for a Trinity
>> + * device node. If device-specific open() is required, this
>> + * callback should be invoked by that open().
>> + *
>> + * @inode: inode to be opened
>> + * @f: file to be opened
>> + *
>> + * Returns 0 on success. Otherwise, returns negative error.
>> + */
>> +int trinity_open(struct inode *inode, struct file *f)
>> +{
>> + struct miscdevice *miscdev;
>> + struct trinity_driver *drv;
>> + int ret = 0;
>> +
>> + miscdev = (struct miscdevice *)f->private_data;
>
>Why the cast?
>
>> + drv = container_of(miscdev, struct trinity_driver, mdev);
>> + f->private_data = drv;
>> +
>> + mutex_lock(&drv->lock);
>> + /** remove PAUSE set on the CP of the NPU */
>> + if (drv->opened == 0) {
>> + ret = trinity_wait_ready(drv);
>> + if (ret != 0)
>> + goto out;
>> + }
>> + drv->opened = drv->opened + 1;
>
>Again, trying to keep track of open/close calls will never work. Just
>let the vfs handle that for you (you will note it does that already).
>Your driver should never need to worry about it.
>
>
>> +
>> + if (drv->verbose)
>> + dev_info(drv_to_dev_ptr(drv), "%s\n", "Device opened");
>> +
>> +out:
>> + mutex_unlock(&drv->lock);
>> +
>> + return 0;
>> +}
>> +
>> +static void trinity_common_init(struct device *dev)
>> +{
>> + if (!trinity_is_empty())
>> + return;
>> +
>> + /* Common init codes */
>> +}
>
>Missing something?
>
>
>> +
>> +static void trinity_common_exit(void)
>> +{
>> + if (!trinity_is_empty())
>> + return;
>> +
>> + /* Common deinit codes */
>> +}
>> +
>
>Don't provide empty functions that do nothing please.
>
>> +static int trinity_set_device_id(struct trinity_driver *drv)
>> +{
>> + const struct trinity_desc *desc = drv->desc;
>> + struct device *dev = drv_to_dev_ptr(drv);
>> + int err = -EEXIST;
>> +
>> + spin_lock(&trinity_lock);
>> + drv->dev_id =
>> + find_first_zero_bit(&dev_bitmap[dev->id], TRINITY_DEV_EACH_MAX);
>
>Again, use an ida structure please.
>
>> + if (drv->dev_id < TRINITY_DEV_EACH_MAX) {
>> + set_bit(drv->dev_id, &dev_bitmap[dev->id]);
>> + err = 0;
>> + }
>> + spin_unlock(&trinity_lock);
>> +
>> + if (err == 0) {
>> + drv->name = devm_kasprintf(dev, GFP_KERNEL, "%s-%u", desc->type,
>> + drv->dev_id);
>> + err = IS_ERR_OR_NULL(drv->name) ? -ENOMEM : 0;
>
>Spell out if statements, this just makes things hard to read. And you
>just leaked a "bit" if this failed, so are you sure this was ever
>tested?
>
>
>
>> + }
>> +
>> + return err;
>> +}
>> +
>> +int trinity_create_node(struct trinity_driver *drv)
>> +{
>> + struct device *dev = drv_to_dev_ptr(drv);
>> + int err;
>> +
>> + /** register as a misc device */
>> + drv->mdev.minor = MISC_DYNAMIC_MINOR;
>> + drv->mdev.parent = NULL;
>
>No parent device? Why not? What bus does this device live on? This is
>a platform device lower on in this code, please use that, don't just
>hang out there at the top of the device tree.
>
>
>> + drv->mdev.name = drv->name;
>> +
>> + err = misc_register(&drv->mdev);
>> + if (err < 0)
>> + dev_err(dev, "failed to register as a misc device");
>> + else
>> + dev_info(dev, "misc device created!");
>
>Again, drivers are quiet if all goes well.
>
>I stopped here.
>
>Also, please remove the layers of abstraction you have in your
>structures that you never use, but yet still define in this patch for
>some reason...
>
>thanks,
>
>greg k-h

2022-09-02 05:55:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/9] trinity: Add base driver

On Thu, Sep 01, 2022 at 10:04:43PM +0300, Dafna Hirschfeld wrote:
> On 27.07.2022 15:22, Greg KH wrote:
> > On Mon, Jul 25, 2022 at 03:53:00PM +0900, Jiho Chu wrote:
> > > It contains the base codes for trinity driver. Minimal codes to load and
> > > probe device is provided. The Trinity Family is controlled by the
> > > Memory-Mapped Registers, the register addresses and offsets are
> > > described. And user api interfaces are presented to control device under
> > > ioctl manner.
> > >
> > > Signed-off-by: Jiho Chu <[email protected]>
> > > Signed-off-by: yelini-jeong <[email protected]>
> > > Signed-off-by: Dongju Chae <[email protected]>
> > > Signed-off-by: Parichay Kapoor <[email protected]>
> > > Signed-off-by: Wook Song <[email protected]>
> > > Signed-off-by: MyungJoo Ham <[email protected]>
> > > ---
> > > drivers/misc/Kconfig | 1 +
> > > drivers/misc/Makefile | 1 +
> > > drivers/misc/trinity/Kconfig | 27 ++
> > > drivers/misc/trinity/Makefile | 7 +
> > > drivers/misc/trinity/trinity.c | 369 ++++++++++++++
> > > drivers/misc/trinity/trinity_common.h | 392 +++++++++++++++
> > > drivers/misc/trinity/trinity_vision2_drv.c | 512 ++++++++++++++++++++
> > > drivers/misc/trinity/trinity_vision2_regs.h | 210 ++++++++
> > > include/uapi/misc/trinity.h | 458 +++++++++++++++++
> > > 9 files changed, 1977 insertions(+)
> > > create mode 100644 drivers/misc/trinity/Kconfig
> > > create mode 100644 drivers/misc/trinity/Makefile
> > > create mode 100644 drivers/misc/trinity/trinity.c
> > > create mode 100644 drivers/misc/trinity/trinity_common.h
> > > create mode 100644 drivers/misc/trinity/trinity_vision2_drv.c
> > > create mode 100644 drivers/misc/trinity/trinity_vision2_regs.h
> > > create mode 100644 include/uapi/misc/trinity.h
> > >
> > > diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> > > index 41d2bb0ae23a..ad0d5f6af291 100644
> > > --- a/drivers/misc/Kconfig
> > > +++ b/drivers/misc/Kconfig
> > > @@ -500,4 +500,5 @@ source "drivers/misc/cardreader/Kconfig"
> > > source "drivers/misc/habanalabs/Kconfig"
> > > source "drivers/misc/uacce/Kconfig"
> > > source "drivers/misc/pvpanic/Kconfig"
> > > +source "drivers/misc/trinity/Kconfig"
> > > endmenu
> > > diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> > > index 70e800e9127f..c63f3fc89780 100644
> > > --- a/drivers/misc/Makefile
> > > +++ b/drivers/misc/Makefile
> > > @@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o
> > > obj-$(CONFIG_HISI_HIKEY_USB) += hisi_hikey_usb.o
> > > obj-$(CONFIG_HI6421V600_IRQ) += hi6421v600-irq.o
> > > obj-$(CONFIG_OPEN_DICE) += open-dice.o
> > > +obj-$(CONFIG_TRINITY) += trinity/
> > > diff --git a/drivers/misc/trinity/Kconfig b/drivers/misc/trinity/Kconfig
> > > new file mode 100644
> > > index 000000000000..ad4bab78f7c6
> > > --- /dev/null
> > > +++ b/drivers/misc/trinity/Kconfig
> > > @@ -0,0 +1,27 @@
> > > +# SPDX-License-Identifier: GPL-2.0-only
> > > +
> > > +config TRINITY
> > > + bool "Samsung Neural Processing Unit"
> > > + depends on HAS_IOMEM
> > > + depends on HAS_DMA
> > > + default n
> >
> > The default is 'n', no need to ever say it again.
> >
> > > + help
> > > + Select this option to enable driver support for Samsung
> > > + Neural Processing Unit (NPU).
> > > +
> > > + This driver works as a base driver of the other drivers
> > > + for Trinity device family.
> > > +
> > > + This option should be enabled to support Trinity
> > > + Vision 2 (TRIV2), and Trinity Audio (TRIA).
> > > +
> > > +config TRINITY_VISION2
> > > + tristate "Samsung NPU Trinity Vision 2"
> >
> > What happened to "vision 1"?
> >
> > > + depends on TRINITY
> > > + default n
> > > + help
> > > + Select this option to enable driver support for a Samsung
> > > + Neural Processing Unit (NPU), Tinity Vision 2.
> > > +
> > > + This driver enables userspace system library to access the
> > > + device via /dev/triv2-N.
> >
> > What is the module name?
> >
> > Where is the userspace library code that talks to this? Any
> > documentation for this interface anywhere?
> >
> > > +#define BASE_DEV_NAME "trinity"
> >
> > KBUILD_MODNAME?
> >
> > > +/* A global lock for shared static variables such as dev_bitmap */
> > > +static DEFINE_SPINLOCK(trinity_lock);
> >
> > That's a sign something is wrong, you should not need any module-wide
> > code variables.
> >
> > > +/* A bitmap to keep track of active Trinity devices */
> > > +static unsigned long dev_bitmap[TRINITY_DEV_END];
> >
> > Should not be needed, use a simple ida structure if you really want to
> > name things cleanly.
> >
> > > +
> > > +/**
> > > + * trinity_release() - A common callback for close() in file_operations for a
> > > + * Trinity device node. If there are device-specific data to be
> > > + * cleaned-up, it is required to clean them up before invoke this
> > > + * callback.
> > > + *
> > > + * @inode: Inode to be closed
> > > + * @file: File to be closed
> > > + *
> > > + * Returns 0 on success. Otherwise, returns negative error.
> > > + */
> > > +int trinity_release(struct inode *inode, struct file *file)
> > > +{
> > > + struct trinity_driver *drv;
> > > +
> > > + drv = file->private_data;
> > > +
> > > + if (drv->verbose)
> > > + dev_info(drv_to_dev_ptr(drv), "%s\n", "Device closed");
> > > +
> > > + mutex_lock(&drv->lock);
> > > + drv->opened = drv->opened - 1;
> >
> > That will never work, you can't keep track of open/close calls.
>
> Hi, can you explain why this will not work?

Let me switch it the other way around, can you explain to me how this
will actually work? Think about userspace calling dup(2) and passing
file handles around to other processes...

It's an impossible thing, just don't worry about it at all. If
userspace wants to open multiple instances of the same device and do
foolish things with it, let it. That's a userspace bug, not a kernel
issue.

thanks,

greg k-h

2022-09-02 09:14:18

by Jiho Chu

[permalink] [raw]
Subject: Re: [PATCH 1/9] trinity: Add base driver

On Thu, 1 Sep 2022 19:36:01 +0100
Mark Brown <[email protected]> wrote:

> On Mon, Jul 25, 2022 at 03:53:00PM +0900, Jiho Chu wrote:
>
> > + drv->opened = drv->opened - 1;
> > + if (drv->opened == 0) {
> > + /* wait already submitted requests */
> > + if (drv->desc->drain_reqs)
> > + drv->desc->drain_reqs(drv);
>
> > + drv->desc->set_state(drv, TRINITY_STATE_PAUSE);
>
> > + mutex_lock(&drv->lock);
> > + /** remove PAUSE set on the CP of the NPU */
> > + if (drv->opened == 0) {
> > + ret = trinity_wait_ready(drv);
> > + if (ret != 0)
> > + goto out;
> > + }
> > + drv->opened = drv->opened + 1;
>
> Would it perhaps be cleaner to hold a runtime PM reference on the
> device for each file and deal with the power up/down of the hardware in
> the runtime PM callbacks?

Hi, Mark.
This open count will be removed as Greg's review.
Anyway, the PM callback for suspend/resume is defined on device_driver struct.

@@ -1400,6 +1833,7 @@ static struct platform_driver trinity_triv2 = {
.name = "triv2",
.owner = THIS_MODULE,
.of_match_table = of_match_ptr(trinity_match),
+ .pm = &triv2_dev_pm_ops,
},
};

Thanks.
Jiho Chu

2022-09-17 08:15:41

by Jiho Chu

[permalink] [raw]
Subject: Re: [PATCH 3/9] trinity: Add load/unload IDU files

On Wed, 27 Jul 2022 15:14:10 +0200
Greg KH <[email protected]> wrote:

> On Mon, Jul 25, 2022 at 03:53:02PM +0900, Jiho Chu wrote:
> > +static int triv2_idu_load_file(struct trinity_driver *drv, const char *dirpath,
> > + const char *file_name,
> > + struct trinity_resv_mem *sector)
> > +{
> > + struct device *dev = drv_to_dev_ptr(drv);
> > + struct trinity_resv_mem mem;
> > + char filepath[NAME_MAX];
> > + struct kstat *stat;
> > + struct file *filp;
> > + loff_t pos = 0;
> > + size_t size;
> > + int ret;
> > +
> > + dev = drv_to_dev_ptr(drv);
> > + stat = vmalloc(sizeof(*stat));
> > + if (stat == NULL)
> > + return -ENOMEM;
> > +
> > + /* if dirpath is null, use the default path */
> > + if (dirpath)
> > + snprintf(filepath, NAME_MAX, "%s/%s", dirpath, file_name);
> > + else
> > + snprintf(filepath, NAME_MAX, TRIV2_IDU_DIRPATH_FMT "/%s",
> > + utsname()->release, file_name);
> > +
> > + filp = filp_open(filepath, O_RDONLY, 0400);
>
> That is cute. And totally not ok.
>
> Please never do this, that is not how to properly load a firmware blob
> in the kernel. This is racy and broken and probably a huge security
> hole.
>
> Heck, I wrote an article about this very topic, way back in 2005, with
> the title of, "Things you should never do in the kernel" and can be seen
> here:
> https://protect2.fireeye.com/v1/url?k=9f82c8ca-ff605597-9f834385-000babd9f1ba-3ee71f9f013fb8d9&q=1&e=0963e638-a9ed-43d0-95e3-adfcbdba2425&u=https%3A%2F%2Fwww.linuxjournal.com%2Farticle%2F8110
>
> This should not be news to anyone, again, never do this.
>
> thanks,
>
> greg k-h
>

Hi, greg
I just resent second revision of the driver.
As your reference, reading user space file mechnism is changed to use IOCTL call.
And many of your reviews (abstaction, open count, doc,, ) are very helpful and fixed in the revision.
If they are modified in wrong way, please let me know.

Thanks for your review.

Thanks,
Jiho Chu