Hello,
This patch set is v2 of Samsung Trinity NPU driver.
As reviews of v1, unnecessary logs are removed, and essential documents
including binding and ABI are presented.
There were some violations like user space access, abstraction and open
count. They modified in right way as much as possible, but if there are any
faults, please let me know.
Unnecessary functions are removed, and each patch is reduced
in size for better reading.
Thanks for reviews.
The main changes are:
Since V2:
- Remove all tracing info logs
- Remove abstraction layer for scheduler
- Remove access to user space file
- USE IDA to generate id
- Add ABI document for sysfs
- Add dt-bindings document
- Use default group for sysfs
Link to v1:
https://lore.kernel.org/all/[email protected]/
I would highly appreciate your feedback.
Reviews, questions or anythings.
Thanks,
Jiho Chu
Jiho Chu (13):
trinity: Add base driver
tirnity: Add memory module
trinity: Add IDU feature
trinity: Add schduler module
trinity: Add debugfs module
trinity: add statistics module
trinity: Add sysfs module
trinity: Add ioctl feature
trinity: Add request and pm feature
trinity: Add profile module
trinity: Add trace module
MAINTAINERS: add TRINITY driver
dt-bindings: arm: Add Samsung Trinity bindings
.../ABI/testing/sysfs-driver-trinity | 55 +
.../bindings/arm/samsung,trinity.yaml | 115 ++
MAINTAINERS | 8 +
drivers/misc/Kconfig | 1 +
drivers/misc/Makefile | 1 +
drivers/misc/trinity/Kconfig | 25 +
drivers/misc/trinity/Makefile | 13 +
drivers/misc/trinity/trinity.c | 1019 ++++++++++
drivers/misc/trinity/trinity_common.h | 437 +++++
drivers/misc/trinity/trinity_debug.c | 331 ++++
drivers/misc/trinity/trinity_dma.c | 83 +
drivers/misc/trinity/trinity_dma.h | 87 +
drivers/misc/trinity/trinity_hwmem.c | 380 ++++
drivers/misc/trinity/trinity_hwmem.h | 81 +
drivers/misc/trinity/trinity_sched.c | 338 ++++
drivers/misc/trinity/trinity_sched.h | 24 +
drivers/misc/trinity/trinity_stat.c | 898 +++++++++
drivers/misc/trinity/trinity_stat.h | 56 +
drivers/misc/trinity/trinity_sysfs.c | 667 +++++++
drivers/misc/trinity/trinity_trace.c | 15 +
drivers/misc/trinity/trinity_trace.h | 329 ++++
drivers/misc/trinity/trinity_vision2_drv.c | 1685 +++++++++++++++++
.../misc/trinity/trinity_vision2_profile.h | 324 ++++
drivers/misc/trinity/trinity_vision2_regs.h | 210 ++
include/uapi/misc/trinity.h | 476 +++++
25 files changed, 7658 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-driver-trinity
create mode 100644 Documentation/devicetree/bindings/arm/samsung,trinity.yaml
create mode 100644 drivers/misc/trinity/Kconfig
create mode 100644 drivers/misc/trinity/Makefile
create mode 100644 drivers/misc/trinity/trinity.c
create mode 100644 drivers/misc/trinity/trinity_common.h
create mode 100644 drivers/misc/trinity/trinity_debug.c
create mode 100644 drivers/misc/trinity/trinity_dma.c
create mode 100644 drivers/misc/trinity/trinity_dma.h
create mode 100644 drivers/misc/trinity/trinity_hwmem.c
create mode 100644 drivers/misc/trinity/trinity_hwmem.h
create mode 100644 drivers/misc/trinity/trinity_sched.c
create mode 100644 drivers/misc/trinity/trinity_sched.h
create mode 100644 drivers/misc/trinity/trinity_stat.c
create mode 100644 drivers/misc/trinity/trinity_stat.h
create mode 100644 drivers/misc/trinity/trinity_sysfs.c
create mode 100644 drivers/misc/trinity/trinity_trace.c
create mode 100644 drivers/misc/trinity/trinity_trace.h
create mode 100644 drivers/misc/trinity/trinity_vision2_drv.c
create mode 100644 drivers/misc/trinity/trinity_vision2_profile.h
create mode 100644 drivers/misc/trinity/trinity_vision2_regs.h
create mode 100644 include/uapi/misc/trinity.h
--
2.25.1
Add SAMSUNG TRINITY DRIVER.
Jiho Chu and Yelin Jeong is added as the maintainers.
Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
---
MAINTAINERS | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 3cf9842d9233..e166558e693e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17690,6 +17690,14 @@ S: Maintained
F: Documentation/devicetree/bindings/thermal/samsung,exynos-thermal.yaml
F: drivers/thermal/samsung/
+SAMSUNG TRINITY DRIVER
+M: Jiho Chu <[email protected]>
+M: Yelin Jeong <[email protected]>
+S: Supported
+F: Documentation/devicetree/bindings/arm/samsung,trinity.yaml
+F: drivers/misc/trinity/
+F: include/uapi/misc/trinity.h
+
SAMSUNG USB2 PHY DRIVER
M: Sylwester Nawrocki <[email protected]>
L: [email protected]
--
2.25.1
This patch includes statistics information module.
The information includes per-application statistics and per-request
statistics. The app statistics records total number of requests,
active requests, allocated memory and freed memory.
For request statistics, it counts number of runs and total consumed
time, and it also keeps profile data for the requests.
Signed-off-by: Jiho Chu <[email protected]>
---
drivers/misc/trinity/Makefile | 1 +
drivers/misc/trinity/trinity.c | 3 +
drivers/misc/trinity/trinity_stat.c | 898 ++++++++++++++++++++++++++++
drivers/misc/trinity/trinity_stat.h | 56 ++
4 files changed, 958 insertions(+)
create mode 100644 drivers/misc/trinity/trinity_stat.c
create mode 100644 drivers/misc/trinity/trinity_stat.h
diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
index 5d3e89dd0dd7..b475938a0db6 100644
--- a/drivers/misc/trinity/Makefile
+++ b/drivers/misc/trinity/Makefile
@@ -6,5 +6,6 @@ trinity-y := trinity.o
trinity-y += trinity_dma.o trinity_hwmem.o
trinity-y += trinity_sched.o
trinity-y += trinity_debug.o
+trinity-y += trinity_stat.o
trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
index 0c75eb13967c..a785a5dca4d9 100644
--- a/drivers/misc/trinity/trinity.c
+++ b/drivers/misc/trinity/trinity.c
@@ -15,6 +15,7 @@
#include "trinity_common.h"
#include "trinity_sched.h"
+#include "trinity_stat.h"
#define TRINITY_PADDR_BASE (0x0)
@@ -100,6 +101,8 @@ int trinity_open(struct inode *inode, struct file *f)
drv = container_of(miscdev, struct trinity_driver, mdev);
f->private_data = drv;
+ trinity_stat_app_set_status(drv, TRINITY_APP_STATUS_STARTED);
+
return 0;
}
diff --git a/drivers/misc/trinity/trinity_stat.c b/drivers/misc/trinity/trinity_stat.c
new file mode 100644
index 000000000000..0cbba08ee0b0
--- /dev/null
+++ b/drivers/misc/trinity/trinity_stat.c
@@ -0,0 +1,898 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Providing statistics for Samsung Trinity device family support
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include "trinity_stat.h"
+#include "trinity_common.h"
+
+#include <linux/bitmap.h>
+#include <linux/list_bl.h>
+
+/* maximum number of stats configurable from sysfs */
+#define TRINITY_STAT_MAX_APPS (128UL)
+#define TRINITY_STAT_MAX_REQS (4096UL)
+#define TRINITY_STAT_MAX_REQS_PER_APP (128UL)
+
+/* default number of stats */
+#define TRINITY_STAT_DEF_APPS (32UL)
+#define TRINITY_STAT_DEF_REQS (128UL)
+#define TRINITY_STAT_DEF_REQS_PER_APP (32UL)
+
+/**
+ * struct trinity_stat_pool - Statistics pool which maintain statistics for device
+ *
+ * @bitmap_app: bitmap for app
+ * @bitmap_req: bitmap for request
+ * @mem_app: reserved memory for applications
+ * @mem_req: reserved memory for request
+ * @max_stat_apps: max statistics size of applications
+ * @max_stat_reqs: max statistics size of requests.
+ * @max_stat_reqs_per_app: max statistics size of request per application
+ * @cur_stat_apps: current statistics for applications
+ * @cur_stat_reqs: current statistics for requests
+ * @drv: an instance of the trinity driver
+ */
+struct trinity_stat_pool {
+ DECLARE_BITMAP(bitmap_app, TRINITY_STAT_MAX_APPS);
+ DECLARE_BITMAP(bitmap_req, TRINITY_STAT_MAX_REQS);
+
+ struct trinity_dma mem_app;
+ struct trinity_dma mem_req;
+
+ unsigned long max_stat_apps;
+ unsigned long max_stat_reqs;
+ unsigned long max_stat_reqs_per_app;
+
+ unsigned long cur_stat_apps;
+ unsigned long cur_stat_reqs;
+
+ struct trinity_driver *drv;
+};
+
+/**
+ * trinity_stat_pool_init(): Initialize trinity statistics pool
+ *
+ * @drv: an instance of the trinity driver
+ *
+ * Returns: 0 on success. Otherwise, returns negative error.
+ */
+int trinity_stat_pool_init(struct trinity_driver *drv)
+{
+ struct trinity_stat_pool *pool;
+
+ pool = kzalloc(sizeof(*pool), GFP_KERNEL);
+ if (!pool)
+ return -ENOMEM;
+
+ pool->drv = drv;
+
+ drv->stat.pdata = pool;
+
+ return 0;
+}
+
+/**
+ * trinity_stat_pool_init(): finish trinity statistics pool
+ *
+ * @drv: an instance of the trinity driver
+ */
+void trinity_stat_pool_fini(struct trinity_driver *drv)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+
+ if (!pool)
+ return;
+
+ trinity_dma_free(dev, &pool->mem_app);
+ trinity_dma_free(dev, &pool->mem_req);
+ kfree(pool);
+
+ drv->stat.pdata = NULL;
+}
+
+static void trinity_stat_pool_resize_apps(struct trinity_stat_pool *pool,
+ unsigned long num_apps)
+{
+ struct device *dev = drv_to_dev_ptr(pool->drv);
+ struct trinity_dma mem;
+ unsigned long size;
+ int err;
+
+ if (num_apps > TRINITY_STAT_MAX_APPS) {
+ dev_err(dev, "The maximum number of stat apps: %lu",
+ TRINITY_STAT_MAX_APPS);
+ return;
+ }
+
+ size = sizeof(struct trinity_stat_app) * num_apps;
+ err = trinity_dma_alloc(dev, size, &mem);
+ if (err < 0) {
+ dev_warn(dev, "Unable to allocate stats for apps");
+ return;
+ }
+
+ trinity_dma_free(dev, &pool->mem_app);
+
+ bitmap_fill(pool->bitmap_app, TRINITY_STAT_MAX_APPS);
+ bitmap_zero(pool->bitmap_app, num_apps);
+
+ pool->max_stat_apps = num_apps;
+ pool->mem_app = mem;
+}
+
+static void trinity_stat_pool_resize_reqs(struct trinity_stat_pool *pool,
+ unsigned long num_reqs)
+{
+ struct device *dev = drv_to_dev_ptr(pool->drv);
+ struct trinity_dma mem;
+ unsigned long size;
+ int err;
+
+ if (num_reqs > TRINITY_STAT_MAX_REQS) {
+ dev_err(dev, "The maximum number of stat reqs: %lu",
+ TRINITY_STAT_MAX_REQS);
+ return;
+ }
+
+ size = sizeof(struct trinity_stat_req) * num_reqs;
+ err = trinity_dma_alloc(dev, size, &mem);
+ if (err < 0) {
+ dev_warn(dev, "Unable to allocate stats for apps");
+ return;
+ }
+ trinity_dma_free(dev, &pool->mem_req);
+
+ bitmap_fill(pool->bitmap_req, TRINITY_STAT_MAX_REQS);
+ bitmap_zero(pool->bitmap_req, num_reqs);
+
+ pool->max_stat_reqs = num_reqs;
+ pool->mem_req = mem;
+}
+
+static struct trinity_stat_app *
+trinity_stat_pool_get_app(struct trinity_driver *drv)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_pool *pool = stat->pdata;
+ struct trinity_stat_app *app = NULL;
+ unsigned long slot;
+ bool retried = false;
+
+ /* ensured that the lock is acquired */
+retry:
+ slot = find_first_zero_bit(pool->bitmap_app, TRINITY_STAT_MAX_APPS);
+ if (slot < TRINITY_STAT_MAX_APPS) {
+ app = &((struct trinity_stat_app *)pool->mem_app.addr)[slot];
+ memset(app, '\x00', sizeof(*app));
+ set_bit(slot, pool->bitmap_app);
+ app->slot = slot;
+ } else if (!retried) {
+ /* retry after destroy old stats */
+ retried = true;
+ trinity_destroy_stats(stat, true);
+ goto retry;
+ } else {
+ dev_warn(drv_to_dev_ptr(pool->drv),
+ "Please increase stat pool limit for apps");
+ }
+
+ return app;
+}
+
+static void trinity_stat_pool_put_app(struct trinity_driver *drv,
+ struct trinity_stat_app *app)
+{
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+
+ /* ensured that the lock is acquired */
+ clear_bit(app->slot, pool->bitmap_app);
+}
+
+static struct trinity_stat_req *
+trinity_stat_pool_get_req(struct trinity_driver *drv)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_pool *pool = stat->pdata;
+ struct trinity_stat_req *req = NULL;
+ unsigned long slot;
+ bool retried = false;
+
+ /* ensured that the lock is acquired */
+retry:
+ slot = find_first_zero_bit(pool->bitmap_req, TRINITY_STAT_MAX_REQS);
+ if (slot < TRINITY_STAT_MAX_REQS) {
+ req = &((struct trinity_stat_req *)pool->mem_req.addr)[slot];
+ memset(req, '\x00', sizeof(*req));
+ set_bit(slot, pool->bitmap_req);
+ req->slot = slot;
+ } else if (!retried) {
+ /* retry after destroy old stats */
+ retried = true;
+ trinity_destroy_stats(stat, true);
+ goto retry;
+ } else {
+ dev_warn(drv_to_dev_ptr(pool->drv),
+ "Please increase stat pool limit for reqs");
+ }
+
+ return req;
+}
+
+static void trinity_stat_pool_put_req(struct trinity_driver *drv,
+ struct trinity_stat_req *req)
+{
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+
+ /* ensured that the lock is acquired */
+ clear_bit(req->slot, pool->bitmap_req);
+}
+
+/**
+ * trinity_stat_init(): Initialize trinity statistics
+ *
+ * @drv: an instance of the trinity driver
+ */
+void trinity_stat_init(struct trinity_driver *drv)
+{
+ unsigned long i;
+
+ spin_lock_init(&drv->stat.lock);
+
+ INIT_LIST_HEAD(&drv->stat.list);
+ for (i = 0; i < TRINITY_STAT_HASH_SIZE; ++i)
+ INIT_HLIST_BL_HEAD(&drv->stat.hlist[i]);
+
+ trinity_stat_pool_init(drv);
+ /* initialize to default values */
+ trinity_stat_resize(drv, TRINITY_STAT_DEF_APPS, TRINITY_STAT_DEF_REQS,
+ TRINITY_STAT_DEF_REQS_PER_APP);
+}
+
+/**
+ * trinity_stat_fini(): Finish trinity statistics
+ *
+ * @drv: an instance of the trinity driver
+ */
+void trinity_stat_fini(struct trinity_driver *drv)
+{
+ trinity_stat_resize(drv, 0, 0, 0);
+ trinity_stat_pool_fini(drv);
+}
+
+/**
+ * trinity_stat_fini(): Finish trinity statistics
+ *
+ * @drv: an instance of the trinity driver
+ * @num_apps: a number of applications
+ * @num_reqs: a number of requests
+ * @num_reqs_per_app: a number of requests per application
+ */
+void trinity_stat_resize(struct trinity_driver *drv, unsigned long num_apps,
+ unsigned long num_reqs, unsigned long num_reqs_per_app)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_pool *pool = stat->pdata;
+ unsigned long i;
+
+ if (!pool)
+ return;
+
+ trinity_stat_lock(&drv->stat);
+
+ for (i = 0; i < TRINITY_STAT_HASH_SIZE; i++) {
+ struct trinity_stat_app *stat_app;
+ struct hlist_bl_node *hn;
+
+ hlist_bl_lock(&(stat->hlist[i]));
+ hlist_bl_for_each_entry(stat_app, hn, &(stat->hlist[i]),
+ hnode) {
+ if (stat_app->status != TRINITY_APP_STATUS_TERMINATED) {
+ dev_warn(drv_to_dev_ptr(drv),
+ "Still busy apps detected.. waiting");
+ hlist_bl_unlock(&(stat->hlist[i]));
+ goto unlock;
+ }
+ }
+ hlist_bl_unlock(&(stat->hlist[i]));
+ }
+
+ trinity_destroy_stats(stat, true);
+
+ /* re-allocate each stat buffer */
+ if (num_apps > 0)
+ trinity_stat_pool_resize_apps(pool, num_apps);
+
+ if (num_reqs > 0)
+ trinity_stat_pool_resize_reqs(pool, num_reqs);
+
+ if (num_reqs_per_app > 0)
+ pool->max_stat_reqs_per_app = num_reqs_per_app;
+
+unlock:
+ trinity_stat_unlock(&drv->stat);
+}
+
+/**
+ * trinity_stat_get_max_apps(): Get max statistics size for application
+ *
+ * @drv: an instance of the trinity driver
+ *
+ * Returns max number of statistics for applications. 0 on error.
+ */
+unsigned long trinity_stat_get_max_apps(struct trinity_driver *drv)
+{
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+ unsigned long num;
+
+ if (!pool)
+ return 0;
+
+ trinity_stat_lock(&drv->stat);
+ num = pool->max_stat_apps;
+ trinity_stat_unlock(&drv->stat);
+
+ return num;
+}
+
+/**
+ * trinity_stat_get_max_reqs(): Get max statistics size for requests
+ *
+ * @drv: an instance of the trinity driver
+ *
+ * Returns max number of statistics for requests. 0 on error.
+ */
+unsigned long trinity_stat_get_max_reqs(struct trinity_driver *drv)
+{
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+ unsigned long num;
+
+ if (!pool)
+ return 0;
+
+ trinity_stat_lock(&drv->stat);
+ num = pool->max_stat_reqs;
+ trinity_stat_unlock(&drv->stat);
+
+ return num;
+}
+
+/**
+ * trinity_stat_get_max_reqs(): Get max statistics size for requests per application
+ *
+ * @drv: an instance of the trinity driver
+ *
+ * Returns max number of statistics for requests per application. 0 on error.
+ */
+unsigned long trinity_stat_get_max_reqs_per_app(struct trinity_driver *drv)
+{
+ struct trinity_stat_pool *pool = drv->stat.pdata;
+ unsigned long num;
+
+ if (!pool)
+ return 0;
+
+ trinity_stat_lock(&drv->stat);
+ num = pool->max_stat_reqs_per_app;
+ trinity_stat_unlock(&drv->stat);
+
+ return num;
+}
+
+/**
+ * trinity_stat_lock(): Lock for trinity statistics
+ *
+ * @stat: an instance of trinity statistics
+ */
+void trinity_stat_lock(struct trinity_stat *stat)
+{
+ if (stat)
+ spin_lock(&stat->lock);
+}
+
+/**
+ * trinity_stat_unlock(): Unlock for trinity statistics
+ *
+ * @stat: an instance of trinity statistics
+ */
+void trinity_stat_unlock(struct trinity_stat *stat)
+{
+ if (stat)
+ spin_unlock(&stat->lock);
+}
+
+/**
+ * trinity_create_stat_app() - Create a stat structure for the opened app
+ *
+ * @drv: An instance of the trinity driver.
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+static int trinity_create_stat_app(struct trinity_driver *drv)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_pool *pool = stat->pdata;
+ struct trinity_stat_app *stat_app;
+ unsigned long key;
+
+ trinity_stat_lock(stat);
+ stat_app = trinity_stat_pool_get_app(drv);
+ if (IS_ERR_OR_NULL(stat_app)) {
+ trinity_stat_unlock(stat);
+ dev_err(drv_to_dev_ptr(drv),
+ "Unable to allocate stat of request");
+ return -ENOMEM;
+ }
+
+ stat_app->parent = stat;
+ stat_app->app_id = trinity_get_app_id();
+ stat_app->total_alloc_mem = 0;
+ stat_app->total_freed_mem = 0;
+ stat_app->num_total_reqs = 0;
+ stat_app->num_kept_reqs = 0;
+ stat_app->num_active_reqs = 0;
+ stat_app->status = TRINITY_APP_STATUS_STARTED;
+
+ strncpy(stat_app->name, current->comm, TASK_COMM_LEN);
+ stat_app->name[TASK_COMM_LEN - 1] = '\x00';
+
+ INIT_HLIST_BL_NODE(&stat_app->hnode);
+ INIT_LIST_HEAD(&stat_app->reqs);
+
+ key = hash_long(stat_app->app_id, TRINITY_STAT_HASH_BITS);
+
+ hlist_bl_lock(&(stat->hlist[key]));
+ hlist_bl_add_head(&stat_app->hnode, &(stat->hlist[key]));
+ hlist_bl_unlock(&(stat->hlist[key]));
+
+ list_add_tail(&stat_app->lnode, &stat->list);
+ pool->cur_stat_apps++;
+
+ /* Remove terminated stats if the number reaches the maximum */
+ trinity_destroy_stats(stat, false);
+
+ trinity_stat_unlock(stat);
+
+ return 0;
+}
+
+static void trinity_destroy_stat_req(struct trinity_stat_req *stat_req)
+{
+ struct trinity_stat_app *stat_app = stat_req->parent;
+ struct trinity_stat *stat = stat_app->parent;
+ struct trinity_driver *drv =
+ container_of(stat, struct trinity_driver, stat);
+
+ if (stat_req->profile)
+ drv->desc->destroy_profile(drv, stat_req->profile);
+ list_del(&stat_req->list);
+ trinity_stat_pool_put_req(drv, stat_req);
+}
+
+static void trinity_destroy_stat_reqs(struct trinity_stat_app *stat_app)
+{
+ struct trinity_stat_req *stat_req, *tmp;
+
+ list_for_each_entry_safe(stat_req, tmp, &stat_app->reqs, list)
+ trinity_destroy_stat_req(stat_req);
+}
+
+/**
+ * trinity_destroy_stats - Destroy terminated stat structures
+ *
+ * @drv: An instance of the trinity driver
+ * @force: force destroy
+ */
+void trinity_destroy_stats(struct trinity_stat *stat, bool force)
+{
+ struct trinity_driver *drv =
+ container_of(stat, struct trinity_driver, stat);
+ struct trinity_stat_pool *pool = stat->pdata;
+ struct trinity_stat_app *stat_app;
+ struct hlist_bl_node *hn, *tmp;
+ int i;
+
+ /* lock should be acquired before */
+ if (!force && pool->cur_stat_apps <= pool->max_stat_apps)
+ return;
+
+ for (i = 0; i < TRINITY_STAT_HASH_SIZE; i++) {
+ hlist_bl_lock(&stat->hlist[i]);
+ hlist_bl_for_each_entry_safe(stat_app, hn, tmp,
+ &(stat->hlist[i]), hnode) {
+ enum trinity_app_status status = stat_app->status;
+
+ if (status == TRINITY_APP_STATUS_TERMINATED) {
+ hlist_bl_del(&stat_app->hnode);
+ list_del(&stat_app->lnode);
+
+ pool->cur_stat_apps--;
+
+ trinity_destroy_stat_reqs(stat_app);
+ trinity_stat_pool_put_app(drv, stat_app);
+ }
+ }
+ hlist_bl_unlock(&stat->hlist[i]);
+ }
+}
+
+static struct trinity_stat_app *
+trinity_get_stat_by_id(struct trinity_driver *drv, int32_t app_id)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+ struct hlist_bl_node *hn;
+ unsigned long key;
+
+ key = hash_long(app_id, TRINITY_STAT_HASH_BITS);
+
+ hlist_bl_lock(&stat->hlist[key]);
+ hlist_bl_for_each_entry(stat_app, hn, &stat->hlist[key], hnode) {
+ if (stat_app->app_id == app_id)
+ goto out;
+ }
+ stat_app = NULL;
+out:
+ hlist_bl_unlock(&stat->hlist[key]);
+
+ return stat_app;
+}
+
+/**
+ * trinity_get_stat_app() - Get a status structure for the target app
+ *
+ * @drv: an instance of the trinity driver.
+ *
+ * Returns statistics for application on success. Otherwise, returns NULL.
+ *
+ * @note: If the stat is not allocated yet, try to create and return it.
+ */
+struct trinity_stat_app *trinity_get_stat_app(struct trinity_driver *drv)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+ int app_id = trinity_get_app_id();
+
+retry:
+ trinity_stat_lock(stat);
+ stat_app = trinity_get_stat_by_id(drv, app_id);
+ trinity_stat_unlock(stat);
+
+ if (!IS_ERR_OR_NULL(stat_app))
+ return stat_app;
+
+ if (trinity_create_stat_app(drv) != 0)
+ return NULL;
+
+ goto retry;
+}
+
+/**
+ * trinity_stat_app_set_status() - Set a status structure for the target app
+ *
+ * @drv: an instance of the trinity driver.
+ * @status: application status
+ */
+void trinity_stat_app_set_status(struct trinity_driver *drv,
+ enum trinity_app_status status)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+ int app_id = trinity_get_app_id();
+
+ trinity_stat_lock(stat);
+ stat_app = trinity_get_stat_by_id(drv, app_id);
+ trinity_stat_unlock(stat);
+
+ if (IS_ERR_OR_NULL(stat_app))
+ return;
+
+ stat_app->status = status;
+}
+
+/**
+ * trinity_stat_append_req() - Append request information for statistics
+ *
+ * @drv: an instance of the trinity driver.
+ * @req: an instance of request
+ *
+ * Return: 0 on success. Otherwise, returns negative error.
+ */
+int trinity_stat_append_req(struct trinity_driver *drv, struct trinity_req *req)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_pool *pool = stat->pdata;
+ struct trinity_stat_app *stat_app;
+ struct trinity_stat_req *stat_req;
+
+ stat_app = trinity_get_stat_app(drv);
+ if (IS_ERR_OR_NULL(stat_app))
+ return -ENOMEM;
+
+ trinity_stat_lock(stat);
+ stat_req = trinity_stat_pool_get_req(drv);
+ if (!stat_req) {
+ trinity_stat_unlock(stat);
+ dev_err(drv_to_dev_ptr(drv),
+ "Unable to allocate stat of request");
+ return -ENOMEM;
+ }
+
+ stat_req->parent = stat_app;
+ stat_req->app_id = stat_app->app_id;
+ stat_req->req_id = req->input.config.req_id;
+ stat_req->model_id = req->input.config.model_id;
+ stat_req->submitted = ktime_get();
+ stat_req->status = TRINITY_REQ_STATUS_PENDING;
+ stat_req->priority =
+ (enum trinity_req_priority)req->input.config.priority;
+ stat_req->is_kernel = req->is_kernel;
+
+ req->stat = stat_req;
+
+ list_add_tail(&stat_req->list, &stat_app->reqs);
+
+ /* don't count kernel requests */
+ if (!req->is_kernel) {
+ if (stat_app->num_kept_reqs == pool->max_stat_reqs_per_app) {
+ struct trinity_stat_req *old_stat;
+
+ old_stat = list_first_entry(
+ &stat_app->reqs, struct trinity_stat_req, list);
+ /* skip any kernel or unfinished request */
+ while (old_stat->is_kernel ||
+ (old_stat->status !=
+ TRINITY_REQ_STATUS_FINISHED &&
+ old_stat->status != TRINITY_REQ_STATUS_ERROR))
+ old_stat = list_next_entry(old_stat, list);
+
+ WARN_ON(old_stat == NULL);
+
+ trinity_destroy_stat_req(old_stat);
+ stat_app->num_total_reqs--;
+ } else {
+ /* total number of user requests kepted */
+ stat_app->num_kept_reqs++;
+ }
+ }
+
+ stat_app->num_total_reqs++;
+ stat_app->num_active_reqs++;
+
+ trinity_stat_unlock(stat);
+ return 0;
+}
+
+/**
+ * trinity_stat_remove_req() - Remove request information for statistics
+ *
+ * @drv: an instance of the trinity driver.
+ * @req: an instance of the request to be used for statistics
+ * @rollback: rollback statistics
+ */
+void trinity_stat_remove_req(struct trinity_driver *drv,
+ struct trinity_req *req, bool rollback)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_req *stat_req = req->stat;
+ struct trinity_stat_app *stat_app = stat_req->parent;
+
+ trinity_stat_lock(stat);
+
+ trinity_destroy_stat_req(stat_req);
+
+ if (!req->is_kernel) {
+ WARN_ON(stat_app->num_kept_reqs == 0);
+ stat_app->num_kept_reqs--;
+ }
+
+ if (rollback) {
+ WARN_ON(stat_app->num_total_reqs == 0);
+ stat_app->num_total_reqs--;
+ WARN_ON(stat_app->num_active_reqs == 0);
+ stat_app->num_active_reqs--;
+ }
+
+ trinity_stat_unlock(stat);
+}
+
+/**
+ * trinity_stat_finish_req() - Finish request for statistics
+ *
+ * @drv: an instance of the trinity driver.
+ * @req: an instance of the request to be used for statistics
+ */
+void trinity_stat_finish_req(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_req *stat_req = req->stat;
+ struct trinity_stat_app *stat_app = stat_req->parent;
+
+ trinity_stat_lock(stat);
+ if (stat_app->num_active_reqs != 0)
+ stat_app->num_active_reqs--;
+ else
+ dev_err(drv_to_dev_ptr(drv),
+ "Fail to keep track of the active reqs");
+ trinity_stat_unlock(stat);
+}
+
+static void copy_stat_app_ioctl(struct trinity_stat_app *stat_app,
+ struct trinity_ioctl_stat_app *ioctl_stat_app)
+{
+ ioctl_stat_app->app_id = stat_app->app_id;
+ ioctl_stat_app->status = stat_app->status;
+ ioctl_stat_app->num_total_reqs = stat_app->num_total_reqs;
+ ioctl_stat_app->num_active_reqs = stat_app->num_active_reqs;
+ ioctl_stat_app->total_alloc_mem = stat_app->total_alloc_mem;
+ ioctl_stat_app->total_freed_mem = stat_app->total_freed_mem;
+
+ strncpy(ioctl_stat_app->name, stat_app->name, TASK_COMM_LEN);
+ ioctl_stat_app->name[TASK_COMM_LEN - 1] = '\x00';
+}
+
+static void copy_stat_req_ioctl(struct trinity_stat_req *stat_req,
+ struct trinity_ioctl_stat_req *ioctl_stat_req)
+{
+ ktime_t cur_time = ktime_get();
+ ktime_t submitted, scheduled, completed;
+
+ submitted = stat_req->submitted;
+ scheduled = stat_req->scheduled ? stat_req->scheduled : cur_time;
+ completed = stat_req->completed ? stat_req->completed : cur_time;
+
+ ioctl_stat_req->req_id = stat_req->req_id;
+ ioctl_stat_req->model_id = stat_req->model_id;
+ ioctl_stat_req->priority = stat_req->priority;
+ ioctl_stat_req->status = stat_req->status;
+
+ if (stat_req->priority == TRINITY_REQ_PRIORITY_HIGH)
+ ioctl_stat_req->sched_time = 0;
+ else
+ ioctl_stat_req->sched_time = TIME_DIFF(scheduled, submitted);
+ ioctl_stat_req->infer_time = TIME_DIFF(completed, scheduled);
+}
+
+/**
+ * trinity_stat_app_copy_ioctl() - Copy an application's statistics information to ioctl info
+ *
+ * @drv: an instance of the trinity driver.
+ * @ioctl_stat_app: ioctl statistics information for an application
+ */
+void trinity_stat_app_copy_ioctl(struct trinity_driver *drv,
+ struct trinity_ioctl_stat_app *ioctl_stat_app)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+ int app_id = trinity_get_app_id();
+
+ trinity_stat_lock(stat);
+
+ stat_app = trinity_get_stat_by_id(drv, app_id);
+ if (IS_ERR_OR_NULL(stat_app)) {
+ ioctl_stat_app->app_id = app_id;
+ ioctl_stat_app->status = TRINITY_APP_STATUS_PENDING;
+ ioctl_stat_app->num_total_reqs = 0;
+ ioctl_stat_app->num_active_reqs = 0;
+ ioctl_stat_app->total_alloc_mem = 0;
+ ioctl_stat_app->total_freed_mem = 0;
+
+ strncpy(ioctl_stat_app->name, current->comm, TASK_COMM_LEN);
+ ioctl_stat_app->name[TASK_COMM_LEN - 1] = '\x00';
+ } else {
+ copy_stat_app_ioctl(stat_app, ioctl_stat_app);
+ }
+
+ trinity_stat_unlock(stat);
+}
+
+/**
+ * trinity_stat_apps_copy_ioctl() - Copy applications' statistics information to ioctl info
+ *
+ * @drv: an instance of the trinity driver.
+ * @ioctl_stat_apps: ioctl statistics information for applications
+ */
+void trinity_stat_apps_copy_ioctl(
+ struct trinity_driver *drv,
+ struct trinity_ioctl_stat_apps *ioctl_stat_apps)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_ioctl_stat_app *ioctl_stat_app;
+ struct trinity_stat_app *stat_app;
+ uint32_t idx = 0;
+
+ trinity_stat_lock(stat);
+
+ list_for_each_entry(stat_app, &stat->list, lnode) {
+ if (idx >= TRINITY_APP_STAT_MAX)
+ break;
+ ioctl_stat_app = &ioctl_stat_apps->stat[idx++];
+ copy_stat_app_ioctl(stat_app, ioctl_stat_app);
+ }
+ ioctl_stat_apps->num_apps = idx;
+
+ trinity_stat_unlock(stat);
+}
+
+/**
+ * trinity_stat_app_copy_ioctl() - Copy requests' statistics information to ioctl info
+ *
+ * @drv: an instance of the trinity driver.
+ * @ioctl_stat_reqs: ioctl statistics information for requests
+ */
+void trinity_stat_reqs_copy_ioctl(
+ struct trinity_driver *drv,
+ struct trinity_ioctl_stat_reqs *ioctl_stat_reqs)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_ioctl_stat_req *ioctl_stat_req;
+ struct trinity_stat_app *stat_app;
+ struct trinity_stat_req *stat_req;
+ uint32_t idx = 0;
+
+ trinity_stat_lock(stat);
+ stat_app = trinity_get_stat_by_id(drv, ioctl_stat_reqs->app_id);
+ if (IS_ERR_OR_NULL(stat_app)) {
+ ioctl_stat_reqs->num_reqs = 0;
+ trinity_stat_unlock(stat);
+ return;
+ }
+
+ list_for_each_entry(stat_req, &stat_app->reqs, list) {
+ if (idx >= TRINITY_REQ_STAT_MAX)
+ break;
+ ioctl_stat_req = &ioctl_stat_reqs->stat[idx++];
+ copy_stat_req_ioctl(stat_req, ioctl_stat_req);
+ }
+ ioctl_stat_reqs->num_reqs = idx;
+
+ trinity_stat_unlock(stat);
+}
+
+/**
+ * trinity_stat_app_total_alloc() - Append allocated size to application's total memory size
+ *
+ * @drv: an instance of the trinity driver.
+ * @size: allocated memory size
+ */
+void trinity_stat_app_total_alloc(struct trinity_driver *drv, size_t size)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+
+ stat_app = trinity_get_stat_app(drv);
+ if (IS_ERR_OR_NULL(stat_app))
+ return;
+
+ trinity_stat_lock(stat);
+ stat_app->total_alloc_mem += size;
+ trinity_stat_unlock(stat);
+}
+
+/**
+ * trinity_stat_app_total_alloc() - Append freed size to application's total memory size
+ *
+ * @drv: an instance of the trinity driver.
+ * @size: freed memory size
+ */
+void trinity_stat_app_total_freed(struct trinity_driver *drv, size_t size)
+{
+ struct trinity_stat *stat = &drv->stat;
+ struct trinity_stat_app *stat_app;
+
+ stat_app = trinity_get_stat_app(drv);
+ if (IS_ERR_OR_NULL(stat_app))
+ return;
+
+ trinity_stat_lock(stat);
+ stat_app->total_freed_mem += size;
+ trinity_stat_unlock(stat);
+}
diff --git a/drivers/misc/trinity/trinity_stat.h b/drivers/misc/trinity/trinity_stat.h
new file mode 100644
index 000000000000..8ae02769efa0
--- /dev/null
+++ b/drivers/misc/trinity/trinity_stat.h
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * Statistics header for trinity devices
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __DRIVERS_MISC_TRINITY_STAT_H__
+#define __DRIVERS_MISC_TRINITY_STAT_H__
+
+#include "trinity_common.h"
+
+void trinity_stat_init(struct trinity_driver *drv);
+void trinity_stat_fini(struct trinity_driver *drv);
+void trinity_stat_resize(struct trinity_driver *drv, unsigned long num_apps,
+ unsigned long num_reqs,
+ unsigned long num_reqs_per_app);
+
+void trinity_stat_lock(struct trinity_stat *stat);
+void trinity_stat_unlock(struct trinity_stat *stat);
+void trinity_destroy_stats(struct trinity_stat *stat, bool force);
+
+unsigned long trinity_stat_get_max_apps(struct trinity_driver *drv);
+unsigned long trinity_stat_get_max_reqs(struct trinity_driver *drv);
+unsigned long trinity_stat_get_max_reqs_per_app(struct trinity_driver *drv);
+
+struct trinity_stat_app *trinity_get_stat_app(struct trinity_driver *drv);
+
+void trinity_stat_app_total_alloc(struct trinity_driver *drv, size_t size);
+void trinity_stat_app_total_freed(struct trinity_driver *drv, size_t size);
+void trinity_stat_app_set_status(struct trinity_driver *drv,
+ enum trinity_app_status status);
+
+int trinity_stat_append_req(struct trinity_driver *drv,
+ struct trinity_req *req);
+void trinity_stat_remove_req(struct trinity_driver *drv,
+ struct trinity_req *req, bool rollback);
+void trinity_stat_finish_req(struct trinity_driver *drv,
+ struct trinity_req *req);
+
+void trinity_stat_app_copy_ioctl(struct trinity_driver *drv,
+ struct trinity_ioctl_stat_app *ioctl_stat_app);
+
+void trinity_stat_apps_copy_ioctl(
+ struct trinity_driver *drv,
+ struct trinity_ioctl_stat_apps *ioctl_stat_apps);
+
+void trinity_stat_reqs_copy_ioctl(
+ struct trinity_driver *drv,
+ struct trinity_ioctl_stat_reqs *ioctl_stat_reqs);
+
+#endif /* __DRIVERS_MISC_TRINITY_STAT_H__ */
--
2.25.1
This patch is for trace declaration.
'trinity' ftrace module added several trace points.
The points are located on each ioctl control, wakeup,
irq, and run trigger.
Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/Makefile | 1 +
drivers/misc/trinity/trinity.c | 58 +++-
drivers/misc/trinity/trinity_trace.c | 15 +
drivers/misc/trinity/trinity_trace.h | 329 +++++++++++++++++++++
drivers/misc/trinity/trinity_vision2_drv.c | 9 +
5 files changed, 410 insertions(+), 2 deletions(-)
create mode 100644 drivers/misc/trinity/trinity_trace.c
create mode 100644 drivers/misc/trinity/trinity_trace.h
diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
index 462b7c61f39f..ac747bdbf46d 100644
--- a/drivers/misc/trinity/Makefile
+++ b/drivers/misc/trinity/Makefile
@@ -8,5 +8,6 @@ trinity-y += trinity_sched.o
trinity-y += trinity_debug.o
trinity-y += trinity_stat.o
trinity-y += trinity_sysfs.o
+trinity-y += trinity_trace.o
trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
index 0463140c0ae6..53c6ab92c26d 100644
--- a/drivers/misc/trinity/trinity.c
+++ b/drivers/misc/trinity/trinity.c
@@ -16,6 +16,7 @@
#include "trinity_common.h"
#include "trinity_sched.h"
#include "trinity_stat.h"
+#include "trinity_trace.h"
#define TRINITY_PADDR_BASE (0x0)
@@ -375,6 +376,8 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
sizeof((desc->ver))))
return -EFAULT;
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_GET_VERSION");
break;
}
case TRINITY_IOCTL_GET_API_LEVEL: {
@@ -384,6 +387,8 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
sizeof(api_level)))
return -EFAULT;
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_GET_API_LEVEL");
break;
}
case TRINITY_IOCTL_GET_STATE: {
@@ -394,6 +399,8 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
sizeof(ready)))
return -EFAULT;
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_GET_STATE");
break;
}
case TRINITY_IOCTL_GET_TOPS: {
@@ -401,6 +408,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
sizeof((drv->tops))))
return -EFAULT;
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_GET_TOPS");
+
break;
}
case TRINITY_IOCTL_GET_DSPM: {
@@ -408,6 +418,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
sizeof((drv->dspm))))
return -EFAULT;
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_GET_DSPM");
+
break;
}
case TRINITY_IOCTL_GET_NEXT_REQUEST: {
@@ -417,6 +430,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
sizeof(req_id)))
return -EFAULT;
+ trace_trinity_ioctl_next_req(drv->dev_id, trinity_get_app_id(),
+ req_id);
+
break;
}
case TRINITY_IOCTL_HWMEM_ALLOC: {
@@ -430,6 +446,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (err >= 0)
trinity_stat_app_total_alloc(drv, hwmem.size);
+ trace_trinity_ioctl_hwmem_alloc(
+ drv->dev_id, trinity_get_app_id(), hwmem.size, err);
+
break;
}
case TRINITY_IOCTL_HWMEM_DEALLOC: {
@@ -447,6 +466,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (err == 0)
trinity_stat_app_total_freed(drv, dbuf->size);
+ trace_trinity_ioctl_hwmem_dealloc(
+ drv->dev_id, trinity_get_app_id(), hwmem.dbuf_fd);
+
break;
}
case TRINITY_IOCTL_REGISTER_MODEL: {
@@ -471,6 +493,11 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
&model->config, sizeof(model->config)))
return -EFAULT;
+ trace_trinity_ioctl_register_model(
+ model->config.metadata_dbuf_fd,
+ model->config.metadata_ext_dbuf_fd,
+ model->config.metadata_ext_size);
+
break;
}
case TRINITY_IOCTL_DEREGISTER_MODEL: {
@@ -481,6 +508,8 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
err = trinity_deregister_model(drv, id);
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_DEREGISTER_MODEL");
break;
}
case TRINITY_IOCTL_RUN_INPUT: {
@@ -511,6 +540,11 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
return err;
}
+ trace_trinity_ioctl_run_input(
+ input->config.timeout_ms, input->config.priority,
+ input->config.num_segments, input->config.input_mode,
+ input->config.output_mode);
+
if (copy_to_user((struct trinity_input __user *)arg,
&input->config, sizeof(input->config))) {
drv->desc->dealloc_req(drv, req);
@@ -527,9 +561,16 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
if (!IDU_LOADED(drv))
return -EFAULT;
- if (drv->desc->stop_reqs)
+ if (drv->desc->stop_reqs) {
schedule_work(&drv->work_stop);
-
+ trace_trinity_ioctl_msg(drv->dev_id,
+ trinity_get_app_id(),
+ "TRINITY_IOCTL_STOP_REQUESTS");
+ } else {
+ trace_trinity_ioctl_msg(
+ drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_STOP_REQUESTS: not supported");
+ }
break;
}
case TRINITY_IOCTL_STAT_CURRENT_APP: {
@@ -546,6 +587,8 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
&ioctl_stat_app, sizeof(ioctl_stat_app)))
return -EACCES;
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_STAT_CURRENT_APP");
break;
}
case TRINITY_IOCTL_STAT_APPS: {
@@ -562,6 +605,8 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
&ioctl_stat_apps, sizeof(ioctl_stat_apps)))
return -EACCES;
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_STAT_APPS");
break;
}
case TRINITY_IOCTL_STAT_REQS: {
@@ -581,6 +626,8 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
&ioctl_stat_reqs, sizeof(ioctl_stat_reqs)))
return -EACCES;
+ trace_trinity_ioctl_msg(drv->dev_id, trinity_get_app_id(),
+ "TRINITY_IOCTL_STAT_REQS");
break;
}
case TRINITY_IOCTL_GET_PROFILE_META: {
@@ -606,6 +653,10 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
&profile, sizeof(profile)))
return -EACCES;
+ trace_trinity_ioctl_get_profile_meta(drv->dev_id,
+ trinity_get_app_id(),
+ profile.req_id,
+ profile.profile_size);
break;
}
case TRINITY_IOCTL_GET_PROFILE_BUFF: {
@@ -624,6 +675,9 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
&profile, sizeof(profile)))
return -EACCES;
+ trace_trinity_ioctl_get_profile_buff(
+ drv->dev_id, trinity_get_app_id(), profile.req_id,
+ profile.profile_pos, profile.profile_size);
break;
}
case TRINITY_IOCTL_IDU_SET: {
diff --git a/drivers/misc/trinity/trinity_trace.c b/drivers/misc/trinity/trinity_trace.c
new file mode 100644
index 000000000000..d5721273eeb1
--- /dev/null
+++ b/drivers/misc/trinity/trinity_trace.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/**
+ * Trace source for trinity devices
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __CHECKER__
+#define CREATE_TRACE_POINTS
+#include "trinity_trace.h"
+#endif
diff --git a/drivers/misc/trinity/trinity_trace.h b/drivers/misc/trinity/trinity_trace.h
new file mode 100644
index 000000000000..c4f03deeee90
--- /dev/null
+++ b/drivers/misc/trinity/trinity_trace.h
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * Trace header for trinity devices
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#if !defined(__DRIVERS_MISC_TRINITY_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
+#define __DRIVERS_MISC_TRINITY_TRACE_H__
+
+#include <linux/tracepoint.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM trinity
+#define TRACE_INCLUDE_FILE trinity_trace
+
+// clang-format off
+TRACE_EVENT(triv2_run_trigger,
+ TP_PROTO(u32 device_id, s32 slot),
+ TP_ARGS(device_id, slot),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, slot)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->slot = slot;
+ ),
+ TP_printk("device_id=%u slot=%d",
+ __entry->device_id,
+ __entry->slot)
+);
+TRACE_EVENT(triv2_wakeup_cp,
+ TP_PROTO(u32 device_id),
+ TP_ARGS(device_id),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ ),
+ TP_printk("device_id=%u",
+ __entry->device_id)
+);
+TRACE_EVENT(triv2_handle_irq,
+ TP_PROTO(u32 device_id, s32 irq),
+ TP_ARGS(device_id, irq),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, irq)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->irq = irq;
+ ),
+ TP_printk("device_id=%u irq=%d",
+ __entry->device_id,
+ __entry->irq)
+);
+TRACE_EVENT(triv2_handle_threaded_irq,
+ TP_PROTO(u32 device_id, s32 irq),
+ TP_ARGS(device_id, irq),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, irq)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->irq = irq;
+ ),
+ TP_printk("device_id=%u irq=%d",
+ __entry->device_id,
+ __entry->irq)
+);
+TRACE_EVENT(triv2_handle_cmd_done,
+ TP_PROTO(u32 device_id, s32 slot, u32 cycles, u32 time),
+ TP_ARGS(device_id, slot, cycles, time),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, slot)
+ __field(u32, cycles)
+ __field(u32, time)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->slot = slot;
+ __entry->cycles = cycles;
+ __entry->time = time;
+ ),
+ TP_printk("device_id=%u slot=%d cycles=%u time(us)=%u",
+ __entry->device_id,
+ __entry->slot,
+ __entry->cycles,
+ __entry->time)
+);
+TRACE_EVENT(triv2_map_sched_data,
+ TP_PROTO(u32 device_id, s32 slot, u32 batch_size, u32 in_cnt, u32 out_cnt),
+ TP_ARGS(device_id, slot, batch_size, in_cnt, out_cnt),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, slot)
+ __field(u32, batch_size)
+ __field(u32, in_cnt)
+ __field(u32, out_cnt)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->slot = slot;
+ __entry->batch_size = batch_size;
+ __entry->in_cnt = in_cnt;
+ __entry->out_cnt = out_cnt;
+ ),
+ TP_printk("device_id=%u slot=%d batch_size=%u in_cnt=%u out_cnt=%u",
+ __entry->device_id,
+ __entry->slot,
+ __entry->batch_size,
+ __entry->in_cnt,
+ __entry->out_cnt)
+);
+TRACE_EVENT(triv2_unmap_sched_data,
+ TP_PROTO(u32 device_id, s32 slot),
+ TP_ARGS(device_id, slot),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, slot)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->slot = slot;
+ ),
+ TP_printk("device_id=%u slot=%d",
+ __entry->device_id,
+ __entry->slot)
+);
+TRACE_EVENT(trinity_ioctl_msg,
+ TP_PROTO(u32 device_id, s32 app_id, char *msg),
+ TP_ARGS(device_id, app_id, msg),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(char*, msg)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->msg = msg;
+ ),
+ TP_printk("device_id=%u app_id=%d msg=%s",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->msg)
+);
+TRACE_EVENT(trinity_ioctl_next_req,
+ TP_PROTO(u32 device_id, s32 app_id, s32 req_id),
+ TP_ARGS(device_id, app_id, req_id),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s32, req_id)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->req_id = req_id;
+ ),
+ TP_printk("device_id=%u app_id=%d req_id=%d",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->req_id)
+);
+TRACE_EVENT(trinity_ioctl_stop_req,
+ TP_PROTO(u32 device_id, s32 app_id, s32 req_id),
+ TP_ARGS(device_id, app_id, req_id),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s32, req_id)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->req_id = req_id;
+ ),
+ TP_printk("device_id=%u app_id=%d req_id=%d",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->req_id)
+);
+TRACE_EVENT(trinity_ioctl_hwmem_alloc,
+ TP_PROTO(u32 device_id, s32 app_id, s64 size, s32 dbuf_fd),
+ TP_ARGS(device_id, app_id, size, dbuf_fd),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s64, size)
+ __field(s32, dbuf_fd)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->size = size;
+ __entry->dbuf_fd = dbuf_fd;
+ ),
+ TP_printk("device_id=%u app_id=%d size=%lld dbuf_fd=%d",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->size,
+ __entry->dbuf_fd)
+);
+TRACE_EVENT(trinity_ioctl_hwmem_dealloc,
+ TP_PROTO(u32 device_id, s32 app_id, s32 dbuf_fd),
+ TP_ARGS(device_id, app_id, dbuf_fd),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s32, dbuf_fd)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->dbuf_fd = dbuf_fd;
+ ),
+ TP_printk("device_id=%u app_id=%d dbuf_fd=%d",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->dbuf_fd)
+);
+TRACE_EVENT(trinity_ioctl_get_profile_meta,
+ TP_PROTO(u32 device_id, s32 app_id, s32 req_id, u32 profile_size),
+ TP_ARGS(device_id, app_id, req_id, profile_size),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s32, req_id)
+ __field(u32, profile_size)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->req_id = req_id;
+ __entry->profile_size = profile_size;
+ ),
+ TP_printk("device_id=%u app_id=%d req_id=%d profile_size=%u",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->req_id,
+ __entry->profile_size)
+);
+TRACE_EVENT(trinity_ioctl_get_profile_buff,
+ TP_PROTO(u32 device_id, s32 app_id, s32 req_id, u32 profile_pos,
+ u32 profile_size),
+ TP_ARGS(device_id, app_id, req_id, profile_pos, profile_size),
+ TP_STRUCT__entry(
+ __field(u32, device_id)
+ __field(s32, app_id)
+ __field(s32, req_id)
+ __field(u32, profile_pos)
+ __field(u32, profile_size)
+ ),
+ TP_fast_assign(
+ __entry->device_id = device_id;
+ __entry->app_id = app_id;
+ __entry->req_id = req_id;
+ __entry->profile_pos = profile_pos;
+ __entry->profile_size = profile_size;
+ ),
+ TP_printk("device_id=%u app_id=%d req_id=%d profile_pos=%u profile_size=%u",
+ __entry->device_id,
+ __entry->app_id,
+ __entry->req_id,
+ __entry->profile_pos,
+ __entry->profile_size)
+);
+TRACE_EVENT(trinity_ioctl_register_model,
+ TP_PROTO(s32 metadata_dbuf_fd, s32 metadata_ext_dbuf_fd,
+ u64 metadata_ext_size),
+ TP_ARGS(metadata_dbuf_fd, metadata_ext_dbuf_fd, metadata_ext_size),
+ TP_STRUCT__entry(
+ __field(s32, metadata_dbuf_fd)
+ __field(s32, metadata_ext_dbuf_fd)
+ __field(u64, metadata_ext_size)
+ ),
+ TP_fast_assign(
+ __entry->metadata_dbuf_fd = metadata_dbuf_fd;
+ __entry->metadata_ext_dbuf_fd = metadata_ext_dbuf_fd;
+ __entry->metadata_ext_size = metadata_ext_size;
+ ),
+ TP_printk("metadata_dbuf_fd=%d metadata_ext_dbuf_fd=%d metadata_ext_size=0x%llx",
+ __entry->metadata_dbuf_fd,
+ __entry->metadata_ext_dbuf_fd,
+ __entry->metadata_ext_size)
+);
+TRACE_EVENT(trinity_ioctl_run_input,
+ TP_PROTO(s64 timeout_ms, u32 priority, u32 num_segments, s32 input_mode,
+ s32 output_mode),
+ TP_ARGS(timeout_ms, priority, num_segments, input_mode, output_mode),
+ TP_STRUCT__entry(
+ __field(s64, timeout_ms)
+ __field(u32, priority)
+ __field(u32, num_segments)
+ __field(s32, input_mode)
+ __field(s32, output_mode)
+ ),
+ TP_fast_assign(
+ __entry->timeout_ms = timeout_ms;
+ __entry->priority = priority;
+ __entry->num_segments = num_segments;
+ __entry->input_mode = input_mode;
+ __entry->output_mode = output_mode;
+ ),
+ TP_printk("timeout_ms=%lld priority=%u num_segments=%u input_mode=%d output_mode=%d",
+ __entry->timeout_ms,
+ __entry->priority,
+ __entry->num_segments,
+ __entry->input_mode,
+ __entry->output_mode)
+);
+// clang-format on
+
+#endif /* __DRIVERS_MISC_TRINITY_TRACE_H__ */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH ../../drivers/misc/trinity
+#include <trace/define_trace.h>
diff --git a/drivers/misc/trinity/trinity_vision2_drv.c b/drivers/misc/trinity/trinity_vision2_drv.c
index 111623322895..8299cb3e25c1 100644
--- a/drivers/misc/trinity/trinity_vision2_drv.c
+++ b/drivers/misc/trinity/trinity_vision2_drv.c
@@ -18,6 +18,7 @@
#include "trinity_common.h"
#include "trinity_sched.h"
+#include "trinity_trace.h"
#include "trinity_vision2_profile.h"
#include "trinity_vision2_regs.h"
@@ -396,6 +397,8 @@ static void triv2_wakeup_cp(const struct trinity_driver *drv)
void *addr =
trinity_get_iomem_addr(drv->mmreg_vaddr[0], OFFSET_CP_PROC_SET);
+ trace_triv2_wakeup_cp(drv->dev_id);
+
trinity_set_bit(BIT_SET_SEND_EVT1, addr);
}
@@ -482,6 +485,8 @@ static void triv2_run_trigger(const struct trinity_driver *drv, int slot)
struct triv2_cmd_info *cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
struct triv2_req *t_req = cmd_info->reqs[slot];
+ trace_triv2_run_trigger(drv->dev_id, slot);
+
if (!t_req) {
dev_err(drv_to_dev_ptr(drv),
"Unable to find the corresponding req");
@@ -546,6 +551,10 @@ static void triv2_handle_cmd_done(struct trinity_driver *drv,
req->stat->prev_cycles = cmd->total_cycles;
req->stat->num_runs++;
req->stat->total_time += req->stat->prev_time;
+
+ trace_triv2_handle_cmd_done(drv->dev_id, cmd->slot,
+ cmd->total_cycles,
+ req->stat->prev_time);
}
t_req->total_cycles = cmd->total_cycles;
--
2.25.1
This patch is for profile module.
The samsung NPU provides internal statistics data,
and it includes memory read/write counts, consumed clock
cycle for each operation. This statistics can be read by
ioctl control command.
Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/trinity_vision2_drv.c | 326 ++++++++++++++++++
.../misc/trinity/trinity_vision2_profile.h | 324 +++++++++++++++++
2 files changed, 650 insertions(+)
create mode 100644 drivers/misc/trinity/trinity_vision2_profile.h
diff --git a/drivers/misc/trinity/trinity_vision2_drv.c b/drivers/misc/trinity/trinity_vision2_drv.c
index 3dd89920cdf5..111623322895 100644
--- a/drivers/misc/trinity/trinity_vision2_drv.c
+++ b/drivers/misc/trinity/trinity_vision2_drv.c
@@ -18,6 +18,7 @@
#include "trinity_common.h"
#include "trinity_sched.h"
+#include "trinity_vision2_profile.h"
#include "trinity_vision2_regs.h"
#define TRIV2_DRV_GET_PDATA(drv) ((struct triv2_pdata *)(drv->pdata))
@@ -146,6 +147,11 @@ struct triv2_pdata {
/* back buffer for context switching */
struct trinity_dma back_buf;
+
+ /* profiling */
+ struct trinity_dma prof_buf;
+ struct mutex prof_lock;
+ DECLARE_HASHTABLE(prof_htable, TRIV2_PROFILE_HASH_BITS);
};
static void triv2_idu_setup(struct trinity_driver *drv);
@@ -156,6 +162,150 @@ static void triv2_handle_cmd_done(struct trinity_driver *drv,
struct triv2_cmd *cmd, bool timeout);
static void triv2_setup_buffers(struct trinity_driver *drv);
+static const char *const triv2_op_names[] =
+ TRIV2_FOREACH_OPNAME(TRIV2_GENERATE_OPNAME);
+
+static struct triv2_profile *
+triv2_find_profile(const struct trinity_driver *drv, int req_id)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ unsigned long key = TRIV2_PROFILE_HASH_KEY(req_id);
+ struct triv2_profile *profile = NULL;
+
+ hash_for_each_possible(pdata->prof_htable, profile, hlist, key) {
+ if (profile->req_id == req_id)
+ break;
+ }
+
+ return profile;
+}
+
+static void triv2_fini_profile(struct device *dev, struct trinity_dma *prof_buf)
+{
+ if (!prof_buf->addr)
+ return;
+
+ trinity_dma_free(dev, prof_buf);
+ memset(prof_buf, '\x00', sizeof(*prof_buf));
+}
+
+static void triv2_init_profile(struct trinity_driver *drv,
+ unsigned long profile_size)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+ struct trinity_dma *prof_buf = TRIV2_DRV_GET_PROF_BUF(drv);
+
+ if (profile_size > 0) {
+ /* allocate profile buffer and enable it */
+ struct iommu_domain *domain;
+ phys_addr_t paddr;
+ int status;
+
+ triv2_fini_profile(dev, prof_buf);
+
+ status = trinity_dma_alloc(dev, profile_size, prof_buf);
+ if (status < 0) {
+ dev_err(dev,
+ "Couldn't allocate memory for profiling buffer: %d",
+ status);
+ return;
+ }
+
+ domain = iommu_get_domain_for_dev(drv_to_dev_ptr(drv));
+ paddr = trinity_get_paddr(domain, prof_buf->dma_handle);
+ iowrite32(TRIV2_IDU_ADDR(paddr),
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_ADDR));
+ iowrite32(prof_buf->size,
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_SIZE));
+ } else {
+ /* disable profiling */
+ triv2_fini_profile(dev, prof_buf);
+
+ iowrite32(0, trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_ADDR));
+ iowrite32(0, trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_SIZE));
+ }
+}
+
+static void triv2_assign_opnames(struct triv2_cmd_profile *cmd)
+{
+ struct triv2_op_profile *ops = cmd->profile_ops;
+ uint32_t i;
+
+ for (i = 0; i < cmd->total_ops; i++)
+ snprintf(ops[i].op_name, TRIV2_MAX_OPNAME, "%s",
+ triv2_op_names[ops[i].opcode]);
+}
+
+static int32_t triv2_check_profile(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct triv2_req *t_req = TRIV2_GET_REQ(req);
+ struct trinity_dma *profile_buf;
+ struct triv2_cmd_profile *profile_cmd;
+ struct triv2_cmd_profile *profile_cmd_new;
+ struct triv2_profile *profile;
+
+ uint32_t offset = t_req->profile_offset;
+ uint32_t total_ops, total_size;
+
+ profile_buf = TRIV2_DRV_GET_PROF_BUF(drv);
+ if (!profile_buf->addr)
+ return 0;
+
+ if (profile_buf->size <= offset) {
+ dev_err(drv_to_dev_ptr(drv),
+ "Invalid profile offset detected: 0x%x", offset);
+ return -EINVAL;
+ }
+
+ profile_cmd = (struct triv2_cmd_profile *)((char *)profile_buf->addr +
+ offset);
+ profile_cmd->total_cycles = t_req->total_cycles;
+
+ total_ops = profile_cmd->total_ops;
+ total_size = sizeof(struct triv2_cmd_profile) +
+ total_ops * sizeof(struct triv2_op_profile);
+
+ profile_cmd_new = vzalloc(total_size);
+ if (!profile_cmd_new)
+ return -ENOMEM;
+
+ mutex_lock(&pdata->prof_lock);
+
+ profile = req->stat->profile;
+ if (profile) {
+ WARN_ON(!profile->data);
+ vfree(profile->data);
+ profile->data = profile_cmd_new;
+ } else {
+ int req_id = req->input.config.req_id;
+ unsigned long key = TRIV2_PROFILE_HASH_KEY(req_id);
+
+ profile = vzalloc(sizeof(struct triv2_profile));
+ if (!profile) {
+ vfree(profile_cmd_new);
+ mutex_unlock(&pdata->prof_lock);
+ return -ENOMEM;
+ }
+ profile->req_id = req_id;
+ profile->data = profile_cmd_new;
+
+ hash_add(pdata->prof_htable, &profile->hlist, key);
+
+ req->stat->profile = profile;
+ }
+ memcpy(profile_cmd_new, profile_cmd, total_size);
+ triv2_assign_opnames(profile_cmd_new);
+
+ mutex_unlock(&pdata->prof_lock);
+ return 0;
+}
+
/**
* triv2_get_state() - Get state (TRINITY_STATE_READY/TRINITY_STATE_PAUSE) of the device.
* @returns (enum triv2_state) TRINITY_STATE_READY (i.e., 1) or TRINITY_STATE_PAUSE (i.e., 0 )
@@ -447,6 +597,157 @@ static void triv2_stop_reqs(struct work_struct *work)
triv2_cancel_reqs(drv);
}
+/**
+ * triv2_get_profile_meta() - get profile metadata for the target req
+ */
+static int32_t triv2_get_profile_meta(const struct trinity_driver *drv,
+ struct trinity_ioctl_profile_meta *meta)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct triv2_profile *profile;
+ struct triv2_cmd_profile *profile_data;
+ int ret = 0;
+
+ mutex_lock(&pdata->prof_lock);
+
+ profile = triv2_find_profile(drv, meta->req_id);
+ if (!profile) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ profile_data = profile->data;
+ WARN_ON(!profile_data);
+
+ meta->total_cycles = profile_data->total_cycles;
+ meta->total_ops = profile_data->total_ops;
+ meta->profile_size =
+ profile_data->total_ops * sizeof(struct triv2_op_profile);
+ /* unsupported for now */
+ meta->input_footprint = -1;
+ meta->output_footprint = -1;
+
+out:
+ mutex_unlock(&pdata->prof_lock);
+
+ return ret;
+}
+
+/**
+ * triv2_get_profile_buff() - get profile buffer for the target req
+ */
+static int32_t triv2_get_profile_buff(const struct trinity_driver *drv,
+ struct trinity_ioctl_profile_buff *buff)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct triv2_profile *profile;
+ struct triv2_cmd_profile *profile_data;
+ uint32_t total_size;
+ int ret = 0;
+
+ mutex_lock(&pdata->prof_lock);
+
+ profile = triv2_find_profile(drv, buff->req_id);
+ if (!profile) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ profile_data = profile->data;
+ WARN_ON(!profile_data);
+
+ profile_data = profile->data;
+ total_size = profile_data->total_ops * sizeof(struct triv2_op_profile);
+
+ if (buff->profile_pos + buff->profile_size > total_size) {
+ dev_err(drv_to_dev_ptr(drv),
+ "Profile data out-of-range! pos(%u) size(%u) > total_size(%u)",
+ buff->profile_pos, buff->profile_size, total_size);
+ ret = -ERANGE;
+ goto out;
+ }
+
+ /* consider partial memory copies */
+ if (copy_to_user((char __user *)buff->profile_buf,
+ (char *)profile_data->profile_ops + buff->profile_pos,
+ buff->profile_size))
+ ret = -EACCES;
+
+out:
+ mutex_unlock(&pdata->prof_lock);
+
+ return ret;
+}
+
+static ssize_t triv2_get_profile(const struct trinity_driver *drv, char *buf, int req_id)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct triv2_profile *profile;
+ struct triv2_cmd_profile *profile_data;
+ uint32_t i;
+ ssize_t len = 0;
+
+ mutex_lock(&pdata->prof_lock);
+
+ profile = triv2_find_profile(drv, req_id);
+ if (!profile) {
+ len += snprintf(buf, PAGE_SIZE, "Unable to find the profile data (req_id %d)",
+ req_id);
+ goto out;
+ }
+
+ profile_data = profile->data;
+ WARN_ON(!profile_data);
+
+ len += snprintf(buf, PAGE_SIZE, "Total cycles: %lld", profile_data->total_cycles);
+ len += snprintf(buf, PAGE_SIZE, "Total ops: %u", profile_data->total_ops);
+
+ for (i = 0; i < profile_data->total_ops; i++) {
+ struct triv2_op_profile *op = &profile_data->profile_ops[i];
+
+ len += snprintf(buf, PAGE_SIZE, "[%u] opcode: %u name:%s", i, op->opcode,
+ op->op_name);
+ len += snprintf(buf, PAGE_SIZE, "\tcycles: %lld", op->cycles);
+ len += snprintf(buf, PAGE_SIZE, "\tprog_seq: %lld", op->prog_seq);
+ len += snprintf(buf, PAGE_SIZE, "\texec_seq: %lld", op->exec_seq);
+ if (op->dram_read > 0)
+ len += snprintf(buf, PAGE_SIZE, "\tdram_read: %lld", op->dram_read);
+ if (op->dram_write > 0)
+ len += snprintf(buf, PAGE_SIZE, "\tdram_write: %lld", op->dram_write);
+ if (op->sram_read > 0)
+ len += snprintf(buf, PAGE_SIZE, "\tsram_read: %lld", op->sram_read);
+ if (op->sram_write > 0)
+ len += snprintf(buf, PAGE_SIZE, "\tsram_write: %lld", op->sram_write);
+ }
+out:
+ mutex_unlock(&pdata->prof_lock);
+ return len;
+}
+
+/**
+ * triv2_destroy_profile() - destroy profile data
+ */
+static void triv2_destroy_profile(const struct trinity_driver *drv, void *data)
+{
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+ struct triv2_profile *profile = data;
+ struct triv2_cmd_profile *profile_data;
+
+ if (!profile)
+ return;
+
+ mutex_lock(&pdata->prof_lock);
+
+ profile_data = profile->data;
+ WARN_ON(!profile_data);
+ vfree(profile_data);
+
+ hash_del(&profile->hlist);
+ vfree(profile);
+
+ mutex_unlock(&pdata->prof_lock);
+}
+
static void triv2_handle_irq_cmds(struct trinity_driver *drv)
{
struct triv2_cmd_info *info;
@@ -1021,11 +1322,13 @@ static void triv2_setup_buffers(struct trinity_driver *drv)
struct iommu_domain *domain;
struct trinity_dma *cmd_buf;
struct trinity_dma *back_buf;
+ struct trinity_dma *prof_buf;
phys_addr_t paddr;
domain = iommu_get_domain_for_dev(dev);
cmd_buf = TRIV2_DRV_GET_CMD_BUF(drv);
back_buf = TRIV2_DRV_GET_BACK_BUF(drv);
+ prof_buf = TRIV2_DRV_GET_PROF_BUF(drv);
/* command */
paddr = trinity_get_paddr(domain, cmd_buf->dma_handle);
@@ -1038,6 +1341,22 @@ static void triv2_setup_buffers(struct trinity_driver *drv)
OFFSET_NPU_BACK_ADDR));
iowrite32(back_buf->size, trinity_get_iomem_addr(drv->mmreg_vaddr[0],
OFFSET_NPU_BACK_SIZE));
+
+ /* profile */
+ if (prof_buf->size > 0) {
+ paddr = trinity_get_paddr(domain, prof_buf->dma_handle);
+ iowrite32(TRIV2_IDU_ADDR(paddr),
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_ADDR));
+ iowrite32(prof_buf->size,
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_SIZE));
+ } else {
+ iowrite32(0, trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_ADDR));
+ iowrite32(0, trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_PROF_SIZE));
+ }
}
static int32_t triv2_init_pdata(struct trinity_driver *drv)
@@ -1203,6 +1522,13 @@ static struct trinity_desc triv2_desc = {
.dealloc_req = triv2_dealloc_req,
.prepare_req = triv2_prepare_req,
.invoke_req = triv2_invoke_req,
+ /* profile */
+ .init_profile = triv2_init_profile,
+ .check_profile = triv2_check_profile,
+ .get_profile_meta = triv2_get_profile_meta,
+ .get_profile_buff = triv2_get_profile_buff,
+ .get_profile = triv2_get_profile,
+ .destroy_profile = triv2_destroy_profile,
/* etc. */
.handle_timeout = triv2_handle_timeout,
.stop_reqs = triv2_stop_reqs,
diff --git a/drivers/misc/trinity/trinity_vision2_profile.h b/drivers/misc/trinity/trinity_vision2_profile.h
new file mode 100644
index 000000000000..7e5b169eca6b
--- /dev/null
+++ b/drivers/misc/trinity/trinity_vision2_profile.h
@@ -0,0 +1,324 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * Profile header for TRIV2 devices
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __DRIVERS_MISC_TRINITY_VISION2_PROFILE_H__
+#define __DRIVERS_MISC_TRINITY_VISION2_PROFILE_H__
+
+#include <linux/types.h>
+
+#define TRIV2_MAX_OPNAME (128)
+#define TRIV2_MAX_PROFILE_SIZE (256)
+
+/**
+ * struct triv2_op_profile - A profile data per operation
+ *
+ * @op_name: The physical DMA address of this DMA buffer.
+ * @cycles: total number of cycles
+ * @dram_read: a count for dram read
+ * @dram_write: a count for dram write
+ * @sram_read: a count for sram read
+ * @sram_write: a count for sram write
+ * @start_cycles: a count for starting cycles
+ * @end_cycles: a cont for ending cycles
+ * @opcode: operation code
+ * @prog_seq: program sequence number
+ * @exec_seq: execution sequence number
+ * @reserved: reserved
+ */
+struct triv2_op_profile {
+ union {
+ struct {
+ char op_name[TRIV2_MAX_OPNAME];
+
+ int64_t cycles;
+
+ int64_t dram_read;
+ int64_t dram_write;
+
+ int64_t sram_read;
+ int64_t sram_write;
+
+ int64_t start_cycles;
+ int64_t end_cycles;
+
+ uint32_t opcode;
+ int64_t prog_seq;
+ int64_t exec_seq;
+ } __packed;
+ uint8_t reserved[TRIV2_MAX_PROFILE_SIZE];
+ };
+};
+
+/**
+ * struct triv2_cmd_profile - A profile data per command
+ *
+ * @total_cycles: total number of cycles for a command
+ * @total_ops: total operations of command
+ * @profile_ops: list of profile data for operations
+ */
+struct triv2_cmd_profile {
+ int64_t total_cycles;
+ uint32_t total_ops;
+ /* zero-length array */
+ struct triv2_op_profile profile_ops[];
+} __packed;
+
+/**
+ * struct triv2_profile - A profile data
+ *
+ * @req_id: total number of cycles for a command
+ * @hlist: list of profile data
+ * @data: command profile data
+ */
+struct triv2_profile {
+ int req_id;
+ struct hlist_node hlist;
+ struct triv2_cmd_profile *data;
+};
+
+enum {
+ NOP = 0x00,
+ HALT = 0x01,
+ ADMA_IN = 0x02,
+ ADMA_OUT = 0x03,
+ RESCALE_I8 = 0x04,
+ RESCALE_I16 = 0x05,
+ CONVERT_I16_I8 = 0x06,
+ CONVERT_I8_I16 = 0x07,
+ RELUN_I8 = 0x08,
+ RELUN_I16 = 0x09,
+ PRELU_I8 = 0x0A,
+ PRELU_I16 = 0x0B,
+ ADD_I8 = 0x0C,
+ ADD_I16 = 0x0D,
+ REDUCE_MEAN_I8 = 0x0E,
+ REDUCE_MEAN_I16 = 0x0F,
+ MAX_POOL_I8 = 0x10,
+ MAX_POOL_I16 = 0x11,
+ AVG_POOL_I8 = 0x12,
+ AVG_POOL_I16 = 0x13,
+ CONV_I8 = 0x14,
+ CONV_I16 = 0x15,
+ CONVE_I8 = 0x16,
+ CONVE_I16 = 0x17,
+ TCONV_I8 = 0x18,
+ TCONV_I16 = 0x19,
+ MUL_I8 = 0x1A,
+ MUL_I16 = 0x1B,
+ DCONV_I8 = 0x1C,
+ DCONV_I16 = 0x1D,
+ DCONVE_I8 = 0x1E,
+ DCONVE_I16 = 0x1F,
+ CONV_I8_P = 0x20,
+ CONV_I16_P = 0x21,
+ PDMA_IN = 0x40,
+ PDMA_OUT = 0x41,
+ ARGMAX_I8 = 0x42,
+ ARGMAX_I16 = 0x43,
+ RESHAPE_I8 = 0x44,
+ RESHAPE_I16 = 0x45,
+ TRANSPOSE_I8 = 0x46,
+ TRANSPOSE_I16 = 0x47,
+ CONCAT_I8 = 0x48,
+ CONCAT_I16 = 0x49,
+ PAD_I8 = 0x4A,
+ PAD_I16 = 0x4B,
+ STRIDED_SLICE_I8 = 0x4C,
+ STRIDED_SLICE_I16 = 0x4D,
+ CONVERT_FORMAT_I8 = 0x4E,
+ CONVERT_FORMAT_I16 = 0x4F,
+ SIGMOID_I8 = 0x50,
+ SIGMOID_I16 = 0x51,
+ TANH_I8 = 0x52,
+ TANH_I16 = 0x53,
+ ELU_I8 = 0x54,
+ ELU_I16 = 0x55,
+ FLOOR_I8 = 0x56,
+ FLOOR_I16 = 0x57,
+ RSQRT_I8 = 0x58,
+ RSQRT_I16 = 0x59,
+ SQRT_I8 = 0x5A,
+ SQRT_I16 = 0x5B,
+ SOFTMAX_I8 = 0x5C,
+ SOFTMAX_I16 = 0x5D,
+ DIVIDE_I8 = 0x60,
+ DIVIDE_I16 = 0x61,
+ FLOORDIV_I8 = 0x62,
+ FLOORDIV_I16 = 0x63,
+ LOGICAL_OR_I8 = 0x64,
+ LOGICAL_OR_I16 = 0x65,
+ GREATER_I8 = 0x66,
+ GREATER_I16 = 0x67,
+ GREATER_EQUAL_I8 = 0x68,
+ GREATER_EQUAL_I16 = 0x69,
+ POW_I8 = 0x6A,
+ POW_I16 = 0x6B,
+ EXP_I8 = 0x6C,
+ EXP_I16 = 0x6D,
+ NOT_EQUAL_I8 = 0x6E,
+ NOT_EQUAL_I16 = 0x6F,
+ BATCH_TO_SPACE_I8 = 0x70,
+ BATCH_TO_SPACE_I16 = 0x71,
+ SPACE_TO_BATCH_I8 = 0x72,
+ SPACE_TO_BATCH_I16 = 0x73,
+ DEPTH_TO_SPACE_I8 = 0x74,
+ DEPTH_TO_SPACE_I16 = 0x75,
+ SPACE_TO_DEPTH_I8 = 0x76,
+ SPACE_TO_DEPTH_I16 = 0x77,
+ YUV_TO_RGB_I8 = 0x7A,
+ YUV_TO_RGB_I16 = 0x7B,
+ RESIZE_BILINEAR_I8 = 0x7C,
+ RESIZE_BILINEAR_I16 = 0x7D,
+ RESIZE_NEAREST_NEIGHBOR_I8 = 0x7E,
+ RESIZE_NEAREST_NEIGHBOR_I16 = 0x7F,
+ LOCAL_RESPONSE_NORM_I8 = 0x80,
+ LOCAL_RESPONSE_NORM_I16 = 0x81,
+ INSTANCE_NORM_I8 = 0x82,
+ INSTANCE_NORM_I16 = 0x83,
+ REDUCED_SUM_SSUM_I8 = 0x84,
+ REDUCED_SUM_SSUM_I16 = 0x85,
+ REDUCED_SUM_SSUM_ACC_I8 = 0x86,
+ REDUCED_SUM_SSUM_ACC_I16 = 0x87,
+ REDUCED_SUM_2SUM_I8 = 0x88,
+ REDUCED_SUM_2SUM_I16 = 0x89,
+ REDUCED_MEAN_DEV_WSUM_I8 = 0x8A,
+ REDUCED_MEAN_DEV_WSUM_I16 = 0x8B,
+ REDUCED_MEAN_DEV_I8 = 0x8C,
+ REDUCED_MEAN_DEV_I16 = 0x8D,
+ RESCALE_CW_I8 = 0x8E,
+ RESCALE_CW_I16 = 0x8F,
+ REDUCED_MEAN_SCALE_WSUM_I8 = 0x90,
+ REDUCED_MEAN_SCALE_WSUM_I16 = 0x91,
+ RESCALE_CHANNELWISE_I8 = 0x92,
+ RESCALE_CHANNELWISE_I16 = 0x93,
+};
+
+/** generate opnames */
+#define TRIV2_GENERATE_OPNAME(OPNAME) \
+ [OPNAME] = #OPNAME,
+
+#define TRIV2_FOREACH_OPNAME(GEN) {\
+ GEN(NOP) \
+ GEN(HALT) \
+ GEN(ADMA_IN) \
+ GEN(ADMA_OUT) \
+ GEN(RESCALE_I8) \
+ GEN(RESCALE_I16) \
+ GEN(CONVERT_I16_I8) \
+ GEN(CONVERT_I8_I16) \
+ GEN(RELUN_I8) \
+ GEN(RELUN_I16) \
+ GEN(PRELU_I8) \
+ GEN(PRELU_I16) \
+ GEN(ADD_I8) \
+ GEN(ADD_I16) \
+ GEN(REDUCE_MEAN_I8) \
+ GEN(REDUCE_MEAN_I16) \
+ GEN(MAX_POOL_I8) \
+ GEN(MAX_POOL_I16) \
+ GEN(AVG_POOL_I8) \
+ GEN(AVG_POOL_I16) \
+ GEN(CONV_I8) \
+ GEN(CONV_I16) \
+ GEN(CONVE_I8) \
+ GEN(CONVE_I16) \
+ GEN(TCONV_I8) \
+ GEN(TCONV_I16) \
+ GEN(MUL_I8) \
+ GEN(MUL_I16) \
+ GEN(DCONV_I8) \
+ GEN(DCONV_I16) \
+ GEN(DCONVE_I8) \
+ GEN(DCONVE_I16) \
+ GEN(CONV_I8_P) \
+ GEN(CONV_I16_P) \
+ GEN(PDMA_IN) \
+ GEN(PDMA_OUT) \
+ GEN(ARGMAX_I8) \
+ GEN(ARGMAX_I16) \
+ GEN(RESHAPE_I8) \
+ GEN(RESHAPE_I16) \
+ GEN(TRANSPOSE_I8) \
+ GEN(TRANSPOSE_I16) \
+ GEN(CONCAT_I8) \
+ GEN(CONCAT_I16) \
+ GEN(PAD_I8) \
+ GEN(PAD_I16) \
+ GEN(STRIDED_SLICE_I8) \
+ GEN(STRIDED_SLICE_I16) \
+ GEN(CONVERT_FORMAT_I8) \
+ GEN(CONVERT_FORMAT_I16) \
+ GEN(SIGMOID_I8) \
+ GEN(SIGMOID_I16) \
+ GEN(TANH_I8) \
+ GEN(TANH_I16) \
+ GEN(ELU_I8) \
+ GEN(ELU_I16) \
+ GEN(FLOOR_I8) \
+ GEN(FLOOR_I16) \
+ GEN(RSQRT_I8) \
+ GEN(RSQRT_I16) \
+ GEN(SQRT_I8) \
+ GEN(SQRT_I16) \
+ GEN(SOFTMAX_I8) \
+ GEN(SOFTMAX_I16) \
+ GEN(DIVIDE_I8) \
+ GEN(DIVIDE_I16) \
+ GEN(FLOORDIV_I8) \
+ GEN(FLOORDIV_I16) \
+ GEN(LOGICAL_OR_I8) \
+ GEN(LOGICAL_OR_I16) \
+ GEN(GREATER_I8) \
+ GEN(GREATER_I16) \
+ GEN(GREATER_EQUAL_I8) \
+ GEN(GREATER_EQUAL_I16) \
+ GEN(POW_I8) \
+ GEN(POW_I16) \
+ GEN(EXP_I8) \
+ GEN(EXP_I16) \
+ GEN(NOT_EQUAL_I8) \
+ GEN(NOT_EQUAL_I16) \
+ GEN(BATCH_TO_SPACE_I8) \
+ GEN(BATCH_TO_SPACE_I16) \
+ GEN(SPACE_TO_BATCH_I8) \
+ GEN(SPACE_TO_BATCH_I16) \
+ GEN(DEPTH_TO_SPACE_I8) \
+ GEN(DEPTH_TO_SPACE_I16) \
+ GEN(SPACE_TO_DEPTH_I8) \
+ GEN(SPACE_TO_DEPTH_I16) \
+ GEN(YUV_TO_RGB_I8) \
+ GEN(YUV_TO_RGB_I16) \
+ GEN(RESIZE_BILINEAR_I8) \
+ GEN(RESIZE_BILINEAR_I16) \
+ GEN(RESIZE_NEAREST_NEIGHBOR_I8) \
+ GEN(RESIZE_NEAREST_NEIGHBOR_I16) \
+ GEN(LOCAL_RESPONSE_NORM_I8) \
+ GEN(LOCAL_RESPONSE_NORM_I16) \
+ GEN(INSTANCE_NORM_I8) \
+ GEN(INSTANCE_NORM_I16) \
+ GEN(REDUCED_SUM_SSUM_I8) \
+ GEN(REDUCED_SUM_SSUM_I16) \
+ GEN(REDUCED_SUM_SSUM_ACC_I8) \
+ GEN(REDUCED_SUM_SSUM_ACC_I16) \
+ GEN(REDUCED_SUM_2SUM_I8) \
+ GEN(REDUCED_SUM_2SUM_I16) \
+ GEN(REDUCED_MEAN_DEV_WSUM_I8) \
+ GEN(REDUCED_MEAN_DEV_WSUM_I16) \
+ GEN(REDUCED_MEAN_DEV_I8) \
+ GEN(REDUCED_MEAN_DEV_I16) \
+ GEN(RESCALE_CW_I8) \
+ GEN(RESCALE_CW_I16) \
+ GEN(REDUCED_MEAN_SCALE_WSUM_I8) \
+ GEN(REDUCED_MEAN_SCALE_WSUM_I16) \
+ GEN(RESCALE_CHANNELWISE_I8) \
+ GEN(RESCALE_CHANNELWISE_I16) \
+}
+#endif /* __DRIVERS_MISC_TRINITY_VISION2_PROFILE_H__ */
--
2.25.1
This patch includes NPU scheduler interface.
Tasks can be pushed to the NPU in order by the scheduler. The default
schduling algorithm is provided using Priority policy.
When the requests are invoked, it calculates priority with remained time to
timeout, and it submits requests to NPU in priority order. It waits
until complete interrupt arrives from NPU, and pushes a next request.
Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/Makefile | 1 +
drivers/misc/trinity/trinity.c | 1 +
drivers/misc/trinity/trinity_sched.c | 338 +++++++++++++++++++++
drivers/misc/trinity/trinity_sched.h | 24 ++
drivers/misc/trinity/trinity_vision2_drv.c | 1 +
5 files changed, 365 insertions(+)
create mode 100644 drivers/misc/trinity/trinity_sched.c
create mode 100644 drivers/misc/trinity/trinity_sched.h
diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
index 5d2b75112482..2a8c4fed135e 100644
--- a/drivers/misc/trinity/Makefile
+++ b/drivers/misc/trinity/Makefile
@@ -4,5 +4,6 @@ obj-$(CONFIG_TRINITY_VISION2) += trinity_vision2.o
trinity-y := trinity.o
trinity-y += trinity_dma.o trinity_hwmem.o
+trinity-y += trinity_sched.o
trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
index 3e8157dd4664..0c75eb13967c 100644
--- a/drivers/misc/trinity/trinity.c
+++ b/drivers/misc/trinity/trinity.c
@@ -14,6 +14,7 @@
#include <linux/of_reserved_mem.h>
#include "trinity_common.h"
+#include "trinity_sched.h"
#define TRINITY_PADDR_BASE (0x0)
diff --git a/drivers/misc/trinity/trinity_sched.c b/drivers/misc/trinity/trinity_sched.c
new file mode 100644
index 000000000000..6e19841b345d
--- /dev/null
+++ b/drivers/misc/trinity/trinity_sched.c
@@ -0,0 +1,338 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * NPU scheduler for trinity requests
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/kthread.h>
+#include <linux/spinlock.h>
+
+#include "trinity_common.h"
+
+struct trinity_sched_data {
+ struct llist_head req_queue;
+ wait_queue_head_t wait_queue;
+ struct task_struct *sched_thread;
+ struct mutex lock;
+ unsigned long suspended;
+};
+
+/**
+ * sched_calc_pri() - Calculate priority using timeout
+ */
+static unsigned long sched_calc_pri(struct trinity_req *req)
+{
+ ktime_t elapsed_time;
+ int64_t priority;
+
+ if (req->input.config.timeout_ms == 0)
+ return 0; /** @todo need preemption */
+
+ elapsed_time = ktime_to_ms(ktime_sub(ktime_get(), req->time_started));
+ WARN_ON(elapsed_time < 0);
+
+ /**
+ * if the elapsed time exceeds the timeout of req,
+ * its priority value is set to the minimum (highest).
+ */
+ priority = req->input.config.timeout_ms - elapsed_time;
+ if (priority < 0)
+ priority = 0;
+
+ return priority;
+}
+
+/**
+ * sched_pick_req() - Pick the top-priority request from request queue
+ */
+static struct trinity_req *sched_pick_req(struct llist_head *queue)
+{
+ struct trinity_req *req, *req_prev;
+ struct trinity_req *top_req, *top_req_prev;
+ int64_t top_priority = S64_MAX;
+ unsigned long priority;
+
+ if (llist_empty(queue))
+ return NULL;
+
+ req = req_prev = NULL;
+ top_req = top_req_prev = NULL;
+
+ /**
+ * llist is not a double linked list, and sorting is not easy
+ * because llist provides only limited APIs.
+ * it could be better than sorting if there are a few pending reqs.
+ * Note that each user application can submit only one req at once.
+ */
+ llist_for_each_entry(req, queue->first, llist) {
+ priority = sched_calc_pri(req);
+ if (top_priority > priority) {
+ top_priority = priority;
+ top_req = req;
+ top_req_prev = req_prev;
+ }
+
+ req_prev = req;
+ }
+
+ if (top_req_prev) {
+ WARN_ON(!top_req);
+ top_req_prev->llist.next = top_req->llist.next;
+ } else {
+ /** it's first entry */
+ top_req = llist_entry(llist_del_first(queue), typeof(*(req)),
+ llist);
+ }
+
+ return top_req;
+}
+
+/**
+ * llist_last() - Get latest node from list
+ */
+static struct llist_node *llist_last(struct llist_node *first)
+{
+ struct llist_node *last = first;
+
+ while (first && first->next) {
+ last = first->next;
+ first = last;
+ }
+
+ return last;
+}
+
+/**
+ * trinity_sched_run_req() - Schedules a req to the target from the req queue
+ *
+ * @req: Request information to be submitted.
+ *
+ * Return: 0 on success. Otherwise, returns negative error. Additional status of
+ * the submitted req could be passed by req->status.
+ */
+static int32_t sched_run_req(struct trinity_req *req, struct trinity_sched_data *sched)
+{
+ struct trinity_driver *drv = req->drv;
+ struct device *dev = drv_to_dev_ptr(drv);
+ int32_t err = 0;
+ int32_t ready;
+
+ /** setup is only allowed in ready state */
+ ready = drv->desc->get_state(drv);
+ if (ready != TRINITY_STATE_READY) {
+ dev_err(dev,
+ "Cannot setup NPU when it's in a non-ready state");
+ err = -EPERM;
+ goto out;
+ }
+
+ if (req->stat->status != TRINITY_REQ_STATUS_PENDING &&
+ req->stat->status != TRINITY_REQ_STATUS_FINISHED) {
+ dev_err(dev, "Invalid req status: %d",
+ req->stat->status);
+ err = -EINVAL;
+ goto out;
+ }
+
+ req->stat->status = TRINITY_REQ_STATUS_RUNNING;
+ err = drv->desc->invoke_req(drv, req, NULL);
+out:
+ if (err != 0)
+ req->stat->status = TRINITY_REQ_STATUS_ERROR;
+
+ return err;
+}
+
+/**
+ * sched_thread_func() - Scheduler thread function
+ */
+static int sched_thread_func(void *data)
+{
+ const unsigned long MAX_RETRY_COUNT = 100;
+ struct trinity_sched_data *sched;
+ struct llist_head local_queue;
+ struct llist_node *new_first;
+
+ sched = data;
+ init_llist_head(&local_queue);
+
+repeat:
+ if (kthread_should_stop())
+ return 0;
+
+ /** extract requests from global queue without locking */
+ new_first = llist_del_all(&sched->req_queue);
+ /** new and pending requests could be located together */
+ if (new_first) {
+ struct llist_node *new_last = llist_last(new_first);
+
+ llist_add_batch(new_first, new_last, &local_queue);
+ }
+
+ /** flush requests in the queue */
+ while (!llist_empty(&local_queue)) {
+ struct trinity_req *req;
+ int32_t ret;
+
+ /**
+ * pick the top-priority request from the queue.
+ * first and last node pointers are updated
+ */
+ req = sched_pick_req(&local_queue);
+ if (!req)
+ goto repeat;
+
+ mutex_lock(&sched->lock);
+ ret = sched_run_req(req, sched);
+ mutex_unlock(&sched->lock);
+
+ /** do not modify or access for 'req' except on an error case.
+ * it could be released by the interrupt.
+ */
+ if (ret == -EBUSY) {
+ if (req->submit_retry >= MAX_RETRY_COUNT) {
+ /** give up to handling this req*/
+ complete_all(&req->complete);
+ } else {
+ req->submit_retry++;
+ /** push again and restart the loop */
+ llist_add(&req->llist, &local_queue);
+ }
+ goto repeat;
+ } else if (ret != 0) {
+ /** let's notify this unknown error */
+ complete_all(&req->complete);
+ }
+ }
+
+ /** ensure the local queue is empty */
+ WARN_ON(!llist_empty(&local_queue));
+
+ wait_event_interruptible(
+ sched->wait_queue,
+ kthread_should_stop() ||
+ !llist_empty(&sched->req_queue));
+ goto repeat;
+}
+
+/**
+ * tirnity_sched_ready() - Check scheduler is ready
+ *
+ * @drv: an instance of trinity driver
+ */
+bool trinity_sched_ready(struct trinity_driver *drv)
+{
+ struct trinity_sched_data *sched = drv->sched_pdata;
+
+ return (test_bit(1, &sched->suspended) != 1);
+}
+
+/**
+ * trinity_sched_submit() - Submit request to scheduler
+ *
+ * @drv: an instance of trinity driver
+ * @req: request to be submitted
+ *
+ * Return: returns 0 on Success, otherwise returns negative error
+ */
+int32_t trinity_sched_submit(struct trinity_driver *drv, struct trinity_req *req)
+{
+ struct trinity_sched_data *sched = drv->sched_pdata;
+
+ if (!req)
+ return -EINVAL;
+
+ if (!trinity_sched_ready(drv))
+ return -EAGAIN;
+
+ llist_add(&req->llist, &sched->req_queue);
+ wake_up(&sched->wait_queue);
+
+ return 0;
+}
+
+/**
+ * trinity_sched_notify() - finishes and notify the request handled
+ */
+void trinity_sched_notify(struct trinity_req *req, bool timeout)
+{
+ req->scheduled = false;
+ req->timeout = timeout;
+}
+
+/**
+ * trinity_sched_suspend() - Suspend scheduler
+ *
+ * @drv: an instance of trinity driver
+ */
+void trinity_sched_suspend(struct trinity_driver *drv)
+{
+ struct trinity_sched_data *sched = drv->sched_pdata;
+
+ if (!test_and_set_bit(1, &sched->suspended))
+ mutex_lock(&sched->lock);
+}
+
+/**
+ * trinity_sched_resume() - Resume scheduler
+ *
+ * @drv: an instance of trinity driver
+ */
+void trinity_sched_resume(struct trinity_driver *drv)
+{
+ struct trinity_sched_data *sched = drv->sched_pdata;
+
+ if (test_and_clear_bit(1, &sched->suspended))
+ mutex_unlock(&sched->lock);
+}
+
+/**
+ * trinity_sched_init() - Initialize trinity task schedulers
+ *
+ * @dev: an instance of the device
+ * Return: returns 0 on Success, otherwise returns negative error
+ */
+int trinity_sched_init(struct device *dev)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ struct trinity_sched_data *sched;
+
+ sched = devm_kzalloc(dev, sizeof(*sched), GFP_KERNEL);
+ if (!sched)
+ return -ENOMEM;
+
+ init_llist_head(&sched->req_queue);
+ init_waitqueue_head(&sched->wait_queue);
+
+ mutex_init(&sched->lock);
+ clear_bit(1, &sched->suspended);
+
+ sched->sched_thread =
+ kthread_run(sched_thread_func, sched, "trinity_sched_thread");
+ if (IS_ERR(sched->sched_thread)) {
+ dev_err(dev,
+ "Failed to create a thread for scheduler");
+ return PTR_ERR(sched->sched_thread);
+ }
+ drv->sched_pdata = sched;
+
+ return 0;
+}
+
+/**
+ * trinity_sched_exit() - Exit trinity task schedulers
+ *
+ * @dev: an instance of the device
+ */
+void trinity_sched_exit(struct device *dev)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+
+ if (drv->sched_pdata)
+ devm_kfree(dev, drv->sched_pdata);
+}
diff --git a/drivers/misc/trinity/trinity_sched.h b/drivers/misc/trinity/trinity_sched.h
new file mode 100644
index 000000000000..751d82d4374e
--- /dev/null
+++ b/drivers/misc/trinity/trinity_sched.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * NPU scheduler for trinity requests
+ *
+ * Copyright (C) 2021-2022 Samsung Electronics
+ * Copyright (C) 2021 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __DRIVERS_MISC_TRINITY_SCHED_H__
+#define __DRIVERS_MISC_TRINITY_SCHED_H__
+
+bool trinity_sched_ready(struct trinity_driver *drv);
+int32_t trinity_sched_submit(struct trinity_driver *drv,
+ struct trinity_req *req);
+void trinity_sched_notify(struct trinity_req *req, bool timeout);
+void trinity_sched_suspend(struct trinity_driver *drv);
+void trinity_sched_resume(struct trinity_driver *drv);
+int32_t trinity_sched_init(struct device *dev);
+void trinity_sched_exit(struct device *dev);
+
+#endif /* __DRIVERS_MISC_TRINITY_SCHED_H__ */
diff --git a/drivers/misc/trinity/trinity_vision2_drv.c b/drivers/misc/trinity/trinity_vision2_drv.c
index 4bfc7f97769c..70b8b6fd5843 100644
--- a/drivers/misc/trinity/trinity_vision2_drv.c
+++ b/drivers/misc/trinity/trinity_vision2_drv.c
@@ -16,6 +16,7 @@
#include <linux/utsname.h>
#include "trinity_common.h"
+#include "trinity_sched.h"
#include "trinity_vision2_regs.h"
#define TRIV2_DRV_GET_PDATA(drv) ((struct triv2_pdata *)(drv->pdata))
--
2.25.1
This patch implements request and PM features.
trinity requests are created by ioctl, and it's invoked by the
scheduler. Each request is prepared to run on the NPU including
segment allocation and command setup. Requests are managed by
the command structure, which keeps inforamtion of the request to
manage its lifecycle.
Power management operations are also provided, it works
suspend mode with pm_runtime_allow and pm_runtime_forbid.
Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/trinity.c | 62 +++
drivers/misc/trinity/trinity_vision2_drv.c | 605 +++++++++++++++++++++
2 files changed, 667 insertions(+)
diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
index 0fb5ccf9f035..0463140c0ae6 100644
--- a/drivers/misc/trinity/trinity.c
+++ b/drivers/misc/trinity/trinity.c
@@ -664,6 +664,23 @@ long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
*/
int trinity_release(struct inode *inode, struct file *file)
{
+ struct trinity_driver *drv;
+
+ drv = file->private_data;
+
+ trinity_stat_app_set_status(drv, TRINITY_APP_STATUS_TERMINATED);
+ /* block newly incoming requests */
+ trinity_sched_suspend(drv);
+
+ /* wait already submitted requests */
+ if (drv->desc->drain_reqs)
+ drv->desc->drain_reqs(drv);
+
+ /* deregister models owned by this device handle */
+ trinity_deregister_models_owned(drv);
+
+ trinity_sched_resume(drv);
+
return 0;
}
@@ -735,6 +752,27 @@ int trinity_open(struct inode *inode, struct file *f)
return 0;
}
+static void trinity_common_init(struct device *dev)
+{
+ trinity_model_htable_init(dev);
+
+ if (trinity_debug_init() < 0)
+ dev_warn(dev, "Unable to initialize debugfs\n");
+
+ if (trinity_sched_init(dev) < 0)
+ dev_warn(dev, "Unable to initialize scheduler\n");
+
+ if (trinity_dma_init(dev) < 0)
+ dev_warn(dev, "Failed to init DMA memory\n");
+}
+
+static void trinity_common_exit(struct device *dev)
+{
+ trinity_dma_exit(dev);
+ trinity_debug_exit();
+ trinity_sched_exit(dev);
+}
+
/**
* trinity_create_node() - Create trinity node
*
@@ -868,9 +906,27 @@ int trinity_probe(struct platform_device *pdev, const struct trinity_desc *desc)
mutex_init(&drv->lock);
INIT_WORK(&drv->work_stop, desc->stop_reqs);
+ trinity_common_init(dev);
+
+ err = trinity_sysfs_init(drv);
+ if (err < 0) {
+ dev_err(dev, "failed to initialize sysfs for a trinity device");
+ goto err_cleanup_common;
+ }
+
+ err = trinity_debug_add(drv);
+ if (err < 0) {
+ dev_err(dev,
+ "failed to add a debugging feature to the trinity device");
+ trinity_sysfs_cleanup(drv);
+ goto err_cleanup_common;
+ }
+ trinity_stat_init(drv);
+
return 0;
err_cleanup_common:
+ trinity_common_exit(dev);
devm_free_irq(dev, drv->irq, &drv->mdev);
err_cleanup:
@@ -895,6 +951,12 @@ int trinity_remove(struct platform_device *pdev,
struct trinity_driver *drv = platform_get_drvdata(pdev);
struct device *dev = drv_to_dev_ptr(drv);
+ trinity_stat_fini(drv);
+ trinity_debug_remove(drv);
+ trinity_sysfs_cleanup(drv);
+
+ trinity_common_exit(dev);
+
ida_free(&dev_nrs, drv->dev_id);
devm_free_irq(dev, drv->irq, &drv->mdev);
devm_kfree(dev, drv);
diff --git a/drivers/misc/trinity/trinity_vision2_drv.c b/drivers/misc/trinity/trinity_vision2_drv.c
index 70b8b6fd5843..3dd89920cdf5 100644
--- a/drivers/misc/trinity/trinity_vision2_drv.c
+++ b/drivers/misc/trinity/trinity_vision2_drv.c
@@ -13,6 +13,7 @@
#include <linux/delay.h>
#include <linux/hashtable.h>
#include <linux/of_device.h>
+#include <linux/pm_runtime.h>
#include <linux/utsname.h>
#include "trinity_common.h"
@@ -151,6 +152,9 @@ static void triv2_idu_setup(struct trinity_driver *drv);
static void triv2_idu_unset(struct trinity_driver *drv);
static int32_t triv2_idu_set(struct trinity_driver *drv,
struct trinity_ioctl_idu *config);
+static void triv2_handle_cmd_done(struct trinity_driver *drv,
+ struct triv2_cmd *cmd, bool timeout);
+static void triv2_setup_buffers(struct trinity_driver *drv);
/**
* triv2_get_state() - Get state (TRINITY_STATE_READY/TRINITY_STATE_PAUSE) of the device.
@@ -212,6 +216,31 @@ static void triv2_set_state(const struct trinity_driver *drv,
}
}
+/**
+ * triv2_sync_segt_entries() - synchronize the segment table entries
+ */
+static int triv2_sync_segt_entries(const struct trinity_driver *drv,
+ struct triv2_req *req)
+{
+#ifdef ARM
+ struct trinity_input *input = &(req->req.input);
+ int i;
+
+ /* flush all caches for heavy models */
+ if (req->total_segment_size > TRIV2_CACHE_FLUSH_THRESHOLD ||
+ /* cannot handle external segments for kernel requests */
+ req->kernel != NULL) {
+ flush_cache_all();
+ return 0;
+ }
+
+ for (i = 0; i < input->config.num_segments; ++i)
+ __cpuc_flush_dcache_area(req->seg_import[i].addr,
+ req->seg_import[i].buf->size);
+#endif
+ return 0;
+}
+
static void triv2_wakeup_cp(const struct trinity_driver *drv)
{
void *addr =
@@ -220,36 +249,552 @@ static void triv2_wakeup_cp(const struct trinity_driver *drv)
trinity_set_bit(BIT_SET_SEND_EVT1, addr);
}
+static void triv2_cancel_reqs(struct trinity_driver *drv)
+{
+ struct triv2_cmd_info *info;
+ struct triv2_cmd *cmd;
+ unsigned long flags;
+ int slot;
+
+ info = TRIV2_DRV_GET_CMD_INFO(drv);
+ spin_lock_irqsave(&info->lock, flags);
+
+ slot = find_first_bit(info->bitmap, TRIV2_MAX_CMDSLOTS);
+ while (slot < TRIV2_MAX_CMDSLOTS) {
+ cmd = TRIV2_GET_CMD_FROM_SLOT(info, slot);
+ triv2_handle_cmd_done(drv, cmd, true);
+ slot = find_next_bit(info->bitmap, TRIV2_MAX_CMDSLOTS,
+ slot + 1);
+ }
+
+ spin_unlock_irqrestore(&info->lock, flags);
+}
+
+static void triv2_drain_reqs(struct trinity_driver *drv)
+{
+ struct triv2_cmd_info *info;
+ unsigned long flags;
+ int cur_retries, max_retries = 1000; /* 1-sec */
+ int slot;
+
+ cur_retries = 0;
+ info = TRIV2_DRV_GET_CMD_INFO(drv);
+retry:
+ spin_lock_irqsave(&info->lock, flags);
+
+ /* wait until all bits are unset */
+ slot = find_first_bit(info->bitmap, TRIV2_MAX_CMDSLOTS);
+ if (slot < TRIV2_MAX_CMDSLOTS) {
+ spin_unlock_irqrestore(&info->lock, flags);
+
+ usleep_range(900, 1100);
+ if (cur_retries++ < max_retries)
+ goto retry;
+
+ spin_lock_irqsave(&info->lock, flags);
+ }
+
+ spin_unlock_irqrestore(&info->lock, flags);
+}
+
static void triv2_reset(struct trinity_driver *drv)
{
+ struct device *dev = drv_to_dev_ptr(drv);
struct triv2_pdata *pdata = drv->pdata;
mutex_lock(&pdata->drv->lock);
+ /* block runtime pm suspend */
+ pm_runtime_forbid(dev);
+
+ /* block new incoming requests first */
+ trinity_sched_suspend(drv);
+
triv2_cancel_reqs(pdata->drv);
msleep(100);
triv2_setup_buffers(drv);
triv2_idu_unset(drv);
+ /* resume scheduler */
+ trinity_sched_resume(drv);
+
+ pm_runtime_allow(dev);
+
mutex_unlock(&pdata->drv->lock);
}
+/**
+ * triv2_run_trigger() - trigger memory-mapped register for inference running
+ */
+static void triv2_run_trigger(const struct trinity_driver *drv, int slot)
+{
+ struct triv2_cmd_info *cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+ struct triv2_req *t_req = cmd_info->reqs[slot];
+
+ if (!t_req) {
+ dev_err(drv_to_dev_ptr(drv),
+ "Unable to find the corresponding req");
+ return;
+ }
+
+ if (triv2_sync_segt_entries(drv, t_req) < 0)
+ dev_err(drv_to_dev_ptr(drv),
+ "Unable to sync the segment table");
+
+ /* sync the current bitmap */
+ iowrite32(*cmd_info->bitmap,
+ trinity_get_iomem_addr(drv->mmreg_vaddr[0],
+ OFFSET_NPU_CMD_REQ));
+
+ t_req->req.stat->scheduled = ktime_get();
+ t_req->req.stat->completed = 0;
+ t_req->req.scheduled = true;
+ t_req->req.timeout = false;
+
+ /* trigger the event (we do not assume that IDU always accepts this event) */
+ triv2_wakeup_cp(drv);
+}
+
+static void triv2_clear_cmd(struct trinity_driver *drv, struct triv2_req *req,
+ struct triv2_cmd *cmd)
+{
+ struct triv2_cmd_info *cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+
+ cmd_info->reqs[req->cmd_slot] = NULL;
+ clear_bit(req->cmd_slot, cmd_info->bitmap);
+ req->cmd_slot = -1;
+
+ memset(cmd, '\x00', sizeof(struct triv2_cmd));
+}
+
+static void triv2_handle_cmd_done(struct trinity_driver *drv,
+ struct triv2_cmd *cmd, bool timeout)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+ struct triv2_cmd_info *cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+ struct triv2_req *t_req;
+ struct trinity_req *req;
+ uint32_t slot = cmd->slot;
+ int64_t time_diff;
+
+ t_req = cmd_info->reqs[slot];
+ if (!t_req) {
+ dev_err(dev, "Failed to find the req\n");
+ return;
+ }
+
+ req = &(t_req->req);
+ req->stat->completed = ktime_get();
+ req->stat->status = TRINITY_REQ_STATUS_FINISHED;
+
+ time_diff = TIME_DIFF_US(req->stat->completed, req->stat->scheduled);
+ if (time_diff < 0) {
+ dev_warn(dev, "Detected invalid inference time of request\n");
+ } else {
+ req->stat->prev_time = (uint32_t)time_diff;
+ req->stat->prev_cycles = cmd->total_cycles;
+ req->stat->num_runs++;
+ req->stat->total_time += req->stat->prev_time;
+ }
+
+ t_req->total_cycles = cmd->total_cycles;
+ t_req->profile_offset = cmd->profile_offset;
+
+ triv2_clear_cmd(drv, t_req, cmd);
+
+ /* notify to the scheduler */
+ trinity_sched_notify(req, timeout);
+
+ /* notify to the caller */
+ if (!req->is_kernel)
+ complete_all(&req->complete);
+}
+
+static void triv2_handle_timeout(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ struct triv2_cmd_info *cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+ struct triv2_cmd *cmd;
+ struct triv2_req *t;
+ unsigned long flags;
+
+ t = TRIV2_GET_REQ(req);
+
+ spin_lock_irqsave(&cmd_info->lock, flags);
+ if (t->cmd_slot >= 0) {
+ /* Timeout! check whether it's not handled in irq handler */
+ cmd = TRIV2_GET_CMD_FROM_SLOT(cmd_info, t->cmd_slot);
+ triv2_handle_cmd_done(drv, cmd, true);
+ }
+ spin_unlock_irqrestore(&cmd_info->lock, flags);
+}
+
+/**
+ * triv2_stop_reqs() - stop the submitted reqs to the driver
+ *
+ * In case of already-executed req, each device needs to determine the policy
+ * depending its capability to terminate the running one.
+ */
+static void triv2_stop_reqs(struct work_struct *work)
+{
+ struct trinity_driver *drv;
+
+ drv = container_of(work, struct trinity_driver, work_stop);
+ if (drv == NULL)
+ return;
+
+ triv2_cancel_reqs(drv);
+}
+
+static void triv2_handle_irq_cmds(struct trinity_driver *drv)
+{
+ struct triv2_cmd_info *info;
+ struct triv2_cmd *cmd;
+ unsigned long flags;
+ int slot;
+
+ info = TRIV2_DRV_GET_CMD_INFO(drv);
+ spin_lock_irqsave(&info->lock, flags);
+
+ /** Search the bitmap to find the completed CMDs */
+ slot = find_first_bit(info->bitmap, TRIV2_MAX_CMDSLOTS);
+ while (slot < TRIV2_MAX_CMDSLOTS) {
+ cmd = TRIV2_GET_CMD_FROM_SLOT(info, slot);
+ if (cmd->status == STATUS_CMD_DONE)
+ triv2_handle_cmd_done(drv, cmd, false);
+ slot = find_next_bit(info->bitmap, TRIV2_MAX_CMDSLOTS,
+ slot + 1);
+ }
+
+ spin_unlock_irqrestore(&info->lock, flags);
+}
+
+/**
+ * triv2_handle_irq() - An IRQ handler to be called when a registered IRQ (IRQ_OUT) occurs.
+ */
+static irqreturn_t triv2_handle_irq(int irq_no, void *dev_id)
+{
+ struct miscdevice *_mdev;
+ struct trinity_driver *drv;
+ void __iomem *addr;
+ uint32_t interrupt;
+ uint32_t reg;
+
+ _mdev = (struct miscdevice *)dev_id;
+ drv = container_of(_mdev, struct trinity_driver, mdev);
+
+ /**
+ * Verify that the IRQ is actually from the NPU
+ * This is required as IRQ_SHARED is used when setting up IRQ
+ */
+ addr = trinity_get_iomem_addr(drv->mmreg_vaddr[2],
+ OFFSET_CBOX_EXT_IRQ_STA);
+ reg = ioread32(addr);
+
+ interrupt = reg & MASK_CP_SWI_STA;
+ if (interrupt == 0)
+ return IRQ_NONE;
+
+ /** Clear the interrupt first */
+ addr = trinity_get_iomem_addr(drv->mmreg_vaddr[2],
+ OFFSET_CBOX_CP_SWI_CLR);
+ iowrite32(1, addr);
+
+ triv2_handle_irq_cmds(drv);
+ return IRQ_HANDLED;
+}
+
+/**
+ * triv2_prepare_req() - evaluate the physical address of entries in the segment table
+ */
+static int32_t triv2_prepare_req(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ struct triv2_req *t = TRIV2_GET_REQ(req);
+ struct trinity_input *input = &(req->input);
+ struct trinity_hwmem_import *segt_import = &(input->import_info);
+ int32_t *segtable_dbuffd_base;
+ uint32_t *segtable_extra_base;
+ int ret, i;
+
+ if (input->config.num_segments == 0)
+ return -EINVAL;
+
+ if (input->config.num_segments > TRIV2_MAX_SEGMENTS)
+ return -ERANGE;
+
+ t->seg_import = kcalloc(input->config.num_segments,
+ sizeof(struct trinity_hwmem_import),
+ GFP_KERNEL);
+ if (!t->seg_import)
+ return -ENOMEM;
+
+ /* dmabuf fd to be resolved */
+ segtable_dbuffd_base = segt_import->addr;
+ /* extra value (e.g., offset or size) */
+ segtable_extra_base = segt_import->addr + HALF_PAGE_SIZE;
+
+#ifdef ARM
+ /* sync segment table */
+ __cpuc_flush_dcache_area(input->import_info.addr,
+ input->import_info.buf->size);
+#endif
+
+ for (i = 0; i < input->config.num_segments; ++i) {
+ struct trinity_hwmem_import *import;
+ int32_t fd = segtable_dbuffd_base[i];
+ dma_addr_t daddr;
+
+ if (fd < 0) {
+ uint32_t idx = (uint32_t)((fd + 1) * -1);
+ struct triv2_kernel_req *kreq;
+
+ /* it's for kernel input/output */
+ if (!req->is_kernel) {
+ req->is_kernel = true;
+ kreq = kzalloc(sizeof(*kreq), GFP_KERNEL);
+ if (!kreq) {
+ ret = -ENOMEM;
+ goto err;
+ }
+ t->kernel = kreq;
+ }
+
+ kreq = t->kernel;
+ if (idx < TRIV2_MAX_TENSORS) {
+ kreq->in_seg_idx[idx] = i;
+ kreq->in_seg_size[idx] = segtable_extra_base[i];
+ t->total_segment_size += kreq->in_seg_size[idx];
+ } else if (idx < TRIV2_MAX_TENSORS * 2) {
+ idx -= TRIV2_MAX_TENSORS;
+ kreq->out_seg_idx[idx] = i;
+ kreq->out_seg_size[idx] =
+ segtable_extra_base[i];
+ t->total_segment_size +=
+ kreq->out_seg_size[idx];
+ } else {
+ dev_err(drv_to_dev_ptr(drv),
+ "Invalid external segment (idx: %u)",
+ idx);
+ ret = -EINVAL;
+ goto err;
+ }
+ continue;
+ }
+
+ import = &(t->seg_import[i]);
+ ret = trinity_hwmem_import_dmabuf_begin(drv_to_dev_ptr(drv), fd,
+ import);
+ if (ret) {
+ dev_err(drv_to_dev_ptr(drv),
+ "%d-th segment with fd (%d) seems invalid: %d",
+ i, fd, ret);
+ goto err;
+ }
+
+ t->total_segment_size += import->buf->size;
+
+ /** @todo Use a local ptr variable */
+ daddr = import->dma_addr;
+ daddr += segtable_extra_base[i];
+
+ iowrite32(TRIV2_IDU_ADDR(daddr),
+ segt_import->addr + i * sizeof(u32));
+ }
+
+ /* set the dma address of DSPM (reserved index: TRIV2_MAX_SEGMENTS - 1) */
+ if (drv->dspm > 0) {
+ struct triv2_pdata *pdata = TRIV2_DRV_GET_PDATA(drv);
+
+ iowrite32(TRIV2_IDU_ADDR(pdata->idu_dsp.dspm),
+ segt_import->addr +
+ (TRIV2_MAX_SEGMENTS - 1) * sizeof(u32));
+ }
+
+ return 0;
+
+err:
+ kfree(t->seg_import);
+ t->seg_import = NULL;
+ return ret;
+}
+
+/**
+ * triv2_prepare_cmd() - Prepare command info. for the target req before invoking
+ */
+static int32_t triv2_prepare_cmd(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ struct triv2_cmd_info *cmd_info;
+ struct triv2_cmd cmd = { 0 };
+ struct triv2_req *t;
+
+ const struct trinity_model *model = req->model;
+ const struct trinity_input *input = &req->input;
+
+ int32_t slot;
+ struct iommu_domain *domain;
+ phys_addr_t paddr;
+ unsigned long flags;
+
+ /** Note that the program base is not behind iommu */
+ domain = iommu_get_domain_for_dev(drv_to_dev_ptr(drv));
+
+ paddr = trinity_get_paddr(domain, model->import_info.dma_addr);
+ cmd.prog_addr = TRIV2_IDU_ADDR(paddr);
+ cmd.prog_addr += model->config.program_offset_addr;
+ cmd.prog_size = model->config.program_size;
+
+ paddr = trinity_get_paddr(domain, input->import_info.dma_addr);
+ cmd.segt_addr = TRIV2_IDU_ADDR(paddr);
+ cmd.num_visa = model->config.num_visa_insts;
+
+ cmd.priority = input->config.priority;
+ cmd.input_mode = input->config.input_mode;
+ cmd.output_mode = input->config.output_mode;
+
+ /** Find a empty cmd slot in bitmap (need a spin lock) */
+ cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+ t = TRIV2_GET_REQ(req);
+
+ spin_lock_irqsave(&cmd_info->lock, flags);
+
+ slot = find_first_zero_bit(cmd_info->bitmap, TRIV2_MAX_CMDSLOTS);
+ if (slot < TRIV2_MAX_CMDSLOTS) {
+ set_bit(slot, cmd_info->bitmap);
+ cmd_info->reqs[slot] = t;
+ t->cmd_slot = slot;
+ }
+
+ spin_unlock_irqrestore(&cmd_info->lock, flags);
+
+ /** Will be retried (rely on platform device's scheduling) */
+ if (slot >= TRIV2_MAX_CMDSLOTS)
+ return -EBUSY;
+
+ cmd.slot = slot;
+ cmd.status = STATUS_CMD_READY;
+
+ memcpy(cmd_info->buf.addr + slot * sizeof(struct triv2_cmd), &cmd,
+ sizeof(struct triv2_cmd));
+
+ return slot;
+}
+
+/**
+ * triv2_invoke_req() - Invoke a req on the device. Note that all configurations
+ * required by running should be done before invocation of this function.
+ */
+static int32_t triv2_invoke_req(struct trinity_driver *drv,
+ struct trinity_req *req, void *sched_data)
+{
+ enum trinity_output_mode mode;
+ int32_t slot;
+
+ mode = req->input.config.output_mode;
+ slot = triv2_prepare_cmd(drv, req);
+ if (slot < 0)
+ return slot;
+
+ if (mode == TRINITY_OUTPUT_HW || mode == TRINITY_OUTPUT_CPU_POLL ||
+ mode == TRINITY_OUTPUT_CPU_INTR) {
+ triv2_run_trigger(drv, slot);
+ } else {
+ dev_err(drv_to_dev_ptr(drv), "Invalid output mode: %d\n", mode);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static struct trinity_req *triv2_alloc_req(struct trinity_driver *drv)
+{
+ struct triv2_req *t_req;
+
+ t_req = kzalloc(sizeof(struct triv2_req), GFP_KERNEL);
+ if (!t_req)
+ return NULL;
+
+ t_req->cmd_slot = -1;
+
+ if (atomic_fetch_inc(&drv->active_reqs) == 0)
+ trinity_wait_ready(drv);
+
+ return &(t_req->req);
+}
+
+static void triv2_dealloc_req(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ struct triv2_req *t_req = TRIV2_GET_REQ(req);
+
+ if (atomic_dec_return(&drv->active_reqs) == 0)
+ trinity_set_pause(drv);
+
+ if (t_req->seg_import) {
+ struct trinity_hwmem_import *import;
+ uint32_t i;
+
+ for (i = 0; i < req->input.config.num_segments; i++) {
+ import = &(t_req->seg_import[i]);
+ if (import->addr)
+ trinity_hwmem_import_dmabuf_end(import);
+ }
+ kfree(t_req->seg_import);
+ }
+
+ kfree(t_req->kernel);
+ kfree(t_req);
+}
+
+static long triv2_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+ struct trinity_driver *drv = f->private_data;
+ struct device *dev = drv_to_dev_ptr(drv);
+ long ret;
+
+ pm_runtime_forbid(dev);
+
+ ret = trinity_ioctl(f, cmd, arg);
+
+ pm_runtime_allow(dev);
+
+ return ret;
+}
+
static int triv2_open(struct inode *inode, struct file *f)
{
+ struct miscdevice *miscdev;
+ struct trinity_driver *drv;
+ struct device *dev;
int ret;
+ miscdev = (struct miscdevice *)f->private_data;
+ drv = container_of(miscdev, struct trinity_driver, mdev);
+ dev = drv_to_dev_ptr(drv);
+
+ pm_runtime_forbid(dev);
+
ret = trinity_open(inode, f);
+ pm_runtime_allow(dev);
+
return ret;
}
static int triv2_release(struct inode *inode, struct file *f)
{
+ struct trinity_driver *drv = f->private_data;
+ struct device *dev = drv_to_dev_ptr(drv);
int ret;
+ pm_runtime_forbid(dev);
+
ret = trinity_release(inode, f);
+ pm_runtime_allow(dev);
+
return ret;
}
@@ -515,6 +1060,7 @@ static int32_t triv2_init_pdata(struct trinity_driver *drv)
cmd_buf = TRIV2_DRV_GET_CMD_BUF(drv);
back_buf = TRIV2_DRV_GET_BACK_BUF(drv);
+ mutex_init(&pdata->prof_lock);
spin_lock_init(&cmd_info->lock);
/* init cmd bitmap */
bitmap_zero(cmd_info->bitmap, TRIV2_MAX_CMDSLOTS);
@@ -657,6 +1203,64 @@ static struct trinity_desc triv2_desc = {
.dealloc_req = triv2_dealloc_req,
.prepare_req = triv2_prepare_req,
.invoke_req = triv2_invoke_req,
+ /* etc. */
+ .handle_timeout = triv2_handle_timeout,
+ .stop_reqs = triv2_stop_reqs,
+ .drain_reqs = triv2_drain_reqs,
+ .handle_irq = triv2_handle_irq,
+};
+
+static int triv2_suspend(struct device *dev)
+{
+ return 0;
+}
+
+static int triv2_resume(struct device *dev)
+{
+ return 0;
+}
+
+static int triv2_runtime_suspend(struct device *dev)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+
+ mutex_lock(&drv->lock);
+
+ /* 1) Ensure that the scheduler was suspended */
+ trinity_sched_suspend(drv);
+
+ /* 2) Set pause state if it's in ready state */
+ if (triv2_get_state(drv) == TRINITY_STATE_READY)
+ triv2_set_state(drv, TRINITY_STATE_PAUSE);
+
+ mutex_unlock(&drv->lock);
+
+ return 0;
+}
+
+static int triv2_runtime_resume(struct device *dev)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+
+ mutex_lock(&drv->lock);
+
+ /* 1) Restore IDU setup */
+ triv2_setup_buffers(drv);
+ triv2_idu_setup(drv);
+
+ /* 2) Resume the req scheduler */
+ trinity_sched_resume(drv);
+
+ mutex_unlock(&drv->lock);
+
+ return 0;
+}
+
+static const struct dev_pm_ops triv2_dev_pm_ops = {
+ // clang-format off
+ SET_SYSTEM_SLEEP_PM_OPS(triv2_suspend, triv2_resume)
+ SET_RUNTIME_PM_OPS(triv2_runtime_suspend, triv2_runtime_resume, NULL)
+ // clang-format on
};
static const struct of_device_id trinity_match[] = {
@@ -734,6 +1338,7 @@ static struct platform_driver trinity_triv2 = {
.name = "triv2",
.owner = THIS_MODULE,
.of_match_table = of_match_ptr(trinity_match),
+ .pm = &triv2_dev_pm_ops,
},
};
--
2.25.1
This patch provides debugfs feature.
It create debugfs entry for driver and provides apis to print
out messages, model info and input info. Default directory
of debugfs is named 'trinity', and each driver has own
debug file.
Signed-off-by: Jiho Chu <[email protected]>
---
drivers/misc/trinity/Makefile | 1 +
drivers/misc/trinity/trinity_debug.c | 331 +++++++++++++++++++++++++++
2 files changed, 332 insertions(+)
create mode 100644 drivers/misc/trinity/trinity_debug.c
diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
index 2a8c4fed135e..5d3e89dd0dd7 100644
--- a/drivers/misc/trinity/Makefile
+++ b/drivers/misc/trinity/Makefile
@@ -5,5 +5,6 @@ obj-$(CONFIG_TRINITY_VISION2) += trinity_vision2.o
trinity-y := trinity.o
trinity-y += trinity_dma.o trinity_hwmem.o
trinity-y += trinity_sched.o
+trinity-y += trinity_debug.o
trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
diff --git a/drivers/misc/trinity/trinity_debug.c b/drivers/misc/trinity/trinity_debug.c
new file mode 100644
index 000000000000..9add728a101b
--- /dev/null
+++ b/drivers/misc/trinity/trinity_debug.c
@@ -0,0 +1,331 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/**
+ * Implementation of debug functions for trinity drivers
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/debugfs.h>
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "trinity_common.h"
+
+#define TRINITY_DEVVER(drv) (drv->desc->ver >> TRINITY_SHIFT_DEV)
+#define TRINITY_DEBUGFS_DIR ("trinity")
+#define TRINITY_DEBUGFS_MAX (1024UL)
+#define TRINITY_DEBUGFS_LENGTH (255)
+
+struct trinity_debugfs_msg {
+ char msg[TRINITY_DEBUGFS_LENGTH + 1]; /* including NULL */
+};
+
+struct trinity_debugfs_entry {
+ struct dentry *dentry;
+ spinlock_t lock;
+
+ unsigned long msg_max;
+ unsigned long msg_num;
+ unsigned long msg_off;
+
+ struct trinity_dma msg_buf;
+};
+
+static struct dentry *trinity_debugfs;
+
+static size_t trinity_debug_append_app_id(struct trinity_driver *drv, char *msg)
+{
+ return snprintf(msg, TRINITY_DEBUGFS_LENGTH, "[%d] ",
+ trinity_get_app_id());
+}
+
+static char *trinity_debug_get_msg_buf(struct trinity_driver *drv)
+{
+ struct trinity_debugfs_entry *entry = drv->debugfs_pdata;
+ struct trinity_debugfs_msg *buf;
+
+ if (!entry || entry->msg_max == 0)
+ return NULL;
+
+ spin_lock(&entry->lock);
+ if (entry->msg_num == entry->msg_max) {
+ buf = &((struct trinity_debugfs_msg *)
+ entry->msg_buf.addr)[entry->msg_off];
+ entry->msg_off = (entry->msg_off + 1) % entry->msg_max;
+ } else {
+ buf = &((struct trinity_debugfs_msg *)
+ entry->msg_buf.addr)[entry->msg_num++];
+ }
+ spin_unlock(&entry->lock);
+
+ memset(buf, '\x00', sizeof(*buf));
+ return buf->msg;
+}
+
+/**
+ * trinity_debug_dump_msg() - Dump trinity debug message
+ *
+ * @drv: an instance of the trinity driver
+ * @fmt: tag message format
+ */
+void trinity_debug_dump_msg(struct trinity_driver *drv, const char *fmt, ...)
+{
+ char *msg;
+ size_t len;
+ va_list args;
+
+ msg = trinity_debug_get_msg_buf(drv);
+ if (msg == NULL)
+ return;
+
+ len = trinity_debug_append_app_id(drv, msg);
+
+ va_start(args, fmt);
+ len += vsnprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len, fmt, args);
+ va_end(args);
+}
+
+/**
+ * trinity_debug_dump_input() - Dump trinity input data
+ *
+ * @drv: an instance of the trinity driver
+ * @input: an instance of the trinity model
+ * @fmt: tag message format
+ */
+void trinity_debug_dump_model(struct trinity_driver *drv,
+ const struct trinity_model *model,
+ const char *fmt, ...)
+{
+ char *msg;
+ size_t len;
+ va_list args;
+
+ msg = trinity_debug_get_msg_buf(drv);
+ if (msg == NULL)
+ return;
+
+ len = trinity_debug_append_app_id(drv, msg);
+
+ va_start(args, fmt);
+ len += vsnprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len, fmt, args);
+ va_end(args);
+
+ len += snprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len,
+ "\n\tid(0x%llx) dbuf_fd(%d) program_offset_addr(0x%llx) program_size(0x%llx)\n",
+ model->config.id, model->config.dbuf_fd,
+ model->config.program_offset_addr, model->config.program_size);
+ len += snprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len,
+ "\tmetadata_dbuf_fd(%d) metadata_ext_dbuf_fd(%d) metadata_ext_size(0x%llx)",
+ model->config.metadata_dbuf_fd,
+ model->config.metadata_ext_dbuf_fd,
+ model->config.metadata_ext_size);
+}
+
+/**
+ * trinity_debug_dump_input() - Dump trinity input data
+ *
+ * @drv: an instance of the trinity driver
+ * @input: an instance of the trinity input
+ * @fmt: tag message format
+ */
+void trinity_debug_dump_input(struct trinity_driver *drv,
+ const struct trinity_input *input,
+ const char *fmt, ...)
+{
+ char *msg;
+ size_t len;
+ va_list args;
+
+ msg = trinity_debug_get_msg_buf(drv);
+ if (msg == NULL)
+ return;
+
+ len = trinity_debug_append_app_id(drv, msg);
+
+ va_start(args, fmt);
+ len += vsnprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len, fmt, args);
+ va_end(args);
+
+ len += snprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len,
+ "\n\tdbuf_fd(%d) model_id(0x%llx)\n",
+ input->config.dbuf_fd, input->config.model_id);
+ len += snprintf(msg + len, TRINITY_DEBUGFS_LENGTH - len,
+ "\ttimeout_ms(%lld) priority(%u) num_segments(%u) input_mode(%d) output_mode(%d)",
+ input->config.timeout_ms, input->config.priority,
+ input->config.num_segments, input->config.input_mode,
+ input->config.output_mode);
+}
+
+static int trinity_debugfs_show(struct seq_file *s, void *unsed)
+{
+ struct trinity_driver *drv = s->private;
+ struct trinity_debugfs_entry *entry = drv->debugfs_pdata;
+ struct trinity_debugfs_msg *msg;
+ unsigned long i, offset;
+
+ spin_lock(&entry->lock);
+ for (i = 0; i < entry->msg_num; i++) {
+ offset = (entry->msg_off + i) % entry->msg_max;
+ msg = &((struct trinity_debugfs_msg *)
+ entry->msg_buf.addr)[offset];
+
+ seq_puts(s, msg->msg);
+ seq_puts(s, "\n");
+ }
+ spin_unlock(&entry->lock);
+
+ return 0;
+}
+
+static int trinity_debugfs_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, trinity_debugfs_show, inode->i_private);
+}
+
+static const struct file_operations trinity_debugfs_fops = {
+ .open = trinity_debugfs_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+/**
+ * trinity_debug_add() - Add trinity debug file system entry
+ *
+ * @drv: an instance of the trinity driver
+ */
+int trinity_debug_add(struct trinity_driver *drv)
+{
+ struct trinity_debugfs_entry *entry;
+ struct dentry *dentry;
+ const char *name = drv->name;
+
+ if (name == NULL)
+ return -EINVAL;
+
+ entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+ if (!entry)
+ return -ENOMEM;
+
+ dentry = debugfs_create_file_unsafe(name, 0400, trinity_debugfs, drv,
+ &trinity_debugfs_fops);
+ if (IS_ERR(dentry)) {
+ kfree(entry);
+ return PTR_ERR(dentry);
+ }
+
+ entry->dentry = dentry;
+ spin_lock_init(&entry->lock);
+
+ drv->debugfs_pdata = entry;
+
+ return 0;
+}
+
+/**
+ * trinity_debug_remove() - Remove trinity debug file system entry
+ *
+ * @drv: an instance of the trinity driver
+ */
+void trinity_debug_remove(struct trinity_driver *drv)
+{
+ struct trinity_debugfs_entry *entry = drv->debugfs_pdata;
+
+ trinity_debug_clear(drv, 0);
+
+ debugfs_remove(entry->dentry);
+ kfree(entry);
+
+ drv->debugfs_pdata = NULL;
+}
+
+/**
+ * trinity_debug_clear() - Clear debug message entity
+ *
+ * @drv: an instance of the trinity driver
+ * @msg_max: reset max size of debug message entity
+ */
+void trinity_debug_clear(struct trinity_driver *drv, unsigned long msg_max)
+{
+ struct trinity_debugfs_entry *entry = drv->debugfs_pdata;
+ struct device *dev = drv_to_dev_ptr(drv);
+ size_t size;
+
+ /* maximum size limit: 256KiB */
+ if (msg_max > TRINITY_DEBUGFS_MAX) {
+ dev_err(dev, "Too much debugfs entries (limit: %lu)",
+ TRINITY_DEBUGFS_MAX);
+ return;
+ }
+
+ spin_lock(&entry->lock);
+
+ /* disable debugfs temporally */
+ trinity_dma_free(dev, &entry->msg_buf);
+ entry->msg_max = 0;
+ entry->msg_num = 0;
+ entry->msg_off = 0;
+
+ if (msg_max == 0)
+ goto out;
+
+ /* reallocate debugfs buffer */
+ size = PAGE_ALIGN(msg_max * sizeof(struct trinity_debugfs_msg));
+ if (trinity_dma_alloc(dev, size, &entry->msg_buf) < 0) {
+ dev_warn(dev, "No available memory for debugfs");
+ goto out;
+ }
+ /* more available entries due to page size alignment */
+ entry->msg_max = size / sizeof(struct trinity_debugfs_msg);
+
+out:
+ spin_unlock(&entry->lock);
+}
+
+/**
+ * trinity_debug_exit() - Get max size of debug message entity
+ *
+ * @drv: an instance of the trinity driver
+ *
+ * Return: max size of debug message entity
+ */
+unsigned long trinity_debug_get_max(struct trinity_driver *drv)
+{
+ struct trinity_debugfs_entry *entry = drv->debugfs_pdata;
+ unsigned long msg_max;
+
+ spin_lock(&entry->lock);
+ msg_max = entry->msg_max;
+ spin_unlock(&entry->lock);
+
+ return msg_max;
+}
+
+/**
+ * trinity_debug_exit() - Initialize debug file system
+ */
+int trinity_debug_init(void)
+{
+ struct dentry *entry;
+
+ entry = debugfs_create_dir(TRINITY_DEBUGFS_DIR, NULL);
+ if (IS_ERR(entry))
+ return PTR_ERR(entry);
+
+ trinity_debugfs = entry;
+
+ return 0;
+}
+
+/**
+ * trinity_debug_exit() - Exit debug file system
+ */
+void trinity_debug_exit(void)
+{
+ debugfs_remove_recursive(trinity_debugfs);
+}
--
2.25.1
It contains the base codes for trinity driver. Minimal codes to load and
probe device is provided. The Trinity Family is controlled by the
Memory-Mapped Registers, the register addresses and offsets are
described. And user api interfaces are presented to control device under
ioctl manner.
Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: yelini-jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: Parichay Kapoor <[email protected]>
Signed-off-by: Wook Song <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/Kconfig | 1 +
drivers/misc/Makefile | 1 +
drivers/misc/trinity/Kconfig | 25 +
drivers/misc/trinity/Makefile | 7 +
drivers/misc/trinity/trinity.c | 225 +++++++++
drivers/misc/trinity/trinity_common.h | 437 ++++++++++++++++++
drivers/misc/trinity/trinity_vision2_drv.c | 278 ++++++++++++
drivers/misc/trinity/trinity_vision2_regs.h | 210 +++++++++
include/uapi/misc/trinity.h | 476 ++++++++++++++++++++
9 files changed, 1660 insertions(+)
create mode 100644 drivers/misc/trinity/Kconfig
create mode 100644 drivers/misc/trinity/Makefile
create mode 100644 drivers/misc/trinity/trinity.c
create mode 100644 drivers/misc/trinity/trinity_common.h
create mode 100644 drivers/misc/trinity/trinity_vision2_drv.c
create mode 100644 drivers/misc/trinity/trinity_vision2_regs.h
create mode 100644 include/uapi/misc/trinity.h
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 41d2bb0ae23a..ad0d5f6af291 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -500,4 +500,5 @@ source "drivers/misc/cardreader/Kconfig"
source "drivers/misc/habanalabs/Kconfig"
source "drivers/misc/uacce/Kconfig"
source "drivers/misc/pvpanic/Kconfig"
+source "drivers/misc/trinity/Kconfig"
endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 70e800e9127f..c63f3fc89780 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o
obj-$(CONFIG_HISI_HIKEY_USB) += hisi_hikey_usb.o
obj-$(CONFIG_HI6421V600_IRQ) += hi6421v600-irq.o
obj-$(CONFIG_OPEN_DICE) += open-dice.o
+obj-$(CONFIG_TRINITY) += trinity/
diff --git a/drivers/misc/trinity/Kconfig b/drivers/misc/trinity/Kconfig
new file mode 100644
index 000000000000..02ad03c2ca0e
--- /dev/null
+++ b/drivers/misc/trinity/Kconfig
@@ -0,0 +1,25 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config TRINITY
+ bool "Samsung Neural Processing Unit"
+ depends on HAS_IOMEM
+ depends on HAS_DMA
+ help
+ Select this option to enable driver support for Samsung
+ Neural Processing Unit (NPU).
+
+ This driver works as a base driver of the other drivers
+ for Trinity device family.
+
+ This option should be enabled to support Trinity
+ Vision 2 (TRIV2), and Trinity Audio (TRIA).
+
+config TRINITY_VISION2
+ tristate "Samsung NPU Trinity Vision 2"
+ depends on TRINITY
+ help
+ Select this option to enable driver support for a Samsung
+ Neural Processing Unit (NPU), Trinity Vision 2.
+
+ This driver enables userspace system library to access the
+ device via /dev/triv2-N.
diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
new file mode 100644
index 000000000000..a8e5697d6d85
--- /dev/null
+++ b/drivers/misc/trinity/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+obj-$(CONFIG_TRINITY_VISION2) += trinity_vision2.o
+
+trinity-y := trinity.o
+
+trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
new file mode 100644
index 000000000000..1704eecfc439
--- /dev/null
+++ b/drivers/misc/trinity/trinity.c
@@ -0,0 +1,225 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Base device driver for Samsung NPU Trinity device family
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/of_address.h>
+
+#include "trinity_common.h"
+
+#define TRINITY_PADDR_BASE (0x0)
+
+static DEFINE_IDA(dev_nrs);
+static DEFINE_IDA(model_ids);
+
+/**
+ * trinity_release() - A common callback for close() in file_operations for a
+ * Trinity device node. If there are device-specific data to be
+ * cleaned-up, it is required to clean them up before invoke this
+ * callback.
+ *
+ * @inode: Inode to be closed
+ * @file: File to be closed
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+int trinity_release(struct inode *inode, struct file *file)
+{
+ return 0;
+}
+
+/**
+ * trinity_open() - A common callback for open() in file_operations for a Trinity
+ * device node. If device-specific open() is required, this
+ * callback should be invoked by that open().
+ *
+ * @inode: inode to be opened
+ * @f: file to be opened
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+int trinity_open(struct inode *inode, struct file *f)
+{
+ struct miscdevice *miscdev;
+ struct trinity_driver *drv;
+
+ miscdev = f->private_data;
+ drv = container_of(miscdev, struct trinity_driver, mdev);
+ f->private_data = drv;
+
+ return 0;
+}
+
+/**
+ * trinity_create_node() - Create trinity node
+ *
+ * @drv: an instance of trinity driver
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+int trinity_create_node(struct trinity_driver *drv)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+ int err;
+
+ /** register as a misc device */
+ drv->mdev.minor = MISC_DYNAMIC_MINOR;
+ drv->mdev.parent = dev;
+ drv->mdev.name = drv->name;
+ drv->mdev.fops = drv->desc->fops;
+
+ err = misc_register(&drv->mdev);
+ if (err < 0)
+ dev_err(dev, "failed to register as a misc device");
+
+ return err;
+}
+
+/**
+ * trinity_destroy_node() - Destroy trinity node
+ *
+ * @drv: an instance of trinity driver
+ */
+void trinity_destroy_node(struct trinity_driver *drv)
+{
+ misc_deregister(&drv->mdev);
+}
+
+/**
+ * trinity_probe() - Probes a new Trinity device. This is a standard interface to
+ * probe a Trinity family device.
+ *
+ * @pdev: Platform device structure to probe
+ * @desc: Device description to probe
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+int trinity_probe(struct platform_device *pdev, const struct trinity_desc *desc)
+{
+ struct device_node *np;
+ struct device *dev;
+ struct trinity_driver *drv;
+ int i, err;
+
+ dev = &pdev->dev;
+ dev->id = ((desc->ver & TRINITY_MASK_DEV) >> TRINITY_SHIFT_DEV);
+
+ /* set private data */
+ drv = devm_kzalloc(dev, sizeof(*drv), GFP_KERNEL);
+ if (!drv)
+ return -ENOMEM;
+
+ drv->dev_id = ida_alloc(&dev_nrs, GFP_KERNEL);
+ if (drv->dev_id < 0) {
+ devm_kfree(dev, drv);
+ return drv->dev_id;
+ }
+ snprintf(drv->name, DEV_NAME_LEN, "%s-%u", desc->type, drv->dev_id);
+
+ platform_set_drvdata(pdev, drv);
+ dev_set_drvdata(dev, drv);
+
+ drv->dev = dev;
+ drv->desc = desc;
+
+ np = dev->of_node;
+ if (of_property_match_string(np, "samsung,trinity-type", desc->type)) {
+ err = -EPROBE_DEFER;
+ goto err_cleanup;
+ }
+
+ /* get reg info for MMREG_BASE */
+ for (i = 0; i < TRINITY_MAX_MMREGS; i++) {
+ struct resource mmreg;
+
+ err = of_address_to_resource(np, i, &mmreg);
+ if (err < 0) {
+ dev_err(dev, "failed to get %d-th mmreg info", i);
+ goto err_cleanup;
+ }
+
+ drv->mmreg_vaddr[i] = devm_ioremap_resource(dev, &mmreg);
+ if (IS_ERR(drv->mmreg_vaddr[i])) {
+ dev_err(dev,
+ "failed to remap %d-th mmreg resource info", i);
+ err = PTR_ERR(drv->mmreg_vaddr[i]);
+ goto err_cleanup;
+ }
+ drv->mmreg_paddr[i] = mmreg.start;
+ }
+
+ /** get a TOPS property */
+ err = of_property_read_u32(np, "samsung,tops", &drv->tops);
+ if (err < 0) {
+ dev_err(dev, "failed to read 'tops' property: %d\n", err);
+ goto err_cleanup;
+ }
+
+ /** get a DSPM property */
+ err = of_property_read_u32(np, "samsung,dspm", &drv->dspm);
+ if (err < 0) {
+ dev_warn(dev, "Setting the size of DPSM to 0\n");
+ drv->dspm = 0;
+ }
+
+ /* Set IRQ handlers */
+ drv->irq = platform_get_irq(pdev, 0);
+ if (drv->irq < 0) {
+ dev_err(dev, "IRQ is not supported");
+ err = drv->irq;
+ goto err_cleanup;
+ }
+
+ /* get the IRQ number from DT and set handlers for it */
+ err = devm_request_irq(dev, drv->irq, desc->handle_irq,
+ IRQF_TRIGGER_HIGH, desc->type, &drv->mdev);
+ if (err < 0) {
+ dev_err(dev, "failed to register handlers for IRQ %d", drv->irq);
+ goto err_cleanup;
+ }
+
+ /** Initialize device-specific variables */
+ init_completion(&drv->complete);
+ mutex_init(&drv->lock);
+ INIT_WORK(&drv->work_stop, desc->stop_reqs);
+
+ return 0;
+
+err_cleanup_common:
+ devm_free_irq(dev, drv->irq, &drv->mdev);
+
+err_cleanup:
+ ida_free(&dev_nrs, drv->dev_id);
+ devm_kfree(dev, drv);
+
+ return err;
+}
+
+/**
+ * trinity_remove() - Cleans up the device driver. This is a standard interface to
+ * remove a Trinity family device.
+ *
+ * @pdev: Platform device structure to probe
+ * @desc: Device description to probe
+ *
+ * Always returns 0.
+ */
+int trinity_remove(struct platform_device *pdev,
+ const struct trinity_desc *desc)
+{
+ struct trinity_driver *drv = platform_get_drvdata(pdev);
+ struct device *dev = drv_to_dev_ptr(drv);
+
+ ida_free(&dev_nrs, drv->dev_id);
+ devm_free_irq(dev, drv->irq, &drv->mdev);
+ devm_kfree(dev, drv);
+
+ return 0;
+}
diff --git a/drivers/misc/trinity/trinity_common.h b/drivers/misc/trinity/trinity_common.h
new file mode 100644
index 000000000000..f35d964ab387
--- /dev/null
+++ b/drivers/misc/trinity/trinity_common.h
@@ -0,0 +1,437 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * Common header for trinity devices
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2020 Parichay Kapoor <[email protected]>
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __DRIVERS_MISC_TRINITY_COMMON_H__
+#define __DRIVERS_MISC_TRINITY_COMMON_H__
+
+#include <linux/errno.h>
+#include <linux/idr.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/irqreturn.h>
+#include <linux/kernel.h>
+#include <linux/list_bl.h>
+#include <linux/miscdevice.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <uapi/misc/trinity.h>
+
+#include "trinity_dma.h"
+#include "trinity_hwmem.h"
+
+#define DEV_NAME_LEN (16)
+
+/** Default timeout to wait for opening device in jiffies */
+#define TRINITY_DEV_TIMEOUT_MSEC (3000)
+#define TRINITY_DEV_TIMEOUT (msecs_to_jiffies(TRINITY_DEV_TIMEOUT_MSEC))
+
+/** Default timeout to wait for running input in jiffies */
+#define TRINITY_RUN_TIMEOUT_MSEC (4000)
+#define TRINITY_RUN_TIMEOUT (msecs_to_jiffies(TRINITY_RUN_TIMEOUT_MSEC))
+
+#define TRINITY_DEV_TYPE_LEN (16)
+#define TRINITY_DEV_EACH_MAX (2)
+#define TRINITY_MAX_MMREGS (3)
+
+/** A helper function to generate the version code of the device driver */
+#define GENVER(dev, mj, mn, ex) \
+ ((dev << TRINITY_SHIFT_DEV) | (mj << TRINITY_SHIFT_MAJOR_VER) | \
+ (mn << TRINITY_SHIFT_MINOR_VER) | (ex << TRINITY_SHIFT_EXTRA_VER))
+
+#define IDU_LOADED(drv) (drv->idu_set)
+
+#define trinity_get_iomem_addr(base, offset) (base + offset)
+#define drv_to_dev_ptr(d) (d->dev)
+#define drv_to_priv(drv) (drv->desc->pdata)
+
+#define TRINITY_STAT_HASH_BITS (10)
+#define TRINITY_STAT_HASH_SIZE (1 << TRINITY_STAT_HASH_BITS)
+
+#define TRINITY_MODEL_HASH_BITS (10)
+#define TRINITY_MODEL_HASH_SIZE (1 << TRINITY_MODEL_HASH_BITS)
+
+#define TIME_DIFF(t1, t2) ktime_to_ms(ktime_sub(t1, t2))
+#define TIME_DIFF_US(t1, t2) ktime_to_us(ktime_sub(t1, t2))
+
+struct trinity_desc;
+struct trinity_driver;
+struct trinity_req;
+struct trinity_stat;
+struct trinity_stat_app;
+struct trinity_stat_req;
+struct trinity_model_htable;
+struct trinity_idu;
+
+/**
+ * struct trinity_desc - a structure for device description
+ *
+ * @type: A string that indicates the type of this device.
+ * @ver: Coded version information generated via GENVER().
+ * @fops: Device-specific file_operations.
+ *
+ * @reset: reset trinity function
+ * @prepare_req: request configuration function before invoking
+ * trinity_submit_req() (if any). This requires a registered model
+ * to the driver.
+ * @handle_timeout: This function is invoked when the request is time-out
+ * @stop_reqs: stops current working request
+ * @drain_reqs: waits currently working requests finishes.
+ * @init_profile: initialize profile configuration
+ * @check_profile: check current profile data
+ * @get_profile_meta: get profile metadata for the target request
+ * @get_profile_buff: get profile data buffer for the target request
+ * @get_profile: get profile data
+ * @destroy_profile: destroy profile resources
+ *
+ * @idu_set: set IDU binary
+ * @idu_unset: unset IDU binary
+ * @idu_version: get IDU version info
+ * @get_state: get current state of IDU
+ * @set_state: set IDU state
+ * @alloc_req: allocate request new trinity request
+ * @dealloc_req: release request resource
+ * @invoke_req: prepare to run request and sent it to scheduler
+ *
+ * @handle_irq: Device-specific IRQ handler.
+ */
+struct trinity_desc {
+ char *type;
+ uint32_t ver;
+
+ const struct file_operations *fops;
+
+ /* Optional */
+ void (*reset)(struct trinity_driver *drv);
+ int32_t (*prepare_req)(struct trinity_driver *drv,
+ struct trinity_req *req);
+ void (*handle_timeout)(struct trinity_driver *drv,
+ struct trinity_req *req);
+ void (*stop_reqs)(struct work_struct *work);
+ void (*drain_reqs)(struct trinity_driver *drv);
+ void (*init_profile)(struct trinity_driver *drv,
+ unsigned long profile_size);
+ int32_t (*check_profile)(struct trinity_driver *drv,
+ struct trinity_req *req);
+ int32_t (*get_profile_meta)(const struct trinity_driver *drv,
+ struct trinity_ioctl_profile_meta *meta);
+ int32_t (*get_profile_buff)(const struct trinity_driver *drv,
+ struct trinity_ioctl_profile_buff *buff);
+ ssize_t (*get_profile)(const struct trinity_driver *drv, char *buf, int req_id);
+ void (*destroy_profile)(const struct trinity_driver *drv, void *data);
+
+ /* Mandatory */
+ int32_t (*idu_set)(struct trinity_driver *drv, struct trinity_ioctl_idu *idu);
+ void (*idu_unset)(struct trinity_driver *drv);
+ int32_t (*idu_version)(struct trinity_driver *drv, uint32_t *major,
+ uint32_t *minor, uint32_t *extra);
+ int32_t (*get_state)(const struct trinity_driver *drv);
+ void (*set_state)(const struct trinity_driver *drv,
+ enum trinity_state state);
+ struct trinity_req *(*alloc_req)(struct trinity_driver *drv);
+ void (*dealloc_req)(struct trinity_driver *drv,
+ struct trinity_req *req);
+ int32_t (*invoke_req)(struct trinity_driver *drv,
+ struct trinity_req *req, void *sched_data);
+ irq_handler_t handle_irq;
+};
+
+/**
+ * struct trinity_stat - A structure for representing a device's statistics.
+ *
+ * @lock: stat lock
+ * @hlist_bl_head: stat hash list heads
+ * @list: list head
+ * @pdata: private data
+ */
+struct trinity_stat {
+ spinlock_t lock;
+
+ struct hlist_bl_head hlist[TRINITY_STAT_HASH_SIZE];
+ struct list_head list;
+
+ void *pdata;
+};
+
+/**
+ * struct trinity_stat_app - a structure for representing statistics for each app
+ * @app_id: identifier for each app
+ * @name: name of stat
+ * @status: app status
+ * @parent: parent node
+ * @total_alloc_mem: total allocated memory size
+ * @total_free_mem: total freed memory size
+ * @list_head reqs: list of request
+ * @num_total_reqs: a number of total requests
+ * @num_kept_reqs: a number of kept requests
+ * @num_active_reqs: a number of active requests
+ * @hnode: hash node
+ * @lnode: list node
+ */
+struct trinity_stat_app {
+ int32_t app_id; /* app identifier */
+ char name[TASK_COMM_LEN];
+ enum trinity_app_status status;
+
+ struct trinity_stat *parent;
+
+ uint64_t total_alloc_mem; /* total allocated memory */
+ uint64_t total_freed_mem; /* total freed memory */
+
+ struct list_head reqs;
+ uint32_t num_total_reqs;
+ uint32_t num_kept_reqs;
+ uint32_t num_active_reqs;
+
+ struct hlist_bl_node hnode; /* hash node */
+ struct list_head lnode; /* list node */
+
+ unsigned long slot;
+};
+
+/**
+ * struct trinity_stat_req - A structure for representing statistics for each request
+ * @status: request status
+ * @priority: priority of request
+ * @parent: parent node
+ * @req_id: app identifier
+ * @req_id: request identifier
+ * @model_id: model identifier
+ * @is_kernel: requested from other kernel module
+ * @submitted: submitted time (i.e., when request is submitted to global queue)
+ * @scheduled: scheduled time (i.e., when request is scheduled to device)
+ * @completed: completed time (i.e., when output notification arrives)
+ * @num_runs: total number of runs
+ * @total_time: total execute time
+ * @prev_time: previous execute time
+ * @prev_cycles: previous execute cycles
+ * @list: list node managed by trinity_stat_app
+ * @profile: profile data
+ * @slot: request slot
+ */
+struct trinity_stat_req {
+ enum trinity_req_status status; /* status of submit result */
+ enum trinity_req_priority priority;
+
+ struct trinity_stat_app *parent;
+
+ int32_t app_id;
+ int32_t req_id;
+ uint64_t model_id;
+
+ bool is_kernel;
+
+ ktime_t submitted;
+ ktime_t scheduled;
+ ktime_t completed;
+
+ uint32_t num_runs;
+ uint32_t total_time;
+
+ uint32_t prev_time;
+ uint32_t prev_cycles;
+
+ struct list_head list;
+ void *profile;
+
+ unsigned long slot;
+};
+
+/**
+ * struct trinity_driver - A private data structure for Trinity device driver
+ * @desc: A pointer to the device description
+ * @name: The id-annotated name of the device
+ * @pdata: private data
+ * #dev_id: device id
+ * @mdev: A copy of &struct misc device to which the device is registered.
+ * @dev: A pointer to &struct device of the device.
+ * @complete: A &struct completion variable to maintain events from the device.
+ * @lock: A lock for access control to driver-level static variables
+ * @irq: acquired IRQ number used for complete event
+ * @glboal_req_id: a request id to generate id for each request
+ * @active_reqs: a number of active requests
+ * @mmreg_vaddr: The iomapped base address of memory-mapped registers
+ * @mmreg_paddr: The physical base address of memory-mapped registers
+ * @idu_set: IDU binary is set
+ * @work_stop: handle stop request
+ * @tops: Tera Operations Per Second (TOPS) of this device
+ * @dspm: The size of Data Scratch-Pad Memory (DSPM) in the DSP
+ * @stat: statistics information
+ * @debugfs_pdata: debugfs private data
+ * @sched_pdata: NPU scheduler private data
+ * @model_htable: model hash table
+ * @profile_req_id: sysfs requested profile id for show_profile
+ */
+struct trinity_driver {
+ const struct trinity_desc *desc;
+ char name[DEV_NAME_LEN];
+ void *pdata;
+
+ uint32_t dev_id;
+ struct miscdevice mdev;
+ struct device *dev;
+ struct completion complete;
+ struct mutex lock;
+ int irq;
+
+ atomic_t global_req_id;
+ atomic_t active_reqs;
+
+ void __iomem *mmreg_vaddr[TRINITY_MAX_MMREGS];
+ phys_addr_t mmreg_paddr[TRINITY_MAX_MMREGS];
+
+ bool idu_set;
+
+ struct work_struct work_stop;
+
+ uint32_t tops;
+ uint32_t dspm;
+
+ struct trinity_stat stat;
+ void *debugfs_pdata;
+ void *sched_pdata;
+
+ struct hlist_bl_head model_htable[TRINITY_MODEL_HASH_SIZE];
+
+ /* sysfs info */
+ int profile_req_id;
+};
+
+/**
+ * struct trinity_model - A structure for representing model data
+ * @config: model configuration
+ * @import_info: Cached hwmem import info.
+ * @hnode: hash node for indexing
+ * @owner_id: Identifier for owner app
+ * @refcnt: reference count
+ */
+struct trinity_model {
+ struct trinity_ioctl_model config;
+ struct trinity_hwmem_import import_info;
+ struct hlist_bl_node hnode;
+ int32_t owner_id;
+ struct kref refcnt;
+} __packed;
+
+/**
+ * struct trinity_input - A structure for representing input data
+ * @config: input configuration
+ * @import_info: Cached hwmem import info.
+ */
+struct trinity_input {
+ struct trinity_ioctl_input config;
+ struct trinity_hwmem_import import_info;
+} __packed;
+
+/**
+ * struct trinity_req - A structure for representing a request
+ * @drv: An instance of the driver.
+ * @input: Information of the input configuration to be run by this request
+ * @model: model information to be used for this request
+ * @status: Status of the submitted request
+ * @submit_retry: retry count of submit request
+ * @complete: completion information
+ * @llist: llist node for request queue
+ * @time_started: started time
+ * @is_kernel: requested from kernel module
+ * @scheduled: scheduled flag
+ * @priv: A handle of private data
+ * @note: The allocated 'trinity_req' is shared with ioctl, scheduler
+ * and interrupt handler routines. After invoking an NPU request,
+ * the irq handler can make complete the request at anytime, and it
+ * causes deallocation of the struct.
+ */
+struct trinity_req {
+ /** context where the req belongs */
+ struct trinity_driver *drv;
+
+ struct trinity_input input; /* the req's input argument */
+ struct trinity_model *model;
+
+ struct trinity_stat_req *stat;
+
+ uint64_t submit_retry;
+ struct completion complete;
+ struct llist_node llist;
+
+ ktime_t time_started;
+ bool is_kernel;
+
+ bool scheduled;
+ bool timeout;
+
+ void *priv;
+};
+
+static inline void trinity_set_bit(uint32_t bit, void __iomem *addr)
+{
+ uint32_t reg = 0;
+
+ reg |= bit;
+ iowrite32(reg, addr);
+}
+
+/**
+ * trinity_get_app_id() - Get a app_id for the current opened device
+ *
+ * Returns app_id (just returns its tgid for now).
+ */
+static inline int32_t trinity_get_app_id(void)
+{
+ return task_tgid_vnr(current);
+}
+
+/*
+ * Trinity common functions
+ */
+int trinity_create_node(struct trinity_driver *drv);
+void trinity_destroy_node(struct trinity_driver *drv);
+int trinity_set_pause(struct trinity_driver *drv);
+int trinity_wait_ready(struct trinity_driver *drv);
+void trinity_init_model_htable(const struct trinity_driver *drv,
+ struct trinity_model_htable *ht);
+int32_t trinity_get_app_id(void);
+phys_addr_t trinity_get_paddr(struct iommu_domain *domain, dma_addr_t daddr);
+
+/* File operations */
+int trinity_open(struct inode *inode, struct file *f);
+int trinity_release(struct inode *inode, struct file *f);
+long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg);
+
+/* Device probing and removing */
+int trinity_probe(struct platform_device *pdev,
+ const struct trinity_desc *desc);
+int trinity_remove(struct platform_device *pdev,
+ const struct trinity_desc *desc);
+
+/* sysfs operations */
+int trinity_sysfs_init(struct trinity_driver *drv);
+int trinity_sysfs_cleanup(struct trinity_driver *drv);
+
+/* debugfs operations */
+int trinity_debug_init(void);
+void trinity_debug_exit(void);
+int trinity_debug_add(struct trinity_driver *drv);
+void trinity_debug_remove(struct trinity_driver *drv);
+void trinity_debug_clear(struct trinity_driver *drv, unsigned long msg_max);
+unsigned long trinity_debug_get_max(struct trinity_driver *drv);
+void trinity_debug_dump_msg(struct trinity_driver *drv, const char *fmt, ...);
+void trinity_debug_dump_model(struct trinity_driver *drv,
+ const struct trinity_model *model,
+ const char *fmt, ...);
+void trinity_debug_dump_input(struct trinity_driver *drv,
+ const struct trinity_input *input,
+ const char *fmt, ...);
+
+#endif /* __DRIVERS_MISC_TRINITY_COMMON_H__ */
diff --git a/drivers/misc/trinity/trinity_vision2_drv.c b/drivers/misc/trinity/trinity_vision2_drv.c
new file mode 100644
index 000000000000..bc43cafa39fb
--- /dev/null
+++ b/drivers/misc/trinity/trinity_vision2_drv.c
@@ -0,0 +1,278 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/**
+ * Samsung NPU Trinity Vision 2 driver
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/delay.h>
+#include <linux/hashtable.h>
+#include <linux/of_device.h>
+#include <linux/utsname.h>
+
+#include "trinity_common.h"
+#include "trinity_vision2_regs.h"
+
+#define TRIV2_DRV_GET_PDATA(drv) ((struct triv2_pdata *)(drv->pdata))
+#define TRIV2_DRV_GET_CMD_INFO(drv) (&(TRIV2_DRV_GET_PDATA(drv)->cmd_info))
+#define TRIV2_DRV_GET_CMD_BUF(drv) (&(TRIV2_DRV_GET_CMD_INFO(drv)->buf))
+#define TRIV2_DRV_GET_PROF_BUF(drv) (&(TRIV2_DRV_GET_PDATA(drv)->prof_buf))
+#define TRIV2_DRV_GET_BACK_BUF(drv) (&(TRIV2_DRV_GET_PDATA(drv)->back_buf))
+
+#define TRIV2_GET_CMD_FROM_SLOT(info, slot) \
+ ((struct triv2_cmd *)(info->buf.addr + \
+ slot * sizeof(struct triv2_cmd)))
+
+#define TRIV2_GET_REQ(req) (container_of(req, struct triv2_req, req))
+
+#define HALF_PAGE_SIZE (PAGE_SIZE >> 1)
+
+enum triv2_cmd_status {
+ STATUS_CMD_NONE = 0,
+ STATUS_CMD_READY = 1,
+ STATUS_CMD_DONE = 2,
+};
+
+/** req command for triv2 */
+struct triv2_cmd {
+ union {
+ struct {
+ uint32_t slot;
+ uint32_t prog_addr;
+ uint32_t prog_size;
+ uint32_t segt_addr;
+ uint32_t num_visa;
+
+ uint32_t priority;
+ uint32_t status;
+ uint32_t input_mode;
+ uint32_t output_mode;
+
+ /** for profiling */
+ uint32_t profile_offset;
+
+ /** for preemptive scheduling */
+ uint32_t program_position;
+
+ /** for batch processing */
+ uint32_t batch_size;
+ uint32_t curr_cnt;
+ uint32_t in_addr[TRIV2_MAX_BATCH_SIZE];
+ uint32_t out_addr[TRIV2_MAX_BATCH_SIZE];
+ uint32_t poll_addr;
+ uint32_t poll_magic;
+ /* deprecated but keep for backward compatibiltiy */
+ uint32_t in_seg_idx;
+ uint32_t out_seg_idx;
+
+ uint32_t total_cycles;
+
+ /* kernel requests */
+ uint32_t in_extern_seg_num;
+ uint32_t out_extern_seg_num;
+ uint32_t in_extern_seg_idx[TRIV2_MAX_TENSORS];
+ uint32_t out_extern_seg_idx[TRIV2_MAX_TENSORS];
+ };
+ uint8_t reserved[TRIV2_MAX_CMD_SIZE];
+ };
+} __packed;
+
+struct triv2_cmd_info {
+ DECLARE_BITMAP(bitmap, TRIV2_MAX_CMDSLOTS);
+ spinlock_t lock;
+
+ struct triv2_req *reqs[TRIV2_MAX_CMDSLOTS];
+ struct triv2_cmd cur_cmd;
+};
+
+struct triv2_hashed_cmd_info {
+ struct trinity_driver *drv;
+ struct hlist_bl_node hnode;
+ struct triv2_req *req;
+ struct triv2_cmd *cmd;
+};
+
+struct triv2_kernel_req {
+ uint32_t in_seg_idx[TRIV2_MAX_TENSORS];
+ uint32_t in_seg_size[TRIV2_MAX_TENSORS];
+ uint32_t out_seg_idx[TRIV2_MAX_TENSORS];
+ uint32_t out_seg_size[TRIV2_MAX_TENSORS];
+};
+
+struct triv2_req {
+ struct trinity_req req;
+
+ int cmd_slot;
+
+ /** kernel requets */
+ struct triv2_kernel_req *kernel;
+
+ /** profiling */
+ uint32_t profile_offset;
+ uint32_t total_cycles;
+
+ /** misc */
+ uint32_t total_segment_size;
+};
+
+struct triv2_pdata {
+ struct trinity_driver *drv;
+
+ /* command info */
+ struct triv2_cmd_info cmd_info;
+};
+
+static void triv2_reset(struct trinity_driver *drv)
+{
+ struct triv2_pdata *pdata = drv->pdata;
+
+ mutex_lock(&pdata->drv->lock);
+
+ triv2_cancel_reqs(pdata->drv);
+ msleep(100);
+
+ mutex_unlock(&pdata->drv->lock);
+}
+
+static int triv2_open(struct inode *inode, struct file *f)
+{
+ int ret;
+
+ ret = trinity_open(inode, f);
+
+ return ret;
+}
+
+static int triv2_release(struct inode *inode, struct file *f)
+{
+ int ret;
+
+ ret = trinity_release(inode, f);
+
+ return ret;
+}
+
+static const struct file_operations triv2_fops = {
+ .owner = THIS_MODULE,
+ .open = triv2_open,
+ .release = triv2_release,
+ .llseek = noop_llseek,
+};
+
+static int32_t triv2_init_pdata(struct trinity_driver *drv)
+{
+ struct triv2_pdata *pdata;
+ struct triv2_cmd_info *cmd_info;
+
+ pdata = kzalloc(sizeof(struct triv2_pdata), GFP_KERNEL);
+ if (!pdata)
+ return -ENOMEM;
+
+ drv->pdata = pdata;
+ pdata->drv = drv;
+
+ cmd_info = TRIV2_DRV_GET_CMD_INFO(drv);
+
+ spin_lock_init(&cmd_info->lock);
+ /* init cmd bitmap */
+ bitmap_zero(cmd_info->bitmap, TRIV2_MAX_CMDSLOTS);
+
+ return 0;
+}
+
+/**
+ * triv2_cleanup() - Clean up initialized variables in TRIV2
+ */
+static void triv2_cleanup(struct trinity_driver *drv)
+{
+ if (!drv->pdata)
+ return;
+
+ kfree(drv->pdata);
+ drv->pdata = NULL;
+}
+
+static struct trinity_desc triv2_desc = {
+ .type = "triv2",
+ .ver = GENVER(TRINITY_DEV_VISION2, VER_MAJOR, VER_MINOR, VER_EXTRA),
+ .fops = &triv2_fops,
+ /* device management */
+ .reset = triv2_reset,
+ /* req management */
+ .alloc_req = triv2_alloc_req,
+ .dealloc_req = triv2_dealloc_req,
+ .prepare_req = triv2_prepare_req,
+ .invoke_req = triv2_invoke_req,
+};
+
+static const struct of_device_id trinity_match[] = {
+ {
+ .compatible = "samsung,trinity",
+ },
+ { /** sentinel */ },
+};
+
+/**
+ * trinity_triv2_probe() - Probes for Trinity vision devices, inits them if found
+ */
+static int trinity_triv2_probe(struct platform_device *pdev)
+{
+ struct trinity_driver *drv;
+ int err;
+
+ err = trinity_probe(pdev, &triv2_desc);
+ if (err < 0)
+ return err;
+
+ err = triv2_init_pdata(drv);
+ if (err < 0)
+ goto out_remove;
+
+ err = trinity_create_node(drv);
+ if (err < 0) {
+ triv2_cleanup(drv);
+ goto out_remove;
+ }
+
+ dev_info(drv_to_dev_ptr(drv), "Trinity Vision2 (TRIV2)");
+
+ return 0;
+
+out_remove:
+ trinity_remove(pdev, &triv2_desc);
+ return err;
+}
+
+/**
+ * trinity_triv2_remove() - Removes an instance of a Trinity vision device
+ */
+static int trinity_triv2_remove(struct platform_device *pdev)
+{
+ struct trinity_driver *drv = platform_get_drvdata(pdev);
+
+ trinity_destroy_node(drv);
+ triv2_cleanup(drv);
+ return trinity_remove(pdev, &triv2_desc);
+}
+
+static struct platform_driver trinity_triv2 = {
+ .probe = trinity_triv2_probe,
+ .remove = trinity_triv2_remove,
+ .driver = {
+ .name = "triv2",
+ .owner = THIS_MODULE,
+ .of_match_table = of_match_ptr(trinity_match),
+ },
+};
+
+module_platform_driver(trinity_triv2);
+
+MODULE_IMPORT_NS(DMA_BUF);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Samsung Electronics");
+MODULE_DESCRIPTION("Samsung NPU device driver for trinity vision 2");
diff --git a/drivers/misc/trinity/trinity_vision2_regs.h b/drivers/misc/trinity/trinity_vision2_regs.h
new file mode 100644
index 000000000000..58ef037d320e
--- /dev/null
+++ b/drivers/misc/trinity/trinity_vision2_regs.h
@@ -0,0 +1,210 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/**
+ * Trinity Vision2 Registers
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __DRIVERS_MISC_TRINITY_VISION2_REGS_H__
+#define __DRIVERS_MISC_TRINITY_VISION2_REGS_H__
+
+/* Register offsets for NPU CP (Config) */
+#define OFFSET_CP_INFO (0x000) /* Processor Information */
+#define OFFSET_CP_PROC_STAT (0x010) /* Processor Status */
+#define OFFSET_CP_PROC_SET (0x014) /* Processor Control (Set) */
+#define OFFSET_CP_PROC_CLR (0x018) /* Processor Control (Clear) */
+#define OFFSET_CP_IMIF_BASE (0x024) /* Instruction Base Address (DRAM) */
+#define OFFSET_CP_CNT_CFG (0x200) /* CP Performance Counter */
+
+/* Register offsets for NPU CP (IDU Setup) */
+#define OFFSET_NPU_PROG_BASE (0x100) /* GPR00: Instruction Base Address */
+#define OFFSET_NPU_PROG_SIZE (0x104) /* GPR01: Program Size */
+#define OFFSET_NPU_SEGT_ADDR (0x108) /* GPR02: Segment Table Address */
+#define OFFSET_NPU_PROF_ADDR (0x10C) /* GPR03: NPU Profiling Address */
+#define OFFSET_NPU_PROF_SIZE (0x110) /* GPR04: NPU Profiling Size */
+#define OFFSET_NPU_BACK_ADDR (0x114) /* GPR05: NPU Context Backup Address */
+#define OFFSET_NPU_BACK_SIZE (0x118) /* GPR06: NPU Context Backup Size */
+#define OFFSET_NPU_PC (0x11C) /* GPR07: NPU Program Counter */
+
+/* Register offsets for NPU CP (Commands) */
+#define OFFSET_NPU_CMD_READY (0x124) /* GPR09: Command Ready Status */
+#define OFFSET_NPU_CMD_BASE (0x128) /* GPR10: Command Base Address */
+#define OFFSET_NPU_CMD_REQ (0x12C) /* GPR11: Command Request Slots (not used) */
+#define OFFSET_NPU_CMD_FREE (0x130) /* GPR12: Command Free Slots */
+
+/* Register offsets for NPU CP (Cbox Setup) */
+#define OFFSET_NPU_CBOX_BASE (0x134) /* GPR13: NPU CBOX BASE */
+
+/* Register offsets for Debugging */
+#define OFFSET_NPU_IDU_VERSION (0x138) /* GPR14: NPU IDU VERSION */
+#define OFFSET_NPU_IDU_STAGE (0x13C) /* GPR15: NPU IDU STAGE */
+
+#define OFFSET_NPU_CP_DMAI_EADDR (0x300) /* CP DMA Source Address */
+#define OFFSET_NPU_CP_DMAI_IADDR (0x304) /* CP DMA Dest Address */
+#define OFFSET_NPU_CP_DMAI_TSIZE (0x308) /* CP DMA Transfer Size */
+#define OFFSET_NPU_CP_DMAI_CONTR (0x310) /* CP DMA Status */
+#define OFFSET_NPU_CP_DMAI_CMDID (0x314) /* CP DMA Command ID */
+#define OFFSET_NPU_CP_DMAI_LSTID \
+ (0x318) /* CP DMA Command ID of the last transfer */
+
+#define OFFSET_NPU_DLA_DMAI_EADDR (0x1000) /* DLA Input External Address */
+#define OFFSET_NPU_DLA_DMAI_EYMOD \
+ (0x1004) /* DLA Input External Address Y Modifier */
+#define OFFSET_NPU_DLA_DMAI_EZMOD \
+ (0x1008) /* DLA Input External Address Z Modifier */
+#define OFFSET_NPU_DLA_DMAI_IADDR (0x100C) /* DLA Input Internal Address */
+#define OFFSET_NPU_DLA_DMAI_IYMOD \
+ (0x1010) /* DLA Input Internal Address Y Modifier */
+#define OFFSET_NPU_DLA_DMAI_IZMOD \
+ (0x1014) /* DLA Input Internal Address Z Modifier */
+#define OFFSET_NPU_DLA_DMAI_SIZE0 (0x1018) /* DLA Input Data Size 0 */
+#define OFFSET_NPU_DLA_DMAI_SIZE1 (0x101C) /* DLA Input Data Size 1 */
+#define OFFSET_NPU_DLA_DMAI_CTRL (0x1020) /* DLA Input Channel Status */
+
+#define OFFSET_NPU_DLA_DMAO_EADDR (0x1080) /* DLA Output External Address */
+#define OFFSET_NPU_DLA_DMAO_EYMOD \
+ (0x1084) /* DLA Output External Address Y Modifier */
+#define OFFSET_NPU_DLA_DMAO_EZMOD \
+ (0x1088) /* DLA Output External Address Z Modifier */
+#define OFFSET_NPU_DLA_DMAO_IADDR (0x108C) /* DLA Output Internal Address */
+#define OFFSET_NPU_DLA_DMAO_IYMOD \
+ (0x1090) /* DLA Output Internal Address Y Modifier */
+#define OFFSET_NPU_DLA_DMAO_IZMOD \
+ (0x1094) /* DLA Output Internal Address Z Modifier */
+#define OFFSET_NPU_DLA_DMAO_SIZE0 (0x1098) /* DLA Output Data Size 0 */
+#define OFFSET_NPU_DLA_DMAO_SIZE1 (0x109C) /* DLA Output Data Size 1 */
+#define OFFSET_NPU_DLA_DMAO_CTRL (0x10A0) /* DLA Output Channel Status */
+
+#define OFFSET_NPU_DLA_CORE_OPC (0x1100) /* DLA Operation Code */
+#define OFFSET_NPU_DLA_CORE_WIND_CFG (0x1104)
+#define OFFSET_NPU_DLA_CORE_SIZE0 (0x1108)
+#define OFFSET_NPU_DLA_CORE_SIZE1 (0x110C)
+#define OFFSET_NPU_DLA_CORE_ZP (0x1110)
+#define OFFSET_NPU_DLA_CORE_OUT_MULT (0x1114)
+#define OFFSET_NPU_DLA_CORE_IN0_MULT (0x1118)
+#define OFFSET_NPU_DLA_CORE_IN1_MULT (0x111C)
+#define OFFSET_NPU_DLA_CORE_OUT_CFG (0x1120)
+#define OFFSET_NPU_DLA_CORE_OUT_MOD (0x1124)
+#define OFFSET_NPU_DLA_CORE_IN0_CFG (0x1128)
+#define OFFSET_NPU_DLA_CORE_IN0_MOD (0x112C)
+#define OFFSET_NPU_DLA_CORE_IN1_CFG (0x1130)
+#define OFFSET_NPU_DLA_CORE_IN1_MOD (0x1134)
+#define OFFSET_NPU_DLA_CORE_PARAM_ADDR (0x1138)
+#define OFFSET_NPU_DLA_CORE_PSUM_ADDR (0x113C)
+#define OFFSET_NPU_DLA_CORE_CWGT_ADDR (0x1140)
+#define OFFSET_NPU_DLA_CORE_CTR (0x1144) /* DLA Core Status */
+
+#define OFFSET_NPU_DSP_DMAI_EADDR (0x2000) /* DSP Input External Address */
+#define OFFSET_NPU_DSP_DMAI_EYMOD \
+ (0x2004) /* DSP Input External Address Y Modifier */
+#define OFFSET_NPU_DSP_DMAI_EZMOD \
+ (0x2008) /* DSP Input External Address Z Modifier */
+#define OFFSET_NPU_DSP_DMAI_IADDR (0x200C) /* DSP Input Internal Address */
+#define OFFSET_NPU_DSP_DMAI_IYMOD \
+ (0x2010) /* DSP Input Internal Address Y Modifier */
+#define OFFSET_NPU_DSP_DMAI_IZMOD \
+ (0x2014) /* DSP Input Internal Address Z Modifier */
+#define OFFSET_NPU_DSP_DMAI_SIZE0 (0x2018) /* DSP Input Data Size 0 */
+#define OFFSET_NPU_DSP_DMAI_SIZE1 (0x201C) /* DSP Input Data Size 1 */
+#define OFFSET_NPU_DSP_DMAI_CTRL (0x2020) /* DSP Input Channel Status */
+
+#define OFFSET_NPU_DSP_DMAO_EADDR (0x2080) /* DSP Output External Address */
+#define OFFSET_NPU_DSP_DMAO_EYMOD \
+ (0x2084) /* DSP Output External Address Y Modifier */
+#define OFFSET_NPU_DSP_DMAO_EZMOD \
+ (0x2088) /* DSP Output External Address Z Modifier */
+#define OFFSET_NPU_DSP_DMAO_IADDR (0x208C) /* DSP Output Internal Address */
+#define OFFSET_NPU_DSP_DMAO_IYMOD \
+ (0x2090) /* DSP Output Internal Address Y Modifier */
+#define OFFSET_NPU_DSP_DMAO_IZMOD \
+ (0x2094) /* DSP Output Internal Address Z Modifier */
+#define OFFSET_NPU_DSP_DMAO_SIZE0 (0x2098) /* DSP Output Data Size 0 */
+#define OFFSET_NPU_DSP_DMAO_SIZE1 (0x209C) /* DSP Output Data Size 1 */
+#define OFFSET_NPU_DSP_DMAO_CTRL (0x20A0) /* DSP Output Channel Status */
+#define OFFSET_NPU_DSP_CORE_CTRL (0x2140) /* DSP Core Status */
+
+/* Register offsets for NPU DSP */
+#define OFFSET_DSP_INFO (0x000) /* Processor Information */
+#define OFFSET_DSP_PROC_STAT (0x010) /* Processor Status */
+#define OFFSET_DSP_PROC_SET (0x014) /* Processor Control (Set) */
+#define OFFSET_DSP_PROC_CLR (0x018) /* Processor Control (Clear) */
+#define OFFSET_DSP_IMIF_BASE (0x024) /* Instruction Base Address (DRAM) */
+
+/* Register offsets for NPU ComBox (IRQ) */
+#define OFFSET_CBOX_EXT_IRQ_MSK (0x100) /* External IRQ Output Mask */
+#define OFFSET_CBOX_EXT_IRQ_STA (0x104) /* External IRQ Output Status */
+#define OFFSET_CBOX_CP_SWI_CLR (0x134) /* CP IRQ output Clear */
+#define OFFSET_CBOX_DSP_SWI_CLR (0x154) /* DSP IRQ output Clear */
+
+/* Location of bits inside corresponding registers */
+#define BIT_CLR_IRQ_OUT BIT(24)
+#define BIT_CLR_PAUSE BIT(0)
+#define BIT_SET_SEND_EVT1 BIT(18)
+#define BIT_SET_PAUSE BIT(0)
+#define BIT_STAT_PAUSED BIT(1)
+
+/* Performance counter configurations */
+#define BIT_CNT_DST_EN BIT(6)
+#define BIT_CNT_IST_EN BIT(5)
+#define BIT_CNT_ST_EN BIT(4)
+#define BIT_CNT_FR_EN BIT(0)
+
+/* Bit masks */
+#define MASK_DSP_SWI_STA BIT_MASK(1)
+#define MASK_CP_SWI_STA BIT_MASK(0)
+
+#define MASK_STAT_WFE_PARAM GENMASK(14, 6)
+#define MASK_STAT_WFE_PARAM_EVT1 BIT_MASK(8)
+#define MASK_STAT_WFE BIT_MASK(5)
+#define MASK_STAT_PAUSED BIT_MASK(1)
+#define MASK_STAT_PAUSE BIT_MASK(0)
+
+#define VER_MAJOR (2)
+#define VER_MINOR (0)
+#define VER_EXTRA (0)
+
+#define read_idu_file(file, pos, addr, size) kernel_read(filp, addr, size, &pos)
+
+/** Macros for Instruction Decode Unit (IDU) */
+#define TRIV2_IDU_DIRPATH_FMT "/lib/modules/%s/kernel/soc/idu"
+#define TRIV2_IDU_MAX_SECTORS (3)
+#define TRIV2_IDU_ZEROIDX (0)
+#define TRIV2_IDU_DATAIDX (1)
+#define TRIV2_IDU_CODEIDX (2)
+#define TRIV2_IDU_ADDR(addr) ((uint32_t)(addr))
+#define TRIV2_IDU_MAXSIZE (1 << 20) /* 1 MiB */
+
+#define TRIV2_IDU_CP_DSPM_SIZE (0x10000)
+
+#define TRIV2_IDU_MASK_MAJOR (0xFF000000)
+#define TRIV2_IDU_MASK_MINOR (0x00FFF000)
+#define TRIV2_IDU_MASK_EXTRA (0x00000FFF)
+
+#define TRIV2_IDU_SHIFT_MAJOR (24)
+#define TRIV2_IDU_SHIFT_MINOR (12)
+
+#define TRIV2_MODEL_HASH_BITS (8)
+#define TRIV2_MODEL_HASH_SIZE (1 << TRIV2_MODEL_HASH_BITS)
+#define TRIV2_PROFILE_HASH_BITS (6)
+#define TRIV2_PROFILE_HASH_SIZE (1 << TRINITY_PROFILE_HASH_BITS)
+#define TRIV2_PROFILE_HASH_KEY(id) (hash_long((id), TRIV2_PROFILE_HASH_BITS))
+
+#define TRIV2_MAX_SEGMENTS (256)
+/** Fits in a single 4K Page */
+#define TRIV2_MAX_CMDSLOTS (PAGE_SIZE / sizeof(struct triv2_cmd))
+
+#define TRIV2_MAX_TENSORS (16)
+#define TRIV2_MAX_CMD_SIZE (512)
+#define TRIV2_MAX_BATCH_SIZE (32)
+
+#define TRIV2_DLA_GBUFFER_SIZE (0x80000)
+#define TRIV2_DSP_DSPM_OFFSET (0x10000)
+
+/* 4MiB (~300ns to flush all caches) */
+#define TRIV2_CACHE_FLUSH_THRESHOLD (0x400000)
+#define TRIV2_KERN_TIMEOUT_RESET (1000)
+
+#endif /* __DRIVERS_MISC_TRINITY_VISION2_REGS_H__ */
diff --git a/include/uapi/misc/trinity.h b/include/uapi/misc/trinity.h
new file mode 100644
index 000000000000..9f4808bfdd90
--- /dev/null
+++ b/include/uapi/misc/trinity.h
@@ -0,0 +1,476 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/**
+ * User-level header for trinity devices.
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2020 Parichay Kapoor <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#ifndef __TRINITY_H__
+#define __TRINITY_H__
+
+#include <linux/types.h>
+
+#define TRINITY_API_LEVEL 12
+
+/**
+ * enum trinity_state - Enum that describes a trinity device state
+ * @TRINITY_STATE_UNKNOWN: A device has unknown state
+ * @TRINITY_STATE_PAUSE: A device is paused
+ * @TRINITY_STATE_READY: A device is ready
+ * @TRINITY_STATE_END: End of trinity_state
+ */
+enum trinity_state {
+ TRINITY_STATE_UNKNOWN = -1,
+ TRINITY_STATE_PAUSE = 0,
+ TRINITY_STATE_READY,
+ TRINITY_STATE_END,
+};
+
+/**
+ * enum trinity_input_mode - Enum that describes an input source
+ * @TRINITY_INPUT_UNKNOWN: Unknown input mode
+ * @TRINITY_INPUT_CPU: Input feed by CPU
+ * @TRINITY_INPUT_HW: Input feed by third-party HW
+ * @TRINITY_INPUT_END: End of trinity_input_mode
+ */
+enum trinity_input_mode {
+ TRINITY_INPUT_UNKNOWN = -1,
+ TRINITY_INPUT_CPU = 0,
+ TRINITY_INPUT_HW,
+ TRINITY_INPUT_END,
+};
+
+/**
+ * enum trinity_output_mode - Enum that describes an output source
+ * @TRINITY_OUTPUT_UNKNOWN: Unknown output mode
+ * @TRINITY_OUTPUT_CPU_INTR: Output completion handling by interrupt
+ * @TRINITY_OUTPUT_CPU_POLL: Output completion handling by polling
+ * @TRINITY_OUTPUT_HW: Output completion handling by third-party HW
+ * @TRINITY_OUTPUT_END: End of trinity_output_mode
+ */
+enum trinity_output_mode {
+ TRINITY_OUTPUT_UNKNOWN = -1,
+ TRINITY_OUTPUT_CPU_INTR = 0,
+ TRINITY_OUTPUT_CPU_POLL,
+ TRINITY_OUTPUT_HW,
+ TRINITY_OUTPUT_END,
+};
+
+/**
+ * enum trinity_app_status - Enum that describes an app status
+ * @TRINITY_APP_STATUS_UNKNOWN: Unknown app status
+ * @TRINITY_APP_STATUS_ERROR: App has got some errors
+ * @TRINITY_APP_STATUS_PENDING: App is currently pending
+ * @TRINITY_APP_STATUS_STARTED: App was started
+ * @TRINITY_APP_STATUS_TERMINATED: App was terminated
+ */
+enum trinity_app_status {
+ TRINITY_APP_STATUS_UNKNOWN = 0,
+ TRINITY_APP_STATUS_ERROR = 1,
+ TRINITY_APP_STATUS_PENDING = 2,
+ TRINITY_APP_STATUS_STARTED = 3,
+ TRINITY_APP_STATUS_TERMINATED = 4
+};
+
+/**
+ * enum trinity_req_status - Enum that describes a request status
+ * @TRINITY_REQ_STATUS_UNKNOWN: Unknown request status
+ * @TRINITY_REQ_STATUS_ERROR: Request has got some errors
+ * @TRINITY_REQ_STATUS_PENDING: Request is currently pending
+ * @TRINITY_REQ_STATUS_RUNING: Request is currently running
+ * @TRINITY_REQ_STATUS_FINISHED: Request was finished
+ */
+enum trinity_req_status {
+ TRINITY_REQ_STATUS_UNKNOWN = 0,
+ TRINITY_REQ_STATUS_ERROR = 1,
+ TRINITY_REQ_STATUS_PENDING = 2, /* A request is submitted */
+ TRINITY_REQ_STATUS_RUNNING = 3, /* A request is running on NPU */
+ TRINITY_REQ_STATUS_FINISHED = 4 /* A request is just finished */
+};
+
+/**
+ * enum trinity_req_priority - Enum that describes a request priority
+ * @TRINITY_REQ_PRIORITY_LOW: Low priority
+ * @TRINITY_REQ_PRIORITY_MID: Mid priority scheduled with a higher chance than low one
+ * @TRINITY_REQ_PRIORITY_HIGH: High priority preempting lower priority requests
+ */
+enum trinity_req_priority {
+ TRINITY_REQ_PRIORITY_LOW = 0,
+ TRINITY_REQ_PRIORITY_MID = 1,
+ TRINITY_REQ_PRIORITY_HIGH = 2,
+};
+
+/**
+ * enum trinity_hwmem_type - A type of DMA buffer allocation method.
+ * @TRINITY_HWMEM_DMA_CONT: Use CMA to allocate backing stroage of DMA buffers.
+ * @TRINITY_HWMEM_DMA_IOMMU: Use IOMMU to allocate backing stroage of DMA buffers.
+ * @HWMEM_END: Sentinel.
+ */
+enum trinity_hwmem_type {
+ TRINITY_HWMEM_DMA_CONT = 0,
+ TRINITY_HWMEM_DMA_IOMMU,
+ TRINITY_HWMEM_END,
+};
+
+#ifndef TASK_COMM_LEN
+#define TASK_COMM_LEN 16
+#endif
+
+#define TRINITY_APP_NAME_MAX TASK_COMM_LEN
+#define TRINITY_APP_STAT_MAX 10
+#define TRINITY_REQ_STAT_MAX 10
+
+/**
+ * struct trinity_ioctl_stat_app - Describes stat of the target app
+ * @app_id: Trinity app id (currently, equal to pid)
+ * @name: Trinity app name
+ * @status: Trinity app status
+ * @num_total_reqs: Number of total requests in app (including finished ones)
+ * @num_active_reqs: Number of active (running or pending) requests in app
+ * @total_alloc_mem: Total size of allocated memory in the device
+ * @total_freed_mem: Total size of freed memory in the device
+ */
+struct trinity_ioctl_stat_app {
+ __s32 app_id;
+
+ char name[TRINITY_APP_NAME_MAX];
+ enum trinity_app_status status;
+
+ __u32 num_total_reqs;
+ __u32 num_active_reqs;
+
+ __u64 total_alloc_mem;
+ __u64 total_freed_mem;
+} __packed;
+
+/**
+ * struct trinity_ioctl_stat_apps - Describes stats of the latest apps
+ * @num_apps: Number of apps for the stat list
+ * @stat: Stat of the latest apps
+ */
+struct trinity_ioctl_stat_apps {
+ __u32 num_apps;
+ struct trinity_ioctl_stat_app stat[TRINITY_APP_STAT_MAX];
+} __packed;
+
+/**
+ * struct trinity_ioctl_stat_req - Describes stat of the target request
+ * @req_id: Trinity req id
+ * @model_id: Trinity model id
+ * @priority: Request priority (low, mid, or high)
+ * @status: Request status
+ * @sched_time: scheduling time in ms
+ * @infer_time: inference time in ms
+ */
+struct trinity_ioctl_stat_req {
+ __s32 req_id;
+ __u64 model_id;
+
+ enum trinity_req_priority priority;
+ enum trinity_req_status status;
+
+ __u32 sched_time;
+ __u32 infer_time;
+} __packed;
+
+/**
+ * struct trinity_ioctl_stat_reqs - Describes stats of the latest reqs
+ * @app_id: Trinity app id (0 means 'current')
+ * @num_reqs: Number of reqs for stat list
+ * @stat: Stat of the latest reqs
+ */
+struct trinity_ioctl_stat_reqs {
+ __s32 app_id;
+ __u32 num_reqs;
+ struct trinity_ioctl_stat_req stat[TRINITY_REQ_STAT_MAX];
+} __packed;
+
+/**
+ * struct trinity_ioctl_hwmem - A structure that Describes hardware memory (hwmem)
+ * @type: The type of hwmem type
+ * @size: The size of hwmem
+ * @dbuf_fd: File descriptor for dmabuf representing hwmem
+ */
+struct trinity_ioctl_hwmem {
+ enum trinity_hwmem_type type;
+ __u64 size;
+ __s32 dbuf_fd;
+} __packed;
+
+/**
+ * struct trinity_ioctl_profile_meta - Describes profiling meta info.
+ * @req_id: The target req id for profiling
+ * @total_cycles: The total number of cycles of the given req
+ * @total_ops: The total number of operations of the given req
+ * @input_footprint: The DRAM footprint of input data
+ * @output_footprint: The DRAM footprint of output data
+ * @profile_size: The size of profiling data
+ */
+struct trinity_ioctl_profile_meta {
+ __s32 req_id;
+ __s64 total_cycles;
+ __u32 total_ops;
+ __s64 input_footprint;
+ __s64 output_footprint;
+ __u32 profile_size;
+} __packed;
+
+/**
+ * struct trinity_ioctl_profile_buff - Describes profiling buff info.
+ * @req_id: The target req id for profiling
+ * @profile_pos: The start position to extract profiling data
+ * @profile_size: The size of user-allocated profiling buffer
+ * @profile_buf: The profiling buffer which user allocated
+ */
+struct trinity_ioctl_profile_buff {
+ __s32 req_id;
+ __u32 profile_pos;
+ __u32 profile_size;
+ void __user *profile_buf;
+} __packed;
+
+/**
+ * struct trinity_ioctl_model - A structure that configure a model registered on NPU
+ * @id: Id for NPU model to extract the base phys addr
+ * @dbuf_fd: File descriptor for dmabuf representing the model
+ * @program_offset_addr: Offset address for the instructions (NPU_PROG_BASE)
+ * @program_size: Size of the program instructions (NPU_PROG_SIZE)
+ * @version: The version of npubinfmt
+ * @endp_trnt_model_common: Indicator for the end of common model parameters
+ * @weight_offset_addr: Offset address for storing weights (NPU_WGT_BASE)
+ * @metadata_dbuf_fd: File descriptor for dmabuf representing the metadata
+ * @metadata_extra_addr: Offset address for the metadata extra
+ * @metadata_extra_size: Size of the metadata extra
+ * @num_visa_insts: Number of virtual ISA instructions
+ */
+struct trinity_ioctl_model {
+ __u64 id;
+ __s32 dbuf_fd;
+ __u64 program_offset_addr;
+ __u64 program_size;
+ __u32 version;
+ union {
+ __u8 endp_trnt_model_common[0];
+ struct {
+ __u64 weight_offset_addr;
+ } __packed;
+ struct {
+ __s32 metadata_dbuf_fd;
+ __s32 metadata_ext_dbuf_fd;
+ __u64 metadata_ext_size;
+ __u32 num_visa_insts;
+ } __packed;
+ };
+} __packed;
+
+/**
+ * struct trinity_ioctl_input - A structure that configure an input passed to NPU
+ * @dbuf_fd: File descriptor for dmabuf of I/O buffer (or segment table)
+ * @model_id: Model id received when setting the model in the NPU
+ * @req_id: Request id to distinguish each run_input
+ * @timeout_ms: Timeout in ms, zero is regarded as preemption
+ * @priority: Priority (LOW: 0, MID: 1, HIGH: 2)
+ * @endp_trnt_input_common: Indicator for the end of common input parameters
+ * @activation_offset_addr0: Offset address for storing weights (NPU_ACT_BASE0)
+ * @activation_offset_addr1: Offset address for storing weights (NPU_ACT_BASE1)
+ * @num_segments: Number of segments
+ * @input_mode: Input mode (who is supposed to feed input)
+ * @output_mode: Output mode (who is supposed to retrieve output)
+ * @hw_input_seg: Third-party HW's input segment idx
+ * @hw_output_seg: Third-party HW's output segment idx
+ * @task_handle: user requested task handle
+ * @subtask_idx: user requested subtask idx
+ * @task_id: kernel module requested task id
+ */
+struct trinity_ioctl_input {
+ __s32 dbuf_fd;
+ __u64 model_id;
+ __s32 req_id;
+ __s64 timeout_ms;
+ __u32 priority;
+ union {
+ __u8 endp_trnt_input_common[0];
+ struct {
+ /* added for TRIV-1 */
+ __u64 activation_offset_addr0;
+ __u64 activation_offset_addr1;
+ } __packed;
+ struct {
+ /* added for TRIV-2 */
+ __u32 num_segments;
+ enum trinity_input_mode input_mode;
+ enum trinity_output_mode output_mode;
+ __s32 hw_input_seg;
+ __s32 hw_output_seg;
+ /* [optional] vd scheduler info */
+ union {
+ struct { /* user request */
+ __u32 task_handle;
+ __u32 subtask_idx;
+ } __packed;
+ struct { /* kernel request */
+ __u32 task_id;
+ } __packed;
+ };
+ } __packed;
+ };
+} __packed;
+
+#define TRINITY_MASK_DEV (0xFF000000)
+#define TRINITY_MASK_MAJOR_VER (0x00FF0000)
+#define TRINITY_MASK_MINOR_VER (0x0000FF00)
+#define TRINITY_MASK_EXTRA_VER (0x000000FF)
+
+/**
+ * struct trinity_ioctl_fpga_memcpy - A structure that contains driver-assisted memcpy
+ * @dbuf_fd: File descriptor for dmabuf of the target buffer
+ * @dbuf_off: Offset from the dmabuf base address
+ * @user_addr: Address of user-level buffer
+ * @user_size: Size of user-level buffer
+ *
+ * @note: It's workaround structure for FPGA envionment
+ */
+struct trinity_ioctl_fpga_memcpy {
+ __s32 dbuf_fd;
+ __u32 dbuf_off;
+ void __user *user_addr;
+ __u64 user_size;
+} __packed;
+
+/**
+ * struct trinity_ioctl_idu - A structure that Describes IDU
+ * @dbuf_cp_data: CP data dmabuf which is allocated with TRINITY_IOCTL_HWMEM_ALLOC
+ * @dbuf_cp_code: CP code dmabuf which is allocated with TRINITY_IOCTL_HWMEM_ALLOC
+ * @dbuf_cp_data: dsp data dmabuf which is allocated with TRINITY_IOCTL_HWMEM_ALLOC
+ * @dbuf_dsp_code: dsp code dmabuf which is allocated with TRINITY_IOCTL_HWMEM_ALLOC
+ */
+struct trinity_ioctl_idu {
+ __s32 dbuf_cp_data;
+ __s32 dbuf_cp_code;
+ __s32 dbuf_dsp_data;
+ __s32 dbuf_dsp_code;
+} __attribute__((packed));
+
+#define TRINITY_SHIFT_DEV (24)
+#define TRINITY_SHIFT_MAJOR_VER (16)
+#define TRINITY_SHIFT_MINOR_VER (8)
+#define TRINITY_SHIFT_EXTRA_VER (0)
+#define TRINITY_SHIFT_MODEL_ID (16)
+
+#define trinity_gen_ver(dev, mj, mn, ex) \
+ { \
+ (dev << TRINITY_SHIFT_DEV) | (mj << TRINITY_SHIFT_MAJOR_VER) | \
+ (mn << TRINITY_SHIFT_MINOR_VER) | \
+ (ex << TRINITY_SHIFT_EXTRA_VER) \
+ }
+
+/**
+ * enum trinity_dev_type - Enum that describes a trinity device type
+ * @TRINITY_DEV_UNKNOWN: Unknown device type
+ * @TRINITY_DEV_VISION: Trinity Vision (TRIV)
+ * @TRINITY_DEV_AUDIO: Trinity Asr (TRIA)
+ * @TRINITY_DEV_VISION2: Trinity Vision2 (TRIV2)
+ * @TRINITY_DEV_VISION2_CUSE: Trinity Vision2 (TRIV2), CUSE-based impl.
+ * @TRINITY_DEV_END: End of trinity_dev_type
+ */
+enum trinity_dev_type {
+ TRINITY_DEV_UNKNOWN = 0,
+ TRINITY_DEV_VISION,
+ TRINITY_DEV_AUDIO,
+ TRINITY_DEV_VISION2,
+ TRINITY_DEV_VISION2_CUSE, /* CUSE-based impl. for triv2 */
+ TRINITY_DEV_END /* sentinel */
+};
+
+/**
+ * Major number can not be dynamic as ioctls need it,
+ */
+#define TRINITY_DRIVER_MAGIC 0x88
+
+#define TRINITY_IO(no) _IO(TRINITY_DRIVER_MAGIC, no)
+#define TRINITY_IOR(no, data_type) _IOR(TRINITY_DRIVER_MAGIC, no, data_type)
+#define TRINITY_IOW(no, data_type) _IOW(TRINITY_DRIVER_MAGIC, no, data_type)
+#define TRINITY_IOWR(no, data_type) _IOWR(TRINITY_DRIVER_MAGIC, no, data_type)
+
+/** Device Information */
+
+/** Get the device version information from the driver */
+#define TRINITY_IOCTL_GET_VERSION TRINITY_IOR(1, __u32)
+/** Get the device API level from the driver */
+#define TRINITY_IOCTL_GET_API_LEVEL TRINITY_IOR(2, __u32)
+/** Get the device state from the driver */
+#define TRINITY_IOCTL_GET_STATE TRINITY_IOR(3, __s32)
+/** Get the device tops information from the driver */
+#define TRINITY_IOCTL_GET_TOPS TRINITY_IOR(4, __u32)
+/** Get the device dspm information from the driver */
+#define TRINITY_IOCTL_GET_DSPM TRINITY_IOR(5, __u32)
+/** Get the next request ID from the driver */
+#define TRINITY_IOCTL_GET_NEXT_REQUEST TRINITY_IOR(6, __s32)
+
+/** Device Control */
+
+/** Allocate driver-managed memory */
+#define TRINITY_IOCTL_HWMEM_ALLOC TRINITY_IOW(21, struct trinity_ioctl_hwmem)
+
+/** De-allocate driver-managed memory */
+#define TRINITY_IOCTL_HWMEM_DEALLOC TRINITY_IOW(22, struct trinity_ioctl_hwmem)
+
+/** Register the given model config in the device and return model id */
+#define TRINITY_IOCTL_REGISTER_MODEL \
+ TRINITY_IOWR(23, struct trinity_ioctl_model)
+
+/** Unregister the model config associated with the given model_id */
+#define TRINITY_IOCTL_DEREGISTER_MODEL TRINITY_IOW(24, __u64)
+
+/** Run the device with the given input */
+#define TRINITY_IOCTL_RUN_INPUT TRINITY_IOWR(25, struct trinity_ioctl_input)
+
+/** Stop all requests submitted to the device */
+#define TRINITY_IOCTL_STOP_REQUESTS TRINITY_IO(26)
+
+/** Stop the target request with id returned by run_input */
+#define TRINITY_IOCTL_STOP_REQUEST TRINITY_IOW(27, __s32)
+
+/** Device Statistics/Profile */
+
+/** Get the current app stat in the opened device */
+#define TRINITY_IOCTL_STAT_CURRENT_APP \
+ TRINITY_IOR(51, struct trinity_ioctl_stat_app)
+
+/** Get latest apps' stat of the opened device */
+#define TRINITY_IOCTL_STAT_APPS TRINITY_IOR(52, struct trinity_ioctl_stat_apps)
+
+/** Get latest reqs' stat in the target app */
+#define TRINITY_IOCTL_STAT_REQS TRINITY_IOR(53, struct trinity_ioctl_stat_reqs)
+
+/** Get profiling metadata of the request */
+#define TRINITY_IOCTL_GET_PROFILE_META \
+ TRINITY_IOWR(54, struct trinity_ioctl_profile_meta)
+
+/** Get profiling per-op data of the request */
+#define TRINITY_IOCTL_GET_PROFILE_BUFF \
+ TRINITY_IOWR(55, struct trinity_ioctl_profile_buff)
+
+/** Set IDU binary */
+#define TRINITY_IOCTL_IDU_SET \
+ TRINITY_IOW(56, struct trinity_ioctl_idu)
+
+/** Device Testing/Workaround */
+
+/** Driver-assisted memory copy for FPGA env. */
+#define TRINITY_IOCTL_FPGA_MEMCPY \
+ TRINITY_IOWR(91, struct trinity_ioctl_fpga_memcpy)
+
+/** A wrapper of trinity_run_internal_req() */
+#define TRINITY_IOCTL_RUN_INTERNAL_REQ TRINITY_IOW(92, dev_t)
+
+#ifdef __KERNEL__
+__s32 trinity_run_internal_req(dev_t);
+#endif
+#endif /* __TRINITY_H__ */
--
2.25.1
This patch implements ioctl operations.
The ioctl routines are added to give controls to the user
library. TRINITY_IOCTL_HWMEM_ALLOC/DEALLOC is for memory
allocation for the model load, RUN/STOP operations are
provided to control NPU works. And several STAT controls can
provide statistics of the NPU.
Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
drivers/misc/trinity/trinity.c | 629 +++++++++++++++++++++++++++++++++
1 file changed, 629 insertions(+)
diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
index a785a5dca4d9..0fb5ccf9f035 100644
--- a/drivers/misc/trinity/trinity.c
+++ b/drivers/misc/trinity/trinity.c
@@ -22,6 +22,635 @@
static DEFINE_IDA(dev_nrs);
static DEFINE_IDA(model_ids);
+static uint64_t trinity_gen_model_id(int32_t dbuf_fd)
+{
+ uint64_t mid;
+
+ mid = ida_alloc_max(&model_ids, ((1 << TRINITY_SHIFT_MODEL_ID) - 1), GFP_KERNEL);
+ if (mid < 0)
+ return mid;
+
+ mid |= (dbuf_fd << TRINITY_SHIFT_MODEL_ID);
+
+ return mid;
+}
+
+static void trinity_release_model_id(uint64_t mid)
+{
+ ida_free(&model_ids, mid & ((1 << TRINITY_SHIFT_MODEL_ID) - 1));
+}
+
+static int32_t trinity_model_id_to_dbuf_fd(uint64_t id)
+{
+ return (id >> TRINITY_SHIFT_MODEL_ID) & UINT_MAX;
+}
+
+static void trinity_model_htable_init(const struct device *dev)
+{
+ int i;
+ struct trinity_driver *drv;
+
+ drv = dev_get_drvdata(dev);
+ for (i = 0; i < TRINITY_MODEL_HASH_SIZE; ++i)
+ INIT_HLIST_BL_HEAD(drv->model_htable + i);
+}
+
+static struct trinity_model *
+trinity_get_model_by_id(const struct trinity_driver *drv, const uint64_t id)
+{
+ struct hlist_bl_head *model_hlist;
+ struct hlist_bl_node *hn;
+ struct trinity_model *hm;
+ unsigned long key;
+ int32_t dbuf_fd;
+
+ dbuf_fd = trinity_model_id_to_dbuf_fd(id);
+ key = hash_long(dbuf_fd, TRINITY_MODEL_HASH_BITS);
+ model_hlist = (struct hlist_bl_head *)(drv->model_htable + key);
+
+ hlist_bl_lock(model_hlist);
+ hlist_bl_for_each_entry(hm, hn, model_hlist, hnode) {
+ if (hm->config.id != id)
+ continue;
+
+ hlist_bl_unlock(model_hlist);
+ return hm;
+ }
+ hlist_bl_unlock(model_hlist);
+
+ return NULL;
+}
+
+/**
+ * trinity_register_model() - Registers a model to the internal hashtable. Note
+ * that the device is responsible for the hashtable maintenance.
+ *
+ * @drv: An instance of the trinity driver
+ * @model: Model information to be registered
+ *
+ * Returns 0 and sets model->id with a valid value, which is unique system-wide,
+ * on success. Otherwise, returns negative error.
+ */
+int32_t trinity_register_model(struct trinity_driver *drv,
+ struct trinity_model *model)
+{
+ struct hlist_bl_head *model_hlist;
+ unsigned long key;
+ int32_t ret;
+
+ ret = trinity_hwmem_import_dmabuf_begin(drv_to_dev_ptr(drv),
+ model->config.dbuf_fd,
+ &model->import_info);
+ if (ret)
+ return ret;
+
+#ifdef ARM
+ /* sync model program data */
+ __cpuc_flush_dcache_area(model->import_info.addr,
+ model->import_info.buf->size);
+#endif
+
+ model->config.id = trinity_gen_model_id(model->config.dbuf_fd);
+ model->owner_id = trinity_get_app_id();
+
+ INIT_HLIST_BL_NODE(&model->hnode);
+
+ key = hash_long(model->config.dbuf_fd, TRINITY_MODEL_HASH_BITS);
+ model_hlist = (struct hlist_bl_head *)(drv->model_htable + key);
+
+ hlist_bl_lock(model_hlist);
+ hlist_bl_add_head(&model->hnode, model_hlist);
+ hlist_bl_unlock(model_hlist);
+
+ kref_init(&model->refcnt);
+
+ return 0;
+}
+
+static void trinity_destroy_model(struct kref *refcnt)
+{
+ struct trinity_model *model =
+ container_of(refcnt, struct trinity_model, refcnt);
+
+ trinity_release_model_id(model->config.id);
+ trinity_hwmem_import_dmabuf_end(&model->import_info);
+ kfree(model);
+}
+
+static void trinity_model_get(struct trinity_model *model)
+{
+ if (!model)
+ return;
+
+ kref_get(&model->refcnt);
+}
+
+static void trinity_model_put(struct trinity_model *model)
+{
+ if (!model)
+ return;
+
+ kref_put(&model->refcnt, trinity_destroy_model);
+}
+
+phys_addr_t trinity_get_paddr(struct iommu_domain *domain, dma_addr_t daddr)
+{
+ if (domain)
+ return iommu_iova_to_phys(domain, daddr);
+
+ return TRINITY_PADDR_BASE + daddr;
+}
+
+void trinity_finish_req(struct trinity_driver *drv, struct trinity_req *req)
+{
+ if (drv->desc->check_profile(drv, req) < 0)
+ dev_warn(drv_to_dev_ptr(drv),
+ "Unable to get profile data from NPU\n");
+ trinity_hwmem_import_dmabuf_end(&req->input.import_info);
+ trinity_stat_finish_req(drv, req);
+ trinity_model_put(req->model);
+}
+
+/**
+ * trinity_deregister_model() - Deregisters the model with a given id from the
+ * table
+ *
+ * @drv: An instance of the trinity driver
+ * @id: An id of the model to be deregistered
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+int32_t trinity_deregister_model(const struct trinity_driver *drv,
+ const uint64_t id)
+{
+ struct hlist_bl_head *model_hlist;
+ int32_t dbuf_fd;
+ unsigned long key;
+ struct hlist_bl_node *hn;
+ struct trinity_model *hm = NULL;
+
+ dbuf_fd = trinity_model_id_to_dbuf_fd(id);
+ key = hash_long(dbuf_fd, TRINITY_MODEL_HASH_BITS);
+ model_hlist = (struct hlist_bl_head *)(drv->model_htable + key);
+
+ hlist_bl_lock(model_hlist);
+ hlist_bl_for_each_entry(hm, hn, model_hlist, hnode) {
+ if (hm->config.id == id) {
+ hlist_bl_del_init(&hm->hnode);
+ break;
+ }
+ }
+ hlist_bl_unlock(model_hlist);
+
+ if (!hm)
+ return -ENOENT;
+
+ trinity_model_put(hm);
+
+ return 0;
+}
+
+/**
+ * trinity_deregister_models_owned() - Deregisters models owned
+ *
+ * @drv: An instance of the trinity driver
+ */
+void trinity_deregister_models_owned(struct trinity_driver *drv)
+{
+ struct hlist_bl_head *model_htable;
+ struct trinity_model *hm;
+ struct hlist_bl_node *hn;
+ int i, app_id;
+
+ app_id = trinity_get_app_id();
+ model_htable = drv->model_htable;
+retry:
+ for (i = 0; i < TRINITY_MODEL_HASH_SIZE; i++) {
+ hlist_bl_lock(model_htable + i);
+ hlist_bl_for_each_entry(hm, hn, model_htable + i, hnode) {
+ if (hm->owner_id == app_id) {
+ hlist_bl_del_init(&hm->hnode);
+ hlist_bl_unlock(model_htable + i);
+
+ trinity_model_put(hm);
+
+ goto retry;
+ }
+ }
+ hlist_bl_unlock(model_htable + i);
+ }
+}
+
+static int32_t trinity_submit_req(struct trinity_driver *drv,
+ struct trinity_req *req)
+{
+ struct device *dev;
+ wait_queue_head_t wq;
+ unsigned long timeout, timeout_ms;
+ unsigned long retry;
+ int ret = 0;
+
+ dev = drv_to_dev_ptr(drv);
+
+ /* optional req setup before submission */
+ if (drv->desc->prepare_req) {
+ ret = drv->desc->prepare_req(drv, req);
+ if (ret < 0) {
+ dev_err(dev, "Unable to prepare req submission: %d",
+ ret);
+ return ret;
+ }
+ }
+
+ req->submit_retry = 0;
+ timeout_ms = req->input.config.timeout_ms;
+ /* use the default timeout if a user didn't set */
+ if (timeout_ms == 0)
+ timeout_ms = TRINITY_RUN_TIMEOUT_MSEC;
+
+ retry = 0;
+ init_waitqueue_head(&wq);
+ init_completion(&req->complete);
+
+ timeout = msecs_to_jiffies(timeout_ms);
+ while (wait_event_interruptible_timeout(wq, trinity_sched_ready(drv),
+ timeout / 10) == 0) {
+ if (retry == 10) {
+ ret = -ETIMEDOUT;
+ break;
+ }
+ retry++;
+ }
+
+ if (ret == 0) {
+ ret = trinity_stat_append_req(drv, req);
+ if (ret < 0) {
+ dev_err(dev, "Unable to append request stat: %d", ret);
+ return ret;
+ }
+
+ ret = trinity_sched_submit(drv, req);
+ if (ret < 0)
+ trinity_stat_remove_req(drv, req, true);
+ }
+
+ if (ret < 0) {
+ dev_err(dev, "Unable to submit req to scheduler: %d", ret);
+ return ret;
+ }
+
+ if (req->input.config.output_mode != TRINITY_OUTPUT_HW) {
+ timeout = wait_for_completion_timeout(&req->complete, timeout);
+ /* Check and handle the timeout if its handler exists */
+ if (timeout == 0) {
+ drv->desc->handle_timeout(drv, req);
+
+ req->stat->status = TRINITY_REQ_STATUS_ERROR;
+ ret = -ECANCELED;
+ } else if (req->stat->status == TRINITY_REQ_STATUS_ERROR) {
+ ret = -ECANCELED;
+ }
+
+ trinity_finish_req(drv, req);
+ }
+
+ return ret;
+}
+
+static int32_t trinity_run_input(struct trinity_driver *drv,
+ struct trinity_input *input,
+ struct trinity_req *req)
+{
+ struct trinity_model *model;
+ int32_t err;
+
+ model = trinity_get_model_by_id(drv, input->config.model_id);
+ if (!model) {
+ dev_err(drv_to_dev_ptr(drv), "Unable to find the model");
+ return -EINVAL;
+ }
+
+ /* skip to submit this req */
+ if (model->config.program_size == 0 &&
+ input->config.output_mode != TRINITY_OUTPUT_HW)
+ return 0;
+
+ trinity_model_get(model);
+
+ err = trinity_hwmem_import_dmabuf_begin(drv_to_dev_ptr(drv),
+ input->config.dbuf_fd,
+ &input->import_info);
+ if (err < 0)
+ return err;
+
+ req->model = model;
+ err = trinity_submit_req(drv, req);
+ if (err == 0)
+ return 0;
+
+ if (err != -ECANCELED)
+ trinity_hwmem_import_dmabuf_end(&input->import_info);
+ return err;
+}
+
+/**
+ * trinity_ioctl() - A common callback for unlocked_ioctl() in file_operations for
+ * a Trinity device node.
+ *
+ * @f: A file instance of the opened device node
+ * @cmd: The target IOCTL command to be handled
+ * @arg: A user argument
+ *
+ * Returns 0 on success. Otherwise, returns negative error.
+ */
+long trinity_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+ struct trinity_driver *drv = f->private_data;
+ const struct trinity_desc *desc = drv->desc;
+ ssize_t err = 0L;
+
+ switch (cmd) {
+ case TRINITY_IOCTL_GET_VERSION: {
+ if (copy_to_user((uint32_t __user *)arg, &(desc->ver),
+ sizeof((desc->ver))))
+ return -EFAULT;
+
+ break;
+ }
+ case TRINITY_IOCTL_GET_API_LEVEL: {
+ uint32_t api_level = TRINITY_API_LEVEL;
+
+ if (copy_to_user((uint32_t __user *)arg, &api_level,
+ sizeof(api_level)))
+ return -EFAULT;
+
+ break;
+ }
+ case TRINITY_IOCTL_GET_STATE: {
+ enum trinity_state ready;
+
+ ready = drv->desc->get_state(drv);
+ if (copy_to_user((enum trinity_state __user *)arg, &ready,
+ sizeof(ready)))
+ return -EFAULT;
+
+ break;
+ }
+ case TRINITY_IOCTL_GET_TOPS: {
+ if (copy_to_user((uint32_t __user *)arg, &(drv->tops),
+ sizeof((drv->tops))))
+ return -EFAULT;
+
+ break;
+ }
+ case TRINITY_IOCTL_GET_DSPM: {
+ if (copy_to_user((uint32_t __user *)arg, &(drv->dspm),
+ sizeof((drv->dspm))))
+ return -EFAULT;
+
+ break;
+ }
+ case TRINITY_IOCTL_GET_NEXT_REQUEST: {
+ int32_t req_id = atomic_inc_return(&drv->global_req_id);
+
+ if (copy_to_user((int32_t __user *)arg, &req_id,
+ sizeof(req_id)))
+ return -EFAULT;
+
+ break;
+ }
+ case TRINITY_IOCTL_HWMEM_ALLOC: {
+ struct trinity_ioctl_hwmem hwmem;
+
+ if (copy_from_user(&hwmem, (size_t __user *)arg, sizeof(hwmem)))
+ return -EFAULT;
+
+ err = trinity_hwmem_alloc(drv_to_dev_ptr(drv), hwmem.size,
+ hwmem.type);
+ if (err >= 0)
+ trinity_stat_app_total_alloc(drv, hwmem.size);
+
+ break;
+ }
+ case TRINITY_IOCTL_HWMEM_DEALLOC: {
+ struct trinity_ioctl_hwmem hwmem;
+ struct dma_buf *dbuf;
+
+ if (copy_from_user(&hwmem, (size_t __user *)arg, sizeof(hwmem)))
+ return -EFAULT;
+
+ dbuf = dma_buf_get(hwmem.dbuf_fd);
+ if (IS_ERR(dbuf))
+ return PTR_ERR(dbuf);
+
+ err = trinity_hwmem_free(drv_to_dev_ptr(drv), hwmem.dbuf_fd);
+ if (err == 0)
+ trinity_stat_app_total_freed(drv, dbuf->size);
+
+ break;
+ }
+ case TRINITY_IOCTL_REGISTER_MODEL: {
+ struct trinity_model *model =
+ kzalloc(sizeof(struct trinity_model), GFP_KERNEL);
+
+ if (IS_ERR_OR_NULL(model))
+ return -ENOMEM;
+
+ if (copy_from_user(&model->config,
+ (struct trinity_model __user *)arg,
+ sizeof(model->config))) {
+ kfree(model);
+ return -EFAULT;
+ }
+
+ err = trinity_register_model(drv, model);
+ if (err < 0)
+ break;
+
+ if (copy_to_user((struct trinity_model __user *)arg,
+ &model->config, sizeof(model->config)))
+ return -EFAULT;
+
+ break;
+ }
+ case TRINITY_IOCTL_DEREGISTER_MODEL: {
+ uint64_t id;
+
+ if (copy_from_user(&id, (uint64_t __user *)arg, sizeof(id)))
+ return -EFAULT;
+
+ err = trinity_deregister_model(drv, id);
+
+ break;
+ }
+ case TRINITY_IOCTL_RUN_INPUT: {
+ struct trinity_req *req;
+ struct trinity_input *input;
+
+ if (!IDU_LOADED(drv))
+ return -EFAULT;
+
+ req = drv->desc->alloc_req(drv);
+ if (!req)
+ return -ENOMEM;
+ req->drv = drv;
+ req->time_started = ktime_get();
+
+ input = &(req->input);
+ /** run input based on config received from the user */
+ if (copy_from_user(&input->config,
+ (struct trinity_input __user *)arg,
+ sizeof(input->config))) {
+ drv->desc->dealloc_req(drv, req);
+ return -EACCES;
+ }
+
+ err = trinity_run_input(drv, input, req);
+ if (err < 0) {
+ drv->desc->dealloc_req(drv, req);
+ return err;
+ }
+
+ if (copy_to_user((struct trinity_input __user *)arg,
+ &input->config, sizeof(input->config))) {
+ drv->desc->dealloc_req(drv, req);
+ return -EACCES;
+ }
+
+ /* this will be freed when stop request is called */
+ if (!req->is_kernel)
+ drv->desc->dealloc_req(drv, req);
+
+ break;
+ }
+ case TRINITY_IOCTL_STOP_REQUESTS: {
+ if (!IDU_LOADED(drv))
+ return -EFAULT;
+
+ if (drv->desc->stop_reqs)
+ schedule_work(&drv->work_stop);
+
+ break;
+ }
+ case TRINITY_IOCTL_STAT_CURRENT_APP: {
+ struct trinity_ioctl_stat_app ioctl_stat_app;
+
+ if (copy_from_user(&ioctl_stat_app,
+ (struct trinity_ioctl_stat_app __user *)arg,
+ sizeof(ioctl_stat_app)))
+ return -EACCES;
+
+ trinity_stat_app_copy_ioctl(drv, &ioctl_stat_app);
+
+ if (copy_to_user((struct trinity_ioctl_stat_app __user *)arg,
+ &ioctl_stat_app, sizeof(ioctl_stat_app)))
+ return -EACCES;
+
+ break;
+ }
+ case TRINITY_IOCTL_STAT_APPS: {
+ struct trinity_ioctl_stat_apps ioctl_stat_apps;
+
+ if (copy_from_user(&ioctl_stat_apps,
+ (struct trinity_ioctl_stat_apps __user *)arg,
+ sizeof(ioctl_stat_apps)))
+ return -EACCES;
+
+ trinity_stat_apps_copy_ioctl(drv, &ioctl_stat_apps);
+
+ if (copy_to_user((struct trinity_ioctl_stat_apps __user *)arg,
+ &ioctl_stat_apps, sizeof(ioctl_stat_apps)))
+ return -EACCES;
+
+ break;
+ }
+ case TRINITY_IOCTL_STAT_REQS: {
+ struct trinity_ioctl_stat_reqs ioctl_stat_reqs;
+
+ if (copy_from_user(&ioctl_stat_reqs,
+ (struct trinity_ioctl_stat_reqs __user *)arg,
+ sizeof(ioctl_stat_reqs)))
+ return -EACCES;
+
+ if (ioctl_stat_reqs.app_id == 0)
+ ioctl_stat_reqs.app_id = trinity_get_app_id();
+
+ trinity_stat_reqs_copy_ioctl(drv, &ioctl_stat_reqs);
+
+ if (copy_to_user((struct trinity_ioctl_stat_reqs __user *)arg,
+ &ioctl_stat_reqs, sizeof(ioctl_stat_reqs)))
+ return -EACCES;
+
+ break;
+ }
+ case TRINITY_IOCTL_GET_PROFILE_META: {
+ struct trinity_ioctl_profile_meta profile;
+
+ if (copy_from_user(
+ &profile,
+ (struct trinity_ioctl_profile_meta __user *)arg,
+ sizeof(profile)))
+ return -EACCES;
+
+ if (drv->desc->get_profile_meta) {
+ err = drv->desc->get_profile_meta(drv, &profile);
+ } else {
+ profile.total_cycles = -1;
+ profile.total_ops = 0;
+ profile.profile_size = 0;
+ profile.input_footprint = -1;
+ profile.output_footprint = -1;
+ }
+
+ if (copy_to_user((struct trinity_ioctl_profile_meta __user *)arg,
+ &profile, sizeof(profile)))
+ return -EACCES;
+
+ break;
+ }
+ case TRINITY_IOCTL_GET_PROFILE_BUFF: {
+ struct trinity_ioctl_profile_buff profile;
+
+ if (copy_from_user(
+ &profile,
+ (struct trinity_ioctl_profile_buff __user *)arg,
+ sizeof(profile)))
+ return -EACCES;
+
+ if (drv->desc->get_profile_buff)
+ err = drv->desc->get_profile_buff(drv, &profile);
+
+ if (copy_to_user((struct trinity_ioctl_profile_buff __user *)arg,
+ &profile, sizeof(profile)))
+ return -EACCES;
+
+ break;
+ }
+ case TRINITY_IOCTL_IDU_SET: {
+ struct trinity_ioctl_idu idu;
+
+ if ((struct trinity_ioctl_idu __user *)arg == NULL) {
+ drv->desc->idu_unset(drv);
+ break;
+ }
+
+ if (copy_from_user(&idu,
+ (struct trinity_ioctl_idu __user *)arg,
+ sizeof(idu))) {
+ return -EACCES;
+ }
+
+ err = drv->desc->idu_set(drv, &idu);
+
+ break;
+ }
+ default:
+ return -ENOTTY;
+ }
+
+ return err;
+}
+
/**
* trinity_release() - A common callback for close() in file_operations for a
* Trinity device node. If there are device-specific data to be
--
2.25.1
On Sat, Sep 17, 2022, at 9:23 AM, Jiho Chu wrote:
> It contains the base codes for trinity driver. Minimal codes to load and
> probe device is provided. The Trinity Family is controlled by the
> Memory-Mapped Registers, the register addresses and offsets are
> described. And user api interfaces are presented to control device under
> ioctl manner.
I'm not doing a full review of the driver at the moment, but
here are some comments on the usage of chardev ioctl based on
Documentation/driver-api/ioctl.rst
> +int trinity_probe(struct platform_device *pdev, const struct
> trinity_desc *desc)
> +{
> + struct device_node *np;
> + struct device *dev;
> + struct trinity_driver *drv;
> + int i, err;
> +
> + dev = &pdev->dev;
> + dev->id = ((desc->ver & TRINITY_MASK_DEV) >> TRINITY_SHIFT_DEV);
> +
> + /* set private data */
> + drv = devm_kzalloc(dev, sizeof(*drv), GFP_KERNEL);
> + if (!drv)
> + return -ENOMEM;
> +
> + drv->dev_id = ida_alloc(&dev_nrs, GFP_KERNEL);
> + if (drv->dev_id < 0) {
> + devm_kfree(dev, drv);
> + return drv->dev_id;
> + }
> + snprintf(drv->name, DEV_NAME_LEN, "%s-%u", desc->type, drv->dev_id);
> +
> + platform_set_drvdata(pdev, drv);
> + dev_set_drvdata(dev, drv);
> +
If you have the need to manage multiple devices here, maybe use
a dynamic major number and have the chardev code allocate the
minor numbers, instead of using multiple misc devices and
doing that yourself.
> +
> +#ifndef TASK_COMM_LEN
> +#define TASK_COMM_LEN 16
> +#endif
> +
> +#define TRINITY_APP_NAME_MAX TASK_COMM_LEN
> +#define TRINITY_APP_STAT_MAX 10
> +#define TRINITY_REQ_STAT_MAX 10
The structure layout should not depend on whether an application
has included a header that defines TASK_COMM_LEN.
What is the purpose of including an application name here?
> +/**
> + * struct trinity_ioctl_stat_app - Describes stat of the target app
> + * @app_id: Trinity app id (currently, equal to pid)
> + * @name: Trinity app name
> + * @status: Trinity app status
> + * @num_total_reqs: Number of total requests in app (including
> finished ones)
> + * @num_active_reqs: Number of active (running or pending) requests in
> app
> + * @total_alloc_mem: Total size of allocated memory in the device
> + * @total_freed_mem: Total size of freed memory in the device
> + */
> +struct trinity_ioctl_stat_app {
> + __s32 app_id;
> +
> + char name[TRINITY_APP_NAME_MAX];
> + enum trinity_app_status status;
> +
> + __u32 num_total_reqs;
> + __u32 num_active_reqs;
> +
> + __u64 total_alloc_mem;
> + __u64 total_freed_mem;
> +} __packed;
'enum' in a uapi structure is not well-defined across
architectures, so better use a fixed-size type there.
Instead of packing the structure, you should keep all
members naturally aligned and add explicit padding
or change some members for 32-bit to 64-bit size
to keep everything naturally aligned.
> +/**
> + * struct trinity_ioctl_profile_buff - Describes profiling buff info.
> + * @req_id: The target req id for profiling
> + * @profile_pos: The start position to extract profiling data
> + * @profile_size: The size of user-allocated profiling buffer
> + * @profile_buf: The profiling buffer which user allocated
> + */
> +struct trinity_ioctl_profile_buff {
> + __s32 req_id;
> + __u32 profile_pos;
> + __u32 profile_size;
> + void __user *profile_buf;
> +} __packed;
Don't put pointers into ioctl structures, they just make compat
mode unnecessarily hard. You can use a __u64 member.
> +/**
> + * Major number can not be dynamic as ioctls need it,
> + */
> +#define TRINITY_DRIVER_MAGIC 0x88
> +
> +#define TRINITY_IO(no) _IO(TRINITY_DRIVER_MAGIC, no)
> +#define TRINITY_IOR(no, data_type) _IOR(TRINITY_DRIVER_MAGIC, no,
> data_type)
> +#define TRINITY_IOW(no, data_type) _IOW(TRINITY_DRIVER_MAGIC, no,
> data_type)
> +#define TRINITY_IOWR(no, data_type) _IOWR(TRINITY_DRIVER_MAGIC, no,
> data_type)
These macros just hurt tools that want to parse the headers.
Please just open-code the usage.
> +#ifdef __KERNEL__
> +__s32 trinity_run_internal_req(dev_t);
> +#endif
This doesn't seem to belong into the uapi header.
Arnd
This patch includes sysfs interfaces.
sysfs interface provides NPU's internal statistics, status and control
attribes.
The sysfs information provided by the Trinity are:
- IDU version
- profiling result
- allocated debugfs buffer
The control attributes are including:
- initialize profile operation
- NPU control (suspend/resume/stop)
Signed-off-by: Jiho Chu <[email protected]>
Signed-off-by: Yelin Jeong <[email protected]>
Signed-off-by: Dongju Chae <[email protected]>
Signed-off-by: MyungJoo Ham <[email protected]>
---
.../ABI/testing/sysfs-driver-trinity | 55 ++
drivers/misc/trinity/Makefile | 1 +
drivers/misc/trinity/trinity_sysfs.c | 667 ++++++++++++++++++
3 files changed, 723 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-driver-trinity
create mode 100644 drivers/misc/trinity/trinity_sysfs.c
diff --git a/Documentation/ABI/testing/sysfs-driver-trinity b/Documentation/ABI/testing/sysfs-driver-trinity
new file mode 100644
index 000000000000..754e6f36a1dc
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-trinity
@@ -0,0 +1,55 @@
+What: /sys/devices/platform/trinity/*.triv2/debug/debugfs_max
+Date: July 2022
+KernelVersion: 5.19-rc8
+Contact: Jiho Chu <[email protected]>
+Description: Shows current allocated debugfs entry size.
+ Note that, Writing max entry size allocates NPU's hardware
+ memory for debugfs entries.
+
+What: /sys/devices/platform/trinity/*.triv2/debug/idu_version
+Date: July 2022
+KernelVersion: 5.19-rc8
+Contact: Jiho Chu <[email protected]>
+Description: Shows IDU version
+
+What: /sys/devices/platform/trinity/*.triv2/debug/show_profile
+Date: July 2022
+Contact: Jiho Chu <[email protected]>
+KernelVersion: 5.19-rc8
+Description: Shows profile information.
+ After writing Request ID, it shows information for the
+ request. This includes number of total cycles, number of
+ total operations and further information
+ (read/write count etc.) for each operation.
+
+What: /sys/devices/platform/trinity/*.triv2/control/profile
+Date: July 2022
+Contact: Jiho Chu <[email protected]>
+Description: Initialize NPU profile operation with profile size.
+ It allocates NPU's hardware memory and activate profile
+ operation in NPU. Note that, write memory size in Byte.
+
+What: /sys/devices/platform/trinity/*.triv2/control/reset
+Date: July 2022
+KernelVersion: 5.19-rc8
+Contact: Jiho Chu <[email protected]>
+Description: Resets NPU and reload IDU binary.
+
+What: /sys/devices/platform/trinity/*.triv2/control/resume
+Date: July 2022
+KernelVersion: 5.19-rc8
+Contact: Jiho Chu <[email protected]>
+Description: Resume NPU operation
+
+What: /sys/devices/platform/trinity/*.triv2/control/suspend
+Date: July 2022
+KernelVersion: 5.19-rc8
+Contact: Jiho Chu <[email protected]>
+Description: Enter suspend state
+
+What: /sys/devices/platform/trinity/*.triv2/control/stop
+Date: July 2022
+KernelVersion: 5.19-rc8
+Contact: Jiho Chu <[email protected]>
+Description: Cancels all NPU workloads.
+
diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
index b475938a0db6..462b7c61f39f 100644
--- a/drivers/misc/trinity/Makefile
+++ b/drivers/misc/trinity/Makefile
@@ -7,5 +7,6 @@ trinity-y += trinity_dma.o trinity_hwmem.o
trinity-y += trinity_sched.o
trinity-y += trinity_debug.o
trinity-y += trinity_stat.o
+trinity-y += trinity_sysfs.o
trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
diff --git a/drivers/misc/trinity/trinity_sysfs.c b/drivers/misc/trinity/trinity_sysfs.c
new file mode 100644
index 000000000000..d716607efa28
--- /dev/null
+++ b/drivers/misc/trinity/trinity_sysfs.c
@@ -0,0 +1,667 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Sysfs interface for Samsung Research Trinity device family
+ *
+ * Copyright (C) 2020-2022 Samsung Electronics
+ * Copyright (C) 2020 Dongju Chae <[email protected]>
+ * Copyright (C) 2020 Wook Song <[email protected]>
+ * Copyright (C) 2022 MyungJoo Ham <[email protected]>
+ * Copyright (C) 2022 Yelin Jeong <[email protected]>
+ * Copyright (C) 2022 Jiho Chu <[email protected]>
+ */
+
+#include <linux/device.h>
+#include <linux/sysfs.h>
+
+#include "trinity_common.h"
+#include "trinity_stat.h"
+
+enum trinity_sysfs_msg {
+ SYSFS_MSG_NORMAL = 0,
+ SYSFS_MSG_PROLOGUE,
+ SYSFS_MSG_EPILOGUE,
+ SYSFS_MSG_EMIT,
+};
+
+static ssize_t debugfs_max_store(struct device *dev,
+ struct device_attribute *attr, const char *buf,
+ size_t count)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ unsigned long msg_max;
+ int32_t ret = 0;
+
+ ret = kstrtoul(buf, 10, &msg_max);
+ if (ret != 0)
+ return -EINVAL;
+
+ trinity_debug_clear(drv, msg_max);
+
+ return count;
+}
+
+static ssize_t debugfs_max_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+
+ return snprintf(buf, PAGE_SIZE, "%lu\n", trinity_debug_get_max(drv));
+}
+static DEVICE_ATTR_RW(debugfs_max);
+
+static ssize_t show_profile_store(struct device *dev,
+ struct device_attribute *attr, const char *buf,
+ size_t count)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ unsigned long val;
+ int32_t ret = 0;
+
+ ret = kstrtoul(buf, 10, &val);
+ if (ret != 0)
+ return -EINVAL;
+
+ drv->profile_req_id = val;
+
+ return count;
+}
+
+static ssize_t show_profile_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+
+ if (!drv->desc->get_profile)
+ return snprintf(buf, PAGE_SIZE, "profile is not supported\n");
+
+ if (drv->profile_req_id < 0)
+ return snprintf(buf, PAGE_SIZE, "invalid request id(%d)\n",
+ drv->profile_req_id);
+
+ return drv->desc->get_profile(drv, buf, drv->profile_req_id);
+}
+static DEVICE_ATTR_RW(show_profile);
+
+static ssize_t idu_version_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+
+ if (drv->desc->idu_version) {
+ uint32_t major, minor, extra;
+
+ if (drv->desc->idu_version(drv, &major, &minor, &extra) == 0)
+ return snprintf(buf, PAGE_SIZE, "v%u.%u.%u\n", major,
+ minor, extra);
+ }
+
+ return snprintf(buf, PAGE_SIZE,
+ "Unknown... v0.30.7 or higher version required.\n");
+}
+static DEVICE_ATTR_RO(idu_version);
+
+static struct attribute *trinity_attrs_debug[] = {
+ &dev_attr_debugfs_max.attr, &dev_attr_show_profile.attr,
+ &dev_attr_idu_version.attr, NULL
+};
+
+/* e.g, /sys/devices/platform/304f0000.triv2/debug/ */
+static struct attribute_group trinity_attrs_debug_group = {
+ .name = "debug",
+ .attrs = trinity_attrs_debug
+};
+
+static ssize_t max_stat_apps_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ unsigned long val;
+ int32_t ret = 0;
+
+ ret = kstrtoul(buf, 10, &val);
+ if (ret != 0)
+ return -EINVAL;
+
+ trinity_stat_resize(drv, val, 0, 0);
+
+ return count;
+}
+
+static ssize_t max_stat_apps_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+
+ return snprintf(buf, PAGE_SIZE, "%lu\n",
+ trinity_stat_get_max_apps(drv));
+}
+static DEVICE_ATTR_RW(max_stat_apps);
+
+static ssize_t max_stat_reqs_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ unsigned long val;
+ int32_t ret = 0;
+
+ ret = kstrtoul(buf, 10, &val);
+ if (ret != 0)
+ return -EINVAL;
+
+ trinity_stat_resize(drv, 0, val, 0);
+
+ return count;
+}
+
+static ssize_t max_stat_reqs_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+
+ return snprintf(buf, PAGE_SIZE, "%lu\n",
+ trinity_stat_get_max_reqs(drv));
+}
+static DEVICE_ATTR_RW(max_stat_reqs);
+
+static ssize_t max_stat_reqs_per_app_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ unsigned long val;
+ int32_t ret = 0;
+
+ ret = kstrtoul(buf, 10, &val);
+ if (ret != 0)
+ return -EINVAL;
+
+ trinity_stat_resize(drv, 0, 0, val);
+
+ return count;
+}
+
+static ssize_t max_stat_reqs_per_app_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+
+ return snprintf(buf, PAGE_SIZE, "%lu\n",
+ trinity_stat_get_max_reqs_per_app(drv));
+}
+static DEVICE_ATTR_RW(max_stat_reqs_per_app);
+
+static ssize_t mem_usage_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ struct trinity_stat_app *stat_app;
+ ssize_t pos = 0;
+ bool first = true;
+
+ trinity_stat_lock(&drv->stat);
+
+ list_for_each_entry(stat_app, &drv->stat.list, lnode) {
+ if (first) {
+ pos += snprintf(
+ buf + pos, PAGE_SIZE,
+ "Memory usage statistics for all opened devices\n");
+ first = false;
+ }
+
+ pos += snprintf(
+ buf + pos, PAGE_SIZE,
+ " [%d] total_alloc: %llu bytes, total_freed: %llu bytes\n",
+ stat_app->app_id, stat_app->total_alloc_mem,
+ stat_app->total_freed_mem);
+ }
+
+ if (first)
+ pos += snprintf(buf + pos, PAGE_SIZE, "No active devices\n");
+
+ trinity_stat_unlock(&drv->stat);
+
+ return pos;
+}
+static DEVICE_ATTR_RO(mem_usage);
+
+#define MODEL_REGISTERED_PROLOGUE \
+ "\n Model statistics registered in all opened devices\n" \
+ "+--------------+--------------+-----------+------------+\n" \
+ "| Model ID | Model Size | Dmabuf FD | Offset |\n" \
+ "+--------------+--------------+-----------+------------+\n"
+#define MODEL_REGISTERED_NORMAL "| %#12llx | %#12llx | %9d | %#10llx |\n"
+#define MODEL_REGISTERED_EPILOGUE \
+ "+--------------+--------------+-----------+------------+\n"
+
+static ssize_t print_registered_models(const struct trinity_model *model,
+ char *buf, enum trinity_sysfs_msg msg)
+{
+ ssize_t pos = 0;
+
+ switch (msg) {
+ case SYSFS_MSG_PROLOGUE:
+ pos = snprintf(buf, PAGE_SIZE, MODEL_REGISTERED_PROLOGUE);
+ break;
+ case SYSFS_MSG_NORMAL:
+ pos = snprintf(buf, PAGE_SIZE, MODEL_REGISTERED_NORMAL,
+ model->config.id, model->config.program_size,
+ model->config.dbuf_fd,
+ model->config.program_offset_addr);
+ break;
+ case SYSFS_MSG_EPILOGUE:
+ pos = snprintf(buf, PAGE_SIZE, MODEL_REGISTERED_EPILOGUE);
+ break;
+ default:
+ break;
+ }
+
+ return pos;
+}
+
+static ssize_t registered_models_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ struct hlist_bl_head *model_htable;
+ struct trinity_model *model;
+ struct hlist_bl_node *hn;
+ ssize_t pos;
+ int i, num_printed = 0;
+
+ model_htable = drv->model_htable;
+
+ pos = print_registered_models(NULL, buf, SYSFS_MSG_PROLOGUE);
+
+ for (i = 0; i < TRINITY_MODEL_HASH_SIZE; i++) {
+ hlist_bl_lock(model_htable + i);
+ hlist_bl_for_each_entry(model, hn, model_htable + i, hnode) {
+ pos += print_registered_models(model, buf + pos,
+ SYSFS_MSG_NORMAL);
+ num_printed++;
+ }
+ hlist_bl_unlock(model_htable + i);
+ }
+
+ if (num_printed > 0)
+ pos += print_registered_models(NULL, buf + pos,
+ SYSFS_MSG_EPILOGUE);
+
+ return pos;
+}
+static DEVICE_ATTR_RO(registered_models);
+
+static const char *priority_to_string(enum trinity_req_priority priority)
+{
+ static const char *const priority_strings[] = {
+ [TRINITY_REQ_PRIORITY_LOW] = "Low",
+ [TRINITY_REQ_PRIORITY_MID] = "Mid",
+ [TRINITY_REQ_PRIORITY_HIGH] = "High",
+ };
+ return priority_strings[priority];
+}
+
+static const char *status_to_string(enum trinity_req_status status)
+{
+ static const char *const status_strings[] = {
+ [TRINITY_REQ_STATUS_UNKNOWN] = "Unknown",
+ [TRINITY_REQ_STATUS_ERROR] = "Error",
+ [TRINITY_REQ_STATUS_PENDING] = "Pending",
+ [TRINITY_REQ_STATUS_RUNNING] = "Running",
+ [TRINITY_REQ_STATUS_FINISHED] = "Finished",
+ };
+ return status_strings[status];
+}
+
+#define APP_STATUS_LENGTH (77)
+#define USER_APP_STATUS_PROLOGUE \
+ "\n\tUser-level request statistics running in %s\n" \
+ "+-------+--------+----------+------+----------+--------------+-------------+\n" \
+ "| PID | Req ID | Model ID | Prio | Status | Sched (us) | Infer (us) |\n" \
+ "+-------+--------+----------+------+----------+--------------+-------------+\n"
+#define USER_APP_STATUS_NORMAL \
+ "| %5d | %6d | %#8llx | %4s | %8s | %12lld | %11lld |\n"
+#define USER_APP_STATUS_EMIT \
+ "| ... (emitted) ... |\n"
+#define USER_APP_STATUS_EPILOGUE \
+ "+-------+--------+----------+------+----------+--------------+-------------+\n"
+
+static ssize_t print_user_app_status(struct device *dev,
+ const struct trinity_stat_req *req,
+ char *buf, enum trinity_sysfs_msg msg)
+{
+ ssize_t pos = 0;
+
+ switch (msg) {
+ case SYSFS_MSG_PROLOGUE:
+ pos = snprintf(buf, APP_STATUS_LENGTH * 4 + 1,
+ USER_APP_STATUS_PROLOGUE, dev_name(dev));
+ break;
+ case SYSFS_MSG_NORMAL: {
+ ktime_t cur_time = ktime_get();
+ ktime_t submitted = req->submitted;
+ ktime_t scheduled = req->scheduled ? req->scheduled : cur_time;
+ ktime_t completed = req->completed ? req->completed : cur_time;
+
+ int64_t sched_diff = TIME_DIFF_US(scheduled, submitted);
+ int64_t infer_diff = TIME_DIFF_US(completed, scheduled);
+
+ if (req->status == TRINITY_REQ_STATUS_ERROR) {
+ sched_diff = 0;
+ infer_diff = 0;
+ }
+
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ USER_APP_STATUS_NORMAL, req->app_id, req->req_id,
+ req->model_id, priority_to_string(req->priority),
+ status_to_string(req->status), sched_diff,
+ infer_diff);
+ } break;
+ case SYSFS_MSG_EMIT:
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ USER_APP_STATUS_EMIT);
+ break;
+ case SYSFS_MSG_EPILOGUE:
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ USER_APP_STATUS_EPILOGUE);
+ break;
+ default:
+ break;
+ }
+
+ return pos;
+}
+
+#define KERNEL_APP_STATUS_PROLOGUE \
+ "\n\tKernel-level request statistics running in %s\n" \
+ "+-------+--------+----------+------+----------+------------+---------------+\n" \
+ "| PID | Req ID | Model ID | Prio | Status | # Runs | Avg. Lat (us) |\n" \
+ "+-------+--------+----------+------+----------+------------+---------------+\n"
+#define KERNEL_APP_STATUS_NORMAL \
+ "| %5d | %6d | %#8llx | %4s | %8s | %10u | %13u |\n"
+#define KERNEL_APP_STATUS_EMIT \
+ "| ... (emitted) ... |\n"
+#define KERNEL_APP_STATUS_EPILOGUE \
+ "+-------+--------+----------+------+----------+------------+---------------+\n"
+
+static ssize_t print_kernel_app_status(struct device *dev,
+ const struct trinity_stat_req *req,
+ char *buf, enum trinity_sysfs_msg msg)
+{
+ ssize_t pos = 0;
+
+ switch (msg) {
+ case SYSFS_MSG_PROLOGUE:
+ pos = snprintf(buf, APP_STATUS_LENGTH * 4 + 1,
+ KERNEL_APP_STATUS_PROLOGUE, dev_name(dev));
+ break;
+ case SYSFS_MSG_NORMAL: {
+ uint32_t avg_latency = 0;
+
+ if (req->num_runs > 0)
+ avg_latency = req->total_time / req->num_runs;
+
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ KERNEL_APP_STATUS_NORMAL, req->app_id,
+ req->req_id, req->model_id,
+ priority_to_string(req->priority),
+ status_to_string(req->status), req->num_runs,
+ avg_latency);
+ } break;
+ case SYSFS_MSG_EMIT:
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ KERNEL_APP_STATUS_EMIT);
+ break;
+ case SYSFS_MSG_EPILOGUE:
+ pos = snprintf(buf, APP_STATUS_LENGTH + 1,
+ KERNEL_APP_STATUS_EPILOGUE);
+ break;
+ default:
+ break;
+ }
+
+ return pos;
+}
+
+static ssize_t app_status_user_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ struct trinity_stat_app *stat_app;
+ struct trinity_stat_req *stat_req;
+ int num_printed = 0;
+ ssize_t pos;
+
+ pos = print_user_app_status(dev, NULL, buf, SYSFS_MSG_PROLOGUE);
+
+ trinity_stat_lock(&drv->stat);
+ list_for_each_entry(stat_app, &drv->stat.list, lnode) {
+ list_for_each_entry(stat_req, &stat_app->reqs, list) {
+ if (stat_req->is_kernel)
+ continue;
+
+ pos += print_user_app_status(dev, stat_req, buf + pos,
+ SYSFS_MSG_NORMAL);
+ num_printed++;
+
+ /* buffer size limit: PAGE_SIZE (also need reserved bytes) */
+ if (pos + APP_STATUS_LENGTH >
+ PAGE_SIZE - 2 * APP_STATUS_LENGTH) {
+ pos += print_user_app_status(
+ dev, NULL, buf + pos, SYSFS_MSG_EMIT);
+ /* clear old stats */
+ trinity_destroy_stats(&drv->stat, true);
+ goto out;
+ }
+ }
+ }
+out:
+ trinity_stat_unlock(&drv->stat);
+
+ if (num_printed > 0)
+ pos += print_user_app_status(dev, NULL, buf + pos,
+ SYSFS_MSG_EPILOGUE);
+
+ return pos;
+}
+static DEVICE_ATTR_RO(app_status_user);
+
+static ssize_t app_status_kernel_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ struct trinity_stat_app *stat_app;
+ struct trinity_stat_req *stat_req;
+ int num_printed = 0;
+ ssize_t pos;
+
+ pos = print_kernel_app_status(dev, NULL, buf, SYSFS_MSG_PROLOGUE);
+
+ trinity_stat_lock(&drv->stat);
+ list_for_each_entry(stat_app, &drv->stat.list, lnode) {
+ list_for_each_entry(stat_req, &stat_app->reqs, list) {
+ if (!stat_req->is_kernel)
+ continue;
+
+ pos += print_kernel_app_status(dev, stat_req, buf + pos,
+ SYSFS_MSG_NORMAL);
+ num_printed++;
+
+ /* buffer size limit: PAGE_SIZE (also need reserved bytes) */
+ if (pos + APP_STATUS_LENGTH >
+ PAGE_SIZE - 2 * APP_STATUS_LENGTH) {
+ pos += print_kernel_app_status(
+ dev, NULL, buf + pos, SYSFS_MSG_EMIT);
+ /* clear old stats */
+ trinity_destroy_stats(&drv->stat, true);
+ goto out;
+ }
+ }
+ }
+out:
+ trinity_stat_unlock(&drv->stat);
+
+ if (num_printed > 0)
+ pos += print_kernel_app_status(dev, NULL, buf + pos,
+ SYSFS_MSG_EPILOGUE);
+
+ return pos;
+}
+static DEVICE_ATTR_RO(app_status_kernel);
+
+static ssize_t num_total_reqs_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ struct trinity_stat_app *stat_app;
+ uint32_t num_total_reqs = 0;
+
+ trinity_stat_lock(&drv->stat);
+ list_for_each_entry(stat_app, &drv->stat.list, lnode) {
+ num_total_reqs += stat_app->num_total_reqs;
+ }
+ trinity_stat_unlock(&drv->stat);
+
+ return snprintf(buf, PAGE_SIZE, "%u\n", num_total_reqs);
+}
+static DEVICE_ATTR_RO(num_total_reqs);
+
+static ssize_t num_active_reqs_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ struct trinity_stat_app *stat_app;
+ uint32_t num_active_reqs = 0;
+
+ trinity_stat_lock(&drv->stat);
+ list_for_each_entry(stat_app, &drv->stat.list, lnode) {
+ num_active_reqs += stat_app->num_active_reqs;
+ }
+ trinity_stat_unlock(&drv->stat);
+
+ return snprintf(buf, PAGE_SIZE, "%u\n", num_active_reqs);
+}
+static DEVICE_ATTR_RO(num_active_reqs);
+
+static struct attribute *trinity_attrs_stat[] = {
+ &dev_attr_max_stat_apps.attr, &dev_attr_max_stat_reqs.attr,
+ &dev_attr_max_stat_reqs_per_app.attr, &dev_attr_mem_usage.attr,
+ &dev_attr_registered_models.attr, &dev_attr_app_status_user.attr,
+ &dev_attr_app_status_kernel.attr, &dev_attr_num_total_reqs.attr,
+ &dev_attr_num_active_reqs.attr, NULL
+};
+
+/* e.g, /sys/devices/platform/304f0000.triv2/stat/ */
+static struct attribute_group trinity_attrs_stat_group = {
+ .name = "stat",
+ .attrs = trinity_attrs_stat
+};
+
+static ssize_t stop_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+
+ if (drv->desc->stop_reqs)
+ schedule_work(&drv->work_stop);
+
+ return count;
+}
+static DEVICE_ATTR_WO(stop);
+
+static ssize_t suspend_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ if (dev->driver->pm)
+ dev->driver->pm->runtime_suspend(dev);
+
+ return count;
+}
+static DEVICE_ATTR_WO(suspend);
+
+static ssize_t resume_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ if (dev->driver->pm)
+ dev->driver->pm->runtime_resume(dev);
+
+ return count;
+}
+static DEVICE_ATTR_WO(resume);
+
+static ssize_t profile_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ unsigned long profile;
+
+ if (kstrtoul(buf, 10, &profile) != 0)
+ return 0;
+
+ /** Note that this interface is used only for testing purpose */
+ if (drv->desc->init_profile)
+ drv->desc->init_profile(drv, profile);
+
+ return count;
+}
+static DEVICE_ATTR_WO(profile);
+
+static ssize_t reset_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct trinity_driver *drv = dev_get_drvdata(dev);
+ unsigned long reset;
+
+ if (kstrtoul(buf, 10, &reset) != 0)
+ return 0;
+
+ if (reset == 1 && drv->desc->reset)
+ drv->desc->reset(drv);
+
+ return count;
+}
+static DEVICE_ATTR_WO(reset);
+
+static struct attribute *trinity_attrs_control[] = { &dev_attr_stop.attr,
+ &dev_attr_suspend.attr,
+ &dev_attr_resume.attr,
+ &dev_attr_profile.attr,
+ &dev_attr_reset.attr,
+ NULL };
+
+/* e.g, /sys/devices/platform/304f0000.triv2/control/ */
+static struct attribute_group trinity_attrs_control_group = {
+ .name = "control",
+ .attrs = trinity_attrs_control
+};
+
+static const struct attribute_group *trinity_attrs_groups[] = {
+ &trinity_attrs_debug_group, &trinity_attrs_stat_group,
+ &trinity_attrs_control_group, NULL
+};
+
+int trinity_sysfs_init(struct trinity_driver *drv)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+ int err;
+
+ err = device_add_groups(dev, trinity_attrs_groups);
+ if (err < 0) {
+ dev_err(dev, "failed to create sysfs groups\n");
+ return err;
+ }
+
+ return 0;
+}
+
+int trinity_sysfs_cleanup(struct trinity_driver *drv)
+{
+ struct device *dev = drv_to_dev_ptr(drv);
+
+ device_remove_groups(dev, trinity_attrs_groups);
+
+ return 0;
+}
--
2.25.1
On Sat, 17 Sep 2022 09:41:13 +0200
"Arnd Bergmann" <[email protected]> wrote:
> On Sat, Sep 17, 2022, at 9:23 AM, Jiho Chu wrote:
> > It contains the base codes for trinity driver. Minimal codes to load and
> > probe device is provided. The Trinity Family is controlled by the
> > Memory-Mapped Registers, the register addresses and offsets are
> > described. And user api interfaces are presented to control device under
> > ioctl manner.
>
> I'm not doing a full review of the driver at the moment, but
> here are some comments on the usage of chardev ioctl based on
> Documentation/driver-api/ioctl.rst
>
Hi, Arnd
Thanks for your review.
I'll read the document more precisely.
> > +int trinity_probe(struct platform_device *pdev, const struct
> > trinity_desc *desc)
> > +{
> > + struct device_node *np;
> > + struct device *dev;
> > + struct trinity_driver *drv;
> > + int i, err;
> > +
> > + dev = &pdev->dev;
> > + dev->id = ((desc->ver & TRINITY_MASK_DEV) >> TRINITY_SHIFT_DEV);
> > +
> > + /* set private data */
> > + drv = devm_kzalloc(dev, sizeof(*drv), GFP_KERNEL);
> > + if (!drv)
> > + return -ENOMEM;
> > +
> > + drv->dev_id = ida_alloc(&dev_nrs, GFP_KERNEL);
> > + if (drv->dev_id < 0) {
> > + devm_kfree(dev, drv);
> > + return drv->dev_id;
> > + }
> > + snprintf(drv->name, DEV_NAME_LEN, "%s-%u", desc->type, drv->dev_id);
> > +
> > + platform_set_drvdata(pdev, drv);
> > + dev_set_drvdata(dev, drv);
> > +
>
> If you have the need to manage multiple devices here, maybe use
> a dynamic major number and have the chardev code allocate the
> minor numbers, instead of using multiple misc devices and
> doing that yourself.
>
I'm little confusing. It means that managing own char devices is proper,
not using misc device? But, it's still under misc dir.
> > +
> > +#ifndef TASK_COMM_LEN
> > +#define TASK_COMM_LEN 16
> > +#endif
> > +
> > +#define TRINITY_APP_NAME_MAX TASK_COMM_LEN
> > +#define TRINITY_APP_STAT_MAX 10
> > +#define TRINITY_REQ_STAT_MAX 10
>
> The structure layout should not depend on whether an application
> has included a header that defines TASK_COMM_LEN.
>
> What is the purpose of including an application name here?
>
> > +/**
> > + * struct trinity_ioctl_stat_app - Describes stat of the target app
> > + * @app_id: Trinity app id (currently, equal to pid)
> > + * @name: Trinity app name
> > + * @status: Trinity app status
> > + * @num_total_reqs: Number of total requests in app (including
> > finished ones)
> > + * @num_active_reqs: Number of active (running or pending) requests in
> > app
> > + * @total_alloc_mem: Total size of allocated memory in the device
> > + * @total_freed_mem: Total size of freed memory in the device
> > + */
> > +struct trinity_ioctl_stat_app {
> > + __s32 app_id;
> > +
> > + char name[TRINITY_APP_NAME_MAX];
> > + enum trinity_app_status status;
> > +
> > + __u32 num_total_reqs;
> > + __u32 num_active_reqs;
> > +
> > + __u64 total_alloc_mem;
> > + __u64 total_freed_mem;
> > +} __packed;
>
> 'enum' in a uapi structure is not well-defined across
> architectures, so better use a fixed-size type there.
>
> Instead of packing the structure, you should keep all
> members naturally aligned and add explicit padding
> or change some members for 32-bit to 64-bit size
> to keep everything naturally aligned.
>
I checked, the members will be aligned.
> > +/**
> > + * struct trinity_ioctl_profile_buff - Describes profiling buff info.
> > + * @req_id: The target req id for profiling
> > + * @profile_pos: The start position to extract profiling data
> > + * @profile_size: The size of user-allocated profiling buffer
> > + * @profile_buf: The profiling buffer which user allocated
> > + */
> > +struct trinity_ioctl_profile_buff {
> > + __s32 req_id;
> > + __u32 profile_pos;
> > + __u32 profile_size;
> > + void __user *profile_buf;
> > +} __packed;
>
> Don't put pointers into ioctl structures, they just make compat
> mode unnecessarily hard. You can use a __u64 member.
>
OK. thanks.
> > +/**
> > + * Major number can not be dynamic as ioctls need it,
> > + */
> > +#define TRINITY_DRIVER_MAGIC 0x88
> > +
> > +#define TRINITY_IO(no) _IO(TRINITY_DRIVER_MAGIC, no)
> > +#define TRINITY_IOR(no, data_type) _IOR(TRINITY_DRIVER_MAGIC, no,
> > data_type)
> > +#define TRINITY_IOW(no, data_type) _IOW(TRINITY_DRIVER_MAGIC, no,
> > data_type)
> > +#define TRINITY_IOWR(no, data_type) _IOWR(TRINITY_DRIVER_MAGIC, no,
> > data_type)
>
> These macros just hurt tools that want to parse the headers.
> Please just open-code the usage.
>
> > +#ifdef __KERNEL__
> > +__s32 trinity_run_internal_req(dev_t);
> > +#endif
>
> This doesn't seem to belong into the uapi header.
>
> Arnd
>
macros and useless codes will be cleared.
Thanks.
Jiho Chu
On Sat, Sep 17, 2022 at 04:23:44PM +0900, Jiho Chu wrote:
> It contains the base codes for trinity driver. Minimal codes to load and
> probe device is provided. The Trinity Family is controlled by the
> Memory-Mapped Registers, the register addresses and offsets are
> described. And user api interfaces are presented to control device under
> ioctl manner.
Where is the documentation for how the userspace api works? And where
is a link to the userspace code that talks to these devices? That
belongs here in this commit changelog text please.
>
> Signed-off-by: Jiho Chu <[email protected]>
> Signed-off-by: yelini-jeong <[email protected]>
> Signed-off-by: Dongju Chae <[email protected]>
> Signed-off-by: Parichay Kapoor <[email protected]>
> Signed-off-by: Wook Song <[email protected]>
> Signed-off-by: MyungJoo Ham <[email protected]>
> ---
> drivers/misc/Kconfig | 1 +
> drivers/misc/Makefile | 1 +
> drivers/misc/trinity/Kconfig | 25 +
> drivers/misc/trinity/Makefile | 7 +
> drivers/misc/trinity/trinity.c | 225 +++++++++
> drivers/misc/trinity/trinity_common.h | 437 ++++++++++++++++++
> drivers/misc/trinity/trinity_vision2_drv.c | 278 ++++++++++++
> drivers/misc/trinity/trinity_vision2_regs.h | 210 +++++++++
> include/uapi/misc/trinity.h | 476 ++++++++++++++++++++
> 9 files changed, 1660 insertions(+)
> create mode 100644 drivers/misc/trinity/Kconfig
> create mode 100644 drivers/misc/trinity/Makefile
> create mode 100644 drivers/misc/trinity/trinity.c
> create mode 100644 drivers/misc/trinity/trinity_common.h
> create mode 100644 drivers/misc/trinity/trinity_vision2_drv.c
> create mode 100644 drivers/misc/trinity/trinity_vision2_regs.h
> create mode 100644 include/uapi/misc/trinity.h
>
> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> index 41d2bb0ae23a..ad0d5f6af291 100644
> --- a/drivers/misc/Kconfig
> +++ b/drivers/misc/Kconfig
> @@ -500,4 +500,5 @@ source "drivers/misc/cardreader/Kconfig"
> source "drivers/misc/habanalabs/Kconfig"
> source "drivers/misc/uacce/Kconfig"
> source "drivers/misc/pvpanic/Kconfig"
> +source "drivers/misc/trinity/Kconfig"
> endmenu
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index 70e800e9127f..c63f3fc89780 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o
> obj-$(CONFIG_HISI_HIKEY_USB) += hisi_hikey_usb.o
> obj-$(CONFIG_HI6421V600_IRQ) += hi6421v600-irq.o
> obj-$(CONFIG_OPEN_DICE) += open-dice.o
> +obj-$(CONFIG_TRINITY) += trinity/
> diff --git a/drivers/misc/trinity/Kconfig b/drivers/misc/trinity/Kconfig
> new file mode 100644
> index 000000000000..02ad03c2ca0e
> --- /dev/null
> +++ b/drivers/misc/trinity/Kconfig
> @@ -0,0 +1,25 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config TRINITY
> + bool "Samsung Neural Processing Unit"
> + depends on HAS_IOMEM
> + depends on HAS_DMA
> + help
> + Select this option to enable driver support for Samsung
> + Neural Processing Unit (NPU).
> +
> + This driver works as a base driver of the other drivers
> + for Trinity device family.
> +
> + This option should be enabled to support Trinity
> + Vision 2 (TRIV2), and Trinity Audio (TRIA).
> +
> +config TRINITY_VISION2
> + tristate "Samsung NPU Trinity Vision 2"
> + depends on TRINITY
> + help
> + Select this option to enable driver support for a Samsung
> + Neural Processing Unit (NPU), Trinity Vision 2.
> +
> + This driver enables userspace system library to access the
> + device via /dev/triv2-N.
Why do you have 2 Kconfig entries for only a single driver? Please just
make it one.
> diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
> new file mode 100644
> index 000000000000..a8e5697d6d85
> --- /dev/null
> +++ b/drivers/misc/trinity/Makefile
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +obj-$(CONFIG_TRINITY_VISION2) += trinity_vision2.o
> +
> +trinity-y := trinity.o
> +
> +trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
> diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
> new file mode 100644
> index 000000000000..1704eecfc439
> --- /dev/null
> +++ b/drivers/misc/trinity/trinity.c
> @@ -0,0 +1,225 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Base device driver for Samsung NPU Trinity device family
> + *
> + * Copyright (C) 2020-2022 Samsung Electronics
> + * Copyright (C) 2020 Dongju Chae <[email protected]>
> + * Copyright (C) 2020 Wook Song <[email protected]>
> + * Copyright (C) 2022 MyungJoo Ham <[email protected]>
> + * Copyright (C) 2022 Yelin Jeong <[email protected]>
> + * Copyright (C) 2022 Jiho Chu <[email protected]>
> + */
> +
> +#include <linux/of_address.h>
> +
> +#include "trinity_common.h"
> +
> +#define TRINITY_PADDR_BASE (0x0)
> +
> +static DEFINE_IDA(dev_nrs);
> +static DEFINE_IDA(model_ids);
> +
> +/**
> + * trinity_release() - A common callback for close() in file_operations for a
> + * Trinity device node. If there are device-specific data to be
> + * cleaned-up, it is required to clean them up before invoke this
> + * callback.
> + *
> + * @inode: Inode to be closed
> + * @file: File to be closed
> + *
> + * Returns 0 on success. Otherwise, returns negative error.
> + */
> +int trinity_release(struct inode *inode, struct file *file)
> +{
> + return 0;
If a callback does nothing, odds are it is not needed at all. Please
just remove.
And why is this a global function?
> +}
> +
> +/**
> + * trinity_open() - A common callback for open() in file_operations for a Trinity
> + * device node. If device-specific open() is required, this
> + * callback should be invoked by that open().
> + *
> + * @inode: inode to be opened
> + * @f: file to be opened
> + *
> + * Returns 0 on success. Otherwise, returns negative error.
> + */
> +int trinity_open(struct inode *inode, struct file *f)
> +{
> + struct miscdevice *miscdev;
> + struct trinity_driver *drv;
> +
> + miscdev = f->private_data;
> + drv = container_of(miscdev, struct trinity_driver, mdev);
> + f->private_data = drv;
> +
> + return 0;
> +}
> +
> +/**
> + * trinity_create_node() - Create trinity node
> + *
> + * @drv: an instance of trinity driver
> + *
> + * Returns 0 on success. Otherwise, returns negative error.
> + */
> +int trinity_create_node(struct trinity_driver *drv)
> +{
> + struct device *dev = drv_to_dev_ptr(drv);
> + int err;
> +
> + /** register as a misc device */
> + drv->mdev.minor = MISC_DYNAMIC_MINOR;
> + drv->mdev.parent = dev;
> + drv->mdev.name = drv->name;
> + drv->mdev.fops = drv->desc->fops;
> +
> + err = misc_register(&drv->mdev);
> + if (err < 0)
> + dev_err(dev, "failed to register as a misc device");
> +
> + return err;
> +}
> +
> +/**
> + * trinity_destroy_node() - Destroy trinity node
> + *
> + * @drv: an instance of trinity driver
> + */
> +void trinity_destroy_node(struct trinity_driver *drv)
> +{
> + misc_deregister(&drv->mdev);
> +}
> +
> +/**
> + * trinity_probe() - Probes a new Trinity device. This is a standard interface to
> + * probe a Trinity family device.
> + *
> + * @pdev: Platform device structure to probe
> + * @desc: Device description to probe
> + *
> + * Returns 0 on success. Otherwise, returns negative error.
> + */
> +int trinity_probe(struct platform_device *pdev, const struct trinity_desc *desc)
> +{
> + struct device_node *np;
> + struct device *dev;
> + struct trinity_driver *drv;
> + int i, err;
> +
> + dev = &pdev->dev;
> + dev->id = ((desc->ver & TRINITY_MASK_DEV) >> TRINITY_SHIFT_DEV);
> +
> + /* set private data */
> + drv = devm_kzalloc(dev, sizeof(*drv), GFP_KERNEL);
> + if (!drv)
> + return -ENOMEM;
> +
> + drv->dev_id = ida_alloc(&dev_nrs, GFP_KERNEL);
> + if (drv->dev_id < 0) {
> + devm_kfree(dev, drv);
> + return drv->dev_id;
> + }
> + snprintf(drv->name, DEV_NAME_LEN, "%s-%u", desc->type, drv->dev_id);
> +
> + platform_set_drvdata(pdev, drv);
> + dev_set_drvdata(dev, drv);
> +
> + drv->dev = dev;
> + drv->desc = desc;
> +
> + np = dev->of_node;
> + if (of_property_match_string(np, "samsung,trinity-type", desc->type)) {
> + err = -EPROBE_DEFER;
> + goto err_cleanup;
> + }
> +
> + /* get reg info for MMREG_BASE */
> + for (i = 0; i < TRINITY_MAX_MMREGS; i++) {
> + struct resource mmreg;
> +
> + err = of_address_to_resource(np, i, &mmreg);
> + if (err < 0) {
> + dev_err(dev, "failed to get %d-th mmreg info", i);
> + goto err_cleanup;
> + }
> +
> + drv->mmreg_vaddr[i] = devm_ioremap_resource(dev, &mmreg);
> + if (IS_ERR(drv->mmreg_vaddr[i])) {
> + dev_err(dev,
> + "failed to remap %d-th mmreg resource info", i);
> + err = PTR_ERR(drv->mmreg_vaddr[i]);
> + goto err_cleanup;
> + }
> + drv->mmreg_paddr[i] = mmreg.start;
> + }
> +
> + /** get a TOPS property */
Why the odd "**" in comments?
thanks,
greg k-h
On Sat, Sep 17, 2022 at 04:23:50PM +0900, Jiho Chu wrote:
> This patch includes sysfs interfaces.
>
> sysfs interface provides NPU's internal statistics, status and control
> attribes.
>
> The sysfs information provided by the Trinity are:
> - IDU version
> - profiling result
> - allocated debugfs buffer
>
> The control attributes are including:
> - initialize profile operation
> - NPU control (suspend/resume/stop)
>
> Signed-off-by: Jiho Chu <[email protected]>
> Signed-off-by: Yelin Jeong <[email protected]>
> Signed-off-by: Dongju Chae <[email protected]>
> Signed-off-by: MyungJoo Ham <[email protected]>
> ---
> .../ABI/testing/sysfs-driver-trinity | 55 ++
> drivers/misc/trinity/Makefile | 1 +
> drivers/misc/trinity/trinity_sysfs.c | 667 ++++++++++++++++++
> 3 files changed, 723 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-driver-trinity
> create mode 100644 drivers/misc/trinity/trinity_sysfs.c
>
> diff --git a/Documentation/ABI/testing/sysfs-driver-trinity b/Documentation/ABI/testing/sysfs-driver-trinity
> new file mode 100644
> index 000000000000..754e6f36a1dc
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-driver-trinity
> @@ -0,0 +1,55 @@
> +What: /sys/devices/platform/trinity/*.triv2/debug/debugfs_max
> +Date: July 2022
> +KernelVersion: 5.19-rc8
> +Contact: Jiho Chu <[email protected]>
> +Description: Shows current allocated debugfs entry size.
> + Note that, Writing max entry size allocates NPU's hardware
> + memory for debugfs entries.
Why are debugfs things being mentioned in sysfs entries?
That's not needed, nor is it allowed, sorry.
Please put all debugfs stuff in debugfs.
Also, sysfs is "one value per file", you violate that in lots of ways
with this patch. Please fix all of that, and use the sysfs_emit() calls
instead of snprintf() for your sysfs show calls.
thanks,
greg k-h
On Sun, 18 Sep 2022 12:35:23 +0200
Greg KH <[email protected]> wrote:
> On Sat, Sep 17, 2022 at 04:23:44PM +0900, Jiho Chu wrote:
> > It contains the base codes for trinity driver. Minimal codes to load and
> > probe device is provided. The Trinity Family is controlled by the
> > Memory-Mapped Registers, the register addresses and offsets are
> > described. And user api interfaces are presented to control device under
> > ioctl manner.
>
> Where is the documentation for how the userspace api works? And where
> is a link to the userspace code that talks to these devices? That
> belongs here in this commit changelog text please.
>
Hi, Greg
Thanks for your review.
The user space library is published in:
https://review.tizen.org/gerrit/gitweb?p=platform/adaptation/npu/trix-engine.git;a=summary
It needs to login to tizen, access guide is:
https://docs.tizen.org/platform/get-started/open-source-project/
And, this information will be included in commit log.
> > +config TRINITY
> > + bool "Samsung Neural Processing Unit"
> > + depends on HAS_IOMEM
> > + depends on HAS_DMA
> > + help
> > + Select this option to enable driver support for Samsung
> > + Neural Processing Unit (NPU).
> > +
> > + This driver works as a base driver of the other drivers
> > + for Trinity device family.
> > +
> > + This option should be enabled to support Trinity
> > + Vision 2 (TRIV2), and Trinity Audio (TRIA).
> > +
> > +config TRINITY_VISION2
> > + tristate "Samsung NPU Trinity Vision 2"
> > + depends on TRINITY
> > + help
> > + Select this option to enable driver support for a Samsung
> > + Neural Processing Unit (NPU), Trinity Vision 2.
> > +
> > + This driver enables userspace system library to access the
> > + device via /dev/triv2-N.
>
> Why do you have 2 Kconfig entries for only a single driver? Please just
> make it one.
>
OK. It'll be merged.
> > diff --git a/drivers/misc/trinity/Makefile b/drivers/misc/trinity/Makefile
> > new file mode 100644
> > index 000000000000..a8e5697d6d85
> > --- /dev/null
> > +++ b/drivers/misc/trinity/Makefile
> > @@ -0,0 +1,7 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +obj-$(CONFIG_TRINITY_VISION2) += trinity_vision2.o
> > +
> > +trinity-y := trinity.o
> > +
> > +trinity_vision2-objs := $(trinity-y) trinity_vision2_drv.o
> > diff --git a/drivers/misc/trinity/trinity.c b/drivers/misc/trinity/trinity.c
> > new file mode 100644
> > index 000000000000..1704eecfc439
> > --- /dev/null
> > +++ b/drivers/misc/trinity/trinity.c
> > @@ -0,0 +1,225 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Base device driver for Samsung NPU Trinity device family
> > + *
> > + * Copyright (C) 2020-2022 Samsung Electronics
> > + * Copyright (C) 2020 Dongju Chae <[email protected]>
> > + * Copyright (C) 2020 Wook Song <[email protected]>
> > + * Copyright (C) 2022 MyungJoo Ham <[email protected]>
> > + * Copyright (C) 2022 Yelin Jeong <[email protected]>
> > + * Copyright (C) 2022 Jiho Chu <[email protected]>
> > + */
> > +
> > +#include <linux/of_address.h>
> > +
> > +#include "trinity_common.h"
> > +
> > +#define TRINITY_PADDR_BASE (0x0)
> > +
> > +static DEFINE_IDA(dev_nrs);
> > +static DEFINE_IDA(model_ids);
> > +
> > +/**
> > + * trinity_release() - A common callback for close() in file_operations for a
> > + * Trinity device node. If there are device-specific data to be
> > + * cleaned-up, it is required to clean them up before invoke this
> > + * callback.
> > + *
> > + * @inode: Inode to be closed
> > + * @file: File to be closed
> > + *
> > + * Returns 0 on success. Otherwise, returns negative error.
> > + */
> > +int trinity_release(struct inode *inode, struct file *file)
> > +{
> > + return 0;
>
> If a callback does nothing, odds are it is not needed at all. Please
> just remove.
>
> And why is this a global function?
>
>
Thanks. All empty functions will be removed.
> > +
> > + /* get reg info for MMREG_BASE */
> > + for (i = 0; i < TRINITY_MAX_MMREGS; i++) {
> > + struct resource mmreg;
> > +
> > + err = of_address_to_resource(np, i, &mmreg);
> > + if (err < 0) {
> > + dev_err(dev, "failed to get %d-th mmreg info", i);
> > + goto err_cleanup;
> > + }
> > +
> > + drv->mmreg_vaddr[i] = devm_ioremap_resource(dev, &mmreg);
> > + if (IS_ERR(drv->mmreg_vaddr[i])) {
> > + dev_err(dev,
> > + "failed to remap %d-th mmreg resource info", i);
> > + err = PTR_ERR(drv->mmreg_vaddr[i]);
> > + goto err_cleanup;
> > + }
> > + drv->mmreg_paddr[i] = mmreg.start;
> > + }
> > +
> > + /** get a TOPS property */
>
> Why the odd "**" in comments?
>
> thanks,
>
> greg k-h
>
It fixed.
Thanks for checking that.
Thanks.
Jiho Chu
> On Sun, 18 Sep 2022 12:35:23 +0200
> Greg KH <[email protected]> wrote:
>
> > On Sat, Sep 17, 2022 at 04:23:44PM +0900, Jiho Chu wrote:
> > > It contains the base codes for trinity driver. Minimal codes to load and
> > > probe device is provided. The Trinity Family is controlled by the
> > > Memory-Mapped Registers, the register addresses and offsets are
> > > described. And user api interfaces are presented to control device under
> > > ioctl manner.
> >
> > Where is the documentation for how the userspace api works? And where
> > is a link to the userspace code that talks to these devices? That
> > belongs here in this commit changelog text please.
> >
>
> Hi, Greg
> Thanks for your review.
>
> The user space library is published in:
> https://review.tizen.org/gerrit/gitweb?p=platform/adaptation/npu/trix-engine.git;a=summary
>
> It needs to login to tizen, access guide is:
> https://docs.tizen.org/platform/get-started/open-source-project/
>
> And, this information will be included in commit log.
The URL without log-in is more appropriate.
Please use https://git.tizen.org/cgit/platform/adaptation/npu/trix-engine/
instead of review.tizen.org.
Cheers,
MyungJoo
On Sat, 17 Sep 2022 09:41:13 +0200
"Arnd Bergmann" <[email protected]> wrote:
>
> > +
> > +#ifndef TASK_COMM_LEN
> > +#define TASK_COMM_LEN 16
> > +#endif
> > +
> > +#define TRINITY_APP_NAME_MAX TASK_COMM_LEN
> > +#define TRINITY_APP_STAT_MAX 10
> > +#define TRINITY_REQ_STAT_MAX 10
>
> The structure layout should not depend on whether an application
> has included a header that defines TASK_COMM_LEN.
>
> What is the purpose of including an application name here?
>
I agree. TASK_COMM_LEN will be removed.
app_name is current context's execuable name, and it's used for
per-app NPU statistics info.
Thanks,
Jiho Chu
On Sun, 18 Sep 2022 12:33:06 +0200
Greg KH <[email protected]> wrote:
> On Sat, Sep 17, 2022 at 04:23:50PM +0900, Jiho Chu wrote:
> > This patch includes sysfs interfaces.
> >
> > sysfs interface provides NPU's internal statistics, status and control
> > attribes.
> >
> > The sysfs information provided by the Trinity are:
> > - IDU version
> > - profiling result
> > - allocated debugfs buffer
> >
> > The control attributes are including:
> > - initialize profile operation
> > - NPU control (suspend/resume/stop)
> >
> > Signed-off-by: Jiho Chu <[email protected]>
> > Signed-off-by: Yelin Jeong <[email protected]>
> > Signed-off-by: Dongju Chae <[email protected]>
> > Signed-off-by: MyungJoo Ham <[email protected]>
> > ---
> > .../ABI/testing/sysfs-driver-trinity | 55 ++
> > drivers/misc/trinity/Makefile | 1 +
> > drivers/misc/trinity/trinity_sysfs.c | 667 ++++++++++++++++++
> > 3 files changed, 723 insertions(+)
> > create mode 100644 Documentation/ABI/testing/sysfs-driver-trinity
> > create mode 100644 drivers/misc/trinity/trinity_sysfs.c
> >
> > diff --git a/Documentation/ABI/testing/sysfs-driver-trinity b/Documentation/ABI/testing/sysfs-driver-trinity
> > new file mode 100644
> > index 000000000000..754e6f36a1dc
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-driver-trinity
> > @@ -0,0 +1,55 @@
> > +What: /sys/devices/platform/trinity/*.triv2/debug/debugfs_max
> > +Date: July 2022
> > +KernelVersion: 5.19-rc8
> > +Contact: Jiho Chu <[email protected]>
> > +Description: Shows current allocated debugfs entry size.
> > + Note that, Writing max entry size allocates NPU's hardware
> > + memory for debugfs entries.
>
> Why are debugfs things being mentioned in sysfs entries?
>
> That's not needed, nor is it allowed, sorry.
>
> Please put all debugfs stuff in debugfs.
>
> Also, sysfs is "one value per file", you violate that in lots of ways
> with this patch. Please fix all of that, and use the sysfs_emit() calls
> instead of snprintf() for your sysfs show calls.
>
> thanks,
>
> greg k-h
>
Thanks for checking.
I'll fix sysfs entries.
Best regards,
Jiho Chu
On Sat, Sep 17, 2022, at 4:49 PM, Jiho Chu wrote:
> On Sat, 17 Sep 2022 09:41:13 +0200
> "Arnd Bergmann" <[email protected]> wrote:
>>
>> If you have the need to manage multiple devices here, maybe use
>> a dynamic major number and have the chardev code allocate the
>> minor numbers, instead of using multiple misc devices and
>> doing that yourself.
>>
>
> I'm little confusing. It means that managing own char devices is proper,
> not using misc device? But, it's still under misc dir.
There is no strict connection between miscdevices and drivers/misc.
The former is for drivers that tend to have only one instance
in a system, while the latter is for drivers that do not have
a separate subsystem.
Arnd
On Thu, 22 Sep 2022 15:56:51 +0200
"Arnd Bergmann" <[email protected]> wrote:
> On Sat, Sep 17, 2022, at 4:49 PM, Jiho Chu wrote:
> > On Sat, 17 Sep 2022 09:41:13 +0200
> > "Arnd Bergmann" <[email protected]> wrote:
> >>
> >> If you have the need to manage multiple devices here, maybe use
> >> a dynamic major number and have the chardev code allocate the
> >> minor numbers, instead of using multiple misc devices and
> >> doing that yourself.
> >>
> >
> > I'm little confusing. It means that managing own char devices is proper,
> > not using misc device? But, it's still under misc dir.
>
> There is no strict connection between miscdevices and drivers/misc.
>
> The former is for drivers that tend to have only one instance
> in a system, while the latter is for drivers that do not have
> a separate subsystem.
>
> Arnd
>
Thanks for the clarification. Allocating a dynamic major could be better
for trinity, which could have multiple instances.
I'll rewrite code for it in next revision.
Best regards,
Jiho chu