Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753714AbbKYQLp (ORCPT ); Wed, 25 Nov 2015 11:11:45 -0500 Received: from mail-wm0-f50.google.com ([74.125.82.50]:38814 "EHLO mail-wm0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752850AbbKYQLj (ORCPT ); Wed, 25 Nov 2015 11:11:39 -0500 Subject: Re: [RESEND RFC/PATCH 3/8] media: platform: mtk-vpu: Support Mediatek VPU To: Tiffany Lin , Rob Herring , Pawel Moll , Mark Rutland , Ian Campbell , Kumar Gala , Catalin Marinas , Will Deacon , Mauro Carvalho Chehab , Matthias Brugger , Daniel Kurtz , Sascha Hauer , Hongzhou Yang , Hans Verkuil , Laurent Pinchart , Sakari Ailus , Geert Uytterhoeven , Mikhail Ulyanov , Fabien Dessenne , Arnd Bergmann , Darren Etheridge , Peter Griffin , Benoit Parrot References: <1447764885-23100-1-git-send-email-tiffany.lin@mediatek.com> <1447764885-23100-4-git-send-email-tiffany.lin@mediatek.com> Cc: Andrew-CT Chen , Eddie Huang , Yingjoe Chen , James Liao , Daniel Hsiao , devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-media@vger.kernel.org, linux-mediatek@lists.infradead.org From: Daniel Thompson Message-ID: <5655DDB4.2080002@linaro.org> Date: Wed, 25 Nov 2015 16:11:32 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <1447764885-23100-4-git-send-email-tiffany.lin@mediatek.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 37025 Lines: 1257 On 17/11/15 12:54, Tiffany Lin wrote: > From: Andrew-CT Chen > > The VPU driver for hw video codec embedded in Mediatek's MT8173 SOCs. > It is able to handle video decoding/encoding of in a range of formats. > The driver provides with VPU firmware download, memory management and > the communication interface between CPU and VPU. > For VPU initialization, it will create virtual memory for CPU access and > IOMMU address for vcodec hw device access. When a decode/encode instance > opens a device node, vpu driver will download vpu firmware to the device. > A decode/encode instant will decode/encode a frame using VPU > interface to interrupt vpu to handle decoding/encoding jobs. > > Signed-off-by: Andrew-CT Chen > --- > drivers/media/platform/Kconfig | 6 + > drivers/media/platform/Makefile | 2 + > drivers/media/platform/mtk-vpu/Makefile | 1 + > .../platform/mtk-vpu/h264_enc/venc_h264_vpu.h | 127 +++ > .../media/platform/mtk-vpu/include/venc_ipi_msg.h | 212 +++++ > drivers/media/platform/mtk-vpu/mtk_vpu_core.c | 823 ++++++++++++++++++++ > drivers/media/platform/mtk-vpu/mtk_vpu_core.h | 161 ++++ > .../media/platform/mtk-vpu/vp8_enc/venc_vp8_vpu.h | 119 +++ > 8 files changed, 1451 insertions(+) > create mode 100644 drivers/media/platform/mtk-vpu/Makefile > create mode 100644 drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h > create mode 100644 drivers/media/platform/mtk-vpu/include/venc_ipi_msg.h > create mode 100644 drivers/media/platform/mtk-vpu/mtk_vpu_core.c > create mode 100644 drivers/media/platform/mtk-vpu/mtk_vpu_core.h > create mode 100644 drivers/media/platform/mtk-vpu/vp8_enc/venc_vp8_vpu.h > > diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig > index ccbc974..f98eb47 100644 > --- a/drivers/media/platform/Kconfig > +++ b/drivers/media/platform/Kconfig > @@ -148,6 +148,12 @@ config VIDEO_CODA > Coda is a range of video codec IPs that supports > H.264, MPEG-4, and other video formats. > > +config MEDIATEK_VPU > + bool "Mediatek Video Processor Unit" > + ---help--- > + This driver provides downloading firmware vpu and > + communicating with vpu. > + This looks pretty broken. Shouldn't this option be tristate? Why are there no depends-on or selects? Also I think the help text could be more descriptive here (and so does checkpatch ;-) ). > diff --git a/drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h b/drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h > new file mode 100644 > index 0000000..9c8ebdd > --- /dev/null > +++ b/drivers/media/platform/mtk-vpu/h264_enc/venc_h264_vpu.h Why is this file included in *this* patch? It is not included by anything in the patch and defines functions that do not exist yet. > diff --git a/drivers/media/platform/mtk-vpu/include/venc_ipi_msg.h b/drivers/media/platform/mtk-vpu/include/venc_ipi_msg.h > new file mode 100644 > index 0000000..a345b98 This file also is not included by anything and should, perhaps be included in a different patch. > diff --git a/drivers/media/platform/mtk-vpu/mtk_vpu_core.c b/drivers/media/platform/mtk-vpu/mtk_vpu_core.c > new file mode 100644 > index 0000000..b524dbc > --- /dev/null > +++ b/drivers/media/platform/mtk-vpu/mtk_vpu_core.c > @@ -0,0 +1,823 @@ > +/* > +* Copyright (c) 2015 MediaTek Inc. > +* Author: Andrew-CT Chen > +* > +* This program is free software; you can redistribute it and/or modify > +* it under the terms of the GNU General Public License version 2 as > +* published by the Free Software Foundation. > +* > +* This program is distributed in the hope that it will be useful, > +* but WITHOUT ANY WARRANTY; without even the implied warranty of > +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +* GNU General Public License for more details. > +*/ > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "mtk_vpu_core.h" > + > +/** > + * VPU (video processor unit) is a tiny processor controlling video hardware > + * related to video codec, scaling and color format converting. > + * VPU interfaces with other blocks by share memory and interrupt. > + **/ > +#define MTK_VPU_DRV_NAME "mtk_vpu" This is only used once, why not just put this string directly into the .name field? > + > +#define INIT_TIMEOUT_MS 2000U > +#define IPI_TIMEOUT_MS 2000U > +#define VPU_FW_VER_LEN 16 > + > +/* vpu extended virtural address */ > +#define VPU_PMEM0_VIRT(vpu) ((vpu)->mem.p_va) > +#define VPU_DMEM0_VIRT(vpu) ((vpu)->mem.d_va) > +/* vpu extended iova address*/ > +#define VPU_PMEM0_IOVA(vpu) ((vpu)->mem.p_iova) > +#define VPU_DMEM0_IOVA(vpu) ((vpu)->mem.d_iova) These feel like obfuscation to me. The code looks clearer without the macro For example: vpu->mem.p_va = dma_alloc_coherent(dev, ...); Is much clearer than: VPU_PMEM0_VIRT(vpu) = dma_alloc_coherent(dev, ...); > + > +#define VPU_PTCM(dev) ((dev)->reg.sram) > +#define VPU_DTCM(dev) ((dev)->reg.sram + VPU_DTCM_OFFSET) These macros also seem to add very little value. They are consumed only a couple of times and only serve to conceal how the sram is mapped and consumed: dest = vpu->reg.sram; if (fw_type == D_FW) dest += VPU_DTCM_OFFSET; Compared to: dest = fw_type ? VPU_DTCM(vpu) : VPU_PTCM(vpu); > + > +#define VPU_PTCM_SIZE (96 * SZ_1K) > +#define VPU_DTCM_SIZE (32 * SZ_1K) > +#define VPU_DTCM_OFFSET 0x18000UL > +#define VPU_EXT_P_SIZE SZ_1M > +#define VPU_EXT_D_SIZE SZ_4M > +#define VPU_P_FW_SIZE (VPU_PTCM_SIZE + VPU_EXT_P_SIZE) > +#define VPU_D_FW_SIZE (VPU_DTCM_SIZE + VPU_EXT_D_SIZE) > +#define SHARE_BUF_SIZE 48 > + > +#define VPU_P_FW "vpu_p.bin" > +#define VPU_D_FW "vpu_d.bin" These macros are of questionable value. > + > +#define VPU_BASE 0x0 Is this register really called "base" in the datasheet? From the use in the code it looks like it performs a reset and/or PM function. > +#define VPU_TCM_CFG 0x0008 > +#define VPU_PMEM_EXT0_ADDR 0x000C > +#define VPU_PMEM_EXT1_ADDR 0x0010 > +#define VPU_TO_HOST 0x001C > +#define VPU_DMEM_EXT0_ADDR 0x0014 > +#define VPU_DMEM_EXT1_ADDR 0x0018 > +#define HOST_TO_VPU 0x0024 > +#define VPU_PC_REG 0x0060 > +#define VPU_WDT_REG 0x0084 > + > +/* vpu inter-processor communication interrupt */ > +#define VPU_IPC_INT BIT(8) > + > +/** > + * enum vpu_fw_type - VPU firmware type > + * > + * @P_FW: program firmware > + * @D_FW: data firmware > + * > + */ > +enum vpu_fw_type { > + P_FW, > + D_FW, > +}; > + > +/** > + * struct vpu_mem - VPU memory information Perhaps "VPU extended program/data memory information" instead. > + * > + * @p_va: the kernel virtual memory address of > + * VPU extended program memory > + * @d_va: the kernel virtual memory address of VPU extended data memory > + * @p_iova: the iova memory address of VPU extended program memory > + * @d_iova: the iova memory address of VPU extended data memory > + */ > +struct vpu_mem { > + void *p_va; > + void *d_va; > + dma_addr_t p_iova; > + dma_addr_t d_iova; > +}; Might be better as: struct { void *va; dma_addr_t iova; } This can be then be used as: vpu->mem[P_FW].va . This doesn't matter much yet but it would helps us common up some code later. > + > +/** > + * struct vpu_regs - VPU SRAM and configuration registers > + * > + * @sram: the register for VPU sram > + * @cfg: the register for VPU configuration > + * @irq: the irq number for VPU interrupt > + */ > +struct vpu_regs { > + void __iomem *sram; This is called TCM everywhere else. Why a different name for it here? > + void __iomem *cfg; > + int irq; > +}; > + > +/** > + * struct vpu_run - VPU initialization status > + * > + * @signaled: the signal of vpu initialization completed > + * @fw_ver: VPU firmware version > + * @wq: wait queue for VPU initialization status > + */ > +struct vpu_run { > + u32 signaled; > + char fw_ver[VPU_FW_VER_LEN]; > + wait_queue_head_t wq; > +}; > + > +/** > + * struct vpu_ipi_desc - VPU IPI descriptor > + * > + * @handler: IPI handler > + * @name: the name of IPI handler > + * @priv: the private data of IPI handler > + */ > +struct vpu_ipi_desc { > + ipi_handler_t handler; > + const char *name; > + void *priv; > +}; > + > +/** > + * struct share_obj - The DTCM (Data Tightly-Coupled Memory) buffer shared with > + * AP and VPU Remove "The" (there are more than one of these buffers). > + * > + * @id: IPI id > + * @len: share buffer length > + * @share_buf: share buffer data > + */ > +struct share_obj { > + int32_t id; > + uint32_t len; > + unsigned char share_buf[SHARE_BUF_SIZE]; > +}; > + > +/** > + * struct mtk_vpu - vpu driver data > + * @mem: VPU extended memory information > + * @reg: VPU SRAM and configuration registers > + * @run: VPU initialization status > + * @ipi_desc: VPU IPI descriptor > + * @recv_buf: VPU DTCM share buffer for receiving. The > + * receive buffer is only accessed in interrupt context. > + * @send_buf: VPU DTCM share buffer for sending > + * @dev: VPU struct device > + * @clk: VPU clock on/off > + * @vpu_mutex: protect mtk_vpu (except recv_buf) and ensure only > + * one client to use VPU service at a time. For example, > + * suppose a client is using VPU to decode VP8. > + * If the other client wants to encode VP8, > + * it has to wait until VP8 decode completes. > + * > + */ > +struct mtk_vpu { > + struct vpu_mem mem; Rename to extmem? > + struct vpu_regs reg; > + struct vpu_run run; > + struct vpu_ipi_desc ipi_desc[IPI_MAX]; > + struct share_obj *recv_buf; > + struct share_obj *send_buf; > + struct device *dev; > + struct clk *clk; > + struct mutex vpu_mutex; /* for protecting vpu data data structure */ > +}; > + > +/* the thread calls the function should hold the |vpu_mutex| */ Remove this comment. Its unhelpful: the code does not meet this requirement because it overstates the scope of vpu_mutex . > +static inline void vpu_cfg_writel(struct mtk_vpu *vpu, u32 val, u32 offset) > +{ > + writel(val, vpu->reg.cfg + offset); > +} > + > +static inline u32 vpu_cfg_readl(struct mtk_vpu *vpu, u32 offset) > +{ > + return readl(vpu->reg.cfg + offset); > +} > + > +static inline bool vpu_running(struct mtk_vpu *vpu) > +{ > + return vpu_cfg_readl(vpu, VPU_BASE) & BIT(0); > +} > + > +void vpu_disable_clock(struct platform_device *pdev) > +{ > + struct mtk_vpu *vpu = platform_get_drvdata(pdev); > + > + /* Disable VPU watchdog */ > + vpu_cfg_writel(vpu, > + vpu_cfg_readl(vpu, VPU_WDT_REG) & ~(1L<<31), > + VPU_WDT_REG); This code combines a reference counted clock disable with a not-reference counted register write and will result in the watchdot being spuriously disabled. This will definitely happen if vpu_debug_read() is called at the wrong time, possibly may also be an issue for concurrent H.264 and VP8 operations. > + > + clk_disable_unprepare(vpu->clk); > +} > + > +int vpu_enable_clock(struct platform_device *pdev) > +{ > + struct mtk_vpu *vpu = platform_get_drvdata(pdev); > + int ret; > + > + ret = clk_prepare_enable(vpu->clk); > + if (ret) > + return ret; > + /* Enable VPU watchdog */ > + vpu_cfg_writel(vpu, vpu_cfg_readl(vpu, VPU_WDT_REG) | (1L << 31), > + VPU_WDT_REG); As above. > + > + return ret; > +} > + > +int vpu_ipi_register(struct platform_device *pdev, > + enum ipi_id id, ipi_handler_t handler, > + const char *name, void *priv) > +{ > + struct mtk_vpu *vpu = platform_get_drvdata(pdev); > + struct vpu_ipi_desc *ipi_desc; > + > + if (!vpu) { > + dev_err(&pdev->dev, "vpu device in not ready\n"); > + return -EPROBE_DEFER; > + } > + > + if (id < IPI_MAX && handler != NULL) { > + ipi_desc = vpu->ipi_desc; > + ipi_desc[id].name = name; > + ipi_desc[id].handler = handler; > + ipi_desc[id].priv = priv; > + return 0; > + } > + > + dev_err(&pdev->dev, "register vpu ipi with invalid arguments\n"); > + return -EINVAL; > +} This API, and manu of its friends lower down the file, appear to be a way to send 32 byte messages to different endpoints on another processor on the SoC. Just interested to know if you evaluated the mailbox driver infrastructure for this? Its a good fit for the messaging but doesn't offer any support for the direct access to TCM or extended memory. > +int vpu_ipi_send(struct platform_device *pdev, > + enum ipi_id id, void *buf, > + unsigned int len, unsigned int wait) > +{ > + struct mtk_vpu *vpu = platform_get_drvdata(pdev); > + struct share_obj *send_obj = vpu->send_buf; > + unsigned long timeout; > + > + if (id >= IPI_MAX || len > sizeof(send_obj->share_buf) || buf == NULL) { > + dev_err(vpu->dev, "failed to send ipi message\n"); > + return -EINVAL; > + } > + > + if (!vpu_running(vpu)) { > + dev_err(vpu->dev, "vpu_ipi_send: VPU is not running\n"); > + return -ENXIO; > + } > + > + mutex_lock(&vpu->vpu_mutex); > + if (vpu_cfg_readl(vpu, HOST_TO_VPU) && !wait) { > + mutex_unlock(&vpu->vpu_mutex); > + return -EBUSY; > + } This branch is unreachable (no caller ever sets wait is never set to false). > + > + if (wait) > + while (vpu_cfg_readl(vpu, HOST_TO_VPU)) > + ; What is this loop for? This code should only be reachable if we timed out in the code below so the likely effect of this loop is to permantently wedge a thread in a manner that frustrates signal delivery. > + > + memcpy((void *)send_obj->share_buf, buf, len); > + send_obj->len = len; > + send_obj->id = id; > + vpu_cfg_writel(vpu, 0x1, HOST_TO_VPU); > + > + /* Wait until VPU receives the command */ > + timeout = jiffies + msecs_to_jiffies(IPI_TIMEOUT_MS); > + do { > + if (time_after(jiffies, timeout)) { > + dev_err(vpu->dev, "vpu_ipi_send: IPI timeout!\n"); > + return -EIO; > + } > + } while (vpu_cfg_readl(vpu, HOST_TO_VPU)); Do we need to busy wait every time we communicate with the co-processor? Couldn't we put this wait *before* we write to HOST_TO_VPU above. That way we only spin when there is a need to. > + > + mutex_unlock(&vpu->vpu_mutex); > + > + return 0; > +} > + > +void *vpu_mapping_dm_addr(struct platform_device *pdev, > + void *dtcm_dmem_addr) Use a different type: dtcm_dmem_addr is not a pointer on this CPU. > +{ > + struct mtk_vpu *vpu = platform_get_drvdata(pdev); > + unsigned long p_vpu_dtcm = (unsigned long)VPU_DTCM(vpu); p_vpu_dtcm *is* a pointer on this CPU. Why cast it so that it isn't. > + unsigned long ul_dtcm_dmem_addr = (unsigned long)(dtcm_dmem_addr); > + > + if (dtcm_dmem_addr == NULL || > + (ul_dtcm_dmem_addr > (VPU_DTCM_SIZE + VPU_EXT_D_SIZE))) { > + dev_err(vpu->dev, "invalid virtual data memory address\n"); > + return ERR_PTR(-EINVAL); > + } > + > + if (ul_dtcm_dmem_addr < VPU_DTCM_SIZE) > + return (void *)(ul_dtcm_dmem_addr + p_vpu_dtcm); Starting with the pointer will preserve type better: return vpu->reg.sram + VPU_DTCM_OFFSET + ul_dtcm_dmem_addr; > + > + return (void *)((ul_dtcm_dmem_addr - VPU_DTCM_SIZE) + > + VPU_DMEM0_VIRT(vpu)); Likewise this code is clearer with the pointer first. return vpu->mem.d_va + (ul_dtcm_dmem_addr - VPU_DTCM_SIZE); > +} > + > +dma_addr_t *vpu_mapping_iommu_dm_addr(struct platform_device *pdev, > + void *dmem_addr) This function does not return a pointer to a dma_addr_t. It just returns a regular dma_addr_t (that's why all the callers of this function cast it back ;-) ). dmem_addr is also not a pointer and this function does not return a pointer to a dma_addr_t. It just returns a regular dma_addr_t. > +{ > + unsigned long ul_dmem_addr = (unsigned long)(dmem_addr); > + struct mtk_vpu *vpu = platform_get_drvdata(pdev); > + > + if (dmem_addr == NULL || > + (ul_dmem_addr < VPU_DTCM_SIZE) || > + (ul_dmem_addr > (VPU_DTCM_SIZE + VPU_EXT_D_SIZE))) { > + dev_err(vpu->dev, "invalid IOMMU data memory address\n"); > + return ERR_PTR(-EINVAL); > + } > + > + return (dma_addr_t *)((ul_dmem_addr - VPU_DTCM_SIZE) + > + VPU_DMEM0_IOVA(vpu)); Better written as (this would also have made explicit the type problem with the function: return vpu->mem.d_iova + (ul_dmem_addr - VPU_DTCM_SIZE); > +} > + > +struct platform_device *vpu_get_plat_device(struct platform_device *pdev) > +{ > + struct device *dev = &pdev->dev; > + struct device_node *vpu_node; > + struct platform_device *vpu_pdev; > + > + vpu_node = of_parse_phandle(dev->of_node, "vpu", 0); > + if (!vpu_node) { > + dev_err(dev, "can't get vpu node\n"); > + return NULL; > + } > + > + vpu_pdev = of_find_device_by_node(vpu_node); > + if (WARN_ON(!vpu_pdev)) { > + dev_err(dev, "vpu pdev failed\n"); > + of_node_put(vpu_node); > + return NULL; > + } > + > + return vpu_pdev; > +} This function looks a bit weird to me. Why do we need to keep consulting the devicetree every time we want to start/stop the clock? I would have expected this code to be part of the init routine of anything that needs this and the platform_device would then be cached. > + > +/* load vpu program/data memory */ > +static void load_requested_vpu(struct mtk_vpu *vpu, > + size_t fw_size, > + const u8 *fw_data, > + u8 fw_type) > +{ > + size_t target_size = fw_type ? VPU_DTCM_SIZE : VPU_PTCM_SIZE; > + size_t extra_fw_size = 0; > + void *dest; > + > + /* reset VPU */ > + vpu_cfg_writel(vpu, 0x0, VPU_BASE); > + > + /* handle extended firmware size */ > + if (fw_size > target_size) { > + dev_dbg(vpu->dev, "fw size %lx > limited fw size %lx\n", > + fw_size, target_size); > + extra_fw_size = fw_size - target_size; > + dev_dbg(vpu->dev, "extra_fw_size %lx\n", extra_fw_size); > + fw_size = target_size; > + } > + dest = fw_type ? VPU_DTCM(vpu) : VPU_PTCM(vpu); > + memcpy(dest, fw_data, fw_size); > + /* download to extended memory if need */ > + if (extra_fw_size > 0) { > + dest = fw_type ? > + VPU_DMEM0_VIRT(vpu) : VPU_PMEM0_VIRT(vpu); > + > + dev_dbg(vpu->dev, "download extended memory type %x\n", > + fw_type); > + memcpy(dest, fw_data + target_size, extra_fw_size); > + } > +} > + > +int vpu_load_firmware(struct platform_device *pdev) > +{ > + struct mtk_vpu *vpu = platform_get_drvdata(pdev); > + struct device *dev = &pdev->dev; > + struct vpu_run *run = &vpu->run; > + const struct firmware *vpu_fw; > + int ret; > + > + if (!pdev) { > + dev_err(dev, "VPU platform device is invalid\n"); > + return -EINVAL; > + } > + > + mutex_lock(&vpu->vpu_mutex); > + > + ret = vpu_enable_clock(pdev); > + if (ret) { > + dev_err(dev, "enable clock failed %d\n", ret); > + goto OUT_LOAD_FW; > + } > + > + if (vpu_running(vpu)) { > + vpu_disable_clock(pdev); > + mutex_unlock(&vpu->vpu_mutex); > + dev_warn(dev, "vpu is running already\n"); > + return 0; > + } > + > + run->signaled = false; > + dev_dbg(vpu->dev, "firmware request\n"); > + ret = request_firmware(&vpu_fw, VPU_P_FW, dev); > + if (ret < 0) { > + dev_err(dev, "Failed to load %s, %d\n", VPU_P_FW, ret); > + goto OUT_LOAD_FW; > + } > + if (vpu_fw->size > VPU_P_FW_SIZE) { > + ret = -EFBIG; > + dev_err(dev, "program fw size %zu is abnormal\n", vpu_fw->size); > + goto OUT_LOAD_FW; > + } Possibly move request_firmware(), release_firmware() and the associated error handling into load_requested_vpu(). It can all be parameterized and the filename of the firmware means error reports will still be clear about whether the p or d firmware is faulty. > + dev_dbg(vpu->dev, "Downloaded program fw size: %zu.\n", > + vpu_fw->size); > + /* Downloading program firmware to device*/ > + load_requested_vpu(vpu, vpu_fw->size, vpu_fw->data, > + P_FW); > + release_firmware(vpu_fw); > + > + ret = request_firmware(&vpu_fw, VPU_D_FW, dev); > + if (ret < 0) { > + dev_err(dev, "Failed to load %s, %d\n", VPU_D_FW, ret); > + goto OUT_LOAD_FW; > + } > + if (vpu_fw->size > VPU_D_FW_SIZE) { > + ret = -EFBIG; > + dev_err(dev, "data fw size %zu is abnormal\n", vpu_fw->size); > + goto OUT_LOAD_FW; > + } > + dev_dbg(vpu->dev, "Downloaded data fw size: %zu.\n", > + vpu_fw->size); > + /* Downloading data firmware to device */ > + load_requested_vpu(vpu, vpu_fw->size, vpu_fw->data, > + D_FW); > + release_firmware(vpu_fw); > + /* boot up vpu */ > + vpu_cfg_writel(vpu, 0x1, VPU_BASE); > + > + ret = wait_event_interruptible_timeout(run->wq, > + run->signaled, > + msecs_to_jiffies(INIT_TIMEOUT_MS) > + ); > + if (0 == ret) { > + ret = -ETIME; > + dev_err(dev, "wait vpu initialization timout!\n"); > + goto OUT_LOAD_FW; > + } else if (-ERESTARTSYS == ret) { > + dev_err(dev, "wait vpu interrupted by a signal!\n"); > + goto OUT_LOAD_FW; > + } > + > + ret = 0; > + dev_info(dev, "vpu is ready. Fw version %s\n", run->fw_ver); > + > +OUT_LOAD_FW: > + vpu_disable_clock(pdev); > + mutex_unlock(&vpu->vpu_mutex); > + > + return ret; > +} > + > +static void vpu_init_ipi_handler(void *data, unsigned int len, void *priv) > +{ > + struct mtk_vpu *vpu = (struct mtk_vpu *)priv; > + struct vpu_run *run = (struct vpu_run *)data; > + > + vpu->run.signaled = run->signaled; > + strncpy(vpu->run.fw_ver, run->fw_ver, VPU_FW_VER_LEN); > + wake_up_interruptible(&vpu->run.wq); > +} > + > +static int vpu_debug_open(struct inode *inode, struct file *file) > +{ > + file->private_data = inode->i_private; > + return 0; > +} > + > +static ssize_t vpu_debug_read(struct file *file, char __user *user_buf, > + size_t count, loff_t *ppos) > +{ > + char buf[256]; > + unsigned int len; > + unsigned int running, pc, vpu_to_host, host_to_vpu, wdt; > + int ret; > + struct device *dev = file->private_data; > + struct platform_device *pdev = to_platform_device(dev); > + struct mtk_vpu *vpu = dev_get_drvdata(dev); > + > + ret = vpu_enable_clock(pdev); > + if (ret) { > + dev_err(vpu->dev, "[VPU] enable clock failed %d\n", ret); > + return 0; > + } > + > + /* vpu register status */ > + running = vpu_running(vpu); > + pc = vpu_cfg_readl(vpu, VPU_PC_REG); > + wdt = vpu_cfg_readl(vpu, VPU_WDT_REG); > + host_to_vpu = vpu_cfg_readl(vpu, HOST_TO_VPU); > + vpu_to_host = vpu_cfg_readl(vpu, VPU_TO_HOST); > + vpu_disable_clock(pdev); > + > + if (running) { > + len = sprintf(buf, "VPU is running\n\n" > + "FW Version: %s\n" > + "PC: 0x%x\n" > + "WDT: 0x%x\n" > + "Host to VPU: 0x%x\n" > + "VPU to Host: 0x%x\n", > + vpu->run.fw_ver, pc, wdt, > + host_to_vpu, vpu_to_host); > + } else { > + len = sprintf(buf, "VPU not running\n"); > + } > + > + return simple_read_from_buffer(user_buf, count, ppos, buf, len); > +} > + > +static const struct file_operations vpu_debug_fops = { > + .open = vpu_debug_open, > + .read = vpu_debug_read, > +}; These operations should be conditional on CONFIG_DEBUG_FS. > + > +static void vpu_free_p_ext_mem(struct mtk_vpu *vpu) > +{ > + struct device *dev = vpu->dev; > + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + > + dma_free_coherent(dev, VPU_EXT_P_SIZE, VPU_PMEM0_VIRT(vpu), > + VPU_PMEM0_IOVA(vpu)); > + > + if (domain) > + iommu_detach_device(domain, vpu->dev); > +} > + > +static void vpu_free_d_ext_mem(struct mtk_vpu *vpu) > +{ > + struct device *dev = vpu->dev; > + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + > + dma_free_coherent(dev, VPU_EXT_D_SIZE, VPU_DMEM0_VIRT(vpu), > + VPU_DMEM0_IOVA(vpu)); > + > + if (domain) > + iommu_detach_device(domain, dev); > +} Look like this could be parameterized and combined with vpu_free_p_ext_mem. > + > +static int vpu_alloc_p_ext_mem(struct mtk_vpu *vpu) > +{ > + struct device *dev = vpu->dev; > + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + phys_addr_t p_pa; > + > + VPU_PMEM0_VIRT(vpu) = dma_alloc_coherent(dev, > + VPU_EXT_P_SIZE, > + &(VPU_PMEM0_IOVA(vpu)), > + GFP_KERNEL); > + if (VPU_PMEM0_VIRT(vpu) == NULL) { > + dev_err(dev, "Failed to allocate the extended program memory\n"); > + return PTR_ERR(VPU_PMEM0_VIRT(vpu)); > + } > + > + p_pa = iommu_iova_to_phys(domain, vpu->mem.p_iova); > + /* Disable extend0. Enable extend1 */ > + vpu_cfg_writel(vpu, 0x1, VPU_PMEM_EXT0_ADDR); > + vpu_cfg_writel(vpu, (p_pa & 0xFFFFF000), VPU_PMEM_EXT1_ADDR); > + > + dev_info(dev, "Program extend memory phy=0x%llx virt=0x%p iova=0x%llx\n", > + (unsigned long long)p_pa, > + VPU_PMEM0_VIRT(vpu), > + (unsigned long long)VPU_PMEM0_IOVA(vpu)); > + > + return 0; > +} > + > +static int vpu_alloc_d_ext_mem(struct mtk_vpu *vpu) > +{ > + struct device *dev = vpu->dev; > + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + phys_addr_t d_pa; > + > + VPU_DMEM0_VIRT(vpu) = dma_alloc_coherent(dev, > + VPU_EXT_D_SIZE, > + &(VPU_DMEM0_IOVA(vpu)), > + GFP_KERNEL); > + if (VPU_DMEM0_VIRT(vpu) == NULL) { > + dev_err(dev, "Failed to allocate the extended data memory\n"); > + return PTR_ERR(VPU_DMEM0_VIRT(vpu)); > + } > + > + d_pa = iommu_iova_to_phys(domain, vpu->mem.d_iova); > + > + /* Disable extend0. Enable extend1 */ > + vpu_cfg_writel(vpu, 0x1, VPU_DMEM_EXT0_ADDR); > + vpu_cfg_writel(vpu, (d_pa & 0xFFFFF000), > + VPU_DMEM_EXT1_ADDR); > + > + dev_info(dev, "Data extend memory phy=0x%llx virt=0x%p iova=0x%llx\n", > + (unsigned long long)d_pa, > + VPU_DMEM0_VIRT(vpu), > + (unsigned long long)VPU_DMEM0_IOVA(vpu)); > + > + return 0; > +} Also looks suitable for parameterizing and combining with vpu_alloc_p_ext_mem . > + > +static void vpu_ipi_handler(struct mtk_vpu *vpu) > +{ > + struct share_obj *rcv_obj = vpu->recv_buf; > + struct vpu_ipi_desc *ipi_desc = vpu->ipi_desc; > + > + if (rcv_obj->id < IPI_MAX && ipi_desc[rcv_obj->id].handler) { > + ipi_desc[rcv_obj->id].handler(rcv_obj->share_buf, > + rcv_obj->len, > + ipi_desc[rcv_obj->id].priv); > + } else { > + dev_err(vpu->dev, "No such ipi id = %d\n", rcv_obj->id); > + } > +} > + > +static int vpu_ipi_init(struct mtk_vpu *vpu) > +{ > + /* Disable VPU to host interrupt */ > + vpu_cfg_writel(vpu, 0x0, VPU_TO_HOST); > + > + /* shared buffer initialization */ > + vpu->recv_buf = (struct share_obj *)VPU_DTCM(vpu); > + vpu->send_buf = vpu->recv_buf + 1; > + memset(vpu->recv_buf, 0, sizeof(struct share_obj)); > + memset(vpu->send_buf, 0, sizeof(struct share_obj)); > + mutex_init(&vpu->vpu_mutex); > + > + return 0; > +} > + > +static irqreturn_t vpu_irq_handler(int irq, void *priv) > +{ > + struct mtk_vpu *vpu = priv; > + uint32_t vpu_to_host = vpu_cfg_readl(vpu, VPU_TO_HOST); > + > + if (vpu_to_host & VPU_IPC_INT) > + vpu_ipi_handler(vpu); > + else > + dev_err(vpu->dev, "vpu watchdog timeout!\n"); > + > + /* VPU won't send another interrupt until we set VPU_TO_HOST to 0. */ > + vpu_cfg_writel(vpu, 0x0, VPU_TO_HOST); If we were triggered by a watchdog then how long will it be before the next watchdog interrupt? Will we end up spamming the logs? > + > + return IRQ_HANDLED; > +} > + > +static struct dentry *vpu_debugfs; > +static int mtk_vpu_probe(struct platform_device *pdev) > +{ > + struct mtk_vpu *vpu; > + struct device *dev; > + struct resource *res; > + int ret = 0; > + > + dev_dbg(&pdev->dev, "initialization\n"); > + > + dev = &pdev->dev; > + vpu = devm_kzalloc(dev, sizeof(*vpu), GFP_KERNEL); > + if (!vpu) > + return -ENOMEM; > + > + vpu->dev = &pdev->dev; > + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "sram"); > + vpu->reg.sram = devm_ioremap_resource(dev, res); > + if (IS_ERR(vpu->reg.sram)) { > + dev_err(dev, "devm_ioremap_resource vpu sram failed.\n"); > + return PTR_ERR(vpu->reg.sram); > + } > + > + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "cfg_reg"); > + vpu->reg.cfg = devm_ioremap_resource(dev, res); > + if (IS_ERR(vpu->reg.cfg)) { > + dev_err(dev, "devm_ioremap_resource vpu cfg failed.\n"); > + return PTR_ERR(vpu->reg.cfg); > + } > + > + /* Get VPU clock */ > + vpu->clk = devm_clk_get(dev, "main"); > + if (vpu->clk == NULL) { > + dev_err(dev, "get vpu clock fail\n"); > + return -EINVAL; > + } > + > + platform_set_drvdata(pdev, vpu); > + > + ret = vpu_enable_clock(pdev); > + if (ret) { > + ret = -EINVAL; > + return ret; Why not just "return ret" (or "return -EINVAL")? > + } > + > + dev_dbg(dev, "vpu ipi init\n"); > + ret = vpu_ipi_init(vpu); > + if (ret) { > + dev_err(dev, "Failed to init ipi\n"); > + goto disable_vpu_clk; > + } > + > + platform_set_drvdata(pdev, vpu); > + > + /* register vpu initialization IPI */ > + ret = vpu_ipi_register(pdev, IPI_VPU_INIT, vpu_init_ipi_handler, > + "vpu_init", vpu); > + if (ret) { > + dev_err(dev, "Failed to register IPI_VPU_INIT\n"); > + goto vpu_mutex_destroy; > + } > + > + vpu_debugfs = debugfs_create_file("mtk_vpu", S_IRUGO, NULL, (void *)dev, > + &vpu_debug_fops); > + if (!vpu_debugfs) { > + ret = -ENOMEM; > + goto cleanup_ipi; > + } > + > + /* Set PTCM to 96K and DTCM to 32K */ > + vpu_cfg_writel(vpu, 0x2, VPU_TCM_CFG); > + > + ret = vpu_alloc_p_ext_mem(vpu); > + if (ret) { > + dev_err(dev, "Allocate PM failed\n"); > + goto remove_debugfs; > + } > + > + ret = vpu_alloc_d_ext_mem(vpu); > + if (ret) { > + dev_err(dev, "Allocate DM failed\n"); > + goto free_p_mem; > + } > + > + init_waitqueue_head(&vpu->run.wq); > + > + res = platform_get_resource(pdev, IORESOURCE_IRQ, 0); > + if (res == NULL) { > + dev_err(dev, "get IRQ resource failed.\n"); > + ret = -ENXIO; > + goto free_d_mem; > + } > + vpu->reg.irq = platform_get_irq(pdev, 0); > + ret = devm_request_irq(dev, vpu->reg.irq, vpu_irq_handler, 0, > + pdev->name, vpu); > + if (ret) { > + dev_err(dev, "failed to request irq\n"); > + goto free_d_mem; > + } > + > + vpu_disable_clock(pdev); > + dev_dbg(dev, "initialization completed\n"); > + > + return 0; > + > +free_d_mem: > + vpu_free_d_ext_mem(vpu); > +free_p_mem: > + vpu_free_p_ext_mem(vpu); > +remove_debugfs: > + debugfs_remove(vpu_debugfs); > +cleanup_ipi: > + memset(vpu->ipi_desc, 0, sizeof(struct vpu_ipi_desc)*IPI_MAX); > +vpu_mutex_destroy: > + mutex_destroy(&vpu->vpu_mutex); > +disable_vpu_clk: > + vpu_disable_clock(pdev); > + > + return ret; > +} > + > +static const struct of_device_id mtk_vpu_match[] = { > + { > + .compatible = "mediatek,mt8173-vpu", > + }, > + {}, > +}; > +MODULE_DEVICE_TABLE(of, mtk_vpu_match); > + > +static int mtk_vpu_remove(struct platform_device *pdev) > +{ > + struct mtk_vpu *vpu = platform_get_drvdata(pdev); > + > + vpu_free_p_ext_mem(vpu); > + vpu_free_d_ext_mem(vpu); This looks like it leaks cpu_debugfs and the vpu_mutex. > + > + return 0; > +} > + > +static struct platform_driver mtk_vpu_driver = { > + .probe = mtk_vpu_probe, > + .remove = mtk_vpu_remove, > + .driver = { > + .name = MTK_VPU_DRV_NAME, > + .owner = THIS_MODULE, > + .of_match_table = mtk_vpu_match, > + }, > +}; > + > +module_platform_driver(mtk_vpu_driver); > + > +MODULE_LICENSE("GPL v2"); > +MODULE_DESCRIPTION("Mediatek Video Prosessor Unit driver"); > diff --git a/drivers/media/platform/mtk-vpu/mtk_vpu_core.h b/drivers/media/platform/mtk-vpu/mtk_vpu_core.h > new file mode 100644 > index 0000000..20cf2a0 > --- /dev/null > +++ b/drivers/media/platform/mtk-vpu/mtk_vpu_core.h > @@ -0,0 +1,161 @@ > +/* > +* Copyright (c) 2015 MediaTek Inc. > +* Author: Andrew-CT Chen > +* > +* This program is free software; you can redistribute it and/or modify > +* it under the terms of the GNU General Public License version 2 as > +* published by the Free Software Foundation. > +* > +* This program is distributed in the hope that it will be useful, > +* but WITHOUT ANY WARRANTY; without even the implied warranty of > +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +* GNU General Public License for more details. > +*/ > + > +#ifndef _MTK_VPU_CORE_H > +#define _MTK_VPU_CORE_H > + > +#include > + > +/** > + * VPU (video processor unit) is a tiny processor controlling video hardware > + * related to video codec, scaling and color format converting. > + * VPU interfaces with other blocks by share memory and interrupt. > + **/ > + > +typedef void (*ipi_handler_t) (void *data, > + unsigned int len, > + void *priv); > + > +/** > + * enum ipi_id - the id of inter-processor interrupt > + * > + * @IPI_VPU_INIT: The interrupt from vpu is to notfiy kernel > + VPU initialization completed. > + * @IPI_VENC_H264: The interrupt from vpu is to notify kernel to > + handle H264 video encoder job, and vice versa. > + * @IPI_VENC_VP8: The interrupt fro vpu is to notify kernel to > + handle VP8 video encoder job,, and vice versa. > + * @IPI_VENC_CAPABILITY: The interrupt from vpu is to > + get venc hardware capability. > + * @IPI_MAX: The maximum IPI number > + */ > +enum ipi_id { > + IPI_VPU_INIT = 0, > + IPI_VENC_H264, > + IPI_VENC_VP8, > + IPI_VENC_CAPABILITY, > + IPI_MAX, > +}; > + > +/** > + * vpu_disable_clock - Disable VPU clock > + * > + * @pdev: VPU platform device > + * > + * > + * Return: Return 0 if the clock is disabled successfully, > + * otherwise it is failed. > + * > + **/ > +void vpu_disable_clock(struct platform_device *pdev); > + > +/** > + * vpu_enable_clock - Enable VPU clock > + * > + * @pdev: VPU platform device > + * > + * Return: Return 0 if the clock is enabled successfully, > + * otherwise it is failed. > + * > + **/ > +int vpu_enable_clock(struct platform_device *pdev); > + > +/** > + * vpu_ipi_register - register an ipi function > + * > + * @pdev: VPU platform device > + * @id: IPI ID > + * @handler: IPI handler > + * @name: IPI name > + * @priv: private data for IPI handler > + * > + * Register an ipi function to receive ipi interrupt from VPU. > + * > + * Return: Return 0 if ipi registers successfully, otherwise it is failed. > + */ > +int vpu_ipi_register(struct platform_device *pdev, enum ipi_id id, > + ipi_handler_t handler, const char *name, void *priv); > + > +/** > + * vpu_ipi_send - send data from AP to vpu. > + * > + * @pdev: VPU platform device > + * @id: IPI ID > + * @buf: the data buffer > + * @len: the data buffer length > + * @wait: wait for the last ipi completed. > + * > + * This function is thread-safe. When this function returns, > + * VPU has received the data and starts the processing. > + * When the processing completes, IPI handler registered > + * by vpu_ipi_register will be called in interrupt context. > + * > + * Return: Return 0 if sending data successfully, otherwise it is failed. > + **/ > +int vpu_ipi_send(struct platform_device *pdev, > + enum ipi_id id, void *buf, > + unsigned int len, > + unsigned int wait); > + > +/** > + * vpu_get_plat_device - get VPU's platform device > + * > + * @pdev: the platform device of the module requesting VPU platform > + * device for using VPU API. > + * > + * Return: Return NULL if it is failed. > + * otherwise it is VPU's platform device > + **/ > +struct platform_device *vpu_get_plat_device(struct platform_device *pdev); > + > +/** > + * vpu_load_firmware - download VPU firmware and boot it > + * > + * @pdev: VPU platform device > + * > + * Return: Return 0 if downloading firmware successfully, > + * otherwise it is failed > + **/ > +int vpu_load_firmware(struct platform_device *pdev); > + > +/** > + * vpu_mapping_dm_addr - Mapping DTCM/DMEM to kernel virtual address > + * > + * @pdev: VPU platform device > + * @dmem_addr: VPU's data memory address > + * > + * Mapping the VPU's DTCM (Data Tightly-Coupled Memory) / > + * DMEM (Data Extended Memory) memory address to > + * kernel virtual address. > + * > + * Return: Return ERR_PTR(-EINVAL) if mapping failed, > + * otherwise the mapped kernel virtual address > + **/ > +void *vpu_mapping_dm_addr(struct platform_device *pdev, > + void *dtcm_dmem_addr); > + > +/** > + * vpu_mapping_iommu_dm_addr - Mapping to iommu address > + * > + * @pdev: VPU platform device > + * @dmem_addr: VPU's extended data memory address > + * > + * Mapping the VPU's extended data address to iommu address > + * > + * Return: Return ERR_PTR(-EINVAL) if mapping failed, > + * otherwise the mapped iommu address > + **/ > +dma_addr_t *vpu_mapping_iommu_dm_addr(struct platform_device *pdev, > + void *dmem_addr); > +#endif /* _MTK_VPU_CORE_H */ > diff --git a/drivers/media/platform/mtk-vpu/vp8_enc/venc_vp8_vpu.h b/drivers/media/platform/mtk-vpu/vp8_enc/venc_vp8_vpu.h > new file mode 100644 > index 0000000..4e09eec > --- /dev/null > +++ b/drivers/media/platform/mtk-vpu/vp8_enc/venc_vp8_vpu.h Like the H.264 header. Why is this file included in this patch? It is not included by anything in the patch and defines symbols that do not exist yet. Daniel. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/