2019-11-01 08:43:56

by Peter Ujfalusi

[permalink] [raw]
Subject: [PATCH v4 00/15] dmaengine/soc: Add Texas Instruments UDMA support

Hi,

Changes since v3
(https://patchwork.kernel.org/project/linux-dmaengine/list/?series=180679&state=*):
- Based on 5.4-rc5
- Fixed typos pointed out by Tero
- Added reviewed-by tags from Tero

- ring accelerator driver
- TODO_GS is removed from the header
- pm_runtime removed as NAVSS and it's components are always on
- Check validity of Message mode setup (element size > 8 bytes must use proxy)

- cppi5 header
- add commit message

- UDMAP DT bindings
- Drop the psil-config node use on the remote PSI-L side and use only one cell
which is the remote threadID:

dmas = <&main_udmap 0xc400>, <&main_udmap 0x4400>;
dma-names = "tx", "rx";

- The PSI-L thread configuration description is moved to kernel as a new module:
k3-psil/k3-psil-am654/k3-psil-j721e
- ti,psil-base has been removed and moved to kernel
- removed the no longer needed dt-bindings/dma/k3-udma.h
- Convert the document to schema (yaml)

- NEW PSI-L endpoint configuration database
- a simple database holding the remote end's configuration needed for UDMAP
configuration. All previous parameters from DT has been moved here and merged
with the linux only tr mode channel flag.
- Client drivers can update the remote endpoint configuration as it can be
different based on system configuration and the endpoint itself is under the
control of the peripheral driver.
- database for am654 and j721e

- UDMAP DMAengine driver
- pm_runtime removed as NAVSS and it's components are always on
- rchan_oes_offset added to MSI dommain allocation
- Use the new PSI-L endpoint database for UDMAP configuration
- Support for waiting for PDMA teardown completion on j721e instead of
returning right away. depends on:
https://lkml.org/lkml/2019/10/25/189
Not included in this series, but it is in the branch I have prepared.
- psil-base is moved from DT to be part of udma_match_data
- tr_thread maps is removed and using the PSI-L endpoint configuration for it

- UDMAP glue layer
- pm_runtime removed as NAVSS and it's components are always on
- Use the new PSI-L endpoint database for UDMAP configuration

Changes since v2
(https://patchwork.kernel.org/project/linux-dmaengine/list/?series=152609&state=*)
- Based on 5.4-rc1
- Support for Flow only data transfer for the glue layer

- cppi5 header
- comments converted to kernel-doc style
- Remove the excessive WARN_ONs and rely on the user for sanity
- new macro for checking TearDown Completion Message

- ring accelerator driver
- fixed up th commit message (SoB, TI-SCI)
- fixed ring reset
- CONFIG_TI_K3_RINGACC_DEBUG is removed along with the dbg_write/read functions
and use dev_dbg()
- k3_ringacc_ring_dump() is moved to static
- step numbering removed from k3_ringacc_ring_reset_dma()
- Add clarification comment for shared ring usage in k3_ringacc_ring_cfg()
- Magic shift values in k3_ringacc_ring_cfg_proxy() got defined
- K3_RINGACC_RING_MODE_QM is removed as it is not supported

- UDMAP DT bindings
- Fix property prefixing: s/pdma,/ti,pdma-
- Add ti,notdpkt property to suppress teardown completion message on tchan
- example updated accordingly

- UDMAP DMAengine driver
- Change __raw_readl/writel to readl/writel
- Split up the udma_tisci_channel_config() into m2m, tx and rx tisci
configuration functions for clarity
- DT bindings change: s/pdma,/ti,pdma-
- Cleanup of udma_tx_status():
- residue calculation fix for m2m
- no need to read packet counter as it is not used
- peer byte counter only available in PDMAs
- Proper locking to avoid race with interrupt handler (polled m2m fix)
- Support for ti,notdpkt
- RFLOW management rework to support data movement without channel:
- the channel is not controlled by Linux but other core and we only have
rflows and rings to do the DMA transfers.
This mode is only supported by the Glue layer for now.

- UDMAP glue layer
- Debug print improvements
- Support for rflow/ring only data movement

Changes since v1
(https://patchwork.kernel.org/project/linux-dmaengine/list/?series=114105&state=*)
- Added support for j721e
- Based on 5.3-rc2
- dropped ti_sci API patch for RM management as it is already upstream
- dropped dmadev_get_slave_channel() patch, using __dma_request_channel()
- Added Rob's Reviewed-by to ringacc DT binding document patch
- DT bindings changes:
- linux,udma-mode is gone, I have a simple lookup table in the driver to flag
TR channels.
- Support for j721e
- Fix bug in of_node_put() handling in xlate function

Changes since RFC (https://patchwork.kernel.org/cover/10612465/):
- Based on linux-next (20190506) which now have the ti_sci interrupt support
- The series can be applied and the UDMA via DMAengine API will be functional
- Included in the series: ti_sci Resource management API, cppi5 header and
driver for the ring accelerator.
- The DMAengine core patches have been updated as per the review comments for
earlier submittion.
- The DMAengine driver patch is artificially split up to 6 smaller patches

The k3-udma driver implements the Data Movement Architecture described in
AM65x TRM (http://www.ti.com/lit/pdf/spruid7) and
j721e TRM (http://www.ti.com/lit/pdf/spruil1)

This DMA architecture is a big departure from 'traditional' architecture where
we had either EDMA or sDMA as system DMA.

Packet DMAs were used as dedicated DMAs to service only networking (Kesytone2)
or USB (am335x) while other peripherals were serviced by EDMA.

In AM65x/j721e the UDMA (Unified DMA) is used for all data movment within the
SoC, tasked to service all peripherals (UART, McSPI, McASP, networking, etc).

The NAVSS/UDMA is built around CPPI5 (Communications Port Programming Interface)
and it supports Packet mode (similar to CPPI4.1 in Keystone2 for networking) and
TR mode (similar to EDMA descriptor).
The data movement is done within a PSI-L fabric, peripherals (including the
UDMA-P) are not addressed by their I/O register as with traditional DMAs but
with their PSI-L thread ID.

In AM65x/j721e we have two main type of peripherals:
Legacy: McASP, McSPI, UART, etc.
to provide connectivity they are serviced by PDMA (Peripheral DMA)
PDMA threads are locked to service a given peripheral, for example PSI-L thread
0x4400/0xc400 is to service McASP0 rx/tx.
The PDMa configuration can be done via the UDMA Real Time Peer registers.
Native: Networking, security accelerator
these peripherals have native support for PSI-L.

To be able to use the DMA the following generic steps need to be taken:
- configure a DMA channel (tchan for TX, rchan for RX)
- channel mode: Packet or TR mode
- for memcpy a tchan and rchan pair is used.
- for packet mode RX we also need to configure a receive flow to configure the
packet receiption
- the source and destination threads must be paired
- at minimum one pair of rings need to be configured:
- tx: transfer ring and transfer completion ring
- rx: free descriptor ring and receive ring
- two interrupts: UDMA-P channel interrupt and ring interrupt for tc_ring/r_ring
- If the channel is in packet mode or configured to memcpy then we only need
one interrupt from the ring, events from UDMAP is not used.

When the channel setup is completed we only interract with the rings:
- TX: push a descriptor to t_ring and wait for it to be pushed to the tc_ring by
the UDMA-P
- RX: push a descriptor to the fd_ring and waith for UDMA-P to push it back to
the r_ring.

Since we have FIFOs in the DMA fabric (UDMA-P, PSI-L and PDMA) which was not the
case in previous DMAs we need to report the amount of data held in these FIFOs
to clients (delay calculation for ALSA, UART FIFO flush support).

Metadata support:
DMAengine user driver was posted upstream based/tested on the v1 of the UDMA
series: https://lkml.org/lkml/2019/6/28/20
SA2UL is using the metadata DMAengine API.

Note on the last patch:
In Keystone2 the networking had dedicated DMA (packet DMA) which is not the case
anymore and the DMAengine API currently missing support for the features we
would need to support networking, things like
- support for receive descriptor 'classification'
- we need to support several receive queues for a channel.
- the queues are used for packet priority handling for example, but they can be
used to have pools of descriptors for different sizes.
- out of order completion of descriptors on a channel
- when we have several queues to handle different priority packets the
descriptors will be completed 'out-of-order'
- NAPI type of operation (polling instead of interrupt driven transfer)
- without this we can not sustain gigabit speeds and we need to support NAPI
- not to limit this to networking, but other high performance operations

It is my intention to work on these to be able to remove the 'glue' layer and
switch to DMAengine API - or have an API aside of DMAengine to have generic way
to support networking, but given how controversial and not trivial these changes
are we need something to support networking.

The series (+DT patches to enabled DMA on AM65x and j721e) on top of 5.4-rc5 is
available:
https://github.com/omap-audio/linux-audio.git peter/udma/series_v4-5.4-rc5

Regards,
Peter
---
Grygorii Strashko (3):
bindings: soc: ti: add documentation for k3 ringacc
soc: ti: k3: add navss ringacc driver
dmaengine: ti: k3-udma: Add glue layer for non DMAengine users

Peter Ujfalusi (12):
dmaengine: doc: Add sections for per descriptor metadata support
dmaengine: Add metadata_ops for dma_async_tx_descriptor
dmaengine: Add support for reporting DMA cached data amount
dmaengine: ti: Add cppi5 header for K3 NAVSS/UDMA
dmaengine: ti: k3 PSI-L remote endpoint configuration
dt-bindings: dma: ti: Add document for K3 UDMA
dmaengine: ti: New driver for K3 UDMA - split#1: defines, structs, io
func
dmaengine: ti: New driver for K3 UDMA - split#2: probe/remove, xlate
and filter_fn
dmaengine: ti: New driver for K3 UDMA - split#3: alloc/free
chan_resources
dmaengine: ti: New driver for K3 UDMA - split#4: dma_device callbacks
1
dmaengine: ti: New driver for K3 UDMA - split#5: dma_device callbacks
2
dmaengine: ti: New driver for K3 UDMA - split#6: Kconfig and Makefile

.../devicetree/bindings/dma/ti/k3-udma.yaml | 190 +
.../devicetree/bindings/soc/ti/k3-ringacc.txt | 59 +
Documentation/driver-api/dmaengine/client.rst | 75 +
.../driver-api/dmaengine/provider.rst | 46 +
drivers/dma/dmaengine.c | 73 +
drivers/dma/dmaengine.h | 8 +
drivers/dma/ti/Kconfig | 26 +
drivers/dma/ti/Makefile | 3 +
drivers/dma/ti/k3-psil-am654.c | 172 +
drivers/dma/ti/k3-psil-j721e.c | 219 ++
drivers/dma/ti/k3-psil-priv.h | 39 +
drivers/dma/ti/k3-psil.c | 97 +
drivers/dma/ti/k3-udma-glue.c | 1202 ++++++
drivers/dma/ti/k3-udma-private.c | 133 +
drivers/dma/ti/k3-udma.c | 3425 +++++++++++++++++
drivers/dma/ti/k3-udma.h | 151 +
drivers/soc/ti/Kconfig | 12 +
drivers/soc/ti/Makefile | 1 +
drivers/soc/ti/k3-ringacc.c | 1158 ++++++
include/linux/dma/k3-psil.h | 47 +
include/linux/dma/k3-udma-glue.h | 134 +
include/linux/dma/ti-cppi5.h | 1049 +++++
include/linux/dmaengine.h | 110 +
include/linux/soc/ti/k3-ringacc.h | 244 ++
24 files changed, 8673 insertions(+)
create mode 100644 Documentation/devicetree/bindings/dma/ti/k3-udma.yaml
create mode 100644 Documentation/devicetree/bindings/soc/ti/k3-ringacc.txt
create mode 100644 drivers/dma/ti/k3-psil-am654.c
create mode 100644 drivers/dma/ti/k3-psil-j721e.c
create mode 100644 drivers/dma/ti/k3-psil-priv.h
create mode 100644 drivers/dma/ti/k3-psil.c
create mode 100644 drivers/dma/ti/k3-udma-glue.c
create mode 100644 drivers/dma/ti/k3-udma-private.c
create mode 100644 drivers/dma/ti/k3-udma.c
create mode 100644 drivers/dma/ti/k3-udma.h
create mode 100644 drivers/soc/ti/k3-ringacc.c
create mode 100644 include/linux/dma/k3-psil.h
create mode 100644 include/linux/dma/k3-udma-glue.h
create mode 100644 include/linux/dma/ti-cppi5.h
create mode 100644 include/linux/soc/ti/k3-ringacc.h

--
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


2019-11-01 08:44:03

by Peter Ujfalusi

[permalink] [raw]
Subject: [PATCH v4 05/15] dmaengine: Add support for reporting DMA cached data amount

A DMA hardware can have big cache or FIFO and the amount of data sitting in
the DMA fabric can be an interest for the clients.

For example in audio we want to know the delay in the data flow and in case
the DMA have significantly large FIFO/cache, it can affect the latenc/delay

Signed-off-by: Peter Ujfalusi <[email protected]>
Reviewed-by: Tero Kristo <[email protected]>
---
drivers/dma/dmaengine.h | 8 ++++++++
include/linux/dmaengine.h | 2 ++
2 files changed, 10 insertions(+)

diff --git a/drivers/dma/dmaengine.h b/drivers/dma/dmaengine.h
index 501c0b063f85..b0b97475707a 100644
--- a/drivers/dma/dmaengine.h
+++ b/drivers/dma/dmaengine.h
@@ -77,6 +77,7 @@ static inline enum dma_status dma_cookie_status(struct dma_chan *chan,
state->last = complete;
state->used = used;
state->residue = 0;
+ state->in_flight_bytes = 0;
}
return dma_async_is_complete(cookie, complete, used);
}
@@ -87,6 +88,13 @@ static inline void dma_set_residue(struct dma_tx_state *state, u32 residue)
state->residue = residue;
}

+static inline void dma_set_in_flight_bytes(struct dma_tx_state *state,
+ u32 in_flight_bytes)
+{
+ if (state)
+ state->in_flight_bytes = in_flight_bytes;
+}
+
struct dmaengine_desc_callback {
dma_async_tx_callback callback;
dma_async_tx_callback_result callback_result;
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 0e8b426bbde9..c4c5219030a6 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -682,11 +682,13 @@ static inline struct dma_async_tx_descriptor *txd_next(struct dma_async_tx_descr
* @residue: the remaining number of bytes left to transmit
* on the selected transfer for states DMA_IN_PROGRESS and
* DMA_PAUSED if this is implemented in the driver, else 0
+ * @in_flight_bytes: amount of data in bytes cached by the DMA.
*/
struct dma_tx_state {
dma_cookie_t last;
dma_cookie_t used;
u32 residue;
+ u32 in_flight_bytes;
};

/**
--
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-01 08:44:25

by Peter Ujfalusi

[permalink] [raw]
Subject: [PATCH v4 02/15] soc: ti: k3: add navss ringacc driver

From: Grygorii Strashko <[email protected]>

The Ring Accelerator (RINGACC or RA) provides hardware acceleration to
enable straightforward passing of work between a producer and a consumer.
There is one RINGACC module per NAVSS on TI AM65x SoCs.

The RINGACC converts constant-address read and write accesses to equivalent
read or write accesses to a circular data structure in memory. The RINGACC
eliminates the need for each DMA controller which needs to access ring
elements from having to know the current state of the ring (base address,
current offset). The DMA controller performs a read or write access to a
specific address range (which maps to the source interface on the RINGACC)
and the RINGACC replaces the address for the transaction with a new address
which corresponds to the head or tail element of the ring (head for reads,
tail for writes). Since the RINGACC maintains the state, multiple DMA
controllers or channels are allowed to coherently share the same rings as
applicable. The RINGACC is able to place data which is destined towards
software into cached memory directly.

Supported ring modes:
- Ring Mode
- Messaging Mode
- Credentials Mode
- Queue Manager Mode

TI-SCI integration:

Texas Instrument's System Control Interface (TI-SCI) Message Protocol now
has control over Ringacc module resources management (RM) and Rings
configuration.

The corresponding support of TI-SCI Ringacc module RM protocol
introduced as option through DT parameters:
- ti,sci: phandle on TI-SCI firmware controller DT node
- ti,sci-dev-id: TI-SCI device identifier as per TI-SCI firmware spec

if both parameters present - Ringacc driver will configure/free/reset Rings
using TI-SCI Message Ringacc RM Protocol.

The Ringacc driver manages Rings allocation by itself now and requests
TI-SCI firmware to allocate and configure specific Rings only. It's done
this way because, Linux driver implements two stage Rings allocation and
configuration (allocate ring and configure ring) while TI-SCI Message
Protocol supports only one combined operation (allocate+configure).

Signed-off-by: Grygorii Strashko <[email protected]>
Signed-off-by: Peter Ujfalusi <[email protected]>
Reviewed-by: Tero Kristo <[email protected]>
---
drivers/soc/ti/Kconfig | 12 +
drivers/soc/ti/Makefile | 1 +
drivers/soc/ti/k3-ringacc.c | 1158 +++++++++++++++++++++++++++++
include/linux/soc/ti/k3-ringacc.h | 244 ++++++
4 files changed, 1415 insertions(+)
create mode 100644 drivers/soc/ti/k3-ringacc.c
create mode 100644 include/linux/soc/ti/k3-ringacc.h

diff --git a/drivers/soc/ti/Kconfig b/drivers/soc/ti/Kconfig
index cf545f428d03..87722d33333a 100644
--- a/drivers/soc/ti/Kconfig
+++ b/drivers/soc/ti/Kconfig
@@ -80,6 +80,18 @@ config TI_SCI_PM_DOMAINS
called ti_sci_pm_domains. Note this is needed early in boot before
rootfs may be available.

+config TI_K3_RINGACC
+ tristate "K3 Ring accelerator Sub System"
+ depends on ARCH_K3 || COMPILE_TEST
+ depends on TI_SCI_INTA_IRQCHIP
+ default y
+ help
+ Say y here to support the K3 Ring accelerator module.
+ The Ring Accelerator (RINGACC or RA) provides hardware acceleration
+ to enable straightforward passing of work between a producer
+ and a consumer. There is one RINGACC module per NAVSS on TI AM65x SoCs
+ If unsure, say N.
+
endif # SOC_TI

config TI_SCI_INTA_MSI_DOMAIN
diff --git a/drivers/soc/ti/Makefile b/drivers/soc/ti/Makefile
index b3868d392d4f..cc4bc8b08bf5 100644
--- a/drivers/soc/ti/Makefile
+++ b/drivers/soc/ti/Makefile
@@ -9,3 +9,4 @@ obj-$(CONFIG_AMX3_PM) += pm33xx.o
obj-$(CONFIG_WKUP_M3_IPC) += wkup_m3_ipc.o
obj-$(CONFIG_TI_SCI_PM_DOMAINS) += ti_sci_pm_domains.o
obj-$(CONFIG_TI_SCI_INTA_MSI_DOMAIN) += ti_sci_inta_msi.o
+obj-$(CONFIG_TI_K3_RINGACC) += k3-ringacc.o
diff --git a/drivers/soc/ti/k3-ringacc.c b/drivers/soc/ti/k3-ringacc.c
new file mode 100644
index 000000000000..0059ca97c029
--- /dev/null
+++ b/drivers/soc/ti/k3-ringacc.c
@@ -0,0 +1,1158 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * TI K3 NAVSS Ring Accelerator subsystem driver
+ *
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
+ */
+
+#include <linux/dma-mapping.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/soc/ti/k3-ringacc.h>
+#include <linux/soc/ti/ti_sci_protocol.h>
+#include <linux/soc/ti/ti_sci_inta_msi.h>
+#include <linux/of_irq.h>
+#include <linux/irqdomain.h>
+
+static LIST_HEAD(k3_ringacc_list);
+static DEFINE_MUTEX(k3_ringacc_list_lock);
+
+#define K3_RINGACC_CFG_RING_SIZE_ELCNT_MASK GENMASK(19, 0)
+
+/**
+ * struct k3_ring_rt_regs - The RA Control/Status Registers region
+ */
+struct k3_ring_rt_regs {
+ u32 resv_16[4];
+ u32 db; /* RT Ring N Doorbell Register */
+ u32 resv_4[1];
+ u32 occ; /* RT Ring N Occupancy Register */
+ u32 indx; /* RT Ring N Current Index Register */
+ u32 hwocc; /* RT Ring N Hardware Occupancy Register */
+ u32 hwindx; /* RT Ring N Current Index Register */
+};
+
+#define K3_RINGACC_RT_REGS_STEP 0x1000
+
+/**
+ * struct k3_ring_fifo_regs - The Ring Accelerator Queues Registers region
+ */
+struct k3_ring_fifo_regs {
+ u32 head_data[128]; /* Ring Head Entry Data Registers */
+ u32 tail_data[128]; /* Ring Tail Entry Data Registers */
+ u32 peek_head_data[128]; /* Ring Peek Head Entry Data Regs */
+ u32 peek_tail_data[128]; /* Ring Peek Tail Entry Data Regs */
+};
+
+/**
+ * struct k3_ringacc_proxy_gcfg_regs - RA Proxy Global Config MMIO Region
+ */
+struct k3_ringacc_proxy_gcfg_regs {
+ u32 revision; /* Revision Register */
+ u32 config; /* Config Register */
+};
+
+#define K3_RINGACC_PROXY_CFG_THREADS_MASK GENMASK(15, 0)
+
+/**
+ * struct k3_ringacc_proxy_target_regs - Proxy Datapath MMIO Region
+ */
+struct k3_ringacc_proxy_target_regs {
+ u32 control; /* Proxy Control Register */
+ u32 status; /* Proxy Status Register */
+ u8 resv_512[504];
+ u32 data[128]; /* Proxy Data Register */
+};
+
+#define K3_RINGACC_PROXY_TARGET_STEP 0x1000
+#define K3_RINGACC_PROXY_NOT_USED (-1)
+
+enum k3_ringacc_proxy_access_mode {
+ PROXY_ACCESS_MODE_HEAD = 0,
+ PROXY_ACCESS_MODE_TAIL = 1,
+ PROXY_ACCESS_MODE_PEEK_HEAD = 2,
+ PROXY_ACCESS_MODE_PEEK_TAIL = 3,
+};
+
+#define K3_RINGACC_FIFO_WINDOW_SIZE_BYTES (512U)
+#define K3_RINGACC_FIFO_REGS_STEP 0x1000
+#define K3_RINGACC_MAX_DB_RING_CNT (127U)
+
+/**
+ * struct k3_ring_ops - Ring operations
+ */
+struct k3_ring_ops {
+ int (*push_tail)(struct k3_ring *ring, void *elm);
+ int (*push_head)(struct k3_ring *ring, void *elm);
+ int (*pop_tail)(struct k3_ring *ring, void *elm);
+ int (*pop_head)(struct k3_ring *ring, void *elm);
+};
+
+/**
+ * struct k3_ring - RA Ring descriptor
+ *
+ * @rt - Ring control/status registers
+ * @fifos - Ring queues registers
+ * @proxy - Ring Proxy Datapath registers
+ * @ring_mem_dma - Ring buffer dma address
+ * @ring_mem_virt - Ring buffer virt address
+ * @ops - Ring operations
+ * @size - Ring size in elements
+ * @elm_size - Size of the ring element
+ * @mode - Ring mode
+ * @flags - flags
+ * @free - Number of free elements
+ * @occ - Ring occupancy
+ * @windex - Write index (only for @K3_RINGACC_RING_MODE_RING)
+ * @rindex - Read index (only for @K3_RINGACC_RING_MODE_RING)
+ * @ring_id - Ring Id
+ * @parent - Pointer on struct @k3_ringacc
+ * @use_count - Use count for shared rings
+ * @proxy_id - RA Ring Proxy Id (only if @K3_RINGACC_RING_USE_PROXY)
+ */
+struct k3_ring {
+ struct k3_ring_rt_regs __iomem *rt;
+ struct k3_ring_fifo_regs __iomem *fifos;
+ struct k3_ringacc_proxy_target_regs __iomem *proxy;
+ dma_addr_t ring_mem_dma;
+ void *ring_mem_virt;
+ struct k3_ring_ops *ops;
+ u32 size;
+ enum k3_ring_size elm_size;
+ enum k3_ring_mode mode;
+ u32 flags;
+#define K3_RING_FLAG_BUSY BIT(1)
+#define K3_RING_FLAG_SHARED BIT(2)
+ u32 free;
+ u32 occ;
+ u32 windex;
+ u32 rindex;
+ u32 ring_id;
+ struct k3_ringacc *parent;
+ u32 use_count;
+ int proxy_id;
+};
+
+/**
+ * struct k3_ringacc - Rings accelerator descriptor
+ *
+ * @dev - pointer on RA device
+ * @proxy_gcfg - RA proxy global config registers
+ * @proxy_target_base - RA proxy datapath region
+ * @num_rings - number of ring in RA
+ * @rings_inuse - bitfield for ring usage tracking
+ * @rm_gp_range - general purpose rings range from tisci
+ * @dma_ring_reset_quirk - DMA reset w/a enable
+ * @num_proxies - number of RA proxies
+ * @proxy_inuse - bitfield for proxy usage tracking
+ * @rings - array of rings descriptors (struct @k3_ring)
+ * @list - list of RAs in the system
+ * @tisci - pointer ti-sci handle
+ * @tisci_ring_ops - ti-sci rings ops
+ * @tisci_dev_id - ti-sci device id
+ */
+struct k3_ringacc {
+ struct device *dev;
+ struct k3_ringacc_proxy_gcfg_regs __iomem *proxy_gcfg;
+ void __iomem *proxy_target_base;
+ u32 num_rings; /* number of rings in Ringacc module */
+ unsigned long *rings_inuse;
+ struct ti_sci_resource *rm_gp_range;
+
+ bool dma_ring_reset_quirk;
+ u32 num_proxies;
+ unsigned long *proxy_inuse;
+
+ struct k3_ring *rings;
+ struct list_head list;
+ struct mutex req_lock; /* protect rings allocation */
+
+ const struct ti_sci_handle *tisci;
+ const struct ti_sci_rm_ringacc_ops *tisci_ring_ops;
+ u32 tisci_dev_id;
+};
+
+static long k3_ringacc_ring_get_fifo_pos(struct k3_ring *ring)
+{
+ return K3_RINGACC_FIFO_WINDOW_SIZE_BYTES -
+ (4 << ring->elm_size);
+}
+
+static void *k3_ringacc_get_elm_addr(struct k3_ring *ring, u32 idx)
+{
+ return (ring->ring_mem_virt + idx * (4 << ring->elm_size));
+}
+
+static int k3_ringacc_ring_push_mem(struct k3_ring *ring, void *elem);
+static int k3_ringacc_ring_pop_mem(struct k3_ring *ring, void *elem);
+
+static struct k3_ring_ops k3_ring_mode_ring_ops = {
+ .push_tail = k3_ringacc_ring_push_mem,
+ .pop_head = k3_ringacc_ring_pop_mem,
+};
+
+static int k3_ringacc_ring_push_io(struct k3_ring *ring, void *elem);
+static int k3_ringacc_ring_pop_io(struct k3_ring *ring, void *elem);
+static int k3_ringacc_ring_push_head_io(struct k3_ring *ring, void *elem);
+static int k3_ringacc_ring_pop_tail_io(struct k3_ring *ring, void *elem);
+
+static struct k3_ring_ops k3_ring_mode_msg_ops = {
+ .push_tail = k3_ringacc_ring_push_io,
+ .push_head = k3_ringacc_ring_push_head_io,
+ .pop_tail = k3_ringacc_ring_pop_tail_io,
+ .pop_head = k3_ringacc_ring_pop_io,
+};
+
+static int k3_ringacc_ring_push_head_proxy(struct k3_ring *ring, void *elem);
+static int k3_ringacc_ring_push_tail_proxy(struct k3_ring *ring, void *elem);
+static int k3_ringacc_ring_pop_head_proxy(struct k3_ring *ring, void *elem);
+static int k3_ringacc_ring_pop_tail_proxy(struct k3_ring *ring, void *elem);
+
+static struct k3_ring_ops k3_ring_mode_proxy_ops = {
+ .push_tail = k3_ringacc_ring_push_tail_proxy,
+ .push_head = k3_ringacc_ring_push_head_proxy,
+ .pop_tail = k3_ringacc_ring_pop_tail_proxy,
+ .pop_head = k3_ringacc_ring_pop_head_proxy,
+};
+
+static void k3_ringacc_ring_dump(struct k3_ring *ring)
+{
+ struct device *dev = ring->parent->dev;
+
+ dev_dbg(dev, "dump ring: %d\n", ring->ring_id);
+ dev_dbg(dev, "dump mem virt %p, dma %pad\n", ring->ring_mem_virt,
+ &ring->ring_mem_dma);
+ dev_dbg(dev, "dump elmsize %d, size %d, mode %d, proxy_id %d\n",
+ ring->elm_size, ring->size, ring->mode, ring->proxy_id);
+
+ dev_dbg(dev, "dump ring_rt_regs: db%08x\n", readl(&ring->rt->db));
+ dev_dbg(dev, "dump occ%08x\n", readl(&ring->rt->occ));
+ dev_dbg(dev, "dump indx%08x\n", readl(&ring->rt->indx));
+ dev_dbg(dev, "dump hwocc%08x\n", readl(&ring->rt->hwocc));
+ dev_dbg(dev, "dump hwindx%08x\n", readl(&ring->rt->hwindx));
+
+ if (ring->ring_mem_virt)
+ print_hex_dump_debug("dump ring_mem_virt ", DUMP_PREFIX_NONE,
+ 16, 1, ring->ring_mem_virt, 16 * 8, false);
+}
+
+struct k3_ring *k3_ringacc_request_ring(struct k3_ringacc *ringacc,
+ int id, u32 flags)
+{
+ int proxy_id = K3_RINGACC_PROXY_NOT_USED;
+
+ mutex_lock(&ringacc->req_lock);
+
+ if (id == K3_RINGACC_RING_ID_ANY) {
+ /* Request for any general purpose ring */
+ struct ti_sci_resource_desc *gp_rings =
+ &ringacc->rm_gp_range->desc[0];
+ unsigned long size;
+
+ size = gp_rings->start + gp_rings->num;
+ id = find_next_zero_bit(ringacc->rings_inuse, size,
+ gp_rings->start);
+ if (id == size)
+ goto error;
+ } else if (id < 0) {
+ goto error;
+ }
+
+ if (test_bit(id, ringacc->rings_inuse) &&
+ !(ringacc->rings[id].flags & K3_RING_FLAG_SHARED))
+ goto error;
+ else if (ringacc->rings[id].flags & K3_RING_FLAG_SHARED)
+ goto out;
+
+ if (flags & K3_RINGACC_RING_USE_PROXY) {
+ proxy_id = find_next_zero_bit(ringacc->proxy_inuse,
+ ringacc->num_proxies, 0);
+ if (proxy_id == ringacc->num_proxies)
+ goto error;
+ }
+
+ if (!try_module_get(ringacc->dev->driver->owner))
+ goto error;
+
+ if (proxy_id != K3_RINGACC_PROXY_NOT_USED) {
+ set_bit(proxy_id, ringacc->proxy_inuse);
+ ringacc->rings[id].proxy_id = proxy_id;
+ dev_dbg(ringacc->dev, "Giving ring#%d proxy#%d\n", id,
+ proxy_id);
+ } else {
+ dev_dbg(ringacc->dev, "Giving ring#%d\n", id);
+ }
+
+ set_bit(id, ringacc->rings_inuse);
+out:
+ ringacc->rings[id].use_count++;
+ mutex_unlock(&ringacc->req_lock);
+ return &ringacc->rings[id];
+
+error:
+ mutex_unlock(&ringacc->req_lock);
+ return NULL;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_request_ring);
+
+static void k3_ringacc_ring_reset_sci(struct k3_ring *ring)
+{
+ struct k3_ringacc *ringacc = ring->parent;
+ int ret;
+
+ ret = ringacc->tisci_ring_ops->config(
+ ringacc->tisci,
+ TI_SCI_MSG_VALUE_RM_RING_COUNT_VALID,
+ ringacc->tisci_dev_id,
+ ring->ring_id,
+ 0,
+ 0,
+ ring->size,
+ 0,
+ 0,
+ 0);
+ if (ret)
+ dev_err(ringacc->dev, "TISCI reset ring fail (%d) ring_idx %d\n",
+ ret, ring->ring_id);
+}
+
+void k3_ringacc_ring_reset(struct k3_ring *ring)
+{
+ if (!ring || !(ring->flags & K3_RING_FLAG_BUSY))
+ return;
+
+ ring->occ = 0;
+ ring->free = 0;
+ ring->rindex = 0;
+ ring->windex = 0;
+
+ k3_ringacc_ring_reset_sci(ring);
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_reset);
+
+static void k3_ringacc_ring_reconfig_qmode_sci(struct k3_ring *ring,
+ enum k3_ring_mode mode)
+{
+ struct k3_ringacc *ringacc = ring->parent;
+ int ret;
+
+ ret = ringacc->tisci_ring_ops->config(
+ ringacc->tisci,
+ TI_SCI_MSG_VALUE_RM_RING_MODE_VALID,
+ ringacc->tisci_dev_id,
+ ring->ring_id,
+ 0,
+ 0,
+ 0,
+ mode,
+ 0,
+ 0);
+ if (ret)
+ dev_err(ringacc->dev, "TISCI reconf qmode fail (%d) ring_idx %d\n",
+ ret, ring->ring_id);
+}
+
+void k3_ringacc_ring_reset_dma(struct k3_ring *ring, u32 occ)
+{
+ if (!ring || !(ring->flags & K3_RING_FLAG_BUSY))
+ return;
+
+ if (!ring->parent->dma_ring_reset_quirk)
+ goto reset;
+
+ if (!occ)
+ occ = readl(&ring->rt->occ);
+
+ if (occ) {
+ u32 db_ring_cnt, db_ring_cnt_cur;
+
+ dev_dbg(ring->parent->dev, "%s %u occ: %u\n", __func__,
+ ring->ring_id, occ);
+ /* TI-SCI ring reset */
+ k3_ringacc_ring_reset_sci(ring);
+
+ /*
+ * Setup the ring in ring/doorbell mode (if not already in this
+ * mode)
+ */
+ if (ring->mode != K3_RINGACC_RING_MODE_RING)
+ k3_ringacc_ring_reconfig_qmode_sci(
+ ring, K3_RINGACC_RING_MODE_RING);
+ /*
+ * Ring the doorbell 2**22 – ringOcc times.
+ * This will wrap the internal UDMAP ring state occupancy
+ * counter (which is 21-bits wide) to 0.
+ */
+ db_ring_cnt = (1U << 22) - occ;
+
+ while (db_ring_cnt != 0) {
+ /*
+ * Ring the doorbell with the maximum count each
+ * iteration if possible to minimize the total
+ * of writes
+ */
+ if (db_ring_cnt > K3_RINGACC_MAX_DB_RING_CNT)
+ db_ring_cnt_cur = K3_RINGACC_MAX_DB_RING_CNT;
+ else
+ db_ring_cnt_cur = db_ring_cnt;
+
+ writel(db_ring_cnt_cur, &ring->rt->db);
+ db_ring_cnt -= db_ring_cnt_cur;
+ }
+
+ /* Restore the original ring mode (if not ring mode) */
+ if (ring->mode != K3_RINGACC_RING_MODE_RING)
+ k3_ringacc_ring_reconfig_qmode_sci(ring, ring->mode);
+ }
+
+reset:
+ /* Reset the ring */
+ k3_ringacc_ring_reset(ring);
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_reset_dma);
+
+static void k3_ringacc_ring_free_sci(struct k3_ring *ring)
+{
+ struct k3_ringacc *ringacc = ring->parent;
+ int ret;
+
+ ret = ringacc->tisci_ring_ops->config(
+ ringacc->tisci,
+ TI_SCI_MSG_VALUE_RM_ALL_NO_ORDER,
+ ringacc->tisci_dev_id,
+ ring->ring_id,
+ 0,
+ 0,
+ 0,
+ 0,
+ 0,
+ 0);
+ if (ret)
+ dev_err(ringacc->dev, "TISCI ring free fail (%d) ring_idx %d\n",
+ ret, ring->ring_id);
+}
+
+int k3_ringacc_ring_free(struct k3_ring *ring)
+{
+ struct k3_ringacc *ringacc;
+
+ if (!ring)
+ return -EINVAL;
+
+ ringacc = ring->parent;
+
+ dev_dbg(ring->parent->dev, "flags: 0x%08x\n", ring->flags);
+
+ if (!test_bit(ring->ring_id, ringacc->rings_inuse))
+ return -EINVAL;
+
+ mutex_lock(&ringacc->req_lock);
+
+ if (--ring->use_count)
+ goto out;
+
+ if (!(ring->flags & K3_RING_FLAG_BUSY))
+ goto no_init;
+
+ k3_ringacc_ring_free_sci(ring);
+
+ dma_free_coherent(ringacc->dev,
+ ring->size * (4 << ring->elm_size),
+ ring->ring_mem_virt, ring->ring_mem_dma);
+ ring->flags = 0;
+ ring->ops = NULL;
+ if (ring->proxy_id != K3_RINGACC_PROXY_NOT_USED) {
+ clear_bit(ring->proxy_id, ringacc->proxy_inuse);
+ ring->proxy = NULL;
+ ring->proxy_id = K3_RINGACC_PROXY_NOT_USED;
+ }
+
+no_init:
+ clear_bit(ring->ring_id, ringacc->rings_inuse);
+
+ module_put(ringacc->dev->driver->owner);
+
+out:
+ mutex_unlock(&ringacc->req_lock);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_free);
+
+u32 k3_ringacc_get_ring_id(struct k3_ring *ring)
+{
+ if (!ring)
+ return -EINVAL;
+
+ return ring->ring_id;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_get_ring_id);
+
+u32 k3_ringacc_get_tisci_dev_id(struct k3_ring *ring)
+{
+ if (!ring)
+ return -EINVAL;
+
+ return ring->parent->tisci_dev_id;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_get_tisci_dev_id);
+
+int k3_ringacc_get_ring_irq_num(struct k3_ring *ring)
+{
+ int irq_num;
+
+ if (!ring)
+ return -EINVAL;
+
+ irq_num = ti_sci_inta_msi_get_virq(ring->parent->dev, ring->ring_id);
+ if (irq_num <= 0)
+ irq_num = -EINVAL;
+ return irq_num;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_get_ring_irq_num);
+
+static int k3_ringacc_ring_cfg_sci(struct k3_ring *ring)
+{
+ struct k3_ringacc *ringacc = ring->parent;
+ u32 ring_idx;
+ int ret;
+
+ if (!ringacc->tisci)
+ return -EINVAL;
+
+ ring_idx = ring->ring_id;
+ ret = ringacc->tisci_ring_ops->config(
+ ringacc->tisci,
+ TI_SCI_MSG_VALUE_RM_ALL_NO_ORDER,
+ ringacc->tisci_dev_id,
+ ring_idx,
+ lower_32_bits(ring->ring_mem_dma),
+ upper_32_bits(ring->ring_mem_dma),
+ ring->size,
+ ring->mode,
+ ring->elm_size,
+ 0);
+ if (ret)
+ dev_err(ringacc->dev, "TISCI config ring fail (%d) ring_idx %d\n",
+ ret, ring_idx);
+
+ return ret;
+}
+
+int k3_ringacc_ring_cfg(struct k3_ring *ring, struct k3_ring_cfg *cfg)
+{
+ struct k3_ringacc *ringacc = ring->parent;
+ int ret = 0;
+
+ if (!ring || !cfg)
+ return -EINVAL;
+ if (cfg->elm_size > K3_RINGACC_RING_ELSIZE_256 ||
+ cfg->mode >= K3_RINGACC_RING_MODE_INVALID ||
+ cfg->size & ~K3_RINGACC_CFG_RING_SIZE_ELCNT_MASK ||
+ !test_bit(ring->ring_id, ringacc->rings_inuse))
+ return -EINVAL;
+
+ if (cfg->mode == K3_RINGACC_RING_MODE_MESSAGE &&
+ ring->proxy_id == K3_RINGACC_PROXY_NOT_USED &&
+ cfg->elm_size > K3_RINGACC_RING_ELSIZE_8) {
+ dev_err(ringacc->dev,
+ "Message mode must use proxy for %u element size\n",
+ 4 << ring->elm_size);
+ return -EINVAL;
+ }
+
+ /*
+ * In case of shared ring only the first user (master user) can
+ * configure the ring. The sequence should be by the client:
+ * ring = k3_ringacc_request_ring(ringacc, ring_id, 0); # master user
+ * k3_ringacc_ring_cfg(ring, cfg); # master configuration
+ * k3_ringacc_request_ring(ringacc, ring_id, K3_RING_FLAG_SHARED);
+ * k3_ringacc_request_ring(ringacc, ring_id, K3_RING_FLAG_SHARED);
+ */
+ if (ring->use_count != 1)
+ return 0;
+
+ ring->size = cfg->size;
+ ring->elm_size = cfg->elm_size;
+ ring->mode = cfg->mode;
+ ring->occ = 0;
+ ring->free = 0;
+ ring->rindex = 0;
+ ring->windex = 0;
+
+ if (ring->proxy_id != K3_RINGACC_PROXY_NOT_USED)
+ ring->proxy = ringacc->proxy_target_base +
+ ring->proxy_id * K3_RINGACC_PROXY_TARGET_STEP;
+
+ switch (ring->mode) {
+ case K3_RINGACC_RING_MODE_RING:
+ ring->ops = &k3_ring_mode_ring_ops;
+ break;
+ case K3_RINGACC_RING_MODE_MESSAGE:
+ if (ring->proxy)
+ ring->ops = &k3_ring_mode_proxy_ops;
+ else
+ ring->ops = &k3_ring_mode_msg_ops;
+ break;
+ default:
+ ring->ops = NULL;
+ ret = -EINVAL;
+ goto err_free_proxy;
+ };
+
+ ring->ring_mem_virt = dma_alloc_coherent(ringacc->dev,
+ ring->size * (4 << ring->elm_size),
+ &ring->ring_mem_dma, GFP_KERNEL);
+ if (!ring->ring_mem_virt) {
+ dev_err(ringacc->dev, "Failed to alloc ring mem\n");
+ ret = -ENOMEM;
+ goto err_free_ops;
+ }
+
+ ret = k3_ringacc_ring_cfg_sci(ring);
+
+ if (ret)
+ goto err_free_mem;
+
+ ring->flags |= K3_RING_FLAG_BUSY;
+ ring->flags |= (cfg->flags & K3_RINGACC_RING_SHARED) ?
+ K3_RING_FLAG_SHARED : 0;
+
+ k3_ringacc_ring_dump(ring);
+
+ return 0;
+
+err_free_mem:
+ dma_free_coherent(ringacc->dev,
+ ring->size * (4 << ring->elm_size),
+ ring->ring_mem_virt,
+ ring->ring_mem_dma);
+err_free_ops:
+ ring->ops = NULL;
+err_free_proxy:
+ ring->proxy = NULL;
+ return ret;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_cfg);
+
+u32 k3_ringacc_ring_get_size(struct k3_ring *ring)
+{
+ if (!ring || !(ring->flags & K3_RING_FLAG_BUSY))
+ return -EINVAL;
+
+ return ring->size;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_get_size);
+
+u32 k3_ringacc_ring_get_free(struct k3_ring *ring)
+{
+ if (!ring || !(ring->flags & K3_RING_FLAG_BUSY))
+ return -EINVAL;
+
+ if (!ring->free)
+ ring->free = ring->size - readl(&ring->rt->occ);
+
+ return ring->free;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_get_free);
+
+u32 k3_ringacc_ring_get_occ(struct k3_ring *ring)
+{
+ if (!ring || !(ring->flags & K3_RING_FLAG_BUSY))
+ return -EINVAL;
+
+ return readl(&ring->rt->occ);
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_get_occ);
+
+u32 k3_ringacc_ring_is_full(struct k3_ring *ring)
+{
+ return !k3_ringacc_ring_get_free(ring);
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_is_full);
+
+enum k3_ringacc_access_mode {
+ K3_RINGACC_ACCESS_MODE_PUSH_HEAD,
+ K3_RINGACC_ACCESS_MODE_POP_HEAD,
+ K3_RINGACC_ACCESS_MODE_PUSH_TAIL,
+ K3_RINGACC_ACCESS_MODE_POP_TAIL,
+ K3_RINGACC_ACCESS_MODE_PEEK_HEAD,
+ K3_RINGACC_ACCESS_MODE_PEEK_TAIL,
+};
+
+#define K3_RINGACC_PROXY_MODE(x) (((x) & 0x3) << 16)
+#define K3_RINGACC_PROXY_ELSIZE(x) (((x) & 0x7) << 24)
+static int k3_ringacc_ring_cfg_proxy(struct k3_ring *ring,
+ enum k3_ringacc_proxy_access_mode mode)
+{
+ u32 val;
+
+ val = ring->ring_id;
+ val |= K3_RINGACC_PROXY_MODE(mode);
+ val |= K3_RINGACC_PROXY_ELSIZE(ring->elm_size);
+ writel(val, &ring->proxy->control);
+ return 0;
+}
+
+static int k3_ringacc_ring_access_proxy(struct k3_ring *ring, void *elem,
+ enum k3_ringacc_access_mode access_mode)
+{
+ void __iomem *ptr;
+
+ ptr = (void __iomem *)&ring->proxy->data;
+
+ switch (access_mode) {
+ case K3_RINGACC_ACCESS_MODE_PUSH_HEAD:
+ case K3_RINGACC_ACCESS_MODE_POP_HEAD:
+ k3_ringacc_ring_cfg_proxy(ring, PROXY_ACCESS_MODE_HEAD);
+ break;
+ case K3_RINGACC_ACCESS_MODE_PUSH_TAIL:
+ case K3_RINGACC_ACCESS_MODE_POP_TAIL:
+ k3_ringacc_ring_cfg_proxy(ring, PROXY_ACCESS_MODE_TAIL);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ ptr += k3_ringacc_ring_get_fifo_pos(ring);
+
+ switch (access_mode) {
+ case K3_RINGACC_ACCESS_MODE_POP_HEAD:
+ case K3_RINGACC_ACCESS_MODE_POP_TAIL:
+ dev_dbg(ring->parent->dev,
+ "proxy:memcpy_fromio(x): --> ptr(%p), mode:%d\n", ptr,
+ access_mode);
+ memcpy_fromio(elem, ptr, (4 << ring->elm_size));
+ ring->occ--;
+ break;
+ case K3_RINGACC_ACCESS_MODE_PUSH_TAIL:
+ case K3_RINGACC_ACCESS_MODE_PUSH_HEAD:
+ dev_dbg(ring->parent->dev,
+ "proxy:memcpy_toio(x): --> ptr(%p), mode:%d\n", ptr,
+ access_mode);
+ memcpy_toio(ptr, elem, (4 << ring->elm_size));
+ ring->free--;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ dev_dbg(ring->parent->dev, "proxy: free%d occ%d\n", ring->free,
+ ring->occ);
+ return 0;
+}
+
+static int k3_ringacc_ring_push_head_proxy(struct k3_ring *ring, void *elem)
+{
+ return k3_ringacc_ring_access_proxy(ring, elem,
+ K3_RINGACC_ACCESS_MODE_PUSH_HEAD);
+}
+
+static int k3_ringacc_ring_push_tail_proxy(struct k3_ring *ring, void *elem)
+{
+ return k3_ringacc_ring_access_proxy(ring, elem,
+ K3_RINGACC_ACCESS_MODE_PUSH_TAIL);
+}
+
+static int k3_ringacc_ring_pop_head_proxy(struct k3_ring *ring, void *elem)
+{
+ return k3_ringacc_ring_access_proxy(ring, elem,
+ K3_RINGACC_ACCESS_MODE_POP_HEAD);
+}
+
+static int k3_ringacc_ring_pop_tail_proxy(struct k3_ring *ring, void *elem)
+{
+ return k3_ringacc_ring_access_proxy(ring, elem,
+ K3_RINGACC_ACCESS_MODE_POP_HEAD);
+}
+
+static int k3_ringacc_ring_access_io(struct k3_ring *ring, void *elem,
+ enum k3_ringacc_access_mode access_mode)
+{
+ void __iomem *ptr;
+
+ switch (access_mode) {
+ case K3_RINGACC_ACCESS_MODE_PUSH_HEAD:
+ case K3_RINGACC_ACCESS_MODE_POP_HEAD:
+ ptr = (void __iomem *)&ring->fifos->head_data;
+ break;
+ case K3_RINGACC_ACCESS_MODE_PUSH_TAIL:
+ case K3_RINGACC_ACCESS_MODE_POP_TAIL:
+ ptr = (void __iomem *)&ring->fifos->tail_data;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ ptr += k3_ringacc_ring_get_fifo_pos(ring);
+
+ switch (access_mode) {
+ case K3_RINGACC_ACCESS_MODE_POP_HEAD:
+ case K3_RINGACC_ACCESS_MODE_POP_TAIL:
+ dev_dbg(ring->parent->dev,
+ "memcpy_fromio(x): --> ptr(%p), mode:%d\n", ptr,
+ access_mode);
+ memcpy_fromio(elem, ptr, (4 << ring->elm_size));
+ ring->occ--;
+ break;
+ case K3_RINGACC_ACCESS_MODE_PUSH_TAIL:
+ case K3_RINGACC_ACCESS_MODE_PUSH_HEAD:
+ dev_dbg(ring->parent->dev,
+ "memcpy_toio(x): --> ptr(%p), mode:%d\n", ptr,
+ access_mode);
+ memcpy_toio(ptr, elem, (4 << ring->elm_size));
+ ring->free--;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ dev_dbg(ring->parent->dev, "free%d index%d occ%d index%d\n", ring->free,
+ ring->windex, ring->occ, ring->rindex);
+ return 0;
+}
+
+static int k3_ringacc_ring_push_head_io(struct k3_ring *ring, void *elem)
+{
+ return k3_ringacc_ring_access_io(ring, elem,
+ K3_RINGACC_ACCESS_MODE_PUSH_HEAD);
+}
+
+static int k3_ringacc_ring_push_io(struct k3_ring *ring, void *elem)
+{
+ return k3_ringacc_ring_access_io(ring, elem,
+ K3_RINGACC_ACCESS_MODE_PUSH_TAIL);
+}
+
+static int k3_ringacc_ring_pop_io(struct k3_ring *ring, void *elem)
+{
+ return k3_ringacc_ring_access_io(ring, elem,
+ K3_RINGACC_ACCESS_MODE_POP_HEAD);
+}
+
+static int k3_ringacc_ring_pop_tail_io(struct k3_ring *ring, void *elem)
+{
+ return k3_ringacc_ring_access_io(ring, elem,
+ K3_RINGACC_ACCESS_MODE_POP_HEAD);
+}
+
+static int k3_ringacc_ring_push_mem(struct k3_ring *ring, void *elem)
+{
+ void *elem_ptr;
+
+ elem_ptr = k3_ringacc_get_elm_addr(ring, ring->windex);
+
+ memcpy(elem_ptr, elem, (4 << ring->elm_size));
+
+ ring->windex = (ring->windex + 1) % ring->size;
+ ring->free--;
+ writel(1, &ring->rt->db);
+
+ dev_dbg(ring->parent->dev, "ring_push_mem: free%d index%d\n",
+ ring->free, ring->windex);
+
+ return 0;
+}
+
+static int k3_ringacc_ring_pop_mem(struct k3_ring *ring, void *elem)
+{
+ void *elem_ptr;
+
+ elem_ptr = k3_ringacc_get_elm_addr(ring, ring->rindex);
+
+ memcpy(elem, elem_ptr, (4 << ring->elm_size));
+
+ ring->rindex = (ring->rindex + 1) % ring->size;
+ ring->occ--;
+ writel(-1, &ring->rt->db);
+
+ dev_dbg(ring->parent->dev, "ring_pop_mem: occ%d index%d pos_ptr%p\n",
+ ring->occ, ring->rindex, elem_ptr);
+ return 0;
+}
+
+int k3_ringacc_ring_push(struct k3_ring *ring, void *elem)
+{
+ int ret = -EOPNOTSUPP;
+
+ if (!ring || !(ring->flags & K3_RING_FLAG_BUSY))
+ return -EINVAL;
+
+ dev_dbg(ring->parent->dev, "ring_push: free%d index%d\n", ring->free,
+ ring->windex);
+
+ if (k3_ringacc_ring_is_full(ring))
+ return -ENOMEM;
+
+ if (ring->ops && ring->ops->push_tail)
+ ret = ring->ops->push_tail(ring, elem);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_push);
+
+int k3_ringacc_ring_push_head(struct k3_ring *ring, void *elem)
+{
+ int ret = -EOPNOTSUPP;
+
+ if (!ring || !(ring->flags & K3_RING_FLAG_BUSY))
+ return -EINVAL;
+
+ dev_dbg(ring->parent->dev, "ring_push_head: free%d index%d\n",
+ ring->free, ring->windex);
+
+ if (k3_ringacc_ring_is_full(ring))
+ return -ENOMEM;
+
+ if (ring->ops && ring->ops->push_head)
+ ret = ring->ops->push_head(ring, elem);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_push_head);
+
+int k3_ringacc_ring_pop(struct k3_ring *ring, void *elem)
+{
+ int ret = -EOPNOTSUPP;
+
+ if (!ring || !(ring->flags & K3_RING_FLAG_BUSY))
+ return -EINVAL;
+
+ if (!ring->occ)
+ ring->occ = k3_ringacc_ring_get_occ(ring);
+
+ dev_dbg(ring->parent->dev, "ring_pop: occ%d index%d\n", ring->occ,
+ ring->rindex);
+
+ if (!ring->occ)
+ return -ENODATA;
+
+ if (ring->ops && ring->ops->pop_head)
+ ret = ring->ops->pop_head(ring, elem);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_pop);
+
+int k3_ringacc_ring_pop_tail(struct k3_ring *ring, void *elem)
+{
+ int ret = -EOPNOTSUPP;
+
+ if (!ring || !(ring->flags & K3_RING_FLAG_BUSY))
+ return -EINVAL;
+
+ if (!ring->occ)
+ ring->occ = k3_ringacc_ring_get_occ(ring);
+
+ dev_dbg(ring->parent->dev, "ring_pop_tail: occ%d index%d\n", ring->occ,
+ ring->rindex);
+
+ if (!ring->occ)
+ return -ENODATA;
+
+ if (ring->ops && ring->ops->pop_tail)
+ ret = ring->ops->pop_tail(ring, elem);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(k3_ringacc_ring_pop_tail);
+
+struct k3_ringacc *of_k3_ringacc_get_by_phandle(struct device_node *np,
+ const char *property)
+{
+ struct device_node *ringacc_np;
+ struct k3_ringacc *ringacc = ERR_PTR(-EPROBE_DEFER);
+ struct k3_ringacc *entry;
+
+ ringacc_np = of_parse_phandle(np, property, 0);
+ if (!ringacc_np)
+ return ERR_PTR(-ENODEV);
+
+ mutex_lock(&k3_ringacc_list_lock);
+ list_for_each_entry(entry, &k3_ringacc_list, list)
+ if (entry->dev->of_node == ringacc_np) {
+ ringacc = entry;
+ break;
+ }
+ mutex_unlock(&k3_ringacc_list_lock);
+ of_node_put(ringacc_np);
+
+ return ringacc;
+}
+EXPORT_SYMBOL_GPL(of_k3_ringacc_get_by_phandle);
+
+static int k3_ringacc_probe_dt(struct k3_ringacc *ringacc)
+{
+ struct device_node *node = ringacc->dev->of_node;
+ struct device *dev = ringacc->dev;
+ struct platform_device *pdev = to_platform_device(dev);
+ int ret;
+
+ if (!node) {
+ dev_err(dev, "device tree info unavailable\n");
+ return -ENODEV;
+ }
+
+ ret = of_property_read_u32(node, "ti,num-rings", &ringacc->num_rings);
+ if (ret) {
+ dev_err(dev, "ti,num-rings read failure %d\n", ret);
+ return ret;
+ }
+
+ ringacc->dma_ring_reset_quirk =
+ of_property_read_bool(node, "ti,dma-ring-reset-quirk");
+
+ ringacc->tisci = ti_sci_get_by_phandle(node, "ti,sci");
+ if (IS_ERR(ringacc->tisci)) {
+ ret = PTR_ERR(ringacc->tisci);
+ if (ret != -EPROBE_DEFER)
+ dev_err(dev, "ti,sci read fail %d\n", ret);
+ ringacc->tisci = NULL;
+ return ret;
+ }
+
+ ret = of_property_read_u32(node, "ti,sci-dev-id",
+ &ringacc->tisci_dev_id);
+ if (ret) {
+ dev_err(dev, "ti,sci-dev-id read fail %d\n", ret);
+ return ret;
+ }
+
+ pdev->id = ringacc->tisci_dev_id;
+
+ ringacc->rm_gp_range = devm_ti_sci_get_of_resource(ringacc->tisci, dev,
+ ringacc->tisci_dev_id,
+ "ti,sci-rm-range-gp-rings");
+ if (IS_ERR(ringacc->rm_gp_range)) {
+ dev_err(dev, "Failed to allocate MSI interrupts\n");
+ return PTR_ERR(ringacc->rm_gp_range);
+ }
+
+ return ti_sci_inta_msi_domain_alloc_irqs(ringacc->dev,
+ ringacc->rm_gp_range);
+}
+
+static int k3_ringacc_probe(struct platform_device *pdev)
+{
+ struct k3_ringacc *ringacc;
+ void __iomem *base_fifo, *base_rt;
+ struct device *dev = &pdev->dev;
+ struct resource *res;
+ int ret, i;
+
+ ringacc = devm_kzalloc(dev, sizeof(*ringacc), GFP_KERNEL);
+ if (!ringacc)
+ return -ENOMEM;
+
+ ringacc->dev = dev;
+ mutex_init(&ringacc->req_lock);
+
+ dev->msi_domain = of_msi_get_domain(dev, dev->of_node,
+ DOMAIN_BUS_TI_SCI_INTA_MSI);
+ if (!dev->msi_domain) {
+ dev_err(dev, "Failed to get MSI domain\n");
+ return -EPROBE_DEFER;
+ }
+
+ ret = k3_ringacc_probe_dt(ringacc);
+ if (ret)
+ return ret;
+
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "rt");
+ base_rt = devm_ioremap_resource(dev, res);
+ if (IS_ERR(base_rt))
+ return PTR_ERR(base_rt);
+
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "fifos");
+ base_fifo = devm_ioremap_resource(dev, res);
+ if (IS_ERR(base_fifo))
+ return PTR_ERR(base_fifo);
+
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "proxy_gcfg");
+ ringacc->proxy_gcfg = devm_ioremap_resource(dev, res);
+ if (IS_ERR(ringacc->proxy_gcfg))
+ return PTR_ERR(ringacc->proxy_gcfg);
+
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
+ "proxy_target");
+ ringacc->proxy_target_base = devm_ioremap_resource(dev, res);
+ if (IS_ERR(ringacc->proxy_target_base))
+ return PTR_ERR(ringacc->proxy_target_base);
+
+ ringacc->num_proxies = readl(&ringacc->proxy_gcfg->config) &
+ K3_RINGACC_PROXY_CFG_THREADS_MASK;
+
+ ringacc->rings = devm_kzalloc(dev,
+ sizeof(*ringacc->rings) *
+ ringacc->num_rings,
+ GFP_KERNEL);
+ ringacc->rings_inuse = devm_kcalloc(dev,
+ BITS_TO_LONGS(ringacc->num_rings),
+ sizeof(unsigned long), GFP_KERNEL);
+ ringacc->proxy_inuse = devm_kcalloc(dev,
+ BITS_TO_LONGS(ringacc->num_proxies),
+ sizeof(unsigned long), GFP_KERNEL);
+
+ if (!ringacc->rings || !ringacc->rings_inuse || !ringacc->proxy_inuse)
+ return -ENOMEM;
+
+ for (i = 0; i < ringacc->num_rings; i++) {
+ ringacc->rings[i].rt = base_rt +
+ K3_RINGACC_RT_REGS_STEP * i;
+ ringacc->rings[i].fifos = base_fifo +
+ K3_RINGACC_FIFO_REGS_STEP * i;
+ ringacc->rings[i].parent = ringacc;
+ ringacc->rings[i].ring_id = i;
+ ringacc->rings[i].proxy_id = K3_RINGACC_PROXY_NOT_USED;
+ }
+ dev_set_drvdata(dev, ringacc);
+
+ ringacc->tisci_ring_ops = &ringacc->tisci->ops.rm_ring_ops;
+
+ mutex_lock(&k3_ringacc_list_lock);
+ list_add_tail(&ringacc->list, &k3_ringacc_list);
+ mutex_unlock(&k3_ringacc_list_lock);
+
+ dev_info(dev, "Ring Accelerator probed rings:%u, gp-rings[%u,%u] sci-dev-id:%u\n",
+ ringacc->num_rings,
+ ringacc->rm_gp_range->desc[0].start,
+ ringacc->rm_gp_range->desc[0].num,
+ ringacc->tisci_dev_id);
+ dev_info(dev, "dma-ring-reset-quirk: %s\n",
+ ringacc->dma_ring_reset_quirk ? "enabled" : "disabled");
+ dev_info(dev, "RA Proxy rev. %08x, num_proxies:%u\n",
+ readl(&ringacc->proxy_gcfg->revision), ringacc->num_proxies);
+ return 0;
+}
+
+static int k3_ringacc_remove(struct platform_device *pdev)
+{
+ struct k3_ringacc *ringacc = dev_get_drvdata(&pdev->dev);
+
+ mutex_lock(&k3_ringacc_list_lock);
+ list_del(&ringacc->list);
+ mutex_unlock(&k3_ringacc_list_lock);
+ return 0;
+}
+
+/* Match table for of_platform binding */
+static const struct of_device_id k3_ringacc_of_match[] = {
+ { .compatible = "ti,am654-navss-ringacc", },
+ {},
+};
+MODULE_DEVICE_TABLE(of, k3_ringacc_of_match);
+
+static struct platform_driver k3_ringacc_driver = {
+ .probe = k3_ringacc_probe,
+ .remove = k3_ringacc_remove,
+ .driver = {
+ .name = "k3-ringacc",
+ .of_match_table = k3_ringacc_of_match,
+ },
+};
+module_platform_driver(k3_ringacc_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("TI Ringacc driver for K3 SOCs");
+MODULE_AUTHOR("Grygorii Strashko <[email protected]>");
diff --git a/include/linux/soc/ti/k3-ringacc.h b/include/linux/soc/ti/k3-ringacc.h
new file mode 100644
index 000000000000..26f73df0a524
--- /dev/null
+++ b/include/linux/soc/ti/k3-ringacc.h
@@ -0,0 +1,244 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * K3 Ring Accelerator (RA) subsystem interface
+ *
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
+ */
+
+#ifndef __SOC_TI_K3_RINGACC_API_H_
+#define __SOC_TI_K3_RINGACC_API_H_
+
+#include <linux/types.h>
+
+struct device_node;
+
+/**
+ * enum k3_ring_mode - &struct k3_ring_cfg mode
+ *
+ * RA ring operational modes
+ *
+ * @K3_RINGACC_RING_MODE_RING: Exposed Ring mode for SW direct access
+ * @K3_RINGACC_RING_MODE_MESSAGE: Messaging mode. Messaging mode requires
+ * that all accesses to the queue must go through this IP so that all
+ * accesses to the memory are controlled and ordered. This IP then
+ * controls the entire state of the queue, and SW has no directly control,
+ * such as through doorbells and cannot access the storage memory directly.
+ * This is particularly useful when more than one SW or HW entity can be
+ * the producer and/or consumer at the same time
+ * @K3_RINGACC_RING_MODE_CREDENTIALS: Credentials mode is message mode plus
+ * stores credentials with each message, requiring the element size to be
+ * doubled to fit the credentials. Any exposed memory should be protected
+ * by a firewall from unwanted access
+ */
+enum k3_ring_mode {
+ K3_RINGACC_RING_MODE_RING = 0,
+ K3_RINGACC_RING_MODE_MESSAGE,
+ K3_RINGACC_RING_MODE_CREDENTIALS,
+ K3_RINGACC_RING_MODE_INVALID
+};
+
+/**
+ * enum k3_ring_size - &struct k3_ring_cfg elm_size
+ *
+ * RA ring element's sizes in bytes.
+ */
+enum k3_ring_size {
+ K3_RINGACC_RING_ELSIZE_4 = 0,
+ K3_RINGACC_RING_ELSIZE_8,
+ K3_RINGACC_RING_ELSIZE_16,
+ K3_RINGACC_RING_ELSIZE_32,
+ K3_RINGACC_RING_ELSIZE_64,
+ K3_RINGACC_RING_ELSIZE_128,
+ K3_RINGACC_RING_ELSIZE_256,
+ K3_RINGACC_RING_ELSIZE_INVALID
+};
+
+struct k3_ringacc;
+struct k3_ring;
+
+/**
+ * enum k3_ring_cfg - RA ring configuration structure
+ *
+ * @size: Ring size, number of elements
+ * @elm_size: Ring element size
+ * @mode: Ring operational mode
+ * @flags: Ring configuration flags. Possible values:
+ * @K3_RINGACC_RING_SHARED: when set allows to request the same ring
+ * few times. It's usable when the same ring is used as Free Host PD ring
+ * for different flows, for example.
+ * Note: Locking should be done by consumer if required
+ */
+struct k3_ring_cfg {
+ u32 size;
+ enum k3_ring_size elm_size;
+ enum k3_ring_mode mode;
+#define K3_RINGACC_RING_SHARED BIT(1)
+ u32 flags;
+};
+
+#define K3_RINGACC_RING_ID_ANY (-1)
+
+/**
+ * of_k3_ringacc_get_by_phandle - find a RA by phandle property
+ * @np: device node
+ * @propname: property name containing phandle on RA node
+ *
+ * Returns pointer on the RA - struct k3_ringacc
+ * or -ENODEV if not found,
+ * or -EPROBE_DEFER if not yet registered
+ */
+struct k3_ringacc *of_k3_ringacc_get_by_phandle(struct device_node *np,
+ const char *property);
+
+#define K3_RINGACC_RING_USE_PROXY BIT(1)
+
+/**
+ * k3_ringacc_request_ring - request ring from ringacc
+ * @ringacc: pointer on ringacc
+ * @id: ring id or K3_RINGACC_RING_ID_ANY for any general purpose ring
+ * @flags:
+ * @K3_RINGACC_RING_USE_PROXY: if set - proxy will be allocated and
+ * used to access ring memory. Sopported only for rings in
+ * Message/Credentials/Queue mode.
+ *
+ * Returns pointer on the Ring - struct k3_ring
+ * or NULL in case of failure.
+ */
+struct k3_ring *k3_ringacc_request_ring(struct k3_ringacc *ringacc,
+ int id, u32 flags);
+
+/**
+ * k3_ringacc_ring_reset - ring reset
+ * @ring: pointer on Ring
+ *
+ * Resets ring internal state ((hw)occ, (hw)idx).
+ */
+void k3_ringacc_ring_reset(struct k3_ring *ring);
+/**
+ * k3_ringacc_ring_reset - ring reset for DMA rings
+ * @ring: pointer on Ring
+ *
+ * Resets ring internal state ((hw)occ, (hw)idx). Should be used for rings
+ * which are read by K3 UDMA, like TX or Free Host PD rings.
+ */
+void k3_ringacc_ring_reset_dma(struct k3_ring *ring, u32 occ);
+
+/**
+ * k3_ringacc_ring_free - ring free
+ * @ring: pointer on Ring
+ *
+ * Resets ring and free all alocated resources.
+ */
+int k3_ringacc_ring_free(struct k3_ring *ring);
+
+/**
+ * k3_ringacc_get_ring_id - Get the Ring ID
+ * @ring: pointer on ring
+ *
+ * Returns the Ring ID
+ */
+u32 k3_ringacc_get_ring_id(struct k3_ring *ring);
+
+/**
+ * k3_ringacc_get_ring_irq_num - Get the irq number for the ring
+ * @ring: pointer on ring
+ *
+ * Returns the interrupt number which can be used to request the interrupt
+ */
+int k3_ringacc_get_ring_irq_num(struct k3_ring *ring);
+
+/**
+ * k3_ringacc_ring_cfg - ring configure
+ * @ring: pointer on ring
+ * @cfg: Ring configuration parameters (see &struct k3_ring_cfg)
+ *
+ * Configures ring, including ring memory allocation.
+ * Returns 0 on success, errno otherwise.
+ */
+int k3_ringacc_ring_cfg(struct k3_ring *ring, struct k3_ring_cfg *cfg);
+
+/**
+ * k3_ringacc_ring_get_size - get ring size
+ * @ring: pointer on ring
+ *
+ * Returns ring size in number of elements.
+ */
+u32 k3_ringacc_ring_get_size(struct k3_ring *ring);
+
+/**
+ * k3_ringacc_ring_get_free - get free elements
+ * @ring: pointer on ring
+ *
+ * Returns number of free elements in the ring.
+ */
+u32 k3_ringacc_ring_get_free(struct k3_ring *ring);
+
+/**
+ * k3_ringacc_ring_get_occ - get ring occupancy
+ * @ring: pointer on ring
+ *
+ * Returns total number of valid entries on the ring
+ */
+u32 k3_ringacc_ring_get_occ(struct k3_ring *ring);
+
+/**
+ * k3_ringacc_ring_is_full - checks if ring is full
+ * @ring: pointer on ring
+ *
+ * Returns true if the ring is full
+ */
+u32 k3_ringacc_ring_is_full(struct k3_ring *ring);
+
+/**
+ * k3_ringacc_ring_push - push element to the ring tail
+ * @ring: pointer on ring
+ * @elem: pointer on ring element buffer
+ *
+ * Push one ring element to the ring tail. Size of the ring element is
+ * determined by ring configuration &struct k3_ring_cfg elm_size.
+ *
+ * Returns 0 on success, errno otherwise.
+ */
+int k3_ringacc_ring_push(struct k3_ring *ring, void *elem);
+
+/**
+ * k3_ringacc_ring_pop - pop element from the ring head
+ * @ring: pointer on ring
+ * @elem: pointer on ring element buffer
+ *
+ * Push one ring element from the ring head. Size of the ring element is
+ * determined by ring configuration &struct k3_ring_cfg elm_size..
+ *
+ * Returns 0 on success, errno otherwise.
+ */
+int k3_ringacc_ring_pop(struct k3_ring *ring, void *elem);
+
+/**
+ * k3_ringacc_ring_push_head - push element to the ring head
+ * @ring: pointer on ring
+ * @elem: pointer on ring element buffer
+ *
+ * Push one ring element to the ring head. Size of the ring element is
+ * determined by ring configuration &struct k3_ring_cfg elm_size.
+ *
+ * Returns 0 on success, errno otherwise.
+ * Not Supported by ring modes: K3_RINGACC_RING_MODE_RING
+ */
+int k3_ringacc_ring_push_head(struct k3_ring *ring, void *elem);
+
+/**
+ * k3_ringacc_ring_pop_tail - pop element from the ring tail
+ * @ring: pointer on ring
+ * @elem: pointer on ring element buffer
+ *
+ * Push one ring element from the ring tail. Size of the ring element is
+ * determined by ring configuration &struct k3_ring_cfg elm_size.
+ *
+ * Returns 0 on success, errno otherwise.
+ * Not Supported by ring modes: K3_RINGACC_RING_MODE_RING
+ */
+int k3_ringacc_ring_pop_tail(struct k3_ring *ring, void *elem);
+
+u32 k3_ringacc_get_tisci_dev_id(struct k3_ring *ring);
+
+#endif /* __SOC_TI_K3_RINGACC_API_H_ */
--
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-01 08:44:29

by Peter Ujfalusi

[permalink] [raw]
Subject: [PATCH v4 15/15] dmaengine: ti: k3-udma: Add glue layer for non DMAengine users

From: Grygorii Strashko <[email protected]>

Certain users can not use right now the DMAengine API due to missing
features in the core. Prime example is Networking.

These users can use the glue layer interface to avoid misuse of DMAengine
API and when the core gains the needed features they can be converted to
use generic API.

Signed-off-by: Grygorii Strashko <[email protected]>
Signed-off-by: Peter Ujfalusi <[email protected]>
---
drivers/dma/ti/Kconfig | 9 +
drivers/dma/ti/Makefile | 1 +
drivers/dma/ti/k3-udma-glue.c | 1202 ++++++++++++++++++++++++++++++
drivers/dma/ti/k3-udma-private.c | 133 ++++
drivers/dma/ti/k3-udma.c | 63 +-
drivers/dma/ti/k3-udma.h | 31 +
include/linux/dma/k3-udma-glue.h | 134 ++++
7 files changed, 1572 insertions(+), 1 deletion(-)
create mode 100644 drivers/dma/ti/k3-udma-glue.c
create mode 100644 drivers/dma/ti/k3-udma-private.c
create mode 100644 include/linux/dma/k3-udma-glue.h

diff --git a/drivers/dma/ti/Kconfig b/drivers/dma/ti/Kconfig
index 04c98e215ba6..2421b600446e 100644
--- a/drivers/dma/ti/Kconfig
+++ b/drivers/dma/ti/Kconfig
@@ -48,6 +48,15 @@ config TI_K3_UDMA
Enable support for the TI UDMA (Unified DMA) controller. This
DMA engine is used in AM65x.

+config TI_K3_UDMA_GLUE_LAYER
+ tristate "Texas Instruments UDMA Glue layer for non DMAengine users"
+ depends on ARCH_K3 || COMPILE_TEST
+ depends on TI_K3_UDMA
+ default y
+ help
+ Say y here to support the K3 NAVSS DMA glue interface
+ If unsure, say N.
+
config TI_K3_PSIL
bool

diff --git a/drivers/dma/ti/Makefile b/drivers/dma/ti/Makefile
index 9d787f009195..9a29a107e374 100644
--- a/drivers/dma/ti/Makefile
+++ b/drivers/dma/ti/Makefile
@@ -3,5 +3,6 @@ obj-$(CONFIG_TI_CPPI41) += cppi41.o
obj-$(CONFIG_TI_EDMA) += edma.o
obj-$(CONFIG_DMA_OMAP) += omap-dma.o
obj-$(CONFIG_TI_K3_UDMA) += k3-udma.o
+obj-$(CONFIG_TI_K3_UDMA_GLUE_LAYER) += k3-udma-glue.o
obj-$(CONFIG_TI_K3_PSIL) += k3-psil.o k3-psil-am654.o k3-psil-j721e.o
obj-$(CONFIG_TI_DMA_CROSSBAR) += dma-crossbar.o
diff --git a/drivers/dma/ti/k3-udma-glue.c b/drivers/dma/ti/k3-udma-glue.c
new file mode 100644
index 000000000000..c747d740a616
--- /dev/null
+++ b/drivers/dma/ti/k3-udma-glue.c
@@ -0,0 +1,1202 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * K3 NAVSS DMA glue interface
+ *
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
+ *
+ */
+
+#include <linux/atomic.h>
+#include <linux/delay.h>
+#include <linux/dma-mapping.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/soc/ti/k3-ringacc.h>
+#include <linux/dma/ti-cppi5.h>
+#include <linux/dma/k3-udma-glue.h>
+
+#include "k3-udma.h"
+#include "k3-psil-priv.h"
+
+struct k3_udma_glue_common {
+ struct device *dev;
+ struct udma_dev *udmax;
+ const struct udma_tisci_rm *tisci_rm;
+ struct k3_ringacc *ringacc;
+ u32 src_thread;
+ u32 dst_thread;
+
+ u32 hdesc_size;
+ bool epib;
+ u32 psdata_size;
+ u32 swdata_size;
+};
+
+struct k3_udma_glue_tx_channel {
+ struct k3_udma_glue_common common;
+
+ struct udma_tchan *udma_tchanx;
+ int udma_tchan_id;
+
+ struct k3_ring *ringtx;
+ struct k3_ring *ringtxcq;
+
+ bool psil_paired;
+
+ int virq;
+
+ atomic_t free_pkts;
+ bool tx_pause_on_err;
+ bool tx_filt_einfo;
+ bool tx_filt_pswords;
+ bool tx_supr_tdpkt;
+};
+
+/**
+ * k3_udma_glue_rx_flow - UDMA RX flow context data
+ *
+ */
+struct k3_udma_glue_rx_flow {
+ struct udma_rflow *udma_rflow;
+ int udma_rflow_id;
+ struct k3_ring *ringrx;
+ struct k3_ring *ringrxfdq;
+
+ int virq;
+};
+
+struct k3_udma_glue_rx_channel {
+ struct k3_udma_glue_common common;
+
+ struct udma_rchan *udma_rchanx;
+ int udma_rchan_id;
+ bool remote;
+
+ bool psil_paired;
+
+ u32 swdata_size;
+ int flow_id_base;
+
+ struct k3_udma_glue_rx_flow *flows;
+ u32 flow_num;
+ u32 flows_ready;
+};
+
+#define K3_UDMAX_TDOWN_TIMEOUT_US 1000
+
+static int of_k3_udma_glue_parse(struct device_node *udmax_np,
+ struct k3_udma_glue_common *common)
+{
+ common->ringacc = of_k3_ringacc_get_by_phandle(udmax_np,
+ "ti,ringacc");
+ if (IS_ERR(common->ringacc))
+ return PTR_ERR(common->ringacc);
+
+ common->udmax = of_xudma_dev_get(udmax_np, NULL);
+ if (IS_ERR(common->udmax))
+ return PTR_ERR(common->udmax);
+
+ common->tisci_rm = xudma_dev_get_tisci_rm(common->udmax);
+
+ return 0;
+}
+
+static int of_k3_udma_glue_parse_chn(struct device_node *chn_np,
+ const char *name, struct k3_udma_glue_common *common,
+ bool tx_chn)
+{
+ struct psil_endpoint_config *ep_config;
+ struct of_phandle_args dma_spec;
+ u32 thread_id;
+ int ret = 0;
+ int index;
+
+ if (unlikely(!name))
+ return -EINVAL;
+
+ index = of_property_match_string(chn_np, "dma-names", name);
+ if (index < 0)
+ return index;
+
+ if (of_parse_phandle_with_args(chn_np, "dmas", "#dma-cells", index,
+ &dma_spec))
+ return -ENOENT;
+
+ thread_id = dma_spec.args[0];
+
+ if (tx_chn && !(thread_id & K3_PSIL_DST_THREAD_ID_OFFSET)) {
+ ret = -EINVAL;
+ goto out_put_spec;
+ }
+
+ if (!tx_chn && (thread_id & K3_PSIL_DST_THREAD_ID_OFFSET)) {
+ ret = -EINVAL;
+ goto out_put_spec;
+ }
+
+ /* get psil endpoint config */
+ ep_config = psil_get_ep_config(thread_id);
+ if (IS_ERR(ep_config)) {
+ dev_err(common->dev,
+ "No configuration for psi-l thread 0x%04x\n",
+ thread_id);
+ ret = PTR_ERR(ep_config);
+ goto out_put_spec;
+ }
+
+ common->epib = ep_config->needs_epib;
+ common->psdata_size = ep_config->psd_size;
+
+ if (tx_chn)
+ common->dst_thread = thread_id;
+ else
+ common->src_thread = thread_id;
+
+ ret = of_k3_udma_glue_parse(dma_spec.np, common);
+
+out_put_spec:
+ of_node_put(dma_spec.np);
+ return ret;
+};
+
+static void k3_udma_glue_dump_tx_chn(struct k3_udma_glue_tx_channel *tx_chn)
+{
+ struct device *dev = tx_chn->common.dev;
+
+ dev_dbg(dev, "dump_tx_chn:\n"
+ "udma_tchan_id: %d\n"
+ "src_thread: %08x\n"
+ "dst_thread: %08x\n",
+ tx_chn->udma_tchan_id,
+ tx_chn->common.src_thread,
+ tx_chn->common.dst_thread);
+}
+
+static void k3_udma_glue_dump_tx_rt_chn(struct k3_udma_glue_tx_channel *chn,
+ char *mark)
+{
+ struct device *dev = chn->common.dev;
+
+ dev_dbg(dev, "=== dump ===> %s\n", mark);
+ dev_dbg(dev, "0x%08X: %08X\n", UDMA_TCHAN_RT_CTL_REG,
+ xudma_tchanrt_read(chn->udma_tchanx, UDMA_TCHAN_RT_CTL_REG));
+ dev_dbg(dev, "0x%08X: %08X\n", UDMA_TCHAN_RT_PEER_RT_EN_REG,
+ xudma_tchanrt_read(chn->udma_tchanx,
+ UDMA_TCHAN_RT_PEER_RT_EN_REG));
+ dev_dbg(dev, "0x%08X: %08X\n", UDMA_TCHAN_RT_PCNT_REG,
+ xudma_tchanrt_read(chn->udma_tchanx, UDMA_TCHAN_RT_PCNT_REG));
+ dev_dbg(dev, "0x%08X: %08X\n", UDMA_TCHAN_RT_BCNT_REG,
+ xudma_tchanrt_read(chn->udma_tchanx, UDMA_TCHAN_RT_BCNT_REG));
+ dev_dbg(dev, "0x%08X: %08X\n", UDMA_TCHAN_RT_SBCNT_REG,
+ xudma_tchanrt_read(chn->udma_tchanx, UDMA_TCHAN_RT_SBCNT_REG));
+}
+
+static int k3_udma_glue_cfg_tx_chn(struct k3_udma_glue_tx_channel *tx_chn)
+{
+ const struct udma_tisci_rm *tisci_rm = tx_chn->common.tisci_rm;
+ struct ti_sci_msg_rm_udmap_tx_ch_cfg req;
+
+ memset(&req, 0, sizeof(req));
+
+ req.valid_params = TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_EINFO_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_PSWORDS_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_SUPR_TDPKT_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID;
+ req.nav_id = tisci_rm->tisci_dev_id;
+ req.index = tx_chn->udma_tchan_id;
+ if (tx_chn->tx_pause_on_err)
+ req.tx_pause_on_err = 1;
+ if (tx_chn->tx_filt_einfo)
+ req.tx_filt_einfo = 1;
+ if (tx_chn->tx_filt_pswords)
+ req.tx_filt_pswords = 1;
+ req.tx_chan_type = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
+ if (tx_chn->tx_supr_tdpkt)
+ req.tx_supr_tdpkt = 1;
+ req.tx_fetch_size = tx_chn->common.hdesc_size >> 2;
+ req.txcq_qnum = k3_ringacc_get_ring_id(tx_chn->ringtxcq);
+
+ return tisci_rm->tisci_udmap_ops->tx_ch_cfg(tisci_rm->tisci, &req);
+}
+
+struct k3_udma_glue_tx_channel *k3_udma_glue_request_tx_chn(struct device *dev,
+ const char *name, struct k3_udma_glue_tx_channel_cfg *cfg)
+{
+ struct k3_udma_glue_tx_channel *tx_chn;
+ int ret;
+
+ tx_chn = devm_kzalloc(dev, sizeof(*tx_chn), GFP_KERNEL);
+ if (!tx_chn)
+ return ERR_PTR(-ENOMEM);
+
+ tx_chn->common.dev = dev;
+ tx_chn->common.swdata_size = cfg->swdata_size;
+ tx_chn->tx_pause_on_err = cfg->tx_pause_on_err;
+ tx_chn->tx_filt_einfo = cfg->tx_filt_einfo;
+ tx_chn->tx_filt_pswords = cfg->tx_filt_pswords;
+ tx_chn->tx_supr_tdpkt = cfg->tx_supr_tdpkt;
+
+ /* parse of udmap channel */
+ ret = of_k3_udma_glue_parse_chn(dev->of_node, name,
+ &tx_chn->common, true);
+ if (ret)
+ goto err;
+
+ tx_chn->common.hdesc_size = cppi5_hdesc_calc_size(tx_chn->common.epib,
+ tx_chn->common.psdata_size,
+ tx_chn->common.swdata_size);
+
+ /* request and cfg UDMAP TX channel */
+ tx_chn->udma_tchanx = xudma_tchan_get(tx_chn->common.udmax, -1);
+ if (IS_ERR(tx_chn->udma_tchanx)) {
+ ret = PTR_ERR(tx_chn->udma_tchanx);
+ dev_err(dev, "UDMAX tchanx get err %d\n", ret);
+ goto err;
+ }
+ tx_chn->udma_tchan_id = xudma_tchan_get_id(tx_chn->udma_tchanx);
+
+ atomic_set(&tx_chn->free_pkts, cfg->txcq_cfg.size);
+
+ /* request and cfg rings */
+ tx_chn->ringtx = k3_ringacc_request_ring(tx_chn->common.ringacc,
+ tx_chn->udma_tchan_id, 0);
+ if (!tx_chn->ringtx) {
+ ret = -ENODEV;
+ dev_err(dev, "Failed to get TX ring %u\n",
+ tx_chn->udma_tchan_id);
+ goto err;
+ }
+
+ tx_chn->ringtxcq = k3_ringacc_request_ring(tx_chn->common.ringacc,
+ -1, 0);
+ if (!tx_chn->ringtxcq) {
+ ret = -ENODEV;
+ dev_err(dev, "Failed to get TXCQ ring\n");
+ goto err;
+ }
+
+ ret = k3_ringacc_ring_cfg(tx_chn->ringtx, &cfg->tx_cfg);
+ if (ret) {
+ dev_err(dev, "Failed to cfg ringtx %d\n", ret);
+ goto err;
+ }
+
+ ret = k3_ringacc_ring_cfg(tx_chn->ringtxcq, &cfg->txcq_cfg);
+ if (ret) {
+ dev_err(dev, "Failed to cfg ringtx %d\n", ret);
+ goto err;
+ }
+
+ /* request and cfg psi-l */
+ tx_chn->common.src_thread =
+ xudma_dev_get_psil_base(tx_chn->common.udmax) +
+ tx_chn->udma_tchan_id;
+
+ ret = k3_udma_glue_cfg_tx_chn(tx_chn);
+ if (ret) {
+ dev_err(dev, "Failed to cfg tchan %d\n", ret);
+ goto err;
+ }
+
+ ret = xudma_navss_psil_pair(tx_chn->common.udmax,
+ tx_chn->common.src_thread,
+ tx_chn->common.dst_thread);
+ if (ret) {
+ dev_err(dev, "PSI-L request err %d\n", ret);
+ goto err;
+ }
+
+ tx_chn->psil_paired = true;
+
+ /* reset TX RT registers */
+ k3_udma_glue_disable_tx_chn(tx_chn);
+
+ k3_udma_glue_dump_tx_chn(tx_chn);
+
+ return tx_chn;
+
+err:
+ k3_udma_glue_release_tx_chn(tx_chn);
+ return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_request_tx_chn);
+
+void k3_udma_glue_release_tx_chn(struct k3_udma_glue_tx_channel *tx_chn)
+{
+ if (tx_chn->psil_paired) {
+ xudma_navss_psil_unpair(tx_chn->common.udmax,
+ tx_chn->common.src_thread,
+ tx_chn->common.dst_thread);
+ tx_chn->psil_paired = false;
+ }
+
+ if (!IS_ERR_OR_NULL(tx_chn->udma_tchanx))
+ xudma_tchan_put(tx_chn->common.udmax,
+ tx_chn->udma_tchanx);
+
+ if (tx_chn->ringtxcq)
+ k3_ringacc_ring_free(tx_chn->ringtxcq);
+
+ if (tx_chn->ringtx)
+ k3_ringacc_ring_free(tx_chn->ringtx);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_release_tx_chn);
+
+int k3_udma_glue_push_tx_chn(struct k3_udma_glue_tx_channel *tx_chn,
+ struct cppi5_host_desc_t *desc_tx,
+ dma_addr_t desc_dma)
+{
+ u32 ringtxcq_id;
+
+ if (!atomic_add_unless(&tx_chn->free_pkts, -1, 0))
+ return -ENOMEM;
+
+ ringtxcq_id = k3_ringacc_get_ring_id(tx_chn->ringtxcq);
+ cppi5_desc_set_retpolicy(&desc_tx->hdr, 0, ringtxcq_id);
+
+ return k3_ringacc_ring_push(tx_chn->ringtx, &desc_dma);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_push_tx_chn);
+
+int k3_udma_glue_pop_tx_chn(struct k3_udma_glue_tx_channel *tx_chn,
+ dma_addr_t *desc_dma)
+{
+ int ret;
+
+ ret = k3_ringacc_ring_pop(tx_chn->ringtxcq, desc_dma);
+ if (!ret)
+ atomic_inc(&tx_chn->free_pkts);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_pop_tx_chn);
+
+int k3_udma_glue_enable_tx_chn(struct k3_udma_glue_tx_channel *tx_chn)
+{
+ u32 txrt_ctl;
+
+ txrt_ctl = UDMA_PEER_RT_EN_ENABLE;
+ xudma_tchanrt_write(tx_chn->udma_tchanx,
+ UDMA_TCHAN_RT_PEER_RT_EN_REG,
+ txrt_ctl);
+
+ txrt_ctl = xudma_tchanrt_read(tx_chn->udma_tchanx,
+ UDMA_TCHAN_RT_CTL_REG);
+ txrt_ctl |= UDMA_CHAN_RT_CTL_EN;
+ xudma_tchanrt_write(tx_chn->udma_tchanx, UDMA_TCHAN_RT_CTL_REG,
+ txrt_ctl);
+
+ k3_udma_glue_dump_tx_rt_chn(tx_chn, "txchn en");
+ return 0;
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_enable_tx_chn);
+
+void k3_udma_glue_disable_tx_chn(struct k3_udma_glue_tx_channel *tx_chn)
+{
+ k3_udma_glue_dump_tx_rt_chn(tx_chn, "txchn dis1");
+
+ xudma_tchanrt_write(tx_chn->udma_tchanx, UDMA_TCHAN_RT_CTL_REG, 0);
+
+ xudma_tchanrt_write(tx_chn->udma_tchanx,
+ UDMA_TCHAN_RT_PEER_RT_EN_REG, 0);
+ k3_udma_glue_dump_tx_rt_chn(tx_chn, "txchn dis2");
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_disable_tx_chn);
+
+void k3_udma_glue_tdown_tx_chn(struct k3_udma_glue_tx_channel *tx_chn,
+ bool sync)
+{
+ int i = 0;
+ u32 val;
+
+ k3_udma_glue_dump_tx_rt_chn(tx_chn, "txchn tdown1");
+
+ xudma_tchanrt_write(tx_chn->udma_tchanx, UDMA_TCHAN_RT_CTL_REG,
+ UDMA_CHAN_RT_CTL_EN | UDMA_CHAN_RT_CTL_TDOWN);
+
+ val = xudma_tchanrt_read(tx_chn->udma_tchanx, UDMA_TCHAN_RT_CTL_REG);
+
+ while (sync && (val & UDMA_CHAN_RT_CTL_EN)) {
+ val = xudma_tchanrt_read(tx_chn->udma_tchanx,
+ UDMA_TCHAN_RT_CTL_REG);
+ udelay(1);
+ if (i > K3_UDMAX_TDOWN_TIMEOUT_US) {
+ dev_err(tx_chn->common.dev, "TX tdown timeout\n");
+ break;
+ }
+ i++;
+ }
+
+ val = xudma_tchanrt_read(tx_chn->udma_tchanx,
+ UDMA_TCHAN_RT_PEER_RT_EN_REG);
+ if (sync && (val & UDMA_PEER_RT_EN_ENABLE))
+ dev_err(tx_chn->common.dev, "TX tdown peer not stopped\n");
+ k3_udma_glue_dump_tx_rt_chn(tx_chn, "txchn tdown2");
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_tdown_tx_chn);
+
+void k3_udma_glue_reset_tx_chn(struct k3_udma_glue_tx_channel *tx_chn,
+ void *data,
+ void (*cleanup)(void *data, dma_addr_t desc_dma))
+{
+ dma_addr_t desc_dma;
+ int occ_tx, i, ret;
+
+ /* reset TXCQ as it is not input for udma - expected to be empty */
+ if (tx_chn->ringtxcq)
+ k3_ringacc_ring_reset(tx_chn->ringtxcq);
+
+ /*
+ * TXQ reset need to be special way as it is input for udma and its
+ * state cached by udma, so:
+ * 1) save TXQ occ
+ * 2) clean up TXQ and call callback .cleanup() for each desc
+ * 3) reset TXQ in a special way
+ */
+ occ_tx = k3_ringacc_ring_get_occ(tx_chn->ringtx);
+ dev_dbg(tx_chn->common.dev, "TX reset occ_tx %u\n", occ_tx);
+
+ for (i = 0; i < occ_tx; i++) {
+ ret = k3_ringacc_ring_pop(tx_chn->ringtx, &desc_dma);
+ if (ret) {
+ dev_err(tx_chn->common.dev, "TX reset pop %d\n", ret);
+ break;
+ }
+ cleanup(data, desc_dma);
+ }
+
+ k3_ringacc_ring_reset_dma(tx_chn->ringtx, occ_tx);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_reset_tx_chn);
+
+u32 k3_udma_glue_tx_get_hdesc_size(struct k3_udma_glue_tx_channel *tx_chn)
+{
+ return tx_chn->common.hdesc_size;
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_tx_get_hdesc_size);
+
+u32 k3_udma_glue_tx_get_txcq_id(struct k3_udma_glue_tx_channel *tx_chn)
+{
+ return k3_ringacc_get_ring_id(tx_chn->ringtxcq);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_tx_get_txcq_id);
+
+int k3_udma_glue_tx_get_irq(struct k3_udma_glue_tx_channel *tx_chn)
+{
+ tx_chn->virq = k3_ringacc_get_ring_irq_num(tx_chn->ringtxcq);
+
+ return tx_chn->virq;
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_tx_get_irq);
+
+static int k3_udma_glue_cfg_rx_chn(struct k3_udma_glue_rx_channel *rx_chn)
+{
+ const struct udma_tisci_rm *tisci_rm = rx_chn->common.tisci_rm;
+ struct ti_sci_msg_rm_udmap_rx_ch_cfg req;
+ int ret;
+
+ memset(&req, 0, sizeof(req));
+
+ req.valid_params = TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_START_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_CNT_VALID;
+
+ req.nav_id = tisci_rm->tisci_dev_id;
+ req.index = rx_chn->udma_rchan_id;
+ req.rx_fetch_size = rx_chn->common.hdesc_size >> 2;
+ /*
+ * TODO: we can't support rxcq_qnum/RCHAN[a]_RCQ cfg with current sysfw
+ * and udmax impl, so just configure it to invalid value.
+ * req.rxcq_qnum = k3_ringacc_get_ring_id(rx_chn->flows[0].ringrx);
+ */
+ req.rxcq_qnum = 0xFFFF;
+ if (rx_chn->flow_num && rx_chn->flow_id_base != rx_chn->udma_rchan_id) {
+ /* Default flow + extra ones */
+ req.flowid_start = rx_chn->flow_id_base;
+ req.flowid_cnt = rx_chn->flow_num;
+ }
+ req.rx_chan_type = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
+
+ ret = tisci_rm->tisci_udmap_ops->rx_ch_cfg(tisci_rm->tisci, &req);
+ if (ret)
+ dev_err(rx_chn->common.dev, "rchan%d cfg failed %d\n",
+ rx_chn->udma_rchan_id, ret);
+
+ return ret;
+}
+
+static void k3_udma_glue_release_rx_flow(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_num)
+{
+ struct k3_udma_glue_rx_flow *flow = &rx_chn->flows[flow_num];
+
+ if (IS_ERR_OR_NULL(flow->udma_rflow))
+ return;
+
+ if (flow->ringrxfdq)
+ k3_ringacc_ring_free(flow->ringrxfdq);
+
+ if (flow->ringrx)
+ k3_ringacc_ring_free(flow->ringrx);
+
+ xudma_rflow_put(rx_chn->common.udmax, flow->udma_rflow);
+ flow->udma_rflow = NULL;
+ rx_chn->flows_ready--;
+}
+
+static int k3_udma_glue_cfg_rx_flow(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_idx,
+ struct k3_udma_glue_rx_flow_cfg *flow_cfg)
+{
+ struct k3_udma_glue_rx_flow *flow = &rx_chn->flows[flow_idx];
+ const struct udma_tisci_rm *tisci_rm = rx_chn->common.tisci_rm;
+ struct device *dev = rx_chn->common.dev;
+ struct ti_sci_msg_rm_udmap_flow_cfg req;
+ int rx_ring_id;
+ int rx_ringfdq_id;
+ int ret = 0;
+
+ flow->udma_rflow = xudma_rflow_get(rx_chn->common.udmax,
+ flow->udma_rflow_id);
+ if (IS_ERR(flow->udma_rflow)) {
+ ret = PTR_ERR(flow->udma_rflow);
+ dev_err(dev, "UDMAX rflow get err %d\n", ret);
+ goto err;
+ }
+
+ if (flow->udma_rflow_id != xudma_rflow_get_id(flow->udma_rflow)) {
+ xudma_rflow_put(rx_chn->common.udmax, flow->udma_rflow);
+ return -ENODEV;
+ }
+
+ /* request and cfg rings */
+ flow->ringrx = k3_ringacc_request_ring(rx_chn->common.ringacc,
+ flow_cfg->ring_rxq_id, 0);
+ if (!flow->ringrx) {
+ ret = -ENODEV;
+ dev_err(dev, "Failed to get RX ring\n");
+ goto err;
+ }
+
+ flow->ringrxfdq = k3_ringacc_request_ring(rx_chn->common.ringacc,
+ flow_cfg->ring_rxfdq0_id, 0);
+ if (!flow->ringrxfdq) {
+ ret = -ENODEV;
+ dev_err(dev, "Failed to get RXFDQ ring\n");
+ goto err;
+ }
+
+ ret = k3_ringacc_ring_cfg(flow->ringrx, &flow_cfg->rx_cfg);
+ if (ret) {
+ dev_err(dev, "Failed to cfg ringrx %d\n", ret);
+ goto err;
+ }
+
+ ret = k3_ringacc_ring_cfg(flow->ringrxfdq, &flow_cfg->rxfdq_cfg);
+ if (ret) {
+ dev_err(dev, "Failed to cfg ringrxfdq %d\n", ret);
+ goto err;
+ }
+
+ if (rx_chn->remote) {
+ rx_ring_id = TI_SCI_RESOURCE_NULL;
+ rx_ringfdq_id = TI_SCI_RESOURCE_NULL;
+ } else {
+ rx_ring_id = k3_ringacc_get_ring_id(flow->ringrx);
+ rx_ringfdq_id = k3_ringacc_get_ring_id(flow->ringrxfdq);
+ }
+
+ memset(&req, 0, sizeof(req));
+
+ req.valid_params =
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_EINFO_PRESENT_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_PSINFO_PRESENT_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_ERROR_HANDLING_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DESC_TYPE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_HI_SEL_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_LO_SEL_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_HI_SEL_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_LO_SEL_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ0_SZ0_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ1_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ2_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ3_QNUM_VALID;
+ req.nav_id = tisci_rm->tisci_dev_id;
+ req.flow_index = flow->udma_rflow_id;
+ if (rx_chn->common.epib)
+ req.rx_einfo_present = 1;
+ if (rx_chn->common.psdata_size)
+ req.rx_psinfo_present = 1;
+ if (flow_cfg->rx_error_handling)
+ req.rx_error_handling = 1;
+ req.rx_desc_type = 0;
+ req.rx_dest_qnum = rx_ring_id;
+ req.rx_src_tag_hi_sel = 0;
+ req.rx_src_tag_lo_sel = flow_cfg->src_tag_lo_sel;
+ req.rx_dest_tag_hi_sel = 0;
+ req.rx_dest_tag_lo_sel = 0;
+ req.rx_fdq0_sz0_qnum = rx_ringfdq_id;
+ req.rx_fdq1_qnum = rx_ringfdq_id;
+ req.rx_fdq2_qnum = rx_ringfdq_id;
+ req.rx_fdq3_qnum = rx_ringfdq_id;
+
+ ret = tisci_rm->tisci_udmap_ops->rx_flow_cfg(tisci_rm->tisci, &req);
+ if (ret) {
+ dev_err(dev, "flow%d config failed: %d\n", flow->udma_rflow_id,
+ ret);
+ goto err;
+ }
+
+ rx_chn->flows_ready++;
+ dev_dbg(dev, "flow%d config done. ready:%d\n",
+ flow->udma_rflow_id, rx_chn->flows_ready);
+
+ return 0;
+err:
+ k3_udma_glue_release_rx_flow(rx_chn, flow_idx);
+ return ret;
+}
+
+static void k3_udma_glue_dump_rx_chn(struct k3_udma_glue_rx_channel *chn)
+{
+ struct device *dev = chn->common.dev;
+
+ dev_dbg(dev, "dump_rx_chn:\n"
+ "udma_rchan_id: %d\n"
+ "src_thread: %08x\n"
+ "dst_thread: %08x\n"
+ "epib: %d\n"
+ "hdesc_size: %u\n"
+ "psdata_size: %u\n"
+ "swdata_size: %u\n"
+ "flow_id_base: %d\n"
+ "flow_num: %d\n",
+ chn->udma_rchan_id,
+ chn->common.src_thread,
+ chn->common.dst_thread,
+ chn->common.epib,
+ chn->common.hdesc_size,
+ chn->common.psdata_size,
+ chn->common.swdata_size,
+ chn->flow_id_base,
+ chn->flow_num);
+}
+
+static void k3_udma_glue_dump_rx_rt_chn(struct k3_udma_glue_rx_channel *chn,
+ char *mark)
+{
+ struct device *dev = chn->common.dev;
+
+ dev_dbg(dev, "=== dump ===> %s\n", mark);
+
+ dev_dbg(dev, "0x%08X: %08X\n", UDMA_RCHAN_RT_CTL_REG,
+ xudma_rchanrt_read(chn->udma_rchanx, UDMA_RCHAN_RT_CTL_REG));
+ dev_dbg(dev, "0x%08X: %08X\n", UDMA_RCHAN_RT_PEER_RT_EN_REG,
+ xudma_rchanrt_read(chn->udma_rchanx,
+ UDMA_RCHAN_RT_PEER_RT_EN_REG));
+ dev_dbg(dev, "0x%08X: %08X\n", UDMA_RCHAN_RT_PCNT_REG,
+ xudma_rchanrt_read(chn->udma_rchanx, UDMA_RCHAN_RT_PCNT_REG));
+ dev_dbg(dev, "0x%08X: %08X\n", UDMA_RCHAN_RT_BCNT_REG,
+ xudma_rchanrt_read(chn->udma_rchanx, UDMA_RCHAN_RT_BCNT_REG));
+ dev_dbg(dev, "0x%08X: %08X\n", UDMA_RCHAN_RT_SBCNT_REG,
+ xudma_rchanrt_read(chn->udma_rchanx, UDMA_RCHAN_RT_SBCNT_REG));
+}
+
+static int
+k3_udma_glue_allocate_rx_flows(struct k3_udma_glue_rx_channel *rx_chn,
+ struct k3_udma_glue_rx_channel_cfg *cfg)
+{
+ int ret;
+
+ /* default rflow */
+ if (cfg->flow_id_use_rxchan_id)
+ return 0;
+
+ /* not a GP rflows */
+ if (rx_chn->flow_id_base != -1 &&
+ !xudma_rflow_is_gp(rx_chn->common.udmax, rx_chn->flow_id_base))
+ return 0;
+
+ /* Allocate range of GP rflows */
+ ret = xudma_alloc_gp_rflow_range(rx_chn->common.udmax,
+ rx_chn->flow_id_base,
+ rx_chn->flow_num);
+ if (ret < 0) {
+ dev_err(rx_chn->common.dev, "UDMAX reserve_rflow %d cnt:%d err: %d\n",
+ rx_chn->flow_id_base, rx_chn->flow_num, ret);
+ return ret;
+ }
+ rx_chn->flow_id_base = ret;
+
+ return 0;
+}
+
+static struct k3_udma_glue_rx_channel *
+k3_udma_glue_request_rx_chn_priv(struct device *dev, const char *name,
+ struct k3_udma_glue_rx_channel_cfg *cfg)
+{
+ struct k3_udma_glue_rx_channel *rx_chn;
+ int ret, i;
+
+ if (cfg->flow_id_num <= 0)
+ return ERR_PTR(-EINVAL);
+
+ if (cfg->flow_id_num != 1 &&
+ (cfg->def_flow_cfg || cfg->flow_id_use_rxchan_id))
+ return ERR_PTR(-EINVAL);
+
+ rx_chn = devm_kzalloc(dev, sizeof(*rx_chn), GFP_KERNEL);
+ if (!rx_chn)
+ return ERR_PTR(-ENOMEM);
+
+ rx_chn->common.dev = dev;
+ rx_chn->common.swdata_size = cfg->swdata_size;
+ rx_chn->remote = false;
+
+ /* parse of udmap channel */
+ ret = of_k3_udma_glue_parse_chn(dev->of_node, name,
+ &rx_chn->common, false);
+ if (ret)
+ goto err;
+
+ rx_chn->common.hdesc_size = cppi5_hdesc_calc_size(rx_chn->common.epib,
+ rx_chn->common.psdata_size,
+ rx_chn->common.swdata_size);
+
+ /* request and cfg UDMAP RX channel */
+ rx_chn->udma_rchanx = xudma_rchan_get(rx_chn->common.udmax, -1);
+ if (IS_ERR(rx_chn->udma_rchanx)) {
+ ret = PTR_ERR(rx_chn->udma_rchanx);
+ dev_err(dev, "UDMAX rchanx get err %d\n", ret);
+ goto err;
+ }
+ rx_chn->udma_rchan_id = xudma_rchan_get_id(rx_chn->udma_rchanx);
+
+ rx_chn->flow_num = cfg->flow_id_num;
+ rx_chn->flow_id_base = cfg->flow_id_base;
+
+ /* Use RX channel id as flow id: target dev can't generate flow_id */
+ if (cfg->flow_id_use_rxchan_id)
+ rx_chn->flow_id_base = rx_chn->udma_rchan_id;
+
+ rx_chn->flows = devm_kcalloc(dev, rx_chn->flow_num,
+ sizeof(*rx_chn->flows), GFP_KERNEL);
+ if (!rx_chn->flows) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ ret = k3_udma_glue_allocate_rx_flows(rx_chn, cfg);
+ if (ret)
+ goto err;
+
+ for (i = 0; i < rx_chn->flow_num; i++)
+ rx_chn->flows[i].udma_rflow_id = rx_chn->flow_id_base + i;
+
+ /* request and cfg psi-l */
+ rx_chn->common.dst_thread =
+ xudma_dev_get_psil_base(rx_chn->common.udmax) +
+ rx_chn->udma_rchan_id;
+
+ ret = k3_udma_glue_cfg_rx_chn(rx_chn);
+ if (ret) {
+ dev_err(dev, "Failed to cfg rchan %d\n", ret);
+ goto err;
+ }
+
+ /* init default RX flow only if flow_num = 1 */
+ if (cfg->def_flow_cfg) {
+ ret = k3_udma_glue_cfg_rx_flow(rx_chn, 0, cfg->def_flow_cfg);
+ if (ret)
+ goto err;
+ }
+
+ ret = xudma_navss_psil_pair(rx_chn->common.udmax,
+ rx_chn->common.src_thread,
+ rx_chn->common.dst_thread);
+ if (ret) {
+ dev_err(dev, "PSI-L request err %d\n", ret);
+ goto err;
+ }
+
+ rx_chn->psil_paired = true;
+
+ /* reset RX RT registers */
+ k3_udma_glue_disable_rx_chn(rx_chn);
+
+ k3_udma_glue_dump_rx_chn(rx_chn);
+
+ return rx_chn;
+
+err:
+ k3_udma_glue_release_rx_chn(rx_chn);
+ return ERR_PTR(ret);
+}
+
+static struct k3_udma_glue_rx_channel *
+k3_udma_glue_request_remote_rx_chn(struct device *dev, const char *name,
+ struct k3_udma_glue_rx_channel_cfg *cfg)
+{
+ struct k3_udma_glue_rx_channel *rx_chn;
+ int ret, i;
+
+ if (cfg->flow_id_num <= 0 ||
+ cfg->flow_id_use_rxchan_id ||
+ cfg->def_flow_cfg ||
+ cfg->flow_id_base < 0)
+ return ERR_PTR(-EINVAL);
+
+ /*
+ * Remote RX channel is under control of Remote CPU core, so
+ * Linux can only request and manipulate by dedicated RX flows
+ */
+
+ rx_chn = devm_kzalloc(dev, sizeof(*rx_chn), GFP_KERNEL);
+ if (!rx_chn)
+ return ERR_PTR(-ENOMEM);
+
+ rx_chn->common.dev = dev;
+ rx_chn->common.swdata_size = cfg->swdata_size;
+ rx_chn->remote = true;
+ rx_chn->udma_rchan_id = -1;
+ rx_chn->flow_num = cfg->flow_id_num;
+ rx_chn->flow_id_base = cfg->flow_id_base;
+ rx_chn->psil_paired = false;
+
+ /* parse of udmap channel */
+ ret = of_k3_udma_glue_parse_chn(dev->of_node, name,
+ &rx_chn->common, false);
+ if (ret)
+ goto err;
+
+ rx_chn->common.hdesc_size = cppi5_hdesc_calc_size(rx_chn->common.epib,
+ rx_chn->common.psdata_size,
+ rx_chn->common.swdata_size);
+
+ rx_chn->flows = devm_kcalloc(dev, rx_chn->flow_num,
+ sizeof(*rx_chn->flows), GFP_KERNEL);
+ if (!rx_chn->flows) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ ret = k3_udma_glue_allocate_rx_flows(rx_chn, cfg);
+ if (ret)
+ goto err;
+
+ for (i = 0; i < rx_chn->flow_num; i++)
+ rx_chn->flows[i].udma_rflow_id = rx_chn->flow_id_base + i;
+
+ k3_udma_glue_dump_rx_chn(rx_chn);
+
+ return rx_chn;
+
+err:
+ k3_udma_glue_release_rx_chn(rx_chn);
+ return ERR_PTR(ret);
+}
+
+struct k3_udma_glue_rx_channel *
+k3_udma_glue_request_rx_chn(struct device *dev, const char *name,
+ struct k3_udma_glue_rx_channel_cfg *cfg)
+{
+ if (cfg->remote)
+ return k3_udma_glue_request_remote_rx_chn(dev, name, cfg);
+ else
+ return k3_udma_glue_request_rx_chn_priv(dev, name, cfg);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_request_rx_chn);
+
+void k3_udma_glue_release_rx_chn(struct k3_udma_glue_rx_channel *rx_chn)
+{
+ int i;
+
+ if (IS_ERR_OR_NULL(rx_chn->common.udmax))
+ return;
+
+ if (rx_chn->psil_paired) {
+ xudma_navss_psil_unpair(rx_chn->common.udmax,
+ rx_chn->common.src_thread,
+ rx_chn->common.dst_thread);
+ rx_chn->psil_paired = false;
+ }
+
+ for (i = 0; i < rx_chn->flow_num; i++)
+ k3_udma_glue_release_rx_flow(rx_chn, i);
+
+ if (xudma_rflow_is_gp(rx_chn->common.udmax, rx_chn->flow_id_base))
+ xudma_free_gp_rflow_range(rx_chn->common.udmax,
+ rx_chn->flow_id_base,
+ rx_chn->flow_num);
+
+ if (!IS_ERR_OR_NULL(rx_chn->udma_rchanx))
+ xudma_rchan_put(rx_chn->common.udmax,
+ rx_chn->udma_rchanx);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_release_rx_chn);
+
+int k3_udma_glue_rx_flow_init(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_idx,
+ struct k3_udma_glue_rx_flow_cfg *flow_cfg)
+{
+ if (flow_idx >= rx_chn->flow_num)
+ return -EINVAL;
+
+ return k3_udma_glue_cfg_rx_flow(rx_chn, flow_idx, flow_cfg);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_rx_flow_init);
+
+u32 k3_udma_glue_rx_flow_get_fdq_id(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_idx)
+{
+ struct k3_udma_glue_rx_flow *flow;
+
+ if (flow_idx >= rx_chn->flow_num)
+ return -EINVAL;
+
+ flow = &rx_chn->flows[flow_idx];
+
+ return k3_ringacc_get_ring_id(flow->ringrxfdq);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_rx_flow_get_fdq_id);
+
+u32 k3_udma_glue_rx_get_flow_id_base(struct k3_udma_glue_rx_channel *rx_chn)
+{
+ return rx_chn->flow_id_base;
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_rx_get_flow_id_base);
+
+int k3_udma_glue_rx_flow_enable(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_idx)
+{
+ struct k3_udma_glue_rx_flow *flow = &rx_chn->flows[flow_idx];
+ const struct udma_tisci_rm *tisci_rm = rx_chn->common.tisci_rm;
+ struct device *dev = rx_chn->common.dev;
+ struct ti_sci_msg_rm_udmap_flow_cfg req;
+ int rx_ring_id;
+ int rx_ringfdq_id;
+ int ret = 0;
+
+ if (!rx_chn->remote)
+ return -EINVAL;
+
+ rx_ring_id = k3_ringacc_get_ring_id(flow->ringrx);
+ rx_ringfdq_id = k3_ringacc_get_ring_id(flow->ringrxfdq);
+
+ memset(&req, 0, sizeof(req));
+
+ req.valid_params =
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ0_SZ0_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ1_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ2_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ3_QNUM_VALID;
+ req.nav_id = tisci_rm->tisci_dev_id;
+ req.flow_index = flow->udma_rflow_id;
+ req.rx_dest_qnum = rx_ring_id;
+ req.rx_fdq0_sz0_qnum = rx_ringfdq_id;
+ req.rx_fdq1_qnum = rx_ringfdq_id;
+ req.rx_fdq2_qnum = rx_ringfdq_id;
+ req.rx_fdq3_qnum = rx_ringfdq_id;
+
+ ret = tisci_rm->tisci_udmap_ops->rx_flow_cfg(tisci_rm->tisci, &req);
+ if (ret) {
+ dev_err(dev, "flow%d enable failed: %d\n", flow->udma_rflow_id,
+ ret);
+ }
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_rx_flow_enable);
+
+int k3_udma_glue_rx_flow_disable(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_idx)
+{
+ struct k3_udma_glue_rx_flow *flow = &rx_chn->flows[flow_idx];
+ const struct udma_tisci_rm *tisci_rm = rx_chn->common.tisci_rm;
+ struct device *dev = rx_chn->common.dev;
+ struct ti_sci_msg_rm_udmap_flow_cfg req;
+ int ret = 0;
+
+ if (!rx_chn->remote)
+ return -EINVAL;
+
+ memset(&req, 0, sizeof(req));
+ req.valid_params =
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ0_SZ0_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ1_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ2_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ3_QNUM_VALID;
+ req.nav_id = tisci_rm->tisci_dev_id;
+ req.flow_index = flow->udma_rflow_id;
+ req.rx_dest_qnum = TI_SCI_RESOURCE_NULL;
+ req.rx_fdq0_sz0_qnum = TI_SCI_RESOURCE_NULL;
+ req.rx_fdq1_qnum = TI_SCI_RESOURCE_NULL;
+ req.rx_fdq2_qnum = TI_SCI_RESOURCE_NULL;
+ req.rx_fdq3_qnum = TI_SCI_RESOURCE_NULL;
+
+ ret = tisci_rm->tisci_udmap_ops->rx_flow_cfg(tisci_rm->tisci, &req);
+ if (ret) {
+ dev_err(dev, "flow%d disable failed: %d\n", flow->udma_rflow_id,
+ ret);
+ }
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_rx_flow_disable);
+
+int k3_udma_glue_enable_rx_chn(struct k3_udma_glue_rx_channel *rx_chn)
+{
+ u32 rxrt_ctl;
+
+ if (rx_chn->remote)
+ return -EINVAL;
+
+ if (rx_chn->flows_ready < rx_chn->flow_num)
+ return -EINVAL;
+
+ rxrt_ctl = xudma_rchanrt_read(rx_chn->udma_rchanx,
+ UDMA_RCHAN_RT_CTL_REG);
+ rxrt_ctl |= UDMA_CHAN_RT_CTL_EN;
+ xudma_rchanrt_write(rx_chn->udma_rchanx, UDMA_RCHAN_RT_CTL_REG,
+ rxrt_ctl);
+
+ xudma_rchanrt_write(rx_chn->udma_rchanx,
+ UDMA_RCHAN_RT_PEER_RT_EN_REG,
+ UDMA_PEER_RT_EN_ENABLE);
+
+ k3_udma_glue_dump_rx_rt_chn(rx_chn, "rxrt en");
+ return 0;
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_enable_rx_chn);
+
+void k3_udma_glue_disable_rx_chn(struct k3_udma_glue_rx_channel *rx_chn)
+{
+ k3_udma_glue_dump_rx_rt_chn(rx_chn, "rxrt dis1");
+
+ xudma_rchanrt_write(rx_chn->udma_rchanx,
+ UDMA_RCHAN_RT_PEER_RT_EN_REG,
+ 0);
+ xudma_rchanrt_write(rx_chn->udma_rchanx, UDMA_RCHAN_RT_CTL_REG, 0);
+
+ k3_udma_glue_dump_rx_rt_chn(rx_chn, "rxrt dis2");
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_disable_rx_chn);
+
+void k3_udma_glue_tdown_rx_chn(struct k3_udma_glue_rx_channel *rx_chn,
+ bool sync)
+{
+ int i = 0;
+ u32 val;
+
+ if (rx_chn->remote)
+ return;
+
+ k3_udma_glue_dump_rx_rt_chn(rx_chn, "rxrt tdown1");
+
+ xudma_rchanrt_write(rx_chn->udma_rchanx, UDMA_RCHAN_RT_PEER_RT_EN_REG,
+ UDMA_PEER_RT_EN_ENABLE | UDMA_PEER_RT_EN_TEARDOWN);
+
+ val = xudma_rchanrt_read(rx_chn->udma_rchanx, UDMA_RCHAN_RT_CTL_REG);
+
+ while (sync && (val & UDMA_CHAN_RT_CTL_EN)) {
+ val = xudma_rchanrt_read(rx_chn->udma_rchanx,
+ UDMA_RCHAN_RT_CTL_REG);
+ udelay(1);
+ if (i > K3_UDMAX_TDOWN_TIMEOUT_US) {
+ dev_err(rx_chn->common.dev, "RX tdown timeout\n");
+ break;
+ }
+ i++;
+ }
+
+ val = xudma_rchanrt_read(rx_chn->udma_rchanx,
+ UDMA_RCHAN_RT_PEER_RT_EN_REG);
+ if (sync && (val & UDMA_PEER_RT_EN_ENABLE))
+ dev_err(rx_chn->common.dev, "TX tdown peer not stopped\n");
+ k3_udma_glue_dump_rx_rt_chn(rx_chn, "rxrt tdown2");
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_tdown_rx_chn);
+
+void k3_udma_glue_reset_rx_chn(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_num, void *data,
+ void (*cleanup)(void *data, dma_addr_t desc_dma), bool skip_fdq)
+{
+ struct k3_udma_glue_rx_flow *flow = &rx_chn->flows[flow_num];
+ struct device *dev = rx_chn->common.dev;
+ dma_addr_t desc_dma;
+ int occ_rx, i, ret;
+
+ /* reset RXCQ as it is not input for udma - expected to be empty */
+ occ_rx = k3_ringacc_ring_get_occ(flow->ringrx);
+ dev_dbg(dev, "RX reset flow %u occ_rx %u\n", flow_num, occ_rx);
+ if (flow->ringrx)
+ k3_ringacc_ring_reset(flow->ringrx);
+
+ /* Skip RX FDQ in case one FDQ is used for the set of flows */
+ if (skip_fdq)
+ return;
+
+ /*
+ * RX FDQ reset need to be special way as it is input for udma and its
+ * state cached by udma, so:
+ * 1) save RX FDQ occ
+ * 2) clean up RX FDQ and call callback .cleanup() for each desc
+ * 3) reset RX FDQ in a special way
+ */
+ occ_rx = k3_ringacc_ring_get_occ(flow->ringrxfdq);
+ dev_dbg(dev, "RX reset flow %u occ_rx_fdq %u\n", flow_num, occ_rx);
+
+ for (i = 0; i < occ_rx; i++) {
+ ret = k3_ringacc_ring_pop(flow->ringrxfdq, &desc_dma);
+ if (ret) {
+ dev_err(dev, "RX reset pop %d\n", ret);
+ break;
+ }
+ cleanup(data, desc_dma);
+ }
+
+ k3_ringacc_ring_reset_dma(flow->ringrxfdq, occ_rx);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_reset_rx_chn);
+
+int k3_udma_glue_push_rx_chn(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_num, struct cppi5_host_desc_t *desc_rx,
+ dma_addr_t desc_dma)
+{
+ struct k3_udma_glue_rx_flow *flow = &rx_chn->flows[flow_num];
+
+ return k3_ringacc_ring_push(flow->ringrxfdq, &desc_dma);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_push_rx_chn);
+
+int k3_udma_glue_pop_rx_chn(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_num, dma_addr_t *desc_dma)
+{
+ struct k3_udma_glue_rx_flow *flow = &rx_chn->flows[flow_num];
+
+ return k3_ringacc_ring_pop(flow->ringrx, desc_dma);
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_pop_rx_chn);
+
+int k3_udma_glue_rx_get_irq(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_num)
+{
+ struct k3_udma_glue_rx_flow *flow;
+
+ flow = &rx_chn->flows[flow_num];
+
+ flow->virq = k3_ringacc_get_ring_irq_num(flow->ringrx);
+
+ return flow->virq;
+}
+EXPORT_SYMBOL_GPL(k3_udma_glue_rx_get_irq);
diff --git a/drivers/dma/ti/k3-udma-private.c b/drivers/dma/ti/k3-udma-private.c
new file mode 100644
index 000000000000..0b8f3dd6b146
--- /dev/null
+++ b/drivers/dma/ti/k3-udma-private.c
@@ -0,0 +1,133 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
+ * Author: Peter Ujfalusi <[email protected]>
+ */
+
+int xudma_navss_psil_pair(struct udma_dev *ud, u32 src_thread, u32 dst_thread)
+{
+ return navss_psil_pair(ud, src_thread, dst_thread);
+}
+EXPORT_SYMBOL(xudma_navss_psil_pair);
+
+int xudma_navss_psil_unpair(struct udma_dev *ud, u32 src_thread, u32 dst_thread)
+{
+ return navss_psil_unpair(ud, src_thread, dst_thread);
+}
+EXPORT_SYMBOL(xudma_navss_psil_unpair);
+
+struct udma_dev *of_xudma_dev_get(struct device_node *np, const char *property)
+{
+ struct device_node *udma_node = np;
+ struct platform_device *pdev;
+ struct udma_dev *ud;
+
+ if (property) {
+ udma_node = of_parse_phandle(np, property, 0);
+ if (!udma_node) {
+ pr_err("UDMA node is not found\n");
+ return ERR_PTR(-ENODEV);
+ }
+ }
+
+ pdev = of_find_device_by_node(udma_node);
+ if (!pdev) {
+ pr_debug("UDMA device not found\n");
+ return ERR_PTR(-EPROBE_DEFER);
+ }
+
+ if (np != udma_node)
+ of_node_put(udma_node);
+
+ ud = platform_get_drvdata(pdev);
+ if (!ud) {
+ pr_debug("UDMA has not been probed\n");
+ return ERR_PTR(-EPROBE_DEFER);
+ }
+
+ return ud;
+}
+EXPORT_SYMBOL(of_xudma_dev_get);
+
+u32 xudma_dev_get_psil_base(struct udma_dev *ud)
+{
+ return ud->psil_base;
+}
+EXPORT_SYMBOL(xudma_dev_get_psil_base);
+
+struct udma_tisci_rm *xudma_dev_get_tisci_rm(struct udma_dev *ud)
+{
+ return &ud->tisci_rm;
+}
+EXPORT_SYMBOL(xudma_dev_get_tisci_rm);
+
+int xudma_alloc_gp_rflow_range(struct udma_dev *ud, int from, int cnt)
+{
+ return __udma_alloc_gp_rflow_range(ud, from, cnt);
+}
+EXPORT_SYMBOL(xudma_alloc_gp_rflow_range);
+
+int xudma_free_gp_rflow_range(struct udma_dev *ud, int from, int cnt)
+{
+ return __udma_free_gp_rflow_range(ud, from, cnt);
+}
+EXPORT_SYMBOL(xudma_free_gp_rflow_range);
+
+bool xudma_rflow_is_gp(struct udma_dev *ud, int id)
+{
+ return !test_bit(id, ud->rflow_gp_map);
+}
+EXPORT_SYMBOL(xudma_rflow_is_gp);
+
+#define XUDMA_GET_PUT_RESOURCE(res) \
+struct udma_##res *xudma_##res##_get(struct udma_dev *ud, int id) \
+{ \
+ return __udma_reserve_##res(ud, false, id); \
+} \
+EXPORT_SYMBOL(xudma_##res##_get); \
+ \
+void xudma_##res##_put(struct udma_dev *ud, struct udma_##res *p) \
+{ \
+ clear_bit(p->id, ud->res##_map); \
+} \
+EXPORT_SYMBOL(xudma_##res##_put)
+XUDMA_GET_PUT_RESOURCE(tchan);
+XUDMA_GET_PUT_RESOURCE(rchan);
+
+struct udma_rflow *xudma_rflow_get(struct udma_dev *ud, int id)
+{
+ return __udma_get_rflow(ud, id);
+}
+EXPORT_SYMBOL(xudma_rflow_get);
+
+void xudma_rflow_put(struct udma_dev *ud, struct udma_rflow *p)
+{
+ __udma_put_rflow(ud, p);
+}
+EXPORT_SYMBOL(xudma_rflow_put);
+
+#define XUDMA_GET_RESOURCE_ID(res) \
+int xudma_##res##_get_id(struct udma_##res *p) \
+{ \
+ return p->id; \
+} \
+EXPORT_SYMBOL(xudma_##res##_get_id)
+XUDMA_GET_RESOURCE_ID(tchan);
+XUDMA_GET_RESOURCE_ID(rchan);
+XUDMA_GET_RESOURCE_ID(rflow);
+
+/* Exported register access functions */
+#define XUDMA_RT_IO_FUNCTIONS(res) \
+u32 xudma_##res##rt_read(struct udma_##res *p, int reg) \
+{ \
+ return udma_##res##rt_read(p, reg); \
+} \
+EXPORT_SYMBOL(xudma_##res##rt_read); \
+ \
+void xudma_##res##rt_write(struct udma_##res *p, int reg, u32 val) \
+{ \
+ udma_##res##rt_write(p, reg, val); \
+} \
+EXPORT_SYMBOL(xudma_##res##rt_write)
+XUDMA_RT_IO_FUNCTIONS(tchan);
+XUDMA_RT_IO_FUNCTIONS(rchan);
diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
index 4a95096e2d81..e2d9caceb4a3 100644
--- a/drivers/dma/ti/k3-udma.c
+++ b/drivers/dma/ti/k3-udma.c
@@ -1050,6 +1050,64 @@ static irqreturn_t udma_udma_irq_handler(int irq, void *data)
return IRQ_HANDLED;
}

+/**
+ * __udma_alloc_gp_rflow_range - alloc range of GP RX flows
+ * @ud: UDMA device
+ * @from: Start the search from this flow id number
+ * @cnt: Number of consecutive flow ids to allocate
+ *
+ * Allocate range of RX flow ids for future use, those flows can be requested
+ * only using explicit flow id number. if @from is set to -1 it will try to find
+ * first free range. if @from is positive value it will force allocation only
+ * of the specified range of flows.
+ *
+ * Returns -ENOMEM if can't find free range.
+ * -EEXIST if requested range is busy.
+ * -EINVAL if wrong input values passed.
+ * Returns flow id on success.
+ */
+static int __udma_alloc_gp_rflow_range(struct udma_dev *ud, int from, int cnt)
+{
+ int start, tmp_from;
+ DECLARE_BITMAP(tmp, K3_UDMA_MAX_RFLOWS);
+
+ tmp_from = from;
+ if (tmp_from < 0)
+ tmp_from = ud->rchan_cnt;
+ /* default flows can't be allocated and accessible only by id */
+ if (tmp_from < ud->rchan_cnt)
+ return -EINVAL;
+
+ if (tmp_from + cnt > ud->rflow_cnt)
+ return -EINVAL;
+
+ bitmap_or(tmp, ud->rflow_gp_map, ud->rflow_gp_map_allocated,
+ ud->rflow_cnt);
+
+ start = bitmap_find_next_zero_area(tmp,
+ ud->rflow_cnt,
+ tmp_from, cnt, 0);
+ if (start >= ud->rflow_cnt)
+ return -ENOMEM;
+
+ if (from >= 0 && start != from)
+ return -EEXIST;
+
+ bitmap_set(ud->rflow_gp_map_allocated, start, cnt);
+ return start;
+}
+
+static int __udma_free_gp_rflow_range(struct udma_dev *ud, int from, int cnt)
+{
+ if (from < ud->rchan_cnt)
+ return -EINVAL;
+ if (from + cnt > ud->rflow_cnt)
+ return -EINVAL;
+
+ bitmap_clear(ud->rflow_gp_map_allocated, from, cnt);
+ return 0;
+}
+
static struct udma_rflow *__udma_get_rflow(struct udma_dev *ud, int id)
{
/*
@@ -2936,7 +2994,7 @@ static struct udma_match_data am654_main_data = {

static struct udma_match_data am654_mcu_data = {
.psil_base = 0x6000,
- .enable_memcpy_support = false, /* MEM_TO_MEM is slow via MCU UDMA */
+ .enable_memcpy_support = true, /* MEM_TO_MEM is slow via MCU UDMA */
.have_acc32 = false,
.have_burst = false,
.statictr_z_mask = GENMASK(11, 0),
@@ -3358,6 +3416,9 @@ static struct platform_driver udma_driver = {

module_platform_driver(udma_driver);

+/* Private interfaces to UDMA */
+#include "k3-udma-private.c"
+
MODULE_ALIAS("platform:ti-udma");
MODULE_DESCRIPTION("TI K3 DMA driver for CPPI 5.0 compliant devices");
MODULE_AUTHOR("Peter Ujfalusi <[email protected]>");
diff --git a/drivers/dma/ti/k3-udma.h b/drivers/dma/ti/k3-udma.h
index 49780d9227ed..fec441dd4d7c 100644
--- a/drivers/dma/ti/k3-udma.h
+++ b/drivers/dma/ti/k3-udma.h
@@ -117,4 +117,35 @@ struct udma_tisci_rm {
struct ti_sci_resource *rm_ranges[RM_RANGE_LAST];
};

+/* Direct access to UDMA low lever resources for the glue layer */
+int xudma_navss_psil_pair(struct udma_dev *ud, u32 src_thread, u32 dst_thread);
+int xudma_navss_psil_unpair(struct udma_dev *ud, u32 src_thread,
+ u32 dst_thread);
+
+struct udma_dev *of_xudma_dev_get(struct device_node *np, const char *property);
+void xudma_dev_put(struct udma_dev *ud);
+u32 xudma_dev_get_psil_base(struct udma_dev *ud);
+struct udma_tisci_rm *xudma_dev_get_tisci_rm(struct udma_dev *ud);
+
+int xudma_alloc_gp_rflow_range(struct udma_dev *ud, int from, int cnt);
+int xudma_free_gp_rflow_range(struct udma_dev *ud, int from, int cnt);
+
+struct udma_tchan *xudma_tchan_get(struct udma_dev *ud, int id);
+struct udma_rchan *xudma_rchan_get(struct udma_dev *ud, int id);
+struct udma_rflow *xudma_rflow_get(struct udma_dev *ud, int id);
+
+void xudma_tchan_put(struct udma_dev *ud, struct udma_tchan *p);
+void xudma_rchan_put(struct udma_dev *ud, struct udma_rchan *p);
+void xudma_rflow_put(struct udma_dev *ud, struct udma_rflow *p);
+
+int xudma_tchan_get_id(struct udma_tchan *p);
+int xudma_rchan_get_id(struct udma_rchan *p);
+int xudma_rflow_get_id(struct udma_rflow *p);
+
+u32 xudma_tchanrt_read(struct udma_tchan *tchan, int reg);
+void xudma_tchanrt_write(struct udma_tchan *tchan, int reg, u32 val);
+u32 xudma_rchanrt_read(struct udma_rchan *rchan, int reg);
+void xudma_rchanrt_write(struct udma_rchan *rchan, int reg, u32 val);
+bool xudma_rflow_is_gp(struct udma_dev *ud, int id);
+
#endif /* K3_UDMA_H_ */
diff --git a/include/linux/dma/k3-udma-glue.h b/include/linux/dma/k3-udma-glue.h
new file mode 100644
index 000000000000..3b83d14ee08a
--- /dev/null
+++ b/include/linux/dma/k3-udma-glue.h
@@ -0,0 +1,134 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
+ */
+
+#ifndef K3_UDMA_GLUE_H_
+#define K3_UDMA_GLUE_H_
+
+#include <linux/types.h>
+#include <linux/soc/ti/k3-ringacc.h>
+#include <linux/dma/ti-cppi5.h>
+
+struct k3_udma_glue_tx_channel_cfg {
+ struct k3_ring_cfg tx_cfg;
+ struct k3_ring_cfg txcq_cfg;
+
+ bool tx_pause_on_err;
+ bool tx_filt_einfo;
+ bool tx_filt_pswords;
+ bool tx_supr_tdpkt;
+ u32 swdata_size;
+};
+
+struct k3_udma_glue_tx_channel;
+
+struct k3_udma_glue_tx_channel *k3_udma_glue_request_tx_chn(struct device *dev,
+ const char *name, struct k3_udma_glue_tx_channel_cfg *cfg);
+
+void k3_udma_glue_release_tx_chn(struct k3_udma_glue_tx_channel *tx_chn);
+int k3_udma_glue_push_tx_chn(struct k3_udma_glue_tx_channel *tx_chn,
+ struct cppi5_host_desc_t *desc_tx,
+ dma_addr_t desc_dma);
+int k3_udma_glue_pop_tx_chn(struct k3_udma_glue_tx_channel *tx_chn,
+ dma_addr_t *desc_dma);
+int k3_udma_glue_enable_tx_chn(struct k3_udma_glue_tx_channel *tx_chn);
+void k3_udma_glue_disable_tx_chn(struct k3_udma_glue_tx_channel *tx_chn);
+void k3_udma_glue_tdown_tx_chn(struct k3_udma_glue_tx_channel *tx_chn,
+ bool sync);
+void k3_udma_glue_reset_tx_chn(struct k3_udma_glue_tx_channel *tx_chn,
+ void *data, void (*cleanup)(void *data, dma_addr_t desc_dma));
+u32 k3_udma_glue_tx_get_hdesc_size(struct k3_udma_glue_tx_channel *tx_chn);
+u32 k3_udma_glue_tx_get_txcq_id(struct k3_udma_glue_tx_channel *tx_chn);
+int k3_udma_glue_tx_get_irq(struct k3_udma_glue_tx_channel *tx_chn);
+
+enum {
+ K3_NAV_UDMAX_SRC_TAG_LO_KEEP = 0,
+ K3_NAV_UDMAX_SRC_TAG_LO_USE_FLOW_REG = 1,
+ K3_NAV_UDMAX_SRC_TAG_LO_USE_REMOTE_FLOW_ID = 2,
+ K3_NAV_UDMAX_SRC_TAG_LO_USE_REMOTE_SRC_TAG = 4,
+};
+
+/**
+ * k3_udma_glue_rx_flow_cfg - UDMA RX flow cfg
+ *
+ * @rx_cfg: RX ring configuration
+ * @rxfdq_cfg: RX free Host PD ring configuration
+ * @ring_rxq_id: RX ring id (or -1 for any)
+ * @ring_rxfdq0_id: RX free Host PD ring (FDQ) if (or -1 for any)
+ * @rx_error_handling: Rx Error Handling Mode (0 - drop, 1 - re-try)
+ * @src_tag_lo_sel: Rx Source Tag Low Byte Selector in Host PD
+ */
+struct k3_udma_glue_rx_flow_cfg {
+ struct k3_ring_cfg rx_cfg;
+ struct k3_ring_cfg rxfdq_cfg;
+ int ring_rxq_id;
+ int ring_rxfdq0_id;
+ bool rx_error_handling;
+ int src_tag_lo_sel;
+};
+
+/**
+ * k3_udma_glue_rx_channel_cfg - UDMA RX channel cfg
+ *
+ * @psdata_size: SW Data is present in Host PD of @swdata_size bytes
+ * @flow_id_base: first flow_id used by channel.
+ * if @flow_id_base = -1 - range of GP rflows will be
+ * allocated dynamically.
+ * @flow_id_num: number of RX flows used by channel
+ * @flow_id_use_rxchan_id: use RX channel id as flow id,
+ * used only if @flow_id_num = 1
+ * @remote indication that RX channel is remote - some remote CPU
+ * core owns and control the RX channel. Linux Host only
+ * allowed to attach and configure RX Flow within RX
+ * channel. if set - not RX channel operation will be
+ * performed by K3 NAVSS DMA glue interface.
+ * @def_flow_cfg default RX flow configuration,
+ * used only if @flow_id_num = 1
+ */
+struct k3_udma_glue_rx_channel_cfg {
+ u32 swdata_size;
+ int flow_id_base;
+ int flow_id_num;
+ bool flow_id_use_rxchan_id;
+ bool remote;
+
+ struct k3_udma_glue_rx_flow_cfg *def_flow_cfg;
+};
+
+struct k3_udma_glue_rx_channel;
+
+struct k3_udma_glue_rx_channel *k3_udma_glue_request_rx_chn(
+ struct device *dev,
+ const char *name,
+ struct k3_udma_glue_rx_channel_cfg *cfg);
+
+void k3_udma_glue_release_rx_chn(struct k3_udma_glue_rx_channel *rx_chn);
+int k3_udma_glue_enable_rx_chn(struct k3_udma_glue_rx_channel *rx_chn);
+void k3_udma_glue_disable_rx_chn(struct k3_udma_glue_rx_channel *rx_chn);
+void k3_udma_glue_tdown_rx_chn(struct k3_udma_glue_rx_channel *rx_chn,
+ bool sync);
+int k3_udma_glue_push_rx_chn(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_num, struct cppi5_host_desc_t *desc_tx,
+ dma_addr_t desc_dma);
+int k3_udma_glue_pop_rx_chn(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_num, dma_addr_t *desc_dma);
+int k3_udma_glue_rx_flow_init(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_idx, struct k3_udma_glue_rx_flow_cfg *flow_cfg);
+u32 k3_udma_glue_rx_flow_get_fdq_id(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_idx);
+u32 k3_udma_glue_rx_get_flow_id_base(struct k3_udma_glue_rx_channel *rx_chn);
+int k3_udma_glue_rx_get_irq(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_num);
+void k3_udma_glue_rx_put_irq(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_num);
+void k3_udma_glue_reset_rx_chn(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_num, void *data,
+ void (*cleanup)(void *data, dma_addr_t desc_dma),
+ bool skip_fdq);
+int k3_udma_glue_rx_flow_enable(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_idx);
+int k3_udma_glue_rx_flow_disable(struct k3_udma_glue_rx_channel *rx_chn,
+ u32 flow_idx);
+
+#endif /* K3_UDMA_GLUE_H_ */
--
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-01 08:45:33

by Peter Ujfalusi

[permalink] [raw]
Subject: [PATCH v4 11/15] dmaengine: ti: New driver for K3 UDMA - split#3: alloc/free chan_resources

Split patch for review containing: channel rsource allocation and free
functions.

DMA driver for
Texas Instruments K3 NAVSS Unified DMA – Peripheral Root Complex (UDMA-P)

The UDMA-P is intended to perform similar (but significantly upgraded) functions
as the packet-oriented DMA used on previous SoC devices. The UDMA-P module
supports the transmission and reception of various packet types. The UDMA-P is
architected to facilitate the segmentation and reassembly of SoC DMA data
structure compliant packets to/from smaller data blocks that are natively
compatible with the specific requirements of each connected peripheral. Multiple
Tx and Rx channels are provided within the DMA which allow multiple segmentation
or reassembly operations to be ongoing. The DMA controller maintains state
information for each of the channels which allows packet segmentation and
reassembly operations to be time division multiplexed between channels in order
to share the underlying DMA hardware. An external DMA scheduler is used to
control the ordering and rate at which this multiplexing occurs for Transmit
operations. The ordering and rate of Receive operations is indirectly controlled
by the order in which blocks are pushed into the DMA on the Rx PSI-L interface.

The UDMA-P also supports acting as both a UTC and UDMA-C for its internal
channels. Channels in the UDMA-P can be configured to be either Packet-Based or
Third-Party channels on a channel by channel basis.

The initial driver supports:
- MEM_TO_MEM (TR mode)
- DEV_TO_MEM (Packet / TR mode)
- MEM_TO_DEV (Packet / TR mode)
- Cyclic (Packet / TR mode)
- Metadata for descriptors

Signed-off-by: Peter Ujfalusi <[email protected]>
---
drivers/dma/ti/k3-udma.c | 799 +++++++++++++++++++++++++++++++++++++++
1 file changed, 799 insertions(+)

diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
index e38c780cd20d..2ce22d2f203e 100644
--- a/drivers/dma/ti/k3-udma.c
+++ b/drivers/dma/ti/k3-udma.c
@@ -1050,6 +1050,805 @@ static irqreturn_t udma_udma_irq_handler(int irq, void *data)
return IRQ_HANDLED;
}

+static struct udma_rflow *__udma_get_rflow(struct udma_dev *ud, int id)
+{
+ /*
+ * Attempt to request rflow by ID can be made for any rflow
+ * if not in use with assumption that caller knows what's doing.
+ * TI-SCI FW will perform additional permission check ant way, it's
+ * safe
+ */
+
+ if (id < 0 || id >= ud->rflow_cnt)
+ return ERR_PTR(-ENOENT);
+
+ if (test_bit(id, ud->rflow_in_use))
+ return ERR_PTR(-ENOENT);
+
+ /* GP rflow has to be allocated first */
+ if (!test_bit(id, ud->rflow_gp_map) &&
+ !test_bit(id, ud->rflow_gp_map_allocated))
+ return ERR_PTR(-EINVAL);
+
+ dev_dbg(ud->dev, "get rflow%d\n", id);
+ set_bit(id, ud->rflow_in_use);
+ return &ud->rflows[id];
+}
+
+static void __udma_put_rflow(struct udma_dev *ud, struct udma_rflow *rflow)
+{
+ if (!test_bit(rflow->id, ud->rflow_in_use)) {
+ dev_err(ud->dev, "attempt to put unused rflow%d\n", rflow->id);
+ return;
+ }
+
+ dev_dbg(ud->dev, "put rflow%d\n", rflow->id);
+ clear_bit(rflow->id, ud->rflow_in_use);
+}
+
+#define UDMA_RESERVE_RESOURCE(res) \
+static struct udma_##res *__udma_reserve_##res(struct udma_dev *ud, \
+ enum udma_tp_level tpl, \
+ int id) \
+{ \
+ if (id >= 0) { \
+ if (test_bit(id, ud->res##_map)) { \
+ dev_err(ud->dev, "res##%d is in use\n", id); \
+ return ERR_PTR(-ENOENT); \
+ } \
+ } else { \
+ int start; \
+ \
+ if (tpl >= ud->match_data->tpl_levels) \
+ tpl = ud->match_data->tpl_levels - 1; \
+ \
+ start = ud->match_data->level_start_idx[tpl]; \
+ \
+ id = find_next_zero_bit(ud->res##_map, ud->res##_cnt, \
+ start); \
+ if (id == ud->res##_cnt) { \
+ return ERR_PTR(-ENOENT); \
+ } \
+ } \
+ \
+ set_bit(id, ud->res##_map); \
+ return &ud->res##s[id]; \
+}
+
+UDMA_RESERVE_RESOURCE(tchan);
+UDMA_RESERVE_RESOURCE(rchan);
+
+static int udma_get_tchan(struct udma_chan *uc)
+{
+ struct udma_dev *ud = uc->ud;
+
+ if (uc->tchan) {
+ dev_dbg(ud->dev, "chan%d: already have tchan%d allocated\n",
+ uc->id, uc->tchan->id);
+ return 0;
+ }
+
+ uc->tchan = __udma_reserve_tchan(ud, uc->channel_tpl, -1);
+ if (IS_ERR(uc->tchan))
+ return PTR_ERR(uc->tchan);
+
+ return 0;
+}
+
+static int udma_get_rchan(struct udma_chan *uc)
+{
+ struct udma_dev *ud = uc->ud;
+
+ if (uc->rchan) {
+ dev_dbg(ud->dev, "chan%d: already have rchan%d allocated\n",
+ uc->id, uc->rchan->id);
+ return 0;
+ }
+
+ uc->rchan = __udma_reserve_rchan(ud, uc->channel_tpl, -1);
+ if (IS_ERR(uc->rchan))
+ return PTR_ERR(uc->rchan);
+
+ return 0;
+}
+
+static int udma_get_chan_pair(struct udma_chan *uc)
+{
+ struct udma_dev *ud = uc->ud;
+ const struct udma_match_data *match_data = ud->match_data;
+ int chan_id, end;
+
+ if ((uc->tchan && uc->rchan) && uc->tchan->id == uc->rchan->id) {
+ dev_info(ud->dev, "chan%d: already have %d pair allocated\n",
+ uc->id, uc->tchan->id);
+ return 0;
+ }
+
+ if (uc->tchan) {
+ dev_err(ud->dev, "chan%d: already have tchan%d allocated\n",
+ uc->id, uc->tchan->id);
+ return -EBUSY;
+ } else if (uc->rchan) {
+ dev_err(ud->dev, "chan%d: already have rchan%d allocated\n",
+ uc->id, uc->rchan->id);
+ return -EBUSY;
+ }
+
+ /* Can be optimized, but let's have it like this for now */
+ end = min(ud->tchan_cnt, ud->rchan_cnt);
+ /* Try to use the highest TPL channel pair for MEM_TO_MEM channels */
+ chan_id = match_data->level_start_idx[match_data->tpl_levels - 1];
+ for (; chan_id < end; chan_id++) {
+ if (!test_bit(chan_id, ud->tchan_map) &&
+ !test_bit(chan_id, ud->rchan_map))
+ break;
+ }
+
+ if (chan_id == end)
+ return -ENOENT;
+
+ set_bit(chan_id, ud->tchan_map);
+ set_bit(chan_id, ud->rchan_map);
+ uc->tchan = &ud->tchans[chan_id];
+ uc->rchan = &ud->rchans[chan_id];
+
+ return 0;
+}
+
+static int udma_get_rflow(struct udma_chan *uc, int flow_id)
+{
+ struct udma_dev *ud = uc->ud;
+
+ if (uc->rflow) {
+ dev_dbg(ud->dev, "chan%d: already have rflow%d allocated\n",
+ uc->id, uc->rflow->id);
+ return 0;
+ }
+
+ if (!uc->rchan)
+ dev_warn(ud->dev, "chan%d: does not have rchan??\n", uc->id);
+
+ uc->rflow = __udma_get_rflow(ud, flow_id);
+ if (IS_ERR(uc->rflow))
+ return PTR_ERR(uc->rflow);
+
+ return 0;
+}
+
+static void udma_put_rchan(struct udma_chan *uc)
+{
+ struct udma_dev *ud = uc->ud;
+
+ if (uc->rchan) {
+ dev_dbg(ud->dev, "chan%d: put rchan%d\n", uc->id,
+ uc->rchan->id);
+ clear_bit(uc->rchan->id, ud->rchan_map);
+ uc->rchan = NULL;
+ }
+}
+
+static void udma_put_tchan(struct udma_chan *uc)
+{
+ struct udma_dev *ud = uc->ud;
+
+ if (uc->tchan) {
+ dev_dbg(ud->dev, "chan%d: put tchan%d\n", uc->id,
+ uc->tchan->id);
+ clear_bit(uc->tchan->id, ud->tchan_map);
+ uc->tchan = NULL;
+ }
+}
+
+static void udma_put_rflow(struct udma_chan *uc)
+{
+ struct udma_dev *ud = uc->ud;
+
+ if (uc->rflow) {
+ dev_dbg(ud->dev, "chan%d: put rflow%d\n", uc->id,
+ uc->rflow->id);
+ __udma_put_rflow(ud, uc->rflow);
+ uc->rflow = NULL;
+ }
+}
+
+static void udma_free_tx_resources(struct udma_chan *uc)
+{
+ if (!uc->tchan)
+ return;
+
+ k3_ringacc_ring_free(uc->tchan->t_ring);
+ k3_ringacc_ring_free(uc->tchan->tc_ring);
+ uc->tchan->t_ring = NULL;
+ uc->tchan->tc_ring = NULL;
+
+ udma_put_tchan(uc);
+}
+
+static int udma_alloc_tx_resources(struct udma_chan *uc)
+{
+ struct k3_ring_cfg ring_cfg;
+ struct udma_dev *ud = uc->ud;
+ int ret;
+
+ ret = udma_get_tchan(uc);
+ if (ret)
+ return ret;
+
+ uc->tchan->t_ring = k3_ringacc_request_ring(ud->ringacc,
+ uc->tchan->id, 0);
+ if (!uc->tchan->t_ring) {
+ ret = -EBUSY;
+ goto err_tx_ring;
+ }
+
+ uc->tchan->tc_ring = k3_ringacc_request_ring(ud->ringacc, -1, 0);
+ if (!uc->tchan->tc_ring) {
+ ret = -EBUSY;
+ goto err_txc_ring;
+ }
+
+ memset(&ring_cfg, 0, sizeof(ring_cfg));
+ ring_cfg.size = K3_UDMA_DEFAULT_RING_SIZE;
+ ring_cfg.elm_size = K3_RINGACC_RING_ELSIZE_8;
+ ring_cfg.mode = K3_RINGACC_RING_MODE_MESSAGE;
+
+ ret = k3_ringacc_ring_cfg(uc->tchan->t_ring, &ring_cfg);
+ ret |= k3_ringacc_ring_cfg(uc->tchan->tc_ring, &ring_cfg);
+
+ if (ret)
+ goto err_ringcfg;
+
+ return 0;
+
+err_ringcfg:
+ k3_ringacc_ring_free(uc->tchan->tc_ring);
+ uc->tchan->tc_ring = NULL;
+err_txc_ring:
+ k3_ringacc_ring_free(uc->tchan->t_ring);
+ uc->tchan->t_ring = NULL;
+err_tx_ring:
+ udma_put_tchan(uc);
+
+ return ret;
+}
+
+static void udma_free_rx_resources(struct udma_chan *uc)
+{
+ if (!uc->rchan)
+ return;
+
+ if (uc->dir != DMA_MEM_TO_MEM) {
+ k3_ringacc_ring_free(uc->rchan->fd_ring);
+ k3_ringacc_ring_free(uc->rchan->r_ring);
+ uc->rchan->fd_ring = NULL;
+ uc->rchan->r_ring = NULL;
+
+ udma_put_rflow(uc);
+ }
+
+ udma_put_rchan(uc);
+}
+
+static int udma_alloc_rx_resources(struct udma_chan *uc)
+{
+ struct k3_ring_cfg ring_cfg;
+ struct udma_dev *ud = uc->ud;
+ int fd_ring_id;
+ int ret;
+
+ ret = udma_get_rchan(uc);
+ if (ret)
+ return ret;
+
+ /* For MEM_TO_MEM we don't need rflow or rings */
+ if (uc->dir == DMA_MEM_TO_MEM)
+ return 0;
+
+ ret = udma_get_rflow(uc, uc->rchan->id);
+ if (ret) {
+ ret = -EBUSY;
+ goto err_rflow;
+ }
+
+ fd_ring_id = ud->tchan_cnt + ud->echan_cnt + uc->rchan->id;
+ uc->rchan->fd_ring = k3_ringacc_request_ring(ud->ringacc,
+ fd_ring_id, 0);
+ if (!uc->rchan->fd_ring) {
+ ret = -EBUSY;
+ goto err_rx_ring;
+ }
+
+ uc->rchan->r_ring = k3_ringacc_request_ring(ud->ringacc, -1, 0);
+ if (!uc->rchan->r_ring) {
+ ret = -EBUSY;
+ goto err_rxc_ring;
+ }
+
+ memset(&ring_cfg, 0, sizeof(ring_cfg));
+
+ if (uc->pkt_mode)
+ ring_cfg.size = SG_MAX_SEGMENTS;
+ else
+ ring_cfg.size = K3_UDMA_DEFAULT_RING_SIZE;
+
+ ring_cfg.elm_size = K3_RINGACC_RING_ELSIZE_8;
+ ring_cfg.mode = K3_RINGACC_RING_MODE_MESSAGE;
+
+ ret = k3_ringacc_ring_cfg(uc->rchan->fd_ring, &ring_cfg);
+ ring_cfg.size = K3_UDMA_DEFAULT_RING_SIZE;
+ ret |= k3_ringacc_ring_cfg(uc->rchan->r_ring, &ring_cfg);
+
+ if (ret)
+ goto err_ringcfg;
+
+ return 0;
+
+err_ringcfg:
+ k3_ringacc_ring_free(uc->rchan->r_ring);
+ uc->rchan->r_ring = NULL;
+err_rxc_ring:
+ k3_ringacc_ring_free(uc->rchan->fd_ring);
+ uc->rchan->fd_ring = NULL;
+err_rx_ring:
+ udma_put_rflow(uc);
+err_rflow:
+ udma_put_rchan(uc);
+
+ return ret;
+}
+
+static int udma_tisci_m2m_channel_config(struct udma_chan *uc)
+{
+ struct udma_dev *ud = uc->ud;
+ struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
+ const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
+ struct udma_tchan *tchan = uc->tchan;
+ struct udma_rchan *rchan = uc->rchan;
+ int ret = 0;
+
+ /* Non synchronized - mem to mem type of transfer */
+ int tc_ring = k3_ringacc_get_ring_id(tchan->tc_ring);
+ struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
+ struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
+
+ req_tx.valid_params =
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_EINFO_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_PSWORDS_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_SUPR_TDPKT_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID;
+
+ req_tx.nav_id = tisci_rm->tisci_dev_id;
+ req_tx.index = tchan->id;
+ req_tx.tx_pause_on_err = 0;
+ req_tx.tx_filt_einfo = 0;
+ req_tx.tx_filt_pswords = 0;
+ req_tx.tx_chan_type = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_BCOPY_PBRR;
+ req_tx.tx_supr_tdpkt = 0;
+ req_tx.tx_fetch_size = sizeof(struct cppi5_desc_hdr_t) >> 2;
+ req_tx.txcq_qnum = tc_ring;
+
+ ret = tisci_ops->tx_ch_cfg(tisci_rm->tisci, &req_tx);
+ if (ret) {
+ dev_err(ud->dev, "tchan%d cfg failed %d\n", tchan->id, ret);
+ return ret;
+ }
+
+ req_rx.valid_params =
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_SHORT_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_LONG_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_START_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_CNT_VALID;
+
+ req_rx.nav_id = tisci_rm->tisci_dev_id;
+ req_rx.index = rchan->id;
+ req_rx.rx_fetch_size = sizeof(struct cppi5_desc_hdr_t) >> 2;
+ req_rx.rxcq_qnum = tc_ring;
+ req_rx.rx_pause_on_err = 0;
+ req_rx.rx_chan_type = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_BCOPY_PBRR;
+ req_rx.rx_ignore_short = 0;
+ req_rx.rx_ignore_long = 0;
+ req_rx.flowid_start = 0;
+ req_rx.flowid_cnt = 0;
+
+ ret = tisci_ops->rx_ch_cfg(tisci_rm->tisci, &req_rx);
+ if (ret)
+ dev_err(ud->dev, "rchan%d alloc failed %d\n", rchan->id, ret);
+
+ return ret;
+}
+
+static int udma_tisci_tx_channel_config(struct udma_chan *uc)
+{
+ struct udma_dev *ud = uc->ud;
+ struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
+ const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
+ struct udma_tchan *tchan = uc->tchan;
+ int tc_ring = k3_ringacc_get_ring_id(tchan->tc_ring);
+ struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
+ u32 mode, fetch_size;
+ int ret = 0;
+
+ if (uc->pkt_mode) {
+ mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
+ fetch_size = cppi5_hdesc_calc_size(uc->needs_epib, uc->psd_size,
+ 0);
+ } else {
+ mode = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_PBRR;
+ fetch_size = sizeof(struct cppi5_desc_hdr_t);
+ }
+
+ req_tx.valid_params =
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_EINFO_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_PSWORDS_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_SUPR_TDPKT_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID;
+
+ req_tx.nav_id = tisci_rm->tisci_dev_id;
+ req_tx.index = tchan->id;
+ req_tx.tx_pause_on_err = 0;
+ req_tx.tx_filt_einfo = 0;
+ req_tx.tx_filt_pswords = 0;
+ req_tx.tx_chan_type = mode;
+ req_tx.tx_supr_tdpkt = uc->notdpkt;
+ req_tx.tx_fetch_size = fetch_size >> 2;
+ req_tx.txcq_qnum = tc_ring;
+ if (uc->ep_type == PSIL_EP_PDMA_XY) {
+ /* wait for peer to complete the teardown for PDMAs */
+ req_tx.valid_params |=
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_TDTYPE_VALID;
+ req_tx.tx_tdtype = 1;
+ }
+
+ ret = tisci_ops->tx_ch_cfg(tisci_rm->tisci, &req_tx);
+ if (ret)
+ dev_err(ud->dev, "tchan%d cfg failed %d\n", tchan->id, ret);
+
+ return ret;
+}
+
+static int udma_tisci_rx_channel_config(struct udma_chan *uc)
+{
+ struct udma_dev *ud = uc->ud;
+ struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
+ const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
+ struct udma_rchan *rchan = uc->rchan;
+ int fd_ring = k3_ringacc_get_ring_id(rchan->fd_ring);
+ int rx_ring = k3_ringacc_get_ring_id(rchan->r_ring);
+ struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
+ struct ti_sci_msg_rm_udmap_flow_cfg flow_req = { 0 };
+ u32 mode, fetch_size;
+ int ret = 0;
+
+ if (uc->pkt_mode) {
+ mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
+ fetch_size = cppi5_hdesc_calc_size(uc->needs_epib,
+ uc->psd_size, 0);
+ } else {
+ mode = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_PBRR;
+ fetch_size = sizeof(struct cppi5_desc_hdr_t);
+ }
+
+ req_rx.valid_params =
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_SHORT_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_LONG_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_START_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_CNT_VALID;
+
+ req_rx.nav_id = tisci_rm->tisci_dev_id;
+ req_rx.index = rchan->id;
+ req_rx.rx_fetch_size = fetch_size >> 2;
+ req_rx.rxcq_qnum = rx_ring;
+ req_rx.rx_pause_on_err = 0;
+ req_rx.rx_chan_type = mode;
+ req_rx.rx_ignore_short = 0;
+ req_rx.rx_ignore_long = 0;
+ req_rx.flowid_start = 0;
+ req_rx.flowid_cnt = 0;
+
+ ret = tisci_ops->rx_ch_cfg(tisci_rm->tisci, &req_rx);
+ if (ret) {
+ dev_err(ud->dev, "rchan%d cfg failed %d\n", rchan->id, ret);
+ return ret;
+ }
+
+ flow_req.valid_params =
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_EINFO_PRESENT_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_PSINFO_PRESENT_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_ERROR_HANDLING_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DESC_TYPE_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_HI_SEL_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_LO_SEL_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_HI_SEL_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_LO_SEL_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ0_SZ0_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ1_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ2_QNUM_VALID |
+ TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ3_QNUM_VALID;
+
+ flow_req.nav_id = tisci_rm->tisci_dev_id;
+ flow_req.flow_index = rchan->id;
+
+ if (uc->needs_epib)
+ flow_req.rx_einfo_present = 1;
+ else
+ flow_req.rx_einfo_present = 0;
+ if (uc->psd_size)
+ flow_req.rx_psinfo_present = 1;
+ else
+ flow_req.rx_psinfo_present = 0;
+ flow_req.rx_error_handling = 1;
+ flow_req.rx_desc_type = 0;
+ flow_req.rx_dest_qnum = rx_ring;
+ flow_req.rx_src_tag_hi_sel = 2;
+ flow_req.rx_src_tag_lo_sel = 4;
+ flow_req.rx_dest_tag_hi_sel = 5;
+ flow_req.rx_dest_tag_lo_sel = 4;
+ flow_req.rx_fdq0_sz0_qnum = fd_ring;
+ flow_req.rx_fdq1_qnum = fd_ring;
+ flow_req.rx_fdq2_qnum = fd_ring;
+ flow_req.rx_fdq3_qnum = fd_ring;
+
+ ret = tisci_ops->rx_flow_cfg(tisci_rm->tisci, &flow_req);
+
+ if (ret)
+ dev_err(ud->dev, "flow%d config failed: %d\n", rchan->id, ret);
+
+ return 0;
+}
+
+static int udma_alloc_chan_resources(struct dma_chan *chan)
+{
+ struct udma_chan *uc = to_udma_chan(chan);
+ struct udma_dev *ud = to_udma_dev(chan->device);
+ const struct udma_match_data *match_data = ud->match_data;
+ struct k3_ring *irq_ring;
+ u32 irq_udma_idx;
+ int ret;
+
+ if (uc->pkt_mode || uc->dir == DMA_MEM_TO_MEM) {
+ uc->use_dma_pool = true;
+ /* in case of MEM_TO_MEM we have maximum of two TRs */
+ if (uc->dir == DMA_MEM_TO_MEM) {
+ uc->hdesc_size = cppi5_trdesc_calc_size(
+ sizeof(struct cppi5_tr_type15_t), 2);
+ uc->pkt_mode = false;
+ }
+ }
+
+ if (uc->use_dma_pool) {
+ uc->hdesc_pool = dma_pool_create(uc->name, ud->ddev.dev,
+ uc->hdesc_size, ud->desc_align,
+ 0);
+ if (!uc->hdesc_pool) {
+ dev_err(ud->ddev.dev,
+ "Descriptor pool allocation failed\n");
+ uc->use_dma_pool = false;
+ return -ENOMEM;
+ }
+ }
+
+ /*
+ * Make sure that the completion is in a known state:
+ * No teardown, the channel is idle
+ */
+ reinit_completion(&uc->teardown_completed);
+ complete_all(&uc->teardown_completed);
+ uc->state = UDMA_CHAN_IS_IDLE;
+
+ switch (uc->dir) {
+ case DMA_MEM_TO_MEM:
+ /* Non synchronized - mem to mem type of transfer */
+ dev_dbg(uc->ud->dev, "%s: chan%d as MEM-to-MEM\n", __func__,
+ uc->id);
+
+ ret = udma_get_chan_pair(uc);
+ if (ret)
+ return ret;
+
+ ret = udma_alloc_tx_resources(uc);
+ if (ret)
+ return ret;
+
+ ret = udma_alloc_rx_resources(uc);
+ if (ret) {
+ udma_free_tx_resources(uc);
+ return ret;
+ }
+
+ uc->src_thread = ud->psil_base + uc->tchan->id;
+ uc->dst_thread = (ud->psil_base + uc->rchan->id) |
+ K3_PSIL_DST_THREAD_ID_OFFSET;
+
+ irq_ring = uc->tchan->tc_ring;
+ irq_udma_idx = uc->tchan->id;
+
+ ret = udma_tisci_m2m_channel_config(uc);
+ break;
+ case DMA_MEM_TO_DEV:
+ /* Slave transfer synchronized - mem to dev (TX) trasnfer */
+ dev_dbg(uc->ud->dev, "%s: chan%d as MEM-to-DEV\n", __func__,
+ uc->id);
+
+ ret = udma_alloc_tx_resources(uc);
+ if (ret) {
+ uc->remote_thread_id = -1;
+ return ret;
+ }
+
+ uc->src_thread = ud->psil_base + uc->tchan->id;
+ uc->dst_thread = uc->remote_thread_id;
+ uc->dst_thread |= K3_PSIL_DST_THREAD_ID_OFFSET;
+
+ irq_ring = uc->tchan->tc_ring;
+ irq_udma_idx = uc->tchan->id;
+
+ ret = udma_tisci_tx_channel_config(uc);
+ break;
+ case DMA_DEV_TO_MEM:
+ /* Slave transfer synchronized - dev to mem (RX) trasnfer */
+ dev_dbg(uc->ud->dev, "%s: chan%d as DEV-to-MEM\n", __func__,
+ uc->id);
+
+ ret = udma_alloc_rx_resources(uc);
+ if (ret) {
+ uc->remote_thread_id = -1;
+ return ret;
+ }
+
+ uc->src_thread = uc->remote_thread_id;
+ uc->dst_thread = (ud->psil_base + uc->rchan->id) |
+ K3_PSIL_DST_THREAD_ID_OFFSET;
+
+ irq_ring = uc->rchan->r_ring;
+ irq_udma_idx = match_data->rchan_oes_offset + uc->rchan->id;
+
+ ret = udma_tisci_rx_channel_config(uc);
+ break;
+ default:
+ /* Can not happen */
+ dev_err(uc->ud->dev, "%s: chan%d invalid direction (%u)\n",
+ __func__, uc->id, uc->dir);
+ return -EINVAL;
+ }
+
+ /* check if the channel configuration was successful */
+ if (ret)
+ goto err_res_free;
+
+ if (udma_is_chan_running(uc)) {
+ dev_warn(ud->dev, "chan%d: is running!\n", uc->id);
+ udma_stop(uc);
+ if (udma_is_chan_running(uc)) {
+ dev_err(ud->dev, "chan%d: won't stop!\n", uc->id);
+ goto err_res_free;
+ }
+ }
+
+ /* PSI-L pairing */
+ ret = navss_psil_pair(ud, uc->src_thread, uc->dst_thread);
+ if (ret) {
+ dev_err(ud->dev, "PSI-L pairing failed: 0x%04x -> 0x%04x\n",
+ uc->src_thread, uc->dst_thread);
+ goto err_res_free;
+ }
+
+ uc->psil_paired = true;
+
+ uc->irq_num_ring = k3_ringacc_get_ring_irq_num(irq_ring);
+ if (uc->irq_num_ring <= 0) {
+ dev_err(ud->dev, "Failed to get ring irq (index: %u)\n",
+ k3_ringacc_get_ring_id(irq_ring));
+ ret = -EINVAL;
+ goto err_psi_free;
+ }
+
+ ret = request_irq(uc->irq_num_ring, udma_ring_irq_handler,
+ IRQF_TRIGGER_HIGH, uc->name, uc);
+ if (ret) {
+ dev_err(ud->dev, "chan%d: ring irq request failed\n", uc->id);
+ goto err_irq_free;
+ }
+
+ /* Event from UDMA (TR events) only needed for slave TR mode channels */
+ if (is_slave_direction(uc->dir) && !uc->pkt_mode) {
+ uc->irq_num_udma = ti_sci_inta_msi_get_virq(ud->dev,
+ irq_udma_idx);
+ if (uc->irq_num_udma <= 0) {
+ dev_err(ud->dev, "Failed to get udma irq (index: %u)\n",
+ irq_udma_idx);
+ free_irq(uc->irq_num_ring, uc);
+ ret = -EINVAL;
+ goto err_irq_free;
+ }
+
+ ret = request_irq(uc->irq_num_udma, udma_udma_irq_handler, 0,
+ uc->name, uc);
+ if (ret) {
+ dev_err(ud->dev, "chan%d: UDMA irq request failed\n",
+ uc->id);
+ free_irq(uc->irq_num_ring, uc);
+ goto err_irq_free;
+ }
+ } else {
+ uc->irq_num_udma = 0;
+ }
+
+ udma_reset_rings(uc);
+
+ return 0;
+
+err_irq_free:
+ uc->irq_num_ring = 0;
+ uc->irq_num_udma = 0;
+err_psi_free:
+ navss_psil_unpair(ud, uc->src_thread, uc->dst_thread);
+ uc->psil_paired = false;
+err_res_free:
+ udma_free_tx_resources(uc);
+ udma_free_rx_resources(uc);
+
+ udma_reset_uchan(uc);
+
+ if (uc->use_dma_pool) {
+ dma_pool_destroy(uc->hdesc_pool);
+ uc->use_dma_pool = false;
+ }
+
+ return ret;
+}
+
+static void udma_free_chan_resources(struct dma_chan *chan)
+{
+ struct udma_chan *uc = to_udma_chan(chan);
+ struct udma_dev *ud = to_udma_dev(chan->device);
+
+ udma_terminate_all(chan);
+
+ if (uc->irq_num_ring > 0) {
+ free_irq(uc->irq_num_ring, uc);
+
+ uc->irq_num_ring = 0;
+ }
+ if (uc->irq_num_udma > 0) {
+ free_irq(uc->irq_num_udma, uc);
+
+ uc->irq_num_udma = 0;
+ }
+
+ /* Release PSI-L pairing */
+ if (uc->psil_paired) {
+ navss_psil_unpair(ud, uc->src_thread, uc->dst_thread);
+ uc->psil_paired = false;
+ }
+
+ vchan_free_chan_resources(&uc->vc);
+ tasklet_kill(&uc->vc.task);
+
+ udma_free_tx_resources(uc);
+ udma_free_rx_resources(uc);
+ udma_reset_uchan(uc);
+
+ if (uc->use_dma_pool) {
+ dma_pool_destroy(uc->hdesc_pool);
+ uc->use_dma_pool = false;
+ }
+}
+
static struct platform_driver udma_driver;

static bool udma_dma_filter_fn(struct dma_chan *chan, void *param)
--
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-01 08:46:01

by Peter Ujfalusi

[permalink] [raw]
Subject: [PATCH v4 10/15] dmaengine: ti: New driver for K3 UDMA - split#2: probe/remove, xlate and filter_fn

Split patch for review containing: module probe/remove functions, of_xlate
and filter_fn for slave channel requests.

DMA driver for
Texas Instruments K3 NAVSS Unified DMA – Peripheral Root Complex (UDMA-P)

The UDMA-P is intended to perform similar (but significantly upgraded) functions
as the packet-oriented DMA used on previous SoC devices. The UDMA-P module
supports the transmission and reception of various packet types. The UDMA-P is
architected to facilitate the segmentation and reassembly of SoC DMA data
structure compliant packets to/from smaller data blocks that are natively
compatible with the specific requirements of each connected peripheral. Multiple
Tx and Rx channels are provided within the DMA which allow multiple segmentation
or reassembly operations to be ongoing. The DMA controller maintains state
information for each of the channels which allows packet segmentation and
reassembly operations to be time division multiplexed between channels in order
to share the underlying DMA hardware. An external DMA scheduler is used to
control the ordering and rate at which this multiplexing occurs for Transmit
operations. The ordering and rate of Receive operations is indirectly controlled
by the order in which blocks are pushed into the DMA on the Rx PSI-L interface.

The UDMA-P also supports acting as both a UTC and UDMA-C for its internal
channels. Channels in the UDMA-P can be configured to be either Packet-Based or
Third-Party channels on a channel by channel basis.

The initial driver supports:
- MEM_TO_MEM (TR mode)
- DEV_TO_MEM (Packet / TR mode)
- MEM_TO_DEV (Packet / TR mode)
- Cyclic (Packet / TR mode)
- Metadata for descriptors

Signed-off-by: Peter Ujfalusi <[email protected]>
---
drivers/dma/ti/k3-udma.c | 523 +++++++++++++++++++++++++++++++++++++++
1 file changed, 523 insertions(+)

diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
index 5ef715ca73a2..e38c780cd20d 100644
--- a/drivers/dma/ti/k3-udma.c
+++ b/drivers/dma/ti/k3-udma.c
@@ -1049,3 +1049,526 @@ static irqreturn_t udma_udma_irq_handler(int irq, void *data)

return IRQ_HANDLED;
}
+
+static struct platform_driver udma_driver;
+
+static bool udma_dma_filter_fn(struct dma_chan *chan, void *param)
+{
+ struct psil_endpoint_config *ep_config;
+ struct udma_chan *uc;
+ struct udma_dev *ud;
+ u32 *args;
+
+ if (chan->device->dev->driver != &udma_driver.driver)
+ return false;
+
+ uc = to_udma_chan(chan);
+ ud = uc->ud;
+ args = param;
+ uc->remote_thread_id = args[0];
+
+ if (uc->remote_thread_id & K3_PSIL_DST_THREAD_ID_OFFSET)
+ uc->dir = DMA_MEM_TO_DEV;
+ else
+ uc->dir = DMA_DEV_TO_MEM;
+
+ ep_config = psil_get_ep_config(uc->remote_thread_id);
+ if (IS_ERR(ep_config)) {
+ dev_err(ud->dev, "No configuration for psi-l thread 0x%04x\n",
+ uc->remote_thread_id);
+ uc->dir = DMA_MEM_TO_MEM;
+ uc->remote_thread_id = -1;
+ return false;
+ }
+
+ uc->pkt_mode = ep_config->pkt_mode;
+ uc->channel_tpl = ep_config->channel_tpl;
+ uc->notdpkt = ep_config->notdpkt;
+ uc->ep_type = ep_config->ep_type;
+
+ if (uc->ep_type != PSIL_EP_NATIVE) {
+ const struct udma_match_data *match_data = ud->match_data;
+
+ if (match_data->have_acc32)
+ uc->enable_acc32 = ep_config->pdma_acc32;
+ if (match_data->have_burst)
+ uc->enable_burst = ep_config->pdma_burst;
+ }
+
+ uc->needs_epib = ep_config->needs_epib;
+ uc->psd_size = ep_config->psd_size;
+ uc->metadata_size = (uc->needs_epib ? CPPI5_INFO0_HDESC_EPIB_SIZE : 0) +
+ uc->psd_size;
+
+ if (uc->pkt_mode)
+ uc->hdesc_size = ALIGN(sizeof(struct cppi5_host_desc_t) +
+ uc->metadata_size, ud->desc_align);
+
+ dev_dbg(ud->dev, "chan%d: Remote thread: 0x%04x (%s)\n", uc->id,
+ uc->remote_thread_id, udma_get_dir_text(uc->dir));
+
+ return true;
+}
+
+static struct dma_chan *udma_of_xlate(struct of_phandle_args *dma_spec,
+ struct of_dma *ofdma)
+{
+ struct udma_dev *ud = ofdma->of_dma_data;
+ dma_cap_mask_t mask = ud->ddev.cap_mask;
+ struct dma_chan *chan;
+
+ if (dma_spec->args_count != 1)
+ return NULL;
+
+ chan = __dma_request_channel(&mask, udma_dma_filter_fn,
+ &dma_spec->args[0], ofdma->of_node);
+ if (!chan) {
+ dev_err(ud->dev, "get channel fail in %s.\n", __func__);
+ return ERR_PTR(-EINVAL);
+ }
+
+ return chan;
+}
+
+static struct udma_match_data am654_main_data = {
+ .psil_base = 0x1000,
+ .enable_memcpy_support = true,
+ .have_acc32 = false,
+ .have_burst = false,
+ .statictr_z_mask = GENMASK(11, 0),
+ .rchan_oes_offset = 0x2000,
+ .tpl_levels = 2,
+ .level_start_idx = {
+ [0] = 8, /* Normal channels */
+ [1] = 0, /* High Throughput channels */
+ },
+};
+
+static struct udma_match_data am654_mcu_data = {
+ .psil_base = 0x6000,
+ .enable_memcpy_support = false, /* MEM_TO_MEM is slow via MCU UDMA */
+ .have_acc32 = false,
+ .have_burst = false,
+ .statictr_z_mask = GENMASK(11, 0),
+ .rchan_oes_offset = 0x2000,
+ .tpl_levels = 2,
+ .level_start_idx = {
+ [0] = 2, /* Normal channels */
+ [1] = 0, /* High Throughput channels */
+ },
+};
+
+static struct udma_match_data j721e_main_data = {
+ .psil_base = 0x1000,
+ .enable_memcpy_support = true,
+ .have_acc32 = true,
+ .have_burst = true,
+ .statictr_z_mask = GENMASK(23, 0),
+ .rchan_oes_offset = 0x400,
+ .tpl_levels = 3,
+ .level_start_idx = {
+ [0] = 16, /* Normal channels */
+ [1] = 4, /* High Throughput channels */
+ [2] = 0, /* Ultra High Throughput channels */
+ },
+};
+
+static struct udma_match_data j721e_mcu_data = {
+ .psil_base = 0x6000,
+ .enable_memcpy_support = false, /* MEM_TO_MEM is slow via MCU UDMA */
+ .have_acc32 = true,
+ .have_burst = true,
+ .statictr_z_mask = GENMASK(23, 0),
+ .rchan_oes_offset = 0x400,
+ .tpl_levels = 2,
+ .level_start_idx = {
+ [0] = 2, /* Normal channels */
+ [1] = 0, /* High Throughput channels */
+ },
+};
+
+static const struct of_device_id udma_of_match[] = {
+ {
+ .compatible = "ti,am654-navss-main-udmap",
+ .data = &am654_main_data,
+ },
+ {
+ .compatible = "ti,am654-navss-mcu-udmap",
+ .data = &am654_mcu_data,
+ }, {
+ .compatible = "ti,j721e-navss-main-udmap",
+ .data = &j721e_main_data,
+ }, {
+ .compatible = "ti,j721e-navss-mcu-udmap",
+ .data = &j721e_mcu_data,
+ },
+ { /* Sentinel */ },
+};
+MODULE_DEVICE_TABLE(of, udma_of_match);
+
+static int udma_get_mmrs(struct platform_device *pdev, struct udma_dev *ud)
+{
+ struct resource *res;
+ int i;
+
+ for (i = 0; i < MMR_LAST; i++) {
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
+ mmr_names[i]);
+ ud->mmrs[i] = devm_ioremap_resource(&pdev->dev, res);
+ if (IS_ERR(ud->mmrs[i]))
+ return PTR_ERR(ud->mmrs[i]);
+ }
+
+ return 0;
+}
+
+static int udma_setup_resources(struct udma_dev *ud)
+{
+ struct device *dev = ud->dev;
+ int ch_count, ret, i, j;
+ u32 cap2, cap3;
+ struct ti_sci_resource_desc *rm_desc;
+ struct ti_sci_resource *rm_res, irq_res;
+ struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
+ static const char * const range_names[] = { "ti,sci-rm-range-tchan",
+ "ti,sci-rm-range-rchan",
+ "ti,sci-rm-range-rflow" };
+
+ cap2 = udma_read(ud->mmrs[MMR_GCFG], 0x28);
+ cap3 = udma_read(ud->mmrs[MMR_GCFG], 0x2c);
+
+ ud->rflow_cnt = cap3 & 0x3fff;
+ ud->tchan_cnt = cap2 & 0x1ff;
+ ud->echan_cnt = (cap2 >> 9) & 0x1ff;
+ ud->rchan_cnt = (cap2 >> 18) & 0x1ff;
+ ch_count = ud->tchan_cnt + ud->rchan_cnt;
+
+ ud->tchan_map = devm_kmalloc_array(dev, BITS_TO_LONGS(ud->tchan_cnt),
+ sizeof(unsigned long), GFP_KERNEL);
+ ud->tchans = devm_kcalloc(dev, ud->tchan_cnt, sizeof(*ud->tchans),
+ GFP_KERNEL);
+ ud->rchan_map = devm_kmalloc_array(dev, BITS_TO_LONGS(ud->rchan_cnt),
+ sizeof(unsigned long), GFP_KERNEL);
+ ud->rchans = devm_kcalloc(dev, ud->rchan_cnt, sizeof(*ud->rchans),
+ GFP_KERNEL);
+ ud->rflow_gp_map = devm_kmalloc_array(dev, BITS_TO_LONGS(ud->rflow_cnt),
+ sizeof(unsigned long),
+ GFP_KERNEL);
+ ud->rflow_gp_map_allocated = devm_kcalloc(dev,
+ BITS_TO_LONGS(ud->rflow_cnt),
+ sizeof(unsigned long),
+ GFP_KERNEL);
+ ud->rflow_in_use = devm_kcalloc(dev, BITS_TO_LONGS(ud->rflow_cnt),
+ sizeof(unsigned long),
+ GFP_KERNEL);
+ ud->rflows = devm_kcalloc(dev, ud->rflow_cnt, sizeof(*ud->rflows),
+ GFP_KERNEL);
+
+ if (!ud->tchan_map || !ud->rchan_map || !ud->rflow_gp_map ||
+ !ud->rflow_gp_map_allocated || !ud->tchans || !ud->rchans ||
+ !ud->rflows || !ud->rflow_in_use)
+ return -ENOMEM;
+
+ /*
+ * RX flows with the same Ids as RX channels are reserved to be used
+ * as default flows if remote HW can't generate flow_ids. Those
+ * RX flows can be requested only explicitly by id.
+ */
+ bitmap_set(ud->rflow_gp_map_allocated, 0, ud->rchan_cnt);
+
+ /* by default no GP rflows are assigned to Linux */
+ bitmap_set(ud->rflow_gp_map, 0, ud->rflow_cnt);
+
+ /* Get resource ranges from tisci */
+ for (i = 0; i < RM_RANGE_LAST; i++)
+ tisci_rm->rm_ranges[i] =
+ devm_ti_sci_get_of_resource(tisci_rm->tisci, dev,
+ tisci_rm->tisci_dev_id,
+ (char *)range_names[i]);
+
+ /* tchan ranges */
+ rm_res = tisci_rm->rm_ranges[RM_RANGE_TCHAN];
+ if (IS_ERR(rm_res)) {
+ bitmap_zero(ud->tchan_map, ud->tchan_cnt);
+ } else {
+ bitmap_fill(ud->tchan_map, ud->tchan_cnt);
+ for (i = 0; i < rm_res->sets; i++) {
+ rm_desc = &rm_res->desc[i];
+ bitmap_clear(ud->tchan_map, rm_desc->start,
+ rm_desc->num);
+ dev_dbg(dev, "ti-sci-res: tchan: %d:%d\n",
+ rm_desc->start, rm_desc->num);
+ }
+ }
+ irq_res.sets = rm_res->sets;
+
+ /* rchan and matching default flow ranges */
+ rm_res = tisci_rm->rm_ranges[RM_RANGE_RCHAN];
+ if (IS_ERR(rm_res)) {
+ bitmap_zero(ud->rchan_map, ud->rchan_cnt);
+ } else {
+ bitmap_fill(ud->rchan_map, ud->rchan_cnt);
+ for (i = 0; i < rm_res->sets; i++) {
+ rm_desc = &rm_res->desc[i];
+ bitmap_clear(ud->rchan_map, rm_desc->start,
+ rm_desc->num);
+ dev_dbg(dev, "ti-sci-res: rchan: %d:%d\n",
+ rm_desc->start, rm_desc->num);
+ }
+ }
+
+ irq_res.sets += rm_res->sets;
+ irq_res.desc = kcalloc(irq_res.sets, sizeof(*irq_res.desc), GFP_KERNEL);
+ rm_res = tisci_rm->rm_ranges[RM_RANGE_TCHAN];
+ for (i = 0; i < rm_res->sets; i++) {
+ irq_res.desc[i].start = rm_res->desc[i].start;
+ irq_res.desc[i].num = rm_res->desc[i].num;
+ }
+ rm_res = tisci_rm->rm_ranges[RM_RANGE_RCHAN];
+ for (j = 0; j < rm_res->sets; j++, i++) {
+ irq_res.desc[i].start = rm_res->desc[j].start +
+ ud->match_data->rchan_oes_offset;
+ irq_res.desc[i].num = rm_res->desc[j].num;
+ }
+ ret = ti_sci_inta_msi_domain_alloc_irqs(ud->dev, &irq_res);
+ kfree(irq_res.desc);
+ if (ret) {
+ dev_err(ud->dev, "Failed to allocate MSI interrupts\n");
+ return ret;
+ }
+
+ /* GP rflow ranges */
+ rm_res = tisci_rm->rm_ranges[RM_RANGE_RFLOW];
+ if (IS_ERR(rm_res)) {
+ /* all gp flows are assigned exclusively to Linux */
+ bitmap_clear(ud->rflow_gp_map, ud->rchan_cnt,
+ ud->rflow_cnt - ud->rchan_cnt);
+ } else {
+ for (i = 0; i < rm_res->sets; i++) {
+ rm_desc = &rm_res->desc[i];
+ bitmap_clear(ud->rflow_gp_map, rm_desc->start,
+ rm_desc->num);
+ dev_dbg(dev, "ti-sci-res: rflow: %d:%d\n",
+ rm_desc->start, rm_desc->num);
+ }
+ }
+
+ ch_count -= bitmap_weight(ud->tchan_map, ud->tchan_cnt);
+ ch_count -= bitmap_weight(ud->rchan_map, ud->rchan_cnt);
+ if (!ch_count)
+ return -ENODEV;
+
+ ud->channels = devm_kcalloc(dev, ch_count, sizeof(*ud->channels),
+ GFP_KERNEL);
+ if (!ud->channels)
+ return -ENOMEM;
+
+ dev_info(dev, "Channels: %d (tchan: %u, rchan: %u, gp-rflow: %u)\n",
+ ch_count,
+ ud->tchan_cnt - bitmap_weight(ud->tchan_map, ud->tchan_cnt),
+ ud->rchan_cnt - bitmap_weight(ud->rchan_map, ud->rchan_cnt),
+ ud->rflow_cnt - bitmap_weight(ud->rflow_gp_map,
+ ud->rflow_cnt));
+
+ return ch_count;
+}
+
+#define TI_UDMAC_BUSWIDTHS (BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) | \
+ BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) | \
+ BIT(DMA_SLAVE_BUSWIDTH_3_BYTES) | \
+ BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) | \
+ BIT(DMA_SLAVE_BUSWIDTH_8_BYTES))
+
+static int udma_probe(struct platform_device *pdev)
+{
+ struct device_node *navss_node = pdev->dev.parent->of_node;
+ struct device *dev = &pdev->dev;
+ struct udma_dev *ud;
+ const struct of_device_id *match;
+ int i, ret;
+ int ch_count;
+
+ ret = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(48));
+ if (ret)
+ dev_err(dev, "failed to set dma mask stuff\n");
+
+ ud = devm_kzalloc(dev, sizeof(*ud), GFP_KERNEL);
+ if (!ud)
+ return -ENOMEM;
+
+ ret = udma_get_mmrs(pdev, ud);
+ if (ret)
+ return ret;
+
+ ud->tisci_rm.tisci = ti_sci_get_by_phandle(dev->of_node, "ti,sci");
+ if (IS_ERR(ud->tisci_rm.tisci))
+ return PTR_ERR(ud->tisci_rm.tisci);
+
+ ret = of_property_read_u32(dev->of_node, "ti,sci-dev-id",
+ &ud->tisci_rm.tisci_dev_id);
+ if (ret) {
+ dev_err(dev, "ti,sci-dev-id read failure %d\n", ret);
+ return ret;
+ }
+ pdev->id = ud->tisci_rm.tisci_dev_id;
+
+ ret = of_property_read_u32(navss_node, "ti,sci-dev-id",
+ &ud->tisci_rm.tisci_navss_dev_id);
+ if (ret) {
+ dev_err(dev, "NAVSS ti,sci-dev-id read failure %d\n", ret);
+ return ret;
+ }
+
+ ud->tisci_rm.tisci_udmap_ops = &ud->tisci_rm.tisci->ops.rm_udmap_ops;
+ ud->tisci_rm.tisci_psil_ops = &ud->tisci_rm.tisci->ops.rm_psil_ops;
+
+ ud->ringacc = of_k3_ringacc_get_by_phandle(dev->of_node, "ti,ringacc");
+ if (IS_ERR(ud->ringacc))
+ return PTR_ERR(ud->ringacc);
+
+ dev->msi_domain = of_msi_get_domain(dev, dev->of_node,
+ DOMAIN_BUS_TI_SCI_INTA_MSI);
+ if (!dev->msi_domain) {
+ dev_err(dev, "Failed to get MSI domain\n");
+ return -EPROBE_DEFER;
+ }
+
+ match = of_match_node(udma_of_match, dev->of_node);
+ if (!match) {
+ dev_err(dev, "No compatible match found\n");
+ return -ENODEV;
+ }
+ ud->match_data = match->data;
+
+ dma_cap_set(DMA_SLAVE, ud->ddev.cap_mask);
+ dma_cap_set(DMA_CYCLIC, ud->ddev.cap_mask);
+
+ ud->ddev.device_alloc_chan_resources = udma_alloc_chan_resources;
+ ud->ddev.device_config = udma_slave_config;
+ ud->ddev.device_prep_slave_sg = udma_prep_slave_sg;
+ ud->ddev.device_prep_dma_cyclic = udma_prep_dma_cyclic;
+ ud->ddev.device_issue_pending = udma_issue_pending;
+ ud->ddev.device_tx_status = udma_tx_status;
+ ud->ddev.device_pause = udma_pause;
+ ud->ddev.device_resume = udma_resume;
+ ud->ddev.device_terminate_all = udma_terminate_all;
+ ud->ddev.device_synchronize = udma_synchronize;
+
+ ud->ddev.device_free_chan_resources = udma_free_chan_resources;
+ ud->ddev.src_addr_widths = TI_UDMAC_BUSWIDTHS;
+ ud->ddev.dst_addr_widths = TI_UDMAC_BUSWIDTHS;
+ ud->ddev.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV);
+ ud->ddev.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
+ ud->ddev.copy_align = DMAENGINE_ALIGN_8_BYTES;
+ ud->ddev.desc_metadata_modes = DESC_METADATA_CLIENT |
+ DESC_METADATA_ENGINE;
+ if (ud->match_data->enable_memcpy_support) {
+ dma_cap_set(DMA_MEMCPY, ud->ddev.cap_mask);
+ ud->ddev.device_prep_dma_memcpy = udma_prep_dma_memcpy;
+ ud->ddev.directions |= BIT(DMA_MEM_TO_MEM);
+ }
+
+ ud->ddev.dev = dev;
+ ud->dev = dev;
+ ud->psil_base = ud->match_data->psil_base;
+
+ INIT_LIST_HEAD(&ud->ddev.channels);
+ INIT_LIST_HEAD(&ud->desc_to_purge);
+
+ ch_count = udma_setup_resources(ud);
+ if (ch_count <= 0)
+ return ch_count;
+
+ spin_lock_init(&ud->lock);
+ INIT_WORK(&ud->purge_work, udma_purge_desc_work);
+
+ ud->desc_align = 64;
+ if (ud->desc_align < dma_get_cache_alignment())
+ ud->desc_align = dma_get_cache_alignment();
+
+ for (i = 0; i < ud->tchan_cnt; i++) {
+ struct udma_tchan *tchan = &ud->tchans[i];
+
+ tchan->id = i;
+ tchan->reg_rt = ud->mmrs[MMR_TCHANRT] + i * 0x1000;
+ }
+
+ for (i = 0; i < ud->rchan_cnt; i++) {
+ struct udma_rchan *rchan = &ud->rchans[i];
+
+ rchan->id = i;
+ rchan->reg_rt = ud->mmrs[MMR_RCHANRT] + i * 0x1000;
+ }
+
+ for (i = 0; i < ud->rflow_cnt; i++) {
+ struct udma_rflow *rflow = &ud->rflows[i];
+
+ rflow->id = i;
+ }
+
+ for (i = 0; i < ch_count; i++) {
+ struct udma_chan *uc = &ud->channels[i];
+
+ uc->ud = ud;
+ uc->vc.desc_free = udma_desc_free;
+ uc->id = i;
+ uc->remote_thread_id = -1;
+ uc->tchan = NULL;
+ uc->rchan = NULL;
+ uc->dir = DMA_MEM_TO_MEM;
+ uc->name = devm_kasprintf(dev, GFP_KERNEL, "%s chan%d",
+ dev_name(dev), i);
+
+ vchan_init(&uc->vc, &ud->ddev);
+ /* Use custom vchan completion handling */
+ tasklet_init(&uc->vc.task, udma_vchan_complete,
+ (unsigned long)&uc->vc);
+ init_completion(&uc->teardown_completed);
+ }
+
+ ret = dma_async_device_register(&ud->ddev);
+ if (ret) {
+ dev_err(dev, "failed to register slave DMA engine: %d\n", ret);
+ return ret;
+ }
+
+ platform_set_drvdata(pdev, ud);
+
+ ret = of_dma_controller_register(dev->of_node, udma_of_xlate, ud);
+ if (ret) {
+ dev_err(dev, "failed to register of_dma controller\n");
+ dma_async_device_unregister(&ud->ddev);
+ }
+
+ return ret;
+}
+
+static int udma_remove(struct platform_device *pdev)
+{
+ struct udma_dev *ud = platform_get_drvdata(pdev);
+
+ of_dma_controller_free(pdev->dev.of_node);
+ dma_async_device_unregister(&ud->ddev);
+
+ /* Make sure that we did proper cleanup */
+ cancel_work_sync(&ud->purge_work);
+ udma_purge_desc_work(&ud->purge_work);
+
+ return 0;
+}
+
+static struct platform_driver udma_driver = {
+ .driver = {
+ .name = "ti-udma",
+ .of_match_table = udma_of_match,
+ },
+ .probe = udma_probe,
+ .remove = udma_remove,
+};
+
+module_platform_driver(udma_driver);
+
+MODULE_ALIAS("platform:ti-udma");
+MODULE_DESCRIPTION("TI K3 DMA driver for CPPI 5.0 compliant devices");
+MODULE_AUTHOR("Peter Ujfalusi <[email protected]>");
+MODULE_LICENSE("GPL v2");
--
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-01 09:35:55

by Peter Ujfalusi

[permalink] [raw]
Subject: [PATCH v4 14/15] dmaengine: ti: New driver for K3 UDMA - split#6: Kconfig and Makefile

Split patch for review containing:
Kconfig and Makefile changes

DMA driver for
Texas Instruments K3 NAVSS Unified DMA – Peripheral Root Complex (UDMA-P)

The UDMA-P is intended to perform similar (but significantly upgraded) functions
as the packet-oriented DMA used on previous SoC devices. The UDMA-P module
supports the transmission and reception of various packet types. The UDMA-P is
architected to facilitate the segmentation and reassembly of SoC DMA data
structure compliant packets to/from smaller data blocks that are natively
compatible with the specific requirements of each connected peripheral. Multiple
Tx and Rx channels are provided within the DMA which allow multiple segmentation
or reassembly operations to be ongoing. The DMA controller maintains state
information for each of the channels which allows packet segmentation and
reassembly operations to be time division multiplexed between channels in order
to share the underlying DMA hardware. An external DMA scheduler is used to
control the ordering and rate at which this multiplexing occurs for Transmit
operations. The ordering and rate of Receive operations is indirectly controlled
by the order in which blocks are pushed into the DMA on the Rx PSI-L interface.

The UDMA-P also supports acting as both a UTC and UDMA-C for its internal
channels. Channels in the UDMA-P can be configured to be either Packet-Based or
Third-Party channels on a channel by channel basis.

The initial driver supports:
- MEM_TO_MEM (TR mode)
- DEV_TO_MEM (Packet / TR mode)
- MEM_TO_DEV (Packet / TR mode)
- Cyclic (Packet / TR mode)
- Metadata for descriptors

Signed-off-by: Peter Ujfalusi <[email protected]>
---
drivers/dma/ti/Kconfig | 14 ++++++++++++++
drivers/dma/ti/Makefile | 1 +
2 files changed, 15 insertions(+)

diff --git a/drivers/dma/ti/Kconfig b/drivers/dma/ti/Kconfig
index 72f3d2728178..04c98e215ba6 100644
--- a/drivers/dma/ti/Kconfig
+++ b/drivers/dma/ti/Kconfig
@@ -34,6 +34,20 @@ config DMA_OMAP
Enable support for the TI sDMA (System DMA or DMA4) controller. This
DMA engine is found on OMAP and DRA7xx parts.

+config TI_K3_UDMA
+ tristate "Texas Instruments UDMA support"
+ depends on ARCH_K3 || COMPILE_TEST
+ depends on TI_SCI_PROTOCOL
+ depends on TI_SCI_INTA_IRQCHIP
+ select DMA_ENGINE
+ select DMA_VIRTUAL_CHANNELS
+ select TI_K3_RINGACC
+ select TI_K3_PSIL
+ default y
+ help
+ Enable support for the TI UDMA (Unified DMA) controller. This
+ DMA engine is used in AM65x.
+
config TI_K3_PSIL
bool

diff --git a/drivers/dma/ti/Makefile b/drivers/dma/ti/Makefile
index f8d912ad7eaf..9d787f009195 100644
--- a/drivers/dma/ti/Makefile
+++ b/drivers/dma/ti/Makefile
@@ -2,5 +2,6 @@
obj-$(CONFIG_TI_CPPI41) += cppi41.o
obj-$(CONFIG_TI_EDMA) += edma.o
obj-$(CONFIG_DMA_OMAP) += omap-dma.o
+obj-$(CONFIG_TI_K3_UDMA) += k3-udma.o
obj-$(CONFIG_TI_K3_PSIL) += k3-psil.o k3-psil-am654.o k3-psil-j721e.o
obj-$(CONFIG_TI_DMA_CROSSBAR) += dma-crossbar.o
--
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-01 09:36:04

by Peter Ujfalusi

[permalink] [raw]
Subject: [PATCH v4 07/15] dmaengine: ti: k3 PSI-L remote endpoint configuration

In K3 architecture the DMA operates within threads. One end of the thread
is UDMAP, the other is on the peripheral side.

The UDMAP channel configuration depends on the needs of the remote
endpoint and it can be differ from peripheral to peripheral.

This patch adds database for am654 and j721e and small API to fetch the
PSI-L endpoint configuration from the database which should only used by
the DMA driver(s).

Another API is added for native peripherals to give possibility to pass new
configuration for the threads they are using, which is needed to be able to
handle changes caused by different firmware loaded for the peripheral for
example.

Signed-off-by: Peter Ujfalusi <[email protected]>
---
drivers/dma/ti/Kconfig | 3 +
drivers/dma/ti/Makefile | 1 +
drivers/dma/ti/k3-psil-am654.c | 172 ++++++++++++++++++++++++++
drivers/dma/ti/k3-psil-j721e.c | 219 +++++++++++++++++++++++++++++++++
drivers/dma/ti/k3-psil-priv.h | 39 ++++++
drivers/dma/ti/k3-psil.c | 97 +++++++++++++++
include/linux/dma/k3-psil.h | 47 +++++++
7 files changed, 578 insertions(+)
create mode 100644 drivers/dma/ti/k3-psil-am654.c
create mode 100644 drivers/dma/ti/k3-psil-j721e.c
create mode 100644 drivers/dma/ti/k3-psil-priv.h
create mode 100644 drivers/dma/ti/k3-psil.c
create mode 100644 include/linux/dma/k3-psil.h

diff --git a/drivers/dma/ti/Kconfig b/drivers/dma/ti/Kconfig
index d507c24fbf31..72f3d2728178 100644
--- a/drivers/dma/ti/Kconfig
+++ b/drivers/dma/ti/Kconfig
@@ -34,5 +34,8 @@ config DMA_OMAP
Enable support for the TI sDMA (System DMA or DMA4) controller. This
DMA engine is found on OMAP and DRA7xx parts.

+config TI_K3_PSIL
+ bool
+
config TI_DMA_CROSSBAR
bool
diff --git a/drivers/dma/ti/Makefile b/drivers/dma/ti/Makefile
index 113e59ec9c32..f8d912ad7eaf 100644
--- a/drivers/dma/ti/Makefile
+++ b/drivers/dma/ti/Makefile
@@ -2,4 +2,5 @@
obj-$(CONFIG_TI_CPPI41) += cppi41.o
obj-$(CONFIG_TI_EDMA) += edma.o
obj-$(CONFIG_DMA_OMAP) += omap-dma.o
+obj-$(CONFIG_TI_K3_PSIL) += k3-psil.o k3-psil-am654.o k3-psil-j721e.o
obj-$(CONFIG_TI_DMA_CROSSBAR) += dma-crossbar.o
diff --git a/drivers/dma/ti/k3-psil-am654.c b/drivers/dma/ti/k3-psil-am654.c
new file mode 100644
index 000000000000..edd7fff36f44
--- /dev/null
+++ b/drivers/dma/ti/k3-psil-am654.c
@@ -0,0 +1,172 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
+ * Author: Peter Ujfalusi <[email protected]>
+ */
+
+#include <linux/kernel.h>
+
+#include "k3-psil-priv.h"
+
+#define PSIL_PDMA_XY_TR(x) \
+ { \
+ .thread_id = x, \
+ .ep_config = { \
+ .ep_type = PSIL_EP_PDMA_XY, \
+ }, \
+ }
+
+#define PSIL_PDMA_XY_PKT(x) \
+ { \
+ .thread_id = x, \
+ .ep_config = { \
+ .ep_type = PSIL_EP_PDMA_XY, \
+ .pkt_mode = 1, \
+ }, \
+ }
+
+#define PSIL_ETHERNET(x) \
+ { \
+ .thread_id = x, \
+ .ep_config = { \
+ .ep_type = PSIL_EP_NATIVE, \
+ .pkt_mode = 1, \
+ .needs_epib = 1, \
+ .psd_size = 16, \
+ }, \
+ }
+
+#define PSIL_SA2UL(x, tx) \
+ { \
+ .thread_id = x, \
+ .ep_config = { \
+ .ep_type = PSIL_EP_NATIVE, \
+ .pkt_mode = 1, \
+ .needs_epib = 1, \
+ .psd_size = 64, \
+ .notdpkt = tx, \
+ }, \
+ }
+
+/* PSI-L source thread IDs, used for RX (DMA_DEV_TO_MEM) */
+struct psil_ep am654_src_ep_map[] = {
+ /* SA2UL */
+ PSIL_SA2UL(0x4000, 0),
+ PSIL_SA2UL(0x4001, 0),
+ /* PRU_ICSSG0 */
+ PSIL_ETHERNET(0x4100),
+ PSIL_ETHERNET(0x4101),
+ PSIL_ETHERNET(0x4102),
+ PSIL_ETHERNET(0x4103),
+ /* PRU_ICSSG1 */
+ PSIL_ETHERNET(0x4200),
+ PSIL_ETHERNET(0x4201),
+ PSIL_ETHERNET(0x4202),
+ PSIL_ETHERNET(0x4203),
+ /* PRU_ICSSG2 */
+ PSIL_ETHERNET(0x4300),
+ PSIL_ETHERNET(0x4301),
+ PSIL_ETHERNET(0x4302),
+ PSIL_ETHERNET(0x4303),
+ /* PDMA0 - McASPs */
+ PSIL_PDMA_XY_TR(0x4400),
+ PSIL_PDMA_XY_TR(0x4401),
+ PSIL_PDMA_XY_TR(0x4402),
+ /* PDMA1 - SPI0-4 */
+ PSIL_PDMA_XY_PKT(0x4500),
+ PSIL_PDMA_XY_PKT(0x4501),
+ PSIL_PDMA_XY_PKT(0x4502),
+ PSIL_PDMA_XY_PKT(0x4503),
+ PSIL_PDMA_XY_PKT(0x4504),
+ PSIL_PDMA_XY_PKT(0x4505),
+ PSIL_PDMA_XY_PKT(0x4506),
+ PSIL_PDMA_XY_PKT(0x4507),
+ PSIL_PDMA_XY_PKT(0x4508),
+ PSIL_PDMA_XY_PKT(0x4509),
+ PSIL_PDMA_XY_PKT(0x450a),
+ PSIL_PDMA_XY_PKT(0x450b),
+ PSIL_PDMA_XY_PKT(0x450c),
+ PSIL_PDMA_XY_PKT(0x450d),
+ PSIL_PDMA_XY_PKT(0x450e),
+ PSIL_PDMA_XY_PKT(0x450f),
+ PSIL_PDMA_XY_PKT(0x4510),
+ PSIL_PDMA_XY_PKT(0x4511),
+ PSIL_PDMA_XY_PKT(0x4512),
+ PSIL_PDMA_XY_PKT(0x4513),
+ /* PDMA1 - USART0-2 */
+ PSIL_PDMA_XY_PKT(0x4514),
+ PSIL_PDMA_XY_PKT(0x4515),
+ PSIL_PDMA_XY_PKT(0x4516),
+ /* CPSW0 */
+ PSIL_ETHERNET(0x7000),
+ /* MCU_PDMA0 - ADCs */
+ PSIL_PDMA_XY_TR(0x7100),
+ PSIL_PDMA_XY_TR(0x7101),
+ PSIL_PDMA_XY_TR(0x7102),
+ PSIL_PDMA_XY_TR(0x7103),
+ /* MCU_PDMA1 - MCU_SPI0-2 */
+ PSIL_PDMA_XY_PKT(0x7200),
+ PSIL_PDMA_XY_PKT(0x7201),
+ PSIL_PDMA_XY_PKT(0x7202),
+ PSIL_PDMA_XY_PKT(0x7203),
+ PSIL_PDMA_XY_PKT(0x7204),
+ PSIL_PDMA_XY_PKT(0x7205),
+ PSIL_PDMA_XY_PKT(0x7206),
+ PSIL_PDMA_XY_PKT(0x7207),
+ PSIL_PDMA_XY_PKT(0x7208),
+ PSIL_PDMA_XY_PKT(0x7209),
+ PSIL_PDMA_XY_PKT(0x720a),
+ PSIL_PDMA_XY_PKT(0x720b),
+ /* MCU_PDMA1 - MCU_USART0 */
+ PSIL_PDMA_XY_PKT(0x7212),
+};
+
+/* PSI-L destination thread IDs, used for TX (DMA_MEM_TO_DEV) */
+struct psil_ep am654_dst_ep_map[] = {
+ /* SA2UL */
+ PSIL_SA2UL(0xc000, 1),
+ /* PRU_ICSSG0 */
+ PSIL_ETHERNET(0xc100),
+ PSIL_ETHERNET(0xc101),
+ PSIL_ETHERNET(0xc102),
+ PSIL_ETHERNET(0xc103),
+ PSIL_ETHERNET(0xc104),
+ PSIL_ETHERNET(0xc105),
+ PSIL_ETHERNET(0xc106),
+ PSIL_ETHERNET(0xc107),
+ /* PRU_ICSSG1 */
+ PSIL_ETHERNET(0xc200),
+ PSIL_ETHERNET(0xc201),
+ PSIL_ETHERNET(0xc202),
+ PSIL_ETHERNET(0xc203),
+ PSIL_ETHERNET(0xc204),
+ PSIL_ETHERNET(0xc205),
+ PSIL_ETHERNET(0xc206),
+ PSIL_ETHERNET(0xc207),
+ /* PRU_ICSSG2 */
+ PSIL_ETHERNET(0xc300),
+ PSIL_ETHERNET(0xc301),
+ PSIL_ETHERNET(0xc302),
+ PSIL_ETHERNET(0xc303),
+ PSIL_ETHERNET(0xc304),
+ PSIL_ETHERNET(0xc305),
+ PSIL_ETHERNET(0xc306),
+ PSIL_ETHERNET(0xc307),
+ /* CPSW0 */
+ PSIL_ETHERNET(0xf000),
+ PSIL_ETHERNET(0xf001),
+ PSIL_ETHERNET(0xf002),
+ PSIL_ETHERNET(0xf003),
+ PSIL_ETHERNET(0xf004),
+ PSIL_ETHERNET(0xf005),
+ PSIL_ETHERNET(0xf006),
+ PSIL_ETHERNET(0xf007),
+};
+
+struct psil_ep_map am654_ep_map = {
+ .name = "am654",
+ .src = am654_src_ep_map,
+ .src_count = ARRAY_SIZE(am654_src_ep_map),
+ .dst = am654_dst_ep_map,
+ .dst_count = ARRAY_SIZE(am654_dst_ep_map),
+};
diff --git a/drivers/dma/ti/k3-psil-j721e.c b/drivers/dma/ti/k3-psil-j721e.c
new file mode 100644
index 000000000000..86e1ff57e197
--- /dev/null
+++ b/drivers/dma/ti/k3-psil-j721e.c
@@ -0,0 +1,219 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
+ * Author: Peter Ujfalusi <[email protected]>
+ */
+
+#include <linux/kernel.h>
+
+#include "k3-psil-priv.h"
+
+#define PSIL_PDMA_XY_TR(x) \
+ { \
+ .thread_id = x, \
+ .ep_config = { \
+ .ep_type = PSIL_EP_PDMA_XY, \
+ }, \
+ }
+
+#define PSIL_PDMA_XY_PKT(x) \
+ { \
+ .thread_id = x, \
+ .ep_config = { \
+ .ep_type = PSIL_EP_PDMA_XY, \
+ .pkt_mode = 1, \
+ }, \
+ }
+
+#define PSIL_PDMA_MCASP(x) \
+ { \
+ .thread_id = x, \
+ .ep_config = { \
+ .ep_type = PSIL_EP_PDMA_XY, \
+ .pdma_acc32 = 1, \
+ .pdma_burst = 1, \
+ }, \
+ }
+
+#define PSIL_ETHERNET(x) \
+ { \
+ .thread_id = x, \
+ .ep_config = { \
+ .ep_type = PSIL_EP_NATIVE, \
+ .pkt_mode = 1, \
+ .needs_epib = 1, \
+ .psd_size = 16, \
+ }, \
+ }
+
+#define PSIL_SA2UL(x, tx) \
+ { \
+ .thread_id = x, \
+ .ep_config = { \
+ .ep_type = PSIL_EP_NATIVE, \
+ .pkt_mode = 1, \
+ .needs_epib = 1, \
+ .psd_size = 64, \
+ .notdpkt = tx, \
+ }, \
+ }
+
+/* PSI-L source thread IDs, used for RX (DMA_DEV_TO_MEM) */
+struct psil_ep j721e_src_ep_map[] = {
+ /* SA2UL */
+ PSIL_SA2UL(0x4000, 0),
+ PSIL_SA2UL(0x4001, 0),
+ /* PRU_ICSSG0 */
+ PSIL_ETHERNET(0x4100),
+ PSIL_ETHERNET(0x4101),
+ PSIL_ETHERNET(0x4102),
+ PSIL_ETHERNET(0x4103),
+ /* PRU_ICSSG1 */
+ PSIL_ETHERNET(0x4200),
+ PSIL_ETHERNET(0x4201),
+ PSIL_ETHERNET(0x4202),
+ PSIL_ETHERNET(0x4203),
+ /* PDMA6 (PSIL_PDMA_MCASP_G0) - McASP0-2 */
+ PSIL_PDMA_MCASP(0x4400),
+ PSIL_PDMA_MCASP(0x4401),
+ PSIL_PDMA_MCASP(0x4402),
+ /* PDMA7 (PSIL_PDMA_MCASP_G1) - McASP3-11 */
+ PSIL_PDMA_MCASP(0x4500),
+ PSIL_PDMA_MCASP(0x4501),
+ PSIL_PDMA_MCASP(0x4502),
+ PSIL_PDMA_MCASP(0x4503),
+ PSIL_PDMA_MCASP(0x4504),
+ PSIL_PDMA_MCASP(0x4505),
+ PSIL_PDMA_MCASP(0x4506),
+ PSIL_PDMA_MCASP(0x4507),
+ PSIL_PDMA_MCASP(0x4508),
+ /* PDMA8 (PDMA_MISC_G0) - SPI0-1 */
+ PSIL_PDMA_XY_PKT(0x4600),
+ PSIL_PDMA_XY_PKT(0x4601),
+ PSIL_PDMA_XY_PKT(0x4602),
+ PSIL_PDMA_XY_PKT(0x4603),
+ PSIL_PDMA_XY_PKT(0x4604),
+ PSIL_PDMA_XY_PKT(0x4605),
+ PSIL_PDMA_XY_PKT(0x4606),
+ PSIL_PDMA_XY_PKT(0x4607),
+ /* PDMA9 (PDMA_MISC_G1) - SPI2-3 */
+ PSIL_PDMA_XY_PKT(0x460c),
+ PSIL_PDMA_XY_PKT(0x460d),
+ PSIL_PDMA_XY_PKT(0x460e),
+ PSIL_PDMA_XY_PKT(0x460f),
+ PSIL_PDMA_XY_PKT(0x4610),
+ PSIL_PDMA_XY_PKT(0x4611),
+ PSIL_PDMA_XY_PKT(0x4612),
+ PSIL_PDMA_XY_PKT(0x4613),
+ /* PDMA10 (PDMA_MISC_G2) - SPI4-5 */
+ PSIL_PDMA_XY_PKT(0x4618),
+ PSIL_PDMA_XY_PKT(0x4619),
+ PSIL_PDMA_XY_PKT(0x461a),
+ PSIL_PDMA_XY_PKT(0x461b),
+ PSIL_PDMA_XY_PKT(0x461c),
+ PSIL_PDMA_XY_PKT(0x461d),
+ PSIL_PDMA_XY_PKT(0x461e),
+ PSIL_PDMA_XY_PKT(0x461f),
+ /* PDMA11 (PDMA_MISC_G3) */
+ PSIL_PDMA_XY_PKT(0x4624),
+ PSIL_PDMA_XY_PKT(0x4625),
+ PSIL_PDMA_XY_PKT(0x4626),
+ PSIL_PDMA_XY_PKT(0x4627),
+ PSIL_PDMA_XY_PKT(0x4628),
+ PSIL_PDMA_XY_PKT(0x4629),
+ PSIL_PDMA_XY_PKT(0x4630),
+ PSIL_PDMA_XY_PKT(0x463a),
+ /* PDMA13 (PDMA_USART_G0) - UART0-1 */
+ PSIL_PDMA_XY_PKT(0x4700),
+ PSIL_PDMA_XY_PKT(0x4701),
+ /* PDMA14 (PDMA_USART_G1) - UART2-3 */
+ PSIL_PDMA_XY_PKT(0x4702),
+ PSIL_PDMA_XY_PKT(0x4703),
+ /* PDMA15 (PDMA_USART_G2) - UART4-9 */
+ PSIL_PDMA_XY_PKT(0x4704),
+ PSIL_PDMA_XY_PKT(0x4705),
+ PSIL_PDMA_XY_PKT(0x4706),
+ PSIL_PDMA_XY_PKT(0x4707),
+ PSIL_PDMA_XY_PKT(0x4708),
+ PSIL_PDMA_XY_PKT(0x4709),
+ /* CPSW9 */
+ PSIL_ETHERNET(0x4a00),
+ /* CPSW0 */
+ PSIL_ETHERNET(0x7000),
+ /* MCU_PDMA0 (MCU_PDMA_MISC_G0) - SPI0 */
+ PSIL_PDMA_XY_PKT(0x7100),
+ PSIL_PDMA_XY_PKT(0x7101),
+ PSIL_PDMA_XY_PKT(0x7102),
+ PSIL_PDMA_XY_PKT(0x7103),
+ /* MCU_PDMA1 (MCU_PDMA_MISC_G1) - SPI1-2 */
+ PSIL_PDMA_XY_PKT(0x7200),
+ PSIL_PDMA_XY_PKT(0x7201),
+ PSIL_PDMA_XY_PKT(0x7202),
+ PSIL_PDMA_XY_PKT(0x7203),
+ PSIL_PDMA_XY_PKT(0x7204),
+ PSIL_PDMA_XY_PKT(0x7205),
+ PSIL_PDMA_XY_PKT(0x7206),
+ PSIL_PDMA_XY_PKT(0x7207),
+ /* MCU_PDMA2 (MCU_PDMA_MISC_G2) - UART0 */
+ PSIL_PDMA_XY_PKT(0x7300),
+ /* MCU_PDMA_ADC - ADC0-1 */
+ PSIL_PDMA_XY_TR(0x7400),
+ PSIL_PDMA_XY_TR(0x7401),
+ PSIL_PDMA_XY_TR(0x7402),
+ PSIL_PDMA_XY_TR(0x7403),
+ /* SA2UL */
+ PSIL_SA2UL(0x7500, 0),
+ PSIL_SA2UL(0x7501, 0),
+};
+
+/* PSI-L destination thread IDs, used for TX (DMA_MEM_TO_DEV) */
+struct psil_ep j721e_dst_ep_map[] = {
+ /* SA2UL */
+ PSIL_SA2UL(0xc000, 1),
+ /* PRU_ICSSG0 */
+ PSIL_ETHERNET(0xc100),
+ PSIL_ETHERNET(0xc101),
+ PSIL_ETHERNET(0xc102),
+ PSIL_ETHERNET(0xc103),
+ PSIL_ETHERNET(0xc104),
+ PSIL_ETHERNET(0xc105),
+ PSIL_ETHERNET(0xc106),
+ PSIL_ETHERNET(0xc107),
+ /* PRU_ICSSG1 */
+ PSIL_ETHERNET(0xc200),
+ PSIL_ETHERNET(0xc201),
+ PSIL_ETHERNET(0xc202),
+ PSIL_ETHERNET(0xc203),
+ PSIL_ETHERNET(0xc204),
+ PSIL_ETHERNET(0xc205),
+ PSIL_ETHERNET(0xc206),
+ PSIL_ETHERNET(0xc207),
+ /* CPSW9 */
+ PSIL_ETHERNET(0xca00),
+ PSIL_ETHERNET(0xca01),
+ PSIL_ETHERNET(0xca02),
+ PSIL_ETHERNET(0xca03),
+ PSIL_ETHERNET(0xca04),
+ PSIL_ETHERNET(0xca05),
+ PSIL_ETHERNET(0xca06),
+ PSIL_ETHERNET(0xca07),
+ /* CPSW0 */
+ PSIL_ETHERNET(0xf000),
+ PSIL_ETHERNET(0xf001),
+ PSIL_ETHERNET(0xf002),
+ PSIL_ETHERNET(0xf003),
+ PSIL_ETHERNET(0xf004),
+ PSIL_ETHERNET(0xf005),
+ PSIL_ETHERNET(0xf006),
+ PSIL_ETHERNET(0xf007),
+ /* SA2UL */
+ PSIL_SA2UL(0xf500, 1),
+};
+
+struct psil_ep_map j721e_ep_map = {
+ .name = "j721e",
+ .src = j721e_src_ep_map,
+ .src_count = ARRAY_SIZE(j721e_src_ep_map),
+ .dst = j721e_dst_ep_map,
+ .dst_count = ARRAY_SIZE(j721e_dst_ep_map),
+};
diff --git a/drivers/dma/ti/k3-psil-priv.h b/drivers/dma/ti/k3-psil-priv.h
new file mode 100644
index 000000000000..f74420653d8a
--- /dev/null
+++ b/drivers/dma/ti/k3-psil-priv.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
+ */
+
+#ifndef K3_PSIL_PRIV_H_
+#define K3_PSIL_PRIV_H_
+
+#include <linux/dma/k3-psil.h>
+
+struct psil_ep {
+ u32 thread_id;
+ struct psil_endpoint_config ep_config;
+};
+
+/**
+ * struct psil_ep_map - PSI-L thread ID configuration maps
+ * @name: Name of the map, set it to the name of the SoC
+ * @src: Array of source PSI-L thread configurations
+ * @src_count: Number of entries in the src array
+ * @dst: Array of destination PSI-L thread configurations
+ * @dst_count: Number of entries in the dst array
+ *
+ * In case of symmetric configuration for a matching src/dst thread (for example
+ * 0x4400 and 0xc400) only the src configuration can be present. If no dst
+ * configuration found the code will look for (dst_thread_id & ~0x8000) to find
+ * the symmetric match.
+ */
+struct psil_ep_map {
+ char *name;
+ struct psil_ep *src;
+ int src_count;
+ struct psil_ep *dst;
+ int dst_count;
+};
+
+struct psil_endpoint_config *psil_get_ep_config(u32 thread_id);
+
+#endif /* K3_PSIL_PRIV_H_ */
diff --git a/drivers/dma/ti/k3-psil.c b/drivers/dma/ti/k3-psil.c
new file mode 100644
index 000000000000..e610022f09f4
--- /dev/null
+++ b/drivers/dma/ti/k3-psil.c
@@ -0,0 +1,97 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
+ * Author: Peter Ujfalusi <[email protected]>
+ */
+
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+
+#include "k3-psil-priv.h"
+
+extern struct psil_ep_map am654_ep_map;
+extern struct psil_ep_map j721e_ep_map;
+
+static DEFINE_MUTEX(ep_map_mutex);
+static struct psil_ep_map *soc_ep_map;
+
+struct psil_endpoint_config *psil_get_ep_config(u32 thread_id)
+{
+ int i;
+
+ mutex_lock(&ep_map_mutex);
+ if (!soc_ep_map) {
+ if (of_machine_is_compatible("ti,am654")) {
+ soc_ep_map = &am654_ep_map;
+ } else if (of_machine_is_compatible("ti,j721e")) {
+ soc_ep_map = &j721e_ep_map;
+ } else {
+ pr_err("PSIL: No compatible machine found for map\n");
+ return ERR_PTR(-ENOTSUPP);
+ }
+ pr_debug("%s: Using map for %s\n", __func__, soc_ep_map->name);
+ }
+ mutex_unlock(&ep_map_mutex);
+
+ if (thread_id & K3_PSIL_DST_THREAD_ID_OFFSET && soc_ep_map->dst) {
+ /* check in destination thread map */
+ for (i = 0; i < soc_ep_map->dst_count; i++) {
+ if (soc_ep_map->dst[i].thread_id == thread_id)
+ return &soc_ep_map->dst[i].ep_config;
+ }
+ }
+
+ thread_id &= ~K3_PSIL_DST_THREAD_ID_OFFSET;
+ if (soc_ep_map->src) {
+ for (i = 0; i < soc_ep_map->src_count; i++) {
+ if (soc_ep_map->src[i].thread_id == thread_id)
+ return &soc_ep_map->src[i].ep_config;
+ }
+ }
+
+ return ERR_PTR(-ENOENT);
+}
+EXPORT_SYMBOL(psil_get_ep_config);
+
+int psil_set_new_ep_config(struct device *dev, const char *name,
+ struct psil_endpoint_config *ep_config)
+{
+ struct psil_endpoint_config *dst_ep_config;
+ struct of_phandle_args dma_spec;
+ u32 thread_id;
+ int index;
+
+ if (!dev || !dev->of_node)
+ return -EINVAL;
+
+ index = of_property_match_string(dev->of_node, "dma-names", name);
+ if (index < 0)
+ return index;
+
+ if (of_parse_phandle_with_args(dev->of_node, "dmas", "#dma-cells",
+ index, &dma_spec))
+ return -ENOENT;
+
+ thread_id = dma_spec.args[0];
+
+ dst_ep_config = psil_get_ep_config(thread_id);
+ if (IS_ERR(dst_ep_config)) {
+ pr_err("PSIL: thread ID 0x%04x not defined in map\n",
+ thread_id);
+ of_node_put(dma_spec.np);
+ return PTR_ERR(dst_ep_config);
+ }
+
+ memcpy(dst_ep_config, ep_config, sizeof(*dst_ep_config));
+
+ of_node_put(dma_spec.np);
+ return 0;
+}
+EXPORT_SYMBOL(psil_set_new_ep_config);
+
+MODULE_DESCRIPTION("TI K3 PSI-L endpoint database");
+MODULE_AUTHOR("Peter Ujfalusi <[email protected]>");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/dma/k3-psil.h b/include/linux/dma/k3-psil.h
new file mode 100644
index 000000000000..16e9c8c6f839
--- /dev/null
+++ b/include/linux/dma/k3-psil.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
+ */
+
+#ifndef K3_PSIL_H_
+#define K3_PSIL_H_
+
+#include <linux/types.h>
+
+#define K3_PSIL_DST_THREAD_ID_OFFSET 0x8000
+
+struct device;
+
+/* Channel Throughput Levels */
+enum udma_tp_level {
+ UDMA_TP_NORMAL = 0,
+ UDMA_TP_HIGH = 1,
+ UDMA_TP_ULTRAHIGH = 2,
+ UDMA_TP_LAST,
+};
+
+enum psil_endpoint_type {
+ PSIL_EP_NATIVE = 0,
+ PSIL_EP_PDMA_XY,
+ PSIL_EP_PDMA_MCAN,
+ PSIL_EP_PDMA_AASRC,
+};
+
+struct psil_endpoint_config {
+ enum psil_endpoint_type ep_type;
+
+ unsigned pkt_mode:1;
+ unsigned notdpkt:1;
+ unsigned needs_epib:1;
+ u32 psd_size;
+ enum udma_tp_level channel_tpl;
+
+ /* PDMA properties, valid for PSIL_EP_PDMA_* */
+ unsigned pdma_acc32:1;
+ unsigned pdma_burst:1;
+};
+
+int psil_set_new_ep_config(struct device *dev, const char *name,
+ struct psil_endpoint_config *ep_config);
+
+#endif /* K3_PSIL_H_ */
--
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-05 07:53:26

by Tero Kristo

[permalink] [raw]
Subject: Re: [PATCH v4 07/15] dmaengine: ti: k3 PSI-L remote endpoint configuration

On 01/11/2019 10:41, Peter Ujfalusi wrote:
> In K3 architecture the DMA operates within threads. One end of the thread
> is UDMAP, the other is on the peripheral side.
>
> The UDMAP channel configuration depends on the needs of the remote
> endpoint and it can be differ from peripheral to peripheral.
>
> This patch adds database for am654 and j721e and small API to fetch the
> PSI-L endpoint configuration from the database which should only used by
> the DMA driver(s).
>
> Another API is added for native peripherals to give possibility to pass new
> configuration for the threads they are using, which is needed to be able to
> handle changes caused by different firmware loaded for the peripheral for
> example.
>
> Signed-off-by: Peter Ujfalusi <[email protected]>
> ---
> drivers/dma/ti/Kconfig | 3 +
> drivers/dma/ti/Makefile | 1 +
> drivers/dma/ti/k3-psil-am654.c | 172 ++++++++++++++++++++++++++
> drivers/dma/ti/k3-psil-j721e.c | 219 +++++++++++++++++++++++++++++++++
> drivers/dma/ti/k3-psil-priv.h | 39 ++++++
> drivers/dma/ti/k3-psil.c | 97 +++++++++++++++
> include/linux/dma/k3-psil.h | 47 +++++++
> 7 files changed, 578 insertions(+)
> create mode 100644 drivers/dma/ti/k3-psil-am654.c
> create mode 100644 drivers/dma/ti/k3-psil-j721e.c
> create mode 100644 drivers/dma/ti/k3-psil-priv.h
> create mode 100644 drivers/dma/ti/k3-psil.c
> create mode 100644 include/linux/dma/k3-psil.h
>
> diff --git a/drivers/dma/ti/Kconfig b/drivers/dma/ti/Kconfig
> index d507c24fbf31..72f3d2728178 100644
> --- a/drivers/dma/ti/Kconfig
> +++ b/drivers/dma/ti/Kconfig
> @@ -34,5 +34,8 @@ config DMA_OMAP
> Enable support for the TI sDMA (System DMA or DMA4) controller. This
> DMA engine is found on OMAP and DRA7xx parts.
>
> +config TI_K3_PSIL
> + bool
> +
> config TI_DMA_CROSSBAR
> bool
> diff --git a/drivers/dma/ti/Makefile b/drivers/dma/ti/Makefile
> index 113e59ec9c32..f8d912ad7eaf 100644
> --- a/drivers/dma/ti/Makefile
> +++ b/drivers/dma/ti/Makefile
> @@ -2,4 +2,5 @@
> obj-$(CONFIG_TI_CPPI41) += cppi41.o
> obj-$(CONFIG_TI_EDMA) += edma.o
> obj-$(CONFIG_DMA_OMAP) += omap-dma.o
> +obj-$(CONFIG_TI_K3_PSIL) += k3-psil.o k3-psil-am654.o k3-psil-j721e.o
> obj-$(CONFIG_TI_DMA_CROSSBAR) += dma-crossbar.o
> diff --git a/drivers/dma/ti/k3-psil-am654.c b/drivers/dma/ti/k3-psil-am654.c
> new file mode 100644
> index 000000000000..edd7fff36f44
> --- /dev/null
> +++ b/drivers/dma/ti/k3-psil-am654.c
> @@ -0,0 +1,172 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + * Author: Peter Ujfalusi <[email protected]>
> + */
> +
> +#include <linux/kernel.h>
> +
> +#include "k3-psil-priv.h"
> +
> +#define PSIL_PDMA_XY_TR(x) \
> + { \
> + .thread_id = x, \
> + .ep_config = { \
> + .ep_type = PSIL_EP_PDMA_XY, \
> + }, \
> + }
> +
> +#define PSIL_PDMA_XY_PKT(x) \
> + { \
> + .thread_id = x, \
> + .ep_config = { \
> + .ep_type = PSIL_EP_PDMA_XY, \
> + .pkt_mode = 1, \
> + }, \
> + }
> +
> +#define PSIL_ETHERNET(x) \
> + { \
> + .thread_id = x, \
> + .ep_config = { \
> + .ep_type = PSIL_EP_NATIVE, \
> + .pkt_mode = 1, \
> + .needs_epib = 1, \
> + .psd_size = 16, \
> + }, \
> + }
> +
> +#define PSIL_SA2UL(x, tx) \
> + { \
> + .thread_id = x, \
> + .ep_config = { \
> + .ep_type = PSIL_EP_NATIVE, \
> + .pkt_mode = 1, \
> + .needs_epib = 1, \
> + .psd_size = 64, \
> + .notdpkt = tx, \
> + }, \
> + }
> +
> +/* PSI-L source thread IDs, used for RX (DMA_DEV_TO_MEM) */
> +struct psil_ep am654_src_ep_map[] = {
> + /* SA2UL */
> + PSIL_SA2UL(0x4000, 0),
> + PSIL_SA2UL(0x4001, 0),
> + /* PRU_ICSSG0 */
> + PSIL_ETHERNET(0x4100),
> + PSIL_ETHERNET(0x4101),
> + PSIL_ETHERNET(0x4102),
> + PSIL_ETHERNET(0x4103),
> + /* PRU_ICSSG1 */
> + PSIL_ETHERNET(0x4200),
> + PSIL_ETHERNET(0x4201),
> + PSIL_ETHERNET(0x4202),
> + PSIL_ETHERNET(0x4203),
> + /* PRU_ICSSG2 */
> + PSIL_ETHERNET(0x4300),
> + PSIL_ETHERNET(0x4301),
> + PSIL_ETHERNET(0x4302),
> + PSIL_ETHERNET(0x4303),
> + /* PDMA0 - McASPs */
> + PSIL_PDMA_XY_TR(0x4400),
> + PSIL_PDMA_XY_TR(0x4401),
> + PSIL_PDMA_XY_TR(0x4402),
> + /* PDMA1 - SPI0-4 */
> + PSIL_PDMA_XY_PKT(0x4500),
> + PSIL_PDMA_XY_PKT(0x4501),
> + PSIL_PDMA_XY_PKT(0x4502),
> + PSIL_PDMA_XY_PKT(0x4503),
> + PSIL_PDMA_XY_PKT(0x4504),
> + PSIL_PDMA_XY_PKT(0x4505),
> + PSIL_PDMA_XY_PKT(0x4506),
> + PSIL_PDMA_XY_PKT(0x4507),
> + PSIL_PDMA_XY_PKT(0x4508),
> + PSIL_PDMA_XY_PKT(0x4509),
> + PSIL_PDMA_XY_PKT(0x450a),
> + PSIL_PDMA_XY_PKT(0x450b),
> + PSIL_PDMA_XY_PKT(0x450c),
> + PSIL_PDMA_XY_PKT(0x450d),
> + PSIL_PDMA_XY_PKT(0x450e),
> + PSIL_PDMA_XY_PKT(0x450f),
> + PSIL_PDMA_XY_PKT(0x4510),
> + PSIL_PDMA_XY_PKT(0x4511),
> + PSIL_PDMA_XY_PKT(0x4512),
> + PSIL_PDMA_XY_PKT(0x4513),
> + /* PDMA1 - USART0-2 */
> + PSIL_PDMA_XY_PKT(0x4514),
> + PSIL_PDMA_XY_PKT(0x4515),
> + PSIL_PDMA_XY_PKT(0x4516),
> + /* CPSW0 */
> + PSIL_ETHERNET(0x7000),
> + /* MCU_PDMA0 - ADCs */
> + PSIL_PDMA_XY_TR(0x7100),
> + PSIL_PDMA_XY_TR(0x7101),
> + PSIL_PDMA_XY_TR(0x7102),
> + PSIL_PDMA_XY_TR(0x7103),
> + /* MCU_PDMA1 - MCU_SPI0-2 */
> + PSIL_PDMA_XY_PKT(0x7200),
> + PSIL_PDMA_XY_PKT(0x7201),
> + PSIL_PDMA_XY_PKT(0x7202),
> + PSIL_PDMA_XY_PKT(0x7203),
> + PSIL_PDMA_XY_PKT(0x7204),
> + PSIL_PDMA_XY_PKT(0x7205),
> + PSIL_PDMA_XY_PKT(0x7206),
> + PSIL_PDMA_XY_PKT(0x7207),
> + PSIL_PDMA_XY_PKT(0x7208),
> + PSIL_PDMA_XY_PKT(0x7209),
> + PSIL_PDMA_XY_PKT(0x720a),
> + PSIL_PDMA_XY_PKT(0x720b),
> + /* MCU_PDMA1 - MCU_USART0 */
> + PSIL_PDMA_XY_PKT(0x7212),
> +};
> +
> +/* PSI-L destination thread IDs, used for TX (DMA_MEM_TO_DEV) */
> +struct psil_ep am654_dst_ep_map[] = {
> + /* SA2UL */
> + PSIL_SA2UL(0xc000, 1),
> + /* PRU_ICSSG0 */
> + PSIL_ETHERNET(0xc100),
> + PSIL_ETHERNET(0xc101),
> + PSIL_ETHERNET(0xc102),
> + PSIL_ETHERNET(0xc103),
> + PSIL_ETHERNET(0xc104),
> + PSIL_ETHERNET(0xc105),
> + PSIL_ETHERNET(0xc106),
> + PSIL_ETHERNET(0xc107),
> + /* PRU_ICSSG1 */
> + PSIL_ETHERNET(0xc200),
> + PSIL_ETHERNET(0xc201),
> + PSIL_ETHERNET(0xc202),
> + PSIL_ETHERNET(0xc203),
> + PSIL_ETHERNET(0xc204),
> + PSIL_ETHERNET(0xc205),
> + PSIL_ETHERNET(0xc206),
> + PSIL_ETHERNET(0xc207),
> + /* PRU_ICSSG2 */
> + PSIL_ETHERNET(0xc300),
> + PSIL_ETHERNET(0xc301),
> + PSIL_ETHERNET(0xc302),
> + PSIL_ETHERNET(0xc303),
> + PSIL_ETHERNET(0xc304),
> + PSIL_ETHERNET(0xc305),
> + PSIL_ETHERNET(0xc306),
> + PSIL_ETHERNET(0xc307),
> + /* CPSW0 */
> + PSIL_ETHERNET(0xf000),
> + PSIL_ETHERNET(0xf001),
> + PSIL_ETHERNET(0xf002),
> + PSIL_ETHERNET(0xf003),
> + PSIL_ETHERNET(0xf004),
> + PSIL_ETHERNET(0xf005),
> + PSIL_ETHERNET(0xf006),
> + PSIL_ETHERNET(0xf007),
> +};
> +
> +struct psil_ep_map am654_ep_map = {
> + .name = "am654",
> + .src = am654_src_ep_map,
> + .src_count = ARRAY_SIZE(am654_src_ep_map),
> + .dst = am654_dst_ep_map,
> + .dst_count = ARRAY_SIZE(am654_dst_ep_map),
> +};
> diff --git a/drivers/dma/ti/k3-psil-j721e.c b/drivers/dma/ti/k3-psil-j721e.c
> new file mode 100644
> index 000000000000..86e1ff57e197
> --- /dev/null
> +++ b/drivers/dma/ti/k3-psil-j721e.c
> @@ -0,0 +1,219 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + * Author: Peter Ujfalusi <[email protected]>
> + */
> +
> +#include <linux/kernel.h>
> +
> +#include "k3-psil-priv.h"
> +
> +#define PSIL_PDMA_XY_TR(x) \
> + { \
> + .thread_id = x, \
> + .ep_config = { \
> + .ep_type = PSIL_EP_PDMA_XY, \
> + }, \
> + }
> +
> +#define PSIL_PDMA_XY_PKT(x) \
> + { \
> + .thread_id = x, \
> + .ep_config = { \
> + .ep_type = PSIL_EP_PDMA_XY, \
> + .pkt_mode = 1, \
> + }, \
> + }
> +
> +#define PSIL_PDMA_MCASP(x) \
> + { \
> + .thread_id = x, \
> + .ep_config = { \
> + .ep_type = PSIL_EP_PDMA_XY, \
> + .pdma_acc32 = 1, \
> + .pdma_burst = 1, \
> + }, \
> + }
> +
> +#define PSIL_ETHERNET(x) \
> + { \
> + .thread_id = x, \
> + .ep_config = { \
> + .ep_type = PSIL_EP_NATIVE, \
> + .pkt_mode = 1, \
> + .needs_epib = 1, \
> + .psd_size = 16, \
> + }, \
> + }
> +
> +#define PSIL_SA2UL(x, tx) \
> + { \
> + .thread_id = x, \
> + .ep_config = { \
> + .ep_type = PSIL_EP_NATIVE, \
> + .pkt_mode = 1, \
> + .needs_epib = 1, \
> + .psd_size = 64, \
> + .notdpkt = tx, \
> + }, \
> + }
> +
> +/* PSI-L source thread IDs, used for RX (DMA_DEV_TO_MEM) */
> +struct psil_ep j721e_src_ep_map[] = {
> + /* SA2UL */
> + PSIL_SA2UL(0x4000, 0),
> + PSIL_SA2UL(0x4001, 0),
> + /* PRU_ICSSG0 */
> + PSIL_ETHERNET(0x4100),
> + PSIL_ETHERNET(0x4101),
> + PSIL_ETHERNET(0x4102),
> + PSIL_ETHERNET(0x4103),
> + /* PRU_ICSSG1 */
> + PSIL_ETHERNET(0x4200),
> + PSIL_ETHERNET(0x4201),
> + PSIL_ETHERNET(0x4202),
> + PSIL_ETHERNET(0x4203),
> + /* PDMA6 (PSIL_PDMA_MCASP_G0) - McASP0-2 */
> + PSIL_PDMA_MCASP(0x4400),
> + PSIL_PDMA_MCASP(0x4401),
> + PSIL_PDMA_MCASP(0x4402),
> + /* PDMA7 (PSIL_PDMA_MCASP_G1) - McASP3-11 */
> + PSIL_PDMA_MCASP(0x4500),
> + PSIL_PDMA_MCASP(0x4501),
> + PSIL_PDMA_MCASP(0x4502),
> + PSIL_PDMA_MCASP(0x4503),
> + PSIL_PDMA_MCASP(0x4504),
> + PSIL_PDMA_MCASP(0x4505),
> + PSIL_PDMA_MCASP(0x4506),
> + PSIL_PDMA_MCASP(0x4507),
> + PSIL_PDMA_MCASP(0x4508),
> + /* PDMA8 (PDMA_MISC_G0) - SPI0-1 */
> + PSIL_PDMA_XY_PKT(0x4600),
> + PSIL_PDMA_XY_PKT(0x4601),
> + PSIL_PDMA_XY_PKT(0x4602),
> + PSIL_PDMA_XY_PKT(0x4603),
> + PSIL_PDMA_XY_PKT(0x4604),
> + PSIL_PDMA_XY_PKT(0x4605),
> + PSIL_PDMA_XY_PKT(0x4606),
> + PSIL_PDMA_XY_PKT(0x4607),
> + /* PDMA9 (PDMA_MISC_G1) - SPI2-3 */
> + PSIL_PDMA_XY_PKT(0x460c),
> + PSIL_PDMA_XY_PKT(0x460d),
> + PSIL_PDMA_XY_PKT(0x460e),
> + PSIL_PDMA_XY_PKT(0x460f),
> + PSIL_PDMA_XY_PKT(0x4610),
> + PSIL_PDMA_XY_PKT(0x4611),
> + PSIL_PDMA_XY_PKT(0x4612),
> + PSIL_PDMA_XY_PKT(0x4613),
> + /* PDMA10 (PDMA_MISC_G2) - SPI4-5 */
> + PSIL_PDMA_XY_PKT(0x4618),
> + PSIL_PDMA_XY_PKT(0x4619),
> + PSIL_PDMA_XY_PKT(0x461a),
> + PSIL_PDMA_XY_PKT(0x461b),
> + PSIL_PDMA_XY_PKT(0x461c),
> + PSIL_PDMA_XY_PKT(0x461d),
> + PSIL_PDMA_XY_PKT(0x461e),
> + PSIL_PDMA_XY_PKT(0x461f),
> + /* PDMA11 (PDMA_MISC_G3) */
> + PSIL_PDMA_XY_PKT(0x4624),
> + PSIL_PDMA_XY_PKT(0x4625),
> + PSIL_PDMA_XY_PKT(0x4626),
> + PSIL_PDMA_XY_PKT(0x4627),
> + PSIL_PDMA_XY_PKT(0x4628),
> + PSIL_PDMA_XY_PKT(0x4629),
> + PSIL_PDMA_XY_PKT(0x4630),
> + PSIL_PDMA_XY_PKT(0x463a),
> + /* PDMA13 (PDMA_USART_G0) - UART0-1 */
> + PSIL_PDMA_XY_PKT(0x4700),
> + PSIL_PDMA_XY_PKT(0x4701),
> + /* PDMA14 (PDMA_USART_G1) - UART2-3 */
> + PSIL_PDMA_XY_PKT(0x4702),
> + PSIL_PDMA_XY_PKT(0x4703),
> + /* PDMA15 (PDMA_USART_G2) - UART4-9 */
> + PSIL_PDMA_XY_PKT(0x4704),
> + PSIL_PDMA_XY_PKT(0x4705),
> + PSIL_PDMA_XY_PKT(0x4706),
> + PSIL_PDMA_XY_PKT(0x4707),
> + PSIL_PDMA_XY_PKT(0x4708),
> + PSIL_PDMA_XY_PKT(0x4709),
> + /* CPSW9 */
> + PSIL_ETHERNET(0x4a00),
> + /* CPSW0 */
> + PSIL_ETHERNET(0x7000),
> + /* MCU_PDMA0 (MCU_PDMA_MISC_G0) - SPI0 */
> + PSIL_PDMA_XY_PKT(0x7100),
> + PSIL_PDMA_XY_PKT(0x7101),
> + PSIL_PDMA_XY_PKT(0x7102),
> + PSIL_PDMA_XY_PKT(0x7103),
> + /* MCU_PDMA1 (MCU_PDMA_MISC_G1) - SPI1-2 */
> + PSIL_PDMA_XY_PKT(0x7200),
> + PSIL_PDMA_XY_PKT(0x7201),
> + PSIL_PDMA_XY_PKT(0x7202),
> + PSIL_PDMA_XY_PKT(0x7203),
> + PSIL_PDMA_XY_PKT(0x7204),
> + PSIL_PDMA_XY_PKT(0x7205),
> + PSIL_PDMA_XY_PKT(0x7206),
> + PSIL_PDMA_XY_PKT(0x7207),
> + /* MCU_PDMA2 (MCU_PDMA_MISC_G2) - UART0 */
> + PSIL_PDMA_XY_PKT(0x7300),
> + /* MCU_PDMA_ADC - ADC0-1 */
> + PSIL_PDMA_XY_TR(0x7400),
> + PSIL_PDMA_XY_TR(0x7401),
> + PSIL_PDMA_XY_TR(0x7402),
> + PSIL_PDMA_XY_TR(0x7403),
> + /* SA2UL */
> + PSIL_SA2UL(0x7500, 0),
> + PSIL_SA2UL(0x7501, 0),
> +};
> +
> +/* PSI-L destination thread IDs, used for TX (DMA_MEM_TO_DEV) */
> +struct psil_ep j721e_dst_ep_map[] = {
> + /* SA2UL */
> + PSIL_SA2UL(0xc000, 1),
> + /* PRU_ICSSG0 */
> + PSIL_ETHERNET(0xc100),
> + PSIL_ETHERNET(0xc101),
> + PSIL_ETHERNET(0xc102),
> + PSIL_ETHERNET(0xc103),
> + PSIL_ETHERNET(0xc104),
> + PSIL_ETHERNET(0xc105),
> + PSIL_ETHERNET(0xc106),
> + PSIL_ETHERNET(0xc107),
> + /* PRU_ICSSG1 */
> + PSIL_ETHERNET(0xc200),
> + PSIL_ETHERNET(0xc201),
> + PSIL_ETHERNET(0xc202),
> + PSIL_ETHERNET(0xc203),
> + PSIL_ETHERNET(0xc204),
> + PSIL_ETHERNET(0xc205),
> + PSIL_ETHERNET(0xc206),
> + PSIL_ETHERNET(0xc207),
> + /* CPSW9 */
> + PSIL_ETHERNET(0xca00),
> + PSIL_ETHERNET(0xca01),
> + PSIL_ETHERNET(0xca02),
> + PSIL_ETHERNET(0xca03),
> + PSIL_ETHERNET(0xca04),
> + PSIL_ETHERNET(0xca05),
> + PSIL_ETHERNET(0xca06),
> + PSIL_ETHERNET(0xca07),
> + /* CPSW0 */
> + PSIL_ETHERNET(0xf000),
> + PSIL_ETHERNET(0xf001),
> + PSIL_ETHERNET(0xf002),
> + PSIL_ETHERNET(0xf003),
> + PSIL_ETHERNET(0xf004),
> + PSIL_ETHERNET(0xf005),
> + PSIL_ETHERNET(0xf006),
> + PSIL_ETHERNET(0xf007),
> + /* SA2UL */
> + PSIL_SA2UL(0xf500, 1),
> +};
> +
> +struct psil_ep_map j721e_ep_map = {
> + .name = "j721e",
> + .src = j721e_src_ep_map,
> + .src_count = ARRAY_SIZE(j721e_src_ep_map),
> + .dst = j721e_dst_ep_map,
> + .dst_count = ARRAY_SIZE(j721e_dst_ep_map),
> +};
> diff --git a/drivers/dma/ti/k3-psil-priv.h b/drivers/dma/ti/k3-psil-priv.h
> new file mode 100644
> index 000000000000..f74420653d8a
> --- /dev/null
> +++ b/drivers/dma/ti/k3-psil-priv.h
> @@ -0,0 +1,39 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + */
> +
> +#ifndef K3_PSIL_PRIV_H_
> +#define K3_PSIL_PRIV_H_
> +
> +#include <linux/dma/k3-psil.h>
> +
> +struct psil_ep {
> + u32 thread_id;
> + struct psil_endpoint_config ep_config;
> +};
> +
> +/**
> + * struct psil_ep_map - PSI-L thread ID configuration maps
> + * @name: Name of the map, set it to the name of the SoC
> + * @src: Array of source PSI-L thread configurations
> + * @src_count: Number of entries in the src array
> + * @dst: Array of destination PSI-L thread configurations
> + * @dst_count: Number of entries in the dst array
> + *
> + * In case of symmetric configuration for a matching src/dst thread (for example
> + * 0x4400 and 0xc400) only the src configuration can be present. If no dst
> + * configuration found the code will look for (dst_thread_id & ~0x8000) to find
> + * the symmetric match.
> + */
> +struct psil_ep_map {
> + char *name;
> + struct psil_ep *src;
> + int src_count;
> + struct psil_ep *dst;
> + int dst_count;
> +};
> +
> +struct psil_endpoint_config *psil_get_ep_config(u32 thread_id);
> +
> +#endif /* K3_PSIL_PRIV_H_ */
> diff --git a/drivers/dma/ti/k3-psil.c b/drivers/dma/ti/k3-psil.c
> new file mode 100644
> index 000000000000..e610022f09f4
> --- /dev/null
> +++ b/drivers/dma/ti/k3-psil.c
> @@ -0,0 +1,97 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + * Author: Peter Ujfalusi <[email protected]>
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <linux/of.h>
> +
> +#include "k3-psil-priv.h"
> +
> +extern struct psil_ep_map am654_ep_map;
> +extern struct psil_ep_map j721e_ep_map;
> +
> +static DEFINE_MUTEX(ep_map_mutex);
> +static struct psil_ep_map *soc_ep_map;

So, you are only protecting the high level soc_ep_map pointer only. You
don't need to protect the database itself via some usecounting or
something, or are you doing it within the DMA driver?

-Tero

> +
> +struct psil_endpoint_config *psil_get_ep_config(u32 thread_id)
> +{
> + int i;
> +
> + mutex_lock(&ep_map_mutex);
> + if (!soc_ep_map) {
> + if (of_machine_is_compatible("ti,am654")) {
> + soc_ep_map = &am654_ep_map;
> + } else if (of_machine_is_compatible("ti,j721e")) {
> + soc_ep_map = &j721e_ep_map;
> + } else {
> + pr_err("PSIL: No compatible machine found for map\n");
> + return ERR_PTR(-ENOTSUPP);
> + }
> + pr_debug("%s: Using map for %s\n", __func__, soc_ep_map->name);
> + }
> + mutex_unlock(&ep_map_mutex);
> +
> + if (thread_id & K3_PSIL_DST_THREAD_ID_OFFSET && soc_ep_map->dst) {
> + /* check in destination thread map */
> + for (i = 0; i < soc_ep_map->dst_count; i++) {
> + if (soc_ep_map->dst[i].thread_id == thread_id)
> + return &soc_ep_map->dst[i].ep_config;
> + }
> + }
> +
> + thread_id &= ~K3_PSIL_DST_THREAD_ID_OFFSET;
> + if (soc_ep_map->src) {
> + for (i = 0; i < soc_ep_map->src_count; i++) {
> + if (soc_ep_map->src[i].thread_id == thread_id)
> + return &soc_ep_map->src[i].ep_config;
> + }
> + }
> +
> + return ERR_PTR(-ENOENT);
> +}
> +EXPORT_SYMBOL(psil_get_ep_config);
> +
> +int psil_set_new_ep_config(struct device *dev, const char *name,
> + struct psil_endpoint_config *ep_config)
> +{
> + struct psil_endpoint_config *dst_ep_config;
> + struct of_phandle_args dma_spec;
> + u32 thread_id;
> + int index;
> +
> + if (!dev || !dev->of_node)
> + return -EINVAL;
> +
> + index = of_property_match_string(dev->of_node, "dma-names", name);
> + if (index < 0)
> + return index;
> +
> + if (of_parse_phandle_with_args(dev->of_node, "dmas", "#dma-cells",
> + index, &dma_spec))
> + return -ENOENT;
> +
> + thread_id = dma_spec.args[0];
> +
> + dst_ep_config = psil_get_ep_config(thread_id);
> + if (IS_ERR(dst_ep_config)) {
> + pr_err("PSIL: thread ID 0x%04x not defined in map\n",
> + thread_id);
> + of_node_put(dma_spec.np);
> + return PTR_ERR(dst_ep_config);
> + }
> +
> + memcpy(dst_ep_config, ep_config, sizeof(*dst_ep_config));
> +
> + of_node_put(dma_spec.np);
> + return 0;
> +}
> +EXPORT_SYMBOL(psil_set_new_ep_config);
> +
> +MODULE_DESCRIPTION("TI K3 PSI-L endpoint database");
> +MODULE_AUTHOR("Peter Ujfalusi <[email protected]>");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/linux/dma/k3-psil.h b/include/linux/dma/k3-psil.h
> new file mode 100644
> index 000000000000..16e9c8c6f839
> --- /dev/null
> +++ b/include/linux/dma/k3-psil.h
> @@ -0,0 +1,47 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + */
> +
> +#ifndef K3_PSIL_H_
> +#define K3_PSIL_H_
> +
> +#include <linux/types.h>
> +
> +#define K3_PSIL_DST_THREAD_ID_OFFSET 0x8000
> +
> +struct device;
> +
> +/* Channel Throughput Levels */
> +enum udma_tp_level {
> + UDMA_TP_NORMAL = 0,
> + UDMA_TP_HIGH = 1,
> + UDMA_TP_ULTRAHIGH = 2,
> + UDMA_TP_LAST,
> +};
> +
> +enum psil_endpoint_type {
> + PSIL_EP_NATIVE = 0,
> + PSIL_EP_PDMA_XY,
> + PSIL_EP_PDMA_MCAN,
> + PSIL_EP_PDMA_AASRC,
> +};
> +
> +struct psil_endpoint_config {
> + enum psil_endpoint_type ep_type;
> +
> + unsigned pkt_mode:1;
> + unsigned notdpkt:1;
> + unsigned needs_epib:1;
> + u32 psd_size;
> + enum udma_tp_level channel_tpl;
> +
> + /* PDMA properties, valid for PSIL_EP_PDMA_* */
> + unsigned pdma_acc32:1;
> + unsigned pdma_burst:1;
> +};
> +
> +int psil_set_new_ep_config(struct device *dev, const char *name,
> + struct psil_endpoint_config *ep_config);
> +
> +#endif /* K3_PSIL_H_ */
>

--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-05 08:16:34

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v4 07/15] dmaengine: ti: k3 PSI-L remote endpoint configuration



On 05/11/2019 9.49, Tero Kristo wrote:
> On 01/11/2019 10:41, Peter Ujfalusi wrote:
>> In K3 architecture the DMA operates within threads. One end of the thread
>> is UDMAP, the other is on the peripheral side.
>>
>> The UDMAP channel configuration depends on the needs of the remote
>> endpoint and it can be differ from peripheral to peripheral.
>>
>> This patch adds database for am654 and j721e and small API to fetch the
>> PSI-L endpoint configuration from the database which should only used by
>> the DMA driver(s).
>>
>> Another API is added for native peripherals to give possibility to
>> pass new
>> configuration for the threads they are using, which is needed to be
>> able to
>> handle changes caused by different firmware loaded for the peripheral for
>> example.
>>
>> Signed-off-by: Peter Ujfalusi <[email protected]>
>> ---
>>   drivers/dma/ti/Kconfig         |   3 +
>>   drivers/dma/ti/Makefile        |   1 +
>>   drivers/dma/ti/k3-psil-am654.c | 172 ++++++++++++++++++++++++++
>>   drivers/dma/ti/k3-psil-j721e.c | 219 +++++++++++++++++++++++++++++++++
>>   drivers/dma/ti/k3-psil-priv.h  |  39 ++++++
>>   drivers/dma/ti/k3-psil.c       |  97 +++++++++++++++
>>   include/linux/dma/k3-psil.h    |  47 +++++++
>>   7 files changed, 578 insertions(+)
>>   create mode 100644 drivers/dma/ti/k3-psil-am654.c
>>   create mode 100644 drivers/dma/ti/k3-psil-j721e.c
>>   create mode 100644 drivers/dma/ti/k3-psil-priv.h
>>   create mode 100644 drivers/dma/ti/k3-psil.c
>>   create mode 100644 include/linux/dma/k3-psil.h

...

>> diff --git a/drivers/dma/ti/k3-psil.c b/drivers/dma/ti/k3-psil.c
>> new file mode 100644
>> index 000000000000..e610022f09f4
>> --- /dev/null
>> +++ b/drivers/dma/ti/k3-psil.c
>> @@ -0,0 +1,97 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + *  Copyright (C) 2019 Texas Instruments Incorporated -
>> http://www.ti.com
>> + *  Author: Peter Ujfalusi <[email protected]>
>> + */
>> +
>> +#include <linux/kernel.h>
>> +#include <linux/device.h>
>> +#include <linux/module.h>
>> +#include <linux/mutex.h>
>> +#include <linux/of.h>
>> +
>> +#include "k3-psil-priv.h"
>> +
>> +extern struct psil_ep_map am654_ep_map;
>> +extern struct psil_ep_map j721e_ep_map;
>> +
>> +static DEFINE_MUTEX(ep_map_mutex);
>> +static struct psil_ep_map *soc_ep_map;
>
> So, you are only protecting the high level soc_ep_map pointer only. You
> don't need to protect the database itself via some usecounting or
> something, or are you doing it within the DMA driver?

That's correct, I protect only the soc_ep_map.
The DMA drivers can look up threads concurrently I just need to make
sure that the soc_ep_map is configured when the first
psil_get_ep_config() comes.
After this the DMA drivers are free to look up things.

The ep_config update will be coming from the DMA client driver(s) and
not from the DMA driver. The clinet driver knows how thier PSI-L
endpoint if configured so they could update the default configuration
_before_ they would request a DMA channel.

>
> -Tero
>
>> +
>> +struct psil_endpoint_config *psil_get_ep_config(u32 thread_id)
>> +{
>> +    int i;
>> +
>> +    mutex_lock(&ep_map_mutex);
>> +    if (!soc_ep_map) {
>> +        if (of_machine_is_compatible("ti,am654")) {
>> +            soc_ep_map = &am654_ep_map;
>> +        } else if (of_machine_is_compatible("ti,j721e")) {
>> +            soc_ep_map = &j721e_ep_map;
>> +        } else {
>> +            pr_err("PSIL: No compatible machine found for map\n");
>> +            return ERR_PTR(-ENOTSUPP);
>> +        }
>> +        pr_debug("%s: Using map for %s\n", __func__, soc_ep_map->name);
>> +    }
>> +    mutex_unlock(&ep_map_mutex);
>> +
>> +    if (thread_id & K3_PSIL_DST_THREAD_ID_OFFSET && soc_ep_map->dst) {
>> +        /* check in destination thread map */
>> +        for (i = 0; i < soc_ep_map->dst_count; i++) {
>> +            if (soc_ep_map->dst[i].thread_id == thread_id)
>> +                return &soc_ep_map->dst[i].ep_config;
>> +        }
>> +    }
>> +
>> +    thread_id &= ~K3_PSIL_DST_THREAD_ID_OFFSET;
>> +    if (soc_ep_map->src) {
>> +        for (i = 0; i < soc_ep_map->src_count; i++) {
>> +            if (soc_ep_map->src[i].thread_id == thread_id)
>> +                return &soc_ep_map->src[i].ep_config;
>> +        }
>> +    }
>> +
>> +    return ERR_PTR(-ENOENT);
>> +}
>> +EXPORT_SYMBOL(psil_get_ep_config);
>> +
>> +int psil_set_new_ep_config(struct device *dev, const char *name,
>> +               struct psil_endpoint_config *ep_config)
>> +{
>> +    struct psil_endpoint_config *dst_ep_config;
>> +    struct of_phandle_args dma_spec;
>> +    u32 thread_id;
>> +    int index;
>> +
>> +    if (!dev || !dev->of_node)
>> +        return -EINVAL;
>> +
>> +    index = of_property_match_string(dev->of_node, "dma-names", name);
>> +    if (index < 0)
>> +        return index;
>> +
>> +    if (of_parse_phandle_with_args(dev->of_node, "dmas", "#dma-cells",
>> +                       index, &dma_spec))
>> +        return -ENOENT;
>> +
>> +    thread_id = dma_spec.args[0];
>> +
>> +    dst_ep_config = psil_get_ep_config(thread_id);
>> +    if (IS_ERR(dst_ep_config)) {
>> +        pr_err("PSIL: thread ID 0x%04x not defined in map\n",
>> +               thread_id);
>> +        of_node_put(dma_spec.np);
>> +        return PTR_ERR(dst_ep_config);
>> +    }
>> +
>> +    memcpy(dst_ep_config, ep_config, sizeof(*dst_ep_config));
>> +
>> +    of_node_put(dma_spec.np);
>> +    return 0;
>> +}
>> +EXPORT_SYMBOL(psil_set_new_ep_config);
>> +
>> +MODULE_DESCRIPTION("TI K3 PSI-L endpoint database");
>> +MODULE_AUTHOR("Peter Ujfalusi <[email protected]>");
>> +MODULE_LICENSE("GPL v2");
>> diff --git a/include/linux/dma/k3-psil.h b/include/linux/dma/k3-psil.h
>> new file mode 100644
>> index 000000000000..16e9c8c6f839
>> --- /dev/null
>> +++ b/include/linux/dma/k3-psil.h
>> @@ -0,0 +1,47 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + *  Copyright (C) 2019 Texas Instruments Incorporated -
>> http://www.ti.com
>> + */
>> +
>> +#ifndef K3_PSIL_H_
>> +#define K3_PSIL_H_
>> +
>> +#include <linux/types.h>
>> +
>> +#define K3_PSIL_DST_THREAD_ID_OFFSET 0x8000
>> +
>> +struct device;
>> +
>> +/* Channel Throughput Levels */
>> +enum udma_tp_level {
>> +    UDMA_TP_NORMAL = 0,
>> +    UDMA_TP_HIGH = 1,
>> +    UDMA_TP_ULTRAHIGH = 2,
>> +    UDMA_TP_LAST,
>> +};
>> +
>> +enum psil_endpoint_type {
>> +    PSIL_EP_NATIVE = 0,
>> +    PSIL_EP_PDMA_XY,
>> +    PSIL_EP_PDMA_MCAN,
>> +    PSIL_EP_PDMA_AASRC,
>> +};
>> +
>> +struct psil_endpoint_config {
>> +    enum psil_endpoint_type ep_type;
>> +
>> +    unsigned pkt_mode:1;
>> +    unsigned notdpkt:1;
>> +    unsigned needs_epib:1;
>> +    u32 psd_size;
>> +    enum udma_tp_level channel_tpl;
>> +
>> +    /* PDMA properties, valid for PSIL_EP_PDMA_* */
>> +    unsigned pdma_acc32:1;
>> +    unsigned pdma_burst:1;
>> +};
>> +
>> +int psil_set_new_ep_config(struct device *dev, const char *name,
>> +               struct psil_endpoint_config *ep_config);
>> +
>> +#endif /* K3_PSIL_H_ */
>>
>
> --
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-05 10:04:46

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [PATCH v4 07/15] dmaengine: ti: k3 PSI-L remote endpoint configuration

Hi Peter,

On 01/11/2019 10:41, Peter Ujfalusi wrote:
> In K3 architecture the DMA operates within threads. One end of the thread
> is UDMAP, the other is on the peripheral side.
>
> The UDMAP channel configuration depends on the needs of the remote
> endpoint and it can be differ from peripheral to peripheral.
>
> This patch adds database for am654 and j721e and small API to fetch the
> PSI-L endpoint configuration from the database which should only used by
> the DMA driver(s).
>
> Another API is added for native peripherals to give possibility to pass new
> configuration for the threads they are using, which is needed to be able to
> handle changes caused by different firmware loaded for the peripheral for
> example.

I have no objection to this approach, but ...

>
> Signed-off-by: Peter Ujfalusi <[email protected]>
> ---
> drivers/dma/ti/Kconfig | 3 +
> drivers/dma/ti/Makefile | 1 +
> drivers/dma/ti/k3-psil-am654.c | 172 ++++++++++++++++++++++++++
> drivers/dma/ti/k3-psil-j721e.c | 219 +++++++++++++++++++++++++++++++++
> drivers/dma/ti/k3-psil-priv.h | 39 ++++++
> drivers/dma/ti/k3-psil.c | 97 +++++++++++++++
> include/linux/dma/k3-psil.h | 47 +++++++
> 7 files changed, 578 insertions(+)
> create mode 100644 drivers/dma/ti/k3-psil-am654.c
> create mode 100644 drivers/dma/ti/k3-psil-j721e.c
> create mode 100644 drivers/dma/ti/k3-psil-priv.h
> create mode 100644 drivers/dma/ti/k3-psil.c
> create mode 100644 include/linux/dma/k3-psil.h
>

[...]

> diff --git a/include/linux/dma/k3-psil.h b/include/linux/dma/k3-psil.h
> new file mode 100644
> index 000000000000..16e9c8c6f839
> --- /dev/null
> +++ b/include/linux/dma/k3-psil.h
> @@ -0,0 +1,47 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com
> + */
> +
> +#ifndef K3_PSIL_H_
> +#define K3_PSIL_H_
> +
> +#include <linux/types.h>
> +
> +#define K3_PSIL_DST_THREAD_ID_OFFSET 0x8000
> +
> +struct device;
> +
> +/* Channel Throughput Levels */
> +enum udma_tp_level {
> + UDMA_TP_NORMAL = 0,
> + UDMA_TP_HIGH = 1,
> + UDMA_TP_ULTRAHIGH = 2,
> + UDMA_TP_LAST,
> +};
> +
> +enum psil_endpoint_type {
> + PSIL_EP_NATIVE = 0,
> + PSIL_EP_PDMA_XY,
> + PSIL_EP_PDMA_MCAN,
> + PSIL_EP_PDMA_AASRC,
> +};
> +
> +struct psil_endpoint_config {
> + enum psil_endpoint_type ep_type;
> +
> + unsigned pkt_mode:1;
> + unsigned notdpkt:1;
> + unsigned needs_epib:1;
> + u32 psd_size;
> + enum udma_tp_level channel_tpl;
> +
> + /* PDMA properties, valid for PSIL_EP_PDMA_* */
> + unsigned pdma_acc32:1;
> + unsigned pdma_burst:1;
> +};
> +
> +int psil_set_new_ep_config(struct device *dev, const char *name,
> + struct psil_endpoint_config *ep_config);
> +
> +#endif /* K3_PSIL_H_ */
>

I see no user now of this public interface, so I think it better to drop it until
there will be real user of it.

--
Best regards,
grygorii

2019-11-05 10:29:23

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v4 07/15] dmaengine: ti: k3 PSI-L remote endpoint configuration



On 05/11/2019 12.00, Grygorii Strashko wrote:
> Hi Peter,
>
> On 01/11/2019 10:41, Peter Ujfalusi wrote:
>> In K3 architecture the DMA operates within threads. One end of the thread
>> is UDMAP, the other is on the peripheral side.
>>
>> The UDMAP channel configuration depends on the needs of the remote
>> endpoint and it can be differ from peripheral to peripheral.
>>
>> This patch adds database for am654 and j721e and small API to fetch the
>> PSI-L endpoint configuration from the database which should only used by
>> the DMA driver(s).
>>
>> Another API is added for native peripherals to give possibility to
>> pass new
>> configuration for the threads they are using, which is needed to be
>> able to
>> handle changes caused by different firmware loaded for the peripheral for
>> example.
>
> I have no objection to this approach, but ...
>
>>
>> Signed-off-by: Peter Ujfalusi <[email protected]>
>> ---
>>   drivers/dma/ti/Kconfig         |   3 +
>>   drivers/dma/ti/Makefile        |   1 +
>>   drivers/dma/ti/k3-psil-am654.c | 172 ++++++++++++++++++++++++++
>>   drivers/dma/ti/k3-psil-j721e.c | 219 +++++++++++++++++++++++++++++++++
>>   drivers/dma/ti/k3-psil-priv.h  |  39 ++++++
>>   drivers/dma/ti/k3-psil.c       |  97 +++++++++++++++
>>   include/linux/dma/k3-psil.h    |  47 +++++++
>>   7 files changed, 578 insertions(+)
>>   create mode 100644 drivers/dma/ti/k3-psil-am654.c
>>   create mode 100644 drivers/dma/ti/k3-psil-j721e.c
>>   create mode 100644 drivers/dma/ti/k3-psil-priv.h
>>   create mode 100644 drivers/dma/ti/k3-psil.c
>>   create mode 100644 include/linux/dma/k3-psil.h
>>
>
> [...]
>
>> diff --git a/include/linux/dma/k3-psil.h b/include/linux/dma/k3-psil.h
>> new file mode 100644
>> index 000000000000..16e9c8c6f839
>> --- /dev/null
>> +++ b/include/linux/dma/k3-psil.h
>> @@ -0,0 +1,47 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + *  Copyright (C) 2019 Texas Instruments Incorporated -
>> http://www.ti.com
>> + */
>> +
>> +#ifndef K3_PSIL_H_
>> +#define K3_PSIL_H_
>> +
>> +#include <linux/types.h>
>> +
>> +#define K3_PSIL_DST_THREAD_ID_OFFSET 0x8000
>> +
>> +struct device;
>> +
>> +/* Channel Throughput Levels */
>> +enum udma_tp_level {
>> +    UDMA_TP_NORMAL = 0,
>> +    UDMA_TP_HIGH = 1,
>> +    UDMA_TP_ULTRAHIGH = 2,
>> +    UDMA_TP_LAST,
>> +};
>> +
>> +enum psil_endpoint_type {
>> +    PSIL_EP_NATIVE = 0,
>> +    PSIL_EP_PDMA_XY,
>> +    PSIL_EP_PDMA_MCAN,
>> +    PSIL_EP_PDMA_AASRC,
>> +};
>> +
>> +struct psil_endpoint_config {
>> +    enum psil_endpoint_type ep_type;
>> +
>> +    unsigned pkt_mode:1;
>> +    unsigned notdpkt:1;
>> +    unsigned needs_epib:1;
>> +    u32 psd_size;
>> +    enum udma_tp_level channel_tpl;
>> +
>> +    /* PDMA properties, valid for PSIL_EP_PDMA_* */
>> +    unsigned pdma_acc32:1;
>> +    unsigned pdma_burst:1;
>> +};
>> +
>> +int psil_set_new_ep_config(struct device *dev, const char *name,
>> +               struct psil_endpoint_config *ep_config);
>> +
>> +#endif /* K3_PSIL_H_ */
>>
>
> I see no user now of this public interface, so I think it better to drop
> it until
> there will be real user of it.

The same argument is valid for the glue layer ;)

This is only going to be used by native PSI-L devices and the
psil_endpoint_config is going to be extended to facilitate their needs
to give information to the DMA driver on how to set things up.

I would rather avoid churn later on than adding the support from the start.

The point is that the PSI-L endpoint configuration is part of the PSI-L
peripheral and based on factors these configurations might differ from
the default one. For example if we want to merge the two physical rx
channel for sa2ul (so they use the same rflow) or other things we (I)
can not foresee yet.
Or if different firmware is loaded for them and it affects their PSI-L
configuration.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-05 11:29:46

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [PATCH v4 07/15] dmaengine: ti: k3 PSI-L remote endpoint configuration



On 05/11/2019 12:27, Peter Ujfalusi wrote:
>
>
> On 05/11/2019 12.00, Grygorii Strashko wrote:
>> Hi Peter,
>>
>> On 01/11/2019 10:41, Peter Ujfalusi wrote:
>>> In K3 architecture the DMA operates within threads. One end of the thread
>>> is UDMAP, the other is on the peripheral side.
>>>
>>> The UDMAP channel configuration depends on the needs of the remote
>>> endpoint and it can be differ from peripheral to peripheral.
>>>
>>> This patch adds database for am654 and j721e and small API to fetch the
>>> PSI-L endpoint configuration from the database which should only used by
>>> the DMA driver(s).
>>>
>>> Another API is added for native peripherals to give possibility to
>>> pass new
>>> configuration for the threads they are using, which is needed to be
>>> able to
>>> handle changes caused by different firmware loaded for the peripheral for
>>> example.
>>
>> I have no objection to this approach, but ...
>>
>>>
>>> Signed-off-by: Peter Ujfalusi <[email protected]>
>>> ---
>>>   drivers/dma/ti/Kconfig         |   3 +
>>>   drivers/dma/ti/Makefile        |   1 +
>>>   drivers/dma/ti/k3-psil-am654.c | 172 ++++++++++++++++++++++++++
>>>   drivers/dma/ti/k3-psil-j721e.c | 219 +++++++++++++++++++++++++++++++++
>>>   drivers/dma/ti/k3-psil-priv.h  |  39 ++++++
>>>   drivers/dma/ti/k3-psil.c       |  97 +++++++++++++++
>>>   include/linux/dma/k3-psil.h    |  47 +++++++
>>>   7 files changed, 578 insertions(+)
>>>   create mode 100644 drivers/dma/ti/k3-psil-am654.c
>>>   create mode 100644 drivers/dma/ti/k3-psil-j721e.c
>>>   create mode 100644 drivers/dma/ti/k3-psil-priv.h
>>>   create mode 100644 drivers/dma/ti/k3-psil.c
>>>   create mode 100644 include/linux/dma/k3-psil.h
>>>
>>
>> [...]
>>
>>> diff --git a/include/linux/dma/k3-psil.h b/include/linux/dma/k3-psil.h
>>> new file mode 100644
>>> index 000000000000..16e9c8c6f839
>>> --- /dev/null
>>> +++ b/include/linux/dma/k3-psil.h
>>> @@ -0,0 +1,47 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +/*
>>> + *  Copyright (C) 2019 Texas Instruments Incorporated -
>>> http://www.ti.com
>>> + */
>>> +
>>> +#ifndef K3_PSIL_H_
>>> +#define K3_PSIL_H_
>>> +
>>> +#include <linux/types.h>
>>> +
>>> +#define K3_PSIL_DST_THREAD_ID_OFFSET 0x8000
>>> +
>>> +struct device;
>>> +
>>> +/* Channel Throughput Levels */
>>> +enum udma_tp_level {
>>> +    UDMA_TP_NORMAL = 0,
>>> +    UDMA_TP_HIGH = 1,
>>> +    UDMA_TP_ULTRAHIGH = 2,
>>> +    UDMA_TP_LAST,
>>> +};
>>> +
>>> +enum psil_endpoint_type {
>>> +    PSIL_EP_NATIVE = 0,
>>> +    PSIL_EP_PDMA_XY,
>>> +    PSIL_EP_PDMA_MCAN,
>>> +    PSIL_EP_PDMA_AASRC,
>>> +};
>>> +
>>> +struct psil_endpoint_config {
>>> +    enum psil_endpoint_type ep_type;
>>> +
>>> +    unsigned pkt_mode:1;
>>> +    unsigned notdpkt:1;
>>> +    unsigned needs_epib:1;
>>> +    u32 psd_size;
>>> +    enum udma_tp_level channel_tpl;
>>> +
>>> +    /* PDMA properties, valid for PSIL_EP_PDMA_* */
>>> +    unsigned pdma_acc32:1;
>>> +    unsigned pdma_burst:1;
>>> +};
>>> +
>>> +int psil_set_new_ep_config(struct device *dev, const char *name,
>>> +               struct psil_endpoint_config *ep_config);
>>> +
>>> +#endif /* K3_PSIL_H_ */
>>>
>>
>> I see no user now of this public interface, so I think it better to drop
>> it until
>> there will be real user of it.
>
> The same argument is valid for the glue layer ;)
>
> This is only going to be used by native PSI-L devices and the
> psil_endpoint_config is going to be extended to facilitate their needs
> to give information to the DMA driver on how to set things up.
>
> I would rather avoid churn later on than adding the support from the start.
>
> The point is that the PSI-L endpoint configuration is part of the PSI-L
> peripheral and based on factors these configurations might differ from
> the default one. For example if we want to merge the two physical rx
> channel for sa2ul (so they use the same rflow) or other things we (I)
> can not foresee yet.
> Or if different firmware is loaded for them and it affects their PSI-L
> configuration.

Ok. It's up to you.

otherwise:
Reviewed-by: Grygorii Strashko <[email protected]>

--
Best regards,
grygorii

2019-11-11 04:22:39

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH v4 02/15] soc: ti: k3: add navss ringacc driver

On 01-11-19, 10:41, Peter Ujfalusi wrote:
> From: Grygorii Strashko <[email protected]>

> +config TI_K3_RINGACC
> + tristate "K3 Ring accelerator Sub System"
> + depends on ARCH_K3 || COMPILE_TEST
> + depends on TI_SCI_INTA_IRQCHIP
> + default y

You want to get an earful from Linus? We dont do default y on new stuff,
never :)

> +struct k3_ring_rt_regs {
> + u32 resv_16[4];
> + u32 db; /* RT Ring N Doorbell Register */
> + u32 resv_4[1];
> + u32 occ; /* RT Ring N Occupancy Register */
> + u32 indx; /* RT Ring N Current Index Register */
> + u32 hwocc; /* RT Ring N Hardware Occupancy Register */
> + u32 hwindx; /* RT Ring N Current Index Register */

nice comments, how about moving them up into kernel-doc style? (here and
other places as well)


> +struct k3_ring *k3_ringacc_request_ring(struct k3_ringacc *ringacc,
> + int id, u32 flags)
> +{
> + int proxy_id = K3_RINGACC_PROXY_NOT_USED;
> +
> + mutex_lock(&ringacc->req_lock);
> +
> + if (id == K3_RINGACC_RING_ID_ANY) {
> + /* Request for any general purpose ring */
> + struct ti_sci_resource_desc *gp_rings =
> + &ringacc->rm_gp_range->desc[0];
> + unsigned long size;
> +
> + size = gp_rings->start + gp_rings->num;
> + id = find_next_zero_bit(ringacc->rings_inuse, size,
> + gp_rings->start);
> + if (id == size)
> + goto error;
> + } else if (id < 0) {
> + goto error;
> + }
> +
> + if (test_bit(id, ringacc->rings_inuse) &&
> + !(ringacc->rings[id].flags & K3_RING_FLAG_SHARED))
> + goto error;
> + else if (ringacc->rings[id].flags & K3_RING_FLAG_SHARED)
> + goto out;
> +
> + if (flags & K3_RINGACC_RING_USE_PROXY) {
> + proxy_id = find_next_zero_bit(ringacc->proxy_inuse,
> + ringacc->num_proxies, 0);
> + if (proxy_id == ringacc->num_proxies)
> + goto error;
> + }
> +
> + if (!try_module_get(ringacc->dev->driver->owner))
> + goto error;

should this not be one of the first things to do?

> +
> + if (proxy_id != K3_RINGACC_PROXY_NOT_USED) {
> + set_bit(proxy_id, ringacc->proxy_inuse);
> + ringacc->rings[id].proxy_id = proxy_id;
> + dev_dbg(ringacc->dev, "Giving ring#%d proxy#%d\n", id,
> + proxy_id);
> + } else {
> + dev_dbg(ringacc->dev, "Giving ring#%d\n", id);
> + }

how bout removing else and doing common print?

--
~Vinod

2019-11-11 04:43:58

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH v4 05/15] dmaengine: Add support for reporting DMA cached data amount

On 01-11-19, 10:41, Peter Ujfalusi wrote:
> A DMA hardware can have big cache or FIFO and the amount of data sitting in
> the DMA fabric can be an interest for the clients.
>
> For example in audio we want to know the delay in the data flow and in case
> the DMA have significantly large FIFO/cache, it can affect the latenc/delay
>
> Signed-off-by: Peter Ujfalusi <[email protected]>
> Reviewed-by: Tero Kristo <[email protected]>
> ---
> drivers/dma/dmaengine.h | 8 ++++++++
> include/linux/dmaengine.h | 2 ++
> 2 files changed, 10 insertions(+)
>
> diff --git a/drivers/dma/dmaengine.h b/drivers/dma/dmaengine.h
> index 501c0b063f85..b0b97475707a 100644
> --- a/drivers/dma/dmaengine.h
> +++ b/drivers/dma/dmaengine.h
> @@ -77,6 +77,7 @@ static inline enum dma_status dma_cookie_status(struct dma_chan *chan,
> state->last = complete;
> state->used = used;
> state->residue = 0;
> + state->in_flight_bytes = 0;
> }
> return dma_async_is_complete(cookie, complete, used);
> }
> @@ -87,6 +88,13 @@ static inline void dma_set_residue(struct dma_tx_state *state, u32 residue)
> state->residue = residue;
> }
>
> +static inline void dma_set_in_flight_bytes(struct dma_tx_state *state,
> + u32 in_flight_bytes)
> +{
> + if (state)
> + state->in_flight_bytes = in_flight_bytes;
> +}
> +
> struct dmaengine_desc_callback {
> dma_async_tx_callback callback;
> dma_async_tx_callback_result callback_result;
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 0e8b426bbde9..c4c5219030a6 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -682,11 +682,13 @@ static inline struct dma_async_tx_descriptor *txd_next(struct dma_async_tx_descr
> * @residue: the remaining number of bytes left to transmit
> * on the selected transfer for states DMA_IN_PROGRESS and
> * DMA_PAUSED if this is implemented in the driver, else 0
> + * @in_flight_bytes: amount of data in bytes cached by the DMA.
> */
> struct dma_tx_state {
> dma_cookie_t last;
> dma_cookie_t used;
> u32 residue;
> + u32 in_flight_bytes;

Should we add this here or use the dmaengine_result()

--
~Vinod

2019-11-11 04:48:14

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH v4 07/15] dmaengine: ti: k3 PSI-L remote endpoint configuration

On 01-11-19, 10:41, Peter Ujfalusi wrote:

> --- /dev/null
> +++ b/drivers/dma/ti/k3-psil.c
> @@ -0,0 +1,97 @@
> +// SPDX-License-Identifier: GPL-2.0

...

> +extern struct psil_ep_map am654_ep_map;
> +extern struct psil_ep_map j721e_ep_map;
> +
> +static DEFINE_MUTEX(ep_map_mutex);
> +static struct psil_ep_map *soc_ep_map;
> +
> +struct psil_endpoint_config *psil_get_ep_config(u32 thread_id)
> +{
> + int i;
> +
> + mutex_lock(&ep_map_mutex);
> + if (!soc_ep_map) {
> + if (of_machine_is_compatible("ti,am654")) {
> + soc_ep_map = &am654_ep_map;
> + } else if (of_machine_is_compatible("ti,j721e")) {
> + soc_ep_map = &j721e_ep_map;
> + } else {
> + pr_err("PSIL: No compatible machine found for map\n");
> + return ERR_PTR(-ENOTSUPP);
> + }
> + pr_debug("%s: Using map for %s\n", __func__, soc_ep_map->name);
> + }
> + mutex_unlock(&ep_map_mutex);
> +
> + if (thread_id & K3_PSIL_DST_THREAD_ID_OFFSET && soc_ep_map->dst) {
> + /* check in destination thread map */
> + for (i = 0; i < soc_ep_map->dst_count; i++) {
> + if (soc_ep_map->dst[i].thread_id == thread_id)
> + return &soc_ep_map->dst[i].ep_config;
> + }
> + }
> +
> + thread_id &= ~K3_PSIL_DST_THREAD_ID_OFFSET;
> + if (soc_ep_map->src) {
> + for (i = 0; i < soc_ep_map->src_count; i++) {
> + if (soc_ep_map->src[i].thread_id == thread_id)
> + return &soc_ep_map->src[i].ep_config;
> + }
> + }
> +
> + return ERR_PTR(-ENOENT);
> +}
> +EXPORT_SYMBOL(psil_get_ep_config);

This doesn't match the license of this module, we need it to be
EXPORT_SYMBOL_GPL
--
~Vinod

2019-11-11 05:34:36

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH v4 10/15] dmaengine: ti: New driver for K3 UDMA - split#2: probe/remove, xlate and filter_fn

On 01-11-19, 10:41, Peter Ujfalusi wrote:

> +static bool udma_dma_filter_fn(struct dma_chan *chan, void *param)
> +{
> + struct psil_endpoint_config *ep_config;
> + struct udma_chan *uc;
> + struct udma_dev *ud;
> + u32 *args;
> +
> + if (chan->device->dev->driver != &udma_driver.driver)
> + return false;
> +
> + uc = to_udma_chan(chan);
> + ud = uc->ud;
> + args = param;
> + uc->remote_thread_id = args[0];
> +
> + if (uc->remote_thread_id & K3_PSIL_DST_THREAD_ID_OFFSET)
> + uc->dir = DMA_MEM_TO_DEV;
> + else
> + uc->dir = DMA_DEV_TO_MEM;

Can you explain this a bit?

> +static int udma_remove(struct platform_device *pdev)
> +{
> + struct udma_dev *ud = platform_get_drvdata(pdev);
> +
> + of_dma_controller_free(pdev->dev.of_node);
> + dma_async_device_unregister(&ud->ddev);
> +
> + /* Make sure that we did proper cleanup */
> + cancel_work_sync(&ud->purge_work);
> + udma_purge_desc_work(&ud->purge_work);

kill the vchan tasklets at it too please
--
~Vinod

2019-11-11 06:08:01

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH v4 11/15] dmaengine: ti: New driver for K3 UDMA - split#3: alloc/free chan_resources

On 01-11-19, 10:41, Peter Ujfalusi wrote:
> Split patch for review containing: channel rsource allocation and free

s/rsource/resource


> +static int udma_tisci_tx_channel_config(struct udma_chan *uc)
> +{
> + struct udma_dev *ud = uc->ud;
> + struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
> + const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
> + struct udma_tchan *tchan = uc->tchan;
> + int tc_ring = k3_ringacc_get_ring_id(tchan->tc_ring);
> + struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
> + u32 mode, fetch_size;
> + int ret = 0;
> +
> + if (uc->pkt_mode) {
> + mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
> + fetch_size = cppi5_hdesc_calc_size(uc->needs_epib, uc->psd_size,
> + 0);
> + } else {
> + mode = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_PBRR;
> + fetch_size = sizeof(struct cppi5_desc_hdr_t);
> + }
> +
> + req_tx.valid_params =
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_EINFO_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_PSWORDS_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_SUPR_TDPKT_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID;

bunch of these are repeat, you can define a COMMON_VALID_PARAMS and use
that + specific ones..

> +
> + req_tx.nav_id = tisci_rm->tisci_dev_id;
> + req_tx.index = tchan->id;
> + req_tx.tx_pause_on_err = 0;
> + req_tx.tx_filt_einfo = 0;
> + req_tx.tx_filt_pswords = 0;

i think initialization to 0 is superfluous

> + req_tx.tx_chan_type = mode;
> + req_tx.tx_supr_tdpkt = uc->notdpkt;
> + req_tx.tx_fetch_size = fetch_size >> 2;
> + req_tx.txcq_qnum = tc_ring;
> + if (uc->ep_type == PSIL_EP_PDMA_XY) {
> + /* wait for peer to complete the teardown for PDMAs */
> + req_tx.valid_params |=
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_TDTYPE_VALID;
> + req_tx.tx_tdtype = 1;
> + }
> +
> + ret = tisci_ops->tx_ch_cfg(tisci_rm->tisci, &req_tx);
> + if (ret)
> + dev_err(ud->dev, "tchan%d cfg failed %d\n", tchan->id, ret);
> +
> + return ret;
> +}
> +
> +static int udma_tisci_rx_channel_config(struct udma_chan *uc)
> +{
> + struct udma_dev *ud = uc->ud;
> + struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
> + const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
> + struct udma_rchan *rchan = uc->rchan;
> + int fd_ring = k3_ringacc_get_ring_id(rchan->fd_ring);
> + int rx_ring = k3_ringacc_get_ring_id(rchan->r_ring);
> + struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
> + struct ti_sci_msg_rm_udmap_flow_cfg flow_req = { 0 };
> + u32 mode, fetch_size;
> + int ret = 0;
> +
> + if (uc->pkt_mode) {
> + mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
> + fetch_size = cppi5_hdesc_calc_size(uc->needs_epib,
> + uc->psd_size, 0);
> + } else {
> + mode = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_PBRR;
> + fetch_size = sizeof(struct cppi5_desc_hdr_t);
> + }
> +
> + req_rx.valid_params =
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_SHORT_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_LONG_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_START_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_CNT_VALID;
> +
> + req_rx.nav_id = tisci_rm->tisci_dev_id;
> + req_rx.index = rchan->id;
> + req_rx.rx_fetch_size = fetch_size >> 2;
> + req_rx.rxcq_qnum = rx_ring;
> + req_rx.rx_pause_on_err = 0;
> + req_rx.rx_chan_type = mode;
> + req_rx.rx_ignore_short = 0;
> + req_rx.rx_ignore_long = 0;
> + req_rx.flowid_start = 0;
> + req_rx.flowid_cnt = 0;
> +
> + ret = tisci_ops->rx_ch_cfg(tisci_rm->tisci, &req_rx);
> + if (ret) {
> + dev_err(ud->dev, "rchan%d cfg failed %d\n", rchan->id, ret);
> + return ret;
> + }
> +
> + flow_req.valid_params =
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_EINFO_PRESENT_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_PSINFO_PRESENT_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_ERROR_HANDLING_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DESC_TYPE_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_QNUM_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_HI_SEL_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_LO_SEL_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_HI_SEL_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_LO_SEL_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ0_SZ0_QNUM_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ1_QNUM_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ2_QNUM_VALID |
> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ3_QNUM_VALID;
> +
> + flow_req.nav_id = tisci_rm->tisci_dev_id;
> + flow_req.flow_index = rchan->id;
> +
> + if (uc->needs_epib)
> + flow_req.rx_einfo_present = 1;
> + else
> + flow_req.rx_einfo_present = 0;
> + if (uc->psd_size)
> + flow_req.rx_psinfo_present = 1;
> + else
> + flow_req.rx_psinfo_present = 0;
> + flow_req.rx_error_handling = 1;
> + flow_req.rx_desc_type = 0;
> + flow_req.rx_dest_qnum = rx_ring;
> + flow_req.rx_src_tag_hi_sel = 2;
> + flow_req.rx_src_tag_lo_sel = 4;
> + flow_req.rx_dest_tag_hi_sel = 5;
> + flow_req.rx_dest_tag_lo_sel = 4;

can we get rid of magic numbers here and elsewhere, or at least comment
on what these mean..

> +static int udma_alloc_chan_resources(struct dma_chan *chan)
> +{
> + struct udma_chan *uc = to_udma_chan(chan);
> + struct udma_dev *ud = to_udma_dev(chan->device);
> + const struct udma_match_data *match_data = ud->match_data;
> + struct k3_ring *irq_ring;
> + u32 irq_udma_idx;
> + int ret;
> +
> + if (uc->pkt_mode || uc->dir == DMA_MEM_TO_MEM) {
> + uc->use_dma_pool = true;
> + /* in case of MEM_TO_MEM we have maximum of two TRs */
> + if (uc->dir == DMA_MEM_TO_MEM) {
> + uc->hdesc_size = cppi5_trdesc_calc_size(
> + sizeof(struct cppi5_tr_type15_t), 2);
> + uc->pkt_mode = false;
> + }
> + }
> +
> + if (uc->use_dma_pool) {
> + uc->hdesc_pool = dma_pool_create(uc->name, ud->ddev.dev,
> + uc->hdesc_size, ud->desc_align,
> + 0);
> + if (!uc->hdesc_pool) {
> + dev_err(ud->ddev.dev,
> + "Descriptor pool allocation failed\n");
> + uc->use_dma_pool = false;
> + return -ENOMEM;
> + }
> + }
> +
> + /*
> + * Make sure that the completion is in a known state:
> + * No teardown, the channel is idle
> + */
> + reinit_completion(&uc->teardown_completed);
> + complete_all(&uc->teardown_completed);

should we not complete first and then do reinit to bring a clean state?

> + uc->state = UDMA_CHAN_IS_IDLE;
> +
> + switch (uc->dir) {
> + case DMA_MEM_TO_MEM:

can you explain why a allocation should be channel dependent, shouldn't
these things be done in prep_ calls?

I looked ahead and checked the prep_ calls and we can use any direction
so this somehow doesn't make sense!
--
~Vinod

2019-11-11 06:13:08

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH v4 14/15] dmaengine: ti: New driver for K3 UDMA - split#6: Kconfig and Makefile

> +config TI_K3_UDMA
> + tristate "Texas Instruments UDMA support"
> + depends on ARCH_K3 || COMPILE_TEST
> + depends on TI_SCI_PROTOCOL
> + depends on TI_SCI_INTA_IRQCHIP
> + select DMA_ENGINE
> + select DMA_VIRTUAL_CHANNELS
> + select TI_K3_RINGACC
> + select TI_K3_PSIL
> + default y

Again no default y!

--
~Vinod

2019-11-11 06:14:02

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH v4 15/15] dmaengine: ti: k3-udma: Add glue layer for non DMAengine users

On 01-11-19, 10:41, Peter Ujfalusi wrote:
> From: Grygorii Strashko <[email protected]>
>
> Certain users can not use right now the DMAengine API due to missing
> features in the core. Prime example is Networking.
>
> These users can use the glue layer interface to avoid misuse of DMAengine
> API and when the core gains the needed features they can be converted to
> use generic API.

Can you add some notes on what all features does this layer implement..

--
~Vinod

2019-11-11 07:42:02

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v4 02/15] soc: ti: k3: add navss ringacc driver



On 11/11/2019 6.21, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>> From: Grygorii Strashko <[email protected]>
>
>> +config TI_K3_RINGACC
>> + tristate "K3 Ring accelerator Sub System"
>> + depends on ARCH_K3 || COMPILE_TEST
>> + depends on TI_SCI_INTA_IRQCHIP
>> + default y
>
> You want to get an earful from Linus? We dont do default y on new stuff,
> never :)

OK

>> +struct k3_ring_rt_regs {
>> + u32 resv_16[4];
>> + u32 db; /* RT Ring N Doorbell Register */
>> + u32 resv_4[1];
>> + u32 occ; /* RT Ring N Occupancy Register */
>> + u32 indx; /* RT Ring N Current Index Register */
>> + u32 hwocc; /* RT Ring N Hardware Occupancy Register */
>> + u32 hwindx; /* RT Ring N Current Index Register */
>
> nice comments, how about moving them up into kernel-doc style? (here and
> other places as well)

Sure, I'll convert the comments.

>> +struct k3_ring *k3_ringacc_request_ring(struct k3_ringacc *ringacc,
>> + int id, u32 flags)
>> +{
>> + int proxy_id = K3_RINGACC_PROXY_NOT_USED;
>> +
>> + mutex_lock(&ringacc->req_lock);
>> +
>> + if (id == K3_RINGACC_RING_ID_ANY) {
>> + /* Request for any general purpose ring */
>> + struct ti_sci_resource_desc *gp_rings =
>> + &ringacc->rm_gp_range->desc[0];
>> + unsigned long size;
>> +
>> + size = gp_rings->start + gp_rings->num;
>> + id = find_next_zero_bit(ringacc->rings_inuse, size,
>> + gp_rings->start);
>> + if (id == size)
>> + goto error;
>> + } else if (id < 0) {
>> + goto error;
>> + }
>> +
>> + if (test_bit(id, ringacc->rings_inuse) &&
>> + !(ringacc->rings[id].flags & K3_RING_FLAG_SHARED))
>> + goto error;
>> + else if (ringacc->rings[id].flags & K3_RING_FLAG_SHARED)
>> + goto out;
>> +
>> + if (flags & K3_RINGACC_RING_USE_PROXY) {
>> + proxy_id = find_next_zero_bit(ringacc->proxy_inuse,
>> + ringacc->num_proxies, 0);
>> + if (proxy_id == ringacc->num_proxies)
>> + goto error;
>> + }
>> +
>> + if (!try_module_get(ringacc->dev->driver->owner))
>> + goto error;
>
> should this not be one of the first things to do?

I'll move it.

>
>> +
>> + if (proxy_id != K3_RINGACC_PROXY_NOT_USED) {
>> + set_bit(proxy_id, ringacc->proxy_inuse);
>> + ringacc->rings[id].proxy_id = proxy_id;
>> + dev_dbg(ringacc->dev, "Giving ring#%d proxy#%d\n", id,
>> + proxy_id);
>> + } else {
>> + dev_dbg(ringacc->dev, "Giving ring#%d\n", id);
>> + }
>
> how bout removing else and doing common print?

When the proxy is used we want to print that as well, I think it is
cleaner to have separate prints for the two cases.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-11 08:01:31

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v4 05/15] dmaengine: Add support for reporting DMA cached data amount



On 11/11/2019 6.39, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>> A DMA hardware can have big cache or FIFO and the amount of data sitting in
>> the DMA fabric can be an interest for the clients.
>>
>> For example in audio we want to know the delay in the data flow and in case
>> the DMA have significantly large FIFO/cache, it can affect the latenc/delay
>>
>> Signed-off-by: Peter Ujfalusi <[email protected]>
>> Reviewed-by: Tero Kristo <[email protected]>
>> ---
>> drivers/dma/dmaengine.h | 8 ++++++++
>> include/linux/dmaengine.h | 2 ++
>> 2 files changed, 10 insertions(+)
>>
>> diff --git a/drivers/dma/dmaengine.h b/drivers/dma/dmaengine.h
>> index 501c0b063f85..b0b97475707a 100644
>> --- a/drivers/dma/dmaengine.h
>> +++ b/drivers/dma/dmaengine.h
>> @@ -77,6 +77,7 @@ static inline enum dma_status dma_cookie_status(struct dma_chan *chan,
>> state->last = complete;
>> state->used = used;
>> state->residue = 0;
>> + state->in_flight_bytes = 0;
>> }
>> return dma_async_is_complete(cookie, complete, used);
>> }
>> @@ -87,6 +88,13 @@ static inline void dma_set_residue(struct dma_tx_state *state, u32 residue)
>> state->residue = residue;
>> }
>>
>> +static inline void dma_set_in_flight_bytes(struct dma_tx_state *state,
>> + u32 in_flight_bytes)
>> +{
>> + if (state)
>> + state->in_flight_bytes = in_flight_bytes;
>> +}
>> +
>> struct dmaengine_desc_callback {
>> dma_async_tx_callback callback;
>> dma_async_tx_callback_result callback_result;
>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>> index 0e8b426bbde9..c4c5219030a6 100644
>> --- a/include/linux/dmaengine.h
>> +++ b/include/linux/dmaengine.h
>> @@ -682,11 +682,13 @@ static inline struct dma_async_tx_descriptor *txd_next(struct dma_async_tx_descr
>> * @residue: the remaining number of bytes left to transmit
>> * on the selected transfer for states DMA_IN_PROGRESS and
>> * DMA_PAUSED if this is implemented in the driver, else 0
>> + * @in_flight_bytes: amount of data in bytes cached by the DMA.
>> */
>> struct dma_tx_state {
>> dma_cookie_t last;
>> dma_cookie_t used;
>> u32 residue;
>> + u32 in_flight_bytes;
>
> Should we add this here or use the dmaengine_result()

Ideally at the time dmaengine_result is used (at tx completion callback)
there should be nothing in flight ;)

The reason why it is added to dma_tx_state is that clients can check at
any time while the DMA is running the number of cached bytes.
Audio needs this for cyclic and UART also needs to know it.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-11 08:49:05

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v4 07/15] dmaengine: ti: k3 PSI-L remote endpoint configuration



On 11/11/2019 6.47, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>
>> --- /dev/null
>> +++ b/drivers/dma/ti/k3-psil.c
>> @@ -0,0 +1,97 @@
>> +// SPDX-License-Identifier: GPL-2.0
>
> ...
>
>> +extern struct psil_ep_map am654_ep_map;
>> +extern struct psil_ep_map j721e_ep_map;
>> +
>> +static DEFINE_MUTEX(ep_map_mutex);
>> +static struct psil_ep_map *soc_ep_map;
>> +
>> +struct psil_endpoint_config *psil_get_ep_config(u32 thread_id)
>> +{
>> + int i;
>> +
>> + mutex_lock(&ep_map_mutex);
>> + if (!soc_ep_map) {
>> + if (of_machine_is_compatible("ti,am654")) {
>> + soc_ep_map = &am654_ep_map;
>> + } else if (of_machine_is_compatible("ti,j721e")) {
>> + soc_ep_map = &j721e_ep_map;
>> + } else {
>> + pr_err("PSIL: No compatible machine found for map\n");
>> + return ERR_PTR(-ENOTSUPP);
>> + }
>> + pr_debug("%s: Using map for %s\n", __func__, soc_ep_map->name);
>> + }
>> + mutex_unlock(&ep_map_mutex);
>> +
>> + if (thread_id & K3_PSIL_DST_THREAD_ID_OFFSET && soc_ep_map->dst) {
>> + /* check in destination thread map */
>> + for (i = 0; i < soc_ep_map->dst_count; i++) {
>> + if (soc_ep_map->dst[i].thread_id == thread_id)
>> + return &soc_ep_map->dst[i].ep_config;
>> + }
>> + }
>> +
>> + thread_id &= ~K3_PSIL_DST_THREAD_ID_OFFSET;
>> + if (soc_ep_map->src) {
>> + for (i = 0; i < soc_ep_map->src_count; i++) {
>> + if (soc_ep_map->src[i].thread_id == thread_id)
>> + return &soc_ep_map->src[i].ep_config;
>> + }
>> + }
>> +
>> + return ERR_PTR(-ENOENT);
>> +}
>> +EXPORT_SYMBOL(psil_get_ep_config);
>
> This doesn't match the license of this module, we need it to be
> EXPORT_SYMBOL_GPL

Oops, will fix it.


- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-11 09:16:10

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v4 10/15] dmaengine: ti: New driver for K3 UDMA - split#2: probe/remove, xlate and filter_fn



On 11/11/2019 7.33, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>
>> +static bool udma_dma_filter_fn(struct dma_chan *chan, void *param)
>> +{
>> + struct psil_endpoint_config *ep_config;
>> + struct udma_chan *uc;
>> + struct udma_dev *ud;
>> + u32 *args;
>> +
>> + if (chan->device->dev->driver != &udma_driver.driver)
>> + return false;
>> +
>> + uc = to_udma_chan(chan);
>> + ud = uc->ud;
>> + args = param;
>> + uc->remote_thread_id = args[0];
>> +
>> + if (uc->remote_thread_id & K3_PSIL_DST_THREAD_ID_OFFSET)
>> + uc->dir = DMA_MEM_TO_DEV;
>> + else
>> + uc->dir = DMA_DEV_TO_MEM;
>
> Can you explain this a bit?

The UDMAP in K3 works between two PSI-L endpoint. The source and
destination needs to be paired to allow data flow.
Source thread IDs are in range of 0x0000 - 0x7fff, while destination
thread IDs are 0x8000 - 0xffff.

If the remote thread ID have the bit 31 set (0x8000) then the transfer
is MEM_TO_DEV and I need to pick one unused tchan for it. If the remote
is the source then it can be handled by rchan.

dmas = <&main_udmap 0xc400>, <&main_udmap 0x4400>;
dma-names = "tx", "rx";

0xc400 is a destination thread ID, so it is MEM_TO_DEV
0x4400 is a source thread ID, so it is DEV_TO_MEM

Even in MEM_TO_MEM case I need to pair two UDMAP channels:
UDMAP source threads are starting at offset 0x1000, UDMAP destination
threads are 0x9000+

Changing direction runtime is hardly possible as it would involve
tearing down the channel, removing interrupts, destroying rings,
removing the PSI-L pairing and redoing everything.

>> +static int udma_remove(struct platform_device *pdev)
>> +{
>> + struct udma_dev *ud = platform_get_drvdata(pdev);
>> +
>> + of_dma_controller_free(pdev->dev.of_node);
>> + dma_async_device_unregister(&ud->ddev);
>> +
>> + /* Make sure that we did proper cleanup */
>> + cancel_work_sync(&ud->purge_work);
>> + udma_purge_desc_work(&ud->purge_work);
>
> kill the vchan tasklets at it too please

Oh, I have missed that, I'll add it.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-11 09:40:59

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v4 11/15] dmaengine: ti: New driver for K3 UDMA - split#3: alloc/free chan_resources



On 11/11/2019 8.06, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>> Split patch for review containing: channel rsource allocation and free
>
> s/rsource/resource

I'll try to remember to fix up this temporally commit message, at the
end these split patches are going to be squashed into one commit when
things are ready to be applied.

>> +static int udma_tisci_tx_channel_config(struct udma_chan *uc)
>> +{
>> + struct udma_dev *ud = uc->ud;
>> + struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
>> + const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
>> + struct udma_tchan *tchan = uc->tchan;
>> + int tc_ring = k3_ringacc_get_ring_id(tchan->tc_ring);
>> + struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
>> + u32 mode, fetch_size;
>> + int ret = 0;
>> +
>> + if (uc->pkt_mode) {
>> + mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
>> + fetch_size = cppi5_hdesc_calc_size(uc->needs_epib, uc->psd_size,
>> + 0);
>> + } else {
>> + mode = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_PBRR;
>> + fetch_size = sizeof(struct cppi5_desc_hdr_t);
>> + }
>> +
>> + req_tx.valid_params =
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_EINFO_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_FILT_PSWORDS_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_SUPR_TDPKT_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID;
>
> bunch of these are repeat, you can define a COMMON_VALID_PARAMS and use
> that + specific ones..

OK, I'll try to sanitize these a bit.

>> +
>> + req_tx.nav_id = tisci_rm->tisci_dev_id;
>> + req_tx.index = tchan->id;
>> + req_tx.tx_pause_on_err = 0;
>> + req_tx.tx_filt_einfo = 0;
>> + req_tx.tx_filt_pswords = 0;
>
> i think initialization to 0 is superfluous

Indeed, I'll remove these.

>> + req_tx.tx_chan_type = mode;
>> + req_tx.tx_supr_tdpkt = uc->notdpkt;
>> + req_tx.tx_fetch_size = fetch_size >> 2;
>> + req_tx.txcq_qnum = tc_ring;
>> + if (uc->ep_type == PSIL_EP_PDMA_XY) {
>> + /* wait for peer to complete the teardown for PDMAs */
>> + req_tx.valid_params |=
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_TX_TDTYPE_VALID;
>> + req_tx.tx_tdtype = 1;
>> + }
>> +
>> + ret = tisci_ops->tx_ch_cfg(tisci_rm->tisci, &req_tx);
>> + if (ret)
>> + dev_err(ud->dev, "tchan%d cfg failed %d\n", tchan->id, ret);
>> +
>> + return ret;
>> +}
>> +
>> +static int udma_tisci_rx_channel_config(struct udma_chan *uc)
>> +{
>> + struct udma_dev *ud = uc->ud;
>> + struct udma_tisci_rm *tisci_rm = &ud->tisci_rm;
>> + const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
>> + struct udma_rchan *rchan = uc->rchan;
>> + int fd_ring = k3_ringacc_get_ring_id(rchan->fd_ring);
>> + int rx_ring = k3_ringacc_get_ring_id(rchan->r_ring);
>> + struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
>> + struct ti_sci_msg_rm_udmap_flow_cfg flow_req = { 0 };
>> + u32 mode, fetch_size;
>> + int ret = 0;
>> +
>> + if (uc->pkt_mode) {
>> + mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
>> + fetch_size = cppi5_hdesc_calc_size(uc->needs_epib,
>> + uc->psd_size, 0);
>> + } else {
>> + mode = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_PBRR;
>> + fetch_size = sizeof(struct cppi5_desc_hdr_t);
>> + }
>> +
>> + req_rx.valid_params =
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_PAUSE_ON_ERR_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_FETCH_SIZE_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_CQ_QNUM_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_CHAN_TYPE_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_SHORT_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_IGNORE_LONG_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_START_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_CH_RX_FLOWID_CNT_VALID;
>> +
>> + req_rx.nav_id = tisci_rm->tisci_dev_id;
>> + req_rx.index = rchan->id;
>> + req_rx.rx_fetch_size = fetch_size >> 2;
>> + req_rx.rxcq_qnum = rx_ring;
>> + req_rx.rx_pause_on_err = 0;
>> + req_rx.rx_chan_type = mode;
>> + req_rx.rx_ignore_short = 0;
>> + req_rx.rx_ignore_long = 0;
>> + req_rx.flowid_start = 0;
>> + req_rx.flowid_cnt = 0;
>> +
>> + ret = tisci_ops->rx_ch_cfg(tisci_rm->tisci, &req_rx);
>> + if (ret) {
>> + dev_err(ud->dev, "rchan%d cfg failed %d\n", rchan->id, ret);
>> + return ret;
>> + }
>> +
>> + flow_req.valid_params =
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_EINFO_PRESENT_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_PSINFO_PRESENT_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_ERROR_HANDLING_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DESC_TYPE_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_QNUM_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_HI_SEL_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_SRC_TAG_LO_SEL_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_HI_SEL_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_DEST_TAG_LO_SEL_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ0_SZ0_QNUM_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ1_QNUM_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ2_QNUM_VALID |
>> + TI_SCI_MSG_VALUE_RM_UDMAP_FLOW_FDQ3_QNUM_VALID;
>> +
>> + flow_req.nav_id = tisci_rm->tisci_dev_id;
>> + flow_req.flow_index = rchan->id;
>> +
>> + if (uc->needs_epib)
>> + flow_req.rx_einfo_present = 1;
>> + else
>> + flow_req.rx_einfo_present = 0;
>> + if (uc->psd_size)
>> + flow_req.rx_psinfo_present = 1;
>> + else
>> + flow_req.rx_psinfo_present = 0;
>> + flow_req.rx_error_handling = 1;
>> + flow_req.rx_desc_type = 0;
>> + flow_req.rx_dest_qnum = rx_ring;
>> + flow_req.rx_src_tag_hi_sel = 2;
>> + flow_req.rx_src_tag_lo_sel = 4;
>> + flow_req.rx_dest_tag_hi_sel = 5;
>> + flow_req.rx_dest_tag_lo_sel = 4;
>
> can we get rid of magic numbers here and elsewhere, or at least comment
> on what these mean..

True, I'll clean it up.

>> +static int udma_alloc_chan_resources(struct dma_chan *chan)
>> +{
>> + struct udma_chan *uc = to_udma_chan(chan);
>> + struct udma_dev *ud = to_udma_dev(chan->device);
>> + const struct udma_match_data *match_data = ud->match_data;
>> + struct k3_ring *irq_ring;
>> + u32 irq_udma_idx;
>> + int ret;
>> +
>> + if (uc->pkt_mode || uc->dir == DMA_MEM_TO_MEM) {
>> + uc->use_dma_pool = true;
>> + /* in case of MEM_TO_MEM we have maximum of two TRs */
>> + if (uc->dir == DMA_MEM_TO_MEM) {
>> + uc->hdesc_size = cppi5_trdesc_calc_size(
>> + sizeof(struct cppi5_tr_type15_t), 2);
>> + uc->pkt_mode = false;
>> + }
>> + }
>> +
>> + if (uc->use_dma_pool) {
>> + uc->hdesc_pool = dma_pool_create(uc->name, ud->ddev.dev,
>> + uc->hdesc_size, ud->desc_align,
>> + 0);
>> + if (!uc->hdesc_pool) {
>> + dev_err(ud->ddev.dev,
>> + "Descriptor pool allocation failed\n");
>> + uc->use_dma_pool = false;
>> + return -ENOMEM;
>> + }
>> + }
>> +
>> + /*
>> + * Make sure that the completion is in a known state:
>> + * No teardown, the channel is idle
>> + */
>> + reinit_completion(&uc->teardown_completed);
>> + complete_all(&uc->teardown_completed);
>
> should we not complete first and then do reinit to bring a clean state?

The reason why it is like this is that the udma_synchronize() is
checking the completion and if the client requested the channel and
calls terminate_all_sync() without any transfer then no one will mark
the completion completed.

>> + uc->state = UDMA_CHAN_IS_IDLE;
>> +
>> + switch (uc->dir) {
>> + case DMA_MEM_TO_MEM:
>
> can you explain why a allocation should be channel dependent, shouldn't
> these things be done in prep_ calls?

A channel can not change direction, it is either MEM_TO_DEV, DEV_TO_MEM
or MEM_TO_MEM and it is set when the channel is requested.

> I looked ahead and checked the prep_ calls and we can use any direction
> so this somehow doesn't make sense!

I'm checking in the prep callbacks if the requested direction is
matching with the channel direction.

I just can not change the channel direction runtime.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-11 10:30:55

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v4 14/15] dmaengine: ti: New driver for K3 UDMA - split#6: Kconfig and Makefile



On 11/11/2019 8.11, Vinod Koul wrote:
>> +config TI_K3_UDMA
>> + tristate "Texas Instruments UDMA support"
>> + depends on ARCH_K3 || COMPILE_TEST
>> + depends on TI_SCI_PROTOCOL
>> + depends on TI_SCI_INTA_IRQCHIP
>> + select DMA_ENGINE
>> + select DMA_VIRTUAL_CHANNELS
>> + select TI_K3_RINGACC
>> + select TI_K3_PSIL
>> + default y
>
> Again no default y!

Removed

>

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-11 10:32:48

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v4 15/15] dmaengine: ti: k3-udma: Add glue layer for non DMAengine users



On 11/11/2019 8.12, Vinod Koul wrote:
> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>> From: Grygorii Strashko <[email protected]>
>>
>> Certain users can not use right now the DMAengine API due to missing
>> features in the core. Prime example is Networking.
>>
>> These users can use the glue layer interface to avoid misuse of DMAengine
>> API and when the core gains the needed features they can be converted to
>> use generic API.
>
> Can you add some notes on what all features does this layer implement..

In the commit message or in the code?

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-12 05:35:46

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH v4 10/15] dmaengine: ti: New driver for K3 UDMA - split#2: probe/remove, xlate and filter_fn

On 11-11-19, 11:16, Peter Ujfalusi wrote:
>
>
> On 11/11/2019 7.33, Vinod Koul wrote:
> > On 01-11-19, 10:41, Peter Ujfalusi wrote:
> >
> >> +static bool udma_dma_filter_fn(struct dma_chan *chan, void *param)
> >> +{
> >> + struct psil_endpoint_config *ep_config;
> >> + struct udma_chan *uc;
> >> + struct udma_dev *ud;
> >> + u32 *args;
> >> +
> >> + if (chan->device->dev->driver != &udma_driver.driver)
> >> + return false;
> >> +
> >> + uc = to_udma_chan(chan);
> >> + ud = uc->ud;
> >> + args = param;
> >> + uc->remote_thread_id = args[0];
> >> +
> >> + if (uc->remote_thread_id & K3_PSIL_DST_THREAD_ID_OFFSET)
> >> + uc->dir = DMA_MEM_TO_DEV;
> >> + else
> >> + uc->dir = DMA_DEV_TO_MEM;
> >
> > Can you explain this a bit?
>
> The UDMAP in K3 works between two PSI-L endpoint. The source and
> destination needs to be paired to allow data flow.
> Source thread IDs are in range of 0x0000 - 0x7fff, while destination
> thread IDs are 0x8000 - 0xffff.
>
> If the remote thread ID have the bit 31 set (0x8000) then the transfer
> is MEM_TO_DEV and I need to pick one unused tchan for it. If the remote
> is the source then it can be handled by rchan.
>
> dmas = <&main_udmap 0xc400>, <&main_udmap 0x4400>;
> dma-names = "tx", "rx";
>
> 0xc400 is a destination thread ID, so it is MEM_TO_DEV
> 0x4400 is a source thread ID, so it is DEV_TO_MEM
>
> Even in MEM_TO_MEM case I need to pair two UDMAP channels:
> UDMAP source threads are starting at offset 0x1000, UDMAP destination
> threads are 0x9000+

Okay so a channel is set for a direction until teardown. Also this and
other patch comments are quite useful, can we add them here?

> Changing direction runtime is hardly possible as it would involve
> tearing down the channel, removing interrupts, destroying rings,
> removing the PSI-L pairing and redoing everything.

okay I would expect the prep_ to check for direction and reject the call
if direction is different.

--
~Vinod

2019-11-12 05:38:30

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH v4 15/15] dmaengine: ti: k3-udma: Add glue layer for non DMAengine users

On 11-11-19, 12:31, Peter Ujfalusi wrote:
>
>
> On 11/11/2019 8.12, Vinod Koul wrote:
> > On 01-11-19, 10:41, Peter Ujfalusi wrote:
> >> From: Grygorii Strashko <[email protected]>
> >>
> >> Certain users can not use right now the DMAengine API due to missing
> >> features in the core. Prime example is Networking.
> >>
> >> These users can use the glue layer interface to avoid misuse of DMAengine
> >> API and when the core gains the needed features they can be converted to
> >> use generic API.
> >
> > Can you add some notes on what all features does this layer implement..
>
> In the commit message or in the code?

commit here so that we know what to expect.

Thanks
--
~Vinod

2019-11-12 07:23:17

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v4 10/15] dmaengine: ti: New driver for K3 UDMA - split#2: probe/remove, xlate and filter_fn



On 12/11/2019 7.34, Vinod Koul wrote:
> On 11-11-19, 11:16, Peter Ujfalusi wrote:
>>
>>
>> On 11/11/2019 7.33, Vinod Koul wrote:
>>> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>>>
>>>> +static bool udma_dma_filter_fn(struct dma_chan *chan, void *param)
>>>> +{
>>>> + struct psil_endpoint_config *ep_config;
>>>> + struct udma_chan *uc;
>>>> + struct udma_dev *ud;
>>>> + u32 *args;
>>>> +
>>>> + if (chan->device->dev->driver != &udma_driver.driver)
>>>> + return false;
>>>> +
>>>> + uc = to_udma_chan(chan);
>>>> + ud = uc->ud;
>>>> + args = param;
>>>> + uc->remote_thread_id = args[0];
>>>> +
>>>> + if (uc->remote_thread_id & K3_PSIL_DST_THREAD_ID_OFFSET)
>>>> + uc->dir = DMA_MEM_TO_DEV;
>>>> + else
>>>> + uc->dir = DMA_DEV_TO_MEM;
>>>
>>> Can you explain this a bit?
>>
>> The UDMAP in K3 works between two PSI-L endpoint. The source and
>> destination needs to be paired to allow data flow.
>> Source thread IDs are in range of 0x0000 - 0x7fff, while destination
>> thread IDs are 0x8000 - 0xffff.
>>
>> If the remote thread ID have the bit 31 set (0x8000) then the transfer
>> is MEM_TO_DEV and I need to pick one unused tchan for it. If the remote
>> is the source then it can be handled by rchan.
>>
>> dmas = <&main_udmap 0xc400>, <&main_udmap 0x4400>;
>> dma-names = "tx", "rx";
>>
>> 0xc400 is a destination thread ID, so it is MEM_TO_DEV
>> 0x4400 is a source thread ID, so it is DEV_TO_MEM
>>
>> Even in MEM_TO_MEM case I need to pair two UDMAP channels:
>> UDMAP source threads are starting at offset 0x1000, UDMAP destination
>> threads are 0x9000+
>
> Okay so a channel is set for a direction until teardown. Also this and
> other patch comments are quite useful, can we add them here?

The direction checks in the prep callbacks do print the reason why the
transfer is rejected when it comes to not matching direction.

Having said that, I can add comment to the udma_alloc_chan_resources()
function about this restriction, or better a dev_dbg() to say that the
given channel is allocated for a given direction.

>> Changing direction runtime is hardly possible as it would involve
>> tearing down the channel, removing interrupts, destroying rings,
>> removing the PSI-L pairing and redoing everything.
>
> okay I would expect the prep_ to check for direction and reject the call
> if direction is different.

They do check, udma_prep_slave_sg() and udma_prep_dma_cyclic():
if (dir != uc->dir) {
dev_err(chan->device->dev,
"%s: chan%d is for %s, not supporting %s\n",
__func__, uc->id, udma_get_dir_text(uc->dir),
udma_get_dir_text(dir));
return NULL;
}

udma_prep_dma_memcpy():
if (uc->dir != DMA_MEM_TO_MEM) {
dev_err(chan->device->dev,
"%s: chan%d is for %s, not supporting %s\n",
__func__, uc->id, udma_get_dir_text(uc->dir),
udma_get_dir_text(DMA_MEM_TO_MEM));
return NULL;
}

>

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2019-11-12 07:25:33

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v4 15/15] dmaengine: ti: k3-udma: Add glue layer for non DMAengine users



On 12/11/2019 7.37, Vinod Koul wrote:
> On 11-11-19, 12:31, Peter Ujfalusi wrote:
>>
>>
>> On 11/11/2019 8.12, Vinod Koul wrote:
>>> On 01-11-19, 10:41, Peter Ujfalusi wrote:
>>>> From: Grygorii Strashko <[email protected]>
>>>>
>>>> Certain users can not use right now the DMAengine API due to missing
>>>> features in the core. Prime example is Networking.
>>>>
>>>> These users can use the glue layer interface to avoid misuse of DMAengine
>>>> API and when the core gains the needed features they can be converted to
>>>> use generic API.
>>>
>>> Can you add some notes on what all features does this layer implement..
>>
>> In the commit message or in the code?
>
> commit here so that we know what to expect.

Can you check the v5 commit message if it is sufficient? If not, I can
make it more verbose for v6.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki