2015-11-30 05:16:31

by Vignesh Raghavendra

[permalink] [raw]
Subject: [PATCH v4 0/5] Add memory mapped read support for ti-qspi

Changes since v3:
Rework to introduce spi_flash_read_message struct.
Support different opcode/addr/data formats as per Brian's suggestion
here: https://lkml.org/lkml/2015/11/11/454

Changes since v2:
Remove mmap_lock_mutex.
Optimize enable/disable of mmap mode.

Changes since v1:
Introduce API in SPI core that MTD flash driver can call for mmap read
instead of directly calling spi-master driver callback. This API makes
sure that SPI core msg queue is locked during mmap transfers.
v1: https://lkml.org/lkml/2015/9/4/103


Cover letter:

This patch series adds support for memory mapped read port of ti-qspi.
ti-qspi has a special memory mapped port through which SPI flash
memories can be accessed directly via SoC specific memory region.

First patch adds a method to pass flash specific information like read
opcode, dummy bytes etc and to request mmap read. Second patch
implements mmap read method in ti-qspi driver. Patch 3 adapts m25p80 to
use mmap read method before trying normal SPI transfer. Patch 4 and 5
add memory map region DT entries for DRA7xx and AM43xx SoCs.

This patch series is based on the discussions here:
http://www.spinics.net/lists/linux-spi/msg04796.html

Tested on DRA74 EVM and AM437x-SK.
Read performance increases from ~100kB/s to ~2.5MB/s.


Vignesh R (5):
spi: introduce accelerated read support for spi flash devices
spi: spi-ti-qspi: add mmap mode read support
mtd: devices: m25p80: add support for mmap read request
ARM: dts: DRA7: add entry for qspi mmap region
ARM: dts: AM4372: add entry for qspi mmap region

Documentation/devicetree/bindings/spi/ti_qspi.txt | 19 +++-
arch/arm/boot/dts/am4372.dtsi | 4 +-
arch/arm/boot/dts/dra7.dtsi | 7 +-
drivers/mtd/devices/m25p80.c | 20 +++++
drivers/spi/spi-ti-qspi.c | 101 ++++++++++++++++++++--
drivers/spi/spi.c | 45 ++++++++++
include/linux/spi/spi.h | 41 +++++++++
7 files changed, 225 insertions(+), 12 deletions(-)

--
2.6.3


2015-11-30 05:17:33

by Vignesh Raghavendra

[permalink] [raw]
Subject: [PATCH v4 1/5] spi: introduce accelerated read support for spi flash devices

In addition to providing direct access to SPI bus, some spi controller
hardwares (like ti-qspi) provide special port (like memory mapped port)
that are optimized to improve SPI flash read performance.
This means the controller can automatically send the SPI signals
required to read data from the SPI flash device.
For this, SPI controller needs to know flash specific information like
read command to use, dummy bytes and address width.

Introduce spi_flash_read() interface to support accelerated read
over SPI flash devices. SPI master drivers can implement this callback to
support interfaces such as memory mapped read etc. m25p80 flash driver
and other flash drivers can call this make use of such interfaces. The
interface should only be used with SPI flashes and cannot be used with
other SPI devices.

Signed-off-by: Vignesh R <[email protected]>
---
drivers/spi/spi.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/spi/spi.h | 41 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 86 insertions(+)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index e2415be209d5..cc2b54139352 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1134,6 +1134,7 @@ static void __spi_pump_messages(struct spi_master *master, bool in_kthread)
}
}

+ mutex_lock(&master->bus_lock_mutex);
trace_spi_message_start(master->cur_msg);

if (master->prepare_message) {
@@ -1143,6 +1144,7 @@ static void __spi_pump_messages(struct spi_master *master, bool in_kthread)
"failed to prepare message: %d\n", ret);
master->cur_msg->status = ret;
spi_finalize_current_message(master);
+ mutex_unlock(&master->bus_lock_mutex);
return;
}
master->cur_msg_prepared = true;
@@ -1152,6 +1154,7 @@ static void __spi_pump_messages(struct spi_master *master, bool in_kthread)
if (ret) {
master->cur_msg->status = ret;
spi_finalize_current_message(master);
+ mutex_unlock(&master->bus_lock_mutex);
return;
}

@@ -1159,8 +1162,10 @@ static void __spi_pump_messages(struct spi_master *master, bool in_kthread)
if (ret) {
dev_err(&master->dev,
"failed to transfer one message from queue\n");
+ mutex_unlock(&master->bus_lock_mutex);
return;
}
+ mutex_unlock(&master->bus_lock_mutex);
}

/**
@@ -2327,6 +2332,46 @@ int spi_async_locked(struct spi_device *spi, struct spi_message *message)
EXPORT_SYMBOL_GPL(spi_async_locked);


+int spi_flash_read(struct spi_device *spi,
+ struct spi_flash_read_message *msg)
+
+{
+ struct spi_master *master = spi->master;
+ int ret;
+
+ if ((msg->opcode_nbits == SPI_NBITS_DUAL ||
+ msg->addr_nbits == SPI_NBITS_DUAL) &&
+ !(spi->mode & (SPI_TX_DUAL | SPI_TX_QUAD)))
+ return -EINVAL;
+ if ((msg->opcode_nbits == SPI_NBITS_QUAD ||
+ msg->addr_nbits == SPI_NBITS_QUAD) &&
+ !(spi->mode & SPI_TX_QUAD))
+ return -EINVAL;
+ if (msg->data_nbits == SPI_NBITS_DUAL &&
+ !(spi->mode & (SPI_RX_DUAL | SPI_RX_QUAD)))
+ return -EINVAL;
+ if (msg->data_nbits == SPI_NBITS_QUAD &&
+ !(spi->mode & SPI_RX_QUAD))
+ return -EINVAL;
+
+ if (master->auto_runtime_pm) {
+ ret = pm_runtime_get_sync(master->dev.parent);
+ if (ret < 0) {
+ dev_err(&master->dev, "Failed to power device: %d\n",
+ ret);
+ return ret;
+ }
+ }
+ mutex_lock(&master->bus_lock_mutex);
+ ret = master->spi_flash_read(spi, msg);
+ mutex_unlock(&master->bus_lock_mutex);
+ if (master->auto_runtime_pm)
+ pm_runtime_put(master->dev.parent);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(spi_flash_read);
+
/*-------------------------------------------------------------------------*/

/* Utility methods for SPI master protocol drivers, layered on
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index cce80e6dc7d1..246d7d519f3f 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -25,6 +25,7 @@
struct dma_chan;
struct spi_master;
struct spi_transfer;
+struct spi_flash_read_message;

/*
* INTERFACES between SPI master-side drivers and SPI infrastructure.
@@ -361,6 +362,8 @@ static inline void spi_unregister_driver(struct spi_driver *sdrv)
* @handle_err: the subsystem calls the driver to handle an error that occurs
* in the generic implementation of transfer_one_message().
* @unprepare_message: undo any work done by prepare_message().
+ * @spi_flash_read: to support spi-controller hardwares that provide
+ * accelerated interface to read from flash devices.
* @cs_gpios: Array of GPIOs to use as chip select lines; one per CS
* number. Any individual value may be -ENOENT for CS lines that
* are not GPIOs (driven by the SPI controller itself).
@@ -507,6 +510,8 @@ struct spi_master {
struct spi_message *message);
int (*unprepare_message)(struct spi_master *master,
struct spi_message *message);
+ int (*spi_flash_read)(struct spi_device *spi,
+ struct spi_flash_read_message *msg);

/*
* These hooks are for drivers that use a generic implementation
@@ -999,6 +1004,42 @@ static inline ssize_t spi_w8r16be(struct spi_device *spi, u8 cmd)
return be16_to_cpu(result);
}

+/**
+ * struct spi_flash_read_message - flash specific information for
+ * spi-masters that provide accelerated flash read interfaces
+ * @buf: buffer to read data
+ * @from: offset within the flash from where data is to be read
+ * @len: length of data to be read
+ * @retlen: actual length of data read
+ * @read_opcode: read_opcode to be used to communicate with flash
+ * @addr_width: number of address bytes
+ * @dummy_bytes: number of dummy bytes
+ * @opcode_nbits: number of lines to send opcode
+ * @addr_nbits: number of lines to send address
+ * @data_nbits: number of lines for data
+ */
+struct spi_flash_read_message {
+ void *buf;
+ loff_t from;
+ size_t len;
+ size_t retlen;
+ u8 read_opcode;
+ u8 addr_width;
+ u8 dummy_bytes;
+ u8 opcode_nbits;
+ u8 addr_nbits;
+ u8 data_nbits;
+};
+
+/* SPI core interface for flash read support */
+static inline bool spi_flash_read_supported(struct spi_device *spi)
+{
+ return spi->master->spi_flash_read ? true : false;
+}
+
+int spi_flash_read(struct spi_device *spi,
+ struct spi_flash_read_message *msg);
+
/*---------------------------------------------------------------------------*/

/*
--
2.6.3

2015-11-30 05:17:37

by Vignesh Raghavendra

[permalink] [raw]
Subject: [PATCH v4 2/5] spi: spi-ti-qspi: add mmap mode read support

ti-qspi controller provides mmap port to read data from SPI flashes.
mmap port is enabled in QSPI_SPI_SWITCH_REG. ctrl module register may
also need to be accessed for some SoCs. The QSPI_SPI_SETUP_REGx needs to
be populated with flash specific information like read opcode, read
mode(quad, dual, normal), address width and dummy bytes. Once,
controller is in mmap mode, the whole flash memory is available as a
memory region at SoC specific address. This region can be accessed using
normal memcpy() (or mem-to-mem dma copy). The ti-qspi controller hardware
will internally communicate with SPI flash over SPI bus and get the
requested data.

Implement spi_flash_read() callback to support mmap read over SPI
flash devices. With this, the read throughput increases from ~100kB/s to
~2.5 MB/s.

Signed-off-by: Vignesh R <[email protected]>
---

drivers/spi/spi-ti-qspi.c | 101 ++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 94 insertions(+), 7 deletions(-)

diff --git a/drivers/spi/spi-ti-qspi.c b/drivers/spi/spi-ti-qspi.c
index 64318fcfacf2..cd4e63f45e65 100644
--- a/drivers/spi/spi-ti-qspi.c
+++ b/drivers/spi/spi-ti-qspi.c
@@ -56,6 +56,7 @@ struct ti_qspi {
u32 dc;

bool ctrl_mod;
+ bool mmap_enabled;
};

#define QSPI_PID (0x0)
@@ -65,11 +66,8 @@ struct ti_qspi {
#define QSPI_SPI_CMD_REG (0x48)
#define QSPI_SPI_STATUS_REG (0x4c)
#define QSPI_SPI_DATA_REG (0x50)
-#define QSPI_SPI_SETUP0_REG (0x54)
+#define QSPI_SPI_SETUP_REG(n) ((0x54 + 4 * n))
#define QSPI_SPI_SWITCH_REG (0x64)
-#define QSPI_SPI_SETUP1_REG (0x58)
-#define QSPI_SPI_SETUP2_REG (0x5c)
-#define QSPI_SPI_SETUP3_REG (0x60)
#define QSPI_SPI_DATA_REG_1 (0x68)
#define QSPI_SPI_DATA_REG_2 (0x6c)
#define QSPI_SPI_DATA_REG_3 (0x70)
@@ -109,6 +107,16 @@ struct ti_qspi {

#define QSPI_AUTOSUSPEND_TIMEOUT 2000

+#define MEM_CS_EN(n) ((n + 1) << 8)
+
+#define MM_SWITCH 0x1
+
+#define QSPI_SETUP_RD_NORMAL (0x0 << 12)
+#define QSPI_SETUP_RD_DUAL (0x1 << 12)
+#define QSPI_SETUP_RD_QUAD (0x3 << 12)
+#define QSPI_SETUP_ADDR_SHIFT 8
+#define QSPI_SETUP_DUMMY_SHIFT 10
+
static inline unsigned long ti_qspi_read(struct ti_qspi *qspi,
unsigned long reg)
{
@@ -366,6 +374,78 @@ static int qspi_transfer_msg(struct ti_qspi *qspi, struct spi_transfer *t)
return 0;
}

+static void ti_qspi_enable_memory_map(struct spi_device *spi)
+{
+ struct ti_qspi *qspi = spi_master_get_devdata(spi->master);
+ u32 val;
+
+ ti_qspi_write(qspi, MM_SWITCH, QSPI_SPI_SWITCH_REG);
+ if (qspi->ctrl_mod) {
+ val = readl(qspi->ctrl_base);
+ val |= MEM_CS_EN(spi->chip_select);
+ writel(val, qspi->ctrl_base);
+ /* dummy readl to ensure bus sync */
+ readl(qspi->ctrl_base);
+ }
+ qspi->mmap_enabled = true;
+}
+
+static void ti_qspi_disable_memory_map(struct spi_device *spi)
+{
+ struct ti_qspi *qspi = spi_master_get_devdata(spi->master);
+ u32 val;
+
+ ti_qspi_write(qspi, 0, QSPI_SPI_SWITCH_REG);
+ if (qspi->ctrl_mod) {
+ val = readl(qspi->ctrl_base);
+ val &= ~MEM_CS_EN(spi->chip_select);
+ writel(val, qspi->ctrl_base);
+ }
+ qspi->mmap_enabled = false;
+}
+
+static void ti_qspi_setup_mmap_read(struct spi_device *spi,
+ struct spi_flash_read_message *msg)
+{
+ struct ti_qspi *qspi = spi_master_get_devdata(spi->master);
+ u32 memval = msg->read_opcode;
+
+ switch (msg->data_nbits) {
+ case SPI_NBITS_QUAD:
+ memval |= QSPI_SETUP_RD_QUAD;
+ break;
+ case SPI_NBITS_DUAL:
+ memval |= QSPI_SETUP_RD_DUAL;
+ break;
+ default:
+ memval |= QSPI_SETUP_RD_NORMAL;
+ break;
+ }
+ memval |= ((msg->addr_width - 1) << QSPI_SETUP_ADDR_SHIFT |
+ msg->dummy_bytes << QSPI_SETUP_DUMMY_SHIFT);
+ ti_qspi_write(qspi, memval,
+ QSPI_SPI_SETUP_REG(spi->chip_select));
+}
+
+static int ti_qspi_spi_flash_read(struct spi_device *spi,
+ struct spi_flash_read_message *msg)
+{
+ struct ti_qspi *qspi = spi_master_get_devdata(spi->master);
+ int ret = 0;
+
+ mutex_lock(&qspi->list_lock);
+
+ if (!qspi->mmap_enabled)
+ ti_qspi_enable_memory_map(spi);
+ ti_qspi_setup_mmap_read(spi, msg);
+ memcpy_fromio(msg->buf, qspi->mmap_base + msg->from, msg->len);
+ msg->retlen = msg->len;
+
+ mutex_unlock(&qspi->list_lock);
+
+ return ret;
+}
+
static int ti_qspi_start_transfer_one(struct spi_master *master,
struct spi_message *m)
{
@@ -398,6 +478,9 @@ static int ti_qspi_start_transfer_one(struct spi_master *master,

mutex_lock(&qspi->list_lock);

+ if (qspi->mmap_enabled)
+ ti_qspi_disable_memory_map(spi);
+
list_for_each_entry(t, &m->transfers, transfer_list) {
qspi->cmd |= QSPI_WLEN(t->bits_per_word);

@@ -521,12 +604,16 @@ static int ti_qspi_probe(struct platform_device *pdev)
}

if (res_mmap) {
- qspi->mmap_base = devm_ioremap_resource(&pdev->dev, res_mmap);
+ qspi->mmap_base = devm_ioremap_resource(&pdev->dev,
+ res_mmap);
+ master->spi_flash_read = ti_qspi_spi_flash_read;
if (IS_ERR(qspi->mmap_base)) {
- ret = PTR_ERR(qspi->mmap_base);
- goto free_master;
+ dev_err(&pdev->dev,
+ "falling back to PIO mode\n");
+ master->spi_flash_read = NULL;
}
}
+ qspi->mmap_enabled = false;

qspi->fclk = devm_clk_get(&pdev->dev, "fck");
if (IS_ERR(qspi->fclk)) {
--
2.6.3

2015-11-30 05:17:28

by Vignesh Raghavendra

[permalink] [raw]
Subject: [PATCH v4 3/5] mtd: devices: m25p80: add support for mmap read request

Certain spi controllers may provide accelerated interface to read from
m25p80 type flash devices. This interface provides better read
performance than regular SPI interface.
Call spi_flash_read(), if supported, to make use of such interface.

Signed-off-by: Vignesh R <[email protected]>
---

v4:
* Use spi_flash_read_message struct to pass args.
* support passing of opcode/addr/data nbits.

drivers/mtd/devices/m25p80.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c
index fe9ceb7b5405..00094a668c62 100644
--- a/drivers/mtd/devices/m25p80.c
+++ b/drivers/mtd/devices/m25p80.c
@@ -131,6 +131,26 @@ static int m25p80_read(struct spi_nor *nor, loff_t from, size_t len,
/* convert the dummy cycles to the number of bytes */
dummy /= 8;

+ if (spi_flash_read_supported(spi)) {
+ struct spi_flash_read_message msg;
+ int ret;
+
+ msg.buf = buf;
+ msg.from = from;
+ msg.len = len;
+ msg.read_opcode = nor->read_opcode;
+ msg.addr_width = nor->addr_width;
+ msg.dummy_bytes = dummy;
+ /* TODO: Support other combinations */
+ msg.opcode_nbits = SPI_NBITS_SINGLE;
+ msg.addr_nbits = SPI_NBITS_SINGLE;
+ msg.data_nbits = m25p80_rx_nbits(nor);
+
+ ret = spi_flash_read(spi, &msg);
+ *retlen = msg.retlen;
+ return ret;
+ }
+
spi_message_init(&m);
memset(t, 0, (sizeof t));

--
2.6.3

2015-11-30 05:17:41

by Vignesh Raghavendra

[permalink] [raw]
Subject: [PATCH v4 4/5] ARM: dts: DRA7: add entry for qspi mmap region

Add qspi memory mapped region entries for DRA7xx based SoCs. Also,
update the binding documents for the controller to document this change.

Acked-by: Rob Herring <[email protected]>
Signed-off-by: Vignesh R <[email protected]>
---

v4: No changes.

Documentation/devicetree/bindings/spi/ti_qspi.txt | 14 ++++++++++++++
arch/arm/boot/dts/dra7.dtsi | 7 +++++--
2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/spi/ti_qspi.txt b/Documentation/devicetree/bindings/spi/ti_qspi.txt
index 601a360531a5..334aa3f32cbc 100644
--- a/Documentation/devicetree/bindings/spi/ti_qspi.txt
+++ b/Documentation/devicetree/bindings/spi/ti_qspi.txt
@@ -26,3 +26,17 @@ qspi: qspi@4b300000 {
spi-max-frequency = <25000000>;
ti,hwmods = "qspi";
};
+
+For dra7xx:
+qspi: qspi@4b300000 {
+ compatible = "ti,dra7xxx-qspi";
+ reg = <0x4b300000 0x100>,
+ <0x5c000000 0x4000000>,
+ <0x4a002558 0x4>;
+ reg-names = "qspi_base", "qspi_mmap",
+ "qspi_ctrlmod";
+ #address-cells = <1>;
+ #size-cells = <0>;
+ spi-max-frequency = <48000000>;
+ ti,hwmods = "qspi";
+};
diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
index fe99231cbde5..debe7523643d 100644
--- a/arch/arm/boot/dts/dra7.dtsi
+++ b/arch/arm/boot/dts/dra7.dtsi
@@ -1153,8 +1153,11 @@

qspi: qspi@4b300000 {
compatible = "ti,dra7xxx-qspi";
- reg = <0x4b300000 0x100>;
- reg-names = "qspi_base";
+ reg = <0x4b300000 0x100>,
+ <0x5c000000 0x4000000>,
+ <0x4a002558 0x4>;
+ reg-names = "qspi_base", "qspi_mmap",
+ "qspi_ctrlmod";
#address-cells = <1>;
#size-cells = <0>;
ti,hwmods = "qspi";
--
2.6.3

2015-11-30 05:17:46

by Vignesh Raghavendra

[permalink] [raw]
Subject: [PATCH v4 5/5] ARM: dts: AM4372: add entry for qspi mmap region

Add qspi memory mapped region entries for AM43xx based SoCs. Also,
update the binding documents for the controller to document this change.

Acked-by: Rob Herring <[email protected]>
Signed-off-by: Vignesh R <[email protected]>
---

v4: No changes.

Documentation/devicetree/bindings/spi/ti_qspi.txt | 5 +++--
arch/arm/boot/dts/am4372.dtsi | 4 +++-
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/spi/ti_qspi.txt b/Documentation/devicetree/bindings/spi/ti_qspi.txt
index 334aa3f32cbc..5a1542eda387 100644
--- a/Documentation/devicetree/bindings/spi/ti_qspi.txt
+++ b/Documentation/devicetree/bindings/spi/ti_qspi.txt
@@ -17,9 +17,10 @@ Recommended properties:

Example:

+For am4372:
qspi: qspi@4b300000 {
- compatible = "ti,dra7xxx-qspi";
- reg = <0x47900000 0x100>, <0x30000000 0x3ffffff>;
+ compatible = "ti,am4372-qspi";
+ reg = <0x47900000 0x100>, <0x30000000 0x4000000>;
reg-names = "qspi_base", "qspi_mmap";
#address-cells = <1>;
#size-cells = <0>;
diff --git a/arch/arm/boot/dts/am4372.dtsi b/arch/arm/boot/dts/am4372.dtsi
index d83ff9c9701e..e32d164102d1 100644
--- a/arch/arm/boot/dts/am4372.dtsi
+++ b/arch/arm/boot/dts/am4372.dtsi
@@ -963,7 +963,9 @@

qspi: qspi@47900000 {
compatible = "ti,am4372-qspi";
- reg = <0x47900000 0x100>;
+ reg = <0x47900000 0x100>,
+ <0x30000000 0x4000000>;
+ reg-names = "qspi_base", "qspi_mmap";
#address-cells = <1>;
#size-cells = <0>;
ti,hwmods = "qspi";
--
2.6.3

2015-11-30 22:01:11

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH v4 4/5] ARM: dts: DRA7: add entry for qspi mmap region

* Vignesh R <[email protected]> [151129 21:16]:
> Add qspi memory mapped region entries for DRA7xx based SoCs. Also,
> update the binding documents for the controller to document this change.
>
> Acked-by: Rob Herring <[email protected]>
> Signed-off-by: Vignesh R <[email protected]>
> ---
>
> v4: No changes.

OK I'll apply patches 4 and 5 of this series into omap-for-v4.5/dt thanks.
The rest should get merged via the MTD tree.

Regards,

Tony

2015-11-30 22:34:20

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH v4 4/5] ARM: dts: DRA7: add entry for qspi mmap region

* Vignesh R <[email protected]> [151129 21:16]:
> Add qspi memory mapped region entries for DRA7xx based SoCs. Also,
> update the binding documents for the controller to document this change.
>
> Acked-by: Rob Herring <[email protected]>
> Signed-off-by: Vignesh R <[email protected]>
...
> --- a/Documentation/devicetree/bindings/spi/ti_qspi.txt
> +++ b/Documentation/devicetree/bindings/spi/ti_qspi.txt
> @@ -26,3 +26,17 @@ qspi: qspi@4b300000 {
> spi-max-frequency = <25000000>;
> ti,hwmods = "qspi";
> };
> +
> +For dra7xx:
> +qspi: qspi@4b300000 {
> + compatible = "ti,dra7xxx-qspi";
> + reg = <0x4b300000 0x100>,
> + <0x5c000000 0x4000000>,
> + <0x4a002558 0x4>;
> + reg-names = "qspi_base", "qspi_mmap",
> + "qspi_ctrlmod";
> + #address-cells = <1>;
> + #size-cells = <0>;
> + spi-max-frequency = <48000000>;
> + ti,hwmods = "qspi";
> +};

Actually none of the IO areas above are within the same interconnect target:

0x4b300000 QSPI0 address space in L3 main interconnect
0x5c000000 QSPI1 address space in L3 main interconnect
0x4a002558 CTRL_CORE_CONTROL_IO_2 in System Control Module (SCM) in L4_CFG

The first two address spaces should be two separate instances of this driver.
The CTRL_CORE_CONTROL_IO_2 needs seems like a shared clock register that
needs to be accessed using the clock framework most likely.

So not applying, this will be impossible to move to some real interconnect
driver until these issues are fixed.

Regards,

Tony

2015-11-30 22:34:57

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH v4 4/5] ARM: dts: DRA7: add entry for qspi mmap region

* Tony Lindgren <[email protected]> [151130 14:03]:
> * Vignesh R <[email protected]> [151129 21:16]:
> > Add qspi memory mapped region entries for DRA7xx based SoCs. Also,
> > update the binding documents for the controller to document this change.
> >
> > Acked-by: Rob Herring <[email protected]>
> > Signed-off-by: Vignesh R <[email protected]>
> > ---
> >
> > v4: No changes.
>
> OK I'll apply patches 4 and 5 of this series into omap-for-v4.5/dt thanks.
> The rest should get merged via the MTD tree.

Actually ddropping them after looking at the IO ranges like I commented.

Regards,

Tony

2015-11-30 22:36:29

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH v4 2/5] spi: spi-ti-qspi: add mmap mode read support


Hi,

Vignesh R <[email protected]> writes:
> ti-qspi controller provides mmap port to read data from SPI flashes.
> mmap port is enabled in QSPI_SPI_SWITCH_REG. ctrl module register may
> also need to be accessed for some SoCs. The QSPI_SPI_SETUP_REGx needs to
> be populated with flash specific information like read opcode, read
> mode(quad, dual, normal), address width and dummy bytes. Once,
> controller is in mmap mode, the whole flash memory is available as a
> memory region at SoC specific address. This region can be accessed using
> normal memcpy() (or mem-to-mem dma copy). The ti-qspi controller hardware
> will internally communicate with SPI flash over SPI bus and get the
> requested data.
>
> Implement spi_flash_read() callback to support mmap read over SPI
> flash devices. With this, the read throughput increases from ~100kB/s to
> ~2.5 MB/s.
>
> Signed-off-by: Vignesh R <[email protected]>
> ---
>
> drivers/spi/spi-ti-qspi.c | 101 ++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 94 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/spi/spi-ti-qspi.c b/drivers/spi/spi-ti-qspi.c
> index 64318fcfacf2..cd4e63f45e65 100644
> --- a/drivers/spi/spi-ti-qspi.c
> +++ b/drivers/spi/spi-ti-qspi.c
> @@ -56,6 +56,7 @@ struct ti_qspi {
> u32 dc;
>
> bool ctrl_mod;
> + bool mmap_enabled;
> };
>
> #define QSPI_PID (0x0)
> @@ -65,11 +66,8 @@ struct ti_qspi {
> #define QSPI_SPI_CMD_REG (0x48)
> #define QSPI_SPI_STATUS_REG (0x4c)
> #define QSPI_SPI_DATA_REG (0x50)
> -#define QSPI_SPI_SETUP0_REG (0x54)
> +#define QSPI_SPI_SETUP_REG(n) ((0x54 + 4 * n))
> #define QSPI_SPI_SWITCH_REG (0x64)
> -#define QSPI_SPI_SETUP1_REG (0x58)
> -#define QSPI_SPI_SETUP2_REG (0x5c)
> -#define QSPI_SPI_SETUP3_REG (0x60)
> #define QSPI_SPI_DATA_REG_1 (0x68)
> #define QSPI_SPI_DATA_REG_2 (0x6c)
> #define QSPI_SPI_DATA_REG_3 (0x70)
> @@ -109,6 +107,16 @@ struct ti_qspi {
>
> #define QSPI_AUTOSUSPEND_TIMEOUT 2000
>
> +#define MEM_CS_EN(n) ((n + 1) << 8)
> +
> +#define MM_SWITCH 0x1
> +
> +#define QSPI_SETUP_RD_NORMAL (0x0 << 12)
> +#define QSPI_SETUP_RD_DUAL (0x1 << 12)
> +#define QSPI_SETUP_RD_QUAD (0x3 << 12)
> +#define QSPI_SETUP_ADDR_SHIFT 8
> +#define QSPI_SETUP_DUMMY_SHIFT 10
> +
> static inline unsigned long ti_qspi_read(struct ti_qspi *qspi,
> unsigned long reg)
> {
> @@ -366,6 +374,78 @@ static int qspi_transfer_msg(struct ti_qspi *qspi, struct spi_transfer *t)
> return 0;
> }
>
> +static void ti_qspi_enable_memory_map(struct spi_device *spi)
> +{
> + struct ti_qspi *qspi = spi_master_get_devdata(spi->master);
> + u32 val;
> +
> + ti_qspi_write(qspi, MM_SWITCH, QSPI_SPI_SWITCH_REG);
> + if (qspi->ctrl_mod) {
> + val = readl(qspi->ctrl_base);
> + val |= MEM_CS_EN(spi->chip_select);
> + writel(val, qspi->ctrl_base);
> + /* dummy readl to ensure bus sync */
> + readl(qspi->ctrl_base);
> + }
> + qspi->mmap_enabled = true;
> +}
> +
> +static void ti_qspi_disable_memory_map(struct spi_device *spi)
> +{
> + struct ti_qspi *qspi = spi_master_get_devdata(spi->master);
> + u32 val;
> +
> + ti_qspi_write(qspi, 0, QSPI_SPI_SWITCH_REG);
> + if (qspi->ctrl_mod) {
> + val = readl(qspi->ctrl_base);
> + val &= ~MEM_CS_EN(spi->chip_select);
> + writel(val, qspi->ctrl_base);
> + }
> + qspi->mmap_enabled = false;
> +}
> +
> +static void ti_qspi_setup_mmap_read(struct spi_device *spi,
> + struct spi_flash_read_message *msg)
> +{
> + struct ti_qspi *qspi = spi_master_get_devdata(spi->master);
> + u32 memval = msg->read_opcode;
> +
> + switch (msg->data_nbits) {
> + case SPI_NBITS_QUAD:
> + memval |= QSPI_SETUP_RD_QUAD;
> + break;
> + case SPI_NBITS_DUAL:
> + memval |= QSPI_SETUP_RD_DUAL;
> + break;
> + default:
> + memval |= QSPI_SETUP_RD_NORMAL;
> + break;
> + }
> + memval |= ((msg->addr_width - 1) << QSPI_SETUP_ADDR_SHIFT |
> + msg->dummy_bytes << QSPI_SETUP_DUMMY_SHIFT);
> + ti_qspi_write(qspi, memval,
> + QSPI_SPI_SETUP_REG(spi->chip_select));
> +}
> +
> +static int ti_qspi_spi_flash_read(struct spi_device *spi,
> + struct spi_flash_read_message *msg)
> +{
> + struct ti_qspi *qspi = spi_master_get_devdata(spi->master);
> + int ret = 0;
> +
> + mutex_lock(&qspi->list_lock);
> +
> + if (!qspi->mmap_enabled)
> + ti_qspi_enable_memory_map(spi);
> + ti_qspi_setup_mmap_read(spi, msg);
> + memcpy_fromio(msg->buf, qspi->mmap_base + msg->from, msg->len);
> + msg->retlen = msg->len;

the way I have always expected this to work was that spi controller
would setup the mmap region (using ranges?) and pass the base address to
the SPI NOR flash instead, so that could call standard
write[bwl]/read[bwl] functions.

I mean, when we're dealing with AXI, AHB, PCI, OCP, whatever we
completely ignore these details, why should SPI be different ? If it's
memory mapped, the SW view of the thing is a piece of memory and that
should be accessible with standard {read,write}[bwl]() calls.

I really think $subject is not a good way forward because it gives too
much responsibility to the SPI controller driver; note that this driver
is the one actually accessing the memory map region, instead of simply
setting it up and passing it along.

So the way I see it, the DTS should be like so:

qspi@XYZ {
reg = <XYZ foo>;
[...]
ranges = <0 0 0x30000000 $size>;

flash@0,0 {
compatible = "mp2580";
reg = <0 0 $flash_size>;
};
};


if you have more than one device sitting on this SPI bus using different
chip selects, that's easy too, just change your ranges property:

qspi@XYZ {
reg = <XYZ foo>;
[...]
ranges = <0 0 0x30000000 0x1000
1 0 0x30001000 0x1000
2 0 0x30002000 0x1000>;

flash@0,0 {
[...]
};

flash@1,0 {
[...]
};

flash@2,0 {
[...]
};
};

and so on. From ti-qspi perspective, you should just setup the memory
map and from mp25p80 you would check if your reg property pointed to an
address that looks like memory, then ioremap it and use tradicional
{read,write}[bwl]() accessors. Any reasons why that wasn't done the way
pointed out above ?

--
balbi


Attachments:
signature.asc (818.00 B)

2015-12-01 04:46:07

by Vignesh Raghavendra

[permalink] [raw]
Subject: Re: [PATCH v4 4/5] ARM: dts: DRA7: add entry for qspi mmap region



On 12/01/2015 04:04 AM, Tony Lindgren wrote:
> * Vignesh R <[email protected]> [151129 21:16]:
>> Add qspi memory mapped region entries for DRA7xx based SoCs. Also,
>> update the binding documents for the controller to document this change.
>>
>> Acked-by: Rob Herring <[email protected]>
>> Signed-off-by: Vignesh R <[email protected]>
> ...
>> --- a/Documentation/devicetree/bindings/spi/ti_qspi.txt
>> +++ b/Documentation/devicetree/bindings/spi/ti_qspi.txt
>> @@ -26,3 +26,17 @@ qspi: qspi@4b300000 {
>> spi-max-frequency = <25000000>;
>> ti,hwmods = "qspi";
>> };
>> +
>> +For dra7xx:
>> +qspi: qspi@4b300000 {
>> + compatible = "ti,dra7xxx-qspi";
>> + reg = <0x4b300000 0x100>,
>> + <0x5c000000 0x4000000>,
>> + <0x4a002558 0x4>;
>> + reg-names = "qspi_base", "qspi_mmap",
>> + "qspi_ctrlmod";
>> + #address-cells = <1>;
>> + #size-cells = <0>;
>> + spi-max-frequency = <48000000>;
>> + ti,hwmods = "qspi";
>> +};
>
> Actually none of the IO areas above are within the same interconnect target:
>
> 0x4b300000 QSPI0 address space in L3 main interconnect
> 0x5c000000 QSPI1 address space in L3 main interconnect


First address range (configuration port: 0x4b300000) corresponds to QSPI
registers space. These registers alone are sufficient for generic SPI
communication (serial flash as well as non-flash SPI devices).

In order to speed up SPI flash reads SFI_MM_IF(SPI memory mapped
interface) is provided by QSPI IP. This _cannot_ be used with non-flash
devices.
The second address range (0x5c000000) corresponds to memory-mapped
region of SFI_MM_IF, through which SPI flash memories can be read as if
though they were RAM addresses (i.e using readl/memcpy). The SFI_MM_IF
block that takes care of communicating over SPI bus and getting the data
from flash device.
But SFI_MM_IF block needs to know some flash specific information(such
as read opcode, address bytes, dummy bytes, mode). This information must
first be populated in QSPI_SPI_SETUP*_REG(0x4B300054-60) before
accessing SFI_MM_IF address range via readl.
Both addresses space belong to same instance of the driver, one
corresponds to register space and other is memory-mapped region.
Therefore both regions are claimed by the same driver.

> 0x4a002558 CTRL_CORE_CONTROL_IO_2 in System Control Module (SCM) in L4_CFG
>
> The first two address spaces should be two separate instances of this driver.

Not actually two instances.

> The CTRL_CORE_CONTROL_IO_2 needs seems like a shared clock register that
> needs to be accessed using the clock framework most likely.
>

Not shared clock.
The CTRL_CORE_CONTROL_IO_2[10:8] QSPI_MEMMAPPED_CS bit fields provides a
functionality for remapping the previously described address space which
starts at 0x5C000000 L3_MAIN address to one of the four supported chip
selects.
How about using syscon to access CTRL_CORE_CONTROL_IO_2?

--
Regards
Vignesh

2015-12-01 07:45:18

by Vignesh Raghavendra

[permalink] [raw]
Subject: Re: [PATCH v4 2/5] spi: spi-ti-qspi: add mmap mode read support

Hi Felipe,

On 12/01/2015 04:05 AM, Balbi, Felipe wrote:
>
> Hi,
>
> Vignesh R <[email protected]> writes:
[...]
>> +}
>> +
>> +static int ti_qspi_spi_flash_read(struct spi_device *spi,
>> + struct spi_flash_read_message *msg)
>> +{
>> + struct ti_qspi *qspi = spi_master_get_devdata(spi->master);
>> + int ret = 0;
>> +
>> + mutex_lock(&qspi->list_lock);
>> +
>> + if (!qspi->mmap_enabled)
>> + ti_qspi_enable_memory_map(spi);
>> + ti_qspi_setup_mmap_read(spi, msg);
>> + memcpy_fromio(msg->buf, qspi->mmap_base + msg->from, msg->len);
>> + msg->retlen = msg->len;
>
> the way I have always expected this to work was that spi controller
> would setup the mmap region (using ranges?) and pass the base address to
> the SPI NOR flash instead, so that could call standard
> write[bwl]/read[bwl] functions.
>
> I mean, when we're dealing with AXI, AHB, PCI, OCP, whatever we
> completely ignore these details, why should SPI be different ? If it's
> memory mapped, the SW view of the thing is a piece of memory and that
> should be accessible with standard {read,write}[bwl]() calls.
>

This is just an acceleration provided to improve flash read speeds.
Whenever there is an access to QSPI memory map region, there is a
SFI_MM_IF block in QSPI IP that generates SPI bus signals in order fetch
the data from flash. This SFI_MM_IF must first be configured with flash
specific information like read opcode, read mode, dummy bytes etc (which
may vary from flash to flash), by writing to QSPI_SPI_SETUP*_REG also,
SFI_MM_IF needs to be selected by writing to QSPI_SPI_SWITCH_REG.
IMO, there has to be a call from spi-nor to ti-qspi before using
standard {read,write}[bwl]() calls for populating flash info, power mgmt
and locking SPI bus.

> I really think $subject is not a good way forward because it gives too
> much responsibility to the SPI controller driver; note that this driver
> is the one actually accessing the memory map region, instead of simply
> setting it up and passing it along.
>

How would you propose to setup mmap transfers while taking care of SPI
bus locking and passing of flash info to ti-qspi?


> So the way I see it, the DTS should be like so:
>
> qspi@XYZ {
> reg = <XYZ foo>;
> [...]
> ranges = <0 0 0x30000000 $size>;
>
> flash@0,0 {
> compatible = "mp2580";
> reg = <0 0 $flash_size>;
> };
> };
>
>
> if you have more than one device sitting on this SPI bus using different
> chip selects, that's easy too, just change your ranges property:
>
> qspi@XYZ {
> reg = <XYZ foo>;
> [...]
> ranges = <0 0 0x30000000 0x1000
> 1 0 0x30001000 0x1000
> 2 0 0x30002000 0x1000>;
>
> flash@0,0 {
> [...]
> };
>
> flash@1,0 {
> [...]
> };
>
> flash@2,0 {
> [...]
> };
> };
>

No, even if there are multiple slaves, all slaves map to the same start
address (0x30000000 in above example). Based on the chip-select line
that is asserted (selected by writing to a particular CTRL_MODULE
register field), the corresponding slave responds. Different slaves
cannot be mapped to different address ranges inside mmap address space.
The ranges property will always be the same for all slaves and all
chip-selects.

> and so on. From ti-qspi perspective, you should just setup the memory
> map and from mp25p80 you would check if your reg property pointed to an
> address that looks like memory, then ioremap it and use tradicional
> {read,write}[bwl]() accessors. Any reasons why that wasn't done the way
> pointed out above ?
>

There might be a SPI controller that provides accelerated interface for
SPI flash read not as a memory mapping but some-other way. Brian Norris
has pointed out that there is at least one other controller which
provides such acceleration w/o memory mapping[1] May be Brian can
explain that better?


[1]https://lkml.org/lkml/2015/11/10/618

--
Regards
Vignesh

2015-12-01 16:40:00

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH v4 4/5] ARM: dts: DRA7: add entry for qspi mmap region

* Vignesh R <[email protected]> [151130 20:46]:
> On 12/01/2015 04:04 AM, Tony Lindgren wrote:
> >
> > Actually none of the IO areas above are within the same interconnect target:
> >
> > 0x4b300000 QSPI0 address space in L3 main interconnect
> > 0x5c000000 QSPI1 address space in L3 main interconnect
>
>
> First address range (configuration port: 0x4b300000) corresponds to QSPI
> registers space. These registers alone are sufficient for generic SPI
> communication (serial flash as well as non-flash SPI devices).

OK

> In order to speed up SPI flash reads SFI_MM_IF(SPI memory mapped
> interface) is provided by QSPI IP. This _cannot_ be used with non-flash
> devices.

OK

> The second address range (0x5c000000) corresponds to memory-mapped
> region of SFI_MM_IF, through which SPI flash memories can be read as if
> though they were RAM addresses (i.e using readl/memcpy). The SFI_MM_IF
> block that takes care of communicating over SPI bus and getting the data
> from flash device.

OK

> But SFI_MM_IF block needs to know some flash specific information(such
> as read opcode, address bytes, dummy bytes, mode). This information must
> first be populated in QSPI_SPI_SETUP*_REG(0x4B300054-60) before
> accessing SFI_MM_IF address range via readl.
> Both addresses space belong to same instance of the driver, one
> corresponds to register space and other is memory-mapped region.
> Therefore both regions are claimed by the same driver.

OK

> > 0x4a002558 CTRL_CORE_CONTROL_IO_2 in System Control Module (SCM) in L4_CFG
> >
> > The first two address spaces should be two separate instances of this driver.
>
> Not actually two instances.

OK. They are both on L3 main so that won't cause any issues for separate
interconnect driver instances. As they are still separate targets flushing
a posted write to one area will not flush anything to the other.

> > The CTRL_CORE_CONTROL_IO_2 needs seems like a shared clock register that
> > needs to be accessed using the clock framework most likely.
> >
>
> Not shared clock.
> The CTRL_CORE_CONTROL_IO_2[10:8] QSPI_MEMMAPPED_CS bit fields provides a
> functionality for remapping the previously described address space which
> starts at 0x5C000000 L3_MAIN address to one of the four supported chip
> selects.
> How about using syscon to access CTRL_CORE_CONTROL_IO_2?

A separate driver implementing some Linux generic framework would be idael :)

But if that does not fit, yeah then syscon makes sense as that IO range
will be on separate interconnect driver (and clock and possibly power domain)
instances eventually.

Regards,

Tony

2015-12-02 19:43:26

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v4 1/5] spi: introduce accelerated read support for spi flash devices

On Mon, Nov 30, 2015 at 10:45:11AM +0530, Vignesh R wrote:
> In addition to providing direct access to SPI bus, some spi controller
> hardwares (like ti-qspi) provide special port (like memory mapped port)
> that are optimized to improve SPI flash read performance.

I'm reasonably OK with this from the SPI side but I'd really like to see
people working on MTD say that this makes sense.


Attachments:
(No filename) (389.00 B)
signature.asc (473.00 B)
Download all attachments

2015-12-03 09:42:30

by Cyrille Pitchen

[permalink] [raw]
Subject: Re: [PATCH v4 3/5] mtd: devices: m25p80: add support for mmap read request

Hi Vignesh,

Le 30/11/2015 06:15, Vignesh R a ?crit :
> Certain spi controllers may provide accelerated interface to read from
> m25p80 type flash devices. This interface provides better read
> performance than regular SPI interface.
> Call spi_flash_read(), if supported, to make use of such interface.
>
> Signed-off-by: Vignesh R <[email protected]>
> ---
>
> v4:
> * Use spi_flash_read_message struct to pass args.
> * support passing of opcode/addr/data nbits.
>
> drivers/mtd/devices/m25p80.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c
> index fe9ceb7b5405..00094a668c62 100644
> --- a/drivers/mtd/devices/m25p80.c
> +++ b/drivers/mtd/devices/m25p80.c
> @@ -131,6 +131,26 @@ static int m25p80_read(struct spi_nor *nor, loff_t from, size_t len,
> /* convert the dummy cycles to the number of bytes */
> dummy /= 8;
>
> + if (spi_flash_read_supported(spi)) {
> + struct spi_flash_read_message msg;
> + int ret;
> +
> + msg.buf = buf;
> + msg.from = from;
> + msg.len = len;
> + msg.read_opcode = nor->read_opcode;
> + msg.addr_width = nor->addr_width;
> + msg.dummy_bytes = dummy;
> + /* TODO: Support other combinations */
> + msg.opcode_nbits = SPI_NBITS_SINGLE;
> + msg.addr_nbits = SPI_NBITS_SINGLE;
> + msg.data_nbits = m25p80_rx_nbits(nor);

I wanted to let you know that the support of other SPI protocols has already
been implemented by this series:
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-September/371170.html

patches 2 & 3 handle the probing of the (Q)SPI memory in spi_nor_scan() and
select the right protocols for register read/write, memory read, memory write,
and memory erase operations. The choice is done according to both the memory
and SPI controller capabilities.

patch 4 updates the m25p80 driver to take advantage of the bandwidth increase
allowed by QSPI protocols. For instance, the Atmel QSPI controller, like TI
one, maps the SPI NOR memory to the system bus. Yesterday I ran mtd_speedtest
to compare Fast Read 1-1-1 (0x0b) and Fast Read 1-1-4 (0x6b). The SPI clock
frequency was limited to 83MHz. The QSPI memory is a Micron n25q128a13.

I only had to change the mode argument of spi_nor_scan() from SPI_NOR_QUAD to
SPI_NOR_FAST in the atmel_quadspi driver (based on fsl-quadspi driver from
Freescale since Atmel's driver also uses the system bus mapping for memory
write operations) to force the spi-nor framework to choose the SPI 1-1-1
protocol instead of the SPI-1-1-4.

1 - Fast Read 1-1-1

mtd_speedtest: testing eraseblock write speed
mtd_speedtest: eraseblock read speed is 9319 KiB/s
[...]
mtd_speedtest: testing page read speed
mtd_speedtest: page read speed is 6649 KiB/s
[...]
mtd_speedtest: testing 2 page read speed
mtd_speedtest: 2 page read speed is 7757 KiB/s

2 - Fast Read 1-1-4

mtd_speedtest: testing eraseblock read speed
mtd_speedtest: eraseblock read speed is 30117 KiB/s
[...]
mtd_speedtest: testing page read speed
mtd_speedtest: page read speed is 13096 KiB/s
[...]
mtd_speedtest: testing 2 page read speed
mtd_speedtest: 2 page read speed is 18224 KiB/s

So the performance improvements are:
eraseblock read speed (65536 bytes) : +223%
page read speed (512 bytes) : +97%
2 page read speed (1024 bytes) : +135%


That's why I believe that you could take advantage of these performance
improvements in the TI (Q)SPI controller driver.


Also please note that you may have to add in the struct spi_flash_read_message
two other fields for the number of option/mode cycles and their value.
Option/mode cycles are sent between the address and dummy cycles. They are
used by some memory manufacturers such as Spansion, Micron and Macronix to
enter/leave their continuous read mode.
So if uninitialized dummy cycles are interpreted by the QSPI memory as
option/mode cycles, depending on the actual value, the memory may enter by
mistake in continuous mode. Once in continuous mode, the command op code byte
must not be sent and is not expected by the memory: the memory implicitly uses
the read op code sent in the SPI message which triggered the memory to enter
continuous read mode. Next SPI messages start from the address cycles until
the right option/mode cycles are sent to leave the continuous read mode.

Currently the mtd layer doesn't use this feature but it should be aware of it.
That's why I believe we'll have to address this point in spi_nor_scan() and the
"regular" m25p80() sooner or later.

> +
> + ret = spi_flash_read(spi, &msg);
> + *retlen = msg.retlen;
> + return ret;
> + }
> +
> spi_message_init(&m);
> memset(t, 0, (sizeof t));
>
>

Best regards,

Cyrille

2015-12-03 10:22:14

by Vignesh Raghavendra

[permalink] [raw]
Subject: Re: [PATCH v4 4/5] ARM: dts: DRA7: add entry for qspi mmap region



On 12/01/2015 10:09 PM, Tony Lindgren wrote:
> * Vignesh R <[email protected]> [151130 20:46]:
>> On 12/01/2015 04:04 AM, Tony Lindgren wrote:
>>>
>>> Actually none of the IO areas above are within the same interconnect target:
>>>
>>> 0x4b300000 QSPI0 address space in L3 main interconnect
>>> 0x5c000000 QSPI1 address space in L3 main interconnect
>>
>>
>> First address range (configuration port: 0x4b300000) corresponds to QSPI
>> registers space. These registers alone are sufficient for generic SPI
>> communication (serial flash as well as non-flash SPI devices).
>
> OK
>
>> In order to speed up SPI flash reads SFI_MM_IF(SPI memory mapped
>> interface) is provided by QSPI IP. This _cannot_ be used with non-flash
>> devices.
>
> OK
>
>> The second address range (0x5c000000) corresponds to memory-mapped
>> region of SFI_MM_IF, through which SPI flash memories can be read as if
>> though they were RAM addresses (i.e using readl/memcpy). The SFI_MM_IF
>> block that takes care of communicating over SPI bus and getting the data
>> from flash device.
>
> OK
>
>> But SFI_MM_IF block needs to know some flash specific information(such
>> as read opcode, address bytes, dummy bytes, mode). This information must
>> first be populated in QSPI_SPI_SETUP*_REG(0x4B300054-60) before
>> accessing SFI_MM_IF address range via readl.
>> Both addresses space belong to same instance of the driver, one
>> corresponds to register space and other is memory-mapped region.
>> Therefore both regions are claimed by the same driver.
>
> OK
>
>>> 0x4a002558 CTRL_CORE_CONTROL_IO_2 in System Control Module (SCM) in L4_CFG
>>>
>>> The first two address spaces should be two separate instances of this driver.
>>
>> Not actually two instances.
>
> OK. They are both on L3 main so that won't cause any issues for separate
> interconnect driver instances. As they are still separate targets flushing
> a posted write to one area will not flush anything to the other.
>

I didn't quite understand what you meant by interconnect driver instance.
qspi_base and qspi_mmap region are tightly bound to each other and both
needs to be accessed by ti-qspi driver (though different targets).
Besides qspi_mmap region is only used to read data, there will not be
any write accesses to this target. Are you saying this binding is not
viable?

>>> The CTRL_CORE_CONTROL_IO_2 needs seems like a shared clock register that
>>> needs to be accessed using the clock framework most likely.
>>>
>>
>> Not shared clock.
>> The CTRL_CORE_CONTROL_IO_2[10:8] QSPI_MEMMAPPED_CS bit fields provides a
>> functionality for remapping the previously described address space which
>> starts at 0x5C000000 L3_MAIN address to one of the four supported chip
>> selects.
>> How about using syscon to access CTRL_CORE_CONTROL_IO_2?
>
> A separate driver implementing some Linux generic framework would be idael :)
>
> But if that does not fit, yeah then syscon makes sense as that IO range
> will be on separate interconnect driver (and clock and possibly power domain)
> instances eventually.
>

I will go ahead with syscon for accessing CTRL_CORE_CONTROL_IO_2 register.

--
Regards
Vignesh

2015-12-03 11:24:30

by Vignesh Raghavendra

[permalink] [raw]
Subject: Re: [PATCH v4 3/5] mtd: devices: m25p80: add support for mmap read request

Hi,

On 12/03/2015 03:12 PM, Cyrille Pitchen wrote:
> Hi Vignesh,
>
> Le 30/11/2015 06:15, Vignesh R a ?crit :
>> Certain spi controllers may provide accelerated interface to read from
>> m25p80 type flash devices. This interface provides better read
>> performance than regular SPI interface.
>> Call spi_flash_read(), if supported, to make use of such interface.
>>
>> Signed-off-by: Vignesh R <[email protected]>
>> ---
>>
>> v4:
>> * Use spi_flash_read_message struct to pass args.
>> * support passing of opcode/addr/data nbits.
>>
>> drivers/mtd/devices/m25p80.c | 20 ++++++++++++++++++++
>> 1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c
>> index fe9ceb7b5405..00094a668c62 100644
>> --- a/drivers/mtd/devices/m25p80.c
>> +++ b/drivers/mtd/devices/m25p80.c
>> @@ -131,6 +131,26 @@ static int m25p80_read(struct spi_nor *nor, loff_t from, size_t len,
>> /* convert the dummy cycles to the number of bytes */
>> dummy /= 8;
>>
>> + if (spi_flash_read_supported(spi)) {
>> + struct spi_flash_read_message msg;
>> + int ret;
>> +
>> + msg.buf = buf;
>> + msg.from = from;
>> + msg.len = len;
>> + msg.read_opcode = nor->read_opcode;
>> + msg.addr_width = nor->addr_width;
>> + msg.dummy_bytes = dummy;
>> + /* TODO: Support other combinations */
>> + msg.opcode_nbits = SPI_NBITS_SINGLE;
>> + msg.addr_nbits = SPI_NBITS_SINGLE;
>> + msg.data_nbits = m25p80_rx_nbits(nor);
>
> I wanted to let you know that the support of other SPI protocols has already
> been implemented by this series:
> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-September/371170.html
>
> patches 2 & 3 handle the probing of the (Q)SPI memory in spi_nor_scan() and
> select the right protocols for register read/write, memory read, memory write,
> and memory erase operations. The choice is done according to both the memory
> and SPI controller capabilities.
>
> patch 4 updates the m25p80 driver to take advantage of the bandwidth increase
> allowed by QSPI protocols. For instance, the Atmel QSPI controller, like TI
> one, maps the SPI NOR memory to the system bus. Yesterday I ran mtd_speedtest
> to compare Fast Read 1-1-1 (0x0b) and Fast Read 1-1-4 (0x6b). The SPI clock
> frequency was limited to 83MHz. The QSPI memory is a Micron n25q128a13.
>
> I only had to change the mode argument of spi_nor_scan() from SPI_NOR_QUAD to
> SPI_NOR_FAST in the atmel_quadspi driver (based on fsl-quadspi driver from
> Freescale since Atmel's driver also uses the system bus mapping for memory
> write operations) to force the spi-nor framework to choose the SPI 1-1-1
> protocol instead of the SPI-1-1-4.
>
> 1 - Fast Read 1-1-1
>
> mtd_speedtest: testing eraseblock write speed
> mtd_speedtest: eraseblock read speed is 9319 KiB/s
> [...]
> mtd_speedtest: testing page read speed
> mtd_speedtest: page read speed is 6649 KiB/s
> [...]
> mtd_speedtest: testing 2 page read speed
> mtd_speedtest: 2 page read speed is 7757 KiB/s
>
> 2 - Fast Read 1-1-4
>
> mtd_speedtest: testing eraseblock read speed
> mtd_speedtest: eraseblock read speed is 30117 KiB/s
> [...]
> mtd_speedtest: testing page read speed
> mtd_speedtest: page read speed is 13096 KiB/s
> [...]
> mtd_speedtest: testing 2 page read speed
> mtd_speedtest: 2 page read speed is 18224 KiB/s
>
> So the performance improvements are:
> eraseblock read speed (65536 bytes) : +223%
> page read speed (512 bytes) : +97%
> 2 page read speed (1024 bytes) : +135%
>
>
> That's why I believe that you could take advantage of these performance
> improvements in the TI (Q)SPI controller driver.
>

Well, I based my patches on linux-next as per Brian's suggestion. If
patches to support other flash protocol modes are accepted, I would be
happy to rebase and make use of other modes. It would just be the matter
of populating msg.{opcode/addr}_n_bits correctly.


>
> Also please note that you may have to add in the struct spi_flash_read_message
> two other fields for the number of option/mode cycles and their value.
> Option/mode cycles are sent between the address and dummy cycles. They are
> used by some memory manufacturers such as Spansion, Micron and Macronix to
> enter/leave their continuous read mode.
> So if uninitialized dummy cycles are interpreted by the QSPI memory as
> option/mode cycles, depending on the actual value, the memory may enter by
> mistake in continuous mode. Once in continuous mode, the command op code byte
> must not be sent and is not expected by the memory: the memory implicitly uses
> the read op code sent in the SPI message which triggered the memory to enter
> continuous read mode. Next SPI messages start from the address cycles until
> the right option/mode cycles are sent to leave the continuous read mode.
>
> Currently the mtd layer doesn't use this feature but it should be aware of it.
> That's why I believe we'll have to address this point in spi_nor_scan() and the
> "regular" m25p80() sooner or later.
>

Above feature can be added incrementally over this series, (as and when
m25p80 is updated) moreover ti-qspi controller does not support
option/mode cycles in memory-mapped mode, so I wont be able to test this
feature anyway.


--
Regards
Vignesh

2015-12-10 05:05:09

by Vignesh Raghavendra

[permalink] [raw]
Subject: Re: [PATCH v4 4/5] ARM: dts: DRA7: add entry for qspi mmap region



On 12/03/2015 03:51 PM, Vignesh R wrote:
>
>
> On 12/01/2015 10:09 PM, Tony Lindgren wrote:
>> * Vignesh R <[email protected]> [151130 20:46]:
>>> On 12/01/2015 04:04 AM, Tony Lindgren wrote:
>>>>
...
>>
>> OK. They are both on L3 main so that won't cause any issues for separate
>> interconnect driver instances. As they are still separate targets flushing
>> a posted write to one area will not flush anything to the other.
>>
>
> I didn't quite understand what you meant by interconnect driver instance.
> qspi_base and qspi_mmap region are tightly bound to each other and both
> needs to be accessed by ti-qspi driver (though different targets).
> Besides qspi_mmap region is only used to read data, there will not be
> any write accesses to this target. Are you saying this binding is not
> viable?
>

As I stated above qspi_base and qspi_mmap region are tightly bound,
there is no way to use qspi_mmap region w/o accessing qspi_base. So I am
planning to keep them as it is. I will move qspi_ctrlmod to use syscon.
Something like:

qspi: qspi@4b300000 {
compatible = "ti,dra7xxx-qspi";
reg = <0x4b300000 0x100>,
<0x5c000000 0x4000000>,
reg-names = "qspi_base", "qspi_mmap";
syscon-chipselects = <&scm_conf 0x558>;
#address-cells = <1>;
#size-cells = <0>;
spi-max-frequency = <48000000>;
ti,hwmods = "qspi";
};

Do you think this is not viable in future?


--
Regards
Vignesh

2015-12-10 17:44:16

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH v4 4/5] ARM: dts: DRA7: add entry for qspi mmap region

* Vignesh R <[email protected]> [151209 21:05]:
>
>
> On 12/03/2015 03:51 PM, Vignesh R wrote:
> >
> >
> > On 12/01/2015 10:09 PM, Tony Lindgren wrote:
> >> * Vignesh R <[email protected]> [151130 20:46]:
> >>> On 12/01/2015 04:04 AM, Tony Lindgren wrote:
> >>>>
> ...
> >>
> >> OK. They are both on L3 main so that won't cause any issues for separate
> >> interconnect driver instances. As they are still separate targets flushing
> >> a posted write to one area will not flush anything to the other.
> >>
> >
> > I didn't quite understand what you meant by interconnect driver instance.
> > qspi_base and qspi_mmap region are tightly bound to each other and both
> > needs to be accessed by ti-qspi driver (though different targets).
> > Besides qspi_mmap region is only used to read data, there will not be
> > any write accesses to this target. Are you saying this binding is not
> > viable?
> >
>
> As I stated above qspi_base and qspi_mmap region are tightly bound,
> there is no way to use qspi_mmap region w/o accessing qspi_base. So I am
> planning to keep them as it is. I will move qspi_ctrlmod to use syscon.
> Something like:
>
> qspi: qspi@4b300000 {
> compatible = "ti,dra7xxx-qspi";
> reg = <0x4b300000 0x100>,
> <0x5c000000 0x4000000>,
> reg-names = "qspi_base", "qspi_mmap";
> syscon-chipselects = <&scm_conf 0x558>;
> #address-cells = <1>;
> #size-cells = <0>;
> spi-max-frequency = <48000000>;
> ti,hwmods = "qspi";
> };
>
> Do you think this is not viable in future?

That's OK, they are on the same interconnect instance. And with the
syscon you're not directly tinkering with the SCM registers. So
yeah, that should work in the long run too.

The absolute addresses will get changed to just offsets from the
interconnect target. But if you use the Linux standard functions like
platform_get_resource_byname + devm_request_mem_region + devm_ioremap
then no changes are needed to your driver later on.

Thanks,

Tony