2020-12-13 03:20:16

by Sowjanya Komatineni

[permalink] [raw]
Subject: [PATCH v3 0/9] Add Tegra Quad SPI driver

This series adds Tegra210, Tegra186, and Tegra194 Quad SPI driver and
enables Quad SPI on Jetson Nano and Jetson Xavier NX.

QSPI controller is available on Tegra210, Tegra186 and Tegra194.

Tegra186 and Tegra194 has additional feature of combined sequence mode
where command, address and data can all be transferred in a single transfer.

Combined sequence mode is useful with DMA mode transfer.

This series does not have combined sequence mode feature as Tegra186/Tegra194
GPCDMA driver is not upstreamed yet.

This series includes
- dt-binding document
- QSPI driver for Tegra210/Tegra186/Tegra194
- Enables QSPI on Jetson Nano and Jetson Xavier NX.

Delta between patch versions:
[v3]: v2 has some mixed patches sent out accidentally.
v3 sends proper patches with fixes mentioned in v2.
[v2]: below v1 feedback
- Added SPI_MASTER_USES_HW_DUMMY_CYCLES flag for controllers supporting
hardware dummy cycles and skips dummy bytes transfer from software for
these controllers.
- Updated dt-binding doc with tx/rx tap delay properties.
- Added qspi_out clock to dt-binding doc which will be used later with
ddr mode support.
- All other v1 feedback on some cleanup.



Sowjanya Komatineni (9):
dt-bindings: clock: tegra: Add clock ID TEGRA210_CLK_QSPI_PM
dt-bindings: spi: Add Tegra Quad SPI device tree binding
MAINTAINERS: Add Tegra Quad SPI driver section
spi: tegra210-quad: Add support for Tegra210 QSPI controller
spi: spi-mem: Allow masters to transfer dummy cycles directly by
hardware
spi: tegra210-quad: Add support for hardware dummy cycles
arm64: tegra: Enable QSPI on Jetson Nano
arm64: tegra: Add QSPI nodes on Tegra194
arm64: tegra: Enable QSPI on Jetson Xavier NX

.../bindings/spi/nvidia,tegra210-quad.yaml | 130 ++
MAINTAINERS | 8 +
.../dts/nvidia/tegra194-p3509-0000+p3668-0000.dts | 12 +
arch/arm64/boot/dts/nvidia/tegra194.dtsi | 24 +
arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts | 12 +
arch/arm64/boot/dts/nvidia/tegra210.dtsi | 5 +-
drivers/spi/Kconfig | 9 +
drivers/spi/Makefile | 1 +
drivers/spi/spi-mem.c | 18 +-
drivers/spi/spi-tegra210-quad.c | 1407 ++++++++++++++++++++
include/dt-bindings/clock/tegra210-car.h | 2 +-
include/linux/spi/spi.h | 8 +
12 files changed, 1626 insertions(+), 10 deletions(-)
create mode 100644 Documentation/devicetree/bindings/spi/nvidia,tegra210-quad.yaml
create mode 100644 drivers/spi/spi-tegra210-quad.c

--
2.7.4


2020-12-13 03:20:54

by Sowjanya Komatineni

[permalink] [raw]
Subject: [PATCH v3 3/9] MAINTAINERS: Add Tegra Quad SPI driver section

Add maintainers and mailing list entries to Tegra Quad SPI driver
section.

Signed-off-by: Sowjanya Komatineni <[email protected]>
---
MAINTAINERS | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5b20bab..19db61f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17447,6 +17447,14 @@ M: Laxman Dewangan <[email protected]>
S: Supported
F: drivers/spi/spi-tegra*

+TEGRA QUAD SPI DRIVER
+M: Thierry Reding <[email protected]>
+M: Jonathan Hunter <[email protected]>
+M: Sowjanya Komatineni <[email protected]>
+L: [email protected]
+S: Maintained
+F: drivers/spi/spi-tegra210-quad.c
+
TEGRA VIDEO DRIVER
M: Thierry Reding <[email protected]>
M: Jonathan Hunter <[email protected]>
--
2.7.4

2020-12-13 03:21:44

by Sowjanya Komatineni

[permalink] [raw]
Subject: [PATCH v3 5/9] spi: spi-mem: Allow masters to transfer dummy cycles directly by hardware

This patch adds a flag SPI_MASTER_USES_HW_DUMMY_CYCLES for the controllers
that support transfer of dummy cycles by the hardware directly.

For controller with this flag set, spi-mem driver will skip dummy bytes
transfer in the spi message.

Controller drivers can get the number of dummy cycles from spi_message.

Signed-off-by: Sowjanya Komatineni <[email protected]>
---
drivers/spi/spi-mem.c | 18 +++++++++++-------
include/linux/spi/spi.h | 8 ++++++++
2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
index f3a3f19..38a523b 100644
--- a/drivers/spi/spi-mem.c
+++ b/drivers/spi/spi-mem.c
@@ -350,13 +350,17 @@ int spi_mem_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
}

if (op->dummy.nbytes) {
- memset(tmpbuf + op->addr.nbytes + 1, 0xff, op->dummy.nbytes);
- xfers[xferpos].tx_buf = tmpbuf + op->addr.nbytes + 1;
- xfers[xferpos].len = op->dummy.nbytes;
- xfers[xferpos].tx_nbits = op->dummy.buswidth;
- spi_message_add_tail(&xfers[xferpos], &msg);
- xferpos++;
- totalxferlen += op->dummy.nbytes;
+ if (ctlr->flags & SPI_MASTER_USES_HW_DUMMY_CYCLES) {
+ msg.dummy_cycles = (op->dummy.nbytes * 8) / op->dummy.buswidth;
+ } else {
+ memset(tmpbuf + op->addr.nbytes + 1, 0xff, op->dummy.nbytes);
+ xfers[xferpos].tx_buf = tmpbuf + op->addr.nbytes + 1;
+ xfers[xferpos].len = op->dummy.nbytes;
+ xfers[xferpos].tx_nbits = op->dummy.buswidth;
+ spi_message_add_tail(&xfers[xferpos], &msg);
+ xferpos++;
+ totalxferlen += op->dummy.nbytes;
+ }
}

if (op->data.nbytes) {
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index aa09fdc..2024149 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -512,6 +512,8 @@ struct spi_controller {

#define SPI_MASTER_GPIO_SS BIT(5) /* GPIO CS must select slave */

+#define SPI_MASTER_USES_HW_DUMMY_CYCLES BIT(6) /* HW dummy bytes transfer */
+
/* flag indicating this is an SPI slave controller */
bool slave;

@@ -1022,6 +1024,12 @@ struct spi_message {
unsigned actual_length;
int status;

+ /*
+ * dummy cycles in the message transfer. This is used by the controller
+ * drivers supports transfer of dummy cycles directly by the hardware.
+ */
+ u8 dummy_cycles;
+
/* for optional use by whatever driver currently owns the
* spi_message ... between calls to spi_async and then later
* complete(), that's the spi_controller controller driver.
--
2.7.4

2020-12-13 03:21:44

by Sowjanya Komatineni

[permalink] [raw]
Subject: [PATCH v3 6/9] spi: tegra210-quad: Add support for hardware dummy cycles

Tegra Quad SPI controller hardware supports sending dummy cycles
after address bytes.

This patch adds this support.

Signed-off-by: Sowjanya Komatineni <[email protected]>
---
drivers/spi/spi-tegra210-quad.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-tegra210-quad.c b/drivers/spi/spi-tegra210-quad.c
index 624f395..1d1b125 100644
--- a/drivers/spi/spi-tegra210-quad.c
+++ b/drivers/spi/spi-tegra210-quad.c
@@ -124,6 +124,13 @@
#define QSPI_DMA_TIMEOUT (msecs_to_jiffies(1000))
#define DEFAULT_QSPI_DMA_BUF_LEN (64 * 1024)

+enum transfer_phase {
+ CMD_BYTE_XFER = 0,
+ ADDR_BYTES_XFER,
+ DATA_BYTES_XFER,
+ MAX_XFERS,
+};
+
struct tegra_qspi_client_data {
int tx_clk_tap_delay;
int rx_clk_tap_delay;
@@ -857,6 +864,8 @@ static int tegra_qspi_start_transfer_one(struct spi_device *spi,

tqspi->command1_reg = command1;

+ tegra_qspi_writel(tqspi, QSPI_NUM_DUMMY_CYCLE(tqspi->dummy_cycles), QSPI_MISC_REG);
+
ret = tegra_qspi_flush_fifos(tqspi, false);
if (ret < 0)
return ret;
@@ -977,7 +986,7 @@ static int tegra_qspi_transfer_one_message(struct spi_master *master, struct spi
struct spi_device *spi = msg->spi;
struct spi_transfer *xfer;
bool is_first_msg = true;
- int ret;
+ int ret, xfer_phase = 0;

msg->status = 0;
msg->actual_length = 0;
@@ -987,6 +996,15 @@ static int tegra_qspi_transfer_one_message(struct spi_master *master, struct spi
list_for_each_entry(xfer, &msg->transfers, transfer_list) {
u32 cmd1;

+ /*
+ * Program dummy clock cycles in Tegra QSPI register only
+ * during address transfer phase.
+ */
+ if (xfer_phase == ADDR_BYTES_XFER)
+ tqspi->dummy_cycles = msg->dummy_cycles;
+ else
+ tqspi->dummy_cycles = 0;
+
reinit_completion(&tqspi->xfer_completion);

cmd1 = tegra_qspi_setup_transfer_one(spi, xfer, is_first_msg);
@@ -1018,6 +1036,7 @@ static int tegra_qspi_transfer_one_message(struct spi_master *master, struct spi
}

msg->actual_length += xfer->len;
+ xfer_phase++;

complete_xfer:
if (ret < 0) {
@@ -1203,6 +1222,7 @@ static int tegra_qspi_probe(struct platform_device *pdev)
master->mode_bits = SPI_MODE_0 | SPI_MODE_3 | SPI_CS_HIGH |
SPI_TX_DUAL | SPI_RX_DUAL | SPI_TX_QUAD | SPI_RX_QUAD;
master->bits_per_word_mask = SPI_BPW_MASK(32) | SPI_BPW_MASK(16) | SPI_BPW_MASK(8);
+ master->flags = SPI_MASTER_USES_HW_DUMMY_CYCLES;
master->setup = tegra_qspi_setup;
master->cleanup = tegra_qspi_cleanup;
master->transfer_one_message = tegra_qspi_transfer_one_message;
--
2.7.4

2020-12-13 10:18:25

by Sowjanya Komatineni

[permalink] [raw]
Subject: [PATCH v3 1/9] dt-bindings: clock: tegra: Add clock ID TEGRA210_CLK_QSPI_PM

Tegra210 QSPI clock output has divider DIV2_SEL which will be enabled
when using DDR interface mode.

This patch adds clock ID for this to dt-binding.

Signed-off-by: Sowjanya Komatineni <[email protected]>
---
include/dt-bindings/clock/tegra210-car.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/dt-bindings/clock/tegra210-car.h b/include/dt-bindings/clock/tegra210-car.h
index ab8b8a7..9cfcc3b 100644
--- a/include/dt-bindings/clock/tegra210-car.h
+++ b/include/dt-bindings/clock/tegra210-car.h
@@ -307,7 +307,7 @@
#define TEGRA210_CLK_AUDIO4 275
#define TEGRA210_CLK_SPDIF 276
/* 277 */
-/* 278 */
+#define TEGRA210_CLK_QSPI_PM 278
/* 279 */
/* 280 */
#define TEGRA210_CLK_SOR0_LVDS 281 /* deprecated */
--
2.7.4

2020-12-13 16:21:46

by Sowjanya Komatineni

[permalink] [raw]
Subject: Re: [PATCH v3 5/9] spi: spi-mem: Allow masters to transfer dummy cycles directly by hardware


On 12/12/20 2:57 AM, Boris Brezillon wrote:
> On Fri, 11 Dec 2020 13:15:59 -0800
> Sowjanya Komatineni <[email protected]> wrote:
>
>> This patch adds a flag SPI_MASTER_USES_HW_DUMMY_CYCLES for the controllers
>> that support transfer of dummy cycles by the hardware directly.
> Hm, not sure this is a good idea. I mean, if we expect regular SPI
> devices to use this feature, then why not, but if it's just for
> spi-mem, I'd recommend implementing a driver-specific exec_op() instead
> of using the default one.

dummy cycles programming is SPI device specific.

Transfer of dummy bytes by SW or HW controller can be depending on
features supported by controller.

Adding controller driver specific exec_op() Just for skipping dummy
bytes transfer will have so much of redundant code pretty much what all
spi_mem_exec_op does.

So in v1, I handled this in controller driver by skipping SW transfer of
dummy bytes during dummy phase and programming dummy cycles in
controller register to allow HW to transfer.

Based on v1 feedback discussion, added this flag
SPI_MASTER_USES_HW_DUMMY_CYCLES which can be used by controllers
supporting HW dummy bytes transfer and updated spi_mem_exec_op to skip
SW dummy bytes.

This helps other controllers supporting HW transfer of dummy bytes as
well just to set the flag and use dummy cycles directly.

> If we go for those core changes, we should at least add a
> ctrl->max_dummy_cycles field so the core can fallback to regular writes
> when the number of dummy cycles in the spi_mem_op exceeds what the
> controller can do.
Yes makes sense. Will add this once we decide on keeping this flag to
identify controllers supporting HW transfer of dummy bytes Vs SW transfer.
>> For controller with this flag set, spi-mem driver will skip dummy bytes
>> transfer in the spi message.
>>
>> Controller drivers can get the number of dummy cycles from spi_message.
>>
>> Signed-off-by: Sowjanya Komatineni <[email protected]>
>> ---
>> drivers/spi/spi-mem.c | 18 +++++++++++-------
>> include/linux/spi/spi.h | 8 ++++++++
>> 2 files changed, 19 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
>> index f3a3f19..38a523b 100644
>> --- a/drivers/spi/spi-mem.c
>> +++ b/drivers/spi/spi-mem.c
>> @@ -350,13 +350,17 @@ int spi_mem_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
>> }
>>
>> if (op->dummy.nbytes) {
>> - memset(tmpbuf + op->addr.nbytes + 1, 0xff, op->dummy.nbytes);
>> - xfers[xferpos].tx_buf = tmpbuf + op->addr.nbytes + 1;
>> - xfers[xferpos].len = op->dummy.nbytes;
>> - xfers[xferpos].tx_nbits = op->dummy.buswidth;
>> - spi_message_add_tail(&xfers[xferpos], &msg);
>> - xferpos++;
>> - totalxferlen += op->dummy.nbytes;
>> + if (ctlr->flags & SPI_MASTER_USES_HW_DUMMY_CYCLES) {
>> + msg.dummy_cycles = (op->dummy.nbytes * 8) / op->dummy.buswidth;
>> + } else {
>> + memset(tmpbuf + op->addr.nbytes + 1, 0xff, op->dummy.nbytes);
>> + xfers[xferpos].tx_buf = tmpbuf + op->addr.nbytes + 1;
>> + xfers[xferpos].len = op->dummy.nbytes;
>> + xfers[xferpos].tx_nbits = op->dummy.buswidth;
>> + spi_message_add_tail(&xfers[xferpos], &msg);
>> + xferpos++;
>> + totalxferlen += op->dummy.nbytes;
>> + }
>> }
>>
>> if (op->data.nbytes) {
>> diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
>> index aa09fdc..2024149 100644
>> --- a/include/linux/spi/spi.h
>> +++ b/include/linux/spi/spi.h
>> @@ -512,6 +512,8 @@ struct spi_controller {
>>
>> #define SPI_MASTER_GPIO_SS BIT(5) /* GPIO CS must select slave */
>>
>> +#define SPI_MASTER_USES_HW_DUMMY_CYCLES BIT(6) /* HW dummy bytes transfer */
>> +
>> /* flag indicating this is an SPI slave controller */
>> bool slave;
>>
>> @@ -1022,6 +1024,12 @@ struct spi_message {
>> unsigned actual_length;
>> int status;
>>
>> + /*
>> + * dummy cycles in the message transfer. This is used by the controller
>> + * drivers supports transfer of dummy cycles directly by the hardware.
>> + */
>> + u8 dummy_cycles;
>> +
>> /* for optional use by whatever driver currently owns the
>> * spi_message ... between calls to spi_async and then later
>> * complete(), that's the spi_controller controller driver.

2020-12-13 18:42:04

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v3 6/9] spi: tegra210-quad: Add support for hardware dummy cycles

On Fri, 11 Dec 2020 13:16:00 -0800
Sowjanya Komatineni <[email protected]> wrote:

> Tegra Quad SPI controller hardware supports sending dummy cycles
> after address bytes.
>
> This patch adds this support.
>
> Signed-off-by: Sowjanya Komatineni <[email protected]>
> ---
> drivers/spi/spi-tegra210-quad.c | 22 +++++++++++++++++++++-
> 1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/spi/spi-tegra210-quad.c b/drivers/spi/spi-tegra210-quad.c
> index 624f395..1d1b125 100644
> --- a/drivers/spi/spi-tegra210-quad.c
> +++ b/drivers/spi/spi-tegra210-quad.c
> @@ -124,6 +124,13 @@
> #define QSPI_DMA_TIMEOUT (msecs_to_jiffies(1000))
> #define DEFAULT_QSPI_DMA_BUF_LEN (64 * 1024)
>
> +enum transfer_phase {
> + CMD_BYTE_XFER = 0,
> + ADDR_BYTES_XFER,
> + DATA_BYTES_XFER,
> + MAX_XFERS,
> +};
> +
> struct tegra_qspi_client_data {
> int tx_clk_tap_delay;
> int rx_clk_tap_delay;
> @@ -857,6 +864,8 @@ static int tegra_qspi_start_transfer_one(struct spi_device *spi,
>
> tqspi->command1_reg = command1;
>
> + tegra_qspi_writel(tqspi, QSPI_NUM_DUMMY_CYCLE(tqspi->dummy_cycles), QSPI_MISC_REG);
> +
> ret = tegra_qspi_flush_fifos(tqspi, false);
> if (ret < 0)
> return ret;
> @@ -977,7 +986,7 @@ static int tegra_qspi_transfer_one_message(struct spi_master *master, struct spi
> struct spi_device *spi = msg->spi;
> struct spi_transfer *xfer;
> bool is_first_msg = true;
> - int ret;
> + int ret, xfer_phase = 0;
>
> msg->status = 0;
> msg->actual_length = 0;
> @@ -987,6 +996,15 @@ static int tegra_qspi_transfer_one_message(struct spi_master *master, struct spi
> list_for_each_entry(xfer, &msg->transfers, transfer_list) {
> u32 cmd1;
>
> + /*
> + * Program dummy clock cycles in Tegra QSPI register only
> + * during address transfer phase.
> + */
> + if (xfer_phase == ADDR_BYTES_XFER)
> + tqspi->dummy_cycles = msg->dummy_cycles;
> + else
> + tqspi->dummy_cycles = 0;

That's fragile. You're trying to guess the phase (which is clearly a
spi-mem concept) from the position of the transfer in the list. What
happens if a spi-mem operation has no address bytes but requires dummy
cycles after the command? What happens if we patch spi_mem_exec_op() to
merge the cmd and address bytes in a single transfer (that's an option
I considered at some point when designing the framework before deciding
it was not worth the extra complexity)?

Besides, I keep thinking the regular transfer path should not assume
it's being passed spi-mem operations, if it is, that means you should
overload the default exec_op(). The more I look at it the less I like
this idea of adding a dummy_cycles field to spi_message. I'm pretty
sure we can find other ways to avoid code duplication if that's your
main concern.

> +
> reinit_completion(&tqspi->xfer_completion);
>
> cmd1 = tegra_qspi_setup_transfer_one(spi, xfer, is_first_msg);
> @@ -1018,6 +1036,7 @@ static int tegra_qspi_transfer_one_message(struct spi_master *master, struct spi
> }
>
> msg->actual_length += xfer->len;
> + xfer_phase++;
>
> complete_xfer:
> if (ret < 0) {
> @@ -1203,6 +1222,7 @@ static int tegra_qspi_probe(struct platform_device *pdev)
> master->mode_bits = SPI_MODE_0 | SPI_MODE_3 | SPI_CS_HIGH |
> SPI_TX_DUAL | SPI_RX_DUAL | SPI_TX_QUAD | SPI_RX_QUAD;
> master->bits_per_word_mask = SPI_BPW_MASK(32) | SPI_BPW_MASK(16) | SPI_BPW_MASK(8);
> + master->flags = SPI_MASTER_USES_HW_DUMMY_CYCLES;
> master->setup = tegra_qspi_setup;
> master->cleanup = tegra_qspi_cleanup;
> master->transfer_one_message = tegra_qspi_transfer_one_message;

2020-12-13 18:44:09

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v3 5/9] spi: spi-mem: Allow masters to transfer dummy cycles directly by hardware

On Sun, 13 Dec 2020 10:54:26 +0100
Boris Brezillon <[email protected]> wrote:

> On Sat, 12 Dec 2020 09:28:50 -0800
> Sowjanya Komatineni <[email protected]> wrote:
>
> > On 12/12/20 2:57 AM, Boris Brezillon wrote:
> > > On Fri, 11 Dec 2020 13:15:59 -0800
> > > Sowjanya Komatineni <[email protected]> wrote:
> > >
> > >> This patch adds a flag SPI_MASTER_USES_HW_DUMMY_CYCLES for the controllers
> > >> that support transfer of dummy cycles by the hardware directly.
> > > Hm, not sure this is a good idea. I mean, if we expect regular SPI
> > > devices to use this feature, then why not, but if it's just for
> > > spi-mem, I'd recommend implementing a driver-specific exec_op() instead
> > > of using the default one.
> >
> > dummy cycles programming is SPI device specific.
> >
> > Transfer of dummy bytes by SW or HW controller can be depending on
> > features supported by controller.
> >
> > Adding controller driver specific exec_op() Just for skipping dummy
> > bytes transfer will have so much of redundant code pretty much what all
> > spi_mem_exec_op does.
> >
> > So in v1, I handled this in controller driver by skipping SW transfer of
> > dummy bytes during dummy phase and programming dummy cycles in
> > controller register to allow HW to transfer.
> >
> > Based on v1 feedback discussion, added this flag
> > SPI_MASTER_USES_HW_DUMMY_CYCLES which can be used by controllers
> > supporting HW dummy bytes transfer and updated spi_mem_exec_op to skip
> > SW dummy bytes.
> >
> > This helps other controllers supporting HW transfer of dummy bytes as
> > well just to set the flag and use dummy cycles directly.
>
> Except saying a spi_message has X dummy cycle is not precise enough.
> Where are those dummy cycles in the transfer sequence? spi-mem has well
> defined sequencing (cmd[+addr][+dummy][+data]) so we know exacly where
> dummy cycles are, but trying to retro-fit the dummy-cycle concept in
> the generic spi_message is confusing IMHO. If want to avoid code
> duplication, I'm pretty sure the driver can be reworked so the
> spi_transfer/exec_op() path can share most of the logic (that probably
> implies declaring a tegra_qspi_op).

Something like that might also do the trick:

--->8---

diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
index ef53290b7d24..8b0476f37fbb 100644
--- a/drivers/spi/spi-mem.c
+++ b/drivers/spi/spi-mem.c
@@ -353,6 +353,7 @@ int spi_mem_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
xfers[xferpos].tx_buf = tmpbuf + op->addr.nbytes + 1;
xfers[xferpos].len = op->dummy.nbytes;
xfers[xferpos].tx_nbits = op->dummy.buswidth;
+ xfers[xferpos].dummy_data = 1;
spi_message_add_tail(&xfers[xferpos], &msg);
xferpos++;
totalxferlen += op->dummy.nbytes;
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index 99380c0825db..ecf7989318c5 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -807,6 +807,10 @@ extern void spi_res_release(struct spi_controller *ctlr,
* transfer. If 0 the default (from @spi_device) is used.
* @bits_per_word: select a bits_per_word other than the device default
* for this transfer. If 0 the default (from @spi_device) is used.
+ * @dummy_data: set to 1 for a dummy transfer (a transfer whose data is
+ * ignored). Controllers that are able to issue dummy cycles can ignore
+ * tx_buf, for those that can't tx_buf will contain dummy bytes. The
+ * number of dummy cycles to issue is (len * tx_bits) / 8.
* @cs_change: affects chipselect after this transfer completes
* @cs_change_delay: delay between cs deassert and assert when
* @cs_change is set and @spi_transfer is not the last in @spi_message
@@ -919,6 +923,7 @@ struct spi_transfer {
struct sg_table tx_sg;
struct sg_table rx_sg;

+ unsigned dummy_data:1;
unsigned cs_change:1;
unsigned tx_nbits:3;
unsigned rx_nbits:3;

2020-12-13 19:06:09

by Sowjanya Komatineni

[permalink] [raw]
Subject: Re: [PATCH v3 5/9] spi: spi-mem: Allow masters to transfer dummy cycles directly by hardware


On 12/13/20 3:28 AM, Boris Brezillon wrote:
> On Sun, 13 Dec 2020 10:54:26 +0100
> Boris Brezillon <[email protected]> wrote:
>
>> On Sat, 12 Dec 2020 09:28:50 -0800
>> Sowjanya Komatineni <[email protected]> wrote:
>>
>>> On 12/12/20 2:57 AM, Boris Brezillon wrote:
>>>> On Fri, 11 Dec 2020 13:15:59 -0800
>>>> Sowjanya Komatineni <[email protected]> wrote:
>>>>
>>>>> This patch adds a flag SPI_MASTER_USES_HW_DUMMY_CYCLES for the controllers
>>>>> that support transfer of dummy cycles by the hardware directly.
>>>> Hm, not sure this is a good idea. I mean, if we expect regular SPI
>>>> devices to use this feature, then why not, but if it's just for
>>>> spi-mem, I'd recommend implementing a driver-specific exec_op() instead
>>>> of using the default one.
>>> dummy cycles programming is SPI device specific.
>>>
>>> Transfer of dummy bytes by SW or HW controller can be depending on
>>> features supported by controller.
>>>
>>> Adding controller driver specific exec_op() Just for skipping dummy
>>> bytes transfer will have so much of redundant code pretty much what all
>>> spi_mem_exec_op does.
>>>
>>> So in v1, I handled this in controller driver by skipping SW transfer of
>>> dummy bytes during dummy phase and programming dummy cycles in
>>> controller register to allow HW to transfer.
>>>
>>> Based on v1 feedback discussion, added this flag
>>> SPI_MASTER_USES_HW_DUMMY_CYCLES which can be used by controllers
>>> supporting HW dummy bytes transfer and updated spi_mem_exec_op to skip
>>> SW dummy bytes.
>>>
>>> This helps other controllers supporting HW transfer of dummy bytes as
>>> well just to set the flag and use dummy cycles directly.
>> Except saying a spi_message has X dummy cycle is not precise enough.
>> Where are those dummy cycles in the transfer sequence? spi-mem has well
>> defined sequencing (cmd[+addr][+dummy][+data]) so we know exacly where
>> dummy cycles are, but trying to retro-fit the dummy-cycle concept in
>> the generic spi_message is confusing IMHO. If want to avoid code
>> duplication, I'm pretty sure the driver can be reworked so the
>> spi_transfer/exec_op() path can share most of the logic (that probably
>> implies declaring a tegra_qspi_op).
> Something like that might also do the trick:
>
> --->8---
>
> diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
> index ef53290b7d24..8b0476f37fbb 100644
> --- a/drivers/spi/spi-mem.c
> +++ b/drivers/spi/spi-mem.c
> @@ -353,6 +353,7 @@ int spi_mem_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
> xfers[xferpos].tx_buf = tmpbuf + op->addr.nbytes + 1;
> xfers[xferpos].len = op->dummy.nbytes;
> xfers[xferpos].tx_nbits = op->dummy.buswidth;
> + xfers[xferpos].dummy_data = 1;
> spi_message_add_tail(&xfers[xferpos], &msg);
> xferpos++;
> totalxferlen += op->dummy.nbytes;
> diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
> index 99380c0825db..ecf7989318c5 100644
> --- a/include/linux/spi/spi.h
> +++ b/include/linux/spi/spi.h
> @@ -807,6 +807,10 @@ extern void spi_res_release(struct spi_controller *ctlr,
> * transfer. If 0 the default (from @spi_device) is used.
> * @bits_per_word: select a bits_per_word other than the device default
> * for this transfer. If 0 the default (from @spi_device) is used.
> + * @dummy_data: set to 1 for a dummy transfer (a transfer whose data is
> + * ignored). Controllers that are able to issue dummy cycles can ignore
> + * tx_buf, for those that can't tx_buf will contain dummy bytes. The
> + * number of dummy cycles to issue is (len * tx_bits) / 8.
> * @cs_change: affects chipselect after this transfer completes
> * @cs_change_delay: delay between cs deassert and assert when
> * @cs_change is set and @spi_transfer is not the last in @spi_message
> @@ -919,6 +923,7 @@ struct spi_transfer {
> struct sg_table tx_sg;
> struct sg_table rx_sg;
>
> + unsigned dummy_data:1;
> unsigned cs_change:1;
> unsigned tx_nbits:3;
> unsigned rx_nbits:3;

Thanks Boris.

Sorry was thinking of spi flash device only as we only support quad spi
flash on Tegra QSPI interface.

But to make it more generic where spi message preparation can happen
from any client driver, agree order of transfers may vary.

Also having controller driver implement exec_op callback is also not
useful considering cases where spi message transfers dont' go thru spi_mem.

Yes adding dummy_data field to indicate transfer is dummy bytes transfer
helps for any types of message transfers.

Tegra QSPI controller dummy cycles need be programmed with transfer
after which dummy cycles are needed.

So, will have v4 to add dummy_data to spi_transfer and will update
controller driver to convert dummy bytes to dummy cycles and program
dummy cycles with its previous transfer and skip dummy transfer buffer.

Thanks

Sowjanya


2020-12-14 02:25:52

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v3 5/9] spi: spi-mem: Allow masters to transfer dummy cycles directly by hardware

On Fri, 11 Dec 2020 13:15:59 -0800
Sowjanya Komatineni <[email protected]> wrote:

> This patch adds a flag SPI_MASTER_USES_HW_DUMMY_CYCLES for the controllers
> that support transfer of dummy cycles by the hardware directly.

Hm, not sure this is a good idea. I mean, if we expect regular SPI
devices to use this feature, then why not, but if it's just for
spi-mem, I'd recommend implementing a driver-specific exec_op() instead
of using the default one.

If we go for those core changes, we should at least add a
ctrl->max_dummy_cycles field so the core can fallback to regular writes
when the number of dummy cycles in the spi_mem_op exceeds what the
controller can do.

>
> For controller with this flag set, spi-mem driver will skip dummy bytes
> transfer in the spi message.
>
> Controller drivers can get the number of dummy cycles from spi_message.
>
> Signed-off-by: Sowjanya Komatineni <[email protected]>
> ---
> drivers/spi/spi-mem.c | 18 +++++++++++-------
> include/linux/spi/spi.h | 8 ++++++++
> 2 files changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
> index f3a3f19..38a523b 100644
> --- a/drivers/spi/spi-mem.c
> +++ b/drivers/spi/spi-mem.c
> @@ -350,13 +350,17 @@ int spi_mem_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
> }
>
> if (op->dummy.nbytes) {
> - memset(tmpbuf + op->addr.nbytes + 1, 0xff, op->dummy.nbytes);
> - xfers[xferpos].tx_buf = tmpbuf + op->addr.nbytes + 1;
> - xfers[xferpos].len = op->dummy.nbytes;
> - xfers[xferpos].tx_nbits = op->dummy.buswidth;
> - spi_message_add_tail(&xfers[xferpos], &msg);
> - xferpos++;
> - totalxferlen += op->dummy.nbytes;
> + if (ctlr->flags & SPI_MASTER_USES_HW_DUMMY_CYCLES) {
> + msg.dummy_cycles = (op->dummy.nbytes * 8) / op->dummy.buswidth;
> + } else {
> + memset(tmpbuf + op->addr.nbytes + 1, 0xff, op->dummy.nbytes);
> + xfers[xferpos].tx_buf = tmpbuf + op->addr.nbytes + 1;
> + xfers[xferpos].len = op->dummy.nbytes;
> + xfers[xferpos].tx_nbits = op->dummy.buswidth;
> + spi_message_add_tail(&xfers[xferpos], &msg);
> + xferpos++;
> + totalxferlen += op->dummy.nbytes;
> + }
> }
>
> if (op->data.nbytes) {
> diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
> index aa09fdc..2024149 100644
> --- a/include/linux/spi/spi.h
> +++ b/include/linux/spi/spi.h
> @@ -512,6 +512,8 @@ struct spi_controller {
>
> #define SPI_MASTER_GPIO_SS BIT(5) /* GPIO CS must select slave */
>
> +#define SPI_MASTER_USES_HW_DUMMY_CYCLES BIT(6) /* HW dummy bytes transfer */
> +
> /* flag indicating this is an SPI slave controller */
> bool slave;
>
> @@ -1022,6 +1024,12 @@ struct spi_message {
> unsigned actual_length;
> int status;
>
> + /*
> + * dummy cycles in the message transfer. This is used by the controller
> + * drivers supports transfer of dummy cycles directly by the hardware.
> + */
> + u8 dummy_cycles;
> +
> /* for optional use by whatever driver currently owns the
> * spi_message ... between calls to spi_async and then later
> * complete(), that's the spi_controller controller driver.

2020-12-14 11:38:17

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v3 5/9] spi: spi-mem: Allow masters to transfer dummy cycles directly by hardware

On Sat, 12 Dec 2020 09:28:50 -0800
Sowjanya Komatineni <[email protected]> wrote:

> On 12/12/20 2:57 AM, Boris Brezillon wrote:
> > On Fri, 11 Dec 2020 13:15:59 -0800
> > Sowjanya Komatineni <[email protected]> wrote:
> >
> >> This patch adds a flag SPI_MASTER_USES_HW_DUMMY_CYCLES for the controllers
> >> that support transfer of dummy cycles by the hardware directly.
> > Hm, not sure this is a good idea. I mean, if we expect regular SPI
> > devices to use this feature, then why not, but if it's just for
> > spi-mem, I'd recommend implementing a driver-specific exec_op() instead
> > of using the default one.
>
> dummy cycles programming is SPI device specific.
>
> Transfer of dummy bytes by SW or HW controller can be depending on
> features supported by controller.
>
> Adding controller driver specific exec_op() Just for skipping dummy
> bytes transfer will have so much of redundant code pretty much what all
> spi_mem_exec_op does.
>
> So in v1, I handled this in controller driver by skipping SW transfer of
> dummy bytes during dummy phase and programming dummy cycles in
> controller register to allow HW to transfer.
>
> Based on v1 feedback discussion, added this flag
> SPI_MASTER_USES_HW_DUMMY_CYCLES which can be used by controllers
> supporting HW dummy bytes transfer and updated spi_mem_exec_op to skip
> SW dummy bytes.
>
> This helps other controllers supporting HW transfer of dummy bytes as
> well just to set the flag and use dummy cycles directly.

Except saying a spi_message has X dummy cycle is not precise enough.
Where are those dummy cycles in the transfer sequence? spi-mem has well
defined sequencing (cmd[+addr][+dummy][+data]) so we know exacly where
dummy cycles are, but trying to retro-fit the dummy-cycle concept in
the generic spi_message is confusing IMHO. If want to avoid code
duplication, I'm pretty sure the driver can be reworked so the
spi_transfer/exec_op() path can share most of the logic (that probably
implies declaring a tegra_qspi_op).

>
> > If we go for those core changes, we should at least add a
> > ctrl->max_dummy_cycles field so the core can fallback to regular writes
> > when the number of dummy cycles in the spi_mem_op exceeds what the
> > controller can do.
> Yes makes sense. Will add this once we decide on keeping this flag to
> identify controllers supporting HW transfer of dummy bytes Vs SW transfer.
> >> For controller with this flag set, spi-mem driver will skip dummy bytes
> >> transfer in the spi message.
> >>
> >> Controller drivers can get the number of dummy cycles from spi_message.
> >>
> >> Signed-off-by: Sowjanya Komatineni <[email protected]>
> >> ---
> >> drivers/spi/spi-mem.c | 18 +++++++++++-------
> >> include/linux/spi/spi.h | 8 ++++++++
> >> 2 files changed, 19 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
> >> index f3a3f19..38a523b 100644
> >> --- a/drivers/spi/spi-mem.c
> >> +++ b/drivers/spi/spi-mem.c
> >> @@ -350,13 +350,17 @@ int spi_mem_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
> >> }
> >>
> >> if (op->dummy.nbytes) {
> >> - memset(tmpbuf + op->addr.nbytes + 1, 0xff, op->dummy.nbytes);
> >> - xfers[xferpos].tx_buf = tmpbuf + op->addr.nbytes + 1;
> >> - xfers[xferpos].len = op->dummy.nbytes;
> >> - xfers[xferpos].tx_nbits = op->dummy.buswidth;
> >> - spi_message_add_tail(&xfers[xferpos], &msg);
> >> - xferpos++;
> >> - totalxferlen += op->dummy.nbytes;
> >> + if (ctlr->flags & SPI_MASTER_USES_HW_DUMMY_CYCLES) {
> >> + msg.dummy_cycles = (op->dummy.nbytes * 8) / op->dummy.buswidth;
> >> + } else {
> >> + memset(tmpbuf + op->addr.nbytes + 1, 0xff, op->dummy.nbytes);
> >> + xfers[xferpos].tx_buf = tmpbuf + op->addr.nbytes + 1;
> >> + xfers[xferpos].len = op->dummy.nbytes;
> >> + xfers[xferpos].tx_nbits = op->dummy.buswidth;
> >> + spi_message_add_tail(&xfers[xferpos], &msg);
> >> + xferpos++;
> >> + totalxferlen += op->dummy.nbytes;
> >> + }
> >> }
> >>
> >> if (op->data.nbytes) {
> >> diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
> >> index aa09fdc..2024149 100644
> >> --- a/include/linux/spi/spi.h
> >> +++ b/include/linux/spi/spi.h
> >> @@ -512,6 +512,8 @@ struct spi_controller {
> >>
> >> #define SPI_MASTER_GPIO_SS BIT(5) /* GPIO CS must select slave */
> >>
> >> +#define SPI_MASTER_USES_HW_DUMMY_CYCLES BIT(6) /* HW dummy bytes transfer */
> >> +
> >> /* flag indicating this is an SPI slave controller */
> >> bool slave;
> >>
> >> @@ -1022,6 +1024,12 @@ struct spi_message {
> >> unsigned actual_length;
> >> int status;
> >>
> >> + /*
> >> + * dummy cycles in the message transfer. This is used by the controller
> >> + * drivers supports transfer of dummy cycles directly by the hardware.
> >> + */
> >> + u8 dummy_cycles;
> >> +
> >> /* for optional use by whatever driver currently owns the
> >> * spi_message ... between calls to spi_async and then later
> >> * complete(), that's the spi_controller controller driver.

2020-12-14 16:28:37

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v3 5/9] spi: spi-mem: Allow masters to transfer dummy cycles directly by hardware

On Sat, Dec 12, 2020 at 11:57:15AM +0100, Boris Brezillon wrote:
> Sowjanya Komatineni <[email protected]> wrote:

> > This patch adds a flag SPI_MASTER_USES_HW_DUMMY_CYCLES for the controllers
> > that support transfer of dummy cycles by the hardware directly.

> Hm, not sure this is a good idea. I mean, if we expect regular SPI
> devices to use this feature, then why not, but if it's just for
> spi-mem, I'd recommend implementing a driver-specific exec_op() instead
> of using the default one.

I *have* seen other high speed devices which had padding bits in the
transfer (see regmap's pad_bits feature), I think that corresponds to
flash dummy bits but haven't checked that the hardware support lines up.
I'm not sure it's ever been seen as something that we particularly
needed to speed up with hardware offload though.

> If we go for those core changes, we should at least add a
> ctrl->max_dummy_cycles field so the core can fallback to regular writes
> when the number of dummy cycles in the spi_mem_op exceeds what the
> controller can do.

That seems sensible if there's a risk of controllers being too limited,
which knowing hardware seems likely.


Attachments:
(No filename) (1.17 kB)
signature.asc (499.00 B)
Download all attachments

2020-12-15 16:09:37

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v3 1/9] dt-bindings: clock: tegra: Add clock ID TEGRA210_CLK_QSPI_PM

On Fri, 11 Dec 2020 13:15:55 -0800, Sowjanya Komatineni wrote:
> Tegra210 QSPI clock output has divider DIV2_SEL which will be enabled
> when using DDR interface mode.
>
> This patch adds clock ID for this to dt-binding.
>
> Signed-off-by: Sowjanya Komatineni <[email protected]>
> ---
> include/dt-bindings/clock/tegra210-car.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>

Acked-by: Rob Herring <[email protected]>