2018-09-13 00:41:47

by David Lechner

[permalink] [raw]
Subject: [PATCH v2 0/4] spi: introduce SPI_CS_WORD mode flag

This series introduces a new SPI mode flag, SPI_CS_WORD, that indicates that
the chip select line should be toggled after each word sent. This series
includes examples of how this can be implemented for both an SPI controller
and an SPI device.

The motivation here is to take advantage of DMA transfers with an analog/digital
convert chip. This chip requires that the chip select line be toggled after
each word send in order to drive the internal circuitry of the chip.

The way we were accomplishing this was, for example, to read 16 channels, we
would create 16 SPI _transfer_s in one SPI _message_. Although this works, it
uses quite a bit of CPU because we have to do work at the end of each transfer
to get the next transfer ready. When you are polling the chip at 100Hz, this
CPU usage adds up.

The SPI controller being used has DMA support, but only on the _transfer_ level
and not on the _message_ level. So, to take advantage of DMA, we need to read
all of the A/DC channels in a single _transfer_. The SPI controller is capable
automatically toggling the chip select line during a DMA transfer, so we are
adding a new SPI flag in order to take advantage of this.

I have tested both the default software implementation and the spi-davinci
implementation with the A/DC driver in this series.

v2 changes:
- dropped patch "spi: spi-bitbang: change flags from u8 to u16" that has already
been applied
- new patch "spi: add software implementation for SPI_CS_WORD" that provides
a default implementation of SPI_CS_WORD for controllers

David Lechner (4):
spi: add new SPI_CS_WORD flag
spi: add software implementation for SPI_CS_WORD
iio: adc: ti-ads7950: use SPI_CS_WORD to reduce CPU usage
spi: spi-davinci: Add support for SPI_CS_WORD

drivers/iio/adc/ti-ads7950.c | 53 +++++++++++++++++++++---------------
drivers/spi/spi-davinci.c | 11 ++++++--
drivers/spi/spi.c | 32 +++++++++++++++++++++-
include/linux/spi/spi.h | 2 +-
4 files changed, 71 insertions(+), 27 deletions(-)

--
2.17.1



2018-09-13 00:40:32

by David Lechner

[permalink] [raw]
Subject: [PATCH v2 3/4] iio: adc: ti-ads7950: use SPI_CS_WORD to reduce CPU usage

This changes how the SPI message for the triggered buffer is setup in
the TI ADS7950 A/DC driver. By using the SPI_CS_WORD flag, we can read
multiple samples in a single SPI transfer. If the SPI controller
supports DMA transfers, we can see a significant reduction in CPU usage.

For example, on an ARM9 system running at 456MHz reading just 4 channels
at 100Hz: before this change, top shows the CPU usage of the IRQ thread
of this driver to be ~7.7%. After this change, the CPU usage drops to
~3.8%.

Signed-off-by: David Lechner <[email protected]>
---

It was brought up in v1 that changing the endianness *could* possible break
users who are taking shortcuts by making assumptions on the data format instead
of using the full ABI to determine the format. Since this only *might* be a
problem I would like to make this change anyway to avoid a bunch of byte
swapping. If it turns out that it really is a problem instead of *might be*
a problem, then we can fix it later.

drivers/iio/adc/ti-ads7950.c | 53 +++++++++++++++++++++---------------
1 file changed, 31 insertions(+), 22 deletions(-)

diff --git a/drivers/iio/adc/ti-ads7950.c b/drivers/iio/adc/ti-ads7950.c
index a5bd5944bc66..0ad63592cc3c 100644
--- a/drivers/iio/adc/ti-ads7950.c
+++ b/drivers/iio/adc/ti-ads7950.c
@@ -51,7 +51,7 @@

struct ti_ads7950_state {
struct spi_device *spi;
- struct spi_transfer ring_xfer[TI_ADS7950_MAX_CHAN + 2];
+ struct spi_transfer ring_xfer;
struct spi_transfer scan_single_xfer[3];
struct spi_message ring_msg;
struct spi_message scan_single_msg;
@@ -65,11 +65,11 @@ struct ti_ads7950_state {
* DMA (thus cache coherency maintenance) requires the
* transfer buffers to live in their own cache lines.
*/
- __be16 rx_buf[TI_ADS7950_MAX_CHAN + TI_ADS7950_TIMESTAMP_SIZE]
+ u16 rx_buf[TI_ADS7950_MAX_CHAN + 2 + TI_ADS7950_TIMESTAMP_SIZE]
____cacheline_aligned;
- __be16 tx_buf[TI_ADS7950_MAX_CHAN];
- __be16 single_tx;
- __be16 single_rx;
+ u16 tx_buf[TI_ADS7950_MAX_CHAN + 2];
+ u16 single_tx;
+ u16 single_rx;

};

@@ -108,7 +108,7 @@ enum ti_ads7950_id {
.realbits = bits, \
.storagebits = 16, \
.shift = 12 - (bits), \
- .endianness = IIO_BE, \
+ .endianness = IIO_CPU, \
}, \
}

@@ -249,23 +249,14 @@ static int ti_ads7950_update_scan_mode(struct iio_dev *indio_dev,
len = 0;
for_each_set_bit(i, active_scan_mask, indio_dev->num_channels) {
cmd = TI_ADS7950_CR_WRITE | TI_ADS7950_CR_CHAN(i) | st->settings;
- st->tx_buf[len++] = cpu_to_be16(cmd);
+ st->tx_buf[len++] = cmd;
}

/* Data for the 1st channel is not returned until the 3rd transfer */
- len += 2;
- for (i = 0; i < len; i++) {
- if ((i + 2) < len)
- st->ring_xfer[i].tx_buf = &st->tx_buf[i];
- if (i >= 2)
- st->ring_xfer[i].rx_buf = &st->rx_buf[i - 2];
- st->ring_xfer[i].len = 2;
- st->ring_xfer[i].cs_change = 1;
- }
- /* make sure last transfer's cs_change is not set */
- st->ring_xfer[len - 1].cs_change = 0;
+ st->tx_buf[len++] = 0;
+ st->tx_buf[len++] = 0;

- spi_message_init_with_transfers(&st->ring_msg, st->ring_xfer, len);
+ st->ring_xfer.len = len * 2;

return 0;
}
@@ -281,7 +272,7 @@ static irqreturn_t ti_ads7950_trigger_handler(int irq, void *p)
if (ret < 0)
goto out;

- iio_push_to_buffers_with_timestamp(indio_dev, st->rx_buf,
+ iio_push_to_buffers_with_timestamp(indio_dev, &st->rx_buf[2],
iio_get_time_ns(indio_dev));

out:
@@ -298,13 +289,13 @@ static int ti_ads7950_scan_direct(struct iio_dev *indio_dev, unsigned int ch)
mutex_lock(&indio_dev->mlock);

cmd = TI_ADS7950_CR_WRITE | TI_ADS7950_CR_CHAN(ch) | st->settings;
- st->single_tx = cpu_to_be16(cmd);
+ st->single_tx = cmd;

ret = spi_sync(st->spi, &st->scan_single_msg);
if (ret)
goto out;

- ret = be16_to_cpu(st->single_rx);
+ ret = st->single_rx;

out:
mutex_unlock(&indio_dev->mlock);
@@ -378,6 +369,14 @@ static int ti_ads7950_probe(struct spi_device *spi)
const struct ti_ads7950_chip_info *info;
int ret;

+ spi->bits_per_word = 16;
+ spi->mode |= SPI_CS_WORD;
+ ret = spi_setup(spi);
+ if (ret < 0) {
+ dev_err(&spi->dev, "Error in spi setup\n");
+ return ret;
+ }
+
indio_dev = devm_iio_device_alloc(&spi->dev, sizeof(*st));
if (!indio_dev)
return -ENOMEM;
@@ -398,6 +397,16 @@ static int ti_ads7950_probe(struct spi_device *spi)
indio_dev->num_channels = info->num_channels;
indio_dev->info = &ti_ads7950_info;

+ /* build spi ring message */
+ spi_message_init(&st->ring_msg);
+
+ st->ring_xfer.tx_buf = &st->tx_buf[0];
+ st->ring_xfer.rx_buf = &st->rx_buf[0];
+ /* len will be set later */
+ st->ring_xfer.cs_change = true;
+
+ spi_message_add_tail(&st->ring_xfer, &st->ring_msg);
+
/*
* Setup default message. The sample is read at the end of the first
* transfer, then it takes one full cycle to convert the sample and one
--
2.17.1


2018-09-13 00:40:35

by David Lechner

[permalink] [raw]
Subject: [PATCH v2 4/4] spi: spi-davinci: Add support for SPI_CS_WORD

This adds support for the SPI_CS_WORD flag to the TI DaVinci SPI
driver. This mode can be used as long as we are using the hardware
chip select and not a GPIO chip select.

Signed-off-by: David Lechner <[email protected]>
---
drivers/spi/spi-davinci.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c
index d502cf504deb..8f7dcbc53c57 100644
--- a/drivers/spi/spi-davinci.c
+++ b/drivers/spi/spi-davinci.c
@@ -230,7 +230,8 @@ static void davinci_spi_chipselect(struct spi_device *spi, int value)
!(spi->mode & SPI_CS_HIGH));
} else {
if (value == BITBANG_CS_ACTIVE) {
- spidat1 |= SPIDAT1_CSHOLD_MASK;
+ if (!(spi->mode & SPI_CS_WORD))
+ spidat1 |= SPIDAT1_CSHOLD_MASK;
spidat1 &= ~(0x1 << chip_sel);
}
}
@@ -440,8 +441,12 @@ static int davinci_spi_setup(struct spi_device *spi)
return retval;
}

- if (internal_cs)
+ if (internal_cs) {
set_io_bits(dspi->base + SPIPC0, 1 << spi->chip_select);
+ } else if (spi->mode & SPI_CS_WORD) {
+ dev_err(&spi->dev, "SPI_CS_WORD can't be use with GPIO CS\n");
+ return -EINVAL;
+ }
}

if (spi->mode & SPI_READY)
@@ -976,7 +981,7 @@ static int davinci_spi_probe(struct platform_device *pdev)
dspi->prescaler_limit = pdata->prescaler_limit;
dspi->version = pdata->version;

- dspi->bitbang.flags = SPI_NO_CS | SPI_LSB_FIRST | SPI_LOOP;
+ dspi->bitbang.flags = SPI_NO_CS | SPI_LSB_FIRST | SPI_LOOP | SPI_CS_WORD;
if (dspi->version == SPI_VERSION_2)
dspi->bitbang.flags |= SPI_READY;

--
2.17.1


2018-09-13 00:40:41

by David Lechner

[permalink] [raw]
Subject: [PATCH v2 2/4] spi: add software implementation for SPI_CS_WORD

This adds a default software implementation for the SPI_CS_WORD flag for
controllers that don't have such a feature.

The SPI_CS_WORD flag indicates that the CS line should be toggled
between each word sent, not just between each transfer. The
implementation works by using existing functions to split transfers into
one-word-sized transfers and sets the cs_change bit for each of the
new transfers.

Signed-off-by: David Lechner <[email protected]>
---
drivers/spi/spi.c | 31 +++++++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 9da0bc5a036c..84518ed58872 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -2783,8 +2783,10 @@ int spi_setup(struct spi_device *spi)
return -EINVAL;
/* help drivers fail *cleanly* when they need options
* that aren't supported with their current controller
+ * SPI_CS_WORD has a fallback software implementation,
+ * so it is ignored here.
*/
- bad_bits = spi->mode & ~spi->controller->mode_bits;
+ bad_bits = spi->mode & ~(spi->controller->mode_bits | SPI_CS_WORD);
ugly_bits = bad_bits &
(SPI_TX_DUAL | SPI_TX_QUAD | SPI_RX_DUAL | SPI_RX_QUAD);
if (ugly_bits) {
@@ -2838,6 +2840,33 @@ static int __spi_validate(struct spi_device *spi, struct spi_message *message)
if (list_empty(&message->transfers))
return -EINVAL;

+ /* If an SPI controller does not support toggling the CS line on each
+ * transfer (indicated by the SPI_CS_WORD flag), we can emulate it by
+ * splitting transfers into one-word transfers and ensuring that
+ * cs_change is set for each transfer.
+ */
+ if ((spi->mode & SPI_CS_WORD) && !(ctlr->mode_bits & SPI_CS_WORD)) {
+ size_t maxsize;
+ int ret;
+
+ maxsize = (spi->bits_per_word + 7) / 8;
+
+ /* spi_split_transfers_maxsize() requires message->spi */
+ message->spi = spi;
+
+ ret = spi_split_transfers_maxsize(ctlr, message, maxsize,
+ GFP_KERNEL);
+ if (ret)
+ return ret;
+
+ list_for_each_entry(xfer, &message->transfers, transfer_list) {
+ /* don't change cs_change on the last entry in the list */
+ if (list_is_last(&xfer->transfer_list, &message->transfers))
+ break;
+ xfer->cs_change = 1;
+ }
+ }
+
/* Half-duplex links include original MicroWire, and ones with
* only one data pin like SPI_3WIRE (switches direction) or where
* either MOSI or MISO is missing. They can also be caused by
--
2.17.1


2018-09-13 00:42:23

by David Lechner

[permalink] [raw]
Subject: [PATCH v2 1/4] spi: add new SPI_CS_WORD flag

This adds a new SPI mode flag, SPI_CS_WORD, that is used to indicate
that a SPI device requires the chip select to be toggled after each
word that is transferred.

Signed-off-by: David Lechner <[email protected]>
---
include/linux/spi/spi.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index d698f9db3484..7bb36145e2ba 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -163,6 +163,7 @@ struct spi_device {
#define SPI_TX_QUAD 0x200 /* transmit with 4 wires */
#define SPI_RX_DUAL 0x400 /* receive with 2 wires */
#define SPI_RX_QUAD 0x800 /* receive with 4 wires */
+#define SPI_CS_WORD 0x1000 /* toggle cs after each word */
int irq;
void *controller_state;
void *controller_data;
@@ -177,7 +178,6 @@ struct spi_device {
* the controller talks to each chip, like:
* - memory packing (12 bit samples into low bits, others zeroed)
* - priority
- * - drop chipselect after each word
* - chipselect delays
* - ...
*/
--
2.17.1


2018-09-13 14:14:53

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] spi: spi-davinci: Add support for SPI_CS_WORD

Hi David,

On Thu, Sep 13, 2018 at 2:40 AM David Lechner <[email protected]> wrote:
> This adds support for the SPI_CS_WORD flag to the TI DaVinci SPI
> driver. This mode can be used as long as we are using the hardware
> chip select and not a GPIO chip select.
>
> Signed-off-by: David Lechner <[email protected]>
> ---
> drivers/spi/spi-davinci.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c
> index d502cf504deb..8f7dcbc53c57 100644
> --- a/drivers/spi/spi-davinci.c
> +++ b/drivers/spi/spi-davinci.c
> @@ -230,7 +230,8 @@ static void davinci_spi_chipselect(struct spi_device *spi, int value)
> !(spi->mode & SPI_CS_HIGH));
> } else {
> if (value == BITBANG_CS_ACTIVE) {
> - spidat1 |= SPIDAT1_CSHOLD_MASK;
> + if (!(spi->mode & SPI_CS_WORD))
> + spidat1 |= SPIDAT1_CSHOLD_MASK;
> spidat1 &= ~(0x1 << chip_sel);
> }
> }
> @@ -440,8 +441,12 @@ static int davinci_spi_setup(struct spi_device *spi)
> return retval;
> }
>
> - if (internal_cs)
> + if (internal_cs) {
> set_io_bits(dspi->base + SPIPC0, 1 << spi->chip_select);
> + } else if (spi->mode & SPI_CS_WORD) {
> + dev_err(&spi->dev, "SPI_CS_WORD can't be use with GPIO CS\n");
> + return -EINVAL;

Does the SPI core fall back to splitting the transfer in this case?

> + }

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2018-09-13 14:27:20

by David Lechner

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] spi: spi-davinci: Add support for SPI_CS_WORD

On 09/13/2018 08:44 AM, Geert Uytterhoeven wrote:
> Hi David,
>
> On Thu, Sep 13, 2018 at 2:40 AM David Lechner <[email protected]> wrote:
>> This adds support for the SPI_CS_WORD flag to the TI DaVinci SPI
>> driver. This mode can be used as long as we are using the hardware
>> chip select and not a GPIO chip select.
>>
>> Signed-off-by: David Lechner <[email protected]>
>> ---
>> drivers/spi/spi-davinci.c | 11 ++++++++---
>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c
>> index d502cf504deb..8f7dcbc53c57 100644
>> --- a/drivers/spi/spi-davinci.c
>> +++ b/drivers/spi/spi-davinci.c
>> @@ -230,7 +230,8 @@ static void davinci_spi_chipselect(struct spi_device *spi, int value)
>> !(spi->mode & SPI_CS_HIGH));
>> } else {
>> if (value == BITBANG_CS_ACTIVE) {
>> - spidat1 |= SPIDAT1_CSHOLD_MASK;
>> + if (!(spi->mode & SPI_CS_WORD))
>> + spidat1 |= SPIDAT1_CSHOLD_MASK;
>> spidat1 &= ~(0x1 << chip_sel);
>> }
>> }
>> @@ -440,8 +441,12 @@ static int davinci_spi_setup(struct spi_device *spi)
>> return retval;
>> }
>>
>> - if (internal_cs)
>> + if (internal_cs) {
>> set_io_bits(dspi->base + SPIPC0, 1 << spi->chip_select);
>> + } else if (spi->mode & SPI_CS_WORD) {
>> + dev_err(&spi->dev, "SPI_CS_WORD can't be use with GPIO CS\n");
>> + return -EINVAL;
>
> Does the SPI core fall back to splitting the transfer in this case?

Hmm... it doesn't look like it.

I suppose it might be best to modify the SPI core to say:

if ((spi->mode & SPI_CS_WORD) && (!(ctlr->mode_bits & SPI_CS_WORD) ||
gpio_is_valid(spi->cs_gpio)) {

instead of:

if ((spi->mode & SPI_CS_WORD) && !(ctlr->mode_bits & SPI_CS_WORD)) {

Then we could drop the error above.

>
>> + }
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
>


2018-09-16 11:31:07

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] spi: add new SPI_CS_WORD flag

On Wed, 12 Sep 2018 19:39:17 -0500
David Lechner <[email protected]> wrote:

> This adds a new SPI mode flag, SPI_CS_WORD, that is used to indicate
> that a SPI device requires the chip select to be toggled after each
> word that is transferred.
>
> Signed-off-by: David Lechner <[email protected]>
Just a general patch ordering / combining comment.

Seems odd to introduce a flag that a driver might use in a patch
preceding any implementations!

I would have combined this with the next patch so the software fallback
would be in place when the ability to turn it on is added.

Jonathan
> ---
> include/linux/spi/spi.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
> index d698f9db3484..7bb36145e2ba 100644
> --- a/include/linux/spi/spi.h
> +++ b/include/linux/spi/spi.h
> @@ -163,6 +163,7 @@ struct spi_device {
> #define SPI_TX_QUAD 0x200 /* transmit with 4 wires */
> #define SPI_RX_DUAL 0x400 /* receive with 2 wires */
> #define SPI_RX_QUAD 0x800 /* receive with 4 wires */
> +#define SPI_CS_WORD 0x1000 /* toggle cs after each word */
> int irq;
> void *controller_state;
> void *controller_data;
> @@ -177,7 +178,6 @@ struct spi_device {
> * the controller talks to each chip, like:
> * - memory packing (12 bit samples into low bits, others zeroed)
> * - priority
> - * - drop chipselect after each word
> * - chipselect delays
> * - ...
> */


2018-09-16 11:34:21

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] spi: add software implementation for SPI_CS_WORD

On Wed, 12 Sep 2018 19:39:18 -0500
David Lechner <[email protected]> wrote:

> This adds a default software implementation for the SPI_CS_WORD flag for
> controllers that don't have such a feature.
>
> The SPI_CS_WORD flag indicates that the CS line should be toggled
> between each word sent, not just between each transfer. The
> implementation works by using existing functions to split transfers into
> one-word-sized transfers and sets the cs_change bit for each of the
> new transfers.
>
> Signed-off-by: David Lechner <[email protected]>
Looks good to me but I'm not that familiar with this code..

Acked-by: Jonathan Cameron <[email protected]>

> ---
> drivers/spi/spi.c | 31 +++++++++++++++++++++++++++++++-
> 1 file changed, 30 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
> index 9da0bc5a036c..84518ed58872 100644
> --- a/drivers/spi/spi.c
> +++ b/drivers/spi/spi.c
> @@ -2783,8 +2783,10 @@ int spi_setup(struct spi_device *spi)
> return -EINVAL;
> /* help drivers fail *cleanly* when they need options
> * that aren't supported with their current controller
> + * SPI_CS_WORD has a fallback software implementation,
> + * so it is ignored here.
> */
> - bad_bits = spi->mode & ~spi->controller->mode_bits;
> + bad_bits = spi->mode & ~(spi->controller->mode_bits | SPI_CS_WORD);
> ugly_bits = bad_bits &
> (SPI_TX_DUAL | SPI_TX_QUAD | SPI_RX_DUAL | SPI_RX_QUAD);
> if (ugly_bits) {
> @@ -2838,6 +2840,33 @@ static int __spi_validate(struct spi_device *spi, struct spi_message *message)
> if (list_empty(&message->transfers))
> return -EINVAL;
>
> + /* If an SPI controller does not support toggling the CS line on each
> + * transfer (indicated by the SPI_CS_WORD flag), we can emulate it by
> + * splitting transfers into one-word transfers and ensuring that
> + * cs_change is set for each transfer.
> + */
> + if ((spi->mode & SPI_CS_WORD) && !(ctlr->mode_bits & SPI_CS_WORD)) {
> + size_t maxsize;
> + int ret;
> +
> + maxsize = (spi->bits_per_word + 7) / 8;
> +
> + /* spi_split_transfers_maxsize() requires message->spi */
> + message->spi = spi;
> +
> + ret = spi_split_transfers_maxsize(ctlr, message, maxsize,
> + GFP_KERNEL);
> + if (ret)
> + return ret;
> +
> + list_for_each_entry(xfer, &message->transfers, transfer_list) {
> + /* don't change cs_change on the last entry in the list */
> + if (list_is_last(&xfer->transfer_list, &message->transfers))
> + break;
> + xfer->cs_change = 1;
> + }
> + }
> +
> /* Half-duplex links include original MicroWire, and ones with
> * only one data pin like SPI_3WIRE (switches direction) or where
> * either MOSI or MISO is missing. They can also be caused by


2018-09-16 11:43:00

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] iio: adc: ti-ads7950: use SPI_CS_WORD to reduce CPU usage

On Wed, 12 Sep 2018 19:39:19 -0500
David Lechner <[email protected]> wrote:

> This changes how the SPI message for the triggered buffer is setup in
> the TI ADS7950 A/DC driver. By using the SPI_CS_WORD flag, we can read
> multiple samples in a single SPI transfer. If the SPI controller
> supports DMA transfers, we can see a significant reduction in CPU usage.
>
> For example, on an ARM9 system running at 456MHz reading just 4 channels
> at 100Hz: before this change, top shows the CPU usage of the IRQ thread
> of this driver to be ~7.7%. After this change, the CPU usage drops to
> ~3.8%.
>
> Signed-off-by: David Lechner <[email protected]>

Hi David,

I've managed to forget why we are changing any of the endian related code
at all. The change SPI_CS_WORD result in changes between words which is
fine but it doesn't change any ordering within words? So as such why
do we no longer need to do the big endian conversions?

Jonathan

> ---
>
> It was brought up in v1 that changing the endianness *could* possible break
> users who are taking shortcuts by making assumptions on the data format instead
> of using the full ABI to determine the format. Since this only *might* be a
> problem I would like to make this change anyway to avoid a bunch of byte
> swapping. If it turns out that it really is a problem instead of *might be*
> a problem, then we can fix it later.
>
> drivers/iio/adc/ti-ads7950.c | 53 +++++++++++++++++++++---------------
> 1 file changed, 31 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/iio/adc/ti-ads7950.c b/drivers/iio/adc/ti-ads7950.c
> index a5bd5944bc66..0ad63592cc3c 100644
> --- a/drivers/iio/adc/ti-ads7950.c
> +++ b/drivers/iio/adc/ti-ads7950.c
> @@ -51,7 +51,7 @@
>
> struct ti_ads7950_state {
> struct spi_device *spi;
> - struct spi_transfer ring_xfer[TI_ADS7950_MAX_CHAN + 2];
> + struct spi_transfer ring_xfer;
> struct spi_transfer scan_single_xfer[3];
> struct spi_message ring_msg;
> struct spi_message scan_single_msg;
> @@ -65,11 +65,11 @@ struct ti_ads7950_state {
> * DMA (thus cache coherency maintenance) requires the
> * transfer buffers to live in their own cache lines.
> */
> - __be16 rx_buf[TI_ADS7950_MAX_CHAN + TI_ADS7950_TIMESTAMP_SIZE]
> + u16 rx_buf[TI_ADS7950_MAX_CHAN + 2 + TI_ADS7950_TIMESTAMP_SIZE]
> ____cacheline_aligned;
> - __be16 tx_buf[TI_ADS7950_MAX_CHAN];
> - __be16 single_tx;
> - __be16 single_rx;
> + u16 tx_buf[TI_ADS7950_MAX_CHAN + 2];
> + u16 single_tx;
> + u16 single_rx;
>
> };
>
> @@ -108,7 +108,7 @@ enum ti_ads7950_id {
> .realbits = bits, \
> .storagebits = 16, \
> .shift = 12 - (bits), \
> - .endianness = IIO_BE, \
> + .endianness = IIO_CPU, \
> }, \
> }
>
> @@ -249,23 +249,14 @@ static int ti_ads7950_update_scan_mode(struct iio_dev *indio_dev,
> len = 0;
> for_each_set_bit(i, active_scan_mask, indio_dev->num_channels) {
> cmd = TI_ADS7950_CR_WRITE | TI_ADS7950_CR_CHAN(i) | st->settings;
> - st->tx_buf[len++] = cpu_to_be16(cmd);
> + st->tx_buf[len++] = cmd;
> }
>
> /* Data for the 1st channel is not returned until the 3rd transfer */
> - len += 2;
> - for (i = 0; i < len; i++) {
> - if ((i + 2) < len)
> - st->ring_xfer[i].tx_buf = &st->tx_buf[i];
> - if (i >= 2)
> - st->ring_xfer[i].rx_buf = &st->rx_buf[i - 2];
> - st->ring_xfer[i].len = 2;
> - st->ring_xfer[i].cs_change = 1;
> - }
> - /* make sure last transfer's cs_change is not set */
> - st->ring_xfer[len - 1].cs_change = 0;
> + st->tx_buf[len++] = 0;
> + st->tx_buf[len++] = 0;
>
> - spi_message_init_with_transfers(&st->ring_msg, st->ring_xfer, len);
> + st->ring_xfer.len = len * 2;
>
> return 0;
> }
> @@ -281,7 +272,7 @@ static irqreturn_t ti_ads7950_trigger_handler(int irq, void *p)
> if (ret < 0)
> goto out;
>
> - iio_push_to_buffers_with_timestamp(indio_dev, st->rx_buf,
> + iio_push_to_buffers_with_timestamp(indio_dev, &st->rx_buf[2],
> iio_get_time_ns(indio_dev));
>
> out:
> @@ -298,13 +289,13 @@ static int ti_ads7950_scan_direct(struct iio_dev *indio_dev, unsigned int ch)
> mutex_lock(&indio_dev->mlock);
>
> cmd = TI_ADS7950_CR_WRITE | TI_ADS7950_CR_CHAN(ch) | st->settings;
> - st->single_tx = cpu_to_be16(cmd);
> + st->single_tx = cmd;
>
> ret = spi_sync(st->spi, &st->scan_single_msg);
> if (ret)
> goto out;
>
> - ret = be16_to_cpu(st->single_rx);
> + ret = st->single_rx;
>
> out:
> mutex_unlock(&indio_dev->mlock);
> @@ -378,6 +369,14 @@ static int ti_ads7950_probe(struct spi_device *spi)
> const struct ti_ads7950_chip_info *info;
> int ret;
>
> + spi->bits_per_word = 16;
> + spi->mode |= SPI_CS_WORD;
> + ret = spi_setup(spi);
> + if (ret < 0) {
> + dev_err(&spi->dev, "Error in spi setup\n");
> + return ret;
> + }
> +
> indio_dev = devm_iio_device_alloc(&spi->dev, sizeof(*st));
> if (!indio_dev)
> return -ENOMEM;
> @@ -398,6 +397,16 @@ static int ti_ads7950_probe(struct spi_device *spi)
> indio_dev->num_channels = info->num_channels;
> indio_dev->info = &ti_ads7950_info;
>
> + /* build spi ring message */
> + spi_message_init(&st->ring_msg);
> +
> + st->ring_xfer.tx_buf = &st->tx_buf[0];
> + st->ring_xfer.rx_buf = &st->rx_buf[0];
> + /* len will be set later */
> + st->ring_xfer.cs_change = true;
> +
> + spi_message_add_tail(&st->ring_xfer, &st->ring_msg);
> +
> /*
> * Setup default message. The sample is read at the end of the first
> * transfer, then it takes one full cycle to convert the sample and one


2018-09-16 16:24:52

by David Lechner

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] iio: adc: ti-ads7950: use SPI_CS_WORD to reduce CPU usage

On 09/16/2018 06:41 AM, Jonathan Cameron wrote:
> On Wed, 12 Sep 2018 19:39:19 -0500
> David Lechner <[email protected]> wrote:
>
>> This changes how the SPI message for the triggered buffer is setup in
>> the TI ADS7950 A/DC driver. By using the SPI_CS_WORD flag, we can read
>> multiple samples in a single SPI transfer. If the SPI controller
>> supports DMA transfers, we can see a significant reduction in CPU usage.
>>
>> For example, on an ARM9 system running at 456MHz reading just 4 channels
>> at 100Hz: before this change, top shows the CPU usage of the IRQ thread
>> of this driver to be ~7.7%. After this change, the CPU usage drops to
>> ~3.8%.
>>
>> Signed-off-by: David Lechner <[email protected]>
>
> Hi David,
>
> I've managed to forget why we are changing any of the endian related code
> at all. The change SPI_CS_WORD result in changes between words which is
> fine but it doesn't change any ordering within words? So as such why
> do we no longer need to do the big endian conversions?
>

The big-endian stuff was cargo culted from another driver when this driver
was originally written. It used an SPI word size of 8 bits and big-endian
byte ordering to effectively emulate 16 bit words.

Now, in order to inject a CS toggle between each word, we need to use the
correct word size, otherwise we would get a CS toggle half way through
each word 16-bit. The SPI subsystem uses CPU byte ordering for multi-byte
words. So, the data we get back from the SPI is going to be CPU endian now
no matter what. Converting that to big endian will just add overhead on
little endian systems.

2018-09-17 08:34:51

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] iio: adc: ti-ads7950: use SPI_CS_WORD to reduce CPU usage

On Sun, 16 Sep 2018 11:24:16 -0500
David Lechner <[email protected]> wrote:

> On 09/16/2018 06:41 AM, Jonathan Cameron wrote:
> > On Wed, 12 Sep 2018 19:39:19 -0500
> > David Lechner <[email protected]> wrote:
> >
> >> This changes how the SPI message for the triggered buffer is setup in
> >> the TI ADS7950 A/DC driver. By using the SPI_CS_WORD flag, we can read
> >> multiple samples in a single SPI transfer. If the SPI controller
> >> supports DMA transfers, we can see a significant reduction in CPU usage.
> >>
> >> For example, on an ARM9 system running at 456MHz reading just 4 channels
> >> at 100Hz: before this change, top shows the CPU usage of the IRQ thread
> >> of this driver to be ~7.7%. After this change, the CPU usage drops to
> >> ~3.8%.
> >>
> >> Signed-off-by: David Lechner <[email protected]>
> >
> > Hi David,
> >
> > I've managed to forget why we are changing any of the endian related code
> > at all. The change SPI_CS_WORD result in changes between words which is
> > fine but it doesn't change any ordering within words? So as such why
> > do we no longer need to do the big endian conversions?
> >
>
> The big-endian stuff was cargo culted from another driver when this driver
> was originally written. It used an SPI word size of 8 bits and big-endian
> byte ordering to effectively emulate 16 bit words.
>
> Now, in order to inject a CS toggle between each word, we need to use the
> correct word size, otherwise we would get a CS toggle half way through
> each word 16-bit. The SPI subsystem uses CPU byte ordering for multi-byte
> words. So, the data we get back from the SPI is going to be CPU endian now
> no matter what. Converting that to big endian will just add overhead on
> little endian systems.

Cool. Thanks for the explanation. If you are rerolling put that in the
patch description.

Reviewed-by: Jonathan Cameron <[email protected]>

I'm kind of assuming Mark will want to take this through the SPI tree
if he is happy with it.

Mark, shout if you want to do it another way.

Thanks,

Jonathan


2018-09-17 21:18:25

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] spi: introduce SPI_CS_WORD mode flag

On Wed, Sep 12, 2018 at 07:39:16PM -0500, David Lechner wrote:
> This series introduces a new SPI mode flag, SPI_CS_WORD, that indicates that
> the chip select line should be toggled after each word sent. This series
> includes examples of how this can be implemented for both an SPI controller
> and an SPI device.

The following changes since commit 5b394b2ddf0347bef56e50c69a58773c94343ff3:

Linux 4.19-rc1 (2018-08-26 14:11:59 -0700)

are available in the Git repository at:

https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git tags/spi-cs-word

for you to fetch changes up to cbaa62e0094a840fecc853910e0c0454529cec03:

spi: add software implementation for SPI_CS_WORD (2018-09-17 14:14:18 -0700)

----------------------------------------------------------------
spi: Provide SPI_CS_WORD

This provides a SPI operation mode which changes chip select after every
word, used by some devices such as ADCs and DACs.

----------------------------------------------------------------
David Lechner (2):
spi: add new SPI_CS_WORD flag
spi: add software implementation for SPI_CS_WORD

drivers/spi/spi.c | 31 ++++++++++++++++++++++++++++++-
include/linux/spi/spi.h | 2 +-
2 files changed, 31 insertions(+), 2 deletions(-)


Attachments:
(No filename) (1.26 kB)
signature.asc (499.00 B)
Download all attachments

2018-09-17 21:19:35

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] spi: spi-davinci: Add support for SPI_CS_WORD

On Thu, Sep 13, 2018 at 09:26:48AM -0500, David Lechner wrote:
> On 09/13/2018 08:44 AM, Geert Uytterhoeven wrote:

> I suppose it might be best to modify the SPI core to say:

> if ((spi->mode & SPI_CS_WORD) && (!(ctlr->mode_bits & SPI_CS_WORD) ||
> gpio_is_valid(spi->cs_gpio)) {

> instead of:

> if ((spi->mode & SPI_CS_WORD) && !(ctlr->mode_bits & SPI_CS_WORD)) {

> Then we could drop the error above.

Yes, that makes sense - the same thing is going to apply to any
controller with this support.


Attachments:
(No filename) (528.00 B)
signature.asc (499.00 B)
Download all attachments

2018-09-17 21:24:35

by Mark Brown

[permalink] [raw]
Subject: Applied "spi: add software implementation for SPI_CS_WORD" to the spi tree

The patch

spi: add software implementation for SPI_CS_WORD

has been applied to the spi tree at

https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

From cbaa62e0094a840fecc853910e0c0454529cec03 Mon Sep 17 00:00:00 2001
From: David Lechner <[email protected]>
Date: Wed, 12 Sep 2018 19:39:18 -0500
Subject: [PATCH] spi: add software implementation for SPI_CS_WORD

This adds a default software implementation for the SPI_CS_WORD flag for
controllers that don't have such a feature.

The SPI_CS_WORD flag indicates that the CS line should be toggled
between each word sent, not just between each transfer. The
implementation works by using existing functions to split transfers into
one-word-sized transfers and sets the cs_change bit for each of the
new transfers.

Signed-off-by: David Lechner <[email protected]>
Acked-by: Jonathan Cameron <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
---
drivers/spi/spi.c | 31 ++++++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index ec395a6baf9c..be73d236919f 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -2774,8 +2774,10 @@ int spi_setup(struct spi_device *spi)
return -EINVAL;
/* help drivers fail *cleanly* when they need options
* that aren't supported with their current controller
+ * SPI_CS_WORD has a fallback software implementation,
+ * so it is ignored here.
*/
- bad_bits = spi->mode & ~spi->controller->mode_bits;
+ bad_bits = spi->mode & ~(spi->controller->mode_bits | SPI_CS_WORD);
ugly_bits = bad_bits &
(SPI_TX_DUAL | SPI_TX_QUAD | SPI_RX_DUAL | SPI_RX_QUAD);
if (ugly_bits) {
@@ -2829,6 +2831,33 @@ static int __spi_validate(struct spi_device *spi, struct spi_message *message)
if (list_empty(&message->transfers))
return -EINVAL;

+ /* If an SPI controller does not support toggling the CS line on each
+ * transfer (indicated by the SPI_CS_WORD flag), we can emulate it by
+ * splitting transfers into one-word transfers and ensuring that
+ * cs_change is set for each transfer.
+ */
+ if ((spi->mode & SPI_CS_WORD) && !(ctlr->mode_bits & SPI_CS_WORD)) {
+ size_t maxsize;
+ int ret;
+
+ maxsize = (spi->bits_per_word + 7) / 8;
+
+ /* spi_split_transfers_maxsize() requires message->spi */
+ message->spi = spi;
+
+ ret = spi_split_transfers_maxsize(ctlr, message, maxsize,
+ GFP_KERNEL);
+ if (ret)
+ return ret;
+
+ list_for_each_entry(xfer, &message->transfers, transfer_list) {
+ /* don't change cs_change on the last entry in the list */
+ if (list_is_last(&xfer->transfer_list, &message->transfers))
+ break;
+ xfer->cs_change = 1;
+ }
+ }
+
/* Half-duplex links include original MicroWire, and ones with
* only one data pin like SPI_3WIRE (switches direction) or where
* either MOSI or MISO is missing. They can also be caused by
--
2.19.0