Hi all,
this series tries to solve a long time issue of compatibility between the
MTD and SPI sub-systems about whether we should use DMA-safe memory.
This issue is visible espcecially when using a UBI file-system on a SPI
NOR memory accessed through a SPI controller behind the m25p80 driver.
The SPI sub-system has already implemented a work-around on its side,
based on the spi_map_buf() function. However this function has its own
limitation too. Especially, even if it builds a 'struct scatterlist' from
a vmalloc'ed buffer, calling dma_map_sg() is still not safe on all
architectures. Especially, on ARM cores using either VIPT or VIVT data
caches, dma_map_sg() doesn't take the cache aliases issue into account.
Then numerous crashes were reported for such architectures.
If think it's high time to provide a reliable solution. So let's try to
work together!
The proposed solution here is based on an former series from Vignesh R and
relies on a bounce buffer.
For this series, I will need some pieces of advice on how to implement
a reliable [spi_nor_]is_dma_safe() function. The proposed implementation
in patch 1 is based on the tests performed by spi_map_buf() but I was
wondring whether it would be more cautious to consider:
DMA-safe <=> allocated by kmalloc'ed.
Actually, it's better using the bounce buffer when not needed than not
using it when we should.
Also the bounce buffer solution in spi-nor is designed so it could be
used as an helper so spi flash controller drivers other than m25p80.c
could now easily add support to DMA transfers for (Fast) Read and/or
Page Program operations.
I've implemented and tested it on a sama5d2 xplained board + Macronix
mx25l25673g SPI NOR memory reading from and writing into some UBI
file-system. I've also tested with mtd_debug to write then read back
a raw data into the SPI NOR memory, later checking the data integrity with
sha1sum.
For the atmel-quadspi.c driver, DMA memcpy() transfers are enabled only if
the "dmacap,mempcy" boolean property is set in the device-tree.
I found this name in some other device-trees already using it for a
boolean property.
Best regards,
Cyrille
Cyrille Pitchen (3):
mtd: spi-nor: add optional DMA-safe bounce buffer for data transfer
dt-bindings: mtd: atmel-quadspi: add an optional property
'dmacap,memcpy'
mtd: atmel-quadspi: add support of DMA memcpy()
.../devicetree/bindings/mtd/atmel-quadspi.txt | 5 +
drivers/mtd/devices/m25p80.c | 4 +-
drivers/mtd/spi-nor/atmel-quadspi.c | 132 +++++++++++++++++++-
drivers/mtd/spi-nor/spi-nor.c | 136 +++++++++++++++++++--
include/linux/mtd/spi-nor.h | 8 ++
5 files changed, 273 insertions(+), 12 deletions(-)
--
2.11.0
This patch takes advantage of the new bounce buffer helper from the
spi-nor framework to add support of memcpy() operations using the DMA
controller.
Since the number of DMA channels is limited and to avoid changing how
those DMA channels are used in existing boards, this new DMA memcpy()
feature is enabled only if the "dmacap,memcpy" boolean property is set in
the device-tree.
Signed-off-by: Cyrille Pitchen <[email protected]>
---
drivers/mtd/spi-nor/atmel-quadspi.c | 132 +++++++++++++++++++++++++++++++++++-
1 file changed, 129 insertions(+), 3 deletions(-)
diff --git a/drivers/mtd/spi-nor/atmel-quadspi.c b/drivers/mtd/spi-nor/atmel-quadspi.c
index 6c5708bacad8..5443e4dba416 100644
--- a/drivers/mtd/spi-nor/atmel-quadspi.c
+++ b/drivers/mtd/spi-nor/atmel-quadspi.c
@@ -22,6 +22,8 @@
#include <linux/kernel.h>
#include <linux/clk.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmaengine.h>
#include <linux/module.h>
#include <linux/platform_device.h>
#include <linux/delay.h>
@@ -36,6 +38,8 @@
#include <linux/io.h>
#include <linux/gpio.h>
+#define QSPI_DMA_THRESHOLD 32
+
/* QSPI register offsets */
#define QSPI_CR 0x0000 /* Control Register */
#define QSPI_MR 0x0004 /* Mode Register */
@@ -161,6 +165,11 @@ struct atmel_qspi {
struct spi_nor nor;
u32 clk_rate;
struct completion cmd_completion;
+
+ /* DMA transfers */
+ struct dma_chan *chan;
+ dma_addr_t dma_mem;
+ struct completion transfer_complete;
};
struct atmel_qspi_command {
@@ -197,11 +206,96 @@ static inline void qspi_writel(struct atmel_qspi *aq, u32 reg, u32 value)
writel_relaxed(value, aq->regs + reg);
}
+static void atmel_qspi_dma_callback(void *param)
+{
+ struct atmel_qspi *aq = param;
+
+ complete(&aq->transfer_complete);
+}
+
+static int atmel_qspi_run_dma_transfer(struct atmel_qspi *aq,
+ const struct atmel_qspi_command *cmd)
+{
+ struct device *dev = &aq->pdev->dev;
+ enum dma_data_direction dir;
+ struct dma_async_tx_descriptor *desc;
+ dma_addr_t src, dst, dma_addr;
+ dma_cookie_t cookie;
+ u32 offset;
+ void *ptr;
+ int err = 0;
+
+ if (cmd->tx_buf) {
+ ptr = (void *)cmd->tx_buf;
+ dir = DMA_TO_DEVICE;
+ } else if (cmd->rx_buf) {
+ ptr = cmd->rx_buf;
+ dir = DMA_FROM_DEVICE;
+ } else {
+ return -EINVAL;
+ }
+
+ dma_addr = dma_map_single(dev, ptr, cmd->buf_len, dir);
+ if (dma_mapping_error(dev, dma_addr))
+ return -ENOMEM;
+
+ offset = cmd->enable.bits.address ? cmd->address : 0;
+ if (cmd->tx_buf) {
+ dst = aq->dma_mem + offset;
+ src = dma_addr;
+ } else {
+ dst = dma_addr;
+ src = aq->dma_mem + offset;
+ }
+
+ desc = dmaengine_prep_dma_memcpy(aq->chan, dst, src, cmd->buf_len,
+ DMA_CTRL_ACK | DMA_PREP_INTERRUPT);
+ if (!desc) {
+ err = -ENOMEM;
+ goto unmap;
+ }
+
+ reinit_completion(&aq->transfer_complete);
+ desc->callback = atmel_qspi_dma_callback;
+ desc->callback_param = aq;
+
+ cookie = dmaengine_submit(desc);
+ err = dma_submit_error(cookie);
+ if (err) {
+ err = -EIO;
+ goto unmap;
+ }
+
+ dma_async_issue_pending(aq->chan);
+ err = wait_for_completion_timeout(&aq->transfer_complete,
+ msecs_to_jiffies(cmd->buf_len));
+ if (err <= 0) {
+ dmaengine_terminate_sync(aq->chan);
+ err = -ETIMEDOUT;
+ goto unmap;
+ }
+
+ err = 0;
+
+ unmap:
+ dma_unmap_single(dev, dma_addr, cmd->buf_len, dir);
+ return err;
+}
+
static int atmel_qspi_run_transfer(struct atmel_qspi *aq,
const struct atmel_qspi_command *cmd)
{
void __iomem *ahb_mem;
+ /* Try DMA transfer first. */
+ if (aq->chan && cmd->buf_len >= QSPI_DMA_THRESHOLD) {
+ int err = atmel_qspi_run_dma_transfer(aq, cmd);
+
+ /* If the DMA transfer has started, stop here. */
+ if (err != -ENOMEM)
+ return err;
+ }
+
/* Then fallback to a PIO transfer (memcpy() DOES NOT work!) */
ahb_mem = aq->mem;
if (cmd->enable.bits.address)
@@ -604,7 +698,7 @@ static irqreturn_t atmel_qspi_interrupt(int irq, void *dev_id)
static int atmel_qspi_probe(struct platform_device *pdev)
{
- const struct spi_nor_hwcaps hwcaps = {
+ struct spi_nor_hwcaps hwcaps = {
.mask = SNOR_HWCAPS_READ |
SNOR_HWCAPS_READ_FAST |
SNOR_HWCAPS_READ_1_1_2 |
@@ -657,19 +751,44 @@ static int atmel_qspi_probe(struct platform_device *pdev)
goto exit;
}
+ aq->dma_mem = (dma_addr_t)res->start;
+
+ /* Reserve a DMA channel for memcpy(), only if requested */
+ if (of_property_read_bool(np, "dmacap,memcpy")) {
+ dma_cap_mask_t mask;
+
+ dma_cap_zero(mask);
+ dma_cap_set(DMA_MEMCPY, mask);
+
+ aq->chan = dma_request_chan_by_mask(&mask);
+ if (IS_ERR(aq->chan)) {
+ if (PTR_ERR(aq->chan) == -EPROBE_DEFER) {
+ err = -EPROBE_DEFER;
+ goto exit;
+ }
+ dev_warn(&pdev->dev, "no available DMA channel\n");
+ aq->chan = NULL;
+ } else {
+ hwcaps.mask |= SNOR_HWCAPS_RD_BOUNCE |
+ SNOR_HWCAPS_WR_BOUNCE;
+ }
+ }
+
+ init_completion(&aq->transfer_complete);
+
/* Get the peripheral clock */
aq->clk = devm_clk_get(&pdev->dev, NULL);
if (IS_ERR(aq->clk)) {
dev_err(&pdev->dev, "missing peripheral clock\n");
err = PTR_ERR(aq->clk);
- goto exit;
+ goto release_dma_chan;
}
/* Enable the peripheral clock */
err = clk_prepare_enable(aq->clk);
if (err) {
dev_err(&pdev->dev, "failed to enable the peripheral clock\n");
- goto exit;
+ goto release_dma_chan;
}
/* Request the IRQ */
@@ -721,6 +840,9 @@ static int atmel_qspi_probe(struct platform_device *pdev)
disable_clk:
clk_disable_unprepare(aq->clk);
+release_dma_chan:
+ if (aq->chan)
+ dma_release_channel(aq->chan);
exit:
of_node_put(child);
@@ -734,6 +856,10 @@ static int atmel_qspi_remove(struct platform_device *pdev)
mtd_device_unregister(&aq->nor.mtd);
qspi_writel(aq, QSPI_CR, QSPI_CR_QSPIDIS);
clk_disable_unprepare(aq->clk);
+
+ if (aq->chan)
+ dma_release_channel(aq->chan);
+
return 0;
}
--
2.11.0
The optional 'dmacap,memcpy' DT property tells the Atmel QSPI controller
driver to reserve some DMA channel then to use it to perform DMA
memcpy() during data transfers. This feature relies on the generic
bounce buffer helper from spi-nor.c.
Signed-off-by: Cyrille Pitchen <[email protected]>
---
Documentation/devicetree/bindings/mtd/atmel-quadspi.txt | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt b/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
index b93c1e2f25dd..002d3f0a445b 100644
--- a/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
+++ b/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
@@ -12,6 +12,10 @@ Required properties:
- #address-cells: Should be <1>.
- #size-cells: Should be <0>.
+Optional properties:
+- dmacap,memcpy: Reserve a DMA channel to perform DMA memcpy() between the
+ system memory and the QSPI mapped memory.
+
Example:
spi@f0020000 {
@@ -24,6 +28,7 @@ spi@f0020000 {
#size-cells = <0>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_spi0_default>;
+ dmacap,memcpy;
m25p80@0 {
...
--
2.11.0
This patch has two purposes:
1 - To fix the compatible issue between the MTD and SPI sub-systems
The MTD sub-system has no particular requirement about the memory areas it
uses. Especially, ubifs is well known for using vmalloc'ed buffers, which
then are not DMA-safe. There are reasons behind that, so we have to deal
with it.
On the other hand, the SPI sub-system clearly states in the kernel doc for
'struct spi-transfer' (include/linux/spi/spi.h) that both .tx_buf and
.rx_buf must point into "dma-safe memory". This requirement has not been
taken into account by the m25p80.c driver, at the border between MTD and
SPI, for a long time now. So it's high time to fix this issue.
2 - To help other SPI flash controller drivers to perform DMA transfers
Those controller drivers suffer the same issue as those behind the
m25p80.c driver in the SPI sub-system: They may be provided by the MTD
sub-system with buffers not suited to DMA operations. We want to avoid
each controller to implement its own bounce buffer so we offer them some
optional bounce buffer, allocated and managed by the spi-nor framework
itself, as an helper to add support to DMA transfers.
Then the patch adds two hardware capabilities for SPI flash controllers,
SNOR_HWCAPS_WR_BOUNCE and SNOR_HWCAPS_RD_BOUNCE.
SPI flash controller drivers are supposed to use them to request the
spi-nor framework to allocate an optional bounce buffer in some
DMA-safe memory area then to use it in some cases during (Fast) Read
and/or Page Program operations.
More precisely, the bounce buffer is used if and only if two conditions
are met:
1 - The SPI flash controller driver has declared the
SNOR_HWCAPS_RD_BOUNCE, resp. SNOR_HWCAPS_WR_BOUNCE for (Fast) Read,
resp. Page Program operations.
2 - The buffer provided by the above MTD layer is not already in a
DMA-safe area.
This policy avoid using the bounce buffer when not explicitly requested
or when not needed, hence limiting the performance penalty.
Besides, the bounce buffer is allocated once for all at the very first
time it is actually needed. This means that as long as all buffers
provided by the above MTD layer are allocated in some DMA-safe areas, the
bounce buffer itself is never allocated.
Finally, the bounce buffer size is limited to 256KiB, the currently known
maximum erase sector size. This tradeoff should still provide good
performances even if new memory parts come with even larger erase sector
sizes, limiting the memory footprint at the same time.
Signed-off-by: Cyrille Pitchen <[email protected]>
---
drivers/mtd/devices/m25p80.c | 4 +-
drivers/mtd/spi-nor/spi-nor.c | 136 +++++++++++++++++++++++++++++++++++++++---
include/linux/mtd/spi-nor.h | 8 +++
3 files changed, 139 insertions(+), 9 deletions(-)
diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c
index a4e18f6aaa33..60878c62a654 100644
--- a/drivers/mtd/devices/m25p80.c
+++ b/drivers/mtd/devices/m25p80.c
@@ -239,7 +239,9 @@ static int m25p_probe(struct spi_device *spi)
struct spi_nor_hwcaps hwcaps = {
.mask = SNOR_HWCAPS_READ |
SNOR_HWCAPS_READ_FAST |
- SNOR_HWCAPS_PP,
+ SNOR_HWCAPS_PP |
+ SNOR_HWCAPS_RD_BOUNCE |
+ SNOR_HWCAPS_WR_BOUNCE,
};
char *flash_name;
int ret;
diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
index 8bafd462f0ae..59f9fbd45234 100644
--- a/drivers/mtd/spi-nor/spi-nor.c
+++ b/drivers/mtd/spi-nor/spi-nor.c
@@ -14,8 +14,10 @@
#include <linux/errno.h>
#include <linux/module.h>
#include <linux/device.h>
+#include <linux/highmem.h>
#include <linux/mutex.h>
#include <linux/math64.h>
+#include <linux/mm.h>
#include <linux/sizes.h>
#include <linux/slab.h>
@@ -1232,6 +1234,56 @@ static const struct flash_info spi_nor_ids[] = {
{ },
};
+static bool spi_nor_is_dma_safe(const void *buf)
+{
+ if (is_vmalloc_addr(buf))
+ return false;
+
+#ifdef CONFIG_HIGHMEM
+ if ((unsigned long)buf >= PKMAP_BASE &&
+ (unsigned long)buf < (PKMAP_BASE + (LAST_PKMAP * PAGE_SIZE)))
+ return false;
+#endif
+
+ return true;
+}
+
+static int spi_nor_get_bounce_buffer(struct spi_nor *nor,
+ u_char **buffer,
+ size_t *buffer_size)
+{
+ const struct flash_info *info = nor->info;
+ /*
+ * Limit the size of the bounce buffer to 256KB: this is currently
+ * the largest known erase sector size (> page size) and should be
+ * enough to still reach good performances if some day new memory
+ * parts use even larger erase sector sizes.
+ */
+ size_t size = min_t(size_t, info->sector_size, SZ_256K);
+
+ /*
+ * Allocate the bounce buffer once for all at the first time it is
+ * actually needed. This prevents wasting some precious memory
+ * in cases where it would never be needed.
+ */
+ if (unlikely(!nor->bounce_buffer)) {
+ nor->bounce_buffer = devm_kmalloc(nor->dev, size, GFP_KERNEL);
+
+ /*
+ * The SPI flash controller driver has required and expects to
+ * use the DMA-safe bounce buffer, so we can't recover from
+ * this allocation failure.
+ */
+ if (!nor->bounce_buffer)
+ return -ENOMEM;
+ }
+
+ *buffer = nor->bounce_buffer;
+ *buffer_size = size;
+
+ return 0;
+}
+
static const struct flash_info *spi_nor_read_id(struct spi_nor *nor)
{
int tmp;
@@ -1260,6 +1312,10 @@ static int spi_nor_read(struct mtd_info *mtd, loff_t from, size_t len,
size_t *retlen, u_char *buf)
{
struct spi_nor *nor = mtd_to_spi_nor(mtd);
+ bool use_bounce = (nor->flags & SNOR_F_USE_RD_BOUNCE) &&
+ !spi_nor_is_dma_safe(buf);
+ u_char *buffer = buf;
+ size_t buffer_size = 0;
int ret;
dev_dbg(nor->dev, "from 0x%08x, len %zd\n", (u32)from, len);
@@ -1268,13 +1324,23 @@ static int spi_nor_read(struct mtd_info *mtd, loff_t from, size_t len,
if (ret)
return ret;
+ if (use_bounce) {
+ ret = spi_nor_get_bounce_buffer(nor, &buffer, &buffer_size);
+ if (ret < 0)
+ goto read_err;
+ }
+
while (len) {
loff_t addr = from;
+ size_t length = len;
if (nor->flags & SNOR_F_S3AN_ADDR_DEFAULT)
addr = spi_nor_s3an_addr_convert(nor, addr);
- ret = nor->read(nor, addr, len, buf);
+ if (use_bounce && length > buffer_size)
+ length = buffer_size;
+
+ ret = nor->read(nor, addr, length, buffer);
if (ret == 0) {
/* We shouldn't see 0-length reads */
ret = -EIO;
@@ -1283,7 +1349,11 @@ static int spi_nor_read(struct mtd_info *mtd, loff_t from, size_t len,
if (ret < 0)
goto read_err;
- WARN_ON(ret > len);
+ WARN_ON(ret > length);
+ if (use_bounce)
+ memcpy(buf, nor->bounce_buffer, ret);
+ else
+ buffer += ret;
*retlen += ret;
buf += ret;
from += ret;
@@ -1300,7 +1370,10 @@ static int sst_write(struct mtd_info *mtd, loff_t to, size_t len,
size_t *retlen, const u_char *buf)
{
struct spi_nor *nor = mtd_to_spi_nor(mtd);
- size_t actual;
+ bool use_bounce = (nor->flags & SNOR_F_USE_WR_BOUNCE) &&
+ !spi_nor_is_dma_safe(buf);
+ u_char *buffer = NULL;
+ size_t actual = 0, buffer_size = 0;
int ret;
dev_dbg(nor->dev, "to 0x%08x, len %zd\n", (u32)to, len);
@@ -1309,6 +1382,12 @@ static int sst_write(struct mtd_info *mtd, loff_t to, size_t len,
if (ret)
return ret;
+ if (use_bounce) {
+ ret = spi_nor_get_bounce_buffer(nor, &buffer, &buffer_size);
+ if (ret < 0)
+ goto sst_write_err;
+ }
+
write_enable(nor);
nor->sst_write_second = false;
@@ -1318,8 +1397,13 @@ static int sst_write(struct mtd_info *mtd, loff_t to, size_t len,
if (actual) {
nor->program_opcode = SPINOR_OP_BP;
+ if (use_bounce)
+ buffer[0] = buf[0];
+ else
+ buffer = (u_char *)buf;
+
/* write one byte. */
- ret = nor->write(nor, to, 1, buf);
+ ret = nor->write(nor, to, 1, buffer);
if (ret < 0)
goto sst_write_err;
WARN(ret != 1, "While writing 1 byte written %i bytes\n",
@@ -1334,8 +1418,15 @@ static int sst_write(struct mtd_info *mtd, loff_t to, size_t len,
for (; actual < len - 1; actual += 2) {
nor->program_opcode = SPINOR_OP_AAI_WP;
+ if (use_bounce) {
+ buffer[0] = buf[actual];
+ buffer[1] = buf[actual + 1];
+ } else {
+ buffer = (u_char *)buf + actual;
+ }
+
/* write two bytes. */
- ret = nor->write(nor, to, 2, buf + actual);
+ ret = nor->write(nor, to, 2, buffer);
if (ret < 0)
goto sst_write_err;
WARN(ret != 2, "While writing 2 bytes written %i bytes\n",
@@ -1358,7 +1449,13 @@ static int sst_write(struct mtd_info *mtd, loff_t to, size_t len,
write_enable(nor);
nor->program_opcode = SPINOR_OP_BP;
- ret = nor->write(nor, to, 1, buf + actual);
+
+ if (use_bounce)
+ buffer[0] = buf[actual];
+ else
+ buffer = (u_char *)buf + actual;
+
+ ret = nor->write(nor, to, 1, buffer);
if (ret < 0)
goto sst_write_err;
WARN(ret != 1, "While writing 1 byte written %i bytes\n",
@@ -1384,7 +1481,10 @@ static int spi_nor_write(struct mtd_info *mtd, loff_t to, size_t len,
size_t *retlen, const u_char *buf)
{
struct spi_nor *nor = mtd_to_spi_nor(mtd);
- size_t page_offset, page_remain, i;
+ bool use_bounce = (nor->flags & SNOR_F_USE_WR_BOUNCE) &&
+ !spi_nor_is_dma_safe(buf);
+ u_char *buffer = NULL;
+ size_t page_offset, page_remain, i, buffer_size = 0;
ssize_t ret;
dev_dbg(nor->dev, "to 0x%08x, len %zd\n", (u32)to, len);
@@ -1393,6 +1493,12 @@ static int spi_nor_write(struct mtd_info *mtd, loff_t to, size_t len,
if (ret)
return ret;
+ if (use_bounce) {
+ ret = spi_nor_get_bounce_buffer(nor, &buffer, &buffer_size);
+ if (ret < 0)
+ goto write_err;
+ }
+
for (i = 0; i < len; ) {
ssize_t written;
loff_t addr = to + i;
@@ -1419,8 +1525,17 @@ static int spi_nor_write(struct mtd_info *mtd, loff_t to, size_t len,
if (nor->flags & SNOR_F_S3AN_ADDR_DEFAULT)
addr = spi_nor_s3an_addr_convert(nor, addr);
+ if (use_bounce) {
+ if (page_remain > buffer_size)
+ page_remain = buffer_size;
+
+ memcpy(nor->bounce_buffer, buf + i, page_remain);
+ } else {
+ buffer = (u_char *)buf + i;
+ }
+
write_enable(nor);
- ret = nor->write(nor, addr, page_remain, buf + i);
+ ret = nor->write(nor, addr, page_remain, buffer);
if (ret < 0)
goto write_err;
written = ret;
@@ -2814,6 +2929,11 @@ int spi_nor_scan(struct spi_nor *nor, const char *name,
if (info->flags & SPI_S3AN)
nor->flags |= SNOR_F_READY_XSR_RDY;
+ if (hwcaps->mask & SNOR_HWCAPS_RD_BOUNCE)
+ nor->flags |= SNOR_F_USE_RD_BOUNCE;
+ if (hwcaps->mask & SNOR_HWCAPS_WR_BOUNCE)
+ nor->flags |= SNOR_F_USE_WR_BOUNCE;
+
/* Parse the Serial Flash Discoverable Parameters table. */
ret = spi_nor_init_params(nor, info, ¶ms);
if (ret)
diff --git a/include/linux/mtd/spi-nor.h b/include/linux/mtd/spi-nor.h
index de36969eb359..9f4218990fc7 100644
--- a/include/linux/mtd/spi-nor.h
+++ b/include/linux/mtd/spi-nor.h
@@ -233,6 +233,8 @@ enum spi_nor_option_flags {
SNOR_F_S3AN_ADDR_DEFAULT = BIT(3),
SNOR_F_READY_XSR_RDY = BIT(4),
SNOR_F_USE_CLSR = BIT(5),
+ SNOR_F_USE_RD_BOUNCE = BIT(6),
+ SNOR_F_USE_WR_BOUNCE = BIT(7),
};
/**
@@ -259,6 +261,7 @@ struct flash_info;
* @write_proto: the SPI protocol for write operations
* @reg_proto the SPI protocol for read_reg/write_reg/erase operations
* @cmd_buf: used by the write_reg
+ * @bounce_buffer: an optional kmalloc'ed buffer as DMA transfer helper
* @prepare: [OPTIONAL] do some preparations for the
* read/write/erase/lock/unlock operations
* @unprepare: [OPTIONAL] do some post work after the
@@ -294,6 +297,7 @@ struct spi_nor {
bool sst_write_second;
u32 flags;
u8 cmd_buf[SPI_NOR_MAX_CMD_SIZE];
+ void *bounce_buffer;
int (*prepare)(struct spi_nor *nor, enum spi_nor_ops ops);
void (*unprepare)(struct spi_nor *nor, enum spi_nor_ops ops);
@@ -386,6 +390,10 @@ struct spi_nor_hwcaps {
#define SNOR_HWCAPS_PP_1_8_8 BIT(21)
#define SNOR_HWCAPS_PP_8_8_8 BIT(22)
+/* Bounce buffer helper, for DMA transfer for instance. */
+#define SNOR_HWCAPS_WR_BOUNCE BIT(30)
+#define SNOR_HWCAPS_RD_BOUNCE BIT(31)
+
/**
* spi_nor_scan() - scan the SPI NOR
* @nor: the spi_nor structure
--
2.11.0
Hi Cyrille,
Thanks for doing this series! One comment below.
On 24-Dec-17 10:06 AM, Cyrille Pitchen wrote:
[...]
> diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
> index 8bafd462f0ae..59f9fbd45234 100644
> --- a/drivers/mtd/spi-nor/spi-nor.c
> +++ b/drivers/mtd/spi-nor/spi-nor.c
> @@ -14,8 +14,10 @@
> #include <linux/errno.h>
> #include <linux/module.h>
> #include <linux/device.h>
> +#include <linux/highmem.h>
> #include <linux/mutex.h>
> #include <linux/math64.h>
> +#include <linux/mm.h>
> #include <linux/sizes.h>
> #include <linux/slab.h>
>
> @@ -1232,6 +1234,56 @@ static const struct flash_info spi_nor_ids[] = {
> { },
> };
>
> +static bool spi_nor_is_dma_safe(const void *buf)
> +{
> + if (is_vmalloc_addr(buf))
> + return false;
> +
> +#ifdef CONFIG_HIGHMEM
> + if ((unsigned long)buf >= PKMAP_BASE &&
> + (unsigned long)buf < (PKMAP_BASE + (LAST_PKMAP * PAGE_SIZE)))
> + return false;
> +#endif
> +
> + return true;
> +}
> +
Better way would be to use virt_addr_valid():
static bool spi_nor_is_dma_safe(const void *buf)
{
return virt_addr_valid(buf);
}
Regards
Vignesh
Hi Vignesh
Le 26/12/2017 à 14:42, Vignesh R a écrit :
> Hi Cyrille,
>
> Thanks for doing this series! One comment below.
>
> On 24-Dec-17 10:06 AM, Cyrille Pitchen wrote:
> [...]
>> diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
>> index 8bafd462f0ae..59f9fbd45234 100644
>> --- a/drivers/mtd/spi-nor/spi-nor.c
>> +++ b/drivers/mtd/spi-nor/spi-nor.c
>> @@ -14,8 +14,10 @@
>> #include <linux/errno.h>
>> #include <linux/module.h>
>> #include <linux/device.h>
>> +#include <linux/highmem.h>
>> #include <linux/mutex.h>
>> #include <linux/math64.h>
>> +#include <linux/mm.h>
>> #include <linux/sizes.h>
>> #include <linux/slab.h>
>>
>> @@ -1232,6 +1234,56 @@ static const struct flash_info spi_nor_ids[] = {
>> { },
>> };
>>
>> +static bool spi_nor_is_dma_safe(const void *buf)
>> +{
>> + if (is_vmalloc_addr(buf))
>> + return false;
>> +
>> +#ifdef CONFIG_HIGHMEM
>> + if ((unsigned long)buf >= PKMAP_BASE &&
>> + (unsigned long)buf < (PKMAP_BASE + (LAST_PKMAP * PAGE_SIZE)))
>> + return false;
>> +#endif
>> +
>> + return true;
>> +}
>> +
>
> Better way would be to use virt_addr_valid():
> static bool spi_nor_is_dma_safe(const void *buf)
> {
> return virt_addr_valid(buf);
> }
>
> Regards
> Vignesh
>
Thanks for the advice :)
https://patchwork.kernel.org/patch/9768341/
Maybe I could check both virt_addr_valid() and object_is_on_stack() too ?
Best regards,
Cyrille
On Sun, 2017-12-24 at 05:36 +0100, Cyrille Pitchen wrote:
> this series tries to solve a long time issue of compatibility between the
> MTD and SPI sub-systems about whether we should use DMA-safe memory.
Can this should replace SoC specific fixes like:
c687c46e9e45
spi: spi-ti-qspi: Use bounce buffer if read buffer is not DMA'ble
7094576ccdc3
spi: atmel: fix corrupted data issue on SAM9 family SoCs
Or, since this only fixes instances of DMA-unsafe buffers used in
access to SPI NOR flash chips, and since there are other SPI master
interface users, those chip specific fixes in some/all spi master
drivers are still needed to fix transfers not originated via spi-nor?
Or are all the previous fixes necessary because of spi-nor flash (via
ubifs or jffs2) and once that's fixed via this series there are no more
originators of dma-unsafe buffers?
> The SPI sub-system has already implemented a work-around on its side,
> based on the spi_map_buf() function. However this function has its own
> limitation too. Especially, even if it builds a 'struct scatterlist' from
> a vmalloc'ed buffer, calling dma_map_sg() is still not safe on all
> architectures. Especially, on ARM cores using either VIPT or VIVT data
> caches, dma_map_sg() doesn't take the cache aliases issue into account.
> Then numerous crashes were reported for such architectures.
Could the above issue also be fixed via calls to
flush_kernel_vmap_range() and invalidate_kernel_vmap_range() to keep
the data cache valid? Or is there a reason that won't work?
On Sun, 2017-12-24 at 05:36 +0100, Cyrille Pitchen wrote:
>
> Then the patch adds two hardware capabilities for SPI flash controllers,
> SNOR_HWCAPS_WR_BOUNCE and SNOR_HWCAPS_RD_BOUNCE.
Are there any drivers for which a bounce buffer is NOT needed when the
tx/rx buffer is not in DMA safe memory? Maybe it would make more sense
to invert the sense of these flags, so that they indicate the driver
does not need DMA safe buffers, if that is the uncommon/non-existent
case, so that fewer drivers need to be modified to to be fixed?
> +static bool spi_nor_is_dma_safe(const void *buf)
> +{
> + if (is_vmalloc_addr(buf))
> + return false;
> +
> +#ifdef CONFIG_HIGHMEM
> + if ((unsigned long)buf >= PKMAP_BASE &&
> + (unsigned long)buf < (PKMAP_BASE + (LAST_PKMAP * PAGE_SIZE)))
> + return false;
> +#endif
It looks like:
(unsigned long)addr >= PKMAP_ADDR(0) &&
(unsigned long)addr < PKMAP_ADDR(LAST_PKMAP)
is the expression used in the highmem code. But really, isn't this
begging for is_highmem_addr() in include/linux/mm.h that can always
return false when highmem is not enabled?
In order to be safe, this must be called when nor->lock is held, right?
Otherwise it could race against two callers allocating the buffer at
the same time. That should probably be noted in the kerneldoc comments
for this function, which should also be written.
> +static int spi_nor_get_bounce_buffer(struct spi_nor *nor,
> + u_char **buffer,
> + size_t *buffer_size)
> +{
> +
> + *buffer = nor->bounce_buffer;
> + *buffer_size = size;
So the buffer is returned via the parameter, and also via a field
inside nor. Seems redundant. Consider address could be returned via
the function return value coupled with PTR_ERR() for the error cases.
Or not return address at all since it's available via nor-
>bounce_buffer.
> {
> struct spi_nor *nor = mtd_to_spi_nor(mtd);
> + bool use_bounce = (nor->flags & SNOR_F_USE_RD_BOUNCE) &&
> + !spi_nor_is_dma_safe(buf);
> + u_char *buffer = buf;
> + size_t buffer_size = 0;
> int ret;
>
> dev_dbg(nor->dev, "from 0x%08x, len %zd\n", (u32)from, len);
> @@ -1268,13 +1324,23 @@ static int spi_nor_read(struct mtd_info *mtd, loff_t from, size_t len,
> if (ret)
> return ret;
>
> + if (use_bounce) {
> + ret = spi_nor_get_bounce_buffer(nor, &buffer, &buffer_size);
> + if (ret < 0)
> + goto read_err;
> + }
This pattern, check if bounce is enabled, check if address is dma-
unsafe, get bounce buffer, seems to be very common. Could it be
refactored into one helper?
u_char *buffer = spi_nor_check_bounce(nor, buf, len, &buffer_size);
if (IS_ERR(buffer))
return PTR_ERR(buffer);
// buffer = nor->bounce_buffer or buf, whichever is correct
// buffer_size = len or bounce buffer size, whichever is correct
Could spi_nor_read_sfdp_dma_unsafe() also use this buffer?
On Sun, Dec 24, 2017 at 05:36:05AM +0100, Cyrille Pitchen wrote:
> The optional 'dmacap,memcpy' DT property tells the Atmel QSPI controller
> driver to reserve some DMA channel then to use it to perform DMA
> memcpy() during data transfers. This feature relies on the generic
> bounce buffer helper from spi-nor.c.
>
> Signed-off-by: Cyrille Pitchen <[email protected]>
> ---
> Documentation/devicetree/bindings/mtd/atmel-quadspi.txt | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt b/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
> index b93c1e2f25dd..002d3f0a445b 100644
> --- a/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
> +++ b/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
> @@ -12,6 +12,10 @@ Required properties:
> - #address-cells: Should be <1>.
> - #size-cells: Should be <0>.
>
> +Optional properties:
> +- dmacap,memcpy: Reserve a DMA channel to perform DMA memcpy() between the
> + system memory and the QSPI mapped memory.
How is this a h/w property? Why would I not want to always enable DMA if
possible?
Furthermore, you are reusing a property, but giving it a different
meaning. The current definition is an indication whether a DMA
controller supports memcpy operations or not. It is not a flag for
clients to use memcpy channels.
Why don't you use "dmas" property to point to the DMA controller.
> +
> Example:
>
> spi@f0020000 {
> @@ -24,6 +28,7 @@ spi@f0020000 {
> #size-cells = <0>;
> pinctrl-names = "default";
> pinctrl-0 = <&pinctrl_spi0_default>;
> + dmacap,memcpy;
>
> m25p80@0 {
> ...
> --
> 2.11.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Dec 26, 2017 at 06:45:28PM +0000, Trent Piepho wrote:
> Or, since this only fixes instances of DMA-unsafe buffers used in
> access to SPI NOR flash chips, and since there are other SPI master
> interface users, those chip specific fixes in some/all spi master
> drivers are still needed to fix transfers not originated via spi-nor?
SPI client drivers are *supposed* to use DMA safe memory already. How
often that happens in cases where it matters is a separate question, we
definitely have users with smaller transfers that don't do the right
thing but they're normally done using PIO anyway.
On Wed, 2017-12-27 at 10:36 +0000, Mark Brown wrote:
> On Tue, Dec 26, 2017 at 06:45:28PM +0000, Trent Piepho wrote:
>
> > Or, since this only fixes instances of DMA-unsafe buffers used in
> > access to SPI NOR flash chips, and since there are other SPI master
> > interface users, those chip specific fixes in some/all spi master
> > drivers are still needed to fix transfers not originated via spi-nor?
>
> SPI client drivers are *supposed* to use DMA safe memory already. How
> often that happens in cases where it matters is a separate question, we
> definitely have users with smaller transfers that don't do the right
> thing but they're normally done using PIO anyway.
I wonder what the end goal is here?
A random collection of spi master drivers will accept DMA-unsafe
buffers in some way. In some cases a framework like spi-nor provides
the fixup to spi-nor master drivers (none so far) and in other cases
(atmel-quadspi), the spi-nor master driver has its own fixes.
Generic spi masters like spi-atmel, spi-ti-qspi, and spi-davinci will
have their fixes for certain cases.
Perhaps spi flash drivers like m25p80 will have fixes too?
Some spi clients, like spidev, will have internal bounce buffers,
rather than userspace addresses or commands in stack variables, so that
they follow the rules about DMA safe buffers.
What exactly is caught as DMA unsafe and what is not will of course
vary greatly from driver to driver. Some drivers will catch highmem
memory while other drivers will only detect vmalloc memory. Some will
only catch an unsafe buffer if a specific SoC known to the driver to
have an aliasing cache is enabled. Some will check buffers that arrive
via the spi_flash_read interface but not via generic spi transfers,
while others will check all spi transfer buffers.
Obviously, I don't think this path will lead to a desirable end. Maybe
the basic assumption, that clients should provide DMA safe buffers,
should be revisited. Experience has shown that it's too much to ask
for and spi clients will never get it right. It would be better to try
to fix this at some common point between the clients and masters so it
can be done once and for all.
Hi Rob,
+ Ludovic Desroches, maintainer of the DMA controller drivers for AT91 SoCs.
Le 27/12/2017 à 00:23, Rob Herring a écrit :
> On Sun, Dec 24, 2017 at 05:36:05AM +0100, Cyrille Pitchen wrote:
>> The optional 'dmacap,memcpy' DT property tells the Atmel QSPI controller
>> driver to reserve some DMA channel then to use it to perform DMA
>> memcpy() during data transfers. This feature relies on the generic
>> bounce buffer helper from spi-nor.c.
>>
>> Signed-off-by: Cyrille Pitchen <[email protected]>
>> ---
>> Documentation/devicetree/bindings/mtd/atmel-quadspi.txt | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt b/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
>> index b93c1e2f25dd..002d3f0a445b 100644
>> --- a/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
>> +++ b/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
>> @@ -12,6 +12,10 @@ Required properties:
>> - #address-cells: Should be <1>.
>> - #size-cells: Should be <0>.
>>
>> +Optional properties:
>> +- dmacap,memcpy: Reserve a DMA channel to perform DMA memcpy() between the
>> + system memory and the QSPI mapped memory.
>
> How is this a h/w property? Why would I not want to always enable DMA if
> possible?
The number of DMA channels is limited for a given SoC. This number may be
lower than the number of enabled controllers (spi, i2c, qspi, aes, sha,
des, sdmmc, usart, ...).
So we use a DT property to explicitly tell the matching drivers to request
and reserved the DMA channels they need. This policy is not driver or even
SoC specific but board specific. It's very common to reserved DMA channels
for the most used or most performance dependent controllers, setting the
relevant properties in the device-tree then restricting remaining
controllers to their PIO mode.
>
> Furthermore, you are reusing a property, but giving it a different
> meaning. The current definition is an indication whether a DMA
> controller supports memcpy operations or not. It is not a flag for
> clients to use memcpy channels.
>
I don't mind changing the name. I thought it would be better to use some
existing one than creating another. However I was not confident on whether
"dmacap,memcpy" was actually a good candidate for what I do in the DMA
patch for the atmel-quadspi.c driver.
Actually, I was relying on your feedbacks :)
> Why don't you use "dmas" property to point to the DMA controller.
I didn't use the "dmas" property because I thought it would not have been
consistent with how this property is used in all other nodes of the sama5*
device-trees. The flags provided in the dma-cells are designed for "memory
to peripheral" or "peripheral to memory" DMA transfers only.
However, here I need to perform some "memory to memory" transfer: the SPI
NOR flash memory is mapped into the AHB bus of SAMA5D2 SoCs. Besides,
once in Serial Memory Mode, data can no longer be transfered through the
TDR or RDR registers of the QSPI controller, only through its AHB mapped
memory window.
So I cannot configured the DMA channel for "peripheral to memory" or
"memory to peripheral" like for other controllers embedded in the SoC but
only for "memory to memory".
Maybe we could extend the flags supported by the "dmas" property but I
guess it may require some little rework in the at_xdmac_xlate() function
of the at_xdmac.c driver.
Or maybe no change at all is required at the at_xdmac.c driver side: we
just don't care about the provided flags in the "dmas" property, especially
the "peripheral id". They would be ignored anyway when the atmel-quadspi.c
driver later calls dmaengine_prep_dma_memcpy(). So I could simply set the
dma cells to 0 in the device-tree?
Ludovic, what do you think about that ?
Best regards,
Cyrille
>
>> +
>> Example:
>>
>> spi@f0020000 {
>> @@ -24,6 +28,7 @@ spi@f0020000 {
>> #size-cells = <0>;
>> pinctrl-names = "default";
>> pinctrl-0 = <&pinctrl_spi0_default>;
>> + dmacap,memcpy;
>>
>> m25p80@0 {
>> ...
>> --
>> 2.11.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe devicetree" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Hi Trent,
Le 27/12/2017 à 21:15, Trent Piepho a écrit :
> On Wed, 2017-12-27 at 10:36 +0000, Mark Brown wrote:
>> On Tue, Dec 26, 2017 at 06:45:28PM +0000, Trent Piepho wrote:
>>
>>> Or, since this only fixes instances of DMA-unsafe buffers used in
>>> access to SPI NOR flash chips, and since there are other SPI master
>>> interface users, those chip specific fixes in some/all spi master
>>> drivers are still needed to fix transfers not originated via spi-nor?
>>
>> SPI client drivers are *supposed* to use DMA safe memory already. How
>> often that happens in cases where it matters is a separate question, we
>> definitely have users with smaller transfers that don't do the right
>> thing but they're normally done using PIO anyway.
>
> I wonder what the end goal is here?
>
> A random collection of spi master drivers will accept DMA-unsafe
> buffers in some way. In some cases a framework like spi-nor provides
> the fixup to spi-nor master drivers (none so far) and in other cases
> (atmel-quadspi), the spi-nor master driver has its own fixes.
>
At the spi-nor side, atmel-quadspi is also an example showing how the
bounce buffer feature should be used by other spi-flash drivers.
Actually, the m25p80 driver taken aside, no spi-flash driver is currently
able to perform DMA safe transfers.
Some patches were proposed but all rejected because they were doing wrong
things: calling dma_map_single() even if the buffer is not guaranteed to be
contiguous in physical memory or not being aware of the data cache aliasing
issue on some architectures.
So this series offers a common helper solution for all drivers in spi-nor.
I don't want each spi-flash driver to implement its own custom solution.
> Generic spi masters like spi-atmel, spi-ti-qspi, and spi-davinci will
> have their fixes for certain cases.
>
If UBIFS was the only reason for those drivers to implement their own fixes
then those fixes would no longer be needed with this series. However if
other spi-clients also provide the SPI sub-system with DMA-unsafe buffers
then maybe those fixes are still needed. I think Mark knows better than
anyone else about the SPI sub-system.
Besides, another reason to allocate the bounce buffer from spi-nor is that
we can choose a suited size as a trade-off between performance and memory
footprint.
> Perhaps spi flash drivers like m25p80 will have fixes too?
>
patch 1 of the series enables the bounce buffer for both read and write
operations in the m25p80 driver, in order to be compliant with buffer
constraints expressed in the kernel-doc of 'struct spi_transfer'.
> Some spi clients, like spidev, will have internal bounce buffers,
> rather than userspace addresses or commands in stack variables, so that
> they follow the rules about DMA safe buffers.
>
> What exactly is caught as DMA unsafe and what is not will of course
> vary greatly from driver to driver. Some drivers will catch highmem
> memory while other drivers will only detect vmalloc memory. Some will
> only catch an unsafe buffer if a specific SoC known to the driver to
> have an aliasing cache is enabled. Some will check buffers that arrive
> via the spi_flash_read interface but not via generic spi transfers,
> while others will check all spi transfer buffers.
>
That's why I've asked for pieces of advice on how to implement a relevant
[spi_nor_]is_dma_safe() function and Vignesh has provided good suggestion!
> Obviously, I don't think this path will lead to a desirable end. Maybe
Here you seem to only take the m25p80 and SPI sub-system cases into
account. However, at the spi-nor side, we currently have to other solution
to let spi-nor flash drivers implement DMA transfers safely.
Best regards,
Cyrille
> the basic assumption, that clients should provide DMA safe buffers,
> should be revisited. Experience has shown that it's too much to ask
> for and spi clients will never get it right. It would be better to try
> to fix this at some common point between the clients and masters so it
> can be done once and for all.
>
Hi Trent,
Le 26/12/2017 à 20:43, Trent Piepho a écrit :
> On Sun, 2017-12-24 at 05:36 +0100, Cyrille Pitchen wrote:
>>
>> Then the patch adds two hardware capabilities for SPI flash controllers,
>> SNOR_HWCAPS_WR_BOUNCE and SNOR_HWCAPS_RD_BOUNCE.
>
> Are there any drivers for which a bounce buffer is NOT needed when the
> tx/rx buffer is not in DMA safe memory? Maybe it would make more sense
> to invert the sense of these flags, so that they indicate the driver
> does not need DMA safe buffers, if that is the uncommon/non-existent
> case, so that fewer drivers need to be modified to to be fixed?
>
It doesn't sound safe for a first step. I don't know if some of the
spi-flash controllers are embedded inside systems with small memory and
don't care about DMA transfers. Maybe they don't plan to use anything else
but PIO transfers. Then why taking the risk to exhaust the memory on systems
that would not use the bounce buffer anyway?
Later, if we see that most spi-flash drivers actually need this bounce
buffer, I agree with you: we could invert the logic of the flags but
currently I guess this is too soon and to be safe I would have to add the
negative flags to all other spi-flash drivers. I would like a smooth and
safe transition to avoid unexpected and unwanted side effects.
About the memory loss when forcing the SNOR_HWCAPS_*_BOUNCE in m25p80.c, I
justify it because the m25p80 has to be compliant with the SPI sub-system
requirements but currently is not. However I've taken care not to allocate
the bounce buffer as long as we use only DMA-safe buffers.
Here at the MTD side, the main (only ?) source of DMA-unsafe buffers is
the UBIFS (JFFS2 too ?) layer. Then I've assumed that systems using such a
file-system should already have enough system memory.
However I wanted to take all other use cases into account.
>> +static bool spi_nor_is_dma_safe(const void *buf)
>> +{
>> + if (is_vmalloc_addr(buf))
>> + return false;
>> +
>> +#ifdef CONFIG_HIGHMEM
>> + if ((unsigned long)buf >= PKMAP_BASE &&
>> + (unsigned long)buf < (PKMAP_BASE + (LAST_PKMAP * PAGE_SIZE)))
>> + return false;
>> +#endif
>
> It looks like:
>
> (unsigned long)addr >= PKMAP_ADDR(0) &&
> (unsigned long)addr < PKMAP_ADDR(LAST_PKMAP)
>
> is the expression used in the highmem code. But really, isn't this
> begging for is_highmem_addr() in include/linux/mm.h that can always
> return false when highmem is not enabled?
>
Vignesh has suggested to call virt_addr_valid() instead.
I think Boris has also told me about this function.
So it might be the right solution. What do you think about their proposal?
> In order to be safe, this must be called when nor->lock is held, right?
Indeed, in all cases (spi_nor_read(), sst_write() and spi_nor_write()),
spi_nor_get_bounce_buffer() is always called with nor->lock held, precisely
to avoid races and allocating the bounce buffer twice by mistake.
> Otherwise it could race against two callers allocating the buffer at
> the same time. That should probably be noted in the kerneldoc comments
> for this function, which should also be written.
>
I can add kernel-doc.
>> +static int spi_nor_get_bounce_buffer(struct spi_nor *nor,
>> + u_char **buffer,
>> + size_t *buffer_size)
>> +{
>
>> +
>> + *buffer = nor->bounce_buffer;
>> + *buffer_size = size;
>
> So the buffer is returned via the parameter, and also via a field
> inside nor. Seems redundant. Consider address could be returned via
> the function return value coupled with PTR_ERR() for the error cases.
> Or not return address at all since it's available via nor-
>> bounce_buffer.
>
Why not. It would require more lines though. I guess it's purely a matter of taste.
u_char *buffer = buf;
if (use_bounce) {
buffer = spi_nor_get_bounce_buffer(nor, &buffer_size);
if (IS_ERR(buffer)) {
err = PTR_ERR(buffer);
goto read_err;
}
}
>> {
>> struct spi_nor *nor = mtd_to_spi_nor(mtd);
>> + bool use_bounce = (nor->flags & SNOR_F_USE_RD_BOUNCE) &&
>> + !spi_nor_is_dma_safe(buf);
>> + u_char *buffer = buf;
>> + size_t buffer_size = 0;
>> int ret;
>>
>> dev_dbg(nor->dev, "from 0x%08x, len %zd\n", (u32)from, len);
>> @@ -1268,13 +1324,23 @@ static int spi_nor_read(struct mtd_info *mtd, loff_t from, size_t len,
>> if (ret)
>> return ret;
>>
>> + if (use_bounce) {
>> + ret = spi_nor_get_bounce_buffer(nor, &buffer, &buffer_size);
>> + if (ret < 0)
>> + goto read_err;
>> + }
>
> This pattern, check if bounce is enabled, check if address is dma-
> unsafe, get bounce buffer, seems to be very common. Could it be
> refactored into one helper?
>
> u_char *buffer = spi_nor_check_bounce(nor, buf, len, &buffer_size);
The conditions that define the value of 'use_bounce' also depend on the type
of operation, read or write, hence on the two different flags
SNOR_F_USE_RD_BOUNCE and SNOR_F_USE_WR_BOUNCE.
Besides, 'use_bounce' is also tested later in spi_nor_read(), sst_write()
and sst_write().
So I don't really see how the spi_nor_check_bounce() function you propose
could be that different from spi_nor_get_bounce_buffer().
> if (IS_ERR(buffer))
> return PTR_ERR(buffer);
> // buffer = nor->bounce_buffer or buf, whichever is correct
> // buffer_size = len or bounce buffer size, whichever is correct
>
> Could spi_nor_read_sfdp_dma_unsafe() also use this buffer?
>
I didn't use the bounce buffer in spi_nor_read_sfdp_dma_unsafe() on
purpose: the bounce buffer, when needed, is allocated once for all to limit
performance loss. However, to avoid increasing the memory footprint, if not
absolutely needed the bounce buffer is not allocated at all.
For a SFDP compliant memory, spi_nor_parse_sfdp() is always called during
spi_nor_scan() even if later, only DMA-safe buffers are provided. In such a
case, allocating the bounce buffer would have been a waste of memory.
Best regards,
Cyrille
On Thu, 2017-12-28 at 11:39 +0100, Cyrille Pitchen wrote:
> Le 26/12/2017 à 20:43, Trent Piepho a écrit :
> > On Sun, 2017-12-24 at 05:36 +0100, Cyrille Pitchen wrote:
> > >
> > > Then the patch adds two hardware capabilities for SPI flash controllers,
> > > SNOR_HWCAPS_WR_BOUNCE and SNOR_HWCAPS_RD_BOUNCE.
> >
> > Are there any drivers for which a bounce buffer is NOT needed when the
> > tx/rx buffer is not in DMA safe memory? Maybe it would make more sense
> > to invert the sense of these flags, so that they indicate the driver
> > does not need DMA safe buffers, if that is the uncommon/non-existent
> > case, so that fewer drivers need to be modified to to be fixed?
> >
>
> It doesn't sound safe for a first step. I don't know if some of the
> spi-flash controllers are embedded inside systems with small memory and
> don't care about DMA transfers. Maybe they don't plan to use anything else
> but PIO transfers. Then why taking the risk to exhaust the memory on systems
> that would not use the bounce buffer anyway?
This would certainly be the case when the driver does not even support
DMA in the first place.
This also makes me wonder, how inefficient does this become when it
uses a bounce buffer for small transfer that would not use DMA anyway?
In the spi_flash_read() interface for spi masters, there is a master
method spi_flash_can_dma() callback used by the spi core to tell if
each transfer can be DMAed.
Should something like that be used here, to ask the master if it would
use dma on the buffer?
This might also prevent allocation of the bounce buffer if the only DMA
unsafe transfers are tiny control ops with stack variables that
wouldn't use DMA, e.g. the stuff spi_nor_read_sfdp_dma_unsafe() does.
> About the memory loss when forcing the SNOR_HWCAPS_*_BOUNCE in m25p80.c, I
> justify it because the m25p80 has to be compliant with the SPI sub-system
> requirements but currently is not. However I've taken care not to allocate
> the bounce buffer as long as we use only DMA-safe buffers.
Another possibility to reduce memory usage: make the buffer smaller
when first allocated by being just enough for the needed space. Grow
it (by powers of two?) until it reaches the max allowed size. No
reason to use a 256 kB buffer if DMA unsafe operations never get that
big.
> Here at the MTD side, the main (only ?) source of DMA-unsafe buffers is
> the UBIFS (JFFS2 too ?) layer. Then I've assumed that systems using such a
> file-system should already have enough system memory.
I saw a note in one of the existing DMA fixes that JFFS2 was the source
of the unsafe buffers, so probably there too.
>
> Vignesh has suggested to call virt_addr_valid() instead.
> I think Boris has also told me about this function.
> So it might be the right solution. What do you think about their proposal?
Not sure what exactly the differences are between these methods. The
fact that each of the many existing DMA fixes uses slightly different
code to detect what is unsafe speaks to the difficulty of this problem!
virt_addr_valid() is already used by spi-ti-qspi. spi core uses for
the buffer map helper, but that code path is for buffers which are NOT
vmalloc or highmem, but are still not virt_addr_valid() for some other
reason.
> > > +static int spi_nor_get_bounce_buffer(struct spi_nor *nor,
> > > + u_char **buffer,
> > > + size_t *buffer_size)
> > > +{
> > > +
> > > + *buffer = nor->bounce_buffer;
> > > + *buffer_size = size;
> >
> > So the buffer is returned via the parameter, and also via a field
> > inside nor. Seems redundant. Consider address could be returned via
> > the function return value coupled with PTR_ERR() for the error cases.
> > Or not return address at all since it's available via nor-
> > > bounce_buffer.
>
> Why not. It would require more lines though. I guess it's purely a matter of taste.
Well, also consider that you don't need to even return the buffer
pointer at all, since it's available as nor->bounce_buffer. Which it
is used as in spi_nor_write() and spi_nor_read().
> > This pattern, check if bounce is enabled, check if address is dma-
> > unsafe, get bounce buffer, seems to be very common. Could it be
> > refactored into one helper?
> >
> > u_char *buffer = spi_nor_check_bounce(nor, buf, len, &buffer_size);
>
> The conditions that define the value of 'use_bounce' also depend on the type
> of operation, read or write, hence on the two different flags
> SNOR_F_USE_RD_BOUNCE and SNOR_F_USE_WR_BOUNCE.
Just pass one of those flags as an argument to tell what direction it
is in. Though I wonder if using a bounce buffer for only one direction
will ever be used.
>
> Besides, 'use_bounce' is also tested later in spi_nor_read(), sst_write()
> and sst_write().
>
> So I don't really see how the spi_nor_check_bounce() function you propose
> could be that different from spi_nor_get_bounce_buffer().
>
> > if (IS_ERR(buffer))
> > return PTR_ERR(buffer);
> > // buffer = nor->bounce_buffer or buf, whichever is correct
> > // buffer_size = len or bounce buffer size, whichever is correct
> >
> > Could spi_nor_read_sfdp_dma_unsafe() also use this buffer?
> >
>
> I didn't use the bounce buffer in spi_nor_read_sfdp_dma_unsafe() on
> purpose: the bounce buffer, when needed, is allocated once for all to limit
> performance loss. However, to avoid increasing the memory footprint, if not
> absolutely needed the bounce buffer is not allocated at all.
spi-nor tries to provide a common implementation of DMA bounce buffers,
yet spi-nor itself has two different DMA bounce buffer
implementations.
I think the real answer for spi_nor_read_sfdp_dma_unsafe() is that it
shouldn't be written that way and the function should go away. The two
call sites should just kmalloc the struct they read into instead of
placing it on the stack. The dma_unsafe wrapper kmallocs a buffer
anyway, so it's not like there is any more allocation going on.
On Wednesday 27 December 2017 12:15 AM, Trent Piepho wrote:
> On Sun, 2017-12-24 at 05:36 +0100, Cyrille Pitchen wrote:
>> this series tries to solve a long time issue of compatibility between the
>> MTD and SPI sub-systems about whether we should use DMA-safe memory.
>
> Can this should replace SoC specific fixes like:
>
> c687c46e9e45
> spi: spi-ti-qspi: Use bounce buffer if read buffer is not DMA'ble
>
Yes, I plan to revert this patch once m25p80 starts using bounce buffers.
I am interested in knowing which other SPI clients end up passing non
DMA'able buffers. AFAIK, its UBIFS and JFFS2. Most other SPI devices on
TI boards that I have dealt with like Touch, Sensors (IIO), GPIO
expanders etc all provide DMA'able buffers.
--
Regards
Vignesh
On Friday 29 December 2017 12:24 AM, Trent Piepho wrote:
> On Thu, 2017-12-28 at 11:39 +0100, Cyrille Pitchen wrote:
>> Le 26/12/2017 à 20:43, Trent Piepho a écrit :
>> > On Sun, 2017-12-24 at 05:36 +0100, Cyrille Pitchen wrote:
>> > >
>> > > Then the patch adds two hardware capabilities for SPI flash controllers,
>> > > SNOR_HWCAPS_WR_BOUNCE and SNOR_HWCAPS_RD_BOUNCE.
>> >
>> > Are there any drivers for which a bounce buffer is NOT needed when the
>> > tx/rx buffer is not in DMA safe memory? Maybe it would make more sense
>> > to invert the sense of these flags, so that they indicate the driver
>> > does not need DMA safe buffers, if that is the uncommon/non-existent
>> > case, so that fewer drivers need to be modified to to be fixed?
>> >
>>
>> It doesn't sound safe for a first step. I don't know if some of the
>> spi-flash controllers are embedded inside systems with small memory and
>> don't care about DMA transfers. Maybe they don't plan to use anything else
>> but PIO transfers. Then why taking the risk to exhaust the memory on systems
>> that would not use the bounce buffer anyway?
>
> This would certainly be the case when the driver does not even support
> DMA in the first place.
>
> This also makes me wonder, how inefficient does this become when it
> uses a bounce buffer for small transfer that would not use DMA anyway?
> In the spi_flash_read() interface for spi masters, there is a master
> method spi_flash_can_dma() callback used by the spi core to tell if
> each transfer can be DMAed.
>
> Should something like that be used here, to ask the master if it would
> use dma on the buffer?
>
> This might also prevent allocation of the bounce buffer if the only DMA
> unsafe transfers are tiny control ops with stack variables that
> wouldn't use DMA, e.g. the stuff spi_nor_read_sfdp_dma_unsafe() does.
>
>
>> About the memory loss when forcing the SNOR_HWCAPS_*_BOUNCE in m25p80.c, I
>> justify it because the m25p80 has to be compliant with the SPI sub-system
>> requirements but currently is not. However I've taken care not to allocate
>> the bounce buffer as long as we use only DMA-safe buffers.
>
> Another possibility to reduce memory usage: make the buffer smaller
> when first allocated by being just enough for the needed space. Grow
> it (by powers of two?) until it reaches the max allowed size. No
> reason to use a 256 kB buffer if DMA unsafe operations never get that
> big.
>
>
>> Here at the MTD side, the main (only ?) source of DMA-unsafe buffers is
>> the UBIFS (JFFS2 too ?) layer. Then I've assumed that systems using such a
>> file-system should already have enough system memory.
>
> I saw a note in one of the existing DMA fixes that JFFS2 was the source
> of the unsafe buffers, so probably there too.
>
>
>>
>> Vignesh has suggested to call virt_addr_valid() instead.
>> I think Boris has also told me about this function.
>> So it might be the right solution. What do you think about their proposal?
>
> Not sure what exactly the differences are between these methods. The
> fact that each of the many existing DMA fixes uses slightly different
> code to detect what is unsafe speaks to the difficulty of this problem!
My understanding based on Documentation/DMA-API-HOWTO.txt and
Documentation/arm/memory.txt is that
virt_addr_valid() will guarantee that address is in range of
PAGE_OFFSET to high_memory-1 (Kernel direct-mapped RAM region) which is
address range of buffers that are DMA'able.
> virt_addr_valid() is already used by spi-ti-qspi. spi core uses for
> the buffer map helper, but that code path is for buffers which are NOT
> vmalloc or highmem, but are still not virt_addr_valid() for some other
> reason.
>
if (vmalloced_buf || kmap_buf) {
/* Handle vmalloc'd or kmap'd buffers */
...
} else if (virt_addr_valid(buf)) {
/* Handle kmalloc'd and such buffers */
...
} else {
/* Error if none of the above */
return -EINVAL;
}
--
Regards
Vignesh
On Fri, 2017-12-29 at 15:46 +0530, Vignesh R wrote:
> On Friday 29 December 2017 12:24 AM, Trent Piepho wrote:
> >
> > > Vignesh has suggested to call virt_addr_valid() instead.
> > > I think Boris has also told me about this function.
> > > So it might be the right solution. What do you think about their proposal?
> >
> > Not sure what exactly the differences are between these methods. The
> > fact that each of the many existing DMA fixes uses slightly different
> > code to detect what is unsafe speaks to the difficulty of this problem!
>
> My understanding based on Documentation/DMA-API-HOWTO.txt and
> Documentation/arm/memory.txt is that
> virt_addr_valid() will guarantee that address is in range of
> PAGE_OFFSET to high_memory-1 (Kernel direct-mapped RAM region) which is
> address range of buffers that are DMA'able.
There's code in gpmi-nand.c that does:
/* first try to map the upper buffer directly */
if (virt_addr_valid(this->upper_buf) &&
!object_is_on_stack(this->upper_buf)) {
sg_init_one(sgl, this->upper_buf, this->upper_len);
So whoever wrote that thought that stack objects needed an additional
test beyond virt_addr_valid. But it does appear to be far more common
to depend on just virt_addr_valid, so perhaps the code in gpmi-nand is
in error.
> > virt_addr_valid() is already used by spi-ti-qspi. spi core uses for
> > the buffer map helper, but that code path is for buffers which are NOT
> > vmalloc or highmem, but are still not virt_addr_valid() for some other
> > reason.
> >
>
> if (vmalloced_buf || kmap_buf) {
> /* Handle vmalloc'd or kmap'd buffers */
> ...
This stuff does get DMAed. So I have to wonder, if spi.c thinks it can
use DMA with vmalloc or highmem, couldn't spi-not do the same instead
of the bounce buffer?
> } else if (virt_addr_valid(buf)) {
> /* Handle kmalloc'd and such buffers */
> ...
> } else {
> /* Error if none of the above */
So what is this case here for? It's some class that does not have a
valid virtual address and yet is not vmalloc or highmem.
> return -EINVAL;
> }
>
On Friday 29 December 2017 11:33 PM, Trent Piepho wrote:
> On Fri, 2017-12-29 at 15:46 +0530, Vignesh R wrote:
>> On Friday 29 December 2017 12:24 AM, Trent Piepho wrote:
>> >
>> > > Vignesh has suggested to call virt_addr_valid() instead.
>> > > I think Boris has also told me about this function.
>> > > So it might be the right solution. What do you think about their proposal?
>> >
>> > Not sure what exactly the differences are between these methods. The
>> > fact that each of the many existing DMA fixes uses slightly different
>> > code to detect what is unsafe speaks to the difficulty of this problem!
>>
>> My understanding based on Documentation/DMA-API-HOWTO.txt and
>> Documentation/arm/memory.txt is that
>> virt_addr_valid() will guarantee that address is in range of
>> PAGE_OFFSET to high_memory-1 (Kernel direct-mapped RAM region) which is
>> address range of buffers that are DMA'able.
>
> There's code in gpmi-nand.c that does:
>
> /* first try to map the upper buffer directly */
> if (virt_addr_valid(this->upper_buf) &&
> !object_is_on_stack(this->upper_buf)) {
> sg_init_one(sgl, this->upper_buf, this->upper_len);
>
> So whoever wrote that thought that stack objects needed an additional
> test beyond virt_addr_valid. But it does appear to be far more common
> to depend on just virt_addr_valid, so perhaps the code in gpmi-nand is
> in error.
>
The test in gpmi-nand.c is right. Enabling CONFIG_DMA_API_DEBUG does
warn about DMA'ing into stack objects (in lib/dma-debug.c). So other
places needs to be fixed, I guess.
>> > virt_addr_valid() is already used by spi-ti-qspi. spi core uses for
>> > the buffer map helper, but that code path is for buffers which are NOT
>> > vmalloc or highmem, but are still not virt_addr_valid() for some other
>> > reason.
>> >
>>
>> if (vmalloced_buf || kmap_buf) {
>> /* Handle vmalloc'd or kmap'd buffers */
>> ...
> This stuff does get DMAed. So I have to wonder, if spi.c thinks it can
> use DMA with vmalloc or highmem, couldn't spi-not do the same instead
> of the bounce buffer?
SPI core does try to DMA into underlying physical pages of vmalloc'd
buffer. But this has two problems:
1. Does not work well with VIVT caches[1].
2. On ARM LPAE systems, vmalloc'd buffers can be from highmem region
that are not addressable using 32 bit addresses and is backed by LPAE.
So, a 32 bit DMA cannot access these buffers.
Both these issues lead to random crashes and errors with UBIFS and JFFS2
flash file systems which this patch series tries to address using bounce
buffer
[1] https://patchwork.kernel.org/patch/9579553/
--
Regards
Vignesh
Hi Cyrille,
On Wed, Dec 27, 2017 at 10:40:00PM +0100, Cyrille Pitchen wrote:
> Hi Rob,
>
> + Ludovic Desroches, maintainer of the DMA controller drivers for AT91 SoCs.
>
> Le 27/12/2017 ? 00:23, Rob Herring a ?crit :
> > On Sun, Dec 24, 2017 at 05:36:05AM +0100, Cyrille Pitchen wrote:
> >> The optional 'dmacap,memcpy' DT property tells the Atmel QSPI controller
> >> driver to reserve some DMA channel then to use it to perform DMA
> >> memcpy() during data transfers. This feature relies on the generic
> >> bounce buffer helper from spi-nor.c.
> >>
> >> Signed-off-by: Cyrille Pitchen <[email protected]>
> >> ---
> >> Documentation/devicetree/bindings/mtd/atmel-quadspi.txt | 5 +++++
> >> 1 file changed, 5 insertions(+)
> >>
> >> diff --git a/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt b/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
> >> index b93c1e2f25dd..002d3f0a445b 100644
> >> --- a/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
> >> +++ b/Documentation/devicetree/bindings/mtd/atmel-quadspi.txt
> >> @@ -12,6 +12,10 @@ Required properties:
> >> - #address-cells: Should be <1>.
> >> - #size-cells: Should be <0>.
> >>
> >> +Optional properties:
> >> +- dmacap,memcpy: Reserve a DMA channel to perform DMA memcpy() between the
> >> + system memory and the QSPI mapped memory.
> >
> > How is this a h/w property? Why would I not want to always enable DMA if
> > possible?
>
> The number of DMA channels is limited for a given SoC. This number may be
> lower than the number of enabled controllers (spi, i2c, qspi, aes, sha,
> des, sdmmc, usart, ...).
>
> So we use a DT property to explicitly tell the matching drivers to request
> and reserved the DMA channels they need. This policy is not driver or even
> SoC specific but board specific. It's very common to reserved DMA channels
> for the most used or most performance dependent controllers, setting the
> relevant properties in the device-tree then restricting remaining
> controllers to their PIO mode.
>
> >
> > Furthermore, you are reusing a property, but giving it a different
> > meaning. The current definition is an indication whether a DMA
> > controller supports memcpy operations or not. It is not a flag for
> > clients to use memcpy channels.
> >
>
> I don't mind changing the name. I thought it would be better to use some
> existing one than creating another. However I was not confident on whether
> "dmacap,memcpy" was actually a good candidate for what I do in the DMA
> patch for the atmel-quadspi.c driver.
>
> Actually, I was relying on your feedbacks :)
>
> > Why don't you use "dmas" property to point to the DMA controller.
>
> I didn't use the "dmas" property because I thought it would not have been
> consistent with how this property is used in all other nodes of the sama5*
> device-trees. The flags provided in the dma-cells are designed for "memory
> to peripheral" or "peripheral to memory" DMA transfers only.
>
> However, here I need to perform some "memory to memory" transfer: the SPI
> NOR flash memory is mapped into the AHB bus of SAMA5D2 SoCs. Besides,
> once in Serial Memory Mode, data can no longer be transfered through the
> TDR or RDR registers of the QSPI controller, only through its AHB mapped
> memory window.
>
> So I cannot configured the DMA channel for "peripheral to memory" or
> "memory to peripheral" like for other controllers embedded in the SoC but
> only for "memory to memory".
>
> Maybe we could extend the flags supported by the "dmas" property but I
> guess it may require some little rework in the at_xdmac_xlate() function
> of the at_xdmac.c driver.
>
I have no objection about changing the at_xdmac_xlate() behavior.
>From your cover-letter, I understand that the issue you're trying to
solve is not only related to the Atmel devices so you may also have to update
the xlate functions of the other DMA controllers.
> Or maybe no change at all is required at the at_xdmac.c driver side: we
> just don't care about the provided flags in the "dmas" property, especially
> the "peripheral id". They would be ignored anyway when the atmel-quadspi.c
> driver later calls dmaengine_prep_dma_memcpy(). So I could simply set the
> dma cells to 0 in the device-tree?
>
> Ludovic, what do you think about that ?
It may work but I won't do this. Usually, channels requested through the xlate
function have usually their capaiblities set to DMA_SLAVE and not DMA_MEMCPY.
In the at_xdmac case, it won't be an issue but if you have a controller
which has channels which can support only mem-to-mem or peripheral, it
won't work.
Regards
Ludovic
On Tue, 2018-01-02 at 11:22 +0100, Ludovic Desroches wrote:
> On Wed, Dec 27, 2017 at 10:40:00PM +0100, Cyrille Pitchen wrote:
>
> > Or maybe no change at all is required at the at_xdmac.c driver side: we
> > just don't care about the provided flags in the "dmas" property, especially
> > the "peripheral id". They would be ignored anyway when the atmel-quadspi.c
> > driver later calls dmaengine_prep_dma_memcpy(). So I could simply set the
> > dma cells to 0 in the device-tree?
> >
> > Ludovic, what do you think about that ?
>
> It may work but I won't do this. Usually, channels requested through the xlate
> function have usually their capaiblities set to DMA_SLAVE and not DMA_MEMCPY.
> In the at_xdmac case, it won't be an issue but if you have a controller
> which has channels which can support only mem-to-mem or peripheral, it
> won't work.
Maybe one could create an "AT91_XDMAC_DT_" macro to indicate a memcpy
channel. There are still unused bits for another flag. It also looks
like at_xdma uses peripheral id 0x3f for memcpy transfers (will that
work with memcpy DMA on multiple channels at the same time?). So
perhaps perid 0x3f could be the indication of wanting a memcpy channel,
rather than another flag bit. But however it's done, one writes:
dmas = <&dma0 AT91_XDMAC_DT_MEMCPY>; dma-names = "rx-tx";
I think one could have the quadspi driver automatically fill in the dma
cell in the dma specifier if it is not present in the device tree. So
one could write "dmas = <&dma0>" and the driver adds the
AT91_XDMAC_DT_MEMCPY cell before xlating. I'm not sure if that's a
good idea or not.
On Tue, Jan 02, 2018 at 07:18:58PM +0000, Trent Piepho wrote:
> On Tue, 2018-01-02 at 11:22 +0100, Ludovic Desroches wrote:
> > On Wed, Dec 27, 2017 at 10:40:00PM +0100, Cyrille Pitchen wrote:
> >
> > > Or maybe no change at all is required at the at_xdmac.c driver side: we
> > > just don't care about the provided flags in the "dmas" property, especially
> > > the "peripheral id". They would be ignored anyway when the atmel-quadspi.c
> > > driver later calls dmaengine_prep_dma_memcpy(). So I could simply set the
> > > dma cells to 0 in the device-tree?
> > >
> > > Ludovic, what do you think about that ?
> >
> > It may work but I won't do this. Usually, channels requested through the xlate
> > function have usually their capaiblities set to DMA_SLAVE and not DMA_MEMCPY.
> > In the at_xdmac case, it won't be an issue but if you have a controller
> > which has channels which can support only mem-to-mem or peripheral, it
> > won't work.
>
> Maybe one could create an "AT91_XDMAC_DT_" macro to indicate a memcpy
> channel. There are still unused bits for another flag. It also looks
> like at_xdma uses peripheral id 0x3f for memcpy transfers (will that
> work with memcpy DMA on multiple channels at the same time?). So
> perhaps perid 0x3f could be the indication of wanting a memcpy channel,
> rather than another flag bit. But however it's done, one writes:
>
> dmas = <&dma0 AT91_XDMAC_DT_MEMCPY>; dma-names = "rx-tx";
>
If have no objection about doing that, my concerns are:
- most (all ?) of the dma controllers used the xlate function to provide
slave channel. Does it have to provide slave channel or can we
use it for all kind of channel? From my point of view, we can do it,
just need the confirmation.
- this set of patches if focused on the atmel qspi controller but other
ones may be interested in doing the same thing so they would have to
update the behavior of the xlate function of the DMA controller they
are using. So having the request of a DMA_MEMCPY channel inside the
spi/qspi controller doesn't seem to be a wrong idea. Moreover, it may
be confusing for the user who don't know the context: why do I have to
use memcpy and not slave as usal?
Honestly I have no opinion about the way to do it. Both have pros and
cons.
> I think one could have the quadspi driver automatically fill in the dma
> cell in the dma specifier if it is not present in the device tree. So
> one could write "dmas = <&dma0>" and the driver adds the
> AT91_XDMAC_DT_MEMCPY cell before xlating. I'm not sure if that's a
> good idea or not.
I don't think so, there is enough black magic, let's try to not add more
:p
Regards
Ludovic
On Wed, Dec 27, 2017 at 08:15:11PM +0000, Trent Piepho wrote:
> A random collection of spi master drivers will accept DMA-unsafe
> buffers in some way. In some cases a framework like spi-nor provides
> the fixup to spi-nor master drivers (none so far) and in other cases
> (atmel-quadspi), the spi-nor master driver has its own fixes.
Everything really should be DMA safe. The thing with skimping on the
DMA safety is that things generally won't explode every single time but
they might explode.
> What exactly is caught as DMA unsafe and what is not will of course
> vary greatly from driver to driver. Some drivers will catch highmem
> memory while other drivers will only detect vmalloc memory. Some will
> only catch an unsafe buffer if a specific SoC known to the driver to
> have an aliasing cache is enabled. Some will check buffers that arrive
> via the spi_flash_read interface but not via generic spi transfers,
> while others will check all spi transfer buffers.
As I keep saying here the real fix would be to have a simple API that we
could use to tell if we're dealing with something DMA safe. Right now
it's a mess because you need to pass that information around through
every layer that references memory which is a pain.
On Wed, Dec 27, 2017 at 10:40:00PM +0100, Cyrille Pitchen wrote:
> Le 27/12/2017 ? 00:23, Rob Herring a ?crit :
> > On Sun, Dec 24, 2017 at 05:36:05AM +0100, Cyrille Pitchen wrote:
> >> +Optional properties:
> >> +- dmacap,memcpy: Reserve a DMA channel to perform DMA memcpy() between the
> >> + system memory and the QSPI mapped memory.
> > How is this a h/w property? Why would I not want to always enable DMA if
> > possible?
> The number of DMA channels is limited for a given SoC. This number may be
> lower than the number of enabled controllers (spi, i2c, qspi, aes, sha,
> des, sdmmc, usart, ...).
> So we use a DT property to explicitly tell the matching drivers to request
> and reserved the DMA channels they need. This policy is not driver or even
> SoC specific but board specific. It's very common to reserved DMA channels
> for the most used or most performance dependent controllers, setting the
> relevant properties in the device-tree then restricting remaining
> controllers to their PIO mode.
Why can't we just time share the DMA channels at runtime, why do we need
this static allocation?
On Tue, 26 Dec 2017 14:59:00 +0100
Cyrille Pitchen <[email protected]> wrote:
> Hi Vignesh
>
> Le 26/12/2017 à 14:42, Vignesh R a écrit :
> > Hi Cyrille,
> >
> > Thanks for doing this series! One comment below.
> >
> > On 24-Dec-17 10:06 AM, Cyrille Pitchen wrote:
> > [...]
> >> diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
> >> index 8bafd462f0ae..59f9fbd45234 100644
> >> --- a/drivers/mtd/spi-nor/spi-nor.c
> >> +++ b/drivers/mtd/spi-nor/spi-nor.c
> >> @@ -14,8 +14,10 @@
> >> #include <linux/errno.h>
> >> #include <linux/module.h>
> >> #include <linux/device.h>
> >> +#include <linux/highmem.h>
> >> #include <linux/mutex.h>
> >> #include <linux/math64.h>
> >> +#include <linux/mm.h>
> >> #include <linux/sizes.h>
> >> #include <linux/slab.h>
> >>
> >> @@ -1232,6 +1234,56 @@ static const struct flash_info spi_nor_ids[] = {
> >> { },
> >> };
> >>
> >> +static bool spi_nor_is_dma_safe(const void *buf)
> >> +{
> >> + if (is_vmalloc_addr(buf))
> >> + return false;
> >> +
> >> +#ifdef CONFIG_HIGHMEM
> >> + if ((unsigned long)buf >= PKMAP_BASE &&
> >> + (unsigned long)buf < (PKMAP_BASE + (LAST_PKMAP * PAGE_SIZE)))
> >> + return false;
> >> +#endif
> >> +
> >> + return true;
> >> +}
> >> +
> >
> > Better way would be to use virt_addr_valid():
> > static bool spi_nor_is_dma_safe(const void *buf)
> > {
> > return virt_addr_valid(buf);
> > }
> >
> > Regards
> > Vignesh
> >
>
> Thanks for the advice :)
>
> https://patchwork.kernel.org/patch/9768341/
> Maybe I could check both virt_addr_valid() and object_is_on_stack() too ?
Yep, see the explanation given here [1].
[1]http://elixir.free-electrons.com/linux/v4.15-rc6/source/Documentation/DMA-API-HOWTO.txt#L132
Hi Cyrille,
On Sun, 24 Dec 2017 05:36:04 +0100
Cyrille Pitchen <[email protected]> wrote:
> This patch has two purposes:
>
> 1 - To fix the compatible issue between the MTD and SPI sub-systems
>
> The MTD sub-system has no particular requirement about the memory areas it
> uses. Especially, ubifs is well known for using vmalloc'ed buffers, which
> then are not DMA-safe. There are reasons behind that, so we have to deal
> with it.
Well, the only reason is that it's easier to deal with
virtually contiguous memory region, and since the LEB/PEB size can be
quite big (especially for NAND devices) we have to allocate it with
vmalloc() to prevent memory fragmentation.
The solution would be to allocate an array of ubi->min_io_size buffers
using kzalloc() and write/read to/from the MTD device using this
granularity, but this approach would require quite a few changes and
that's not the topic here.
>
> On the other hand, the SPI sub-system clearly states in the kernel doc for
> 'struct spi-transfer' (include/linux/spi/spi.h) that both .tx_buf and
> .rx_buf must point into "dma-safe memory". This requirement has not been
> taken into account by the m25p80.c driver, at the border between MTD and
> SPI, for a long time now. So it's high time to fix this issue.
I agree, even if I guess the MTD layer is not the only offender and
having this bounce buffer logic at the SPI level would be even better
IMO. But let's solve the problem in the SPI-NOR layer for now.
>
> 2 - To help other SPI flash controller drivers to perform DMA transfers
>
> Those controller drivers suffer the same issue as those behind the
> m25p80.c driver in the SPI sub-system: They may be provided by the MTD
> sub-system with buffers not suited to DMA operations. We want to avoid
> each controller to implement its own bounce buffer so we offer them some
> optional bounce buffer, allocated and managed by the spi-nor framework
> itself, as an helper to add support to DMA transfers.
>
> Then the patch adds two hardware capabilities for SPI flash controllers,
> SNOR_HWCAPS_WR_BOUNCE and SNOR_HWCAPS_RD_BOUNCE.
Do you have good reasons to handle the read/write path independently? I
mean, if you need a bounce buffer for one of them it's likely that
you'll need it for both.
>
> SPI flash controller drivers are supposed to use them to request the
> spi-nor framework to allocate an optional bounce buffer in some
> DMA-safe memory area then to use it in some cases during (Fast) Read
> and/or Page Program operations.
>
> More precisely, the bounce buffer is used if and only if two conditions
> are met:
> 1 - The SPI flash controller driver has declared the
> SNOR_HWCAPS_RD_BOUNCE, resp. SNOR_HWCAPS_WR_BOUNCE for (Fast) Read,
> resp. Page Program operations.
> 2 - The buffer provided by the above MTD layer is not already in a
> DMA-safe area.
>
> This policy avoid using the bounce buffer when not explicitly requested
> or when not needed, hence limiting the performance penalty.
>
> Besides, the bounce buffer is allocated once for all at the very first
> time it is actually needed.
Hm, I think it would be better to allocate the buffer at detection/probe
time, when you have all the information you need (sector_size?). My
fear is that you might not be able to kmalloc() a large buffer the
first time a read/write operation is performed, and that means the
operation might randomly fail/succeed depending on when the first IO
operation is done. If you do it at probe time, you'll be able to detect
the allocation failure early and stop the MTD dev registration when
this is the case.
> This means that as long as all buffers
> provided by the above MTD layer are allocated in some DMA-safe areas, the
> bounce buffer itself is never allocated.
>
> Finally, the bounce buffer size is limited to 256KiB, the currently known
> maximum erase sector size. This tradeoff should still provide good
> performances even if new memory parts come with even larger erase sector
> sizes, limiting the memory footprint at the same time.
>
> Signed-off-by: Cyrille Pitchen <[email protected]>
> ---
> drivers/mtd/devices/m25p80.c | 4 +-
> drivers/mtd/spi-nor/spi-nor.c | 136 +++++++++++++++++++++++++++++++++++++++---
> include/linux/mtd/spi-nor.h | 8 +++
> 3 files changed, 139 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c
> index a4e18f6aaa33..60878c62a654 100644
> --- a/drivers/mtd/devices/m25p80.c
> +++ b/drivers/mtd/devices/m25p80.c
> @@ -239,7 +239,9 @@ static int m25p_probe(struct spi_device *spi)
> struct spi_nor_hwcaps hwcaps = {
> .mask = SNOR_HWCAPS_READ |
> SNOR_HWCAPS_READ_FAST |
> - SNOR_HWCAPS_PP,
> + SNOR_HWCAPS_PP |
> + SNOR_HWCAPS_RD_BOUNCE |
> + SNOR_HWCAPS_WR_BOUNCE,
> };
> char *flash_name;
> int ret;
> diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
> index 8bafd462f0ae..59f9fbd45234 100644
> --- a/drivers/mtd/spi-nor/spi-nor.c
> +++ b/drivers/mtd/spi-nor/spi-nor.c
> @@ -14,8 +14,10 @@
> #include <linux/errno.h>
> #include <linux/module.h>
> #include <linux/device.h>
> +#include <linux/highmem.h>
> #include <linux/mutex.h>
> #include <linux/math64.h>
> +#include <linux/mm.h>
> #include <linux/sizes.h>
> #include <linux/slab.h>
>
> @@ -1232,6 +1234,56 @@ static const struct flash_info spi_nor_ids[] = {
> { },
> };
>
> +static bool spi_nor_is_dma_safe(const void *buf)
> +{
> + if (is_vmalloc_addr(buf))
> + return false;
> +
> +#ifdef CONFIG_HIGHMEM
> + if ((unsigned long)buf >= PKMAP_BASE &&
> + (unsigned long)buf < (PKMAP_BASE + (LAST_PKMAP * PAGE_SIZE)))
> + return false;
> +#endif
> +
> + return true;
> +}
> +
> +static int spi_nor_get_bounce_buffer(struct spi_nor *nor,
> + u_char **buffer,
> + size_t *buffer_size)
> +{
> + const struct flash_info *info = nor->info;
> + /*
> + * Limit the size of the bounce buffer to 256KB: this is currently
> + * the largest known erase sector size (> page size) and should be
> + * enough to still reach good performances if some day new memory
> + * parts use even larger erase sector sizes.
> + */
> + size_t size = min_t(size_t, info->sector_size, SZ_256K);
Wow! That's a huge max size for a buffer allocated with kmalloc. Are
you sure you shouldn't shrink it a bit? Don't know what the usual
sector_size is, but AFAIU sector_size is the ->erasesize, and read/write
operations are done at a ->writesize or ->writebufsize granularity.
> +
> + /*
> + * Allocate the bounce buffer once for all at the first time it is
> + * actually needed. This prevents wasting some precious memory
> + * in cases where it would never be needed.
> + */
> + if (unlikely(!nor->bounce_buffer)) {
I've been told that unlikely/likely() specifiers are not so useful
these days and are sometime doing worse than when nothing is specified.
I must admit I never went as far as evaluating it myself, but I don't
think it's really needed here (the time spent doing this check is
likely to be negligible compared to the IO operation).
> + nor->bounce_buffer = devm_kmalloc(nor->dev, size, GFP_KERNEL);
Nope, devm_kmalloc() is not DMA-safe (see this thread [1]).
> +
> + /*
> + * The SPI flash controller driver has required and expects to
> + * use the DMA-safe bounce buffer, so we can't recover from
> + * this allocation failure.
> + */
> + if (!nor->bounce_buffer)
> + return -ENOMEM;
> + }
> +
> + *buffer = nor->bounce_buffer;
> + *buffer_size = size;
> +
> + return 0;
> +}
> +
Regards,
Boris
[1]http://linux-arm-kernel.infradead.narkive.com/vyJqy0RQ/question-devm-kmalloc-for-dma