2020-05-12 08:59:55

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v4 00/38] DRM: fix struct sg_table nents vs. orig_nents misuse

Dear All,

During the Exynos DRM GEM rework and fixing the issues in the.
drm_prime_sg_to_page_addr_arrays() function [1] I've noticed that most
drivers in DRM framework incorrectly use nents and orig_nents entries of
the struct sg_table.

In case of the most DMA-mapping implementations exchanging those two
entries or using nents for all loops on the scatterlist is harmless,
because they both have the same value. There exists however a DMA-mapping
implementations, for which such incorrect usage breaks things. The nents
returned by dma_map_sg() might be lower than the nents passed as its
parameter and this is perfectly fine. DMA framework or IOMMU is allowed
to join consecutive chunks while mapping if such operation is supported
by the underlying HW (bus, bridge, IOMMU, etc). Example of the case
where dma_map_sg() might return 1 'DMA' chunk for the 4 'physical' pages
is described here [2]

The DMA-mapping framework documentation [3] states that dma_map_sg()
returns the numer of the created entries in the DMA address space.
However the subsequent calls to dma_sync_sg_for_{device,cpu} and
dma_unmap_sg must be called with the original number of entries passed to
dma_map_sg. The common pattern in DRM drivers were to assign the
dma_map_sg() return value to sg_table->nents and use that value for
the subsequent calls to dma_sync_sg_* or dma_unmap_sg functions. Also
the code iterated over nents times to access the pages stored in the
processed scatterlist, while it should use orig_nents as the numer of
the page entries.

I've tried to identify all such incorrect usage of sg_table->nents and
this is a result of my research. It looks that the incorrect pattern has
been copied over the many drivers mainly in the DRM subsystem. Too bad in
most cases it even worked correctly if the system used a simple, linear
DMA-mapping implementation, for which swapping nents and orig_nents
doesn't make any difference. To avoid similar issues in the future, I've
introduced a common wrappers for DMA-mapping calls, which operate directly
on the sg_table objects. I've also added wrappers for iterating over the
scatterlists stored in the sg_table objects and applied them where
possible. This, together with some common DRM prime helpers, allowed me
to almost get rid of all nents/orig_nents usage in the drivers. I hope
that such change makes the code robust, easier to follow and copy/paste
safe.

The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix
it fully. The driver creatively uses sg_table->orig_nents to store the
size of the allocate scatterlist and ignores the number of the entries
returned by dma_map_sg function. In this patchset I only fixed the
sg_table objects exported by dmabuf related functions. I hope that I
didn't break anything there.

Patches are based on top of Linux next-20200511.

Best regards,
Marek Szyprowski


References:

[1] https://lkml.org/lkml/2020/3/27/555.
[2] https://lkml.org/lkml/2020/3/29/65
[3] Documentation/DMA-API-HOWTO.txt


Changelog:

v4:
- added for_each_sgtable_* wrappers and applied where possible
- added drm_prime_get_contiguous_size() and applied where possible
- applied drm_prime_sg_to_page_addr_arrays() where possible to remove page
extraction from sg_table objects
- added documentation for the introduced wrappers
- improved patches description a bit

v3: https://lore.kernel.org/dri-devel/[email protected]/
- introduce dma_*_sgtable_* wrappers and use them in all patches

v2: https://lore.kernel.org/linux-iommu/[email protected]/T/
- dropped most of the changes to drm/i915
- added fixes for rcar-du, xen, media and ion
- fixed a few issues pointed by kbuild test robot
- added wide cc: list for each patch

v1: https://lore.kernel.org/linux-iommu/[email protected]/T/
- initial version


Patch summary:

Marek Szyprowski (38):
dma-mapping: add generic helpers for mapping sgtable objects
scatterlist: add generic wrappers for iterating over sgtable objects
iommu: add generic helper for mapping sgtable objects
drm: prime: add common helper to check scatterlist contiguity
drm: prime: use sgtable iterators in
drm_prime_sg_to_page_addr_arrays()
drm: core: fix common struct sg_table related issues
drm: amdgpu: fix common struct sg_table related issues
drm: armada: fix common struct sg_table related issues
drm: etnaviv: fix common struct sg_table related issues
drm: exynos: use common helper for a scatterlist contiguity check
drm: exynos: fix common struct sg_table related issues
drm: i915: fix common struct sg_table related issues
drm: lima: fix common struct sg_table related issues
drm: mediatek: use common helper for a scatterlist contiguity check
drm: mediatek: use common helper for extracting pages array
drm: msm: fix common struct sg_table related issues
drm: omapdrm: use common helper for extracting pages array
drm: omapdrm: fix common struct sg_table related issues
drm: panfrost: fix common struct sg_table related issues
drm: radeon: fix common struct sg_table related issues
drm: rockchip: use common helper for a scatterlist contiguity check
drm: rockchip: fix common struct sg_table related issues
drm: tegra: fix common struct sg_table related issues
drm: v3d: fix common struct sg_table related issues
drm: virtio: fix common struct sg_table related issues
drm: vmwgfx: fix common struct sg_table related issues
xen: gntdev: fix common struct sg_table related issues
drm: host1x: fix common struct sg_table related issues
drm: rcar-du: fix common struct sg_table related issues
dmabuf: fix common struct sg_table related issues
staging: ion: remove dead code
staging: ion: fix common struct sg_table related issues
staging: tegra-vde: fix common struct sg_table related issues
misc: fastrpc: fix common struct sg_table related issues
rapidio: fix common struct sg_table related issues
samples: vfio-mdev/mbochs: fix common struct sg_table related issues
media: pci: fix common ALSA DMA-mapping related codes
videobuf2: use sgtable-based scatterlist wrappers

drivers/dma-buf/heaps/heap-helpers.c | 13 ++--
drivers/dma-buf/udmabuf.c | 7 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 6 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 9 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 8 +-
drivers/gpu/drm/armada/armada_gem.c | 12 +--
drivers/gpu/drm/drm_cache.c | 2 +-
drivers/gpu/drm/drm_gem_cma_helper.c | 23 +-----
drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++--
drivers/gpu/drm/drm_prime.c | 86 ++++++++++++----------
drivers/gpu/drm/etnaviv/etnaviv_gem.c | 12 ++-
drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 13 +---
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 10 +--
drivers/gpu/drm/exynos/exynos_drm_gem.c | 23 +-----
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 11 +--
drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c | 7 +-
drivers/gpu/drm/lima/lima_gem.c | 11 ++-
drivers/gpu/drm/lima/lima_vm.c | 5 +-
drivers/gpu/drm/mediatek/mtk_drm_gem.c | 34 ++-------
drivers/gpu/drm/msm/msm_gem.c | 13 ++--
drivers/gpu/drm/msm/msm_gpummu.c | 14 ++--
drivers/gpu/drm/msm/msm_iommu.c | 2 +-
drivers/gpu/drm/omapdrm/omap_gem.c | 20 ++---
drivers/gpu/drm/panfrost/panfrost_gem.c | 4 +-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +-
drivers/gpu/drm/radeon/radeon_ttm.c | 11 ++-
drivers/gpu/drm/rcar-du/rcar_du_vsp.c | 3 +-
drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 42 +++--------
drivers/gpu/drm/tegra/gem.c | 27 +++----
drivers/gpu/drm/tegra/plane.c | 15 ++--
drivers/gpu/drm/v3d/v3d_mmu.c | 17 ++---
drivers/gpu/drm/virtio/virtgpu_object.c | 36 +++++----
drivers/gpu/drm/virtio/virtgpu_vq.c | 12 ++-
drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 17 +----
drivers/gpu/host1x/job.c | 22 ++----
.../media/common/videobuf2/videobuf2-dma-contig.c | 41 +++++------
drivers/media/common/videobuf2/videobuf2-dma-sg.c | 32 +++-----
drivers/media/common/videobuf2/videobuf2-vmalloc.c | 12 +--
drivers/media/pci/cx23885/cx23885-alsa.c | 2 +-
drivers/media/pci/cx25821/cx25821-alsa.c | 2 +-
drivers/media/pci/cx88/cx88-alsa.c | 2 +-
drivers/media/pci/saa7134/saa7134-alsa.c | 2 +-
drivers/media/platform/vsp1/vsp1_drm.c | 8 +-
drivers/misc/fastrpc.c | 4 +-
drivers/rapidio/devices/rio_mport_cdev.c | 8 +-
drivers/staging/android/ion/ion.c | 25 +++----
drivers/staging/android/ion/ion.h | 1 -
drivers/staging/android/ion/ion_heap.c | 53 ++++---------
drivers/staging/android/ion/ion_system_heap.c | 2 +-
drivers/staging/media/tegra-vde/iommu.c | 4 +-
drivers/xen/gntdev-dmabuf.c | 13 ++--
include/drm/drm_prime.h | 2 +
include/linux/dma-mapping.h | 79 ++++++++++++++++++++
include/linux/iommu.h | 16 ++++
include/linux/scatterlist.h | 50 ++++++++++++-
samples/vfio-mdev/mbochs.c | 3 +-
56 files changed, 452 insertions(+), 477 deletions(-)

--
1.9.1


2020-05-12 09:07:02

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v4 01/38] dma-mapping: add generic helpers for mapping sgtable objects

struct sg_table is a common structure used for describing a memory
buffer. It consists of a scatterlist with memory pages and DMA addresses
(sgl entry), as well as the number of scatterlist entries: CPU pages
(orig_nents entry) and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg
function.

To avoid such issues, lets introduce a common wrappers operating directly
on the struct sg_table objects, which take care of the proper use of
the nents and orig_nents entries.

Signed-off-by: Marek Szyprowski <[email protected]>
---
For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread:
https://lore.kernel.org/dri-devel/[email protected]/T/
---
include/linux/dma-mapping.h | 79 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 79 insertions(+)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index b43116a..88f01cc 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -609,6 +609,85 @@ static inline void dma_sync_single_range_for_device(struct device *dev,
return dma_sync_single_for_device(dev, addr + offset, size, dir);
}

+/**
+ * dma_map_sgtable - Map the given buffer for the DMA operations
+ * @dev: The device to perform a DMA operation
+ * @sgt: The sg_table object describing the buffer
+ * @dir: DMA direction
+ * @attrs: Optional DMA attributes for the map operation
+ *
+ * Maps a buffer described by a scatterlist stored in the given sg_table
+ * object for the @dir DMA operation by the @dev device. After success
+ * the ownership for the buffer is transferred to the DMA domain. One has
+ * to call dma_sync_sgtable_for_cpu() or dma_unmap_sgtable() to move the
+ * ownership of the buffer back to the CPU domain before touching the
+ * buffer by the CPU.
+ * Returns 0 on success or -EINVAL on error during mapping the buffer.
+ */
+static inline int dma_map_sgtable(struct device *dev, struct sg_table *sgt,
+ enum dma_data_direction dir, unsigned long attrs)
+{
+ int n = dma_map_sg_attrs(dev, sgt->sgl, sgt->orig_nents, dir, attrs);
+
+ if (n > 0) {
+ sgt->nents = n;
+ return 0;
+ }
+ return -EINVAL;
+}
+
+/**
+ * dma_unmap_sgtable - Unmap the given buffer for the DMA operations
+ * @dev: The device to perform a DMA operation
+ * @sgt: The sg_table object describing the buffer
+ * @dir: DMA direction
+ * @attrs: Optional DMA attributes for the map operation
+ *
+ * Unmaps a buffer described by a scatterlist stored in the given sg_table
+ * object for the @dir DMA operation by the @dev device. After this function
+ * the ownership of the buffer is transferred back to the CPU domain.
+ */
+static inline void dma_unmap_sgtable(struct device *dev, struct sg_table *sgt,
+ enum dma_data_direction dir, unsigned long attrs)
+{
+ dma_unmap_sg_attrs(dev, sgt->sgl, sgt->orig_nents, dir, attrs);
+}
+
+/**
+ * dma_sync_sgtable_for_cpu - Synchronize the given buffer for the CPU access
+ * @dev: The device to perform a DMA operation
+ * @sgt: The sg_table object describing the buffer
+ * @dir: DMA direction
+ *
+ * Performs the needed cache synchronization and moves the ownership of the
+ * buffer back to the CPU domain, so it is safe to perform any access to it
+ * by the CPU. Before doing any further DMA operations, one has to transfer
+ * the ownership of the buffer back to the DMA domain by calling the
+ * dma_sync_sgtable_for_device().
+ */
+static inline void dma_sync_sgtable_for_cpu(struct device *dev,
+ struct sg_table *sgt, enum dma_data_direction dir)
+{
+ dma_sync_sg_for_cpu(dev, sgt->sgl, sgt->orig_nents, dir);
+}
+
+/**
+ * dma_sync_sgtable_for_device - Synchronize the given buffer for the DMA
+ * @dev: The device to perform a DMA operation
+ * @sgt: The sg_table object describing the buffer
+ * @dir: DMA direction
+ *
+ * Performs the needed cache synchronization and moves the ownership of the
+ * buffer back to the DMA domain, so it is safe to perform the DMA operation.
+ * Once finished, one has to call dma_sync_sgtable_for_cpu() or
+ * dma_unmap_sgtable().
+ */
+static inline void dma_sync_sgtable_for_device(struct device *dev,
+ struct sg_table *sgt, enum dma_data_direction dir)
+{
+ dma_sync_sg_for_device(dev, sgt->sgl, sgt->orig_nents, dir);
+}
+
#define dma_map_single(d, a, s, r) dma_map_single_attrs(d, a, s, r, 0)
#define dma_unmap_single(d, a, s, r) dma_unmap_single_attrs(d, a, s, r, 0)
#define dma_map_sg(d, s, n, r) dma_map_sg_attrs(d, s, n, r, 0)
--
1.9.1

2020-05-12 09:07:05

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v4 06/38] drm: core: fix common struct sg_table related issues

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski <[email protected]>
---
For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread:
https://lore.kernel.org/dri-devel/[email protected]/T/
---
drivers/gpu/drm/drm_cache.c | 2 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 14 +++++++++-----
drivers/gpu/drm/drm_prime.c | 11 ++++++-----
3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 03e01b0..0fe3c49 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -127,7 +127,7 @@ static void drm_cache_flush_clflush(struct page *pages[],
struct sg_page_iter sg_iter;

mb(); /*CLFLUSH is ordered only by using memory barriers*/
- for_each_sg_page(st->sgl, &sg_iter, st->nents, 0)
+ for_each_sgtable_page(st, &sg_iter, 0)
drm_clflush_page(sg_page_iter_page(&sg_iter));
mb(); /*Make sure that all cache line entry is flushed*/

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index df31e57..00a43e8 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -117,8 +117,8 @@ void drm_gem_shmem_free_object(struct drm_gem_object *obj)
kvfree(shmem->pages);
} else {
if (shmem->sgt) {
- dma_unmap_sg(obj->dev->dev, shmem->sgt->sgl,
- shmem->sgt->nents, DMA_BIDIRECTIONAL);
+ dma_unmap_sgtable(obj->dev->dev, shmem->sgt,
+ DMA_BIDIRECTIONAL, 0);
sg_free_table(shmem->sgt);
kfree(shmem->sgt);
}
@@ -395,8 +395,7 @@ void drm_gem_shmem_purge_locked(struct drm_gem_object *obj)

WARN_ON(!drm_gem_shmem_is_purgeable(shmem));

- dma_unmap_sg(obj->dev->dev, shmem->sgt->sgl,
- shmem->sgt->nents, DMA_BIDIRECTIONAL);
+ dma_unmap_sgtable(obj->dev->dev, shmem->sgt, DMA_BIDIRECTIONAL, 0);
sg_free_table(shmem->sgt);
kfree(shmem->sgt);
shmem->sgt = NULL;
@@ -623,12 +622,17 @@ struct sg_table *drm_gem_shmem_get_pages_sgt(struct drm_gem_object *obj)
goto err_put_pages;
}
/* Map the pages for use by the h/w. */
- dma_map_sg(obj->dev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL);
+ ret = dma_map_sgtable(obj->dev->dev, sgt, DMA_BIDIRECTIONAL, 0);
+ if (ret)
+ goto err_free_sgt;

shmem->sgt = sgt;

return sgt;

+err_free_sgt:
+ sg_free_table(sgt);
+ kfree(sgt);
err_put_pages:
drm_gem_shmem_put_pages(shmem);
return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index dfdf4d4..5ed22dd 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -617,6 +617,7 @@ struct sg_table *drm_gem_map_dma_buf(struct dma_buf_attachment *attach,
{
struct drm_gem_object *obj = attach->dmabuf->priv;
struct sg_table *sgt;
+ int ret;

if (WARN_ON(dir == DMA_NONE))
return ERR_PTR(-EINVAL);
@@ -626,11 +627,12 @@ struct sg_table *drm_gem_map_dma_buf(struct dma_buf_attachment *attach,
else
sgt = obj->dev->driver->gem_prime_get_sg_table(obj);

- if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC)) {
+ ret = dma_map_sgtable(attach->dev, sgt, dir,
+ DMA_ATTR_SKIP_CPU_SYNC);
+ if (ret) {
sg_free_table(sgt);
kfree(sgt);
- sgt = ERR_PTR(-ENOMEM);
+ sgt = ERR_PTR(ret);
}

return sgt;
@@ -652,8 +654,7 @@ void drm_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
if (!sgt)
return;

- dma_unmap_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC);
+ dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sgt);
kfree(sgt);
}
--
1.9.1

2020-05-12 09:07:17

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v4 29/38] drm: rcar-du: fix common struct sg_table related issues

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

dma_map_sgtable() function returns zero or an error code, so adjust the
return value check for the vsp1_du_map_sg() function.

Signed-off-by: Marek Szyprowski <[email protected]>
---
For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread:
https://lore.kernel.org/dri-devel/[email protected]/T/
---
drivers/gpu/drm/rcar-du/rcar_du_vsp.c | 3 +--
drivers/media/platform/vsp1/vsp1_drm.c | 8 ++++----
2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_vsp.c b/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
index 5e4faf2..2fc1816 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
@@ -197,9 +197,8 @@ int rcar_du_vsp_map_fb(struct rcar_du_vsp *vsp, struct drm_framebuffer *fb,
goto fail;

ret = vsp1_du_map_sg(vsp->vsp, sgt);
- if (!ret) {
+ if (ret) {
sg_free_table(sgt);
- ret = -ENOMEM;
goto fail;
}
}
diff --git a/drivers/media/platform/vsp1/vsp1_drm.c b/drivers/media/platform/vsp1/vsp1_drm.c
index a4a45d6..86d5e3f 100644
--- a/drivers/media/platform/vsp1/vsp1_drm.c
+++ b/drivers/media/platform/vsp1/vsp1_drm.c
@@ -912,8 +912,8 @@ int vsp1_du_map_sg(struct device *dev, struct sg_table *sgt)
* skip cache sync. This will need to be revisited when support for
* non-coherent buffers will be added to the DU driver.
*/
- return dma_map_sg_attrs(vsp1->bus_master, sgt->sgl, sgt->nents,
- DMA_TO_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
+ return dma_map_sgtable(vsp1->bus_master, sgt, DMA_TO_DEVICE,
+ DMA_ATTR_SKIP_CPU_SYNC);
}
EXPORT_SYMBOL_GPL(vsp1_du_map_sg);

@@ -921,8 +921,8 @@ void vsp1_du_unmap_sg(struct device *dev, struct sg_table *sgt)
{
struct vsp1_device *vsp1 = dev_get_drvdata(dev);

- dma_unmap_sg_attrs(vsp1->bus_master, sgt->sgl, sgt->nents,
- DMA_TO_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
+ dma_unmap_sgtable(vsp1->bus_master, sgt, DMA_TO_DEVICE,
+ DMA_ATTR_SKIP_CPU_SYNC);
}
EXPORT_SYMBOL_GPL(vsp1_du_unmap_sg);

--
1.9.1

2020-05-12 09:07:40

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v4 02/38] scatterlist: add generic wrappers for iterating over sgtable objects

struct sg_table is a common structure used for describing a memory
buffer. It consists of a scatterlist with memory pages and DMA addresses
(sgl entry), as well as the number of scatterlist entries: CPU pages
(orig_nents entry) and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling the scatterlist iterating functions with a wrong number
of the entries.

To avoid such issues, lets introduce a common wrappers operating directly
on the struct sg_table objects, which take care of the proper use of
the nents and orig_nents entries.

While touching this, lets clarify some ambiguities in the comments for
the existing for_each helpers.

Signed-off-by: Marek Szyprowski <[email protected]>
---
For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread:
https://lore.kernel.org/dri-devel/[email protected]/T/
---
include/linux/scatterlist.h | 50 ++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 47 insertions(+), 3 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 6eec50f..4f922af 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -151,6 +151,20 @@ static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
#define for_each_sg(sglist, sg, nr, __i) \
for (__i = 0, sg = (sglist); __i < (nr); __i++, sg = sg_next(sg))

+/*
+ * Loop over each sg element in the given sg_table object.
+ */
+#define for_each_sgtable_sg(sgt, sg, i) \
+ for_each_sg(sgt->sgl, sg, sgt->orig_nents, i)
+
+/*
+ * Loop over each sg element in the given *DMA mapped* sg_table object.
+ * Please use sg_dma_address(sg) and sg_dma_len(sg) to extract DMA addresses
+ * of the each element.
+ */
+#define for_each_sgtable_dma_sg(sgt, sg, i) \
+ for_each_sg(sgt->sgl, sg, sgt->nents, i)
+
/**
* sg_chain - Chain two sglists together
* @prv: First scatterlist
@@ -401,9 +415,10 @@ static inline struct page *sg_page_iter_page(struct sg_page_iter *piter)
* @sglist: sglist to iterate over
* @piter: page iterator to hold current page, sg, sg_pgoffset
* @nents: maximum number of sg entries to iterate over
- * @pgoffset: starting page offset
+ * @pgoffset: starting page offset (in pages)
*
* Callers may use sg_page_iter_page() to get each page pointer.
+ * In each loop it operates on PAGE_SIZE unit.
*/
#define for_each_sg_page(sglist, piter, nents, pgoffset) \
for (__sg_page_iter_start((piter), (sglist), (nents), (pgoffset)); \
@@ -412,18 +427,47 @@ static inline struct page *sg_page_iter_page(struct sg_page_iter *piter)
/**
* for_each_sg_dma_page - iterate over the pages of the given sg list
* @sglist: sglist to iterate over
- * @dma_iter: page iterator to hold current page
+ * @dma_iter: DMA page iterator to hold current page
* @dma_nents: maximum number of sg entries to iterate over, this is the value
* returned from dma_map_sg
- * @pgoffset: starting page offset
+ * @pgoffset: starting page offset (in pages)
*
* Callers may use sg_page_iter_dma_address() to get each page's DMA address.
+ * In each loop it operates on PAGE_SIZE unit.
*/
#define for_each_sg_dma_page(sglist, dma_iter, dma_nents, pgoffset) \
for (__sg_page_iter_start(&(dma_iter)->base, sglist, dma_nents, \
pgoffset); \
__sg_page_iter_dma_next(dma_iter);)

+/**
+ * for_each_sgtable_page - iterate over all pages in the sg_table object
+ * @sgt: sg_table object to iterate over
+ * @piter: page iterator to hold current page
+ * @pgoffset: starting page offset (in pages)
+ *
+ * Iterates over the all memory pages in the buffer described by
+ * a scatterlist stored in the given sg_table object.
+ * See also for_each_sg_page(). In each loop it operates on PAGE_SIZE unit.
+ */
+#define for_each_sgtable_page(sgt, piter, pgoffset) \
+ for_each_sg_page(sgt->sgl, piter, sgt->orig_nents, pgoffset)
+
+/**
+ * for_each_sgtable_dma_page - iterate over the DMA mapped sg_table object
+ * @sgt: sg_table object to iterate over
+ * @dma_iter: DMA page iterator to hold current page
+ * @pgoffset: starting page offset (in pages)
+ *
+ * Iterates over the all DMA mapped pages in the buffer described by
+ * a scatterlist stored in the given sg_table object.
+ * See also for_each_sg_dma_page(). In each loop it operates on PAGE_SIZE
+ * unit.
+ */
+#define for_each_sgtable_dma_page(sgt, dma_iter, pgoffset) \
+ for_each_sg_dma_page(sgt->sgl, dma_iter, sgt->nents, pgoffset)
+
+
/*
* Mapping sg iterator
*
--
1.9.1

2020-05-12 09:07:41

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v4 20/38] drm: radeon: fix common struct sg_table related issues

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski <[email protected]>
Reviewed-by: Christian König <[email protected]>
---
For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread:
https://lore.kernel.org/dri-devel/[email protected]/T/
---
drivers/gpu/drm/radeon/radeon_ttm.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index 5d50c9e..0e3eb0d 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -481,7 +481,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
{
struct radeon_device *rdev = radeon_get_rdev(ttm->bdev);
struct radeon_ttm_tt *gtt = (void *)ttm;
- unsigned pinned = 0, nents;
+ unsigned pinned = 0;
int r;

int write = !(gtt->userflags & RADEON_GEM_USERPTR_READONLY);
@@ -521,9 +521,8 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
if (r)
goto release_sg;

- r = -ENOMEM;
- nents = dma_map_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
- if (nents == 0)
+ r = dma_map_sgtable(rdev->dev, ttm->sg, direction, 0);
+ if (r)
goto release_sg;

drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
@@ -554,9 +553,9 @@ static void radeon_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
return;

/* free the sg table and pages again */
- dma_unmap_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
+ dma_unmap_sgtable(rdev->dev, ttm->sg, direction, 0);

- for_each_sg_page(ttm->sg->sgl, &sg_iter, ttm->sg->nents, 0) {
+ for_each_sgtable_page(ttm->sg, &sg_iter, 0) {
struct page *page = sg_page_iter_page(&sg_iter);
if (!(gtt->userflags & RADEON_GEM_USERPTR_READONLY))
set_page_dirty(page);
--
1.9.1

2020-05-12 09:07:41

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v4 21/38] drm: rockchip: use common helper for a scatterlist contiguity check

Use common helper for checking the contiguity of the imported dma-buf.

Signed-off-by: Marek Szyprowski <[email protected]>
---
For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread:
https://lore.kernel.org/dri-devel/[email protected]/T/
---
drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 19 +------------------
1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
index 0d18846..21f8cb2 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
@@ -460,23 +460,6 @@ struct sg_table *rockchip_gem_prime_get_sg_table(struct drm_gem_object *obj)
return sgt;
}

-static unsigned long rockchip_sg_get_contiguous_size(struct sg_table *sgt,
- int count)
-{
- struct scatterlist *s;
- dma_addr_t expected = sg_dma_address(sgt->sgl);
- unsigned int i;
- unsigned long size = 0;
-
- for_each_sg(sgt->sgl, s, count, i) {
- if (sg_dma_address(s) != expected)
- break;
- expected = sg_dma_address(s) + sg_dma_len(s);
- size += sg_dma_len(s);
- }
- return size;
-}
-
static int
rockchip_gem_iommu_map_sg(struct drm_device *drm,
struct dma_buf_attachment *attach,
@@ -498,7 +481,7 @@ static unsigned long rockchip_sg_get_contiguous_size(struct sg_table *sgt,
if (!count)
return -EINVAL;

- if (rockchip_sg_get_contiguous_size(sg, count) < attach->dmabuf->size) {
+ if (drm_prime_get_contiguous_size(sg) < attach->dmabuf->size) {
DRM_ERROR("failed to map sg_table to contiguous linear address.\n");
dma_unmap_sg(drm->dev, sg->sgl, sg->nents,
DMA_BIDIRECTIONAL);
--
1.9.1

2020-05-12 09:07:57

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v4 12/38] drm: i915: fix common struct sg_table related issues

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

This driver creatively uses sg_table->orig_nents to store the size of the
allocated scatterlist and ignores the number of the entries returned by
dma_map_sg function. The sg_table->orig_nents is (mis)used to properly
free the (over)allocated scatterlist.

This patch only introduces the common DMA-mapping wrappers operating
directly on the struct sg_table objects to the dmabuf related functions,
so the other drivers, which might share buffers with i915 could rely on
the properly set nents and orig_nents values.

Signed-off-by: Marek Szyprowski <[email protected]>
---
For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread:
https://lore.kernel.org/dri-devel/[email protected]/T/
---
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 11 +++--------
drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c | 7 +++----
2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 7db5a79..6c67810 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -48,12 +48,9 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme
src = sg_next(src);
}

- if (!dma_map_sg_attrs(attachment->dev,
- st->sgl, st->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC)) {
- ret = -ENOMEM;
+ ret = dma_map_sgtable(attachment->dev, st, dir, DMA_ATTR_SKIP_CPU_SYNC);
+ if (ret)
goto err_free_sg;
- }

return st;

@@ -73,9 +70,7 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment,
{
struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);

- dma_unmap_sg_attrs(attachment->dev,
- sg->sgl, sg->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC);
+ dma_unmap_sgtable(attachment->dev, sg, dir, DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sg);
kfree(sg);

diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
index debaf7b..be30b27 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
@@ -28,10 +28,9 @@ static struct sg_table *mock_map_dma_buf(struct dma_buf_attachment *attachment,
sg = sg_next(sg);
}

- if (!dma_map_sg(attachment->dev, st->sgl, st->nents, dir)) {
- err = -ENOMEM;
+ err = dma_map_sgtable(attachment->dev, st, dir, 0);
+ if (err)
goto err_st;
- }

return st;

@@ -46,7 +45,7 @@ static void mock_unmap_dma_buf(struct dma_buf_attachment *attachment,
struct sg_table *st,
enum dma_data_direction dir)
{
- dma_unmap_sg(attachment->dev, st->sgl, st->nents, dir);
+ dma_unmap_sgtable(attachment->dev, st, dir, 0);
sg_free_table(st);
kfree(st);
}
--
1.9.1

2020-05-12 09:07:59

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v4 15/38] drm: mediatek: use common helper for extracting pages array

Use common helper for converting a sg_table object into struct
page pointer array.

Signed-off-by: Marek Szyprowski <[email protected]>
---
For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread:
https://lore.kernel.org/dri-devel/[email protected]/T/
---
drivers/gpu/drm/mediatek/mtk_drm_gem.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index 6c34c06..14fcd48 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -233,9 +233,7 @@ void *mtk_drm_gem_prime_vmap(struct drm_gem_object *obj)
{
struct mtk_drm_gem_obj *mtk_gem = to_mtk_gem_obj(obj);
struct sg_table *sgt;
- struct sg_page_iter iter;
unsigned int npages;
- unsigned int i = 0;

if (mtk_gem->kvaddr)
return mtk_gem->kvaddr;
@@ -249,11 +247,8 @@ void *mtk_drm_gem_prime_vmap(struct drm_gem_object *obj)
if (!mtk_gem->pages)
goto out;

- for_each_sg_page(sgt->sgl, &iter, sgt->orig_nents, 0) {
- mtk_gem->pages[i++] = sg_page_iter_page(&iter);
- if (i > npages)
- break;
- }
+ drm_prime_sg_to_page_addr_arrays(sgt, mtk_gem->pages, NULL, npages);
+
mtk_gem->kvaddr = vmap(mtk_gem->pages, npages, VM_MAP,
pgprot_writecombine(PAGE_KERNEL));

--
1.9.1

2020-05-12 09:08:27

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v4 05/38] drm: prime: use sgtable iterators in drm_prime_sg_to_page_addr_arrays()

Replace the current hand-crafted code for extracting pages and DMA
addresses from the given scatterlist by the much more robust
code based on the generic scatterlist iterators and recently
introduced sg_table-based wrappers. The resulting code is simple and
easy to understand, so the comment describing the old code is no
longer needed.

Signed-off-by: Marek Szyprowski <[email protected]>
---
For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread:
https://lore.kernel.org/dri-devel/[email protected]/T/
---
drivers/gpu/drm/drm_prime.c | 47 ++++++++++++++-------------------------------
1 file changed, 14 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 1d2e5fe..dfdf4d4 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -985,45 +985,26 @@ struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev,
int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages,
dma_addr_t *addrs, int max_entries)
{
- unsigned count;
- struct scatterlist *sg;
- struct page *page;
- u32 page_len, page_index;
- dma_addr_t addr;
- u32 dma_len, dma_index;
+ struct sg_dma_page_iter dma_iter;
+ struct sg_page_iter page_iter;
+ struct page **p = pages;
+ dma_addr_t *a = addrs;

- /*
- * Scatterlist elements contains both pages and DMA addresses, but
- * one shoud not assume 1:1 relation between them. The sg->length is
- * the size of the physical memory chunk described by the sg->page,
- * while sg_dma_len(sg) is the size of the DMA (IO virtual) chunk
- * described by the sg_dma_address(sg).
- */
- page_index = 0;
- dma_index = 0;
- for_each_sg(sgt->sgl, sg, sgt->nents, count) {
- page_len = sg->length;
- page = sg_page(sg);
- dma_len = sg_dma_len(sg);
- addr = sg_dma_address(sg);
-
- while (pages && page_len > 0) {
- if (WARN_ON(page_index >= max_entries))
+ if (pages) {
+ for_each_sgtable_page(sgt, &page_iter, 0) {
+ if (p - pages >= max_entries)
return -1;
- pages[page_index] = page;
- page++;
- page_len -= PAGE_SIZE;
- page_index++;
+ *p++ = sg_page_iter_page(&page_iter);
}
- while (addrs && dma_len > 0) {
- if (WARN_ON(dma_index >= max_entries))
+ }
+ if (addrs) {
+ for_each_sgtable_dma_page(sgt, &dma_iter, 0) {
+ if (a - addrs >= max_entries)
return -1;
- addrs[dma_index] = addr;
- addr += PAGE_SIZE;
- dma_len -= PAGE_SIZE;
- dma_index++;
+ *a++ = sg_page_iter_dma_address(&dma_iter);
}
}
+
return 0;
}
EXPORT_SYMBOL(drm_prime_sg_to_page_addr_arrays);
--
1.9.1

2020-05-12 09:08:28

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v4 04/38] drm: prime: add common helper to check scatterlist contiguity

It is a common operation done by DRM drivers to check the contiguity
of the DMA-mapped buffer described by a scatterlist in the
sg_table object. Let's add a common helper for this operation.

Signed-off-by: Marek Szyprowski <[email protected]>
---
For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread:
https://lore.kernel.org/dri-devel/[email protected]/T/
---
drivers/gpu/drm/drm_gem_cma_helper.c | 23 +++--------------------
drivers/gpu/drm/drm_prime.c | 26 ++++++++++++++++++++++++++
include/drm/drm_prime.h | 2 ++
3 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_cma_helper.c b/drivers/gpu/drm/drm_gem_cma_helper.c
index 12e98fb..9f2d13e 100644
--- a/drivers/gpu/drm/drm_gem_cma_helper.c
+++ b/drivers/gpu/drm/drm_gem_cma_helper.c
@@ -471,26 +471,9 @@ struct drm_gem_object *
{
struct drm_gem_cma_object *cma_obj;

- if (sgt->nents != 1) {
- /* check if the entries in the sg_table are contiguous */
- dma_addr_t next_addr = sg_dma_address(sgt->sgl);
- struct scatterlist *s;
- unsigned int i;
-
- for_each_sg(sgt->sgl, s, sgt->nents, i) {
- /*
- * sg_dma_address(s) is only valid for entries
- * that have sg_dma_len(s) != 0
- */
- if (!sg_dma_len(s))
- continue;
-
- if (sg_dma_address(s) != next_addr)
- return ERR_PTR(-EINVAL);
-
- next_addr = sg_dma_address(s) + sg_dma_len(s);
- }
- }
+ /* check if the entries in the sg_table are contiguous */
+ if (drm_prime_get_contiguous_size(sgt) < attach->dmabuf->size)
+ return ERR_PTR(-EINVAL);

/* Create a CMA GEM buffer. */
cma_obj = __drm_gem_cma_create(dev, attach->dmabuf->size);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 282774e..1d2e5fe 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -826,6 +826,32 @@ struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_page
EXPORT_SYMBOL(drm_prime_pages_to_sg);

/**
+ * drm_prime_get_contiguous_size - returns the contiguous size of the buffer
+ * @sgt: sg_table describing the buffer to check
+ *
+ * This helper calculates the contiguous size in the DMA address space
+ * of the the buffer described by the provided sg_table.
+ *
+ * This is useful for implementing
+ * &drm_gem_object_funcs.gem_prime_import_sg_table.
+ */
+unsigned long drm_prime_get_contiguous_size(struct sg_table *sgt)
+{
+ dma_addr_t expected = sg_dma_address(sgt->sgl);
+ struct sg_dma_page_iter dma_iter;
+ unsigned long size = 0;
+
+ for_each_sgtable_dma_page(sgt, &dma_iter, 0) {
+ if (sg_page_iter_dma_address(&dma_iter) != expected)
+ break;
+ expected += PAGE_SIZE;
+ size += PAGE_SIZE;
+ }
+ return size;
+}
+EXPORT_SYMBOL(drm_prime_get_contiguous_size);
+
+/**
* drm_gem_prime_export - helper library implementation of the export callback
* @obj: GEM object to export
* @flags: flags like DRM_CLOEXEC and DRM_RDWR
diff --git a/include/drm/drm_prime.h b/include/drm/drm_prime.h
index 9af7422..47ef116 100644
--- a/include/drm/drm_prime.h
+++ b/include/drm/drm_prime.h
@@ -92,6 +92,8 @@ void drm_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj,
int flags);

+unsigned long drm_prime_get_contiguous_size(struct sg_table *sgt);
+
/* helper functions for importing */
struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev,
struct dma_buf *dma_buf,
--
1.9.1

2020-05-12 12:20:33

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 01/38] dma-mapping: add generic helpers for mapping sgtable objects

On Tue, May 12, 2020 at 11:00:21AM +0200, Marek Szyprowski wrote:
> struct sg_table is a common structure used for describing a memory
> buffer. It consists of a scatterlist with memory pages and DMA addresses
> (sgl entry), as well as the number of scatterlist entries: CPU pages
> (orig_nents entry) and DMA mapped pages (nents entry).
>
> It turned out that it was a common mistake to misuse nents and orig_nents
> entries, calling DMA-mapping functions with a wrong number of entries or
> ignoring the number of mapped entries returned by the dma_map_sg
> function.
>
> To avoid such issues, lets introduce a common wrappers operating directly
> on the struct sg_table objects, which take care of the proper use of
> the nents and orig_nents entries.
>
> Signed-off-by: Marek Szyprowski <[email protected]>
> ---
> For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
> vs. orig_nents misuse' thread:
> https://lore.kernel.org/dri-devel/[email protected]/T/
> ---
> include/linux/dma-mapping.h | 79 +++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 79 insertions(+)
>
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index b43116a..88f01cc 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -609,6 +609,85 @@ static inline void dma_sync_single_range_for_device(struct device *dev,
> return dma_sync_single_for_device(dev, addr + offset, size, dir);
> }
>
> +/**
> + * dma_map_sgtable - Map the given buffer for the DMA operations
> + * @dev: The device to perform a DMA operation
> + * @sgt: The sg_table object describing the buffer
> + * @dir: DMA direction
> + * @attrs: Optional DMA attributes for the map operation
> + *
> + * Maps a buffer described by a scatterlist stored in the given sg_table
> + * object for the @dir DMA operation by the @dev device. After success
> + * the ownership for the buffer is transferred to the DMA domain. One has
> + * to call dma_sync_sgtable_for_cpu() or dma_unmap_sgtable() to move the
> + * ownership of the buffer back to the CPU domain before touching the
> + * buffer by the CPU.
> + * Returns 0 on success or -EINVAL on error during mapping the buffer.
> + */
> +static inline int dma_map_sgtable(struct device *dev, struct sg_table *sgt,
> + enum dma_data_direction dir, unsigned long attrs)
> +{
> + int n = dma_map_sg_attrs(dev, sgt->sgl, sgt->orig_nents, dir, attrs);
> +
> + if (n > 0) {
> + sgt->nents = n;
> + return 0;
> + }
> + return -EINVAL;

Nit: code tend to be a tad easier to read if the the exceptional
condition is inside the branch block, so:

if (n <= 0)
return -EINVAL;
sgt->nents = n;
return 0;

Otherwise this looks good to me:

Reviewed-by: Christoph Hellwig <[email protected]>

2020-05-12 12:21:44

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 00/38] DRM: fix struct sg_table nents vs. orig_nents misuse

Thanks for doing the work. I'd love to pull 1-3 into an immutable
branch as soon possible, as I also have some pending work touching
that area.

2020-05-12 13:02:52

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH v4 01/38] dma-mapping: add generic helpers for mapping sgtable objects

Hi Christoph,

On 12.05.2020 14:18, Christoph Hellwig wrote:
> On Tue, May 12, 2020 at 11:00:21AM +0200, Marek Szyprowski wrote:
>> struct sg_table is a common structure used for describing a memory
>> buffer. It consists of a scatterlist with memory pages and DMA addresses
>> (sgl entry), as well as the number of scatterlist entries: CPU pages
>> (orig_nents entry) and DMA mapped pages (nents entry).
>>
>> It turned out that it was a common mistake to misuse nents and orig_nents
>> entries, calling DMA-mapping functions with a wrong number of entries or
>> ignoring the number of mapped entries returned by the dma_map_sg
>> function.
>>
>> To avoid such issues, lets introduce a common wrappers operating directly
>> on the struct sg_table objects, which take care of the proper use of
>> the nents and orig_nents entries.
>>
>> Signed-off-by: Marek Szyprowski <[email protected]>
>> ---
>> For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
>> vs. orig_nents misuse' thread:
>> https://lore.kernel.org/dri-devel/[email protected]/T/
>> ---
>> include/linux/dma-mapping.h | 79 +++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 79 insertions(+)
>>
>> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
>> index b43116a..88f01cc 100644
>> --- a/include/linux/dma-mapping.h
>> +++ b/include/linux/dma-mapping.h
>> @@ -609,6 +609,85 @@ static inline void dma_sync_single_range_for_device(struct device *dev,
>> return dma_sync_single_for_device(dev, addr + offset, size, dir);
>> }
>>
>> +/**
>> + * dma_map_sgtable - Map the given buffer for the DMA operations
>> + * @dev: The device to perform a DMA operation
>> + * @sgt: The sg_table object describing the buffer
>> + * @dir: DMA direction
>> + * @attrs: Optional DMA attributes for the map operation
>> + *
>> + * Maps a buffer described by a scatterlist stored in the given sg_table
>> + * object for the @dir DMA operation by the @dev device. After success
>> + * the ownership for the buffer is transferred to the DMA domain. One has
>> + * to call dma_sync_sgtable_for_cpu() or dma_unmap_sgtable() to move the
>> + * ownership of the buffer back to the CPU domain before touching the
>> + * buffer by the CPU.
>> + * Returns 0 on success or -EINVAL on error during mapping the buffer.
>> + */
>> +static inline int dma_map_sgtable(struct device *dev, struct sg_table *sgt,
>> + enum dma_data_direction dir, unsigned long attrs)
>> +{
>> + int n = dma_map_sg_attrs(dev, sgt->sgl, sgt->orig_nents, dir, attrs);
>> +
>> + if (n > 0) {
>> + sgt->nents = n;
>> + return 0;
>> + }
>> + return -EINVAL;
> Nit: code tend to be a tad easier to read if the the exceptional
> condition is inside the branch block, so:
>
> if (n <= 0)
> return -EINVAL;
> sgt->nents = n;
> return 0;
>
Indeed this version looks much better. I will resend it in a few minutes.
> Otherwise this looks good to me:
>
> Reviewed-by: Christoph Hellwig <[email protected]>
>
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2020-05-12 13:12:01

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 01/38] dma-mapping: add generic helpers for mapping sgtable objects

On Tue, May 12, 2020 at 03:00:31PM +0200, Marek Szyprowski wrote:
> > if (n <= 0)
> > return -EINVAL;
> > sgt->nents = n;
> > return 0;
> >
> Indeed this version looks much better. I will resend it in a few minutes.

I could also just fix it up when applying the patch.

2020-05-12 13:21:37

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH v4 01/38] dma-mapping: add generic helpers for mapping sgtable objects

Hi Christoph,

On 12.05.2020 15:09, Christoph Hellwig wrote:
> On Tue, May 12, 2020 at 03:00:31PM +0200, Marek Szyprowski wrote:
>>> if (n <= 0)
>>> return -EINVAL;
>>> sgt->nents = n;
>>> return 0;
>>>
>> Indeed this version looks much better. I will resend it in a few minutes.
> I could also just fix it up when applying the patch.
Fine for me.

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2020-05-13 20:43:13

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v4 02/38] scatterlist: add generic wrappers for iterating over sgtable objects

On 2020-05-12 10:00 am, Marek Szyprowski wrote:
> struct sg_table is a common structure used for describing a memory
> buffer. It consists of a scatterlist with memory pages and DMA addresses
> (sgl entry), as well as the number of scatterlist entries: CPU pages
> (orig_nents entry) and DMA mapped pages (nents entry).
>
> It turned out that it was a common mistake to misuse nents and orig_nents
> entries, calling the scatterlist iterating functions with a wrong number
> of the entries.
>
> To avoid such issues, lets introduce a common wrappers operating directly
> on the struct sg_table objects, which take care of the proper use of
> the nents and orig_nents entries.
>
> While touching this, lets clarify some ambiguities in the comments for
> the existing for_each helpers.

Reviewed-by: Robin Murphy <[email protected]>

> Signed-off-by: Marek Szyprowski <[email protected]>
> ---
> For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
> vs. orig_nents misuse' thread:
> https://lore.kernel.org/dri-devel/[email protected]/T/
> ---
> include/linux/scatterlist.h | 50 ++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 47 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index 6eec50f..4f922af 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -151,6 +151,20 @@ static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
> #define for_each_sg(sglist, sg, nr, __i) \
> for (__i = 0, sg = (sglist); __i < (nr); __i++, sg = sg_next(sg))
>
> +/*
> + * Loop over each sg element in the given sg_table object.
> + */
> +#define for_each_sgtable_sg(sgt, sg, i) \
> + for_each_sg(sgt->sgl, sg, sgt->orig_nents, i)
> +
> +/*
> + * Loop over each sg element in the given *DMA mapped* sg_table object.
> + * Please use sg_dma_address(sg) and sg_dma_len(sg) to extract DMA addresses
> + * of the each element.
> + */
> +#define for_each_sgtable_dma_sg(sgt, sg, i) \
> + for_each_sg(sgt->sgl, sg, sgt->nents, i)
> +
> /**
> * sg_chain - Chain two sglists together
> * @prv: First scatterlist
> @@ -401,9 +415,10 @@ static inline struct page *sg_page_iter_page(struct sg_page_iter *piter)
> * @sglist: sglist to iterate over
> * @piter: page iterator to hold current page, sg, sg_pgoffset
> * @nents: maximum number of sg entries to iterate over
> - * @pgoffset: starting page offset
> + * @pgoffset: starting page offset (in pages)
> *
> * Callers may use sg_page_iter_page() to get each page pointer.
> + * In each loop it operates on PAGE_SIZE unit.
> */
> #define for_each_sg_page(sglist, piter, nents, pgoffset) \
> for (__sg_page_iter_start((piter), (sglist), (nents), (pgoffset)); \
> @@ -412,18 +427,47 @@ static inline struct page *sg_page_iter_page(struct sg_page_iter *piter)
> /**
> * for_each_sg_dma_page - iterate over the pages of the given sg list
> * @sglist: sglist to iterate over
> - * @dma_iter: page iterator to hold current page
> + * @dma_iter: DMA page iterator to hold current page
> * @dma_nents: maximum number of sg entries to iterate over, this is the value
> * returned from dma_map_sg
> - * @pgoffset: starting page offset
> + * @pgoffset: starting page offset (in pages)
> *
> * Callers may use sg_page_iter_dma_address() to get each page's DMA address.
> + * In each loop it operates on PAGE_SIZE unit.
> */
> #define for_each_sg_dma_page(sglist, dma_iter, dma_nents, pgoffset) \
> for (__sg_page_iter_start(&(dma_iter)->base, sglist, dma_nents, \
> pgoffset); \
> __sg_page_iter_dma_next(dma_iter);)
>
> +/**
> + * for_each_sgtable_page - iterate over all pages in the sg_table object
> + * @sgt: sg_table object to iterate over
> + * @piter: page iterator to hold current page
> + * @pgoffset: starting page offset (in pages)
> + *
> + * Iterates over the all memory pages in the buffer described by
> + * a scatterlist stored in the given sg_table object.
> + * See also for_each_sg_page(). In each loop it operates on PAGE_SIZE unit.
> + */
> +#define for_each_sgtable_page(sgt, piter, pgoffset) \
> + for_each_sg_page(sgt->sgl, piter, sgt->orig_nents, pgoffset)
> +
> +/**
> + * for_each_sgtable_dma_page - iterate over the DMA mapped sg_table object
> + * @sgt: sg_table object to iterate over
> + * @dma_iter: DMA page iterator to hold current page
> + * @pgoffset: starting page offset (in pages)
> + *
> + * Iterates over the all DMA mapped pages in the buffer described by
> + * a scatterlist stored in the given sg_table object.
> + * See also for_each_sg_dma_page(). In each loop it operates on PAGE_SIZE
> + * unit.
> + */
> +#define for_each_sgtable_dma_page(sgt, dma_iter, pgoffset) \
> + for_each_sg_dma_page(sgt->sgl, dma_iter, sgt->nents, pgoffset)
> +
> +
> /*
> * Mapping sg iterator
> *
>

2020-05-13 20:43:20

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v4 01/38] dma-mapping: add generic helpers for mapping sgtable objects

On 2020-05-12 10:00 am, Marek Szyprowski wrote:
> struct sg_table is a common structure used for describing a memory
> buffer. It consists of a scatterlist with memory pages and DMA addresses
> (sgl entry), as well as the number of scatterlist entries: CPU pages
> (orig_nents entry) and DMA mapped pages (nents entry).
>
> It turned out that it was a common mistake to misuse nents and orig_nents
> entries, calling DMA-mapping functions with a wrong number of entries or
> ignoring the number of mapped entries returned by the dma_map_sg
> function.
>
> To avoid such issues, lets introduce a common wrappers operating directly

Nit: "let's"

> on the struct sg_table objects, which take care of the proper use of
> the nents and orig_nents entries.

A few more documentation nitpicks below, but either way the
implementation itself (modulo Christoph's fixup) looks good;

Reviewed-by: Robin Murphy <[email protected]>

> Signed-off-by: Marek Szyprowski <[email protected]>
> ---
> For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
> vs. orig_nents misuse' thread:
> https://lore.kernel.org/dri-devel/[email protected]/T/
> ---
> include/linux/dma-mapping.h | 79 +++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 79 insertions(+)
>
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index b43116a..88f01cc 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -609,6 +609,85 @@ static inline void dma_sync_single_range_for_device(struct device *dev,
> return dma_sync_single_for_device(dev, addr + offset, size, dir);
> }
>
> +/**
> + * dma_map_sgtable - Map the given buffer for the DMA operations

Either "for DMA operations", "for the DMA operation", or "for a DMA
operation", depending on the exact context. Or at that point, perhaps
just "for DMA".

> + * @dev: The device to perform a DMA operation

That doesn't quite parse, maybe "the device performing the DMA
operation", or "the device for which to perform the DMA operation",
depending on whether "DMA operation" means the mapping or the actual
hardware access?

> + * @sgt: The sg_table object describing the buffer
> + * @dir: DMA direction
> + * @attrs: Optional DMA attributes for the map operation
> + *
> + * Maps a buffer described by a scatterlist stored in the given sg_table
> + * object for the @dir DMA operation by the @dev device. After success
> + * the ownership for the buffer is transferred to the DMA domain. One has
> + * to call dma_sync_sgtable_for_cpu() or dma_unmap_sgtable() to move the
> + * ownership of the buffer back to the CPU domain before touching the
> + * buffer by the CPU.
> + * Returns 0 on success or -EINVAL on error during mapping the buffer.

Maybe make that a proper "Return:" section?

> + */
> +static inline int dma_map_sgtable(struct device *dev, struct sg_table *sgt,
> + enum dma_data_direction dir, unsigned long attrs)
> +{
> + int n = dma_map_sg_attrs(dev, sgt->sgl, sgt->orig_nents, dir, attrs);
> +
> + if (n > 0) {
> + sgt->nents = n;
> + return 0;
> + }
> + return -EINVAL;
> +}
> +
> +/**
> + * dma_unmap_sgtable - Unmap the given buffer for the DMA operations
> + * @dev: The device to perform a DMA operation

Same two points as before.

> + * @sgt: The sg_table object describing the buffer
> + * @dir: DMA direction
> + * @attrs: Optional DMA attributes for the map operation

Presumably "the unmap operation", although it *is* true that some
attributes are expected to match those originally passed to
dma_map_sgtable()... not sure if kerneldoc can can stretch to that level
of detail concisely ;)

> + *
> + * Unmaps a buffer described by a scatterlist stored in the given sg_table
> + * object for the @dir DMA operation by the @dev device. After this function
> + * the ownership of the buffer is transferred back to the CPU domain.
> + */
> +static inline void dma_unmap_sgtable(struct device *dev, struct sg_table *sgt,
> + enum dma_data_direction dir, unsigned long attrs)
> +{
> + dma_unmap_sg_attrs(dev, sgt->sgl, sgt->orig_nents, dir, attrs);
> +}
> +
> +/**
> + * dma_sync_sgtable_for_cpu - Synchronize the given buffer for the CPU access

s/the CPU/CPU/

> + * @dev: The device to perform a DMA operation

As before.

> + * @sgt: The sg_table object describing the buffer
> + * @dir: DMA direction
> + *
> + * Performs the needed cache synchronization and moves the ownership of the
> + * buffer back to the CPU domain, so it is safe to perform any access to it
> + * by the CPU. Before doing any further DMA operations, one has to transfer
> + * the ownership of the buffer back to the DMA domain by calling the
> + * dma_sync_sgtable_for_device().
> + */
> +static inline void dma_sync_sgtable_for_cpu(struct device *dev,
> + struct sg_table *sgt, enum dma_data_direction dir)
> +{
> + dma_sync_sg_for_cpu(dev, sgt->sgl, sgt->orig_nents, dir);
> +}
> +
> +/**
> + * dma_sync_sgtable_for_device - Synchronize the given buffer for the DMA

That one doesn't even

> + * @dev: The device to perform a DMA operation

As before.

But of course, many thanks for taking the effort to add such complete
documentation in the first place :)

Cheers,
Robin.

> + * @sgt: The sg_table object describing the buffer
> + * @dir: DMA direction
> + *
> + * Performs the needed cache synchronization and moves the ownership of the
> + * buffer back to the DMA domain, so it is safe to perform the DMA operation.
> + * Once finished, one has to call dma_sync_sgtable_for_cpu() or
> + * dma_unmap_sgtable().
> + */
> +static inline void dma_sync_sgtable_for_device(struct device *dev,
> + struct sg_table *sgt, enum dma_data_direction dir)
> +{
> + dma_sync_sg_for_device(dev, sgt->sgl, sgt->orig_nents, dir);
> +}
> +
> #define dma_map_single(d, a, s, r) dma_map_single_attrs(d, a, s, r, 0)
> #define dma_unmap_single(d, a, s, r) dma_unmap_single_attrs(d, a, s, r, 0)
> #define dma_map_sg(d, s, n, r) dma_map_sg_attrs(d, s, n, r, 0)
>