2020-04-28 13:24:42

by Marek Szyprowski

[permalink] [raw]
Subject: [RFC 00/17] DRM: fix struct sg_table nents vs. orig_nents misuse

Dear All,

During the Exynos DRM GEM rework and fixing the issues in the
drm_prime_sg_to_page_addr_arrays() function [1] I've noticed that most
drivers in DRM framework incorrectly use nents and orig_nents entries of
the struct sg_table.

In case of the most DMA-mapping implementations exchanging those two
entries or using nents for all loops on the scatterlist is harmless,
because they both have the same value. There exists however a DMA-mapping
implementations, for which such incorrect usage breaks things. The nents
returned by dma_map_sg() might be lower than the nents passed as its
parameter and this is perfectly fine. DMA framework or IOMMU is allowed
to join consecutive chunks while mapping if such operation is supported
by the underlying HW (bus, bridge, IOMMU, etc). Example of the case
where dma_map_sg() might return 1 'DMA' chunk for the 4 'physical' pages
is described here [2]

The DMA-mapping framework documentation [3] states that dma_map_sg()
returns the numer of the created entries in the DMA address space.
However the subsequent calls to dma_sync_sg_for_{device,cpu} and
dma_unmap_sg must be called with the original number of entries passed to
dma_map_sg. The common pattern in DRM drivers were to assign the
dma_map_sg() return value to sg_table->nents and use that value for
the subsequent calls to dma_sync_sg_* or dma_unmap_sg functions. Also
the code iterated over nents times to access the pages stored in the
processed scatterlist, while it should use orig_nents as the numer of
the page entries.

I've tried to identify all such incorrect usage of sg_table->nents in the
DRM subsystem and this is a result of my research. It looks that the
incorrect pattern has been copied over the many drivers. Too bad in most
cases it even worked correctly if the system used simple, linear
DMA-mapping implementation, for which swapping nents and orig_nents
doesn't make any difference.

I wonder what to do to avoid such misuse in the future, as this case
clearly shows that the current sg_table structure is a bit hard to
understand. I have the following ideas and I would like to know your
opinion before I prepare more patches and check sg_table usage all over
the kernel:

1. introduce a dma_{map,sync,unmap}_sgtable() wrappers, which will use
a proper sg_table entries and call respective DMA-mapping functions
and adapt current code to it

2. rename nents and orig_nents to nr_pages, nr_dmas to clearly state
which one refers to which part of the scatterlist; I'm open for
other names for those entries

I will appreciate your comments.

Patches are based on top of Linux next-20200428. I've reduced the
recipients list mainly to mailing lists, the next version I will try to
send to the maintainers of the respective drivers.

Best regards,
Marek Szyprowski


References:

[1] https://lkml.org/lkml/2020/3/27/555
[2] https://lkml.org/lkml/2020/3/29/65
[3] Documentation/DMA-API-HOWTO.txt


Patch summary:

Marek Szyprowski (17):
drm: core: fix sg_table nents vs. orig_nents usage
drm: amdgpu: fix sg_table nents vs. orig_nents usage
drm: armada: fix sg_table nents vs. orig_nents usage
drm: etnaviv: fix sg_table nents vs. orig_nents usage
drm: exynos: fix sg_table nents vs. orig_nents usage
drm: i915: fix sg_table nents vs. orig_nents usage
drm: lima: fix sg_table nents vs. orig_nents usage
drm: msm: fix sg_table nents vs. orig_nents usage
drm: panfrost: fix sg_table nents vs. orig_nents usage
drm: radeon: fix sg_table nents vs. orig_nents usage
drm: rockchip: fix sg_table nents vs. orig_nents usage
drm: tegra: fix sg_table nents vs. orig_nents usage
drm: virtio: fix sg_table nents vs. orig_nents usage
drm: vmwgfx: fix sg_table nents vs. orig_nents usage
drm: xen: fix sg_table nents vs. orig_nents usage
drm: host1x: fix sg_table nents vs. orig_nents usage
dmabuf: fix sg_table nents vs. orig_nents usage

drivers/dma-buf/heaps/heap-helpers.c | 7 ++++---
drivers/dma-buf/udmabuf.c | 5 +++--
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 7 ++++---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 ++++----
drivers/gpu/drm/armada/armada_gem.c | 14 ++++++++-----
drivers/gpu/drm/drm_cache.c | 2 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 7 ++++---
drivers/gpu/drm/drm_prime.c | 9 +++++----
drivers/gpu/drm/etnaviv/etnaviv_gem.c | 10 ++++++----
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 7 ++++---
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 13 ++++++------
drivers/gpu/drm/i915/gem/i915_gem_internal.c | 4 ++--
drivers/gpu/drm/i915/gem/i915_gem_region.c | 4 ++--
drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 5 +++--
drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 10 +++++-----
drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c | 5 +++--
drivers/gpu/drm/i915/gt/intel_ggtt.c | 12 ++++++------
drivers/gpu/drm/i915/i915_gem_gtt.c | 12 +++++++-----
drivers/gpu/drm/i915/i915_scatterlist.c | 4 ++--
drivers/gpu/drm/i915/selftests/scatterlist.c | 8 ++++----
drivers/gpu/drm/lima/lima_gem.c | 4 ++--
drivers/gpu/drm/msm/msm_gem.c | 8 ++++----
drivers/gpu/drm/msm/msm_iommu.c | 3 ++-
drivers/gpu/drm/panfrost/panfrost_gem.c | 3 ++-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 4 +++-
drivers/gpu/drm/radeon/radeon_ttm.c | 10 +++++-----
drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 15 +++++++-------
drivers/gpu/drm/tegra/gem.c | 25 ++++++++++++------------
drivers/gpu/drm/tegra/plane.c | 13 ++++++------
drivers/gpu/drm/virtio/virtgpu_object.c | 11 ++++++-----
drivers/gpu/drm/virtio/virtgpu_vq.c | 8 ++++----
drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 6 +++---
drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +-
drivers/gpu/host1x/job.c | 17 ++++++++--------
34 files changed, 154 insertions(+), 128 deletions(-)

--
1.9.1


2020-04-28 13:24:52

by Marek Szyprowski

[permalink] [raw]
Subject: [RFC 05/17] drm: exynos: fix sg_table nents vs. orig_nents misuse

The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski <[email protected]>
---
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
index fcee33a..e27715c 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
@@ -396,7 +396,7 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d,

out:
dma_unmap_sg(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt->sgl,
- g2d_userptr->sgt->nents, DMA_BIDIRECTIONAL);
+ g2d_userptr->sgt->orig_nents, DMA_BIDIRECTIONAL);

pages = frame_vector_pages(g2d_userptr->vec);
if (!IS_ERR(pages)) {
@@ -511,8 +511,9 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct g2d_data *g2d,

g2d_userptr->sgt = sgt;

- if (!dma_map_sg(to_dma_dev(g2d->drm_dev), sgt->sgl, sgt->nents,
- DMA_BIDIRECTIONAL)) {
+ sgt->nents = dma_map_sg(to_dma_dev(g2d->drm_dev), sgt->sgl,
+ sgt->orig_nents, DMA_BIDIRECTIONAL)
+ if (!sgt->nents) {
DRM_DEV_ERROR(g2d->dev, "failed to map sgt with dma region.\n");
ret = -ENOMEM;
goto err_sg_free_table;
--
1.9.1

2020-04-28 13:25:45

by Marek Szyprowski

[permalink] [raw]
Subject: [RFC 12/17] drm: tegra: fix sg_table nents vs. orig_nents misuse

The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski <[email protected]>
---
drivers/gpu/drm/tegra/gem.c | 25 +++++++++++++------------
drivers/gpu/drm/tegra/plane.c | 13 +++++++------
2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index 6237681..5710ab4 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -98,8 +98,8 @@ static struct sg_table *tegra_bo_pin(struct device *dev, struct host1x_bo *bo,
* the SG table needs to be copied to avoid overwriting any
* other potential users of the original SG table.
*/
- err = sg_alloc_table_from_sg(sgt, obj->sgt->sgl, obj->sgt->nents,
- GFP_KERNEL);
+ err = sg_alloc_table_from_sg(sgt, obj->sgt->sgl,
+ obj->sgt->orig_nents, GFP_KERNEL);
if (err < 0)
goto free;
} else {
@@ -197,7 +197,7 @@ static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
bo->iova = bo->mm->start;

bo->size = iommu_map_sg(tegra->domain, bo->iova, bo->sgt->sgl,
- bo->sgt->nents, prot);
+ bo->sgt->orig_nents, prot);
if (!bo->size) {
dev_err(tegra->drm->dev, "failed to map buffer\n");
err = -ENOMEM;
@@ -264,7 +264,7 @@ static struct tegra_bo *tegra_bo_alloc_object(struct drm_device *drm,
static void tegra_bo_free(struct drm_device *drm, struct tegra_bo *bo)
{
if (bo->pages) {
- dma_unmap_sg(drm->dev, bo->sgt->sgl, bo->sgt->nents,
+ dma_unmap_sg(drm->dev, bo->sgt->sgl, bo->sgt->orig_nents,
DMA_FROM_DEVICE);
drm_gem_put_pages(&bo->gem, bo->pages, true, true);
sg_free_table(bo->sgt);
@@ -290,9 +290,9 @@ static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo)
goto put_pages;
}

- err = dma_map_sg(drm->dev, bo->sgt->sgl, bo->sgt->nents,
- DMA_FROM_DEVICE);
- if (err == 0) {
+ bo->sgt->nents = dma_map_sg(drm->dev, bo->sgt->sgl, bo->sgt->orig_nents,
+ DMA_FROM_DEVICE);
+ if (bo->sgt->nents == 0) {
err = -EFAULT;
goto free_sgt;
}
@@ -571,7 +571,8 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
goto free;
}

- if (dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir) == 0)
+ sgt->nents = dma_map_sg(attach->dev, sgt->sgl, sgt->orig_nents, dir);
+ if (sgt->nents == 0)
goto free;

return sgt;
@@ -590,7 +591,7 @@ static void tegra_gem_prime_unmap_dma_buf(struct dma_buf_attachment *attach,
struct tegra_bo *bo = to_tegra_bo(gem);

if (bo->pages)
- dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+ dma_unmap_sg(attach->dev, sgt->sgl, sgt->orig_nents, dir);

sg_free_table(sgt);
kfree(sgt);
@@ -609,7 +610,7 @@ static int tegra_gem_prime_begin_cpu_access(struct dma_buf *buf,
struct drm_device *drm = gem->dev;

if (bo->pages)
- dma_sync_sg_for_cpu(drm->dev, bo->sgt->sgl, bo->sgt->nents,
+ dma_sync_sg_for_cpu(drm->dev, bo->sgt->sgl, bo->sgt->orig_nents,
DMA_FROM_DEVICE);

return 0;
@@ -623,8 +624,8 @@ static int tegra_gem_prime_end_cpu_access(struct dma_buf *buf,
struct drm_device *drm = gem->dev;

if (bo->pages)
- dma_sync_sg_for_device(drm->dev, bo->sgt->sgl, bo->sgt->nents,
- DMA_TO_DEVICE);
+ dma_sync_sg_for_device(drm->dev, bo->sgt->sgl,
+ bo->sgt->orig_nents, DMA_TO_DEVICE);

return 0;
}
diff --git a/drivers/gpu/drm/tegra/plane.c b/drivers/gpu/drm/tegra/plane.c
index 9ccfb56..3982bf8 100644
--- a/drivers/gpu/drm/tegra/plane.c
+++ b/drivers/gpu/drm/tegra/plane.c
@@ -130,9 +130,10 @@ static int tegra_dc_pin(struct tegra_dc *dc, struct tegra_plane_state *state)
}

if (sgt) {
- err = dma_map_sg(dc->dev, sgt->sgl, sgt->nents,
- DMA_TO_DEVICE);
- if (err == 0) {
+ sgt->nents = dma_map_sg(dc->dev, sgt->sgl,
+ sgt->orig_nents,
+ DMA_TO_DEVICE);
+ if (sgt->nents == 0) {
err = -ENOMEM;
goto unpin;
}
@@ -143,7 +144,7 @@ static int tegra_dc_pin(struct tegra_dc *dc, struct tegra_plane_state *state)
* map its SG table to a single contiguous chunk of
* I/O virtual memory.
*/
- if (err > 1) {
+ if (sgt->nents > 1) {
err = -EINVAL;
goto unpin;
}
@@ -165,7 +166,7 @@ static int tegra_dc_pin(struct tegra_dc *dc, struct tegra_plane_state *state)
struct sg_table *sgt = state->sgt[i];

if (sgt)
- dma_unmap_sg(dc->dev, sgt->sgl, sgt->nents,
+ dma_unmap_sg(dc->dev, sgt->sgl, sgt->orig_nents,
DMA_TO_DEVICE);

host1x_bo_unpin(dc->dev, &bo->base, sgt);
@@ -185,7 +186,7 @@ static void tegra_dc_unpin(struct tegra_dc *dc, struct tegra_plane_state *state)
struct sg_table *sgt = state->sgt[i];

if (sgt)
- dma_unmap_sg(dc->dev, sgt->sgl, sgt->nents,
+ dma_unmap_sg(dc->dev, sgt->sgl, sgt->orig_nents,
DMA_TO_DEVICE);

host1x_bo_unpin(dc->dev, &bo->base, sgt);
--
1.9.1

2020-04-28 13:26:25

by Marek Szyprowski

[permalink] [raw]
Subject: [RFC 09/17] drm: panfrost: fix sg_table nents vs. orig_nents misuse

The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski <[email protected]>
---
drivers/gpu/drm/panfrost/panfrost_gem.c | 3 ++-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 4 +++-
2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 17b654e..22fec7c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -42,7 +42,8 @@ static void panfrost_gem_free_object(struct drm_gem_object *obj)
for (i = 0; i < n_sgt; i++) {
if (bo->sgts[i].sgl) {
dma_unmap_sg(pfdev->dev, bo->sgts[i].sgl,
- bo->sgts[i].nents, DMA_BIDIRECTIONAL);
+ bo->sgts[i].orig_nents,
+ DMA_BIDIRECTIONAL);
sg_free_table(&bo->sgts[i]);
}
}
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index ed28aeb..2d9b1f9 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -517,7 +517,9 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
if (ret)
goto err_pages;

- if (!dma_map_sg(pfdev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL)) {
+ sgt->nents = dma_map_sg(pfdev->dev, sgt->sgl, sgt->orig_nents,
+ DMA_BIDIRECTIONAL);
+ if (!sgt->nents) {
ret = -EINVAL;
goto err_map;
}
--
1.9.1

2020-04-28 14:06:38

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC 00/17] DRM: fix struct sg_table nents vs. orig_nents misuse

On Tue, Apr 28, 2020 at 03:19:48PM +0200, Marek Szyprowski wrote:
> 1. introduce a dma_{map,sync,unmap}_sgtable() wrappers, which will use
> a proper sg_table entries and call respective DMA-mapping functions
> and adapt current code to it

That sounds reasonable to me. Those could be pretty trivial wrappers.

>
>
> 2. rename nents and orig_nents to nr_pages, nr_dmas to clearly state
> which one refers to which part of the scatterlist; I'm open for
> other names for those entries

nr_cpu_ents and nr_dma_ents might be better names, but it still would be
a whole lot of churn for little gain. I think just good wrappers like
suggested above might be more helpful.

2020-04-28 15:34:09

by Daniel Vetter

[permalink] [raw]
Subject: Re: [RFC 00/17] DRM: fix struct sg_table nents vs. orig_nents misuse

On Tue, Apr 28, 2020 at 04:02:57PM +0200, Christoph Hellwig wrote:
> On Tue, Apr 28, 2020 at 03:19:48PM +0200, Marek Szyprowski wrote:
> > 1. introduce a dma_{map,sync,unmap}_sgtable() wrappers, which will use
> > a proper sg_table entries and call respective DMA-mapping functions
> > and adapt current code to it
>
> That sounds reasonable to me. Those could be pretty trivial wrappers.
>
> >
> >
> > 2. rename nents and orig_nents to nr_pages, nr_dmas to clearly state
> > which one refers to which part of the scatterlist; I'm open for
> > other names for those entries
>
> nr_cpu_ents and nr_dma_ents might be better names, but it still would be
> a whole lot of churn for little gain. I think just good wrappers like
> suggested above might be more helpful.

I guess long-term we could aim for both? I.e. roll out better wrappers
first, once that's soaked through the tree, rename the last offenders.

Personally I like nr_cpu_ents and nr_dma_ents, that's about as clear as it
gets.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2020-04-28 16:06:06

by Robin Murphy

[permalink] [raw]
Subject: Re: [RFC 00/17] DRM: fix struct sg_table nents vs. orig_nents misuse

On 2020-04-28 4:32 pm, Daniel Vetter wrote:
> On Tue, Apr 28, 2020 at 04:02:57PM +0200, Christoph Hellwig wrote:
>> On Tue, Apr 28, 2020 at 03:19:48PM +0200, Marek Szyprowski wrote:
>>> 1. introduce a dma_{map,sync,unmap}_sgtable() wrappers, which will use
>>> a proper sg_table entries and call respective DMA-mapping functions
>>> and adapt current code to it
>>
>> That sounds reasonable to me. Those could be pretty trivial wrappers.
>>
>>>
>>>
>>> 2. rename nents and orig_nents to nr_pages, nr_dmas to clearly state
>>> which one refers to which part of the scatterlist; I'm open for
>>> other names for those entries
>>
>> nr_cpu_ents and nr_dma_ents might be better names, but it still would be
>> a whole lot of churn for little gain. I think just good wrappers like
>> suggested above might be more helpful.
>
> I guess long-term we could aim for both? I.e. roll out better wrappers
> first, once that's soaked through the tree, rename the last offenders.

Yes, that's what I was thinking too - most of these uses are just
passing them in and out of the DMA APIs, and thus would be subsumed into
the wrappers anyway, then in the relatively few remaining places where
the table is actually iterated for one reason or the other, renaming
would stand to help review and maintenance in terms of making it far
more obvious when the implementation and the intent don't match.

Robin.