2020-03-25 09:09:14

by Shane Francis

[permalink] [raw]
Subject: [PATCH v4 0/3] AMDGPU / RADEON / DRM Fix mapping of user pages

This patch set is to fix a bug in amdgpu / radeon drm that results in
a crash when dma_map_sg combines elemnets within a scatterlist table.

There are 2 shortfalls in the current kernel.

1) AMDGPU / RADEON assumes that the requested and created scatterlist
table lengths using from dma_map_sg are equal. This may not be the
case using the newer dma-iommu implementation

2) drm_prime does not fetch the length of the scatterlist
via the correct dma macro, this can use the incorrect length
being used (>0) in places where dma_map_sg has updated the table
elements.

The sg_dma_len macro is representative of the length of the sg item
after dma_map_sg

Example Crash :
> [drm:amdgpu_ttm_backend_bind [amdgpu]] *ERROR* failed to pin userptr

This happens in OpenCL applications, causing them to crash or hang, on
either amdgpu-pro or ROCm OpenCL implementations

I have verified this fixes the above on kernel 5.5 and 5.5rc using an
AMD Vega 64 GPU

Shane Francis (3):
drm/prime: use dma length macro when mapping sg to arrays
drm/amdgpu: fix scatter-gather mapping with user pages
drm/radeon: fix scatter-gather mapping with user pages

drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
drivers/gpu/drm/drm_prime.c | 2 +-
drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

--
2.26.0


2020-03-25 09:09:47

by Shane Francis

[permalink] [raw]
Subject: [PATCH v4 3/3] drm/radeon: fix scatter-gather mapping with user pages

Calls to dma_map_sg may return segments / entries than requested
if they fall on page bounderies. The old implementation did not
support this use case.

Signed-off-by: Shane Francis <[email protected]>
---
drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index 3b92311d30b9..b3380ffab4c2 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -528,7 +528,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)

r = -ENOMEM;
nents = dma_map_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
- if (nents != ttm->sg->nents)
+ if (nents == 0)
goto release_sg;

drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
--
2.26.0

2020-03-25 09:09:56

by Shane Francis

[permalink] [raw]
Subject: [PATCH v4 1/3] drm/prime: use dma length macro when mapping sg

As dma_map_sg can reorganize scatter-gather lists in a
way that can cause some later segments to be empty we should
always use the sg_dma_len macro to fetch the actual length.

This could now be 0 and not need to be mapped to a page or
address array

Signed-off-by: Shane Francis <[email protected]>
---
drivers/gpu/drm/drm_prime.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 86d9b0e45c8c..1de2cde2277c 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -967,7 +967,7 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages,

index = 0;
for_each_sg(sgt->sgl, sg, sgt->nents, count) {
- len = sg->length;
+ len = sg_dma_len(sg);
page = sg_page(sg);
addr = sg_dma_address(sg);

--
2.26.0

2020-03-25 09:10:29

by Shane Francis

[permalink] [raw]
Subject: [PATCH v4 2/3] drm/amdgpu: fix scatter-gather mapping with user pages

Calls to dma_map_sg may return segments / entries than requested
if they fall on page bounderies. The old implementation did not
support this use case.

Signed-off-by: Shane Francis <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index dee446278417..c6e9885c071f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -974,7 +974,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
/* Map SG to device */
r = -ENOMEM;
nents = dma_map_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
- if (nents != ttm->sg->nents)
+ if (nents == 0)
goto release_sg;

/* convert SG to linear array of pages and dma addresses */
--
2.26.0

2020-03-25 13:12:16

by Michael J. Ruhl

[permalink] [raw]
Subject: RE: [PATCH v4 0/3] AMDGPU / RADEON / DRM Fix mapping of user pages

>-----Original Message-----
>From: dri-devel <[email protected]> On Behalf Of
>Shane Francis
>Sent: Wednesday, March 25, 2020 5:08 AM
>To: [email protected]
>Cc: [email protected]; [email protected]; [email protected];
>[email protected]; [email protected];
>[email protected]
>Subject: [PATCH v4 0/3] AMDGPU / RADEON / DRM Fix mapping of user pages
>
>This patch set is to fix a bug in amdgpu / radeon drm that results in
>a crash when dma_map_sg combines elemnets within a scatterlist table.

s/elemnets/elements

>There are 2 shortfalls in the current kernel.
>
>1) AMDGPU / RADEON assumes that the requested and created scatterlist
> table lengths using from dma_map_sg are equal. This may not be the
> case using the newer dma-iommu implementation
>
>2) drm_prime does not fetch the length of the scatterlist
> via the correct dma macro, this can use the incorrect length
> being used (>0) in places where dma_map_sg has updated the table
> elements.
>
> The sg_dma_len macro is representative of the length of the sg item
> after dma_map_sg
>
>Example Crash :
>> [drm:amdgpu_ttm_backend_bind [amdgpu]] *ERROR* failed to pin userptr
>
>This happens in OpenCL applications, causing them to crash or hang, on
>either amdgpu-pro or ROCm OpenCL implementations
>
>I have verified this fixes the above on kernel 5.5 and 5.5rc using an
>AMD Vega 64 GPU
>
>Shane Francis (3):
> drm/prime: use dma length macro when mapping sg to arrays
> drm/amdgpu: fix scatter-gather mapping with user pages
> drm/radeon: fix scatter-gather mapping with user pages
>
> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
> drivers/gpu/drm/drm_prime.c | 2 +-
> drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
>
>--
>2.26.0
>
>_______________________________________________
>dri-devel mailing list
>[email protected]
>https://lists.freedesktop.org/mailman/listinfo/dri-devel

2020-03-25 13:57:51

by Michael J. Ruhl

[permalink] [raw]
Subject: RE: [PATCH v4 1/3] drm/prime: use dma length macro when mapping sg

>-----Original Message-----
>From: dri-devel <[email protected]> On Behalf Of
>Shane Francis
>Sent: Wednesday, March 25, 2020 5:08 AM
>To: [email protected]
>Cc: [email protected]; [email protected]; [email protected];
>[email protected]; [email protected];
>[email protected]
>Subject: [PATCH v4 1/3] drm/prime: use dma length macro when mapping sg
>
>As dma_map_sg can reorganize scatter-gather lists in a
>way that can cause some later segments to be empty we should
>always use the sg_dma_len macro to fetch the actual length.
>
>This could now be 0 and not need to be mapped to a page or
>address array
>Signed-off-by: Shane Francis <[email protected]>
>---
> drivers/gpu/drm/drm_prime.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>index 86d9b0e45c8c..1de2cde2277c 100644
>--- a/drivers/gpu/drm/drm_prime.c
>+++ b/drivers/gpu/drm/drm_prime.c
>@@ -967,7 +967,7 @@ int drm_prime_sg_to_page_addr_arrays(struct
>sg_table *sgt, struct page **pages,
>
> index = 0;
> for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>- len = sg->length;
>+ len = sg_dma_len(sg);
> page = sg_page(sg);
> addr = sg_dma_address(sg);

This looks correct to me.

Reviewed-by: Michael J. Ruhl <[email protected]>

M

>
>--
>2.26.0
>
>_______________________________________________
>dri-devel mailing list
>[email protected]
>https://lists.freedesktop.org/mailman/listinfo/dri-devel

2020-03-25 14:01:13

by Michael J. Ruhl

[permalink] [raw]
Subject: RE: [PATCH v4 2/3] drm/amdgpu: fix scatter-gather mapping with user pages

>-----Original Message-----
>From: dri-devel <[email protected]> On Behalf Of
>Shane Francis
>Sent: Wednesday, March 25, 2020 5:08 AM
>To: [email protected]
>Cc: [email protected]; [email protected]; [email protected];
>[email protected]; [email protected];
>[email protected]
>Subject: [PATCH v4 2/3] drm/amdgpu: fix scatter-gather mapping with user
>pages
>
>Calls to dma_map_sg may return segments / entries than requested

"may return less segments/entries" ?
^^^
>if they fall on page bounderies. The old implementation did not
>support this use case.
>
>Signed-off-by: Shane Francis <[email protected]>
>---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>index dee446278417..c6e9885c071f 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>@@ -974,7 +974,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt
>*ttm)
> /* Map SG to device */
> r = -ENOMEM;
> nents = dma_map_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents,
>direction);
>- if (nents != ttm->sg->nents)
>+ if (nents == 0)
> goto release_sg;

this looks correct to me.

Reviewed-by: Michael J. Ruhl <[email protected]>

> /* convert SG to linear array of pages and dma addresses */
>--
>2.26.0
>
>_______________________________________________
>dri-devel mailing list
>[email protected]
>https://lists.freedesktop.org/mailman/listinfo/dri-devel

2020-03-25 14:21:05

by Michael J. Ruhl

[permalink] [raw]
Subject: RE: [PATCH v4 3/3] drm/radeon: fix scatter-gather mapping with user pages

>-----Original Message-----
>From: dri-devel <[email protected]> On Behalf Of
>Shane Francis
>Sent: Wednesday, March 25, 2020 5:08 AM
>To: [email protected]
>Cc: [email protected]; [email protected]; [email protected];
>[email protected]; [email protected];
>[email protected]
>Subject: [PATCH v4 3/3] drm/radeon: fix scatter-gather mapping with user
>pages
>
>Calls to dma_map_sg may return segments / entries than requested

"may return less segment..." ?
^^^

>if they fall on page bounderies. The old implementation did not
>support this use case.
>
>Signed-off-by: Shane Francis <[email protected]>
>---
> drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c
>b/drivers/gpu/drm/radeon/radeon_ttm.c
>index 3b92311d30b9..b3380ffab4c2 100644
>--- a/drivers/gpu/drm/radeon/radeon_ttm.c
>+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
>@@ -528,7 +528,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt
>*ttm)
>
> r = -ENOMEM;
> nents = dma_map_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents,
>direction);
>- if (nents != ttm->sg->nents)
>+ if (nents == 0)
> goto release_sg;

This looks correct to me.

Reviewed-by: Michael J. Ruhl <[email protected]>

M
> drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
>--
>2.26.0
>
>_______________________________________________
>dri-devel mailing list
>[email protected]
>https://lists.freedesktop.org/mailman/listinfo/dri-devel

2020-03-25 15:56:34

by Shane Francis

[permalink] [raw]
Subject: Re: [PATCH v4 3/3] drm/radeon: fix scatter-gather mapping with user pages

> >-----Original Message-----
> >From: dri-devel <[email protected]> On Behalf Of
> >Shane Francis
> >Sent: Wednesday, March 25, 2020 5:08 AM
> >To: [email protected]
> >Cc: [email protected]; [email protected]; [email protected];
> >[email protected]; [email protected];
> >[email protected]
> >Subject: [PATCH v4 3/3] drm/radeon: fix scatter-gather mapping with user
> >pages
> >
> >Calls to dma_map_sg may return segments / entries than requested
>
> "may return less segment..." ?
> ^^^

I will reword / fix the highlighted issues with the text and send a updated
patch set later today.

>
> >if they fall on page bounderies. The old implementation did not
> >support this use case.
> >
> >Signed-off-by: Shane Francis <[email protected]>
> >---
> > drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c
> >b/drivers/gpu/drm/radeon/radeon_ttm.c
> >index 3b92311d30b9..b3380ffab4c2 100644
> >--- a/drivers/gpu/drm/radeon/radeon_ttm.c
> >+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> >@@ -528,7 +528,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt
> >*ttm)
> >
> > r = -ENOMEM;
> > nents = dma_map_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents,
> >direction);
> >- if (nents != ttm->sg->nents)
> >+ if (nents == 0)
> > goto release_sg;
>
> This looks correct to me.
>
> Reviewed-by: Michael J. Ruhl <[email protected]>
>
> M
> > drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
> >--
> >2.26.0
> >
> >_______________________________________________
> >dri-devel mailing list
> >[email protected]
> >https://lists.freedesktop.org/mailman/listinfo/dri-devel

2020-03-25 16:09:09

by Alex Deucher

[permalink] [raw]
Subject: Re: [PATCH v4 3/3] drm/radeon: fix scatter-gather mapping with user pages

On Wed, Mar 25, 2020 at 11:54 AM Shane Francis <[email protected]> wrote:
>
> > >-----Original Message-----
> > >From: dri-devel <[email protected]> On Behalf Of
> > >Shane Francis
> > >Sent: Wednesday, March 25, 2020 5:08 AM
> > >To: [email protected]
> > >Cc: [email protected]; [email protected]; [email protected];
> > >[email protected]; [email protected];
> > >[email protected]
> > >Subject: [PATCH v4 3/3] drm/radeon: fix scatter-gather mapping with user
> > >pages
> > >
> > >Calls to dma_map_sg may return segments / entries than requested
> >
> > "may return less segment..." ?
> > ^^^
>
> I will reword / fix the highlighted issues with the text and send a updated
> patch set later today.

I'll fix it up locally when I apply it. Thanks!

Alex

>
> >
> > >if they fall on page bounderies. The old implementation did not
> > >support this use case.
> > >
> > >Signed-off-by: Shane Francis <[email protected]>
> > >---
> > > drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > >diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c
> > >b/drivers/gpu/drm/radeon/radeon_ttm.c
> > >index 3b92311d30b9..b3380ffab4c2 100644
> > >--- a/drivers/gpu/drm/radeon/radeon_ttm.c
> > >+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> > >@@ -528,7 +528,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt
> > >*ttm)
> > >
> > > r = -ENOMEM;
> > > nents = dma_map_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents,
> > >direction);
> > >- if (nents != ttm->sg->nents)
> > >+ if (nents == 0)
> > > goto release_sg;
> >
> > This looks correct to me.
> >
> > Reviewed-by: Michael J. Ruhl <[email protected]>
> >
> > M
> > > drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
> > >--
> > >2.26.0
> > >
> > >_______________________________________________
> > >dri-devel mailing list
> > >[email protected]
> > >https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> dri-devel mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

2020-03-27 07:57:00

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

Hi All,

On 2020-03-25 10:07, Shane Francis wrote:
> As dma_map_sg can reorganize scatter-gather lists in a
> way that can cause some later segments to be empty we should
> always use the sg_dma_len macro to fetch the actual length.
>
> This could now be 0 and not need to be mapped to a page or
> address array
>
> Signed-off-by: Shane Francis <[email protected]>
> Reviewed-by: Michael J. Ruhl <[email protected]>
This patch landed in linux-next 20200326 and it causes a kernel panic on
various Exynos SoC based boards.
> ---
> drivers/gpu/drm/drm_prime.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 86d9b0e45c8c..1de2cde2277c 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -967,7 +967,7 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages,
>
> index = 0;
> for_each_sg(sgt->sgl, sg, sgt->nents, count) {
> - len = sg->length;
> + len = sg_dma_len(sg);
> page = sg_page(sg);
> addr = sg_dma_address(sg);
>

Sorry, but this code is wrong :(

The scatterlist elements (sg) describes memory chunks in physical memory
and in the DMA (IO virtual) space. However in general, you cannot assume
1:1 mapping between them. If you access sg_page(sg) (basically
sg->page), you must match it with sg->length. When you access
sg_dma_address(sg) (again, in most cases it is sg->dma_address), then
you must match it with sg_dma_len(sg). The sg->dma_address might not be
the dma address of the sg->page.

In some cases (when IOMMU is available, it performs aggregation of the
scatterlist chunks and a few other, minor requirements), the whole
scatterlist might be mapped into contiguous DMA address space and filled
only to the first sg element.

The proper way to iterate over a scatterlists to get both the pages and
the DMA addresses assigned to them is:

int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page
**pages,
                                     dma_addr_t *addrs, int max_entries)
{
        unsigned count;
        struct scatterlist *sg;
        struct page *page;
        u32 page_len, page_index;
        dma_addr_t addr;
        u32 dma_len, dma_index;

        page_index = 0;
        dma_index = 0;
        for_each_sg(sgt->sgl, sg, sgt->nents, count) {
                page_len = sg->length;
                page = sg_page(sg);
                dma_len = sg_dma_len(sg);
                addr = sg_dma_address(sg);

                while (pages && page_len > 0) {
                        if (WARN_ON(page_index >= max_entries))
                                return -1;
                        pages[page_index] = page;
                        page++;
                        page_len -= PAGE_SIZE;
                        page_index++;
                }

                while (addrs && dma_len > 0) {
                        if (WARN_ON(dma_index >= max_entries))
                                return -1;
                        addrs[dma_index] = addr;
                        addr += PAGE_SIZE;
                        dma_len -= PAGE_SIZE;
                        dma_index++;
                }
        }

        return 0;
}

I will send a patch in a few minutes with the above fixed code.

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2020-03-27 08:14:09

by Christian König

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

Am 27.03.20 um 08:54 schrieb Marek Szyprowski:
> Hi All,
>
> On 2020-03-25 10:07, Shane Francis wrote:
>> As dma_map_sg can reorganize scatter-gather lists in a
>> way that can cause some later segments to be empty we should
>> always use the sg_dma_len macro to fetch the actual length.
>>
>> This could now be 0 and not need to be mapped to a page or
>> address array
>>
>> Signed-off-by: Shane Francis <[email protected]>
>> Reviewed-by: Michael J. Ruhl <[email protected]>
> This patch landed in linux-next 20200326 and it causes a kernel panic on
> various Exynos SoC based boards.
>> ---
>> drivers/gpu/drm/drm_prime.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>> index 86d9b0e45c8c..1de2cde2277c 100644
>> --- a/drivers/gpu/drm/drm_prime.c
>> +++ b/drivers/gpu/drm/drm_prime.c
>> @@ -967,7 +967,7 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages,
>>
>> index = 0;
>> for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>> - len = sg->length;
>> + len = sg_dma_len(sg);
>> page = sg_page(sg);
>> addr = sg_dma_address(sg);
>>
> Sorry, but this code is wrong :(

Well it is at least better than before because it makes most drivers
work correctly again.

See we only fill the pages array because some drivers (like Exynos) are
still buggy and require this.

Accessing the pages of an DMA-buf imported sg_table is illegal and
should be fixed in the drivers.

> [SNIP]
>
> I will send a patch in a few minutes with the above fixed code.

That is certainly a good idea for now, but could we also put on
somebodies todo list an item to fix Exynos?

Thanks in advance,
Christian.

>
> Best regards

2020-03-27 08:29:04

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

On 2020-03-27 08:54, Marek Szyprowski wrote:
> On 2020-03-25 10:07, Shane Francis wrote:
>> As dma_map_sg can reorganize scatter-gather lists in a
>> way that can cause some later segments to be empty we should
>> always use the sg_dma_len macro to fetch the actual length.
>>
>> This could now be 0 and not need to be mapped to a page or
>> address array
>>
>> Signed-off-by: Shane Francis <[email protected]>
>> Reviewed-by: Michael J. Ruhl <[email protected]>
> This patch landed in linux-next 20200326 and it causes a kernel panic
> on various Exynos SoC based boards.
>> ---
>>   drivers/gpu/drm/drm_prime.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>> index 86d9b0e45c8c..1de2cde2277c 100644
>> --- a/drivers/gpu/drm/drm_prime.c
>> +++ b/drivers/gpu/drm/drm_prime.c
>> @@ -967,7 +967,7 @@ int drm_prime_sg_to_page_addr_arrays(struct
>> sg_table *sgt, struct page **pages,
>>         index = 0;
>>       for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>> -        len = sg->length;
>> +        len = sg_dma_len(sg);
>>           page = sg_page(sg);
>>           addr = sg_dma_address(sg);
>
> Sorry, but this code is wrong :(
>
> The scatterlist elements (sg) describes memory chunks in physical
> memory and in the DMA (IO virtual) space. However in general, you
> cannot assume 1:1 mapping between them. If you access sg_page(sg)
> (basically sg->page), you must match it with sg->length. When you
> access sg_dma_address(sg) (again, in most cases it is
> sg->dma_address), then you must match it with sg_dma_len(sg). The
> sg->dma_address might not be the dma address of the sg->page.
>
> In some cases (when IOMMU is available, it performs aggregation of the
> scatterlist chunks and a few other, minor requirements), the whole
> scatterlist might be mapped into contiguous DMA address space and
> filled only to the first sg element.
>
> The proper way to iterate over a scatterlists to get both the pages
> and the DMA addresses assigned to them is:
>
> int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page
> **pages,
>                                      dma_addr_t *addrs, int max_entries)
> {
>         unsigned count;
>         struct scatterlist *sg;
>         struct page *page;
>         u32 page_len, page_index;
>         dma_addr_t addr;
>         u32 dma_len, dma_index;
>
>         page_index = 0;
>         dma_index = 0;
>         for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>                 page_len = sg->length;
>                 page = sg_page(sg);
>                 dma_len = sg_dma_len(sg);
>                 addr = sg_dma_address(sg);
>
>                 while (pages && page_len > 0) {
>                         if (WARN_ON(page_index >= max_entries))
>                                 return -1;
>                         pages[page_index] = page;
>                         page++;
>                         page_len -= PAGE_SIZE;
>                         page_index++;
>                 }
>
>                 while (addrs && dma_len > 0) {
>                         if (WARN_ON(dma_index >= max_entries))
>                                 return -1;
>                         addrs[dma_index] = addr;
>                         addr += PAGE_SIZE;
>                         dma_len -= PAGE_SIZE;
>                         dma_index++;
>                 }
>         }
>
>         return 0;
> }
>
> I will send a patch in a few minutes with the above fixed code.

Here is the fix: https://patchwork.freedesktop.org/patch/359081/

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2020-03-27 09:11:45

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

Hi Christian,

On 2020-03-27 09:11, Christian König wrote:
> Am 27.03.20 um 08:54 schrieb Marek Szyprowski:
>> On 2020-03-25 10:07, Shane Francis wrote:
>>> As dma_map_sg can reorganize scatter-gather lists in a
>>> way that can cause some later segments to be empty we should
>>> always use the sg_dma_len macro to fetch the actual length.
>>>
>>> This could now be 0 and not need to be mapped to a page or
>>> address array
>>>
>>> Signed-off-by: Shane Francis <[email protected]>
>>> Reviewed-by: Michael J. Ruhl <[email protected]>
>> This patch landed in linux-next 20200326 and it causes a kernel panic on
>> various Exynos SoC based boards.
>>> ---
>>>    drivers/gpu/drm/drm_prime.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>>> index 86d9b0e45c8c..1de2cde2277c 100644
>>> --- a/drivers/gpu/drm/drm_prime.c
>>> +++ b/drivers/gpu/drm/drm_prime.c
>>> @@ -967,7 +967,7 @@ int drm_prime_sg_to_page_addr_arrays(struct
>>> sg_table *sgt, struct page **pages,
>>>           index = 0;
>>>        for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>>> -        len = sg->length;
>>> +        len = sg_dma_len(sg);
>>>            page = sg_page(sg);
>>>            addr = sg_dma_address(sg);
>> Sorry, but this code is wrong :(
>
> Well it is at least better than before because it makes most drivers
> work correctly again.

Well, I'm not sure that a half-broken fix should be considered as a fix ;)

Anyway, I just got the comment from Shane, that my patch is fixing the
issues with amdgpu and radeon, while still working fine for exynos, so
it is indeed a proper fix.

> See we only fill the pages array because some drivers (like Exynos)
> are still buggy and require this.

Exynos driver use this pages array internally.

>
> Accessing the pages of an DMA-buf imported sg_table is illegal and
> should be fixed in the drivers.

True, but in meantime we should avoid breaking stuff which worked fine
for ages.

>
>> [SNIP]
>>
>> I will send a patch in a few minutes with the above fixed code.
>
> That is certainly a good idea for now, but could we also put on
> somebodies todo list an item to fix Exynos?

Yes, I can take a look into removing the use of the internal pages
array. It is used mainly for implementing vmap for fbdev emulation, but
this can be handled in a different way.

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2020-03-30 08:19:02

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

Hi

On 2020-03-27 10:10, Marek Szyprowski wrote:
> Hi Christian,
>
> On 2020-03-27 09:11, Christian König wrote:
>> Am 27.03.20 um 08:54 schrieb Marek Szyprowski:
>>> On 2020-03-25 10:07, Shane Francis wrote:
>>>> As dma_map_sg can reorganize scatter-gather lists in a
>>>> way that can cause some later segments to be empty we should
>>>> always use the sg_dma_len macro to fetch the actual length.
>>>>
>>>> This could now be 0 and not need to be mapped to a page or
>>>> address array
>>>>
>>>> Signed-off-by: Shane Francis <[email protected]>
>>>> Reviewed-by: Michael J. Ruhl <[email protected]>
>>> This patch landed in linux-next 20200326 and it causes a kernel
>>> panic on
>>> various Exynos SoC based boards.
>>>> ---
>>>>    drivers/gpu/drm/drm_prime.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>>>> index 86d9b0e45c8c..1de2cde2277c 100644
>>>> --- a/drivers/gpu/drm/drm_prime.c
>>>> +++ b/drivers/gpu/drm/drm_prime.c
>>>> @@ -967,7 +967,7 @@ int drm_prime_sg_to_page_addr_arrays(struct
>>>> sg_table *sgt, struct page **pages,
>>>>           index = 0;
>>>>        for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>>>> -        len = sg->length;
>>>> +        len = sg_dma_len(sg);
>>>>            page = sg_page(sg);
>>>>            addr = sg_dma_address(sg);
>>> Sorry, but this code is wrong :(
>>
>> Well it is at least better than before because it makes most drivers
>> work correctly again.
>
> Well, I'm not sure that a half-broken fix should be considered as a
> fix ;)
>
> Anyway, I just got the comment from Shane, that my patch is fixing the
> issues with amdgpu and radeon, while still working fine for exynos, so
> it is indeed a proper fix.

Today I've noticed that this patch went to final v5.6 without even a day
of testing in linux-next, so v5.6 is broken on Exynos and probably a few
other ARM archs, which rely on the drm_prime_sg_to_page_addr_arrays
function.

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2020-03-30 09:41:43

by Shane Francis

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

On Mon, Mar 30, 2020 at 9:18 AM Marek Szyprowski
<[email protected]> wrote:
> Today I've noticed that this patch went to final v5.6 without even a day
> of testing in linux-next, so v5.6 is broken on Exynos and probably a few
> other ARM archs, which rely on the drm_prime_sg_to_page_addr_arrays
> function.
>
> Best regards
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
>

Not sure what the full merge pipeline is here, but my original patch
was not sent to the stable mailing list. So I would assume that it
went through the normal release gates / checks ? If there was a fault
on the way I uploaded the patches I do apologise however I did follow
the normal guidelines as far as I understood them.

Personally I did validate this patch on systems with Intel and AMD GFX,
unfortunately I do not have any ARM based dev kits that are able to run
mainline (was never able to get an Nvidia TK1 to boot outside of L4T)

2020-03-30 12:34:20

by Shane Francis

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

On Mon, Mar 30, 2020 at 9:18 AM Marek Szyprowski
<[email protected]> wrote:
> Today I've noticed that this patch went to final v5.6 without even a day
> of testing in linux-next, so v5.6 is broken on Exynos and probably a few
> other ARM archs, which rely on the drm_prime_sg_to_page_addr_arrays
> function.
>
> Best regards
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
>

2020-03-30 12:46:49

by Shane Francis

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

On Mon, Mar 30, 2020 at 9:18 AM Marek Szyprowski
<[email protected]> wrote:

> Today I've noticed that this patch went to final v5.6 without even a day
> of testing in linux-next, so v5.6 is broken on Exynos and probably a few
> other ARM archs, which rely on the drm_prime_sg_to_page_addr_arrays
> function.
>
> Best regards
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
>

FYI These changes are now in the 5.5-stable queue.

2020-03-30 13:59:22

by Alex Deucher

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

On Mon, Mar 30, 2020 at 4:18 AM Marek Szyprowski
<[email protected]> wrote:
>
> Hi
>
> On 2020-03-27 10:10, Marek Szyprowski wrote:
> > Hi Christian,
> >
> > On 2020-03-27 09:11, Christian König wrote:
> >> Am 27.03.20 um 08:54 schrieb Marek Szyprowski:
> >>> On 2020-03-25 10:07, Shane Francis wrote:
> >>>> As dma_map_sg can reorganize scatter-gather lists in a
> >>>> way that can cause some later segments to be empty we should
> >>>> always use the sg_dma_len macro to fetch the actual length.
> >>>>
> >>>> This could now be 0 and not need to be mapped to a page or
> >>>> address array
> >>>>
> >>>> Signed-off-by: Shane Francis <[email protected]>
> >>>> Reviewed-by: Michael J. Ruhl <[email protected]>
> >>> This patch landed in linux-next 20200326 and it causes a kernel
> >>> panic on
> >>> various Exynos SoC based boards.
> >>>> ---
> >>>> drivers/gpu/drm/drm_prime.c | 2 +-
> >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> >>>> index 86d9b0e45c8c..1de2cde2277c 100644
> >>>> --- a/drivers/gpu/drm/drm_prime.c
> >>>> +++ b/drivers/gpu/drm/drm_prime.c
> >>>> @@ -967,7 +967,7 @@ int drm_prime_sg_to_page_addr_arrays(struct
> >>>> sg_table *sgt, struct page **pages,
> >>>> index = 0;
> >>>> for_each_sg(sgt->sgl, sg, sgt->nents, count) {
> >>>> - len = sg->length;
> >>>> + len = sg_dma_len(sg);
> >>>> page = sg_page(sg);
> >>>> addr = sg_dma_address(sg);
> >>> Sorry, but this code is wrong :(
> >>
> >> Well it is at least better than before because it makes most drivers
> >> work correctly again.
> >
> > Well, I'm not sure that a half-broken fix should be considered as a
> > fix ;)
> >
> > Anyway, I just got the comment from Shane, that my patch is fixing the
> > issues with amdgpu and radeon, while still working fine for exynos, so
> > it is indeed a proper fix.
>
> Today I've noticed that this patch went to final v5.6 without even a day
> of testing in linux-next, so v5.6 is broken on Exynos and probably a few
> other ARM archs, which rely on the drm_prime_sg_to_page_addr_arrays
> function.

Please commit your patch and cc stable.

Alex


>
> Best regards
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
>
> _______________________________________________
> dri-devel mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

2020-03-31 05:26:50

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

Hi Alex,

On 2020-03-30 15:23, Alex Deucher wrote:
> On Mon, Mar 30, 2020 at 4:18 AM Marek Szyprowski
> <[email protected]> wrote:
>> Hi
>>
>> On 2020-03-27 10:10, Marek Szyprowski wrote:
>>> Hi Christian,
>>>
>>> On 2020-03-27 09:11, Christian König wrote:
>>>> Am 27.03.20 um 08:54 schrieb Marek Szyprowski:
>>>>> On 2020-03-25 10:07, Shane Francis wrote:
>>>>>> As dma_map_sg can reorganize scatter-gather lists in a
>>>>>> way that can cause some later segments to be empty we should
>>>>>> always use the sg_dma_len macro to fetch the actual length.
>>>>>>
>>>>>> This could now be 0 and not need to be mapped to a page or
>>>>>> address array
>>>>>>
>>>>>> Signed-off-by: Shane Francis <[email protected]>
>>>>>> Reviewed-by: Michael J. Ruhl <[email protected]>
>>>>> This patch landed in linux-next 20200326 and it causes a kernel
>>>>> panic on
>>>>> various Exynos SoC based boards.
>>>>>> ---
>>>>>> drivers/gpu/drm/drm_prime.c | 2 +-
>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>>>>>> index 86d9b0e45c8c..1de2cde2277c 100644
>>>>>> --- a/drivers/gpu/drm/drm_prime.c
>>>>>> +++ b/drivers/gpu/drm/drm_prime.c
>>>>>> @@ -967,7 +967,7 @@ int drm_prime_sg_to_page_addr_arrays(struct
>>>>>> sg_table *sgt, struct page **pages,
>>>>>> index = 0;
>>>>>> for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>>>>>> - len = sg->length;
>>>>>> + len = sg_dma_len(sg);
>>>>>> page = sg_page(sg);
>>>>>> addr = sg_dma_address(sg);
>>>>> Sorry, but this code is wrong :(
>>>> Well it is at least better than before because it makes most drivers
>>>> work correctly again.
>>> Well, I'm not sure that a half-broken fix should be considered as a
>>> fix ;)
>>>
>>> Anyway, I just got the comment from Shane, that my patch is fixing the
>>> issues with amdgpu and radeon, while still working fine for exynos, so
>>> it is indeed a proper fix.
>> Today I've noticed that this patch went to final v5.6 without even a day
>> of testing in linux-next, so v5.6 is broken on Exynos and probably a few
>> other ARM archs, which rely on the drm_prime_sg_to_page_addr_arrays
>> function.
> Please commit your patch and cc stable.

I've already did that: https://lkml.org/lkml/2020/3/27/555

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2020-03-31 14:11:18

by Alex Deucher

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

On Tue, Mar 31, 2020 at 1:25 AM Marek Szyprowski
<[email protected]> wrote:
>
> Hi Alex,
>
> On 2020-03-30 15:23, Alex Deucher wrote:
> > On Mon, Mar 30, 2020 at 4:18 AM Marek Szyprowski
> > <[email protected]> wrote:
> >> Hi
> >>
> >> On 2020-03-27 10:10, Marek Szyprowski wrote:
> >>> Hi Christian,
> >>>
> >>> On 2020-03-27 09:11, Christian König wrote:
> >>>> Am 27.03.20 um 08:54 schrieb Marek Szyprowski:
> >>>>> On 2020-03-25 10:07, Shane Francis wrote:
> >>>>>> As dma_map_sg can reorganize scatter-gather lists in a
> >>>>>> way that can cause some later segments to be empty we should
> >>>>>> always use the sg_dma_len macro to fetch the actual length.
> >>>>>>
> >>>>>> This could now be 0 and not need to be mapped to a page or
> >>>>>> address array
> >>>>>>
> >>>>>> Signed-off-by: Shane Francis <[email protected]>
> >>>>>> Reviewed-by: Michael J. Ruhl <[email protected]>
> >>>>> This patch landed in linux-next 20200326 and it causes a kernel
> >>>>> panic on
> >>>>> various Exynos SoC based boards.
> >>>>>> ---
> >>>>>> drivers/gpu/drm/drm_prime.c | 2 +-
> >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> >>>>>> index 86d9b0e45c8c..1de2cde2277c 100644
> >>>>>> --- a/drivers/gpu/drm/drm_prime.c
> >>>>>> +++ b/drivers/gpu/drm/drm_prime.c
> >>>>>> @@ -967,7 +967,7 @@ int drm_prime_sg_to_page_addr_arrays(struct
> >>>>>> sg_table *sgt, struct page **pages,
> >>>>>> index = 0;
> >>>>>> for_each_sg(sgt->sgl, sg, sgt->nents, count) {
> >>>>>> - len = sg->length;
> >>>>>> + len = sg_dma_len(sg);
> >>>>>> page = sg_page(sg);
> >>>>>> addr = sg_dma_address(sg);
> >>>>> Sorry, but this code is wrong :(
> >>>> Well it is at least better than before because it makes most drivers
> >>>> work correctly again.
> >>> Well, I'm not sure that a half-broken fix should be considered as a
> >>> fix ;)
> >>>
> >>> Anyway, I just got the comment from Shane, that my patch is fixing the
> >>> issues with amdgpu and radeon, while still working fine for exynos, so
> >>> it is indeed a proper fix.
> >> Today I've noticed that this patch went to final v5.6 without even a day
> >> of testing in linux-next, so v5.6 is broken on Exynos and probably a few
> >> other ARM archs, which rely on the drm_prime_sg_to_page_addr_arrays
> >> function.
> > Please commit your patch and cc stable.
>
> I've already did that: https://lkml.org/lkml/2020/3/27/555

Do you have drm-misc commit rights or do you need someone to commit
this for you?

Alex


>
> Best regards
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
>

2020-03-31 14:31:07

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [v4,1/3] drm/prime: use dma length macro when mapping sg

Hi Alex,

On 2020-03-31 16:10, Alex Deucher wrote:
> On Tue, Mar 31, 2020 at 1:25 AM Marek Szyprowski
> <[email protected]> wrote:
>> Hi Alex,
>>
>> On 2020-03-30 15:23, Alex Deucher wrote:
>>> On Mon, Mar 30, 2020 at 4:18 AM Marek Szyprowski
>>> <[email protected]> wrote:
>>>> Hi
>>>>
>>>> On 2020-03-27 10:10, Marek Szyprowski wrote:
>>>>> Hi Christian,
>>>>>
>>>>> On 2020-03-27 09:11, Christian König wrote:
>>>>>> Am 27.03.20 um 08:54 schrieb Marek Szyprowski:
>>>>>>> On 2020-03-25 10:07, Shane Francis wrote:
>>>>>>>> As dma_map_sg can reorganize scatter-gather lists in a
>>>>>>>> way that can cause some later segments to be empty we should
>>>>>>>> always use the sg_dma_len macro to fetch the actual length.
>>>>>>>>
>>>>>>>> This could now be 0 and not need to be mapped to a page or
>>>>>>>> address array
>>>>>>>>
>>>>>>>> Signed-off-by: Shane Francis <[email protected]>
>>>>>>>> Reviewed-by: Michael J. Ruhl <[email protected]>
>>>>>>> This patch landed in linux-next 20200326 and it causes a kernel
>>>>>>> panic on
>>>>>>> various Exynos SoC based boards.
>>>>>>>> ---
>>>>>>>> drivers/gpu/drm/drm_prime.c | 2 +-
>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>>>>>>>> index 86d9b0e45c8c..1de2cde2277c 100644
>>>>>>>> --- a/drivers/gpu/drm/drm_prime.c
>>>>>>>> +++ b/drivers/gpu/drm/drm_prime.c
>>>>>>>> @@ -967,7 +967,7 @@ int drm_prime_sg_to_page_addr_arrays(struct
>>>>>>>> sg_table *sgt, struct page **pages,
>>>>>>>> index = 0;
>>>>>>>> for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>>>>>>>> - len = sg->length;
>>>>>>>> + len = sg_dma_len(sg);
>>>>>>>> page = sg_page(sg);
>>>>>>>> addr = sg_dma_address(sg);
>>>>>>> Sorry, but this code is wrong :(
>>>>>> Well it is at least better than before because it makes most drivers
>>>>>> work correctly again.
>>>>> Well, I'm not sure that a half-broken fix should be considered as a
>>>>> fix ;)
>>>>>
>>>>> Anyway, I just got the comment from Shane, that my patch is fixing the
>>>>> issues with amdgpu and radeon, while still working fine for exynos, so
>>>>> it is indeed a proper fix.
>>>> Today I've noticed that this patch went to final v5.6 without even a day
>>>> of testing in linux-next, so v5.6 is broken on Exynos and probably a few
>>>> other ARM archs, which rely on the drm_prime_sg_to_page_addr_arrays
>>>> function.
>>> Please commit your patch and cc stable.
>> I've already did that: https%3A%2F%2Flkml.org%2Flkml%2F2020%2F3%2F27%2F555
> Do you have drm-misc commit rights or do you need someone to commit
> this for you?

I have no access to drm-misc.

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland