2023-05-01 20:53:20

by Rob Clark

[permalink] [raw]
Subject: [PATCH] drm/msm: Set max segment size earlier

From: Rob Clark <[email protected]>

Fixes the following splat on a6xx gen2+ (a640, a650, a660 families),
a6xx gen1 has smaller GMU allocations so they fit under the default
64K max segment size.

------------[ cut here ]------------
DMA-API: msm_dpu ae01000.display-controller: mapping sg segment longer than device claims to support [len=126976] [max=65536]
WARNING: CPU: 5 PID: 9 at kernel/dma/debug.c:1160 debug_dma_map_sg+0x288/0x314
Modules linked in:
CPU: 5 PID: 9 Comm: kworker/u16:0 Not tainted 6.3.0-rc2-debug+ #629
Hardware name: Google Villager (rev1+) with LTE (DT)
Workqueue: events_unbound deferred_probe_work_func
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : debug_dma_map_sg+0x288/0x314
lr : debug_dma_map_sg+0x288/0x314
sp : ffffffc00809b560
x29: ffffffc00809b560 x28: 0000000000000060 x27: 0000000000000000
x26: 0000000000010000 x25: 0000000000000004 x24: 0000000000000004
x23: ffffffffffffffff x22: ffffffdb31693cc0 x21: ffffff8080935800
x20: ffffff8087417400 x19: ffffff8087a45010 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000010000
x14: 0000000000000001 x13: ffffffffffffffff x12: ffffffffffffffff
x11: 0000000000000000 x10: 000000000000000a x9 : ffffffdb2ff05e14
x8 : ffffffdb31275000 x7 : ffffffdb2ff08908 x6 : 0000000000000000
x5 : 0000000000000001 x4 : ffffffdb2ff08a74 x3 : ffffffdb31275008
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff80803a9a80
Call trace:
debug_dma_map_sg+0x288/0x314
__dma_map_sg_attrs+0x80/0xe4
dma_map_sgtable+0x30/0x4c
get_pages+0x1d4/0x1e4
msm_gem_pin_pages_locked+0xbc/0xf8
msm_gem_pin_vma_locked+0x58/0xa0
msm_gem_get_and_pin_iova_range+0x98/0xac
a6xx_gmu_memory_alloc+0x7c/0x128
a6xx_gmu_init+0x16c/0x9b0
a6xx_gpu_init+0x38c/0x3e4
adreno_bind+0x214/0x264
component_bind_all+0x128/0x1f8
msm_drm_bind+0x2b8/0x608
try_to_bring_up_aggregate_device+0x88/0x1a4
__component_add+0xec/0x13c
component_add+0x1c/0x28
dp_display_probe+0x3f8/0x43c
platform_probe+0x70/0xc4
really_probe+0x148/0x280
__driver_probe_device+0xc8/0xe0
driver_probe_device+0x44/0x100
__device_attach_driver+0x64/0xdc
bus_for_each_drv+0xb0/0xd8
__device_attach+0xd8/0x168
device_initial_probe+0x1c/0x28
bus_probe_device+0x44/0xb0
deferred_probe_work_func+0xc8/0xe0
process_one_work+0x2e0/0x488
process_scheduled_works+0x4c/0x50
worker_thread+0x218/0x274
kthread+0xf0/0x100
ret_from_fork+0x10/0x20
irq event stamp: 293712
hardirqs last enabled at (293711): [<ffffffdb2ff0893c>] vprintk_emit+0x160/0x25c
hardirqs last disabled at (293712): [<ffffffdb30b48130>] el1_dbg+0x24/0x80
softirqs last enabled at (279520): [<ffffffdb2fe10420>] __do_softirq+0x21c/0x4bc
softirqs last disabled at (279515): [<ffffffdb2fe16708>] ____do_softirq+0x18/0x24
---[ end trace 0000000000000000 ]---

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_drv.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 3a74b5653e96..6dec1a3534f2 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -440,27 +440,27 @@ static int msm_drm_init(struct device *dev, const struct drm_driver *drv)
fs_reclaim_acquire(GFP_KERNEL);
might_lock(&priv->lru.lock);
fs_reclaim_release(GFP_KERNEL);

drm_mode_config_init(ddev);

ret = msm_init_vram(ddev);
if (ret)
goto err_drm_dev_put;

+ dma_set_max_seg_size(dev, UINT_MAX);
+
/* Bind all our sub-components: */
ret = component_bind_all(dev, ddev);
if (ret)
goto err_drm_dev_put;

- dma_set_max_seg_size(dev, UINT_MAX);
-
msm_gem_shrinker_init(ddev);

if (priv->kms_init) {
ret = priv->kms_init(ddev);
if (ret) {
DRM_DEV_ERROR(dev, "failed to load kms\n");
priv->kms = NULL;
goto err_msm_uninit;
}
kms = priv->kms;
--
2.39.2


2023-05-01 20:59:44

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH] drm/msm: Set max segment size earlier

On 01/05/2023 23:44, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> Fixes the following splat on a6xx gen2+ (a640, a650, a660 families),
> a6xx gen1 has smaller GMU allocations so they fit under the default
> 64K max segment size.
>
> ------------[ cut here ]------------
> DMA-API: msm_dpu ae01000.display-controller: mapping sg segment longer than device claims to support [len=126976] [max=65536]
> WARNING: CPU: 5 PID: 9 at kernel/dma/debug.c:1160 debug_dma_map_sg+0x288/0x314
> Modules linked in:
> CPU: 5 PID: 9 Comm: kworker/u16:0 Not tainted 6.3.0-rc2-debug+ #629
> Hardware name: Google Villager (rev1+) with LTE (DT)
> Workqueue: events_unbound deferred_probe_work_func
> pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : debug_dma_map_sg+0x288/0x314
> lr : debug_dma_map_sg+0x288/0x314
> sp : ffffffc00809b560
> x29: ffffffc00809b560 x28: 0000000000000060 x27: 0000000000000000
> x26: 0000000000010000 x25: 0000000000000004 x24: 0000000000000004
> x23: ffffffffffffffff x22: ffffffdb31693cc0 x21: ffffff8080935800
> x20: ffffff8087417400 x19: ffffff8087a45010 x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000010000
> x14: 0000000000000001 x13: ffffffffffffffff x12: ffffffffffffffff
> x11: 0000000000000000 x10: 000000000000000a x9 : ffffffdb2ff05e14
> x8 : ffffffdb31275000 x7 : ffffffdb2ff08908 x6 : 0000000000000000
> x5 : 0000000000000001 x4 : ffffffdb2ff08a74 x3 : ffffffdb31275008
> x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff80803a9a80
> Call trace:
> debug_dma_map_sg+0x288/0x314
> __dma_map_sg_attrs+0x80/0xe4
> dma_map_sgtable+0x30/0x4c
> get_pages+0x1d4/0x1e4
> msm_gem_pin_pages_locked+0xbc/0xf8
> msm_gem_pin_vma_locked+0x58/0xa0
> msm_gem_get_and_pin_iova_range+0x98/0xac
> a6xx_gmu_memory_alloc+0x7c/0x128
> a6xx_gmu_init+0x16c/0x9b0
> a6xx_gpu_init+0x38c/0x3e4
> adreno_bind+0x214/0x264
> component_bind_all+0x128/0x1f8
> msm_drm_bind+0x2b8/0x608
> try_to_bring_up_aggregate_device+0x88/0x1a4
> __component_add+0xec/0x13c
> component_add+0x1c/0x28
> dp_display_probe+0x3f8/0x43c
> platform_probe+0x70/0xc4
> really_probe+0x148/0x280
> __driver_probe_device+0xc8/0xe0
> driver_probe_device+0x44/0x100
> __device_attach_driver+0x64/0xdc
> bus_for_each_drv+0xb0/0xd8
> __device_attach+0xd8/0x168
> device_initial_probe+0x1c/0x28
> bus_probe_device+0x44/0xb0
> deferred_probe_work_func+0xc8/0xe0
> process_one_work+0x2e0/0x488
> process_scheduled_works+0x4c/0x50
> worker_thread+0x218/0x274
> kthread+0xf0/0x100
> ret_from_fork+0x10/0x20
> irq event stamp: 293712
> hardirqs last enabled at (293711): [<ffffffdb2ff0893c>] vprintk_emit+0x160/0x25c
> hardirqs last disabled at (293712): [<ffffffdb30b48130>] el1_dbg+0x24/0x80
> softirqs last enabled at (279520): [<ffffffdb2fe10420>] __do_softirq+0x21c/0x4bc
> softirqs last disabled at (279515): [<ffffffdb2fe16708>] ____do_softirq+0x18/0x24
> ---[ end trace 0000000000000000 ]---
>
> Signed-off-by: Rob Clark <[email protected]>

I think this should be:

Fixes: db735fc4036b ("drm/msm: Set dma maximum segment size for mdss")

Reviewed-by: Dmitry Baryshkov <[email protected]>

> ---
> drivers/gpu/drm/msm/msm_drv.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index 3a74b5653e96..6dec1a3534f2 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -440,27 +440,27 @@ static int msm_drm_init(struct device *dev, const struct drm_driver *drv)
> fs_reclaim_acquire(GFP_KERNEL);
> might_lock(&priv->lru.lock);
> fs_reclaim_release(GFP_KERNEL);
>
> drm_mode_config_init(ddev);
>
> ret = msm_init_vram(ddev);
> if (ret)
> goto err_drm_dev_put;
>
> + dma_set_max_seg_size(dev, UINT_MAX);
> +
> /* Bind all our sub-components: */
> ret = component_bind_all(dev, ddev);
> if (ret)
> goto err_drm_dev_put;
>
> - dma_set_max_seg_size(dev, UINT_MAX);
> -
> msm_gem_shrinker_init(ddev);
>
> if (priv->kms_init) {
> ret = priv->kms_init(ddev);
> if (ret) {
> DRM_DEV_ERROR(dev, "failed to load kms\n");
> priv->kms = NULL;
> goto err_msm_uninit;
> }
> kms = priv->kms;

--
With best wishes
Dmitry

2023-05-01 21:23:39

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH] drm/msm: Set max segment size earlier

On Mon, May 1, 2023 at 1:56 PM Dmitry Baryshkov
<[email protected]> wrote:
>
> On 01/05/2023 23:44, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > Fixes the following splat on a6xx gen2+ (a640, a650, a660 families),
> > a6xx gen1 has smaller GMU allocations so they fit under the default
> > 64K max segment size.
> >
> > ------------[ cut here ]------------
> > DMA-API: msm_dpu ae01000.display-controller: mapping sg segment longer than device claims to support [len=126976] [max=65536]
> > WARNING: CPU: 5 PID: 9 at kernel/dma/debug.c:1160 debug_dma_map_sg+0x288/0x314
> > Modules linked in:
> > CPU: 5 PID: 9 Comm: kworker/u16:0 Not tainted 6.3.0-rc2-debug+ #629
> > Hardware name: Google Villager (rev1+) with LTE (DT)
> > Workqueue: events_unbound deferred_probe_work_func
> > pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : debug_dma_map_sg+0x288/0x314
> > lr : debug_dma_map_sg+0x288/0x314
> > sp : ffffffc00809b560
> > x29: ffffffc00809b560 x28: 0000000000000060 x27: 0000000000000000
> > x26: 0000000000010000 x25: 0000000000000004 x24: 0000000000000004
> > x23: ffffffffffffffff x22: ffffffdb31693cc0 x21: ffffff8080935800
> > x20: ffffff8087417400 x19: ffffff8087a45010 x18: 0000000000000000
> > x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000010000
> > x14: 0000000000000001 x13: ffffffffffffffff x12: ffffffffffffffff
> > x11: 0000000000000000 x10: 000000000000000a x9 : ffffffdb2ff05e14
> > x8 : ffffffdb31275000 x7 : ffffffdb2ff08908 x6 : 0000000000000000
> > x5 : 0000000000000001 x4 : ffffffdb2ff08a74 x3 : ffffffdb31275008
> > x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff80803a9a80
> > Call trace:
> > debug_dma_map_sg+0x288/0x314
> > __dma_map_sg_attrs+0x80/0xe4
> > dma_map_sgtable+0x30/0x4c
> > get_pages+0x1d4/0x1e4
> > msm_gem_pin_pages_locked+0xbc/0xf8
> > msm_gem_pin_vma_locked+0x58/0xa0
> > msm_gem_get_and_pin_iova_range+0x98/0xac
> > a6xx_gmu_memory_alloc+0x7c/0x128
> > a6xx_gmu_init+0x16c/0x9b0
> > a6xx_gpu_init+0x38c/0x3e4
> > adreno_bind+0x214/0x264
> > component_bind_all+0x128/0x1f8
> > msm_drm_bind+0x2b8/0x608
> > try_to_bring_up_aggregate_device+0x88/0x1a4
> > __component_add+0xec/0x13c
> > component_add+0x1c/0x28
> > dp_display_probe+0x3f8/0x43c
> > platform_probe+0x70/0xc4
> > really_probe+0x148/0x280
> > __driver_probe_device+0xc8/0xe0
> > driver_probe_device+0x44/0x100
> > __device_attach_driver+0x64/0xdc
> > bus_for_each_drv+0xb0/0xd8
> > __device_attach+0xd8/0x168
> > device_initial_probe+0x1c/0x28
> > bus_probe_device+0x44/0xb0
> > deferred_probe_work_func+0xc8/0xe0
> > process_one_work+0x2e0/0x488
> > process_scheduled_works+0x4c/0x50
> > worker_thread+0x218/0x274
> > kthread+0xf0/0x100
> > ret_from_fork+0x10/0x20
> > irq event stamp: 293712
> > hardirqs last enabled at (293711): [<ffffffdb2ff0893c>] vprintk_emit+0x160/0x25c
> > hardirqs last disabled at (293712): [<ffffffdb30b48130>] el1_dbg+0x24/0x80
> > softirqs last enabled at (279520): [<ffffffdb2fe10420>] __do_softirq+0x21c/0x4bc
> > softirqs last disabled at (279515): [<ffffffdb2fe16708>] ____do_softirq+0x18/0x24
> > ---[ end trace 0000000000000000 ]---
> >
> > Signed-off-by: Rob Clark <[email protected]>
>
> I think this should be:
>
> Fixes: db735fc4036b ("drm/msm: Set dma maximum segment size for mdss")

yeah, or perhaps just that commit didn't fix the issue hard enough.. I
was thinking that it was introduced by memory allocations out of
hw_init() but actually this has been here the whole time (on newer
a6xx gens). There was an internal bug about it, but somehow it didn't
get routed to me.

BR,
-R

> Reviewed-by: Dmitry Baryshkov <[email protected]>
>
> > ---
> > drivers/gpu/drm/msm/msm_drv.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > index 3a74b5653e96..6dec1a3534f2 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.c
> > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > @@ -440,27 +440,27 @@ static int msm_drm_init(struct device *dev, const struct drm_driver *drv)
> > fs_reclaim_acquire(GFP_KERNEL);
> > might_lock(&priv->lru.lock);
> > fs_reclaim_release(GFP_KERNEL);
> >
> > drm_mode_config_init(ddev);
> >
> > ret = msm_init_vram(ddev);
> > if (ret)
> > goto err_drm_dev_put;
> >
> > + dma_set_max_seg_size(dev, UINT_MAX);
> > +
> > /* Bind all our sub-components: */
> > ret = component_bind_all(dev, ddev);
> > if (ret)
> > goto err_drm_dev_put;
> >
> > - dma_set_max_seg_size(dev, UINT_MAX);
> > -
> > msm_gem_shrinker_init(ddev);
> >
> > if (priv->kms_init) {
> > ret = priv->kms_init(ddev);
> > if (ret) {
> > DRM_DEV_ERROR(dev, "failed to load kms\n");
> > priv->kms = NULL;
> > goto err_msm_uninit;
> > }
> > kms = priv->kms;
>
> --
> With best wishes
> Dmitry
>