2019-04-15 12:41:30

by Paul Kocialkowski

[permalink] [raw]
Subject: [PATCH v5 0/4] drm/vc4: Binner BO management improvements

Changes since v4:
* Used a kref on the binner bo instead of firstopen/lastclose;
* Added a mutex to prevent race conditions;
* Took care of enabling the OOM interrupt when we have a binner BO allocated.

Changes since v3:
* Split changes into more commits when possible;
* Reworked binner bo alloc condition as discussed.

Changes since v2:
* Removed deprecated sentence about fristopen;
* Added collected Reviewed-By tags.

Changes since v1:
* Squashed the two final patches into one.

Paul Kocialkowski (4):
drm/vc4: Reformat and export binner bo allocation helper
drm/vc4: Check for V3D before binner bo alloc
drm/vc4: Check for the binner bo before handling OOM interrupt
drm/vc4: Allocate binner bo when starting to use the V3D

drivers/gpu/drm/vc4/vc4_bo.c | 30 ++++++++++++++++++
drivers/gpu/drm/vc4/vc4_drv.c | 17 +++++++++++
drivers/gpu/drm/vc4/vc4_drv.h | 10 ++++++
drivers/gpu/drm/vc4/vc4_irq.c | 9 ++++--
drivers/gpu/drm/vc4/vc4_v3d.c | 57 ++++++++++++++++++++++++-----------
5 files changed, 104 insertions(+), 19 deletions(-)

--
2.21.0


2019-04-15 12:40:30

by Paul Kocialkowski

[permalink] [raw]
Subject: [PATCH v5 3/4] drm/vc4: Check for the binner bo before handling OOM interrupt

Since the OOM interrupt directly deals with the binner bo, it doesn't
make sense to try and handle it without a binner buffer registered.

Signed-off-by: Paul Kocialkowski <[email protected]>
---
drivers/gpu/drm/vc4/vc4_irq.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
index ffd0a4388752..723dc86b4511 100644
--- a/drivers/gpu/drm/vc4/vc4_irq.c
+++ b/drivers/gpu/drm/vc4/vc4_irq.c
@@ -64,6 +64,9 @@ vc4_overflow_mem_work(struct work_struct *work)
struct vc4_exec_info *exec;
unsigned long irqflags;

+ if (!bo)
+ return;
+
bin_bo_slot = vc4_v3d_get_bin_slot(vc4);
if (bin_bo_slot < 0) {
DRM_ERROR("Couldn't allocate binner overflow mem\n");
--
2.21.0

2019-04-15 12:40:50

by Paul Kocialkowski

[permalink] [raw]
Subject: [PATCH v5 1/4] drm/vc4: Reformat and export binner bo allocation helper

Since we'll be using the binner bo allocation helper in other parts
of the driver, reformat it with a vc4_v3d prefix and pass the vc4 dev
directly to match other functions.

Make the function visible to the whole driver too.

Signed-off-by: Paul Kocialkowski <[email protected]>
---
drivers/gpu/drm/vc4/vc4_drv.h | 1 +
drivers/gpu/drm/vc4/vc4_v3d.c | 11 +++++------
2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index e61734af059b..1f0f529169a1 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -833,6 +833,7 @@ void vc4_plane_async_set_fb(struct drm_plane *plane,
extern struct platform_driver vc4_v3d_driver;
extern const struct of_device_id vc4_v3d_dt_match[];
int vc4_v3d_get_bin_slot(struct vc4_dev *vc4);
+int vc4_v3d_bin_bo_alloc(struct vc4_dev *vc4);
int vc4_v3d_pm_get(struct vc4_dev *vc4);
void vc4_v3d_pm_put(struct vc4_dev *vc4);

diff --git a/drivers/gpu/drm/vc4/vc4_v3d.c b/drivers/gpu/drm/vc4/vc4_v3d.c
index a4b6859e3af6..9c73fad77449 100644
--- a/drivers/gpu/drm/vc4/vc4_v3d.c
+++ b/drivers/gpu/drm/vc4/vc4_v3d.c
@@ -213,7 +213,7 @@ int vc4_v3d_get_bin_slot(struct vc4_dev *vc4)
}

/**
- * vc4_allocate_bin_bo() - allocates the memory that will be used for
+ * vc4_v3d_bin_bo_alloc() - allocates the memory that will be used for
* tile binning.
*
* The binner has a limitation that the addresses in the tile state
@@ -234,9 +234,8 @@ int vc4_v3d_get_bin_slot(struct vc4_dev *vc4)
* overall CMA pool before they make scenes complicated enough to run
* out of bin space.
*/
-static int vc4_allocate_bin_bo(struct drm_device *drm)
+int vc4_v3d_bin_bo_alloc(struct vc4_dev *vc4)
{
- struct vc4_dev *vc4 = to_vc4_dev(drm);
struct vc4_v3d *v3d = vc4->v3d;
uint32_t size = 16 * 1024 * 1024;
int ret = 0;
@@ -251,7 +250,7 @@ static int vc4_allocate_bin_bo(struct drm_device *drm)
INIT_LIST_HEAD(&list);

while (true) {
- struct vc4_bo *bo = vc4_bo_create(drm, size, true,
+ struct vc4_bo *bo = vc4_bo_create(vc4->dev, size, true,
VC4_BO_TYPE_BIN);

if (IS_ERR(bo)) {
@@ -333,7 +332,7 @@ static int vc4_v3d_runtime_resume(struct device *dev)
struct vc4_dev *vc4 = v3d->vc4;
int ret;

- ret = vc4_allocate_bin_bo(vc4->dev);
+ ret = vc4_v3d_bin_bo_alloc(vc4);
if (ret)
return ret;

@@ -403,7 +402,7 @@ static int vc4_v3d_bind(struct device *dev, struct device *master, void *data)
if (ret != 0)
return ret;

- ret = vc4_allocate_bin_bo(drm);
+ ret = vc4_v3d_bin_bo_alloc(vc4);
if (ret) {
clk_disable_unprepare(v3d->clk);
return ret;
--
2.21.0

2019-04-15 12:41:42

by Paul Kocialkowski

[permalink] [raw]
Subject: [PATCH v5 4/4] drm/vc4: Allocate binner bo when starting to use the V3D

The binner bo is not required until the V3D is in use, so avoid
allocating it at probe and do it on the first non-dumb BO allocation.
Keep track of which clients are using the V3D and liberate the buffer
when there is none left (through a kref) and protect it with a mutex to
avoid race conditions.

The Out-Of-Memory (OOM) interrupt also gets some tweaking, to avoid
enabling it before having allocated a binner bo.

We also want to keep the BO alive during runtime suspend/resume to avoid
failing to allocate it at resume. This happens when the CMA pool is
full at that point and results in a hard crash.

Signed-off-by: Paul Kocialkowski <[email protected]>
---
drivers/gpu/drm/vc4/vc4_bo.c | 30 ++++++++++++++++++++++
drivers/gpu/drm/vc4/vc4_drv.c | 17 +++++++++++++
drivers/gpu/drm/vc4/vc4_drv.h | 9 +++++++
drivers/gpu/drm/vc4/vc4_irq.c | 6 +++--
drivers/gpu/drm/vc4/vc4_v3d.c | 47 +++++++++++++++++++++++++----------
5 files changed, 94 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index 88ebd681d7eb..5a5276cce9a2 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -799,6 +799,28 @@ vc4_prime_import_sg_table(struct drm_device *dev,
return obj;
}

+static int vc4_grab_bin_bo(struct drm_device *dev,
+ struct drm_file *file_priv)
+{
+ struct vc4_file *vc4file = file_priv->driver_priv;
+ struct vc4_dev *vc4 = to_vc4_dev(dev);
+
+ if (!vc4->v3d)
+ return -ENODEV;
+
+ if (vc4file->bin_bo_kref)
+ return 0;
+
+ mutex_lock(&vc4->bin_bo_lock);
+ vc4file->bin_bo_kref = vc4_v3d_bin_bo_get(vc4);
+ mutex_unlock(&vc4->bin_bo_lock);
+
+ if (!vc4file->bin_bo_kref)
+ return -ENOMEM;
+
+ return 0;
+}
+
int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
{
@@ -806,6 +828,10 @@ int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
struct vc4_bo *bo = NULL;
int ret;

+ ret = vc4_grab_bin_bo(dev, file_priv);
+ if (ret)
+ return ret;
+
/*
* We can't allocate from the BO cache, because the BOs don't
* get zeroed, and that might leak data between users.
@@ -865,6 +891,10 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
return -EINVAL;
}

+ ret = vc4_grab_bin_bo(dev, file_priv);
+ if (ret)
+ return ret;
+
bo = vc4_bo_create(dev, args->size, true, VC4_BO_TYPE_V3D_SHADER);
if (IS_ERR(bo))
return PTR_ERR(bo);
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index d840b52b9805..b09f771384be 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -126,10 +126,25 @@ static int vc4_open(struct drm_device *dev, struct drm_file *file)
return 0;
}

+static void vc4_close_bin_bo_release(struct kref *ref)
+{
+ struct vc4_dev *vc4 = container_of(ref, struct vc4_dev, bin_bo_kref);
+
+ return vc4_v3d_bin_bo_free(vc4);
+}
+
static void vc4_close(struct drm_device *dev, struct drm_file *file)
{
+ struct vc4_dev *vc4 = to_vc4_dev(dev);
struct vc4_file *vc4file = file->driver_priv;

+ mutex_lock(&vc4->bin_bo_lock);
+
+ if (vc4file->bin_bo_kref)
+ kref_put(vc4file->bin_bo_kref, vc4_close_bin_bo_release);
+
+ mutex_unlock(&vc4->bin_bo_lock);
+
vc4_perfmon_close_file(vc4file);
kfree(vc4file);
}
@@ -274,6 +289,8 @@ static int vc4_drm_bind(struct device *dev)
drm->dev_private = vc4;
INIT_LIST_HEAD(&vc4->debugfs_list);

+ mutex_init(&vc4->bin_bo_lock);
+
ret = vc4_bo_cache_init(drm);
if (ret)
goto dev_put;
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 1f0f529169a1..69dbda8e89ba 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -216,6 +216,11 @@ struct vc4_dev {
* the minor is available (after drm_dev_register()).
*/
struct list_head debugfs_list;
+
+ /* Mutex for binner bo allocation. */
+ struct mutex bin_bo_lock;
+ /* Reference count for our binner bo. */
+ struct kref bin_bo_kref;
};

static inline struct vc4_dev *
@@ -594,6 +599,8 @@ struct vc4_file {
struct idr idr;
struct mutex lock;
} perfmon;
+
+ struct kref *bin_bo_kref;
};

static inline struct vc4_exec_info *
@@ -833,7 +840,9 @@ void vc4_plane_async_set_fb(struct drm_plane *plane,
extern struct platform_driver vc4_v3d_driver;
extern const struct of_device_id vc4_v3d_dt_match[];
int vc4_v3d_get_bin_slot(struct vc4_dev *vc4);
+struct kref *vc4_v3d_bin_bo_get(struct vc4_dev *vc4);
int vc4_v3d_bin_bo_alloc(struct vc4_dev *vc4);
+void vc4_v3d_bin_bo_free(struct vc4_dev *vc4);
int vc4_v3d_pm_get(struct vc4_dev *vc4);
void vc4_v3d_pm_put(struct vc4_dev *vc4);

diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
index 723dc86b4511..e20e46483346 100644
--- a/drivers/gpu/drm/vc4/vc4_irq.c
+++ b/drivers/gpu/drm/vc4/vc4_irq.c
@@ -252,8 +252,10 @@ vc4_irq_postinstall(struct drm_device *dev)
if (!vc4->v3d)
return 0;

- /* Enable both the render done and out of memory interrupts. */
- V3D_WRITE(V3D_INTENA, V3D_DRIVER_IRQS);
+ /* Enable the render done interrupts. The out-of-memory interrupt is
+ * enabled as soon as we have a binner BO allocated.
+ */
+ V3D_WRITE(V3D_INTENA, V3D_INT_FLDONE | V3D_INT_FRDONE);

return 0;
}
diff --git a/drivers/gpu/drm/vc4/vc4_v3d.c b/drivers/gpu/drm/vc4/vc4_v3d.c
index a77c8d6c7d5b..3501a83a0e91 100644
--- a/drivers/gpu/drm/vc4/vc4_v3d.c
+++ b/drivers/gpu/drm/vc4/vc4_v3d.c
@@ -212,6 +212,23 @@ int vc4_v3d_get_bin_slot(struct vc4_dev *vc4)
return -ENOMEM;
}

+struct kref *vc4_v3d_bin_bo_get(struct vc4_dev *vc4)
+{
+ struct kref *bin_bo_kref = &vc4->bin_bo_kref;
+ int ret;
+
+ if (vc4->bin_bo) {
+ kref_get(bin_bo_kref);
+ return bin_bo_kref;
+ }
+
+ ret = vc4_v3d_bin_bo_alloc(vc4);
+ if (ret)
+ return NULL;
+
+ return bin_bo_kref;
+}
+
/**
* vc4_v3d_bin_bo_alloc() - allocates the memory that will be used for
* tile binning.
@@ -294,6 +311,14 @@ int vc4_v3d_bin_bo_alloc(struct vc4_dev *vc4)
WARN_ON_ONCE(sizeof(vc4->bin_alloc_used) * 8 !=
bo->base.base.size / vc4->bin_alloc_size);

+ kref_init(&vc4->bin_bo_kref);
+
+ /* Enable the out-of-memory interrupt to set our
+ * newly-allocated binner BO, potentially from an
+ * already-pending-but-masked interupt.
+ */
+ V3D_WRITE(V3D_INTENA, V3D_INT_OUTOMEM);
+
break;
}

@@ -313,6 +338,15 @@ int vc4_v3d_bin_bo_alloc(struct vc4_dev *vc4)
return ret;
}

+void vc4_v3d_bin_bo_free(struct vc4_dev *vc4)
+{
+ if (!vc4->bin_bo)
+ return;
+
+ drm_gem_object_put_unlocked(&vc4->bin_bo->base.base);
+ vc4->bin_bo = NULL;
+}
+
#ifdef CONFIG_PM
static int vc4_v3d_runtime_suspend(struct device *dev)
{
@@ -321,9 +355,6 @@ static int vc4_v3d_runtime_suspend(struct device *dev)

vc4_irq_uninstall(vc4->dev);

- drm_gem_object_put_unlocked(&vc4->bin_bo->base.base);
- vc4->bin_bo = NULL;
-
clk_disable_unprepare(v3d->clk);

return 0;
@@ -335,10 +366,6 @@ static int vc4_v3d_runtime_resume(struct device *dev)
struct vc4_dev *vc4 = v3d->vc4;
int ret;

- ret = vc4_v3d_bin_bo_alloc(vc4);
- if (ret)
- return ret;
-
ret = clk_prepare_enable(v3d->clk);
if (ret != 0)
return ret;
@@ -405,12 +432,6 @@ static int vc4_v3d_bind(struct device *dev, struct device *master, void *data)
if (ret != 0)
return ret;

- ret = vc4_v3d_bin_bo_alloc(vc4);
- if (ret) {
- clk_disable_unprepare(v3d->clk);
- return ret;
- }
-
/* Reset the binner overflow address/size at setup, to be sure
* we don't reuse an old one.
*/
--
2.21.0

2019-04-15 12:41:50

by Paul Kocialkowski

[permalink] [raw]
Subject: [PATCH v5 2/4] drm/vc4: Check for V3D before binner bo alloc

Check that we have a V3D device registered before attempting to
allocate a binner buffer object.

Signed-off-by: Paul Kocialkowski <[email protected]>
---
drivers/gpu/drm/vc4/vc4_v3d.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_v3d.c b/drivers/gpu/drm/vc4/vc4_v3d.c
index 9c73fad77449..a77c8d6c7d5b 100644
--- a/drivers/gpu/drm/vc4/vc4_v3d.c
+++ b/drivers/gpu/drm/vc4/vc4_v3d.c
@@ -241,6 +241,9 @@ int vc4_v3d_bin_bo_alloc(struct vc4_dev *vc4)
int ret = 0;
struct list_head list;

+ if (!v3d)
+ return -ENODEV;
+
/* We may need to try allocating more than once to get a BO
* that doesn't cross 256MB. Track the ones we've allocated
* that failed so far, so that we can free them when we've got
--
2.21.0

2019-04-15 20:50:35

by Eric Anholt

[permalink] [raw]
Subject: Re: [PATCH v5 3/4] drm/vc4: Check for the binner bo before handling OOM interrupt

Paul Kocialkowski <[email protected]> writes:

> Since the OOM interrupt directly deals with the binner bo, it doesn't
> make sense to try and handle it without a binner buffer registered.
>
> Signed-off-by: Paul Kocialkowski <[email protected]>
> ---
> drivers/gpu/drm/vc4/vc4_irq.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
> index ffd0a4388752..723dc86b4511 100644
> --- a/drivers/gpu/drm/vc4/vc4_irq.c
> +++ b/drivers/gpu/drm/vc4/vc4_irq.c
> @@ -64,6 +64,9 @@ vc4_overflow_mem_work(struct work_struct *work)
> struct vc4_exec_info *exec;
> unsigned long irqflags;
>
> + if (!bo)
> + return;
> +
> bin_bo_slot = vc4_v3d_get_bin_slot(vc4);
> if (bin_bo_slot < 0) {
> DRM_ERROR("Couldn't allocate binner overflow mem\n");
> --
> 2.21.0

I don't think this is going to be race-free. You're checking outside of
a lock, then proceeding to use it even if (in patch 4) the bin BO was in
the process of being freed during the file close path. Can we put all
of the overflow process here under the same lock as freeing?


Attachments:
signature.asc (847.00 B)

2019-04-15 20:51:54

by Eric Anholt

[permalink] [raw]
Subject: Re: [PATCH v5 4/4] drm/vc4: Allocate binner bo when starting to use the V3D

Paul Kocialkowski <[email protected]> writes:

> The binner bo is not required until the V3D is in use, so avoid
> allocating it at probe and do it on the first non-dumb BO allocation.
> Keep track of which clients are using the V3D and liberate the buffer
> when there is none left (through a kref) and protect it with a mutex to
> avoid race conditions.
>
> The Out-Of-Memory (OOM) interrupt also gets some tweaking, to avoid
> enabling it before having allocated a binner bo.

I love your solution to this!

> We also want to keep the BO alive during runtime suspend/resume to avoid
> failing to allocate it at resume. This happens when the CMA pool is
> full at that point and results in a hard crash.
>
> Signed-off-by: Paul Kocialkowski <[email protected]>
> ---
> drivers/gpu/drm/vc4/vc4_bo.c | 30 ++++++++++++++++++++++
> drivers/gpu/drm/vc4/vc4_drv.c | 17 +++++++++++++
> drivers/gpu/drm/vc4/vc4_drv.h | 9 +++++++
> drivers/gpu/drm/vc4/vc4_irq.c | 6 +++--
> drivers/gpu/drm/vc4/vc4_v3d.c | 47 +++++++++++++++++++++++++----------
> 5 files changed, 94 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
> index 88ebd681d7eb..5a5276cce9a2 100644
> --- a/drivers/gpu/drm/vc4/vc4_bo.c
> +++ b/drivers/gpu/drm/vc4/vc4_bo.c
> @@ -799,6 +799,28 @@ vc4_prime_import_sg_table(struct drm_device *dev,
> return obj;
> }
>
> +static int vc4_grab_bin_bo(struct drm_device *dev,
> + struct drm_file *file_priv)
> +{
> + struct vc4_file *vc4file = file_priv->driver_priv;
> + struct vc4_dev *vc4 = to_vc4_dev(dev);
> +
> + if (!vc4->v3d)
> + return -ENODEV;
> +
> + if (vc4file->bin_bo_kref)
> + return 0;
> +
> + mutex_lock(&vc4->bin_bo_lock);
> + vc4file->bin_bo_kref = vc4_v3d_bin_bo_get(vc4);
> + mutex_unlock(&vc4->bin_bo_lock);

Interesting. I think I would have stored a bool for whether he had the
kref, instead of a pointer to vc4->bin_bo_kref. Either way is fine with
me, though.

> +
> + if (!vc4file->bin_bo_kref)
> + return -ENOMEM;
> +
> + return 0;
> +}

> diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
> index d840b52b9805..b09f771384be 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.c
> +++ b/drivers/gpu/drm/vc4/vc4_drv.c
> @@ -126,10 +126,25 @@ static int vc4_open(struct drm_device *dev, struct drm_file *file)
> return 0;
> }
>
> +static void vc4_close_bin_bo_release(struct kref *ref)
> +{
> + struct vc4_dev *vc4 = container_of(ref, struct vc4_dev, bin_bo_kref);
> +
> + return vc4_v3d_bin_bo_free(vc4);
> +}

Could we have this be the function that vc4_v3d.c publishes instead of
_free, and then we don't need two separate functions for the free path?

> static void vc4_close(struct drm_device *dev, struct drm_file *file)
> {
> + struct vc4_dev *vc4 = to_vc4_dev(dev);
> struct vc4_file *vc4file = file->driver_priv;
>
> + mutex_lock(&vc4->bin_bo_lock);
> +
> + if (vc4file->bin_bo_kref)
> + kref_put(vc4file->bin_bo_kref, vc4_close_bin_bo_release);
> +
> + mutex_unlock(&vc4->bin_bo_lock);
> +
> vc4_perfmon_close_file(vc4file);
> kfree(vc4file);
> }


> @@ -313,6 +338,15 @@ int vc4_v3d_bin_bo_alloc(struct vc4_dev *vc4)
> return ret;
> }
>
> +void vc4_v3d_bin_bo_free(struct vc4_dev *vc4)
> +{
> + if (!vc4->bin_bo)
> + return;
> +
> + drm_gem_object_put_unlocked(&vc4->bin_bo->base.base);
> + vc4->bin_bo = NULL;

Hmm. I'm thinking:

We free the bin BO from the DRM fd close operation, but we could have
jobs still being executed after the close if userspace didn't happen to
wait on them before finishing (closing doesn't wait for execution to
complete, and we couldn't reliably wait during close even if we wanted
to). I think to resolve that we need the vc4_exec_info to keep a ref on
the bin BO as well (at least, if they had a binning component to their
job).

> #ifdef CONFIG_PM
> static int vc4_v3d_runtime_suspend(struct device *dev)
> {
> @@ -321,9 +355,6 @@ static int vc4_v3d_runtime_suspend(struct device *dev)
>
> vc4_irq_uninstall(vc4->dev);
>
> - drm_gem_object_put_unlocked(&vc4->bin_bo->base.base);
> - vc4->bin_bo = NULL;
> -
> clk_disable_unprepare(v3d->clk);
>
> return 0;
> @@ -335,10 +366,6 @@ static int vc4_v3d_runtime_resume(struct device *dev)
> struct vc4_dev *vc4 = v3d->vc4;
> int ret;
>
> - ret = vc4_v3d_bin_bo_alloc(vc4);
> - if (ret)
> - return ret;
> -
> ret = clk_prepare_enable(v3d->clk);
> if (ret != 0)
> return ret;


Attachments:
signature.asc (847.00 B)

2019-04-23 08:33:02

by Paul Kocialkowski

[permalink] [raw]
Subject: Re: [PATCH v5 3/4] drm/vc4: Check for the binner bo before handling OOM interrupt

Hi,

On Mon, 2019-04-15 at 13:48 -0700, Eric Anholt wrote:
> Paul Kocialkowski <[email protected]> writes:
>
> > Since the OOM interrupt directly deals with the binner bo, it doesn't
> > make sense to try and handle it without a binner buffer registered.
> >
> > Signed-off-by: Paul Kocialkowski <[email protected]>
> > ---
> > drivers/gpu/drm/vc4/vc4_irq.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
> > index ffd0a4388752..723dc86b4511 100644
> > --- a/drivers/gpu/drm/vc4/vc4_irq.c
> > +++ b/drivers/gpu/drm/vc4/vc4_irq.c
> > @@ -64,6 +64,9 @@ vc4_overflow_mem_work(struct work_struct *work)
> > struct vc4_exec_info *exec;
> > unsigned long irqflags;
> >
> > + if (!bo)
> > + return;
> > +
> > bin_bo_slot = vc4_v3d_get_bin_slot(vc4);
> > if (bin_bo_slot < 0) {
> > DRM_ERROR("Couldn't allocate binner overflow mem\n");
> > --
> > 2.21.0
>
> I don't think this is going to be race-free. You're checking outside of
> a lock, then proceeding to use it even if (in patch 4) the bin BO was in
> the process of being freed during the file close path. Can we put all
> of the overflow process here under the same lock as freeing?

Definitely, sorry I missed that.

Cheers,

Paul

--
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

2019-04-23 08:52:42

by Paul Kocialkowski

[permalink] [raw]
Subject: Re: [PATCH v5 3/4] drm/vc4: Check for the binner bo before handling OOM interrupt

On Tue, 2019-04-23 at 10:30 +0200, Paul Kocialkowski wrote:
> Hi,
>
> On Mon, 2019-04-15 at 13:48 -0700, Eric Anholt wrote:
> > Paul Kocialkowski <[email protected]> writes:
> >
> > > Since the OOM interrupt directly deals with the binner bo, it doesn't
> > > make sense to try and handle it without a binner buffer registered.
> > >
> > > Signed-off-by: Paul Kocialkowski <[email protected]>
> > > ---
> > > drivers/gpu/drm/vc4/vc4_irq.c | 3 +++
> > > 1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
> > > index ffd0a4388752..723dc86b4511 100644
> > > --- a/drivers/gpu/drm/vc4/vc4_irq.c
> > > +++ b/drivers/gpu/drm/vc4/vc4_irq.c
> > > @@ -64,6 +64,9 @@ vc4_overflow_mem_work(struct work_struct *work)
> > > struct vc4_exec_info *exec;
> > > unsigned long irqflags;
> > >
> > > + if (!bo)
> > > + return;
> > > +
> > > bin_bo_slot = vc4_v3d_get_bin_slot(vc4);
> > > if (bin_bo_slot < 0) {
> > > DRM_ERROR("Couldn't allocate binner overflow mem\n");
> > > --
> > > 2.21.0
> >
> > I don't think this is going to be race-free. You're checking outside of
> > a lock, then proceeding to use it even if (in patch 4) the bin BO was in
> > the process of being freed during the file close path. Can we put all
> > of the overflow process here under the same lock as freeing?
>
> Definitely, sorry I missed that.

That being said, I think it should be fixed in 4/4 since this patch is
not problematic on its own (we haven't introduced the mutex/kref yet).

Cheers,

Paul

--
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

2019-04-23 12:59:12

by Paul Kocialkowski

[permalink] [raw]
Subject: Re: [PATCH v5 4/4] drm/vc4: Allocate binner bo when starting to use the V3D

Hi,

On Mon, 2019-04-15 at 13:50 -0700, Eric Anholt wrote:
> Paul Kocialkowski <[email protected]> writes:
>
> > The binner bo is not required until the V3D is in use, so avoid
> > allocating it at probe and do it on the first non-dumb BO allocation.
> > Keep track of which clients are using the V3D and liberate the buffer
> > when there is none left (through a kref) and protect it with a mutex to
> > avoid race conditions.
> >
> > The Out-Of-Memory (OOM) interrupt also gets some tweaking, to avoid
> > enabling it before having allocated a binner bo.
>
> I love your solution to this!
>
> > We also want to keep the BO alive during runtime suspend/resume to avoid
> > failing to allocate it at resume. This happens when the CMA pool is
> > full at that point and results in a hard crash.
> >
> > Signed-off-by: Paul Kocialkowski <[email protected]>
> > ---
> > drivers/gpu/drm/vc4/vc4_bo.c | 30 ++++++++++++++++++++++
> > drivers/gpu/drm/vc4/vc4_drv.c | 17 +++++++++++++
> > drivers/gpu/drm/vc4/vc4_drv.h | 9 +++++++
> > drivers/gpu/drm/vc4/vc4_irq.c | 6 +++--
> > drivers/gpu/drm/vc4/vc4_v3d.c | 47 +++++++++++++++++++++++++----------
> > 5 files changed, 94 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
> > index 88ebd681d7eb..5a5276cce9a2 100644
> > --- a/drivers/gpu/drm/vc4/vc4_bo.c
> > +++ b/drivers/gpu/drm/vc4/vc4_bo.c
> > @@ -799,6 +799,28 @@ vc4_prime_import_sg_table(struct drm_device *dev,
> > return obj;
> > }
> >
> > +static int vc4_grab_bin_bo(struct drm_device *dev,
> > + struct drm_file *file_priv)
> > +{
> > + struct vc4_file *vc4file = file_priv->driver_priv;
> > + struct vc4_dev *vc4 = to_vc4_dev(dev);
> > +
> > + if (!vc4->v3d)
> > + return -ENODEV;
> > +
> > + if (vc4file->bin_bo_kref)
> > + return 0;
> > +
> > + mutex_lock(&vc4->bin_bo_lock);
> > + vc4file->bin_bo_kref = vc4_v3d_bin_bo_get(vc4);
> > + mutex_unlock(&vc4->bin_bo_lock);
>
> Interesting. I think I would have stored a bool for whether he had the
> kref, instead of a pointer to vc4->bin_bo_kref. Either way is fine with
> me, though.

Heh, I've changed it to a bool for v6 anyway, it makes the code look
more symetrical.

> > +
> > + if (!vc4file->bin_bo_kref)
> > + return -ENOMEM;
> > +
> > + return 0;
> > +}
> > diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
> > index d840b52b9805..b09f771384be 100644
> > --- a/drivers/gpu/drm/vc4/vc4_drv.c
> > +++ b/drivers/gpu/drm/vc4/vc4_drv.c
> > @@ -126,10 +126,25 @@ static int vc4_open(struct drm_device *dev, struct drm_file *file)
> > return 0;
> > }
> >
> > +static void vc4_close_bin_bo_release(struct kref *ref)
> > +{
> > + struct vc4_dev *vc4 = container_of(ref, struct vc4_dev, bin_bo_kref);
> > +
> > + return vc4_v3d_bin_bo_free(vc4);
> > +}
>
> Could we have this be the function that vc4_v3d.c publishes instead of
> _free, and then we don't need two separate functions for the free path?

I've made get/put helpers for v6 that shall address this as well.

> > static void vc4_close(struct drm_device *dev, struct drm_file *file)
> > {
> > + struct vc4_dev *vc4 = to_vc4_dev(dev);
> > struct vc4_file *vc4file = file->driver_priv;
> >
> > + mutex_lock(&vc4->bin_bo_lock);
> > +
> > + if (vc4file->bin_bo_kref)
> > + kref_put(vc4file->bin_bo_kref, vc4_close_bin_bo_release);
> > +
> > + mutex_unlock(&vc4->bin_bo_lock);
> > +
> > vc4_perfmon_close_file(vc4file);
> > kfree(vc4file);
> > }
> > @@ -313,6 +338,15 @@ int vc4_v3d_bin_bo_alloc(struct vc4_dev *vc4)
> > return ret;
> > }
> >
> > +void vc4_v3d_bin_bo_free(struct vc4_dev *vc4)
> > +{
> > + if (!vc4->bin_bo)
> > + return;
> > +
> > + drm_gem_object_put_unlocked(&vc4->bin_bo->base.base);
> > + vc4->bin_bo = NULL;
>
> Hmm. I'm thinking:
>
> We free the bin BO from the DRM fd close operation, but we could have
> jobs still being executed after the close if userspace didn't happen to
> wait on them before finishing (closing doesn't wait for execution to
> complete, and we couldn't reliably wait during close even if we wanted
> to). I think to resolve that we need the vc4_exec_info to keep a ref on
> the bin BO as well (at least, if they had a binning component to their
> job).

Oh good catch! So since we assign binner slots to exec jobs when
getting a tile_binning_mode_config_packet, we need to make sure that's
kept alive as well.

Here is something more I noticed when reworking the code: there are
subsequent calls in the ioctl handlers that may fall after we got a
reference to the binner BO. So we need to make sure to put in the
failure path since the matching legit put won't happen if the ioctl
errored out.

Cheers,

Paul

> > #ifdef CONFIG_PM
> > static int vc4_v3d_runtime_suspend(struct device *dev)
> > {
> > @@ -321,9 +355,6 @@ static int vc4_v3d_runtime_suspend(struct device *dev)
> >
> > vc4_irq_uninstall(vc4->dev);
> >
> > - drm_gem_object_put_unlocked(&vc4->bin_bo->base.base);
> > - vc4->bin_bo = NULL;
> > -
> > clk_disable_unprepare(v3d->clk);
> >
> > return 0;
> > @@ -335,10 +366,6 @@ static int vc4_v3d_runtime_resume(struct device *dev)
> > struct vc4_dev *vc4 = v3d->vc4;
> > int ret;
> >
> > - ret = vc4_v3d_bin_bo_alloc(vc4);
> > - if (ret)
> > - return ret;
> > -
> > ret = clk_prepare_enable(v3d->clk);
> > if (ret != 0)
> > return ret;
--
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com