2021-09-03 18:45:03

by Rob Clark

[permalink] [raw]
Subject: [PATCH v3 0/9] dma-fence: Deadline awareness

From: Rob Clark <[email protected]>

This series adds deadline awareness to fences, so realtime deadlines
such as vblank can be communicated to the fence signaller for power/
frequency management decisions.

This is partially inspired by a trick i915 does, but implemented
via dma-fence for a couple of reasons:

1) To continue to be able to use the atomic helpers
2) To support cases where display and gpu are different drivers

This iteration adds a dma-fence ioctl to set a deadline (both to
support igt-tests, and compositors which delay decisions about which
client buffer to display), and a sw_sync ioctl to read back the
deadline. IGT tests utilizing these can be found at:

https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadline


v1: https://patchwork.freedesktop.org/series/93035/
v2: Move filtering out of later deadlines to fence implementation
to avoid increasing the size of dma_fence
v3: Add support in fence-array and fence-chain; Add some uabi to
support igt tests and userspace compositors.

Rob Clark (9):
dma-fence: Add deadline awareness
drm/vblank: Add helper to get next vblank time
drm/atomic-helper: Set fence deadline for vblank
drm/scheduler: Add fence deadline support
drm/msm: Add deadline based boost support
dma-buf/fence-array: Add fence deadline support
dma-buf/fence-chain: Add fence deadline support
dma-buf/sync_file: Add SET_DEADLINE ioctl
dma-buf/sw_sync: Add fence deadline support

drivers/dma-buf/dma-fence-array.c | 11 ++++
drivers/dma-buf/dma-fence-chain.c | 13 +++++
drivers/dma-buf/dma-fence.c | 20 +++++++
drivers/dma-buf/sw_sync.c | 58 +++++++++++++++++++
drivers/dma-buf/sync_debug.h | 2 +
drivers/dma-buf/sync_file.c | 19 +++++++
drivers/gpu/drm/drm_atomic_helper.c | 36 ++++++++++++
drivers/gpu/drm/drm_vblank.c | 32 +++++++++++
drivers/gpu/drm/msm/msm_fence.c | 76 +++++++++++++++++++++++++
drivers/gpu/drm/msm/msm_fence.h | 20 +++++++
drivers/gpu/drm/msm/msm_gpu.h | 1 +
drivers/gpu/drm/msm/msm_gpu_devfreq.c | 20 +++++++
drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++
drivers/gpu/drm/scheduler/sched_main.c | 2 +-
include/drm/drm_vblank.h | 1 +
include/drm/gpu_scheduler.h | 8 +++
include/linux/dma-fence.h | 16 ++++++
include/uapi/linux/sync_file.h | 20 +++++++
18 files changed, 388 insertions(+), 1 deletion(-)

--
2.31.1


2021-09-03 18:45:16

by Rob Clark

[permalink] [raw]
Subject: [PATCH v3 2/9] drm/vblank: Add helper to get next vblank time

From: Rob Clark <[email protected]>

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/drm_vblank.c | 32 ++++++++++++++++++++++++++++++++
include/drm/drm_vblank.h | 1 +
2 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index b701cda86d0c..ec2732664b95 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -980,6 +980,38 @@ u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc,
}
EXPORT_SYMBOL(drm_crtc_vblank_count_and_time);

+/**
+ * drm_crtc_next_vblank_time - calculate the time of the next vblank
+ * @crtc: the crtc for which to calculate next vblank time
+ * @vblanktime: pointer to time to receive the next vblank timestamp.
+ *
+ * Calculate the expected time of the next vblank based on time of previous
+ * vblank and frame duration
+ */
+int drm_crtc_next_vblank_time(struct drm_crtc *crtc, ktime_t *vblanktime)
+{
+ unsigned int pipe = drm_crtc_index(crtc);
+ struct drm_vblank_crtc *vblank = &crtc->dev->vblank[pipe];
+ u64 count;
+
+ if (!vblank->framedur_ns)
+ return -EINVAL;
+
+ count = drm_vblank_count_and_time(crtc->dev, pipe, vblanktime);
+
+ /*
+ * If we don't get a valid count, then we probably also don't
+ * have a valid time:
+ */
+ if (!count)
+ return -EINVAL;
+
+ *vblanktime = ktime_add(*vblanktime, ns_to_ktime(vblank->framedur_ns));
+
+ return 0;
+}
+EXPORT_SYMBOL(drm_crtc_next_vblank_time);
+
static void send_vblank_event(struct drm_device *dev,
struct drm_pending_vblank_event *e,
u64 seq, ktime_t now)
diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
index 733a3e2d1d10..a63bc2c92f3c 100644
--- a/include/drm/drm_vblank.h
+++ b/include/drm/drm_vblank.h
@@ -230,6 +230,7 @@ bool drm_dev_has_vblank(const struct drm_device *dev);
u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc,
ktime_t *vblanktime);
+int drm_crtc_next_vblank_time(struct drm_crtc *crtc, ktime_t *vblanktime);
void drm_crtc_send_vblank_event(struct drm_crtc *crtc,
struct drm_pending_vblank_event *e);
void drm_crtc_arm_vblank_event(struct drm_crtc *crtc,
--
2.31.1

2021-09-03 18:45:25

by Rob Clark

[permalink] [raw]
Subject: [PATCH v3 3/9] drm/atomic-helper: Set fence deadline for vblank

From: Rob Clark <[email protected]>

For an atomic commit updating a single CRTC (ie. a pageflip) calculate
the next vblank time, and inform the fence(s) of that deadline.

v2: Comment typo fix (danvet)

Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/drm_atomic_helper.c | 36 +++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 2c0c6ec92820..3322dafd675f 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1407,6 +1407,40 @@ void drm_atomic_helper_commit_modeset_enables(struct drm_device *dev,
}
EXPORT_SYMBOL(drm_atomic_helper_commit_modeset_enables);

+/*
+ * For atomic updates which touch just a single CRTC, calculate the time of the
+ * next vblank, and inform all the fences of the deadline.
+ */
+static void set_fence_deadline(struct drm_device *dev,
+ struct drm_atomic_state *state)
+{
+ struct drm_crtc *crtc, *wait_crtc = NULL;
+ struct drm_crtc_state *new_crtc_state;
+ struct drm_plane *plane;
+ struct drm_plane_state *new_plane_state;
+ ktime_t vbltime;
+ int i;
+
+ for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) {
+ if (wait_crtc)
+ return;
+ wait_crtc = crtc;
+ }
+
+ /* If no CRTCs updated, then nothing to do: */
+ if (!wait_crtc)
+ return;
+
+ if (drm_crtc_next_vblank_time(wait_crtc, &vbltime))
+ return;
+
+ for_each_new_plane_in_state (state, plane, new_plane_state, i) {
+ if (!new_plane_state->fence)
+ continue;
+ dma_fence_set_deadline(new_plane_state->fence, vbltime);
+ }
+}
+
/**
* drm_atomic_helper_wait_for_fences - wait for fences stashed in plane state
* @dev: DRM device
@@ -1436,6 +1470,8 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
struct drm_plane_state *new_plane_state;
int i, ret;

+ set_fence_deadline(dev, state);
+
for_each_new_plane_in_state(state, plane, new_plane_state, i) {
if (!new_plane_state->fence)
continue;
--
2.31.1

2021-09-03 18:45:29

by Rob Clark

[permalink] [raw]
Subject: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

From: Rob Clark <[email protected]>

As the finished fence is the one that is exposed to userspace, and
therefore the one that other operations, like atomic update, would
block on, we need to propagate the deadline from from the finished
fence to the actual hw fence.

v2: Split into drm_sched_fence_set_parent() (ckoenig)

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
drivers/gpu/drm/scheduler/sched_main.c | 2 +-
include/drm/gpu_scheduler.h | 8 ++++++
3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
index bcea035cf4c6..4fc41a71d1c7 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
dma_fence_put(&fence->scheduled);
}

+static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
+ ktime_t deadline)
+{
+ struct drm_sched_fence *fence = to_drm_sched_fence(f);
+ unsigned long flags;
+
+ spin_lock_irqsave(&fence->lock, flags);
+
+ /* If we already have an earlier deadline, keep it: */
+ if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
+ ktime_before(fence->deadline, deadline)) {
+ spin_unlock_irqrestore(&fence->lock, flags);
+ return;
+ }
+
+ fence->deadline = deadline;
+ set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
+
+ spin_unlock_irqrestore(&fence->lock, flags);
+
+ if (fence->parent)
+ dma_fence_set_deadline(fence->parent, deadline);
+}
+
static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
@@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
.release = drm_sched_fence_release_finished,
+ .set_deadline = drm_sched_fence_set_deadline_finished,
};

struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
@@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
}
EXPORT_SYMBOL(to_drm_sched_fence);

+void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
+ struct dma_fence *fence)
+{
+ s_fence->parent = dma_fence_get(fence);
+ if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
+ &s_fence->finished.flags))
+ dma_fence_set_deadline(fence, s_fence->deadline);
+}
+
struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
void *owner)
{
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 595e47ff7d06..27bf0ac0625f 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
drm_sched_fence_scheduled(s_fence);

if (!IS_ERR_OR_NULL(fence)) {
- s_fence->parent = dma_fence_get(fence);
+ drm_sched_fence_set_parent(s_fence, fence);
r = dma_fence_add_callback(fence, &sched_job->cb,
drm_sched_job_done_cb);
if (r == -ENOENT)
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 7f77a455722c..158ddd662469 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -238,6 +238,12 @@ struct drm_sched_fence {
*/
struct dma_fence finished;

+ /**
+ * @deadline: deadline set on &drm_sched_fence.finished which
+ * potentially needs to be propagated to &drm_sched_fence.parent
+ */
+ ktime_t deadline;
+
/**
* @parent: the fence returned by &drm_sched_backend_ops.run_job
* when scheduling the job on hardware. We signal the
@@ -505,6 +511,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
enum drm_sched_priority priority);
bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);

+void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
+ struct dma_fence *fence);
struct drm_sched_fence *drm_sched_fence_alloc(
struct drm_sched_entity *s_entity, void *owner);
void drm_sched_fence_init(struct drm_sched_fence *fence,
--
2.31.1

2021-09-03 18:46:33

by Rob Clark

[permalink] [raw]
Subject: [PATCH v3 5/9] drm/msm: Add deadline based boost support

From: Rob Clark <[email protected]>

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_fence.c | 76 +++++++++++++++++++++++++++
drivers/gpu/drm/msm/msm_fence.h | 20 +++++++
drivers/gpu/drm/msm/msm_gpu.h | 1 +
drivers/gpu/drm/msm/msm_gpu_devfreq.c | 20 +++++++
4 files changed, 117 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index f2cece542c3f..67c2a96e1c85 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -8,6 +8,37 @@

#include "msm_drv.h"
#include "msm_fence.h"
+#include "msm_gpu.h"
+
+static inline bool fence_completed(struct msm_fence_context *fctx, uint32_t fence);
+
+static struct msm_gpu *fctx2gpu(struct msm_fence_context *fctx)
+{
+ struct msm_drm_private *priv = fctx->dev->dev_private;
+ return priv->gpu;
+}
+
+static enum hrtimer_restart deadline_timer(struct hrtimer *t)
+{
+ struct msm_fence_context *fctx = container_of(t,
+ struct msm_fence_context, deadline_timer);
+
+ kthread_queue_work(fctx2gpu(fctx)->worker, &fctx->deadline_work);
+
+ return HRTIMER_NORESTART;
+}
+
+static void deadline_work(struct kthread_work *work)
+{
+ struct msm_fence_context *fctx = container_of(work,
+ struct msm_fence_context, deadline_work);
+
+ /* If deadline fence has already passed, nothing to do: */
+ if (fence_completed(fctx, fctx->next_deadline_fence))
+ return;
+
+ msm_devfreq_boost(fctx2gpu(fctx), 2);
+}


struct msm_fence_context *
@@ -26,6 +57,13 @@ msm_fence_context_alloc(struct drm_device *dev, volatile uint32_t *fenceptr,
fctx->fenceptr = fenceptr;
spin_lock_init(&fctx->spinlock);

+ hrtimer_init(&fctx->deadline_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+ fctx->deadline_timer.function = deadline_timer;
+
+ kthread_init_work(&fctx->deadline_work, deadline_work);
+
+ fctx->next_deadline = ktime_get();
+
return fctx;
}

@@ -49,6 +87,8 @@ void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence)
{
spin_lock(&fctx->spinlock);
fctx->completed_fence = max(fence, fctx->completed_fence);
+ if (fence_completed(fctx, fctx->next_deadline_fence))
+ hrtimer_cancel(&fctx->deadline_timer);
spin_unlock(&fctx->spinlock);
}

@@ -79,10 +119,46 @@ static bool msm_fence_signaled(struct dma_fence *fence)
return fence_completed(f->fctx, f->base.seqno);
}

+static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
+{
+ struct msm_fence *f = to_msm_fence(fence);
+ struct msm_fence_context *fctx = f->fctx;
+ unsigned long flags;
+ ktime_t now;
+
+ spin_lock_irqsave(&fctx->spinlock, flags);
+ now = ktime_get();
+
+ if (ktime_after(now, fctx->next_deadline) ||
+ ktime_before(deadline, fctx->next_deadline)) {
+ fctx->next_deadline = deadline;
+ fctx->next_deadline_fence =
+ max(fctx->next_deadline_fence, (uint32_t)fence->seqno);
+
+ /*
+ * Set timer to trigger boost 3ms before deadline, or
+ * if we are already less than 3ms before the deadline
+ * schedule boost work immediately.
+ */
+ deadline = ktime_sub(deadline, ms_to_ktime(3));
+
+ if (ktime_after(now, deadline)) {
+ kthread_queue_work(fctx2gpu(fctx)->worker,
+ &fctx->deadline_work);
+ } else {
+ hrtimer_start(&fctx->deadline_timer, deadline,
+ HRTIMER_MODE_ABS);
+ }
+ }
+
+ spin_unlock_irqrestore(&fctx->spinlock, flags);
+}
+
static const struct dma_fence_ops msm_fence_ops = {
.get_driver_name = msm_fence_get_driver_name,
.get_timeline_name = msm_fence_get_timeline_name,
.signaled = msm_fence_signaled,
+ .set_deadline = msm_fence_set_deadline,
};

struct dma_fence *
diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
index 4783db528bcc..d34e853c555a 100644
--- a/drivers/gpu/drm/msm/msm_fence.h
+++ b/drivers/gpu/drm/msm/msm_fence.h
@@ -50,6 +50,26 @@ struct msm_fence_context {
volatile uint32_t *fenceptr;

spinlock_t spinlock;
+
+ /*
+ * TODO this doesn't really deal with multiple deadlines, like
+ * if userspace got multiple frames ahead.. OTOH atomic updates
+ * don't queue, so maybe that is ok
+ */
+
+ /** next_deadline: Time of next deadline */
+ ktime_t next_deadline;
+
+ /**
+ * next_deadline_fence:
+ *
+ * Fence value for next pending deadline. The deadline timer is
+ * canceled when this fence is signaled.
+ */
+ uint32_t next_deadline_fence;
+
+ struct hrtimer deadline_timer;
+ struct kthread_work deadline_work;
};

struct msm_fence_context * msm_fence_context_alloc(struct drm_device *dev,
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 0e4b45bff2e6..e031c9b495ed 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -425,6 +425,7 @@ void msm_devfreq_init(struct msm_gpu *gpu);
void msm_devfreq_cleanup(struct msm_gpu *gpu);
void msm_devfreq_resume(struct msm_gpu *gpu);
void msm_devfreq_suspend(struct msm_gpu *gpu);
+void msm_devfreq_boost(struct msm_gpu *gpu, unsigned factor);
void msm_devfreq_active(struct msm_gpu *gpu);
void msm_devfreq_idle(struct msm_gpu *gpu);

diff --git a/drivers/gpu/drm/msm/msm_gpu_devfreq.c b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
index 0a1ee20296a2..8a8d7b9028a3 100644
--- a/drivers/gpu/drm/msm/msm_gpu_devfreq.c
+++ b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
@@ -144,6 +144,26 @@ void msm_devfreq_suspend(struct msm_gpu *gpu)
devfreq_suspend_device(gpu->devfreq.devfreq);
}

+void msm_devfreq_boost(struct msm_gpu *gpu, unsigned factor)
+{
+ struct msm_gpu_devfreq *df = &gpu->devfreq;
+ unsigned long freq;
+
+ /*
+ * Hold devfreq lock to synchronize with get_dev_status()/
+ * target() callbacks
+ */
+ mutex_lock(&df->devfreq->lock);
+
+ freq = get_freq(gpu);
+
+ freq *= factor;
+
+ msm_devfreq_target(&gpu->pdev->dev, &freq, 0);
+
+ mutex_unlock(&df->devfreq->lock);
+}
+
void msm_devfreq_active(struct msm_gpu *gpu)
{
struct msm_gpu_devfreq *df = &gpu->devfreq;
--
2.31.1

2021-09-03 18:47:24

by Rob Clark

[permalink] [raw]
Subject: [PATCH v3 9/9] dma-buf/sw_sync: Add fence deadline support

From: Rob Clark <[email protected]>

This consists of simply storing the most recent deadline, and adding an
ioctl to retrieve the deadline. This can be used in conjunction with
the SET_DEADLINE ioctl on a fence fd for testing. Ie. create various
sw_sync fences, merge them into a fence-array, set deadline on the
fence-array and confirm that it is propagated properly to each fence.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/dma-buf/sw_sync.c | 58 ++++++++++++++++++++++++++++++++++++
drivers/dma-buf/sync_debug.h | 2 ++
2 files changed, 60 insertions(+)

diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 348b3a9170fa..50f2638cccd3 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -52,12 +52,26 @@ struct sw_sync_create_fence_data {
__s32 fence; /* fd of new fence */
};

+/**
+ * struct sw_sync_get_deadline - get the deadline of a sw_sync fence
+ * @tv_sec: seconds elapsed since epoch (out)
+ * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec (out)
+ * @fence_fd: the sw_sync fence fd (in)
+ */
+struct sw_sync_get_deadline {
+ __s64 tv_sec;
+ __s32 tv_nsec;
+ __s32 fence_fd;
+};
+
#define SW_SYNC_IOC_MAGIC 'W'

#define SW_SYNC_IOC_CREATE_FENCE _IOWR(SW_SYNC_IOC_MAGIC, 0,\
struct sw_sync_create_fence_data)

#define SW_SYNC_IOC_INC _IOW(SW_SYNC_IOC_MAGIC, 1, __u32)
+#define SW_SYNC_GET_DEADLINE _IOWR(SW_SYNC_IOC_MAGIC, 2, \
+ struct sw_sync_get_deadline)

static const struct dma_fence_ops timeline_fence_ops;

@@ -171,6 +185,13 @@ static void timeline_fence_timeline_value_str(struct dma_fence *fence,
snprintf(str, size, "%d", parent->value);
}

+static void timeline_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
+{
+ struct sync_pt *pt = dma_fence_to_sync_pt(fence);
+
+ pt->deadline = deadline;
+}
+
static const struct dma_fence_ops timeline_fence_ops = {
.get_driver_name = timeline_fence_get_driver_name,
.get_timeline_name = timeline_fence_get_timeline_name,
@@ -179,6 +200,7 @@ static const struct dma_fence_ops timeline_fence_ops = {
.release = timeline_fence_release,
.fence_value_str = timeline_fence_value_str,
.timeline_value_str = timeline_fence_timeline_value_str,
+ .set_deadline = timeline_fence_set_deadline,
};

/**
@@ -387,6 +409,39 @@ static long sw_sync_ioctl_inc(struct sync_timeline *obj, unsigned long arg)
return 0;
}

+static int sw_sync_ioctl_get_deadline(struct sync_timeline *obj, unsigned long arg)
+{
+ struct sw_sync_get_deadline data;
+ struct timespec64 ts;
+ struct dma_fence *fence;
+ struct sync_pt *pt;
+
+ if (copy_from_user(&data, (void __user *)arg, sizeof(data)))
+ return -EFAULT;
+
+ if (data.tv_sec || data.tv_nsec)
+ return -EINVAL;
+
+ fence = sync_file_get_fence(data.fence_fd);
+ if (!fence)
+ return -EINVAL;
+
+ pt = dma_fence_to_sync_pt(fence);
+ if (!pt)
+ return -EINVAL;
+
+ ts = ktime_to_timespec64(pt->deadline);
+ data.tv_sec = ts.tv_sec;
+ data.tv_nsec = ts.tv_nsec;
+
+ dma_fence_put(fence);
+
+ if (copy_to_user((void __user *)arg, &data, sizeof(data)))
+ return -EFAULT;
+
+ return 0;
+}
+
static long sw_sync_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
@@ -399,6 +454,9 @@ static long sw_sync_ioctl(struct file *file, unsigned int cmd,
case SW_SYNC_IOC_INC:
return sw_sync_ioctl_inc(obj, arg);

+ case SW_SYNC_GET_DEADLINE:
+ return sw_sync_ioctl_get_deadline(obj, arg);
+
default:
return -ENOTTY;
}
diff --git a/drivers/dma-buf/sync_debug.h b/drivers/dma-buf/sync_debug.h
index 6176e52ba2d7..2e0146d0bdbb 100644
--- a/drivers/dma-buf/sync_debug.h
+++ b/drivers/dma-buf/sync_debug.h
@@ -55,11 +55,13 @@ static inline struct sync_timeline *dma_fence_parent(struct dma_fence *fence)
* @base: base fence object
* @link: link on the sync timeline's list
* @node: node in the sync timeline's tree
+ * @deadline: the most recently set fence deadline
*/
struct sync_pt {
struct dma_fence base;
struct list_head link;
struct rb_node node;
+ ktime_t deadline;
};

extern const struct file_operations sw_sync_debugfs_fops;
--
2.31.1

2021-09-03 19:18:11

by Rob Clark

[permalink] [raw]
Subject: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support

From: Rob Clark <[email protected]>

Signed-off-by: Rob Clark <[email protected]>
---
drivers/dma-buf/dma-fence-chain.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c
index 1b4cb3e5cec9..736a9ad3ea6d 100644
--- a/drivers/dma-buf/dma-fence-chain.c
+++ b/drivers/dma-buf/dma-fence-chain.c
@@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence *fence)
dma_fence_free(fence);
}

+
+static void dma_fence_chain_set_deadline(struct dma_fence *fence,
+ ktime_t deadline)
+{
+ dma_fence_chain_for_each(fence, fence) {
+ struct dma_fence_chain *chain = to_dma_fence_chain(fence);
+ struct dma_fence *f = chain ? chain->fence : fence;
+
+ dma_fence_set_deadline(f, deadline);
+ }
+}
+
const struct dma_fence_ops dma_fence_chain_ops = {
.use_64bit_seqno = true,
.get_driver_name = dma_fence_chain_get_driver_name,
@@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = {
.enable_signaling = dma_fence_chain_enable_signaling,
.signaled = dma_fence_chain_signaled,
.release = dma_fence_chain_release,
+ .set_deadline = dma_fence_chain_set_deadline,
};
EXPORT_SYMBOL(dma_fence_chain_ops);

--
2.31.1

2021-09-03 19:18:18

by Rob Clark

[permalink] [raw]
Subject: [PATCH v3 1/9] dma-fence: Add deadline awareness

From: Rob Clark <[email protected]>

Add a way to hint to the fence signaler of an upcoming deadline, such as
vblank, which the fence waiter would prefer not to miss. This is to aid
the fence signaler in making power management decisions, like boosting
frequency as the deadline approaches and awareness of missing deadlines
so that can be factored in to the frequency scaling.

v2: Drop dma_fence::deadline and related logic to filter duplicate
deadlines, to avoid increasing dma_fence size. The fence-context
implementation will need similar logic to track deadlines of all
the fences on the same timeline. [ckoenig]

Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
---
drivers/dma-buf/dma-fence.c | 20 ++++++++++++++++++++
include/linux/dma-fence.h | 16 ++++++++++++++++
2 files changed, 36 insertions(+)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index ce0f5eff575d..1f444863b94d 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -910,6 +910,26 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
}
EXPORT_SYMBOL(dma_fence_wait_any_timeout);

+
+/**
+ * dma_fence_set_deadline - set desired fence-wait deadline
+ * @fence: the fence that is to be waited on
+ * @deadline: the time by which the waiter hopes for the fence to be
+ * signaled
+ *
+ * Inform the fence signaler of an upcoming deadline, such as vblank, by
+ * which point the waiter would prefer the fence to be signaled by. This
+ * is intended to give feedback to the fence signaler to aid in power
+ * management decisions, such as boosting GPU frequency if a periodic
+ * vblank deadline is approaching.
+ */
+void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
+{
+ if (fence->ops->set_deadline && !dma_fence_is_signaled(fence))
+ fence->ops->set_deadline(fence, deadline);
+}
+EXPORT_SYMBOL(dma_fence_set_deadline);
+
/**
* dma_fence_init - Initialize a custom fence.
* @fence: the fence to initialize
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 6ffb4b2c6371..9c809f0d5d0a 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -99,6 +99,7 @@ enum dma_fence_flag_bits {
DMA_FENCE_FLAG_SIGNALED_BIT,
DMA_FENCE_FLAG_TIMESTAMP_BIT,
DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+ DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
DMA_FENCE_FLAG_USER_BITS, /* must always be last member */
};

@@ -261,6 +262,19 @@ struct dma_fence_ops {
*/
void (*timeline_value_str)(struct dma_fence *fence,
char *str, int size);
+
+ /**
+ * @set_deadline:
+ *
+ * Callback to allow a fence waiter to inform the fence signaler of an
+ * upcoming deadline, such as vblank, by which point the waiter would
+ * prefer the fence to be signaled by. This is intended to give feedback
+ * to the fence signaler to aid in power management decisions, such as
+ * boosting GPU frequency.
+ *
+ * This callback is optional.
+ */
+ void (*set_deadline)(struct dma_fence *fence, ktime_t deadline);
};

void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
@@ -586,6 +600,8 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
return ret < 0 ? ret : 0;
}

+void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline);
+
struct dma_fence *dma_fence_get_stub(void);
struct dma_fence *dma_fence_allocate_private_stub(void);
u64 dma_fence_context_alloc(unsigned num);
--
2.31.1

2021-09-03 19:31:55

by Rob Clark

[permalink] [raw]
Subject: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

From: Rob Clark <[email protected]>

The initial purpose is for igt tests, but this would also be useful for
compositors that wait until close to vblank deadline to make decisions
about which frame to show.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/dma-buf/sync_file.c | 19 +++++++++++++++++++
include/uapi/linux/sync_file.h | 20 ++++++++++++++++++++
2 files changed, 39 insertions(+)

diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
index 394e6e1e9686..f295772d5169 100644
--- a/drivers/dma-buf/sync_file.c
+++ b/drivers/dma-buf/sync_file.c
@@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file,
return ret;
}

+static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
+ unsigned long arg)
+{
+ struct sync_set_deadline ts;
+
+ if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
+ return -EFAULT;
+
+ if (ts.pad)
+ return -EINVAL;
+
+ dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec));
+
+ return 0;
+}
+
static long sync_file_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
@@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd,
case SYNC_IOC_FILE_INFO:
return sync_file_ioctl_fence_info(sync_file, arg);

+ case SYNC_IOC_SET_DEADLINE:
+ return sync_file_ioctl_set_deadline(sync_file, arg);
+
default:
return -ENOTTY;
}
diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
index ee2dcfb3d660..f67d4ffe7566 100644
--- a/include/uapi/linux/sync_file.h
+++ b/include/uapi/linux/sync_file.h
@@ -67,6 +67,18 @@ struct sync_file_info {
__u64 sync_fence_info;
};

+/**
+ * struct sync_set_deadline - set a deadline on a fence
+ * @tv_sec: seconds elapsed since epoch
+ * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
+ * @pad: must be zero
+ */
+struct sync_set_deadline {
+ __s64 tv_sec;
+ __s32 tv_nsec;
+ __u32 pad;
+};
+
#define SYNC_IOC_MAGIC '>'

/**
@@ -95,4 +107,12 @@ struct sync_file_info {
*/
#define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)

+
+/**
+ * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
+ *
+ * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
+ */
+#define SYNC_IOC_SET_DEADLINE _IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline)
+
#endif /* _UAPI_LINUX_SYNC_H */
--
2.31.1

2021-09-03 19:31:55

by Rob Clark

[permalink] [raw]
Subject: [PATCH v3 6/9] dma-buf/fence-array: Add fence deadline support

From: Rob Clark <[email protected]>

Signed-off-by: Rob Clark <[email protected]>
---
drivers/dma-buf/dma-fence-array.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c
index d3fbd950be94..8d194b09ee3d 100644
--- a/drivers/dma-buf/dma-fence-array.c
+++ b/drivers/dma-buf/dma-fence-array.c
@@ -119,12 +119,23 @@ static void dma_fence_array_release(struct dma_fence *fence)
dma_fence_free(fence);
}

+static void dma_fence_array_set_deadline(struct dma_fence *fence,
+ ktime_t deadline)
+{
+ struct dma_fence_array *array = to_dma_fence_array(fence);
+ unsigned i;
+
+ for (i = 0; i < array->num_fences; ++i)
+ dma_fence_set_deadline(array->fences[i], deadline);
+}
+
const struct dma_fence_ops dma_fence_array_ops = {
.get_driver_name = dma_fence_array_get_driver_name,
.get_timeline_name = dma_fence_array_get_timeline_name,
.enable_signaling = dma_fence_array_enable_signaling,
.signaled = dma_fence_array_signaled,
.release = dma_fence_array_release,
+ .set_deadline = dma_fence_array_set_deadline,
};
EXPORT_SYMBOL(dma_fence_array_ops);

--
2.31.1

2021-09-08 17:50:17

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

On Fri, Sep 03, 2021 at 11:47:55AM -0700, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> As the finished fence is the one that is exposed to userspace, and
> therefore the one that other operations, like atomic update, would
> block on, we need to propagate the deadline from from the finished
> fence to the actual hw fence.
>
> v2: Split into drm_sched_fence_set_parent() (ckoenig)
>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
> drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> include/drm/gpu_scheduler.h | 8 ++++++
> 3 files changed, 43 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> index bcea035cf4c6..4fc41a71d1c7 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
> dma_fence_put(&fence->scheduled);
> }
>
> +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> + ktime_t deadline)
> +{
> + struct drm_sched_fence *fence = to_drm_sched_fence(f);
> + unsigned long flags;
> +
> + spin_lock_irqsave(&fence->lock, flags);
> +
> + /* If we already have an earlier deadline, keep it: */
> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
> + ktime_before(fence->deadline, deadline)) {
> + spin_unlock_irqrestore(&fence->lock, flags);
> + return;
> + }
> +
> + fence->deadline = deadline;
> + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
> +
> + spin_unlock_irqrestore(&fence->lock, flags);
> +
> + if (fence->parent)
> + dma_fence_set_deadline(fence->parent, deadline);
> +}
> +
> static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
> .get_driver_name = drm_sched_fence_get_driver_name,
> .get_timeline_name = drm_sched_fence_get_timeline_name,
> @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
> .get_driver_name = drm_sched_fence_get_driver_name,
> .get_timeline_name = drm_sched_fence_get_timeline_name,
> .release = drm_sched_fence_release_finished,
> + .set_deadline = drm_sched_fence_set_deadline_finished,
> };
>
> struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> }
> EXPORT_SYMBOL(to_drm_sched_fence);
>
> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> + struct dma_fence *fence)
> +{
> + s_fence->parent = dma_fence_get(fence);
> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> + &s_fence->finished.flags))

Don't you need the spinlock here too to avoid races? test_bit is very
unordered, so guarantees nothing. Spinlock would need to be both around
->parent = and the test_bit.

Entirely aside, but there's discussions going on to preallocate the hw
fence somehow. If we do that we could make the deadline forwarding
lockless here. Having a spinlock just to set the parent is a bit annoying
...

Alternative is that you do this locklessly with barriers and a _lot_ of
comments. Would be good to benchmark whether the overhead matters though
first.
-Daniel

> + dma_fence_set_deadline(fence, s_fence->deadline);
> +}
> +
> struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> void *owner)
> {
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 595e47ff7d06..27bf0ac0625f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
> drm_sched_fence_scheduled(s_fence);
>
> if (!IS_ERR_OR_NULL(fence)) {
> - s_fence->parent = dma_fence_get(fence);
> + drm_sched_fence_set_parent(s_fence, fence);
> r = dma_fence_add_callback(fence, &sched_job->cb,
> drm_sched_job_done_cb);
> if (r == -ENOENT)
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 7f77a455722c..158ddd662469 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -238,6 +238,12 @@ struct drm_sched_fence {
> */
> struct dma_fence finished;
>
> + /**
> + * @deadline: deadline set on &drm_sched_fence.finished which
> + * potentially needs to be propagated to &drm_sched_fence.parent
> + */
> + ktime_t deadline;
> +
> /**
> * @parent: the fence returned by &drm_sched_backend_ops.run_job
> * when scheduling the job on hardware. We signal the
> @@ -505,6 +511,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> enum drm_sched_priority priority);
> bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>
> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> + struct dma_fence *fence);
> struct drm_sched_fence *drm_sched_fence_alloc(
> struct drm_sched_entity *s_entity, void *owner);
> void drm_sched_fence_init(struct drm_sched_fence *fence,
> --
> 2.31.1
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2021-09-08 17:51:32

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v3 5/9] drm/msm: Add deadline based boost support

On Fri, Sep 03, 2021 at 11:47:56AM -0700, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> Signed-off-by: Rob Clark <[email protected]>

Why do you need a kthread_work here? Is this just to make sure you're
running at realtime prio? Maybe a comment to that effect would be good.
-Daniel

> ---
> drivers/gpu/drm/msm/msm_fence.c | 76 +++++++++++++++++++++++++++
> drivers/gpu/drm/msm/msm_fence.h | 20 +++++++
> drivers/gpu/drm/msm/msm_gpu.h | 1 +
> drivers/gpu/drm/msm/msm_gpu_devfreq.c | 20 +++++++
> 4 files changed, 117 insertions(+)
>
> diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
> index f2cece542c3f..67c2a96e1c85 100644
> --- a/drivers/gpu/drm/msm/msm_fence.c
> +++ b/drivers/gpu/drm/msm/msm_fence.c
> @@ -8,6 +8,37 @@
>
> #include "msm_drv.h"
> #include "msm_fence.h"
> +#include "msm_gpu.h"
> +
> +static inline bool fence_completed(struct msm_fence_context *fctx, uint32_t fence);
> +
> +static struct msm_gpu *fctx2gpu(struct msm_fence_context *fctx)
> +{
> + struct msm_drm_private *priv = fctx->dev->dev_private;
> + return priv->gpu;
> +}
> +
> +static enum hrtimer_restart deadline_timer(struct hrtimer *t)
> +{
> + struct msm_fence_context *fctx = container_of(t,
> + struct msm_fence_context, deadline_timer);
> +
> + kthread_queue_work(fctx2gpu(fctx)->worker, &fctx->deadline_work);
> +
> + return HRTIMER_NORESTART;
> +}
> +
> +static void deadline_work(struct kthread_work *work)
> +{
> + struct msm_fence_context *fctx = container_of(work,
> + struct msm_fence_context, deadline_work);
> +
> + /* If deadline fence has already passed, nothing to do: */
> + if (fence_completed(fctx, fctx->next_deadline_fence))
> + return;
> +
> + msm_devfreq_boost(fctx2gpu(fctx), 2);
> +}
>
>
> struct msm_fence_context *
> @@ -26,6 +57,13 @@ msm_fence_context_alloc(struct drm_device *dev, volatile uint32_t *fenceptr,
> fctx->fenceptr = fenceptr;
> spin_lock_init(&fctx->spinlock);
>
> + hrtimer_init(&fctx->deadline_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
> + fctx->deadline_timer.function = deadline_timer;
> +
> + kthread_init_work(&fctx->deadline_work, deadline_work);
> +
> + fctx->next_deadline = ktime_get();
> +
> return fctx;
> }
>
> @@ -49,6 +87,8 @@ void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence)
> {
> spin_lock(&fctx->spinlock);
> fctx->completed_fence = max(fence, fctx->completed_fence);
> + if (fence_completed(fctx, fctx->next_deadline_fence))
> + hrtimer_cancel(&fctx->deadline_timer);
> spin_unlock(&fctx->spinlock);
> }
>
> @@ -79,10 +119,46 @@ static bool msm_fence_signaled(struct dma_fence *fence)
> return fence_completed(f->fctx, f->base.seqno);
> }
>
> +static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> +{
> + struct msm_fence *f = to_msm_fence(fence);
> + struct msm_fence_context *fctx = f->fctx;
> + unsigned long flags;
> + ktime_t now;
> +
> + spin_lock_irqsave(&fctx->spinlock, flags);
> + now = ktime_get();
> +
> + if (ktime_after(now, fctx->next_deadline) ||
> + ktime_before(deadline, fctx->next_deadline)) {
> + fctx->next_deadline = deadline;
> + fctx->next_deadline_fence =
> + max(fctx->next_deadline_fence, (uint32_t)fence->seqno);
> +
> + /*
> + * Set timer to trigger boost 3ms before deadline, or
> + * if we are already less than 3ms before the deadline
> + * schedule boost work immediately.
> + */
> + deadline = ktime_sub(deadline, ms_to_ktime(3));
> +
> + if (ktime_after(now, deadline)) {
> + kthread_queue_work(fctx2gpu(fctx)->worker,
> + &fctx->deadline_work);
> + } else {
> + hrtimer_start(&fctx->deadline_timer, deadline,
> + HRTIMER_MODE_ABS);
> + }
> + }
> +
> + spin_unlock_irqrestore(&fctx->spinlock, flags);
> +}
> +
> static const struct dma_fence_ops msm_fence_ops = {
> .get_driver_name = msm_fence_get_driver_name,
> .get_timeline_name = msm_fence_get_timeline_name,
> .signaled = msm_fence_signaled,
> + .set_deadline = msm_fence_set_deadline,
> };
>
> struct dma_fence *
> diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
> index 4783db528bcc..d34e853c555a 100644
> --- a/drivers/gpu/drm/msm/msm_fence.h
> +++ b/drivers/gpu/drm/msm/msm_fence.h
> @@ -50,6 +50,26 @@ struct msm_fence_context {
> volatile uint32_t *fenceptr;
>
> spinlock_t spinlock;
> +
> + /*
> + * TODO this doesn't really deal with multiple deadlines, like
> + * if userspace got multiple frames ahead.. OTOH atomic updates
> + * don't queue, so maybe that is ok
> + */
> +
> + /** next_deadline: Time of next deadline */
> + ktime_t next_deadline;
> +
> + /**
> + * next_deadline_fence:
> + *
> + * Fence value for next pending deadline. The deadline timer is
> + * canceled when this fence is signaled.
> + */
> + uint32_t next_deadline_fence;
> +
> + struct hrtimer deadline_timer;
> + struct kthread_work deadline_work;
> };
>
> struct msm_fence_context * msm_fence_context_alloc(struct drm_device *dev,
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index 0e4b45bff2e6..e031c9b495ed 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -425,6 +425,7 @@ void msm_devfreq_init(struct msm_gpu *gpu);
> void msm_devfreq_cleanup(struct msm_gpu *gpu);
> void msm_devfreq_resume(struct msm_gpu *gpu);
> void msm_devfreq_suspend(struct msm_gpu *gpu);
> +void msm_devfreq_boost(struct msm_gpu *gpu, unsigned factor);
> void msm_devfreq_active(struct msm_gpu *gpu);
> void msm_devfreq_idle(struct msm_gpu *gpu);
>
> diff --git a/drivers/gpu/drm/msm/msm_gpu_devfreq.c b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
> index 0a1ee20296a2..8a8d7b9028a3 100644
> --- a/drivers/gpu/drm/msm/msm_gpu_devfreq.c
> +++ b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
> @@ -144,6 +144,26 @@ void msm_devfreq_suspend(struct msm_gpu *gpu)
> devfreq_suspend_device(gpu->devfreq.devfreq);
> }
>
> +void msm_devfreq_boost(struct msm_gpu *gpu, unsigned factor)
> +{
> + struct msm_gpu_devfreq *df = &gpu->devfreq;
> + unsigned long freq;
> +
> + /*
> + * Hold devfreq lock to synchronize with get_dev_status()/
> + * target() callbacks
> + */
> + mutex_lock(&df->devfreq->lock);
> +
> + freq = get_freq(gpu);
> +
> + freq *= factor;
> +
> + msm_devfreq_target(&gpu->pdev->dev, &freq, 0);
> +
> + mutex_unlock(&df->devfreq->lock);
> +}
> +
> void msm_devfreq_active(struct msm_gpu *gpu)
> {
> struct msm_gpu_devfreq *df = &gpu->devfreq;
> --
> 2.31.1
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2021-09-08 17:51:38

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> The initial purpose is for igt tests, but this would also be useful for
> compositors that wait until close to vblank deadline to make decisions
> about which frame to show.
>
> Signed-off-by: Rob Clark <[email protected]>

Needs userspace and I think ideally also some igts to make sure it works
and doesn't go boom.
-Daniel

> ---
> drivers/dma-buf/sync_file.c | 19 +++++++++++++++++++
> include/uapi/linux/sync_file.h | 20 ++++++++++++++++++++
> 2 files changed, 39 insertions(+)
>
> diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> index 394e6e1e9686..f295772d5169 100644
> --- a/drivers/dma-buf/sync_file.c
> +++ b/drivers/dma-buf/sync_file.c
> @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file,
> return ret;
> }
>
> +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> + unsigned long arg)
> +{
> + struct sync_set_deadline ts;
> +
> + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> + return -EFAULT;
> +
> + if (ts.pad)
> + return -EINVAL;
> +
> + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec));
> +
> + return 0;
> +}
> +
> static long sync_file_ioctl(struct file *file, unsigned int cmd,
> unsigned long arg)
> {
> @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd,
> case SYNC_IOC_FILE_INFO:
> return sync_file_ioctl_fence_info(sync_file, arg);
>
> + case SYNC_IOC_SET_DEADLINE:
> + return sync_file_ioctl_set_deadline(sync_file, arg);
> +
> default:
> return -ENOTTY;
> }
> diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> index ee2dcfb3d660..f67d4ffe7566 100644
> --- a/include/uapi/linux/sync_file.h
> +++ b/include/uapi/linux/sync_file.h
> @@ -67,6 +67,18 @@ struct sync_file_info {
> __u64 sync_fence_info;
> };
>
> +/**
> + * struct sync_set_deadline - set a deadline on a fence
> + * @tv_sec: seconds elapsed since epoch
> + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> + * @pad: must be zero
> + */
> +struct sync_set_deadline {
> + __s64 tv_sec;
> + __s32 tv_nsec;
> + __u32 pad;
> +};
> +
> #define SYNC_IOC_MAGIC '>'
>
> /**
> @@ -95,4 +107,12 @@ struct sync_file_info {
> */
> #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
>
> +
> +/**
> + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> + *
> + * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
> + */
> +#define SYNC_IOC_SET_DEADLINE _IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline)
> +
> #endif /* _UAPI_LINUX_SYNC_H */
> --
> 2.31.1
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2021-09-08 17:55:43

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 5/9] drm/msm: Add deadline based boost support

On Wed, Sep 8, 2021 at 10:48 AM Daniel Vetter <[email protected]> wrote:
>
> On Fri, Sep 03, 2021 at 11:47:56AM -0700, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > Signed-off-by: Rob Clark <[email protected]>
>
> Why do you need a kthread_work here? Is this just to make sure you're
> running at realtime prio? Maybe a comment to that effect would be good.

Mostly because we are already using a kthread_worker for things the
GPU needs to kick off to a different context.. but I think this is
something we'd want at a realtime prio

BR,
-R

> -Daniel
>
> > ---
> > drivers/gpu/drm/msm/msm_fence.c | 76 +++++++++++++++++++++++++++
> > drivers/gpu/drm/msm/msm_fence.h | 20 +++++++
> > drivers/gpu/drm/msm/msm_gpu.h | 1 +
> > drivers/gpu/drm/msm/msm_gpu_devfreq.c | 20 +++++++
> > 4 files changed, 117 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
> > index f2cece542c3f..67c2a96e1c85 100644
> > --- a/drivers/gpu/drm/msm/msm_fence.c
> > +++ b/drivers/gpu/drm/msm/msm_fence.c
> > @@ -8,6 +8,37 @@
> >
> > #include "msm_drv.h"
> > #include "msm_fence.h"
> > +#include "msm_gpu.h"
> > +
> > +static inline bool fence_completed(struct msm_fence_context *fctx, uint32_t fence);
> > +
> > +static struct msm_gpu *fctx2gpu(struct msm_fence_context *fctx)
> > +{
> > + struct msm_drm_private *priv = fctx->dev->dev_private;
> > + return priv->gpu;
> > +}
> > +
> > +static enum hrtimer_restart deadline_timer(struct hrtimer *t)
> > +{
> > + struct msm_fence_context *fctx = container_of(t,
> > + struct msm_fence_context, deadline_timer);
> > +
> > + kthread_queue_work(fctx2gpu(fctx)->worker, &fctx->deadline_work);
> > +
> > + return HRTIMER_NORESTART;
> > +}
> > +
> > +static void deadline_work(struct kthread_work *work)
> > +{
> > + struct msm_fence_context *fctx = container_of(work,
> > + struct msm_fence_context, deadline_work);
> > +
> > + /* If deadline fence has already passed, nothing to do: */
> > + if (fence_completed(fctx, fctx->next_deadline_fence))
> > + return;
> > +
> > + msm_devfreq_boost(fctx2gpu(fctx), 2);
> > +}
> >
> >
> > struct msm_fence_context *
> > @@ -26,6 +57,13 @@ msm_fence_context_alloc(struct drm_device *dev, volatile uint32_t *fenceptr,
> > fctx->fenceptr = fenceptr;
> > spin_lock_init(&fctx->spinlock);
> >
> > + hrtimer_init(&fctx->deadline_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
> > + fctx->deadline_timer.function = deadline_timer;
> > +
> > + kthread_init_work(&fctx->deadline_work, deadline_work);
> > +
> > + fctx->next_deadline = ktime_get();
> > +
> > return fctx;
> > }
> >
> > @@ -49,6 +87,8 @@ void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence)
> > {
> > spin_lock(&fctx->spinlock);
> > fctx->completed_fence = max(fence, fctx->completed_fence);
> > + if (fence_completed(fctx, fctx->next_deadline_fence))
> > + hrtimer_cancel(&fctx->deadline_timer);
> > spin_unlock(&fctx->spinlock);
> > }
> >
> > @@ -79,10 +119,46 @@ static bool msm_fence_signaled(struct dma_fence *fence)
> > return fence_completed(f->fctx, f->base.seqno);
> > }
> >
> > +static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> > +{
> > + struct msm_fence *f = to_msm_fence(fence);
> > + struct msm_fence_context *fctx = f->fctx;
> > + unsigned long flags;
> > + ktime_t now;
> > +
> > + spin_lock_irqsave(&fctx->spinlock, flags);
> > + now = ktime_get();
> > +
> > + if (ktime_after(now, fctx->next_deadline) ||
> > + ktime_before(deadline, fctx->next_deadline)) {
> > + fctx->next_deadline = deadline;
> > + fctx->next_deadline_fence =
> > + max(fctx->next_deadline_fence, (uint32_t)fence->seqno);
> > +
> > + /*
> > + * Set timer to trigger boost 3ms before deadline, or
> > + * if we are already less than 3ms before the deadline
> > + * schedule boost work immediately.
> > + */
> > + deadline = ktime_sub(deadline, ms_to_ktime(3));
> > +
> > + if (ktime_after(now, deadline)) {
> > + kthread_queue_work(fctx2gpu(fctx)->worker,
> > + &fctx->deadline_work);
> > + } else {
> > + hrtimer_start(&fctx->deadline_timer, deadline,
> > + HRTIMER_MODE_ABS);
> > + }
> > + }
> > +
> > + spin_unlock_irqrestore(&fctx->spinlock, flags);
> > +}
> > +
> > static const struct dma_fence_ops msm_fence_ops = {
> > .get_driver_name = msm_fence_get_driver_name,
> > .get_timeline_name = msm_fence_get_timeline_name,
> > .signaled = msm_fence_signaled,
> > + .set_deadline = msm_fence_set_deadline,
> > };
> >
> > struct dma_fence *
> > diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
> > index 4783db528bcc..d34e853c555a 100644
> > --- a/drivers/gpu/drm/msm/msm_fence.h
> > +++ b/drivers/gpu/drm/msm/msm_fence.h
> > @@ -50,6 +50,26 @@ struct msm_fence_context {
> > volatile uint32_t *fenceptr;
> >
> > spinlock_t spinlock;
> > +
> > + /*
> > + * TODO this doesn't really deal with multiple deadlines, like
> > + * if userspace got multiple frames ahead.. OTOH atomic updates
> > + * don't queue, so maybe that is ok
> > + */
> > +
> > + /** next_deadline: Time of next deadline */
> > + ktime_t next_deadline;
> > +
> > + /**
> > + * next_deadline_fence:
> > + *
> > + * Fence value for next pending deadline. The deadline timer is
> > + * canceled when this fence is signaled.
> > + */
> > + uint32_t next_deadline_fence;
> > +
> > + struct hrtimer deadline_timer;
> > + struct kthread_work deadline_work;
> > };
> >
> > struct msm_fence_context * msm_fence_context_alloc(struct drm_device *dev,
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> > index 0e4b45bff2e6..e031c9b495ed 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.h
> > +++ b/drivers/gpu/drm/msm/msm_gpu.h
> > @@ -425,6 +425,7 @@ void msm_devfreq_init(struct msm_gpu *gpu);
> > void msm_devfreq_cleanup(struct msm_gpu *gpu);
> > void msm_devfreq_resume(struct msm_gpu *gpu);
> > void msm_devfreq_suspend(struct msm_gpu *gpu);
> > +void msm_devfreq_boost(struct msm_gpu *gpu, unsigned factor);
> > void msm_devfreq_active(struct msm_gpu *gpu);
> > void msm_devfreq_idle(struct msm_gpu *gpu);
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gpu_devfreq.c b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
> > index 0a1ee20296a2..8a8d7b9028a3 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu_devfreq.c
> > +++ b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
> > @@ -144,6 +144,26 @@ void msm_devfreq_suspend(struct msm_gpu *gpu)
> > devfreq_suspend_device(gpu->devfreq.devfreq);
> > }
> >
> > +void msm_devfreq_boost(struct msm_gpu *gpu, unsigned factor)
> > +{
> > + struct msm_gpu_devfreq *df = &gpu->devfreq;
> > + unsigned long freq;
> > +
> > + /*
> > + * Hold devfreq lock to synchronize with get_dev_status()/
> > + * target() callbacks
> > + */
> > + mutex_lock(&df->devfreq->lock);
> > +
> > + freq = get_freq(gpu);
> > +
> > + freq *= factor;
> > +
> > + msm_devfreq_target(&gpu->pdev->dev, &freq, 0);
> > +
> > + mutex_unlock(&df->devfreq->lock);
> > +}
> > +
> > void msm_devfreq_active(struct msm_gpu *gpu)
> > {
> > struct msm_gpu_devfreq *df = &gpu->devfreq;
> > --
> > 2.31.1
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

2021-09-08 17:57:54

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support

On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/dma-buf/dma-fence-chain.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c
> index 1b4cb3e5cec9..736a9ad3ea6d 100644
> --- a/drivers/dma-buf/dma-fence-chain.c
> +++ b/drivers/dma-buf/dma-fence-chain.c
> @@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence *fence)
> dma_fence_free(fence);
> }
>
> +
> +static void dma_fence_chain_set_deadline(struct dma_fence *fence,
> + ktime_t deadline)
> +{
> + dma_fence_chain_for_each(fence, fence) {
> + struct dma_fence_chain *chain = to_dma_fence_chain(fence);
> + struct dma_fence *f = chain ? chain->fence : fence;

Doesn't this just end up calling set_deadline on a chain, potenetially
resulting in recursion? Also I don't think this should ever happen, why
did you add that?
-Daniel

> +
> + dma_fence_set_deadline(f, deadline);
> + }
> +}
> +
> const struct dma_fence_ops dma_fence_chain_ops = {
> .use_64bit_seqno = true,
> .get_driver_name = dma_fence_chain_get_driver_name,
> @@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = {
> .enable_signaling = dma_fence_chain_enable_signaling,
> .signaled = dma_fence_chain_signaled,
> .release = dma_fence_chain_release,
> + .set_deadline = dma_fence_chain_set_deadline,
> };
> EXPORT_SYMBOL(dma_fence_chain_ops);
>
> --
> 2.31.1
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2021-09-08 17:58:27

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v3 1/9] dma-fence: Add deadline awareness

On Fri, Sep 03, 2021 at 11:47:52AM -0700, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> Add a way to hint to the fence signaler of an upcoming deadline, such as
> vblank, which the fence waiter would prefer not to miss. This is to aid
> the fence signaler in making power management decisions, like boosting
> frequency as the deadline approaches and awareness of missing deadlines
> so that can be factored in to the frequency scaling.
>
> v2: Drop dma_fence::deadline and related logic to filter duplicate
> deadlines, to avoid increasing dma_fence size. The fence-context
> implementation will need similar logic to track deadlines of all
> the fences on the same timeline. [ckoenig]
>
> Signed-off-by: Rob Clark <[email protected]>
> Reviewed-by: Christian K?nig <[email protected]>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/dma-buf/dma-fence.c | 20 ++++++++++++++++++++
> include/linux/dma-fence.h | 16 ++++++++++++++++
> 2 files changed, 36 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index ce0f5eff575d..1f444863b94d 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -910,6 +910,26 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
> }
> EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>
> +
> +/**
> + * dma_fence_set_deadline - set desired fence-wait deadline
> + * @fence: the fence that is to be waited on
> + * @deadline: the time by which the waiter hopes for the fence to be
> + * signaled
> + *
> + * Inform the fence signaler of an upcoming deadline, such as vblank, by
> + * which point the waiter would prefer the fence to be signaled by. This
> + * is intended to give feedback to the fence signaler to aid in power
> + * management decisions, such as boosting GPU frequency if a periodic
> + * vblank deadline is approaching.
> + */
> +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> +{
> + if (fence->ops->set_deadline && !dma_fence_is_signaled(fence))
> + fence->ops->set_deadline(fence, deadline);
> +}
> +EXPORT_SYMBOL(dma_fence_set_deadline);
> +
> /**
> * dma_fence_init - Initialize a custom fence.
> * @fence: the fence to initialize
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index 6ffb4b2c6371..9c809f0d5d0a 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -99,6 +99,7 @@ enum dma_fence_flag_bits {
> DMA_FENCE_FLAG_SIGNALED_BIT,
> DMA_FENCE_FLAG_TIMESTAMP_BIT,
> DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
> + DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> DMA_FENCE_FLAG_USER_BITS, /* must always be last member */
> };
>
> @@ -261,6 +262,19 @@ struct dma_fence_ops {
> */
> void (*timeline_value_str)(struct dma_fence *fence,
> char *str, int size);
> +
> + /**
> + * @set_deadline:
> + *
> + * Callback to allow a fence waiter to inform the fence signaler of an
> + * upcoming deadline, such as vblank, by which point the waiter would
> + * prefer the fence to be signaled by. This is intended to give feedback
> + * to the fence signaler to aid in power management decisions, such as
> + * boosting GPU frequency.

Please add here that this callback is called without &dma_fence.lock held,
and that locking is up to callers if they have some state to manage.

I realized that while scratching some heads over your later patches.
-Daniel

> + *
> + * This callback is optional.
> + */
> + void (*set_deadline)(struct dma_fence *fence, ktime_t deadline);
> };
>
> void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
> @@ -586,6 +600,8 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
> return ret < 0 ? ret : 0;
> }
>
> +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline);
> +
> struct dma_fence *dma_fence_get_stub(void);
> struct dma_fence *dma_fence_allocate_private_stub(void);
> u64 dma_fence_context_alloc(unsigned num);
> --
> 2.31.1
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2021-09-08 18:22:24

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter <[email protected]> wrote:
>
> On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > The initial purpose is for igt tests, but this would also be useful for
> > compositors that wait until close to vblank deadline to make decisions
> > about which frame to show.
> >
> > Signed-off-by: Rob Clark <[email protected]>
>
> Needs userspace and I think ideally also some igts to make sure it works
> and doesn't go boom.

See cover-letter.. there are igt tests, although currently that is the
only user.

I'd be ok to otherwise initially restrict this and the sw_sync UABI
(CAP_SYS_ADMIN? Or??) until there is a non-igt user, but they are
both needed by the igt tests

BR,
-R

> -Daniel
>
> > ---
> > drivers/dma-buf/sync_file.c | 19 +++++++++++++++++++
> > include/uapi/linux/sync_file.h | 20 ++++++++++++++++++++
> > 2 files changed, 39 insertions(+)
> >
> > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > index 394e6e1e9686..f295772d5169 100644
> > --- a/drivers/dma-buf/sync_file.c
> > +++ b/drivers/dma-buf/sync_file.c
> > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file,
> > return ret;
> > }
> >
> > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> > + unsigned long arg)
> > +{
> > + struct sync_set_deadline ts;
> > +
> > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> > + return -EFAULT;
> > +
> > + if (ts.pad)
> > + return -EINVAL;
> > +
> > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec));
> > +
> > + return 0;
> > +}
> > +
> > static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > unsigned long arg)
> > {
> > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > case SYNC_IOC_FILE_INFO:
> > return sync_file_ioctl_fence_info(sync_file, arg);
> >
> > + case SYNC_IOC_SET_DEADLINE:
> > + return sync_file_ioctl_set_deadline(sync_file, arg);
> > +
> > default:
> > return -ENOTTY;
> > }
> > diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> > index ee2dcfb3d660..f67d4ffe7566 100644
> > --- a/include/uapi/linux/sync_file.h
> > +++ b/include/uapi/linux/sync_file.h
> > @@ -67,6 +67,18 @@ struct sync_file_info {
> > __u64 sync_fence_info;
> > };
> >
> > +/**
> > + * struct sync_set_deadline - set a deadline on a fence
> > + * @tv_sec: seconds elapsed since epoch
> > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> > + * @pad: must be zero
> > + */
> > +struct sync_set_deadline {
> > + __s64 tv_sec;
> > + __s32 tv_nsec;
> > + __u32 pad;
> > +};
> > +
> > #define SYNC_IOC_MAGIC '>'
> >
> > /**
> > @@ -95,4 +107,12 @@ struct sync_file_info {
> > */
> > #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
> >
> > +
> > +/**
> > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> > + *
> > + * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
> > + */
> > +#define SYNC_IOC_SET_DEADLINE _IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline)
> > +
> > #endif /* _UAPI_LINUX_SYNC_H */
> > --
> > 2.31.1
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

2021-09-08 19:03:53

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v3 6/9] dma-buf/fence-array: Add fence deadline support

On Fri, Sep 03, 2021 at 11:47:57AM -0700, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/dma-buf/dma-fence-array.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c
> index d3fbd950be94..8d194b09ee3d 100644
> --- a/drivers/dma-buf/dma-fence-array.c
> +++ b/drivers/dma-buf/dma-fence-array.c
> @@ -119,12 +119,23 @@ static void dma_fence_array_release(struct dma_fence *fence)
> dma_fence_free(fence);
> }
>
> +static void dma_fence_array_set_deadline(struct dma_fence *fence,
> + ktime_t deadline)
> +{
> + struct dma_fence_array *array = to_dma_fence_array(fence);
> + unsigned i;
> +
> + for (i = 0; i < array->num_fences; ++i)
> + dma_fence_set_deadline(array->fences[i], deadline);

Hm I wonder whether this can go wrong, and whether we need Christian's
massive fence iterator that I've seen flying around. If you nest these
things too much it could all go wrong I think. I looked at other users
which inspect dma_fence_array and none of them have a risk for unbounded
recursion.

Maybe check with Christian.
-Daniel


> +}
> +
> const struct dma_fence_ops dma_fence_array_ops = {
> .get_driver_name = dma_fence_array_get_driver_name,
> .get_timeline_name = dma_fence_array_get_timeline_name,
> .enable_signaling = dma_fence_array_enable_signaling,
> .signaled = dma_fence_array_signaled,
> .release = dma_fence_array_release,
> + .set_deadline = dma_fence_array_set_deadline,
> };
> EXPORT_SYMBOL(dma_fence_array_ops);
>
> --
> 2.31.1
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2021-09-08 19:05:00

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support

On Wed, Sep 8, 2021 at 10:54 AM Daniel Vetter <[email protected]> wrote:
>
> On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > Signed-off-by: Rob Clark <[email protected]>
> > ---
> > drivers/dma-buf/dma-fence-chain.c | 13 +++++++++++++
> > 1 file changed, 13 insertions(+)
> >
> > diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c
> > index 1b4cb3e5cec9..736a9ad3ea6d 100644
> > --- a/drivers/dma-buf/dma-fence-chain.c
> > +++ b/drivers/dma-buf/dma-fence-chain.c
> > @@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence *fence)
> > dma_fence_free(fence);
> > }
> >
> > +
> > +static void dma_fence_chain_set_deadline(struct dma_fence *fence,
> > + ktime_t deadline)
> > +{
> > + dma_fence_chain_for_each(fence, fence) {
> > + struct dma_fence_chain *chain = to_dma_fence_chain(fence);
> > + struct dma_fence *f = chain ? chain->fence : fence;
>
> Doesn't this just end up calling set_deadline on a chain, potenetially
> resulting in recursion? Also I don't think this should ever happen, why
> did you add that?

Tbh the fence-chain was the part I was a bit fuzzy about, and the main
reason I added igt tests. The iteration is similar to how, for ex,
dma_fence_chain_signaled() work, and according to the igt test it does
what was intended

BR,
-R

> -Daniel
>
> > +
> > + dma_fence_set_deadline(f, deadline);
> > + }
> > +}
> > +
> > const struct dma_fence_ops dma_fence_chain_ops = {
> > .use_64bit_seqno = true,
> > .get_driver_name = dma_fence_chain_get_driver_name,
> > @@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = {
> > .enable_signaling = dma_fence_chain_enable_signaling,
> > .signaled = dma_fence_chain_signaled,
> > .release = dma_fence_chain_release,
> > + .set_deadline = dma_fence_chain_set_deadline,
> > };
> > EXPORT_SYMBOL(dma_fence_chain_ops);
> >
> > --
> > 2.31.1
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

2021-09-08 19:07:47

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

On Wed, Sep 08, 2021 at 11:23:42AM -0700, Rob Clark wrote:
> On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter <[email protected]> wrote:
> >
> > On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote:
> > > From: Rob Clark <[email protected]>
> > >
> > > The initial purpose is for igt tests, but this would also be useful for
> > > compositors that wait until close to vblank deadline to make decisions
> > > about which frame to show.
> > >
> > > Signed-off-by: Rob Clark <[email protected]>
> >
> > Needs userspace and I think ideally also some igts to make sure it works
> > and doesn't go boom.
>
> See cover-letter.. there are igt tests, although currently that is the
> only user.

Ah sorry missed that. It would be good to record that in the commit too
that adds the uapi. git blame doesn't find cover letters at all, unlike on
gitlab where you get the MR request with everything.

Ok there is the Link: thing, but since that only points at the last
version all the interesting discussion is still usually lost, so I tend to
not bother looking there.

> I'd be ok to otherwise initially restrict this and the sw_sync UABI
> (CAP_SYS_ADMIN? Or??) until there is a non-igt user, but they are
> both needed by the igt tests

Hm really awkward, uapi for igts in cross vendor stuff like this isn't
great. I think hiding it in vgem is semi-ok (we have fences there
already). But it's all a bit silly ...

For the tests, should we instead have a selftest/Kunit thing to exercise
this stuff? igt probably not quite the right thing. Or combine with a page
flip if you want to test msm.
-Daniel

>
> BR,
> -R
>
> > -Daniel
> >
> > > ---
> > > drivers/dma-buf/sync_file.c | 19 +++++++++++++++++++
> > > include/uapi/linux/sync_file.h | 20 ++++++++++++++++++++
> > > 2 files changed, 39 insertions(+)
> > >
> > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > > index 394e6e1e9686..f295772d5169 100644
> > > --- a/drivers/dma-buf/sync_file.c
> > > +++ b/drivers/dma-buf/sync_file.c
> > > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file,
> > > return ret;
> > > }
> > >
> > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> > > + unsigned long arg)
> > > +{
> > > + struct sync_set_deadline ts;
> > > +
> > > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> > > + return -EFAULT;
> > > +
> > > + if (ts.pad)
> > > + return -EINVAL;
> > > +
> > > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec));
> > > +
> > > + return 0;
> > > +}
> > > +
> > > static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > > unsigned long arg)
> > > {
> > > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > > case SYNC_IOC_FILE_INFO:
> > > return sync_file_ioctl_fence_info(sync_file, arg);
> > >
> > > + case SYNC_IOC_SET_DEADLINE:
> > > + return sync_file_ioctl_set_deadline(sync_file, arg);
> > > +
> > > default:
> > > return -ENOTTY;
> > > }
> > > diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> > > index ee2dcfb3d660..f67d4ffe7566 100644
> > > --- a/include/uapi/linux/sync_file.h
> > > +++ b/include/uapi/linux/sync_file.h
> > > @@ -67,6 +67,18 @@ struct sync_file_info {
> > > __u64 sync_fence_info;
> > > };
> > >
> > > +/**
> > > + * struct sync_set_deadline - set a deadline on a fence
> > > + * @tv_sec: seconds elapsed since epoch
> > > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> > > + * @pad: must be zero
> > > + */
> > > +struct sync_set_deadline {
> > > + __s64 tv_sec;
> > > + __s32 tv_nsec;
> > > + __u32 pad;
> > > +};
> > > +
> > > #define SYNC_IOC_MAGIC '>'
> > >
> > > /**
> > > @@ -95,4 +107,12 @@ struct sync_file_info {
> > > */
> > > #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
> > >
> > > +
> > > +/**
> > > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> > > + *
> > > + * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
> > > + */
> > > +#define SYNC_IOC_SET_DEADLINE _IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline)
> > > +
> > > #endif /* _UAPI_LINUX_SYNC_H */
> > > --
> > > 2.31.1
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2021-09-08 20:03:03

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support

On Wed, Sep 08, 2021 at 11:19:15AM -0700, Rob Clark wrote:
> On Wed, Sep 8, 2021 at 10:54 AM Daniel Vetter <[email protected]> wrote:
> >
> > On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote:
> > > From: Rob Clark <[email protected]>
> > >
> > > Signed-off-by: Rob Clark <[email protected]>
> > > ---
> > > drivers/dma-buf/dma-fence-chain.c | 13 +++++++++++++
> > > 1 file changed, 13 insertions(+)
> > >
> > > diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c
> > > index 1b4cb3e5cec9..736a9ad3ea6d 100644
> > > --- a/drivers/dma-buf/dma-fence-chain.c
> > > +++ b/drivers/dma-buf/dma-fence-chain.c
> > > @@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence *fence)
> > > dma_fence_free(fence);
> > > }
> > >
> > > +
> > > +static void dma_fence_chain_set_deadline(struct dma_fence *fence,
> > > + ktime_t deadline)
> > > +{
> > > + dma_fence_chain_for_each(fence, fence) {
> > > + struct dma_fence_chain *chain = to_dma_fence_chain(fence);
> > > + struct dma_fence *f = chain ? chain->fence : fence;
> >
> > Doesn't this just end up calling set_deadline on a chain, potenetially
> > resulting in recursion? Also I don't think this should ever happen, why
> > did you add that?
>
> Tbh the fence-chain was the part I was a bit fuzzy about, and the main
> reason I added igt tests. The iteration is similar to how, for ex,
> dma_fence_chain_signaled() work, and according to the igt test it does
> what was intended

Huh indeed. Maybe something we should fix, like why does the
dma_fence_chain_for_each not give you the upcast chain pointer ... I guess
this also needs more Christian and less me.
-Daniel

>
> BR,
> -R
>
> > -Daniel
> >
> > > +
> > > + dma_fence_set_deadline(f, deadline);
> > > + }
> > > +}
> > > +
> > > const struct dma_fence_ops dma_fence_chain_ops = {
> > > .use_64bit_seqno = true,
> > > .get_driver_name = dma_fence_chain_get_driver_name,
> > > @@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = {
> > > .enable_signaling = dma_fence_chain_enable_signaling,
> > > .signaled = dma_fence_chain_signaled,
> > > .release = dma_fence_chain_release,
> > > + .set_deadline = dma_fence_chain_set_deadline,
> > > };
> > > EXPORT_SYMBOL(dma_fence_chain_ops);
> > >
> > > --
> > > 2.31.1
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2021-09-08 20:10:35

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

On Wed, Sep 8, 2021 at 11:49 AM Daniel Vetter <[email protected]> wrote:
>
> On Wed, Sep 08, 2021 at 11:23:42AM -0700, Rob Clark wrote:
> > On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter <[email protected]> wrote:
> > >
> > > On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote:
> > > > From: Rob Clark <[email protected]>
> > > >
> > > > The initial purpose is for igt tests, but this would also be useful for
> > > > compositors that wait until close to vblank deadline to make decisions
> > > > about which frame to show.
> > > >
> > > > Signed-off-by: Rob Clark <[email protected]>
> > >
> > > Needs userspace and I think ideally also some igts to make sure it works
> > > and doesn't go boom.
> >
> > See cover-letter.. there are igt tests, although currently that is the
> > only user.
>
> Ah sorry missed that. It would be good to record that in the commit too
> that adds the uapi. git blame doesn't find cover letters at all, unlike on
> gitlab where you get the MR request with everything.
>
> Ok there is the Link: thing, but since that only points at the last
> version all the interesting discussion is still usually lost, so I tend to
> not bother looking there.
>
> > I'd be ok to otherwise initially restrict this and the sw_sync UABI
> > (CAP_SYS_ADMIN? Or??) until there is a non-igt user, but they are
> > both needed by the igt tests
>
> Hm really awkward, uapi for igts in cross vendor stuff like this isn't
> great. I think hiding it in vgem is semi-ok (we have fences there
> already). But it's all a bit silly ...
>
> For the tests, should we instead have a selftest/Kunit thing to exercise
> this stuff? igt probably not quite the right thing. Or combine with a page
> flip if you want to test msm.

Hmm, IIRC we have used CONFIG_BROKEN or something along those lines
for UABI in other places where we weren't willing to commit to yet?

I suppose if we had to I could make this a sw_sync ioctl instead. But
OTOH there are kind of a limited # of ways this ioctl could look. And
we already know that at least some wayland compositors are going to
want this.

I guess I can look at non-igt options. But the igt test is already a
pretty convenient way to contrive situations (like loops, which is a
thing I need to add)

BR,
-R


> -Daniel
>
> >
> > BR,
> > -R
> >
> > > -Daniel
> > >
> > > > ---
> > > > drivers/dma-buf/sync_file.c | 19 +++++++++++++++++++
> > > > include/uapi/linux/sync_file.h | 20 ++++++++++++++++++++
> > > > 2 files changed, 39 insertions(+)
> > > >
> > > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > > > index 394e6e1e9686..f295772d5169 100644
> > > > --- a/drivers/dma-buf/sync_file.c
> > > > +++ b/drivers/dma-buf/sync_file.c
> > > > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file,
> > > > return ret;
> > > > }
> > > >
> > > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> > > > + unsigned long arg)
> > > > +{
> > > > + struct sync_set_deadline ts;
> > > > +
> > > > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> > > > + return -EFAULT;
> > > > +
> > > > + if (ts.pad)
> > > > + return -EINVAL;
> > > > +
> > > > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec));
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > > > unsigned long arg)
> > > > {
> > > > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > > > case SYNC_IOC_FILE_INFO:
> > > > return sync_file_ioctl_fence_info(sync_file, arg);
> > > >
> > > > + case SYNC_IOC_SET_DEADLINE:
> > > > + return sync_file_ioctl_set_deadline(sync_file, arg);
> > > > +
> > > > default:
> > > > return -ENOTTY;
> > > > }
> > > > diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> > > > index ee2dcfb3d660..f67d4ffe7566 100644
> > > > --- a/include/uapi/linux/sync_file.h
> > > > +++ b/include/uapi/linux/sync_file.h
> > > > @@ -67,6 +67,18 @@ struct sync_file_info {
> > > > __u64 sync_fence_info;
> > > > };
> > > >
> > > > +/**
> > > > + * struct sync_set_deadline - set a deadline on a fence
> > > > + * @tv_sec: seconds elapsed since epoch
> > > > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> > > > + * @pad: must be zero
> > > > + */
> > > > +struct sync_set_deadline {
> > > > + __s64 tv_sec;
> > > > + __s32 tv_nsec;
> > > > + __u32 pad;
> > > > +};
> > > > +
> > > > #define SYNC_IOC_MAGIC '>'
> > > >
> > > > /**
> > > > @@ -95,4 +107,12 @@ struct sync_file_info {
> > > > */
> > > > #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
> > > >
> > > > +
> > > > +/**
> > > > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> > > > + *
> > > > + * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
> > > > + */
> > > > +#define SYNC_IOC_SET_DEADLINE _IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline)
> > > > +
> > > > #endif /* _UAPI_LINUX_SYNC_H */
> > > > --
> > > > 2.31.1
> > > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

2021-09-08 21:14:32

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

On Wed, Sep 8, 2021 at 9:36 PM Rob Clark <[email protected]> wrote:
> On Wed, Sep 8, 2021 at 11:49 AM Daniel Vetter <[email protected]> wrote:
> > On Wed, Sep 08, 2021 at 11:23:42AM -0700, Rob Clark wrote:
> > > On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter <[email protected]> wrote:
> > > >
> > > > On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote:
> > > > > From: Rob Clark <[email protected]>
> > > > >
> > > > > The initial purpose is for igt tests, but this would also be useful for
> > > > > compositors that wait until close to vblank deadline to make decisions
> > > > > about which frame to show.
> > > > >
> > > > > Signed-off-by: Rob Clark <[email protected]>
> > > >
> > > > Needs userspace and I think ideally also some igts to make sure it works
> > > > and doesn't go boom.
> > >
> > > See cover-letter.. there are igt tests, although currently that is the
> > > only user.
> >
> > Ah sorry missed that. It would be good to record that in the commit too
> > that adds the uapi. git blame doesn't find cover letters at all, unlike on
> > gitlab where you get the MR request with everything.
> >
> > Ok there is the Link: thing, but since that only points at the last
> > version all the interesting discussion is still usually lost, so I tend to
> > not bother looking there.
> >
> > > I'd be ok to otherwise initially restrict this and the sw_sync UABI
> > > (CAP_SYS_ADMIN? Or??) until there is a non-igt user, but they are
> > > both needed by the igt tests
> >
> > Hm really awkward, uapi for igts in cross vendor stuff like this isn't
> > great. I think hiding it in vgem is semi-ok (we have fences there
> > already). But it's all a bit silly ...
> >
> > For the tests, should we instead have a selftest/Kunit thing to exercise
> > this stuff? igt probably not quite the right thing. Or combine with a page
> > flip if you want to test msm.
>
> Hmm, IIRC we have used CONFIG_BROKEN or something along those lines
> for UABI in other places where we weren't willing to commit to yet?
>
> I suppose if we had to I could make this a sw_sync ioctl instead. But
> OTOH there are kind of a limited # of ways this ioctl could look. And
> we already know that at least some wayland compositors are going to
> want this.

Hm I was trying to think up a few ways this could work, but didn't
come up with anything reasonable. Forcing the compositor to boost the
entire chain (for gl composited primary plane fallback) is something
the kernel can easily do too. Also only makes sense for priority
boost, not so much for clock boosting, since clock boosting only
really needs the final element to be boosted.

> I guess I can look at non-igt options. But the igt test is already a
> pretty convenient way to contrive situations (like loops, which is a
> thing I need to add)

Yeah it's definitely very useful for testing ... One option could be a
hacky debugfs interface, where you write a fd number and deadline and
the debugfs read function does the deadline setting. Horribly, but
since it's debugfs no one ever cares. That's at least where we're
hiding all the i915 hacks that igts need.
-Daniel

> BR,
> -R
>
>
> > -Daniel
> >
> > >
> > > BR,
> > > -R
> > >
> > > > -Daniel
> > > >
> > > > > ---
> > > > > drivers/dma-buf/sync_file.c | 19 +++++++++++++++++++
> > > > > include/uapi/linux/sync_file.h | 20 ++++++++++++++++++++
> > > > > 2 files changed, 39 insertions(+)
> > > > >
> > > > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > > > > index 394e6e1e9686..f295772d5169 100644
> > > > > --- a/drivers/dma-buf/sync_file.c
> > > > > +++ b/drivers/dma-buf/sync_file.c
> > > > > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file,
> > > > > return ret;
> > > > > }
> > > > >
> > > > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> > > > > + unsigned long arg)
> > > > > +{
> > > > > + struct sync_set_deadline ts;
> > > > > +
> > > > > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> > > > > + return -EFAULT;
> > > > > +
> > > > > + if (ts.pad)
> > > > > + return -EINVAL;
> > > > > +
> > > > > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec));
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > > > > unsigned long arg)
> > > > > {
> > > > > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > > > > case SYNC_IOC_FILE_INFO:
> > > > > return sync_file_ioctl_fence_info(sync_file, arg);
> > > > >
> > > > > + case SYNC_IOC_SET_DEADLINE:
> > > > > + return sync_file_ioctl_set_deadline(sync_file, arg);
> > > > > +
> > > > > default:
> > > > > return -ENOTTY;
> > > > > }
> > > > > diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> > > > > index ee2dcfb3d660..f67d4ffe7566 100644
> > > > > --- a/include/uapi/linux/sync_file.h
> > > > > +++ b/include/uapi/linux/sync_file.h
> > > > > @@ -67,6 +67,18 @@ struct sync_file_info {
> > > > > __u64 sync_fence_info;
> > > > > };
> > > > >
> > > > > +/**
> > > > > + * struct sync_set_deadline - set a deadline on a fence
> > > > > + * @tv_sec: seconds elapsed since epoch
> > > > > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> > > > > + * @pad: must be zero
> > > > > + */
> > > > > +struct sync_set_deadline {
> > > > > + __s64 tv_sec;
> > > > > + __s32 tv_nsec;
> > > > > + __u32 pad;
> > > > > +};
> > > > > +
> > > > > #define SYNC_IOC_MAGIC '>'
> > > > >
> > > > > /**
> > > > > @@ -95,4 +107,12 @@ struct sync_file_info {
> > > > > */
> > > > > #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
> > > > >
> > > > > +
> > > > > +/**
> > > > > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> > > > > + *
> > > > > + * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
> > > > > + */
> > > > > +#define SYNC_IOC_SET_DEADLINE _IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline)
> > > > > +
> > > > > #endif /* _UAPI_LINUX_SYNC_H */
> > > > > --
> > > > > 2.31.1
> > > > >
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch



--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2021-09-09 06:39:43

by Christian König

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

Am 08.09.21 um 19:45 schrieb Daniel Vetter:
> On Fri, Sep 03, 2021 at 11:47:55AM -0700, Rob Clark wrote:
>> From: Rob Clark <[email protected]>
>>
>> As the finished fence is the one that is exposed to userspace, and
>> therefore the one that other operations, like atomic update, would
>> block on, we need to propagate the deadline from from the finished
>> fence to the actual hw fence.
>>
>> v2: Split into drm_sched_fence_set_parent() (ckoenig)
>>
>> Signed-off-by: Rob Clark <[email protected]>
>> ---
>> drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
>> drivers/gpu/drm/scheduler/sched_main.c | 2 +-
>> include/drm/gpu_scheduler.h | 8 ++++++
>> 3 files changed, 43 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>> index bcea035cf4c6..4fc41a71d1c7 100644
>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>> @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
>> dma_fence_put(&fence->scheduled);
>> }
>>
>> +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
>> + ktime_t deadline)
>> +{
>> + struct drm_sched_fence *fence = to_drm_sched_fence(f);
>> + unsigned long flags;
>> +
>> + spin_lock_irqsave(&fence->lock, flags);
>> +
>> + /* If we already have an earlier deadline, keep it: */
>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
>> + ktime_before(fence->deadline, deadline)) {
>> + spin_unlock_irqrestore(&fence->lock, flags);
>> + return;
>> + }
>> +
>> + fence->deadline = deadline;
>> + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
>> +
>> + spin_unlock_irqrestore(&fence->lock, flags);
>> +
>> + if (fence->parent)
>> + dma_fence_set_deadline(fence->parent, deadline);
>> +}
>> +
>> static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
>> .get_driver_name = drm_sched_fence_get_driver_name,
>> .get_timeline_name = drm_sched_fence_get_timeline_name,
>> @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
>> .get_driver_name = drm_sched_fence_get_driver_name,
>> .get_timeline_name = drm_sched_fence_get_timeline_name,
>> .release = drm_sched_fence_release_finished,
>> + .set_deadline = drm_sched_fence_set_deadline_finished,
>> };
>>
>> struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>> @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>> }
>> EXPORT_SYMBOL(to_drm_sched_fence);
>>
>> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
>> + struct dma_fence *fence)
>> +{
>> + s_fence->parent = dma_fence_get(fence);
>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
>> + &s_fence->finished.flags))
> Don't you need the spinlock here too to avoid races? test_bit is very
> unordered, so guarantees nothing. Spinlock would need to be both around
> ->parent = and the test_bit.
>
> Entirely aside, but there's discussions going on to preallocate the hw
> fence somehow. If we do that we could make the deadline forwarding
> lockless here. Having a spinlock just to set the parent is a bit annoying

Well previously we didn't needed the spinlock to set the parent because
the parent wasn't used outside of the main thread.

This becomes an issue now because we can race with setting the deadline.
So yeah we probably do need the spinlock here now.

Christian.

> ...
>
> Alternative is that you do this locklessly with barriers and a _lot_ of
> comments. Would be good to benchmark whether the overhead matters though
> first.
> -Daniel
>
>> + dma_fence_set_deadline(fence, s_fence->deadline);
>> +}
>> +
>> struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>> void *owner)
>> {
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 595e47ff7d06..27bf0ac0625f 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
>> drm_sched_fence_scheduled(s_fence);
>>
>> if (!IS_ERR_OR_NULL(fence)) {
>> - s_fence->parent = dma_fence_get(fence);
>> + drm_sched_fence_set_parent(s_fence, fence);
>> r = dma_fence_add_callback(fence, &sched_job->cb,
>> drm_sched_job_done_cb);
>> if (r == -ENOENT)
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 7f77a455722c..158ddd662469 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -238,6 +238,12 @@ struct drm_sched_fence {
>> */
>> struct dma_fence finished;
>>
>> + /**
>> + * @deadline: deadline set on &drm_sched_fence.finished which
>> + * potentially needs to be propagated to &drm_sched_fence.parent
>> + */
>> + ktime_t deadline;
>> +
>> /**
>> * @parent: the fence returned by &drm_sched_backend_ops.run_job
>> * when scheduling the job on hardware. We signal the
>> @@ -505,6 +511,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>> enum drm_sched_priority priority);
>> bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>
>> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
>> + struct dma_fence *fence);
>> struct drm_sched_fence *drm_sched_fence_alloc(
>> struct drm_sched_entity *s_entity, void *owner);
>> void drm_sched_fence_init(struct drm_sched_fence *fence,
>> --
>> 2.31.1
>>

2021-09-09 06:42:04

by Christian König

[permalink] [raw]
Subject: Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support

Am 08.09.21 um 20:45 schrieb Daniel Vetter:
> On Wed, Sep 08, 2021 at 11:19:15AM -0700, Rob Clark wrote:
>> On Wed, Sep 8, 2021 at 10:54 AM Daniel Vetter <[email protected]> wrote:
>>> On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote:
>>>> From: Rob Clark <[email protected]>
>>>>
>>>> Signed-off-by: Rob Clark <[email protected]>
>>>> ---
>>>> drivers/dma-buf/dma-fence-chain.c | 13 +++++++++++++
>>>> 1 file changed, 13 insertions(+)
>>>>
>>>> diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c
>>>> index 1b4cb3e5cec9..736a9ad3ea6d 100644
>>>> --- a/drivers/dma-buf/dma-fence-chain.c
>>>> +++ b/drivers/dma-buf/dma-fence-chain.c
>>>> @@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence *fence)
>>>> dma_fence_free(fence);
>>>> }
>>>>
>>>> +
>>>> +static void dma_fence_chain_set_deadline(struct dma_fence *fence,
>>>> + ktime_t deadline)
>>>> +{
>>>> + dma_fence_chain_for_each(fence, fence) {
>>>> + struct dma_fence_chain *chain = to_dma_fence_chain(fence);
>>>> + struct dma_fence *f = chain ? chain->fence : fence;
>>> Doesn't this just end up calling set_deadline on a chain, potenetially
>>> resulting in recursion? Also I don't think this should ever happen, why
>>> did you add that?
>> Tbh the fence-chain was the part I was a bit fuzzy about, and the main
>> reason I added igt tests. The iteration is similar to how, for ex,
>> dma_fence_chain_signaled() work, and according to the igt test it does
>> what was intended
> Huh indeed. Maybe something we should fix, like why does the
> dma_fence_chain_for_each not give you the upcast chain pointer ... I guess
> this also needs more Christian and less me.

Yeah I was also already thinking about having a
dma_fence_chain_for_each_contained() macro which directly returns the
containing fence, just didn't had time to implement/clean that up.

And yes the patch is correct as it is and avoid the recursion, so
Reviewed-by: Christian König <[email protected]> for this one.

Regards,
Christian.

> -Daniel
>
>> BR,
>> -R
>>
>>> -Daniel
>>>
>>>> +
>>>> + dma_fence_set_deadline(f, deadline);
>>>> + }
>>>> +}
>>>> +
>>>> const struct dma_fence_ops dma_fence_chain_ops = {
>>>> .use_64bit_seqno = true,
>>>> .get_driver_name = dma_fence_chain_get_driver_name,
>>>> @@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = {
>>>> .enable_signaling = dma_fence_chain_enable_signaling,
>>>> .signaled = dma_fence_chain_signaled,
>>>> .release = dma_fence_chain_release,
>>>> + .set_deadline = dma_fence_chain_set_deadline,
>>>> };
>>>> EXPORT_SYMBOL(dma_fence_chain_ops);
>>>>
>>>> --
>>>> 2.31.1
>>>>
>>> --
>>> Daniel Vetter
>>> Software Engineer, Intel Corporation
>>> http://blog.ffwll.ch

2021-09-09 07:00:38

by Christian König

[permalink] [raw]
Subject: Re: [PATCH v3 6/9] dma-buf/fence-array: Add fence deadline support

Am 08.09.21 um 20:00 schrieb Daniel Vetter:
> On Fri, Sep 03, 2021 at 11:47:57AM -0700, Rob Clark wrote:
>> From: Rob Clark <[email protected]>
>>
>> Signed-off-by: Rob Clark <[email protected]>
>> ---
>> drivers/dma-buf/dma-fence-array.c | 11 +++++++++++
>> 1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c
>> index d3fbd950be94..8d194b09ee3d 100644
>> --- a/drivers/dma-buf/dma-fence-array.c
>> +++ b/drivers/dma-buf/dma-fence-array.c
>> @@ -119,12 +119,23 @@ static void dma_fence_array_release(struct dma_fence *fence)
>> dma_fence_free(fence);
>> }
>>
>> +static void dma_fence_array_set_deadline(struct dma_fence *fence,
>> + ktime_t deadline)
>> +{
>> + struct dma_fence_array *array = to_dma_fence_array(fence);
>> + unsigned i;
>> +
>> + for (i = 0; i < array->num_fences; ++i)
>> + dma_fence_set_deadline(array->fences[i], deadline);
> Hm I wonder whether this can go wrong, and whether we need Christian's
> massive fence iterator that I've seen flying around. If you nest these
> things too much it could all go wrong I think. I looked at other users
> which inspect dma_fence_array and none of them have a risk for unbounded
> recursion.

That should work fine or at least doesn't add anything new which could
go boom.

The dma_fence_array() can't contain other dma_fence_array or
dma_fence_chain objects or it could end up in a recursion and corrupt
the kernel stack.

That's a well known limitation for other code paths as well.

So Reviewed-by: Christian König <[email protected]> for this one.

Regards,
Christian.

>
> Maybe check with Christian.
> -Daniel
>
>
>> +}
>> +
>> const struct dma_fence_ops dma_fence_array_ops = {
>> .get_driver_name = dma_fence_array_get_driver_name,
>> .get_timeline_name = dma_fence_array_get_timeline_name,
>> .enable_signaling = dma_fence_array_enable_signaling,
>> .signaled = dma_fence_array_signaled,
>> .release = dma_fence_array_release,
>> + .set_deadline = dma_fence_array_set_deadline,
>> };
>> EXPORT_SYMBOL(dma_fence_array_ops);
>>
>> --
>> 2.31.1
>>

2021-09-09 16:17:18

by Simon Ser

[permalink] [raw]
Subject: Re: [PATCH v3 0/9] dma-fence: Deadline awareness

Out of curiosity, would it be reasonable to allow user-space (more
precisely, the compositor) to set the deadline via an IOCTL without
actually performing an atomic commit with the FB?

Some compositors might want to wait themselves for FB fence completions
to ensure a client doesn't block the whole desktop (by submitting a
very costly rendering job). In this case it would make sense for the
compositor to indicate that it intends to display the buffer on next
vblank if it's ready by that point, without queueing a page-flip yet.

2021-09-09 16:32:53

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 0/9] dma-fence: Deadline awareness

On Thu, Sep 9, 2021 at 9:16 AM Simon Ser <[email protected]> wrote:
>
> Out of curiosity, would it be reasonable to allow user-space (more
> precisely, the compositor) to set the deadline via an IOCTL without
> actually performing an atomic commit with the FB?
>
> Some compositors might want to wait themselves for FB fence completions
> to ensure a client doesn't block the whole desktop (by submitting a
> very costly rendering job). In this case it would make sense for the
> compositor to indicate that it intends to display the buffer on next
> vblank if it's ready by that point, without queueing a page-flip yet.

Yes, I think it would.. and "dma-buf/sync_file: Add SET_DEADLINE
ioctl" adds such an ioctl.. just for the benefit of igt tests at this
point, but the thought was it would be also used by compositors that
are doing such frame scheduling. Ofc danvet is a bit grumpy that
there isn't a more real (than igt) userspace for the ioctl yet ;-)

BR,
-R

2021-09-09 16:45:18

by Simon Ser

[permalink] [raw]
Subject: Re: [PATCH v3 0/9] dma-fence: Deadline awareness

On Thursday, September 9th, 2021 at 18:31, Rob Clark <[email protected]> wrote:

> Yes, I think it would.. and "dma-buf/sync_file: Add SET_DEADLINE
> ioctl" adds such an ioctl.. just for the benefit of igt tests at this
> point, but the thought was it would be also used by compositors that
> are doing such frame scheduling. Ofc danvet is a bit grumpy that
> there isn't a more real (than igt) userspace for the ioctl yet ;-)

Ah, very nice, I somehow missed it.

I guess one issue is that explicit sync isn't quite plumbed through
compositors yet, so without Jason's DMA-BUF to sync_file IOCTL it'd be
a bit difficult to use.

Can anybody set the deadline? I wonder if clients should be allowed to.

What happens if the deadline is exceeded? I'd assume nothing in
particular, the deadline being just a hint?

2021-09-09 17:06:13

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 0/9] dma-fence: Deadline awareness

On Thu, Sep 9, 2021 at 9:42 AM Simon Ser <[email protected]> wrote:
>
> On Thursday, September 9th, 2021 at 18:31, Rob Clark <[email protected]> wrote:
>
> > Yes, I think it would.. and "dma-buf/sync_file: Add SET_DEADLINE
> > ioctl" adds such an ioctl.. just for the benefit of igt tests at this
> > point, but the thought was it would be also used by compositors that
> > are doing such frame scheduling. Ofc danvet is a bit grumpy that
> > there isn't a more real (than igt) userspace for the ioctl yet ;-)
>
> Ah, very nice, I somehow missed it.
>
> I guess one issue is that explicit sync isn't quite plumbed through
> compositors yet, so without Jason's DMA-BUF to sync_file IOCTL it'd be
> a bit difficult to use.
>
> Can anybody set the deadline? I wonder if clients should be allowed to.

In its current form, anyone who has the fd can.. I'm not sure how (or
even if) we could limit it beyond that. I suppose hypothetically you
could use this for completely non-compositor related things, like
compute jobs where you want the result by some deadline. (OTOH, it
could be the driver using this internally when the app is stalling on
a result)

> What happens if the deadline is exceeded? I'd assume nothing in
> particular, the deadline being just a hint?

Nothing in particular, it is just a hint. The main intention is to
provide a feedback hint to the drivers in scenarios like vblank
deadlines, where being 1ms late is just as bad as being
$frame_duration-1ms late. From my experiments and profiling it is
useful in a couple scenarios:

1) input latency, ie. go from idle to fullscreen animation, where GPU
has been idle for a while and not busy enough *yet* for devfreq to
kick in and start ramping up the freq before we miss the first vblank
2) double buffering, for ex. if you are 1ms late you end up stalling
15ms before the gpu can start rendering the next frame.. in the
absence of feedback, devfreq would ramp down in this scenario instead
of up

BR,
-R

2021-09-14 13:40:33

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

On Thu, Sep 09, 2021 at 08:22:59AM +0200, Christian K?nig wrote:
> Am 08.09.21 um 19:45 schrieb Daniel Vetter:
> > On Fri, Sep 03, 2021 at 11:47:55AM -0700, Rob Clark wrote:
> > > From: Rob Clark <[email protected]>
> > >
> > > As the finished fence is the one that is exposed to userspace, and
> > > therefore the one that other operations, like atomic update, would
> > > block on, we need to propagate the deadline from from the finished
> > > fence to the actual hw fence.
> > >
> > > v2: Split into drm_sched_fence_set_parent() (ckoenig)
> > >
> > > Signed-off-by: Rob Clark <[email protected]>
> > > ---
> > > drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
> > > drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> > > include/drm/gpu_scheduler.h | 8 ++++++
> > > 3 files changed, 43 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > > index bcea035cf4c6..4fc41a71d1c7 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > > @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
> > > dma_fence_put(&fence->scheduled);
> > > }
> > > +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> > > + ktime_t deadline)
> > > +{
> > > + struct drm_sched_fence *fence = to_drm_sched_fence(f);
> > > + unsigned long flags;
> > > +
> > > + spin_lock_irqsave(&fence->lock, flags);
> > > +
> > > + /* If we already have an earlier deadline, keep it: */
> > > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
> > > + ktime_before(fence->deadline, deadline)) {
> > > + spin_unlock_irqrestore(&fence->lock, flags);
> > > + return;
> > > + }
> > > +
> > > + fence->deadline = deadline;
> > > + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
> > > +
> > > + spin_unlock_irqrestore(&fence->lock, flags);
> > > +
> > > + if (fence->parent)
> > > + dma_fence_set_deadline(fence->parent, deadline);
> > > +}
> > > +
> > > static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
> > > .get_driver_name = drm_sched_fence_get_driver_name,
> > > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > > @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
> > > .get_driver_name = drm_sched_fence_get_driver_name,
> > > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > > .release = drm_sched_fence_release_finished,
> > > + .set_deadline = drm_sched_fence_set_deadline_finished,
> > > };
> > > struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > > @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > > }
> > > EXPORT_SYMBOL(to_drm_sched_fence);
> > > +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> > > + struct dma_fence *fence)
> > > +{
> > > + s_fence->parent = dma_fence_get(fence);
> > > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> > > + &s_fence->finished.flags))
> > Don't you need the spinlock here too to avoid races? test_bit is very
> > unordered, so guarantees nothing. Spinlock would need to be both around
> > ->parent = and the test_bit.
> >
> > Entirely aside, but there's discussions going on to preallocate the hw
> > fence somehow. If we do that we could make the deadline forwarding
> > lockless here. Having a spinlock just to set the parent is a bit annoying
>
> Well previously we didn't needed the spinlock to set the parent because the
> parent wasn't used outside of the main thread.
>
> This becomes an issue now because we can race with setting the deadline. So
> yeah we probably do need the spinlock here now.

Yeah tbh this is the part I don't like, because it means the scheduler
thread locking becomes more complex.

We might need to look at this again when we include stuff like boosting
priorities and things like that. Maybe it would be better if we have a
request queue which the scheduler thread could then pick up whenever it
gets around to the next scheduling decision?

I think for now just the spinlock in more places should be fine.
-Daniel


> Christian.
>
> > ...
> >
> > Alternative is that you do this locklessly with barriers and a _lot_ of
> > comments. Would be good to benchmark whether the overhead matters though
> > first.
> > -Daniel
> >
> > > + dma_fence_set_deadline(fence, s_fence->deadline);
> > > +}
> > > +
> > > struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> > > void *owner)
> > > {
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > index 595e47ff7d06..27bf0ac0625f 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
> > > drm_sched_fence_scheduled(s_fence);
> > > if (!IS_ERR_OR_NULL(fence)) {
> > > - s_fence->parent = dma_fence_get(fence);
> > > + drm_sched_fence_set_parent(s_fence, fence);
> > > r = dma_fence_add_callback(fence, &sched_job->cb,
> > > drm_sched_job_done_cb);
> > > if (r == -ENOENT)
> > > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > > index 7f77a455722c..158ddd662469 100644
> > > --- a/include/drm/gpu_scheduler.h
> > > +++ b/include/drm/gpu_scheduler.h
> > > @@ -238,6 +238,12 @@ struct drm_sched_fence {
> > > */
> > > struct dma_fence finished;
> > > + /**
> > > + * @deadline: deadline set on &drm_sched_fence.finished which
> > > + * potentially needs to be propagated to &drm_sched_fence.parent
> > > + */
> > > + ktime_t deadline;
> > > +
> > > /**
> > > * @parent: the fence returned by &drm_sched_backend_ops.run_job
> > > * when scheduling the job on hardware. We signal the
> > > @@ -505,6 +511,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> > > enum drm_sched_priority priority);
> > > bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> > > +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> > > + struct dma_fence *fence);
> > > struct drm_sched_fence *drm_sched_fence_alloc(
> > > struct drm_sched_entity *s_entity, void *owner);
> > > void drm_sched_fence_init(struct drm_sched_fence *fence,
> > > --
> > > 2.31.1
> > >
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2021-09-21 15:55:55

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

On Wed, Sep 8, 2021 at 10:45 AM Daniel Vetter <[email protected]> wrote:
>
> On Fri, Sep 03, 2021 at 11:47:55AM -0700, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > As the finished fence is the one that is exposed to userspace, and
> > therefore the one that other operations, like atomic update, would
> > block on, we need to propagate the deadline from from the finished
> > fence to the actual hw fence.
> >
> > v2: Split into drm_sched_fence_set_parent() (ckoenig)
> >
> > Signed-off-by: Rob Clark <[email protected]>
> > ---
> > drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
> > drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> > include/drm/gpu_scheduler.h | 8 ++++++
> > 3 files changed, 43 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > index bcea035cf4c6..4fc41a71d1c7 100644
> > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
> > dma_fence_put(&fence->scheduled);
> > }
> >
> > +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> > + ktime_t deadline)
> > +{
> > + struct drm_sched_fence *fence = to_drm_sched_fence(f);
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&fence->lock, flags);
> > +
> > + /* If we already have an earlier deadline, keep it: */
> > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
> > + ktime_before(fence->deadline, deadline)) {
> > + spin_unlock_irqrestore(&fence->lock, flags);
> > + return;
> > + }
> > +
> > + fence->deadline = deadline;
> > + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
> > +
> > + spin_unlock_irqrestore(&fence->lock, flags);
> > +
> > + if (fence->parent)
> > + dma_fence_set_deadline(fence->parent, deadline);
> > +}
> > +
> > static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
> > .get_driver_name = drm_sched_fence_get_driver_name,
> > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
> > .get_driver_name = drm_sched_fence_get_driver_name,
> > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > .release = drm_sched_fence_release_finished,
> > + .set_deadline = drm_sched_fence_set_deadline_finished,
> > };
> >
> > struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > }
> > EXPORT_SYMBOL(to_drm_sched_fence);
> >
> > +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> > + struct dma_fence *fence)
> > +{
> > + s_fence->parent = dma_fence_get(fence);
> > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> > + &s_fence->finished.flags))
>
> Don't you need the spinlock here too to avoid races? test_bit is very
> unordered, so guarantees nothing. Spinlock would need to be both around
> ->parent = and the test_bit.
>
> Entirely aside, but there's discussions going on to preallocate the hw
> fence somehow. If we do that we could make the deadline forwarding
> lockless here. Having a spinlock just to set the parent is a bit annoying
> ...
>
> Alternative is that you do this locklessly with barriers and a _lot_ of
> comments. Would be good to benchmark whether the overhead matters though
> first.

So, my thinking is that very few (well no) guarantees are made to the
fence implementor that their ->set_deadline() is not called multiple
times, from multiple threads, etc. And no guarantee that a later
deadline is set after an earlier deadline has been set. It is all up
to the set_deadline() implementation to deal with these cases.

So that means we just need the appropriate barrier-fu to ensure
another thread calling drm_sched_fence_set_deadline_finished() sees
fence->parent set before the test_bit. It could mean that the backend
implementation sees the same deadline set twice, but that is fine.

BR,
-R

> -Daniel
>
> > + dma_fence_set_deadline(fence, s_fence->deadline);
> > +}
> > +
> > struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> > void *owner)
> > {
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 595e47ff7d06..27bf0ac0625f 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
> > drm_sched_fence_scheduled(s_fence);
> >
> > if (!IS_ERR_OR_NULL(fence)) {
> > - s_fence->parent = dma_fence_get(fence);
> > + drm_sched_fence_set_parent(s_fence, fence);
> > r = dma_fence_add_callback(fence, &sched_job->cb,
> > drm_sched_job_done_cb);
> > if (r == -ENOENT)
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 7f77a455722c..158ddd662469 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -238,6 +238,12 @@ struct drm_sched_fence {
> > */
> > struct dma_fence finished;
> >
> > + /**
> > + * @deadline: deadline set on &drm_sched_fence.finished which
> > + * potentially needs to be propagated to &drm_sched_fence.parent
> > + */
> > + ktime_t deadline;
> > +
> > /**
> > * @parent: the fence returned by &drm_sched_backend_ops.run_job
> > * when scheduling the job on hardware. We signal the
> > @@ -505,6 +511,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> > enum drm_sched_priority priority);
> > bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >
> > +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> > + struct dma_fence *fence);
> > struct drm_sched_fence *drm_sched_fence_alloc(
> > struct drm_sched_entity *s_entity, void *owner);
> > void drm_sched_fence_init(struct drm_sched_fence *fence,
> > --
> > 2.31.1
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

2021-09-21 16:32:35

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

On Tue, Sep 21, 2021 at 8:57 AM Rob Clark <[email protected]> wrote:
>
> On Wed, Sep 8, 2021 at 10:45 AM Daniel Vetter <[email protected]> wrote:
> >
> > On Fri, Sep 03, 2021 at 11:47:55AM -0700, Rob Clark wrote:
> > > From: Rob Clark <[email protected]>
> > >
> > > As the finished fence is the one that is exposed to userspace, and
> > > therefore the one that other operations, like atomic update, would
> > > block on, we need to propagate the deadline from from the finished
> > > fence to the actual hw fence.
> > >
> > > v2: Split into drm_sched_fence_set_parent() (ckoenig)
> > >
> > > Signed-off-by: Rob Clark <[email protected]>
> > > ---
> > > drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
> > > drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> > > include/drm/gpu_scheduler.h | 8 ++++++
> > > 3 files changed, 43 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > > index bcea035cf4c6..4fc41a71d1c7 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > > @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
> > > dma_fence_put(&fence->scheduled);
> > > }
> > >
> > > +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> > > + ktime_t deadline)
> > > +{
> > > + struct drm_sched_fence *fence = to_drm_sched_fence(f);
> > > + unsigned long flags;
> > > +
> > > + spin_lock_irqsave(&fence->lock, flags);
> > > +
> > > + /* If we already have an earlier deadline, keep it: */
> > > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
> > > + ktime_before(fence->deadline, deadline)) {
> > > + spin_unlock_irqrestore(&fence->lock, flags);
> > > + return;
> > > + }
> > > +
> > > + fence->deadline = deadline;
> > > + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
> > > +
> > > + spin_unlock_irqrestore(&fence->lock, flags);
> > > +
> > > + if (fence->parent)
> > > + dma_fence_set_deadline(fence->parent, deadline);
> > > +}
> > > +
> > > static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
> > > .get_driver_name = drm_sched_fence_get_driver_name,
> > > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > > @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
> > > .get_driver_name = drm_sched_fence_get_driver_name,
> > > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > > .release = drm_sched_fence_release_finished,
> > > + .set_deadline = drm_sched_fence_set_deadline_finished,
> > > };
> > >
> > > struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > > @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > > }
> > > EXPORT_SYMBOL(to_drm_sched_fence);
> > >
> > > +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> > > + struct dma_fence *fence)
> > > +{
> > > + s_fence->parent = dma_fence_get(fence);
> > > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> > > + &s_fence->finished.flags))
> >
> > Don't you need the spinlock here too to avoid races? test_bit is very
> > unordered, so guarantees nothing. Spinlock would need to be both around
> > ->parent = and the test_bit.
> >
> > Entirely aside, but there's discussions going on to preallocate the hw
> > fence somehow. If we do that we could make the deadline forwarding
> > lockless here. Having a spinlock just to set the parent is a bit annoying
> > ...
> >
> > Alternative is that you do this locklessly with barriers and a _lot_ of
> > comments. Would be good to benchmark whether the overhead matters though
> > first.
>
> So, my thinking is that very few (well no) guarantees are made to the
> fence implementor that their ->set_deadline() is not called multiple
> times, from multiple threads, etc. And no guarantee that a later
> deadline is set after an earlier deadline has been set. It is all up
> to the set_deadline() implementation to deal with these cases.
>
> So that means we just need the appropriate barrier-fu to ensure
> another thread calling drm_sched_fence_set_deadline_finished() sees
> fence->parent set before the test_bit. It could mean that the backend
> implementation sees the same deadline set twice, but that is fine.
>

something like:

-----
diff --git a/drivers/gpu/drm/scheduler/sched_fence.c
b/drivers/gpu/drm/scheduler/sched_fence.c
index 4fc41a71d1c7..7f2af6d1777c 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -132,6 +132,7 @@ static void
drm_sched_fence_set_deadline_finished(struct dma_fence *f,
ktime_t deadline)
{
struct drm_sched_fence *fence = to_drm_sched_fence(f);
+ struct dma_fence *parent;
unsigned long flags;

spin_lock_irqsave(&fence->lock, flags);
@@ -148,8 +149,9 @@ static void
drm_sched_fence_set_deadline_finished(struct dma_fence *f,

spin_unlock_irqrestore(&fence->lock, flags);

- if (fence->parent)
- dma_fence_set_deadline(fence->parent, deadline);
+ parent = smp_load_acquire(&fence->parent);
+ if (parent)
+ dma_fence_set_deadline(parent, deadline);
}

static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
@@ -180,7 +182,7 @@ EXPORT_SYMBOL(to_drm_sched_fence);
void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
struct dma_fence *fence)
{
- s_fence->parent = dma_fence_get(fence);
+ smp_store_release(&s_fence->parent, dma_fence_get(fence));
if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
&s_fence->finished.flags))
dma_fence_set_deadline(fence, s_fence->deadline);
-----

BR,
-R

2021-09-21 16:47:38

by Christian König

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

Am 21.09.21 um 18:35 schrieb Rob Clark:
> On Tue, Sep 21, 2021 at 8:57 AM Rob Clark <[email protected]> wrote:
>> On Wed, Sep 8, 2021 at 10:45 AM Daniel Vetter <[email protected]> wrote:
>>> On Fri, Sep 03, 2021 at 11:47:55AM -0700, Rob Clark wrote:
>>>> From: Rob Clark <[email protected]>
>>>>
>>>> As the finished fence is the one that is exposed to userspace, and
>>>> therefore the one that other operations, like atomic update, would
>>>> block on, we need to propagate the deadline from from the finished
>>>> fence to the actual hw fence.
>>>>
>>>> v2: Split into drm_sched_fence_set_parent() (ckoenig)
>>>>
>>>> Signed-off-by: Rob Clark <[email protected]>
>>>> ---
>>>> drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
>>>> drivers/gpu/drm/scheduler/sched_main.c | 2 +-
>>>> include/drm/gpu_scheduler.h | 8 ++++++
>>>> 3 files changed, 43 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>>>> index bcea035cf4c6..4fc41a71d1c7 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>>> @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
>>>> dma_fence_put(&fence->scheduled);
>>>> }
>>>>
>>>> +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
>>>> + ktime_t deadline)
>>>> +{
>>>> + struct drm_sched_fence *fence = to_drm_sched_fence(f);
>>>> + unsigned long flags;
>>>> +
>>>> + spin_lock_irqsave(&fence->lock, flags);
>>>> +
>>>> + /* If we already have an earlier deadline, keep it: */
>>>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
>>>> + ktime_before(fence->deadline, deadline)) {
>>>> + spin_unlock_irqrestore(&fence->lock, flags);
>>>> + return;
>>>> + }
>>>> +
>>>> + fence->deadline = deadline;
>>>> + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
>>>> +
>>>> + spin_unlock_irqrestore(&fence->lock, flags);
>>>> +
>>>> + if (fence->parent)
>>>> + dma_fence_set_deadline(fence->parent, deadline);
>>>> +}
>>>> +
>>>> static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
>>>> .get_driver_name = drm_sched_fence_get_driver_name,
>>>> .get_timeline_name = drm_sched_fence_get_timeline_name,
>>>> @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
>>>> .get_driver_name = drm_sched_fence_get_driver_name,
>>>> .get_timeline_name = drm_sched_fence_get_timeline_name,
>>>> .release = drm_sched_fence_release_finished,
>>>> + .set_deadline = drm_sched_fence_set_deadline_finished,
>>>> };
>>>>
>>>> struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>> @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>> }
>>>> EXPORT_SYMBOL(to_drm_sched_fence);
>>>>
>>>> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
>>>> + struct dma_fence *fence)
>>>> +{
>>>> + s_fence->parent = dma_fence_get(fence);
>>>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
>>>> + &s_fence->finished.flags))
>>> Don't you need the spinlock here too to avoid races? test_bit is very
>>> unordered, so guarantees nothing. Spinlock would need to be both around
>>> ->parent = and the test_bit.
>>>
>>> Entirely aside, but there's discussions going on to preallocate the hw
>>> fence somehow. If we do that we could make the deadline forwarding
>>> lockless here. Having a spinlock just to set the parent is a bit annoying
>>> ...
>>>
>>> Alternative is that you do this locklessly with barriers and a _lot_ of
>>> comments. Would be good to benchmark whether the overhead matters though
>>> first.
>> So, my thinking is that very few (well no) guarantees are made to the
>> fence implementor that their ->set_deadline() is not called multiple
>> times, from multiple threads, etc. And no guarantee that a later
>> deadline is set after an earlier deadline has been set. It is all up
>> to the set_deadline() implementation to deal with these cases.
>>
>> So that means we just need the appropriate barrier-fu to ensure
>> another thread calling drm_sched_fence_set_deadline_finished() sees
>> fence->parent set before the test_bit. It could mean that the backend
>> implementation sees the same deadline set twice, but that is fine.
>>
> something like:

Of hand I think that this should work.

Or rather say I can't see anything wrong with that.

Christian.

>
> -----
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c
> b/drivers/gpu/drm/scheduler/sched_fence.c
> index 4fc41a71d1c7..7f2af6d1777c 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -132,6 +132,7 @@ static void
> drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> ktime_t deadline)
> {
> struct drm_sched_fence *fence = to_drm_sched_fence(f);
> + struct dma_fence *parent;
> unsigned long flags;
>
> spin_lock_irqsave(&fence->lock, flags);
> @@ -148,8 +149,9 @@ static void
> drm_sched_fence_set_deadline_finished(struct dma_fence *f,
>
> spin_unlock_irqrestore(&fence->lock, flags);
>
> - if (fence->parent)
> - dma_fence_set_deadline(fence->parent, deadline);
> + parent = smp_load_acquire(&fence->parent);
> + if (parent)
> + dma_fence_set_deadline(parent, deadline);
> }
>
> static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
> @@ -180,7 +182,7 @@ EXPORT_SYMBOL(to_drm_sched_fence);
> void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> struct dma_fence *fence)
> {
> - s_fence->parent = dma_fence_get(fence);
> + smp_store_release(&s_fence->parent, dma_fence_get(fence));
> if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> &s_fence->finished.flags))
> dma_fence_set_deadline(fence, s_fence->deadline);
> -----
>
> BR,
> -R

2021-09-21 18:05:40

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

On Wed, Sep 8, 2021 at 2:10 PM Daniel Vetter <[email protected]> wrote:
>
> On Wed, Sep 8, 2021 at 9:36 PM Rob Clark <[email protected]> wrote:
> > On Wed, Sep 8, 2021 at 11:49 AM Daniel Vetter <[email protected]> wrote:
> > > On Wed, Sep 08, 2021 at 11:23:42AM -0700, Rob Clark wrote:
> > > > On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter <[email protected]> wrote:
> > > > >
> > > > > On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote:
> > > > > > From: Rob Clark <[email protected]>
> > > > > >
> > > > > > The initial purpose is for igt tests, but this would also be useful for
> > > > > > compositors that wait until close to vblank deadline to make decisions
> > > > > > about which frame to show.
> > > > > >
> > > > > > Signed-off-by: Rob Clark <[email protected]>
> > > > >
> > > > > Needs userspace and I think ideally also some igts to make sure it works
> > > > > and doesn't go boom.
> > > >
> > > > See cover-letter.. there are igt tests, although currently that is the
> > > > only user.
> > >
> > > Ah sorry missed that. It would be good to record that in the commit too
> > > that adds the uapi. git blame doesn't find cover letters at all, unlike on
> > > gitlab where you get the MR request with everything.
> > >
> > > Ok there is the Link: thing, but since that only points at the last
> > > version all the interesting discussion is still usually lost, so I tend to
> > > not bother looking there.
> > >
> > > > I'd be ok to otherwise initially restrict this and the sw_sync UABI
> > > > (CAP_SYS_ADMIN? Or??) until there is a non-igt user, but they are
> > > > both needed by the igt tests
> > >
> > > Hm really awkward, uapi for igts in cross vendor stuff like this isn't
> > > great. I think hiding it in vgem is semi-ok (we have fences there
> > > already). But it's all a bit silly ...
> > >
> > > For the tests, should we instead have a selftest/Kunit thing to exercise
> > > this stuff? igt probably not quite the right thing. Or combine with a page
> > > flip if you want to test msm.
> >
> > Hmm, IIRC we have used CONFIG_BROKEN or something along those lines
> > for UABI in other places where we weren't willing to commit to yet?
> >
> > I suppose if we had to I could make this a sw_sync ioctl instead. But
> > OTOH there are kind of a limited # of ways this ioctl could look. And
> > we already know that at least some wayland compositors are going to
> > want this.
>
> Hm I was trying to think up a few ways this could work, but didn't
> come up with anything reasonable. Forcing the compositor to boost the
> entire chain (for gl composited primary plane fallback) is something
> the kernel can easily do too. Also only makes sense for priority
> boost, not so much for clock boosting, since clock boosting only
> really needs the final element to be boosted.

So, I think the compositor, much like
drm_atomic_helper_wait_for_fences(), really just sees one fence per
surface, it doesn't really know (or care) that under-the-hood it is a
fence-chain or fence-array. There isn't really much for the
compositor to do but inform "if possible, I'd like this fence to be
signaled by time T".

Say you have multiple updated frames, which have a fence-array
composed of fences from multiple different rings. It is up to the
fence provider to keep track of the latest fence and the earliest
deadline.

The drm/msm implementation doesn't try to be too clever and track
multiple deadlines, Ie. fenceA wanted by time1 and fenceB wanted by
time2. It just keeps track of the nearest deadline and the last
fence. That is probably sufficient, eventually the utilization based
gpu freq governor will settle into the appropriate steady-state
framerate.

(Although, I did realize that the WAIT_FENCE ioctl should also be
setting a deadline.. I forgot to add that)

> > I guess I can look at non-igt options. But the igt test is already a
> > pretty convenient way to contrive situations (like loops, which is a
> > thing I need to add)
>
> Yeah it's definitely very useful for testing ... One option could be a
> hacky debugfs interface, where you write a fd number and deadline and
> the debugfs read function does the deadline setting. Horribly, but
> since it's debugfs no one ever cares. That's at least where we're
> hiding all the i915 hacks that igts need.

ugg :-)

BR,
-R

> -Daniel
>
> > BR,
> > -R
> >
> >
> > > -Daniel
> > >
> > > >
> > > > BR,
> > > > -R
> > > >
> > > > > -Daniel
> > > > >
> > > > > > ---
> > > > > > drivers/dma-buf/sync_file.c | 19 +++++++++++++++++++
> > > > > > include/uapi/linux/sync_file.h | 20 ++++++++++++++++++++
> > > > > > 2 files changed, 39 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > > > > > index 394e6e1e9686..f295772d5169 100644
> > > > > > --- a/drivers/dma-buf/sync_file.c
> > > > > > +++ b/drivers/dma-buf/sync_file.c
> > > > > > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file,
> > > > > > return ret;
> > > > > > }
> > > > > >
> > > > > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> > > > > > + unsigned long arg)
> > > > > > +{
> > > > > > + struct sync_set_deadline ts;
> > > > > > +
> > > > > > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> > > > > > + return -EFAULT;
> > > > > > +
> > > > > > + if (ts.pad)
> > > > > > + return -EINVAL;
> > > > > > +
> > > > > > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec));
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > > > > > unsigned long arg)
> > > > > > {
> > > > > > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > > > > > case SYNC_IOC_FILE_INFO:
> > > > > > return sync_file_ioctl_fence_info(sync_file, arg);
> > > > > >
> > > > > > + case SYNC_IOC_SET_DEADLINE:
> > > > > > + return sync_file_ioctl_set_deadline(sync_file, arg);
> > > > > > +
> > > > > > default:
> > > > > > return -ENOTTY;
> > > > > > }
> > > > > > diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> > > > > > index ee2dcfb3d660..f67d4ffe7566 100644
> > > > > > --- a/include/uapi/linux/sync_file.h
> > > > > > +++ b/include/uapi/linux/sync_file.h
> > > > > > @@ -67,6 +67,18 @@ struct sync_file_info {
> > > > > > __u64 sync_fence_info;
> > > > > > };
> > > > > >
> > > > > > +/**
> > > > > > + * struct sync_set_deadline - set a deadline on a fence
> > > > > > + * @tv_sec: seconds elapsed since epoch
> > > > > > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> > > > > > + * @pad: must be zero
> > > > > > + */
> > > > > > +struct sync_set_deadline {
> > > > > > + __s64 tv_sec;
> > > > > > + __s32 tv_nsec;
> > > > > > + __u32 pad;
> > > > > > +};
> > > > > > +
> > > > > > #define SYNC_IOC_MAGIC '>'
> > > > > >
> > > > > > /**
> > > > > > @@ -95,4 +107,12 @@ struct sync_file_info {
> > > > > > */
> > > > > > #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
> > > > > >
> > > > > > +
> > > > > > +/**
> > > > > > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> > > > > > + *
> > > > > > + * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
> > > > > > + */
> > > > > > +#define SYNC_IOC_SET_DEADLINE _IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline)
> > > > > > +
> > > > > > #endif /* _UAPI_LINUX_SYNC_H */
> > > > > > --
> > > > > > 2.31.1
> > > > > >
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

2021-09-21 21:04:02

by Andrey Grodzovsky

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

On 2021-09-03 2:47 p.m., Rob Clark wrote:

> From: Rob Clark <[email protected]>
>
> As the finished fence is the one that is exposed to userspace, and
> therefore the one that other operations, like atomic update, would
> block on, we need to propagate the deadline from from the finished
> fence to the actual hw fence.
>
> v2: Split into drm_sched_fence_set_parent() (ckoenig)
>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
> drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> include/drm/gpu_scheduler.h | 8 ++++++
> 3 files changed, 43 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> index bcea035cf4c6..4fc41a71d1c7 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
> dma_fence_put(&fence->scheduled);
> }
>
> +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> + ktime_t deadline)
> +{
> + struct drm_sched_fence *fence = to_drm_sched_fence(f);
> + unsigned long flags;
> +
> + spin_lock_irqsave(&fence->lock, flags);
> +
> + /* If we already have an earlier deadline, keep it: */
> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
> + ktime_before(fence->deadline, deadline)) {
> + spin_unlock_irqrestore(&fence->lock, flags);
> + return;
> + }
> +
> + fence->deadline = deadline;
> + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
> +
> + spin_unlock_irqrestore(&fence->lock, flags);
> +
> + if (fence->parent)
> + dma_fence_set_deadline(fence->parent, deadline);
> +}
> +
> static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
> .get_driver_name = drm_sched_fence_get_driver_name,
> .get_timeline_name = drm_sched_fence_get_timeline_name,
> @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
> .get_driver_name = drm_sched_fence_get_driver_name,
> .get_timeline_name = drm_sched_fence_get_timeline_name,
> .release = drm_sched_fence_release_finished,
> + .set_deadline = drm_sched_fence_set_deadline_finished,
> };
>
> struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> }
> EXPORT_SYMBOL(to_drm_sched_fence);
>
> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> + struct dma_fence *fence)
> +{
> + s_fence->parent = dma_fence_get(fence);
> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> + &s_fence->finished.flags))
> + dma_fence_set_deadline(fence, s_fence->deadline);


I believe above you should pass be s_fence->finished to
dma_fence_set_deadline
instead it fence which is the HW fence itself.

Andrey


> +}
> +
> struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> void *owner)
> {
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 595e47ff7d06..27bf0ac0625f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
> drm_sched_fence_scheduled(s_fence);
>
> if (!IS_ERR_OR_NULL(fence)) {
> - s_fence->parent = dma_fence_get(fence);
> + drm_sched_fence_set_parent(s_fence, fence);
> r = dma_fence_add_callback(fence, &sched_job->cb,
> drm_sched_job_done_cb);
> if (r == -ENOENT)
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 7f77a455722c..158ddd662469 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -238,6 +238,12 @@ struct drm_sched_fence {
> */
> struct dma_fence finished;
>
> + /**
> + * @deadline: deadline set on &drm_sched_fence.finished which
> + * potentially needs to be propagated to &drm_sched_fence.parent
> + */
> + ktime_t deadline;
> +
> /**
> * @parent: the fence returned by &drm_sched_backend_ops.run_job
> * when scheduling the job on hardware. We signal the
> @@ -505,6 +511,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> enum drm_sched_priority priority);
> bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>
> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> + struct dma_fence *fence);
> struct drm_sched_fence *drm_sched_fence_alloc(
> struct drm_sched_entity *s_entity, void *owner);
> void drm_sched_fence_init(struct drm_sched_fence *fence,

2021-09-21 21:19:15

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

On Tue, Sep 21, 2021 at 1:09 PM Andrey Grodzovsky
<[email protected]> wrote:
>
> On 2021-09-03 2:47 p.m., Rob Clark wrote:
>
> > From: Rob Clark <[email protected]>
> >
> > As the finished fence is the one that is exposed to userspace, and
> > therefore the one that other operations, like atomic update, would
> > block on, we need to propagate the deadline from from the finished
> > fence to the actual hw fence.
> >
> > v2: Split into drm_sched_fence_set_parent() (ckoenig)
> >
> > Signed-off-by: Rob Clark <[email protected]>
> > ---
> > drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
> > drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> > include/drm/gpu_scheduler.h | 8 ++++++
> > 3 files changed, 43 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > index bcea035cf4c6..4fc41a71d1c7 100644
> > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
> > dma_fence_put(&fence->scheduled);
> > }
> >
> > +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> > + ktime_t deadline)
> > +{
> > + struct drm_sched_fence *fence = to_drm_sched_fence(f);
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&fence->lock, flags);
> > +
> > + /* If we already have an earlier deadline, keep it: */
> > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
> > + ktime_before(fence->deadline, deadline)) {
> > + spin_unlock_irqrestore(&fence->lock, flags);
> > + return;
> > + }
> > +
> > + fence->deadline = deadline;
> > + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
> > +
> > + spin_unlock_irqrestore(&fence->lock, flags);
> > +
> > + if (fence->parent)
> > + dma_fence_set_deadline(fence->parent, deadline);
> > +}
> > +
> > static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
> > .get_driver_name = drm_sched_fence_get_driver_name,
> > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
> > .get_driver_name = drm_sched_fence_get_driver_name,
> > .get_timeline_name = drm_sched_fence_get_timeline_name,
> > .release = drm_sched_fence_release_finished,
> > + .set_deadline = drm_sched_fence_set_deadline_finished,
> > };
> >
> > struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > }
> > EXPORT_SYMBOL(to_drm_sched_fence);
> >
> > +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> > + struct dma_fence *fence)
> > +{
> > + s_fence->parent = dma_fence_get(fence);
> > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> > + &s_fence->finished.flags))
> > + dma_fence_set_deadline(fence, s_fence->deadline);
>
>
> I believe above you should pass be s_fence->finished to
> dma_fence_set_deadline
> instead it fence which is the HW fence itself.

Hmm, unless this has changed recently with some patches I don't have,
s_fence->parent is the one signalled by hw, so it is the one we want
to set the deadline on

BR,
-R

> Andrey
>
>
> > +}
> > +
> > struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> > void *owner)
> > {
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 595e47ff7d06..27bf0ac0625f 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
> > drm_sched_fence_scheduled(s_fence);
> >
> > if (!IS_ERR_OR_NULL(fence)) {
> > - s_fence->parent = dma_fence_get(fence);
> > + drm_sched_fence_set_parent(s_fence, fence);
> > r = dma_fence_add_callback(fence, &sched_job->cb,
> > drm_sched_job_done_cb);
> > if (r == -ENOENT)
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 7f77a455722c..158ddd662469 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -238,6 +238,12 @@ struct drm_sched_fence {
> > */
> > struct dma_fence finished;
> >
> > + /**
> > + * @deadline: deadline set on &drm_sched_fence.finished which
> > + * potentially needs to be propagated to &drm_sched_fence.parent
> > + */
> > + ktime_t deadline;
> > +
> > /**
> > * @parent: the fence returned by &drm_sched_backend_ops.run_job
> > * when scheduling the job on hardware. We signal the
> > @@ -505,6 +511,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> > enum drm_sched_priority priority);
> > bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >
> > +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> > + struct dma_fence *fence);
> > struct drm_sched_fence *drm_sched_fence_alloc(
> > struct drm_sched_entity *s_entity, void *owner);
> > void drm_sched_fence_init(struct drm_sched_fence *fence,

2021-09-22 02:19:14

by Andrey Grodzovsky

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support


On 2021-09-21 4:47 p.m., Rob Clark wrote:
> On Tue, Sep 21, 2021 at 1:09 PM Andrey Grodzovsky
> <[email protected]> wrote:
>> On 2021-09-03 2:47 p.m., Rob Clark wrote:
>>
>>> From: Rob Clark <[email protected]>
>>>
>>> As the finished fence is the one that is exposed to userspace, and
>>> therefore the one that other operations, like atomic update, would
>>> block on, we need to propagate the deadline from from the finished
>>> fence to the actual hw fence.
>>>
>>> v2: Split into drm_sched_fence_set_parent() (ckoenig)
>>>
>>> Signed-off-by: Rob Clark <[email protected]>
>>> ---
>>> drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
>>> drivers/gpu/drm/scheduler/sched_main.c | 2 +-
>>> include/drm/gpu_scheduler.h | 8 ++++++
>>> 3 files changed, 43 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>>> index bcea035cf4c6..4fc41a71d1c7 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>> @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
>>> dma_fence_put(&fence->scheduled);
>>> }
>>>
>>> +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
>>> + ktime_t deadline)
>>> +{
>>> + struct drm_sched_fence *fence = to_drm_sched_fence(f);
>>> + unsigned long flags;
>>> +
>>> + spin_lock_irqsave(&fence->lock, flags);
>>> +
>>> + /* If we already have an earlier deadline, keep it: */
>>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
>>> + ktime_before(fence->deadline, deadline)) {
>>> + spin_unlock_irqrestore(&fence->lock, flags);
>>> + return;
>>> + }
>>> +
>>> + fence->deadline = deadline;
>>> + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
>>> +
>>> + spin_unlock_irqrestore(&fence->lock, flags);
>>> +
>>> + if (fence->parent)
>>> + dma_fence_set_deadline(fence->parent, deadline);
>>> +}
>>> +
>>> static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
>>> .get_driver_name = drm_sched_fence_get_driver_name,
>>> .get_timeline_name = drm_sched_fence_get_timeline_name,
>>> @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
>>> .get_driver_name = drm_sched_fence_get_driver_name,
>>> .get_timeline_name = drm_sched_fence_get_timeline_name,
>>> .release = drm_sched_fence_release_finished,
>>> + .set_deadline = drm_sched_fence_set_deadline_finished,
>>> };
>>>
>>> struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>> @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>> }
>>> EXPORT_SYMBOL(to_drm_sched_fence);
>>>
>>> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
>>> + struct dma_fence *fence)
>>> +{
>>> + s_fence->parent = dma_fence_get(fence);
>>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
>>> + &s_fence->finished.flags))
>>> + dma_fence_set_deadline(fence, s_fence->deadline);
>>
>> I believe above you should pass be s_fence->finished to
>> dma_fence_set_deadline
>> instead it fence which is the HW fence itself.
> Hmm, unless this has changed recently with some patches I don't have,
> s_fence->parent is the one signalled by hw, so it is the one we want
> to set the deadline on
>
> BR,
> -R


No it didn't change. But then when exactly will
drm_sched_fence_set_deadline_finished
execute such that fence->parent != NULL ? In other words, I am not clear
how propagation
happens otherwise - if dma_fence_set_deadline is called with the HW
fence then the assumption
here is that driver provided driver specific
dma_fence_ops.dma_fence_set_deadline callback executes
but I was under impression that drm_sched_fence_set_deadline_finished is
the one that propagates
the deadline to the HW fence's callback and for it to execute
dma_fence_set_deadline needs to be called
with s_fence->finished.

Andrey



>
>> Andrey
>>
>>
>>> +}
>>> +
>>> struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>> void *owner)
>>> {
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 595e47ff7d06..27bf0ac0625f 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
>>> drm_sched_fence_scheduled(s_fence);
>>>
>>> if (!IS_ERR_OR_NULL(fence)) {
>>> - s_fence->parent = dma_fence_get(fence);
>>> + drm_sched_fence_set_parent(s_fence, fence);
>>> r = dma_fence_add_callback(fence, &sched_job->cb,
>>> drm_sched_job_done_cb);
>>> if (r == -ENOENT)
>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>> index 7f77a455722c..158ddd662469 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -238,6 +238,12 @@ struct drm_sched_fence {
>>> */
>>> struct dma_fence finished;
>>>
>>> + /**
>>> + * @deadline: deadline set on &drm_sched_fence.finished which
>>> + * potentially needs to be propagated to &drm_sched_fence.parent
>>> + */
>>> + ktime_t deadline;
>>> +
>>> /**
>>> * @parent: the fence returned by &drm_sched_backend_ops.run_job
>>> * when scheduling the job on hardware. We signal the
>>> @@ -505,6 +511,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>>> enum drm_sched_priority priority);
>>> bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>>
>>> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
>>> + struct dma_fence *fence);
>>> struct drm_sched_fence *drm_sched_fence_alloc(
>>> struct drm_sched_entity *s_entity, void *owner);
>>> void drm_sched_fence_init(struct drm_sched_fence *fence,

2021-09-22 03:30:32

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

On Tue, Sep 21, 2021 at 7:18 PM Andrey Grodzovsky
<[email protected]> wrote:
>
>
> On 2021-09-21 4:47 p.m., Rob Clark wrote:
> > On Tue, Sep 21, 2021 at 1:09 PM Andrey Grodzovsky
> > <[email protected]> wrote:
> >> On 2021-09-03 2:47 p.m., Rob Clark wrote:
> >>
> >>> From: Rob Clark <[email protected]>
> >>>
> >>> As the finished fence is the one that is exposed to userspace, and
> >>> therefore the one that other operations, like atomic update, would
> >>> block on, we need to propagate the deadline from from the finished
> >>> fence to the actual hw fence.
> >>>
> >>> v2: Split into drm_sched_fence_set_parent() (ckoenig)
> >>>
> >>> Signed-off-by: Rob Clark <[email protected]>
> >>> ---
> >>> drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
> >>> drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> >>> include/drm/gpu_scheduler.h | 8 ++++++
> >>> 3 files changed, 43 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> >>> index bcea035cf4c6..4fc41a71d1c7 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> >>> @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
> >>> dma_fence_put(&fence->scheduled);
> >>> }
> >>>
> >>> +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> >>> + ktime_t deadline)
> >>> +{
> >>> + struct drm_sched_fence *fence = to_drm_sched_fence(f);
> >>> + unsigned long flags;
> >>> +
> >>> + spin_lock_irqsave(&fence->lock, flags);
> >>> +
> >>> + /* If we already have an earlier deadline, keep it: */
> >>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
> >>> + ktime_before(fence->deadline, deadline)) {
> >>> + spin_unlock_irqrestore(&fence->lock, flags);
> >>> + return;
> >>> + }
> >>> +
> >>> + fence->deadline = deadline;
> >>> + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
> >>> +
> >>> + spin_unlock_irqrestore(&fence->lock, flags);
> >>> +
> >>> + if (fence->parent)
> >>> + dma_fence_set_deadline(fence->parent, deadline);
> >>> +}
> >>> +
> >>> static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
> >>> .get_driver_name = drm_sched_fence_get_driver_name,
> >>> .get_timeline_name = drm_sched_fence_get_timeline_name,
> >>> @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
> >>> .get_driver_name = drm_sched_fence_get_driver_name,
> >>> .get_timeline_name = drm_sched_fence_get_timeline_name,
> >>> .release = drm_sched_fence_release_finished,
> >>> + .set_deadline = drm_sched_fence_set_deadline_finished,
> >>> };
> >>>
> >>> struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >>> @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >>> }
> >>> EXPORT_SYMBOL(to_drm_sched_fence);
> >>>
> >>> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> >>> + struct dma_fence *fence)
> >>> +{
> >>> + s_fence->parent = dma_fence_get(fence);
> >>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> >>> + &s_fence->finished.flags))
> >>> + dma_fence_set_deadline(fence, s_fence->deadline);
> >>
> >> I believe above you should pass be s_fence->finished to
> >> dma_fence_set_deadline
> >> instead it fence which is the HW fence itself.
> > Hmm, unless this has changed recently with some patches I don't have,
> > s_fence->parent is the one signalled by hw, so it is the one we want
> > to set the deadline on
> >
> > BR,
> > -R
>
>
> No it didn't change. But then when exactly will
> drm_sched_fence_set_deadline_finished
> execute such that fence->parent != NULL ? In other words, I am not clear
> how propagation
> happens otherwise - if dma_fence_set_deadline is called with the HW
> fence then the assumption
> here is that driver provided driver specific
> dma_fence_ops.dma_fence_set_deadline callback executes
> but I was under impression that drm_sched_fence_set_deadline_finished is
> the one that propagates
> the deadline to the HW fence's callback and for it to execute
> dma_fence_set_deadline needs to be called
> with s_fence->finished.

Assuming I didn't screw up drm/msm conversion to scheduler,
&s_fence->finished is the one that will be returned to userspace.. and
later passed back to kernel for atomic commit (or to the compositor).
So it is the one that fence->set_deadline() will be called on. But
s_fence->parent is the actual hw fence that needs to know about the
deadline. Depending on whether or not the job has been written into
hw ringbuffer or not, there are two cases:

1) not scheduled yet, s_fence will store the deadline and propagate it
later once s_fence->parent is known
2) already scheduled, in which case s_fence->finished.set_deadline
will propagate it directly to the real fence

BR,
-R

> Andrey
>
>
>
> >
> >> Andrey
> >>
> >>
> >>> +}
> >>> +
> >>> struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> >>> void *owner)
> >>> {
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index 595e47ff7d06..27bf0ac0625f 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
> >>> drm_sched_fence_scheduled(s_fence);
> >>>
> >>> if (!IS_ERR_OR_NULL(fence)) {
> >>> - s_fence->parent = dma_fence_get(fence);
> >>> + drm_sched_fence_set_parent(s_fence, fence);
> >>> r = dma_fence_add_callback(fence, &sched_job->cb,
> >>> drm_sched_job_done_cb);
> >>> if (r == -ENOENT)
> >>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> >>> index 7f77a455722c..158ddd662469 100644
> >>> --- a/include/drm/gpu_scheduler.h
> >>> +++ b/include/drm/gpu_scheduler.h
> >>> @@ -238,6 +238,12 @@ struct drm_sched_fence {
> >>> */
> >>> struct dma_fence finished;
> >>>
> >>> + /**
> >>> + * @deadline: deadline set on &drm_sched_fence.finished which
> >>> + * potentially needs to be propagated to &drm_sched_fence.parent
> >>> + */
> >>> + ktime_t deadline;
> >>> +
> >>> /**
> >>> * @parent: the fence returned by &drm_sched_backend_ops.run_job
> >>> * when scheduling the job on hardware. We signal the
> >>> @@ -505,6 +511,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >>> enum drm_sched_priority priority);
> >>> bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >>>
> >>> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> >>> + struct dma_fence *fence);
> >>> struct drm_sched_fence *drm_sched_fence_alloc(
> >>> struct drm_sched_entity *s_entity, void *owner);
> >>> void drm_sched_fence_init(struct drm_sched_fence *fence,

2021-09-22 14:33:53

by Andrey Grodzovsky

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support


On 2021-09-21 11:32 p.m., Rob Clark wrote:
> On Tue, Sep 21, 2021 at 7:18 PM Andrey Grodzovsky
> <[email protected]> wrote:
>>
>> On 2021-09-21 4:47 p.m., Rob Clark wrote:
>>> On Tue, Sep 21, 2021 at 1:09 PM Andrey Grodzovsky
>>> <[email protected]> wrote:
>>>> On 2021-09-03 2:47 p.m., Rob Clark wrote:
>>>>
>>>>> From: Rob Clark <[email protected]>
>>>>>
>>>>> As the finished fence is the one that is exposed to userspace, and
>>>>> therefore the one that other operations, like atomic update, would
>>>>> block on, we need to propagate the deadline from from the finished
>>>>> fence to the actual hw fence.
>>>>>
>>>>> v2: Split into drm_sched_fence_set_parent() (ckoenig)
>>>>>
>>>>> Signed-off-by: Rob Clark <[email protected]>
>>>>> ---
>>>>> drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
>>>>> drivers/gpu/drm/scheduler/sched_main.c | 2 +-
>>>>> include/drm/gpu_scheduler.h | 8 ++++++
>>>>> 3 files changed, 43 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>> index bcea035cf4c6..4fc41a71d1c7 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>> @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
>>>>> dma_fence_put(&fence->scheduled);
>>>>> }
>>>>>
>>>>> +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
>>>>> + ktime_t deadline)
>>>>> +{
>>>>> + struct drm_sched_fence *fence = to_drm_sched_fence(f);
>>>>> + unsigned long flags;
>>>>> +
>>>>> + spin_lock_irqsave(&fence->lock, flags);
>>>>> +
>>>>> + /* If we already have an earlier deadline, keep it: */
>>>>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
>>>>> + ktime_before(fence->deadline, deadline)) {
>>>>> + spin_unlock_irqrestore(&fence->lock, flags);
>>>>> + return;
>>>>> + }
>>>>> +
>>>>> + fence->deadline = deadline;
>>>>> + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
>>>>> +
>>>>> + spin_unlock_irqrestore(&fence->lock, flags);
>>>>> +
>>>>> + if (fence->parent)
>>>>> + dma_fence_set_deadline(fence->parent, deadline);
>>>>> +}
>>>>> +
>>>>> static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
>>>>> .get_driver_name = drm_sched_fence_get_driver_name,
>>>>> .get_timeline_name = drm_sched_fence_get_timeline_name,
>>>>> @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
>>>>> .get_driver_name = drm_sched_fence_get_driver_name,
>>>>> .get_timeline_name = drm_sched_fence_get_timeline_name,
>>>>> .release = drm_sched_fence_release_finished,
>>>>> + .set_deadline = drm_sched_fence_set_deadline_finished,
>>>>> };
>>>>>
>>>>> struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>>> @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>>> }
>>>>> EXPORT_SYMBOL(to_drm_sched_fence);
>>>>>
>>>>> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
>>>>> + struct dma_fence *fence)
>>>>> +{
>>>>> + s_fence->parent = dma_fence_get(fence);
>>>>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
>>>>> + &s_fence->finished.flags))
>>>>> + dma_fence_set_deadline(fence, s_fence->deadline);
>>>> I believe above you should pass be s_fence->finished to
>>>> dma_fence_set_deadline
>>>> instead it fence which is the HW fence itself.
>>> Hmm, unless this has changed recently with some patches I don't have,
>>> s_fence->parent is the one signalled by hw, so it is the one we want
>>> to set the deadline on
>>>
>>> BR,
>>> -R
>>
>> No it didn't change. But then when exactly will
>> drm_sched_fence_set_deadline_finished
>> execute such that fence->parent != NULL ? In other words, I am not clear
>> how propagation
>> happens otherwise - if dma_fence_set_deadline is called with the HW
>> fence then the assumption
>> here is that driver provided driver specific
>> dma_fence_ops.dma_fence_set_deadline callback executes
>> but I was under impression that drm_sched_fence_set_deadline_finished is
>> the one that propagates
>> the deadline to the HW fence's callback and for it to execute
>> dma_fence_set_deadline needs to be called
>> with s_fence->finished.
> Assuming I didn't screw up drm/msm conversion to scheduler,
> &s_fence->finished is the one that will be returned to userspace.. and
> later passed back to kernel for atomic commit (or to the compositor).
> So it is the one that fence->set_deadline() will be called on. But
> s_fence->parent is the actual hw fence that needs to know about the
> deadline. Depending on whether or not the job has been written into
> hw ringbuffer or not, there are two cases:
>
> 1) not scheduled yet, s_fence will store the deadline and propagate it
> later once s_fence->parent is known


And by later you mean the call to drm_sched_fence_set_parent
after HW fence is returned  ? If yes I think i get it now.

Andrey


> 2) already scheduled, in which case s_fence->finished.set_deadline
> will propagate it directly to the real fence
>
> BR,
> -R
>
>> Andrey
>>
>>
>>
>>>> Andrey
>>>>
>>>>
>>>>> +}
>>>>> +
>>>>> struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>>>> void *owner)
>>>>> {
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 595e47ff7d06..27bf0ac0625f 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
>>>>> drm_sched_fence_scheduled(s_fence);
>>>>>
>>>>> if (!IS_ERR_OR_NULL(fence)) {
>>>>> - s_fence->parent = dma_fence_get(fence);
>>>>> + drm_sched_fence_set_parent(s_fence, fence);
>>>>> r = dma_fence_add_callback(fence, &sched_job->cb,
>>>>> drm_sched_job_done_cb);
>>>>> if (r == -ENOENT)
>>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>>>> index 7f77a455722c..158ddd662469 100644
>>>>> --- a/include/drm/gpu_scheduler.h
>>>>> +++ b/include/drm/gpu_scheduler.h
>>>>> @@ -238,6 +238,12 @@ struct drm_sched_fence {
>>>>> */
>>>>> struct dma_fence finished;
>>>>>
>>>>> + /**
>>>>> + * @deadline: deadline set on &drm_sched_fence.finished which
>>>>> + * potentially needs to be propagated to &drm_sched_fence.parent
>>>>> + */
>>>>> + ktime_t deadline;
>>>>> +
>>>>> /**
>>>>> * @parent: the fence returned by &drm_sched_backend_ops.run_job
>>>>> * when scheduling the job on hardware. We signal the
>>>>> @@ -505,6 +511,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>>>>> enum drm_sched_priority priority);
>>>>> bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>>>>
>>>>> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
>>>>> + struct dma_fence *fence);
>>>>> struct drm_sched_fence *drm_sched_fence_alloc(
>>>>> struct drm_sched_entity *s_entity, void *owner);
>>>>> void drm_sched_fence_init(struct drm_sched_fence *fence,

2021-09-22 15:02:31

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

On Wed, Sep 22, 2021 at 7:31 AM Andrey Grodzovsky
<[email protected]> wrote:
>
>
> On 2021-09-21 11:32 p.m., Rob Clark wrote:
> > On Tue, Sep 21, 2021 at 7:18 PM Andrey Grodzovsky
> > <[email protected]> wrote:
> >>
> >> On 2021-09-21 4:47 p.m., Rob Clark wrote:
> >>> On Tue, Sep 21, 2021 at 1:09 PM Andrey Grodzovsky
> >>> <[email protected]> wrote:
> >>>> On 2021-09-03 2:47 p.m., Rob Clark wrote:
> >>>>
> >>>>> From: Rob Clark <[email protected]>
> >>>>>
> >>>>> As the finished fence is the one that is exposed to userspace, and
> >>>>> therefore the one that other operations, like atomic update, would
> >>>>> block on, we need to propagate the deadline from from the finished
> >>>>> fence to the actual hw fence.
> >>>>>
> >>>>> v2: Split into drm_sched_fence_set_parent() (ckoenig)
> >>>>>
> >>>>> Signed-off-by: Rob Clark <[email protected]>
> >>>>> ---
> >>>>> drivers/gpu/drm/scheduler/sched_fence.c | 34 +++++++++++++++++++++++++
> >>>>> drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> >>>>> include/drm/gpu_scheduler.h | 8 ++++++
> >>>>> 3 files changed, 43 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>> index bcea035cf4c6..4fc41a71d1c7 100644
> >>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>> @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f)
> >>>>> dma_fence_put(&fence->scheduled);
> >>>>> }
> >>>>>
> >>>>> +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> >>>>> + ktime_t deadline)
> >>>>> +{
> >>>>> + struct drm_sched_fence *fence = to_drm_sched_fence(f);
> >>>>> + unsigned long flags;
> >>>>> +
> >>>>> + spin_lock_irqsave(&fence->lock, flags);
> >>>>> +
> >>>>> + /* If we already have an earlier deadline, keep it: */
> >>>>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
> >>>>> + ktime_before(fence->deadline, deadline)) {
> >>>>> + spin_unlock_irqrestore(&fence->lock, flags);
> >>>>> + return;
> >>>>> + }
> >>>>> +
> >>>>> + fence->deadline = deadline;
> >>>>> + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
> >>>>> +
> >>>>> + spin_unlock_irqrestore(&fence->lock, flags);
> >>>>> +
> >>>>> + if (fence->parent)
> >>>>> + dma_fence_set_deadline(fence->parent, deadline);
> >>>>> +}
> >>>>> +
> >>>>> static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
> >>>>> .get_driver_name = drm_sched_fence_get_driver_name,
> >>>>> .get_timeline_name = drm_sched_fence_get_timeline_name,
> >>>>> @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = {
> >>>>> .get_driver_name = drm_sched_fence_get_driver_name,
> >>>>> .get_timeline_name = drm_sched_fence_get_timeline_name,
> >>>>> .release = drm_sched_fence_release_finished,
> >>>>> + .set_deadline = drm_sched_fence_set_deadline_finished,
> >>>>> };
> >>>>>
> >>>>> struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >>>>> @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >>>>> }
> >>>>> EXPORT_SYMBOL(to_drm_sched_fence);
> >>>>>
> >>>>> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> >>>>> + struct dma_fence *fence)
> >>>>> +{
> >>>>> + s_fence->parent = dma_fence_get(fence);
> >>>>> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> >>>>> + &s_fence->finished.flags))
> >>>>> + dma_fence_set_deadline(fence, s_fence->deadline);
> >>>> I believe above you should pass be s_fence->finished to
> >>>> dma_fence_set_deadline
> >>>> instead it fence which is the HW fence itself.
> >>> Hmm, unless this has changed recently with some patches I don't have,
> >>> s_fence->parent is the one signalled by hw, so it is the one we want
> >>> to set the deadline on
> >>>
> >>> BR,
> >>> -R
> >>
> >> No it didn't change. But then when exactly will
> >> drm_sched_fence_set_deadline_finished
> >> execute such that fence->parent != NULL ? In other words, I am not clear
> >> how propagation
> >> happens otherwise - if dma_fence_set_deadline is called with the HW
> >> fence then the assumption
> >> here is that driver provided driver specific
> >> dma_fence_ops.dma_fence_set_deadline callback executes
> >> but I was under impression that drm_sched_fence_set_deadline_finished is
> >> the one that propagates
> >> the deadline to the HW fence's callback and for it to execute
> >> dma_fence_set_deadline needs to be called
> >> with s_fence->finished.
> > Assuming I didn't screw up drm/msm conversion to scheduler,
> > &s_fence->finished is the one that will be returned to userspace.. and
> > later passed back to kernel for atomic commit (or to the compositor).
> > So it is the one that fence->set_deadline() will be called on. But
> > s_fence->parent is the actual hw fence that needs to know about the
> > deadline. Depending on whether or not the job has been written into
> > hw ringbuffer or not, there are two cases:
> >
> > 1) not scheduled yet, s_fence will store the deadline and propagate it
> > later once s_fence->parent is known
>
>
> And by later you mean the call to drm_sched_fence_set_parent
> after HW fence is returned ? If yes I think i get it now.

Yup :-)

BR,
-R

> Andrey
>
>
> > 2) already scheduled, in which case s_fence->finished.set_deadline
> > will propagate it directly to the real fence
> >
> > BR,
> > -R
> >
> >> Andrey
> >>
> >>
> >>
> >>>> Andrey
> >>>>
> >>>>
> >>>>> +}
> >>>>> +
> >>>>> struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> >>>>> void *owner)
> >>>>> {
> >>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> index 595e47ff7d06..27bf0ac0625f 100644
> >>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> @@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
> >>>>> drm_sched_fence_scheduled(s_fence);
> >>>>>
> >>>>> if (!IS_ERR_OR_NULL(fence)) {
> >>>>> - s_fence->parent = dma_fence_get(fence);
> >>>>> + drm_sched_fence_set_parent(s_fence, fence);
> >>>>> r = dma_fence_add_callback(fence, &sched_job->cb,
> >>>>> drm_sched_job_done_cb);
> >>>>> if (r == -ENOENT)
> >>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> >>>>> index 7f77a455722c..158ddd662469 100644
> >>>>> --- a/include/drm/gpu_scheduler.h
> >>>>> +++ b/include/drm/gpu_scheduler.h
> >>>>> @@ -238,6 +238,12 @@ struct drm_sched_fence {
> >>>>> */
> >>>>> struct dma_fence finished;
> >>>>>
> >>>>> + /**
> >>>>> + * @deadline: deadline set on &drm_sched_fence.finished which
> >>>>> + * potentially needs to be propagated to &drm_sched_fence.parent
> >>>>> + */
> >>>>> + ktime_t deadline;
> >>>>> +
> >>>>> /**
> >>>>> * @parent: the fence returned by &drm_sched_backend_ops.run_job
> >>>>> * when scheduling the job on hardware. We signal the
> >>>>> @@ -505,6 +511,8 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >>>>> enum drm_sched_priority priority);
> >>>>> bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >>>>>
> >>>>> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> >>>>> + struct dma_fence *fence);
> >>>>> struct drm_sched_fence *drm_sched_fence_alloc(
> >>>>> struct drm_sched_entity *s_entity, void *owner);
> >>>>> void drm_sched_fence_init(struct drm_sched_fence *fence,

2021-09-27 08:44:56

by Pekka Paalanen

[permalink] [raw]
Subject: Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

On Fri, 3 Sep 2021 11:47:59 -0700
Rob Clark <[email protected]> wrote:

> From: Rob Clark <[email protected]>
>
> The initial purpose is for igt tests, but this would also be useful for
> compositors that wait until close to vblank deadline to make decisions
> about which frame to show.
>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/dma-buf/sync_file.c | 19 +++++++++++++++++++
> include/uapi/linux/sync_file.h | 20 ++++++++++++++++++++
> 2 files changed, 39 insertions(+)
>
> diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> index 394e6e1e9686..f295772d5169 100644
> --- a/drivers/dma-buf/sync_file.c
> +++ b/drivers/dma-buf/sync_file.c
> @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file,
> return ret;
> }
>
> +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> + unsigned long arg)
> +{
> + struct sync_set_deadline ts;
> +
> + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> + return -EFAULT;
> +
> + if (ts.pad)
> + return -EINVAL;
> +
> + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec));
> +
> + return 0;
> +}
> +
> static long sync_file_ioctl(struct file *file, unsigned int cmd,
> unsigned long arg)
> {
> @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd,
> case SYNC_IOC_FILE_INFO:
> return sync_file_ioctl_fence_info(sync_file, arg);
>
> + case SYNC_IOC_SET_DEADLINE:
> + return sync_file_ioctl_set_deadline(sync_file, arg);
> +
> default:
> return -ENOTTY;
> }
> diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> index ee2dcfb3d660..f67d4ffe7566 100644
> --- a/include/uapi/linux/sync_file.h
> +++ b/include/uapi/linux/sync_file.h
> @@ -67,6 +67,18 @@ struct sync_file_info {
> __u64 sync_fence_info;
> };
>
> +/**
> + * struct sync_set_deadline - set a deadline on a fence
> + * @tv_sec: seconds elapsed since epoch
> + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> + * @pad: must be zero

Hi Rob,

I think you need to specify which clock this timestamp must be in.

Which epoch? Sounds a bit like CLOCK_REALTIME to me which would not
make sense.

Also I cannot guess how a compositor should be using this, so
explaining the expected usage would be really good, with reasons for
why should userspace bother.


Thanks,
pq

> + */
> +struct sync_set_deadline {
> + __s64 tv_sec;
> + __s32 tv_nsec;
> + __u32 pad;
> +};
> +
> #define SYNC_IOC_MAGIC '>'
>
> /**
> @@ -95,4 +107,12 @@ struct sync_file_info {
> */
> #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
>
> +
> +/**
> + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> + *
> + * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
> + */
> +#define SYNC_IOC_SET_DEADLINE _IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline)
> +
> #endif /* _UAPI_LINUX_SYNC_H */


Attachments:
(No filename) (849.00 B)
OpenPGP digital signature

2021-09-27 08:55:33

by Christian König

[permalink] [raw]
Subject: Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

Am 27.09.21 um 10:42 schrieb Pekka Paalanen:
> On Fri, 3 Sep 2021 11:47:59 -0700
> Rob Clark <[email protected]> wrote:
>
>> From: Rob Clark <[email protected]>
>>
>> The initial purpose is for igt tests, but this would also be useful for
>> compositors that wait until close to vblank deadline to make decisions
>> about which frame to show.
>>
>> Signed-off-by: Rob Clark <[email protected]>
>> ---
>> drivers/dma-buf/sync_file.c | 19 +++++++++++++++++++
>> include/uapi/linux/sync_file.h | 20 ++++++++++++++++++++
>> 2 files changed, 39 insertions(+)
>>
>> diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
>> index 394e6e1e9686..f295772d5169 100644
>> --- a/drivers/dma-buf/sync_file.c
>> +++ b/drivers/dma-buf/sync_file.c
>> @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file,
>> return ret;
>> }
>>
>> +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
>> + unsigned long arg)
>> +{
>> + struct sync_set_deadline ts;
>> +
>> + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
>> + return -EFAULT;
>> +
>> + if (ts.pad)
>> + return -EINVAL;
>> +
>> + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec));
>> +
>> + return 0;
>> +}
>> +
>> static long sync_file_ioctl(struct file *file, unsigned int cmd,
>> unsigned long arg)
>> {
>> @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd,
>> case SYNC_IOC_FILE_INFO:
>> return sync_file_ioctl_fence_info(sync_file, arg);
>>
>> + case SYNC_IOC_SET_DEADLINE:
>> + return sync_file_ioctl_set_deadline(sync_file, arg);
>> +
>> default:
>> return -ENOTTY;
>> }
>> diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
>> index ee2dcfb3d660..f67d4ffe7566 100644
>> --- a/include/uapi/linux/sync_file.h
>> +++ b/include/uapi/linux/sync_file.h
>> @@ -67,6 +67,18 @@ struct sync_file_info {
>> __u64 sync_fence_info;
>> };
>>
>> +/**
>> + * struct sync_set_deadline - set a deadline on a fence
>> + * @tv_sec: seconds elapsed since epoch
>> + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
>> + * @pad: must be zero
> Hi Rob,
>
> I think you need to specify which clock this timestamp must be in.
>
> Which epoch? Sounds a bit like CLOCK_REALTIME to me which would not
> make sense.

Most likely CLOCK_MONOTONIC I think since that's what ktime is based on
as well IIRC.

My recollection might be wrong but I think Daniel documented somewhere
that an absolut 64bit nanosecond value should be used for timeouts in
IOCTLs which is sufficient for ~500 years uptime.

So the separation between seconds and nanoseconds might be redundant.

Regards,
Christian.

>
> Also I cannot guess how a compositor should be using this, so
> explaining the expected usage would be really good, with reasons for
> why should userspace bother.
>
>
> Thanks,
> pq
>
>> + */
>> +struct sync_set_deadline {
>> + __s64 tv_sec;
>> + __s32 tv_nsec;
>> + __u32 pad;
>> +};
>> +
>> #define SYNC_IOC_MAGIC '>'
>>
>> /**
>> @@ -95,4 +107,12 @@ struct sync_file_info {
>> */
>> #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
>>
>> +
>> +/**
>> + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
>> + *
>> + * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
>> + */
>> +#define SYNC_IOC_SET_DEADLINE _IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline)
>> +
>> #endif /* _UAPI_LINUX_SYNC_H */

2021-09-27 14:33:47

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

On Mon, Sep 27, 2021 at 1:42 AM Pekka Paalanen <[email protected]> wrote:
>
> On Fri, 3 Sep 2021 11:47:59 -0700
> Rob Clark <[email protected]> wrote:
>
> > From: Rob Clark <[email protected]>
> >
> > The initial purpose is for igt tests, but this would also be useful for
> > compositors that wait until close to vblank deadline to make decisions
> > about which frame to show.
> >
> > Signed-off-by: Rob Clark <[email protected]>
> > ---
> > drivers/dma-buf/sync_file.c | 19 +++++++++++++++++++
> > include/uapi/linux/sync_file.h | 20 ++++++++++++++++++++
> > 2 files changed, 39 insertions(+)
> >
> > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > index 394e6e1e9686..f295772d5169 100644
> > --- a/drivers/dma-buf/sync_file.c
> > +++ b/drivers/dma-buf/sync_file.c
> > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file,
> > return ret;
> > }
> >
> > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> > + unsigned long arg)
> > +{
> > + struct sync_set_deadline ts;
> > +
> > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> > + return -EFAULT;
> > +
> > + if (ts.pad)
> > + return -EINVAL;
> > +
> > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, ts.tv_nsec));
> > +
> > + return 0;
> > +}
> > +
> > static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > unsigned long arg)
> > {
> > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > case SYNC_IOC_FILE_INFO:
> > return sync_file_ioctl_fence_info(sync_file, arg);
> >
> > + case SYNC_IOC_SET_DEADLINE:
> > + return sync_file_ioctl_set_deadline(sync_file, arg);
> > +
> > default:
> > return -ENOTTY;
> > }
> > diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> > index ee2dcfb3d660..f67d4ffe7566 100644
> > --- a/include/uapi/linux/sync_file.h
> > +++ b/include/uapi/linux/sync_file.h
> > @@ -67,6 +67,18 @@ struct sync_file_info {
> > __u64 sync_fence_info;
> > };
> >
> > +/**
> > + * struct sync_set_deadline - set a deadline on a fence
> > + * @tv_sec: seconds elapsed since epoch
> > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> > + * @pad: must be zero
>
> Hi Rob,
>
> I think you need to specify which clock this timestamp must be in.
>
> Which epoch? Sounds a bit like CLOCK_REALTIME to me which would not
> make sense.

It should be monotonic.. same clock as is used for vblank timestamps,
which I assume that would be the most straightforward thing for
compositors to use

BR,
-R

> Also I cannot guess how a compositor should be using this, so
> explaining the expected usage would be really good, with reasons for
> why should userspace bother.
>
>
> Thanks,
> pq
>
> > + */
> > +struct sync_set_deadline {
> > + __s64 tv_sec;
> > + __s32 tv_nsec;
> > + __u32 pad;
> > +};
> > +
> > #define SYNC_IOC_MAGIC '>'
> >
> > /**
> > @@ -95,4 +107,12 @@ struct sync_file_info {
> > */
> > #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
> >
> > +
> > +/**
> > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> > + *
> > + * Allows userspace to set a deadline on a fence, see dma_fence_set_deadline()
> > + */
> > +#define SYNC_IOC_SET_DEADLINE _IOW(SYNC_IOC_MAGIC, 5, struct sync_set_deadline)
> > +
> > #endif /* _UAPI_LINUX_SYNC_H */
>

2021-09-28 07:59:53

by Pekka Paalanen

[permalink] [raw]
Subject: Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

On Mon, 27 Sep 2021 07:36:05 -0700
Rob Clark <[email protected]> wrote:

> On Mon, Sep 27, 2021 at 1:42 AM Pekka Paalanen <[email protected]> wrote:
> >
> > On Fri, 3 Sep 2021 11:47:59 -0700
> > Rob Clark <[email protected]> wrote:
> >
> > > From: Rob Clark <[email protected]>
> > >
> > > The initial purpose is for igt tests, but this would also be useful for
> > > compositors that wait until close to vblank deadline to make decisions
> > > about which frame to show.
> > >
> > > Signed-off-by: Rob Clark <[email protected]>
> > > ---
> > > drivers/dma-buf/sync_file.c | 19 +++++++++++++++++++
> > > include/uapi/linux/sync_file.h | 20 ++++++++++++++++++++
> > > 2 files changed, 39 insertions(+)

...

> > > diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> > > index ee2dcfb3d660..f67d4ffe7566 100644
> > > --- a/include/uapi/linux/sync_file.h
> > > +++ b/include/uapi/linux/sync_file.h
> > > @@ -67,6 +67,18 @@ struct sync_file_info {
> > > __u64 sync_fence_info;
> > > };
> > >
> > > +/**
> > > + * struct sync_set_deadline - set a deadline on a fence
> > > + * @tv_sec: seconds elapsed since epoch
> > > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> > > + * @pad: must be zero
> >
> > Hi Rob,
> >
> > I think you need to specify which clock this timestamp must be in.
> >
> > Which epoch? Sounds a bit like CLOCK_REALTIME to me which would not
> > make sense.
>
> It should be monotonic.. same clock as is used for vblank timestamps,
> which I assume that would be the most straightforward thing for
> compositors to use

Yes, it would. Just need to document that. :-)


Thanks,
pq


Attachments:
(No filename) (849.00 B)
OpenPGP digital signature