2018-04-07 00:07:05

by Keith Packard

[permalink] [raw]
Subject: [PATCH 0/1] drm: Add crtc_queue_syncobj and crtc_get_syncobj ioctls

(This is an RFC on whether this pair of ioctls seems reasonable. The
code compiles, but I haven't tested it as I'm away from home this
weekend.)

I'm rewriting my implementation of the Vulkan EXT_display_control
extension, which provides a way to signal a Vulkan fence at vblank
time. I had implemented it using events, but that isn't great as the
Vulkan API includes the ability to wait for any of a set of fences to
be signaled. As the other Vulkan fences are implemented using
dma_fences in the kernel, and (eventually) using syncobj up in user
space, it seems reasonable to use syncobjs for everything and hook
vblank to those.

In any case, I'm proposing two new syncobj/vblank ioctls (the names
aren't great; suggestions welcome, as usual):

DRM_IOCTL_CRTC_QUEUE_SYNCOBJ

Create a new syncobj that will be signaled at (or after) the
specified vblank sequence value. This uses the same parameters
to specify the target sequence as
DRM_IOCTL_CRTC_QUEUE_SEQUENCE.

DRM_IOCTL_CRTC_GET_SYNCOBJ

Once the above syncobj has been signaled, this ioctl allows
the application to find out when that happened, returning both
the vblank sequence count and time (in ns).

I'd like to hear comments on whether this seems reasonable, or whether
I should go in some other direction.



2018-04-07 00:06:11

by Keith Packard

[permalink] [raw]
Subject: [PATCH] drm: Add crtc_queue_syncobj and crtc_get_syncobj ioctls

crtc_queue_syncobj creates a new syncobj that will get signaled at a
specified vblank sequence count.

crtc_get_syncobj returns the time and vblank sequence count when the
syncobj was signaled.

The pair of these allows use of syncobjs instead of events for
monitoring vblank activity.

Signed-off-by: Keith Packard <[email protected]>
---
drivers/gpu/drm/drm_file.c | 5 +
drivers/gpu/drm/drm_internal.h | 4 +
drivers/gpu/drm/drm_ioctl.c | 2 +
drivers/gpu/drm/drm_syncobj.c | 13 ++-
drivers/gpu/drm/drm_vblank.c | 212 +++++++++++++++++++++++++++++++++++++----
include/drm/drm_file.h | 11 ++-
include/drm/drm_syncobj.h | 13 +++
include/uapi/drm/drm.h | 17 ++++
8 files changed, 253 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index b3c6e997ccdb..c1ada3fe70b0 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -37,6 +37,7 @@

#include <drm/drm_file.h>
#include <drm/drmP.h>
+#include <drm/drm_syncobj.h>

#include "drm_legacy.h"
#include "drm_internal.h"
@@ -711,6 +712,10 @@ void drm_send_event_locked(struct drm_device *dev, struct drm_pending_event *e)
dma_fence_put(e->fence);
}

+ if (e->syncobj) {
+ drm_syncobj_put(e->syncobj);
+ }
+
if (!e->file_priv) {
kfree(e);
return;
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index c9d5a6cd4d41..71b9435b5b37 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -75,6 +75,10 @@ int drm_crtc_get_sequence_ioctl(struct drm_device *dev, void *data,

int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp);
+int drm_crtc_queue_syncobj_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *filp);
+int drm_crtc_get_syncobj_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *filp);

/* drm_auth.c */
int drm_getmagic(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 4aafe4802099..309611fb5d0d 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -665,6 +665,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, drm_crtc_queue_sequence_ioctl, DRM_UNLOCKED),
+ DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SYNCOBJ, drm_crtc_queue_syncobj_ioctl, DRM_UNLOCKED),
+ DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SYNCOBJ, drm_crtc_get_syncobj_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, drm_mode_create_lease_ioctl, DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_LIST_LESSEES, drm_mode_list_lessees_ioctl, DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GET_LEASE, drm_mode_get_lease_ioctl, DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index cb4d09c70fd4..e197b007079d 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -208,7 +208,7 @@ static const struct dma_fence_ops drm_syncobj_null_fence_ops = {
.release = NULL,
};

-static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
+int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj, bool signal)
{
struct drm_syncobj_null_fence *fence;
fence = kzalloc(sizeof(*fence), GFP_KERNEL);
@@ -218,7 +218,8 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
spin_lock_init(&fence->lock);
dma_fence_init(&fence->base, &drm_syncobj_null_fence_ops,
&fence->lock, 0, 0);
- dma_fence_signal(&fence->base);
+ if (signal)
+ dma_fence_signal(&fence->base);

drm_syncobj_replace_fence(syncobj, &fence->base);

@@ -226,6 +227,7 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)

return 0;
}
+EXPORT_SYMBOL(drm_syncobj_assign_null_handle);

int drm_syncobj_find_fence(struct drm_file *file_private,
u32 handle,
@@ -283,7 +285,7 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
spin_lock_init(&syncobj->lock);

if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
- ret = drm_syncobj_assign_null_handle(syncobj);
+ ret = drm_syncobj_assign_null_handle(syncobj, true);
if (ret < 0) {
drm_syncobj_put(syncobj);
return ret;
@@ -341,7 +343,7 @@ static int drm_syncobj_create_as_handle(struct drm_file *file_private,
return ret;
}

-static int drm_syncobj_destroy(struct drm_file *file_private,
+int drm_syncobj_destroy(struct drm_file *file_private,
u32 handle)
{
struct drm_syncobj *syncobj;
@@ -356,6 +358,7 @@ static int drm_syncobj_destroy(struct drm_file *file_private,
drm_syncobj_put(syncobj);
return 0;
}
+EXPORT_SYMBOL(drm_syncobj_destroy);

static int drm_syncobj_file_release(struct inode *inode, struct file *file)
{
@@ -973,7 +976,7 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
return ret;

for (i = 0; i < args->count_handles; i++) {
- ret = drm_syncobj_assign_null_handle(syncobjs[i]);
+ ret = drm_syncobj_assign_null_handle(syncobjs[i], true);
if (ret < 0)
break;
}
diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index 32d9bcf5be7f..422a5b1d3b92 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -26,6 +26,7 @@

#include <drm/drm_vblank.h>
#include <drm/drmP.h>
+#include <drm/drm_syncobj.h>
#include <linux/export.h>

#include "drm_trace.h"
@@ -806,25 +807,33 @@ static void send_vblank_event(struct drm_device *dev,
u64 seq, ktime_t now)
{
struct timespec64 tv;
+ struct drm_syncobj *syncobj = e->base.syncobj;

- switch (e->event.base.type) {
- case DRM_EVENT_VBLANK:
- case DRM_EVENT_FLIP_COMPLETE:
- tv = ktime_to_timespec64(now);
- e->event.vbl.sequence = seq;
- /*
- * e->event is a user space structure, with hardcoded unsigned
- * 32-bit seconds/microseconds. This is safe as we always use
- * monotonic timestamps since linux-4.15
- */
- e->event.vbl.tv_sec = tv.tv_sec;
- e->event.vbl.tv_usec = tv.tv_nsec / 1000;
- break;
- case DRM_EVENT_CRTC_SEQUENCE:
+ if (syncobj) {
if (seq)
- e->event.seq.sequence = seq;
- e->event.seq.time_ns = ktime_to_ns(now);
- break;
+ syncobj->sequence = seq;
+ syncobj->time = now;
+ }
+ if (e->base.event) {
+ switch (e->event.base.type) {
+ case DRM_EVENT_VBLANK:
+ case DRM_EVENT_FLIP_COMPLETE:
+ tv = ktime_to_timespec64(now);
+ e->event.vbl.sequence = seq;
+ /*
+ * e->event is a user space structure, with hardcoded unsigned
+ * 32-bit seconds/microseconds. This is safe as we always use
+ * monotonic timestamps since linux-4.15
+ */
+ e->event.vbl.tv_sec = tv.tv_sec;
+ e->event.vbl.tv_usec = tv.tv_nsec / 1000;
+ break;
+ case DRM_EVENT_CRTC_SEQUENCE:
+ if (seq)
+ e->event.seq.sequence = seq;
+ e->event.seq.time_ns = ktime_to_ns(now);
+ break;
+ }
}
trace_drm_vblank_event_delivered(e->base.file_priv, e->pipe, seq);
drm_send_event_locked(dev, &e->base);
@@ -1837,3 +1846,172 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, void *data,
kfree(e);
return ret;
}
+
+/*
+ * Create a syncobj that is signaled when the specified vblank
+ * occurs.
+ */
+int drm_crtc_queue_syncobj_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file_priv)
+{
+ struct drm_crtc *crtc;
+ struct drm_vblank_crtc *vblank;
+ int pipe;
+ struct drm_crtc_queue_syncobj *queue_syncobj = data;
+ ktime_t now;
+ struct drm_pending_vblank_event *e;
+ struct drm_syncobj *syncobj;
+ u32 syncobj_handle;
+ u32 flags;
+ u64 seq;
+ u64 req_seq;
+ int ret;
+ unsigned long spin_flags;
+
+ if (!drm_core_check_feature(dev, DRIVER_MODESET))
+ return -EINVAL;
+
+ if (!dev->irq_enabled)
+ return -EINVAL;
+
+ crtc = drm_crtc_find(dev, file_priv, queue_syncobj->crtc_id);
+ if (!crtc)
+ return -ENOENT;
+
+ flags = queue_syncobj->flags;
+ /* Check valid flag bits */
+ if (flags & ~(DRM_CRTC_SEQUENCE_RELATIVE|
+ DRM_CRTC_SEQUENCE_NEXT_ON_MISS))
+ return -EINVAL;
+
+ pipe = drm_crtc_index(crtc);
+
+ vblank = &dev->vblank[pipe];
+
+ /*
+ * Allocate all of the necessary objects --
+ * a drm_pending_vblank, drm_syncobj, drm_syncobj handle and dma_fence
+ */
+
+ /* drm_pending_vblank_event */
+ e = kzalloc(sizeof(*e), GFP_KERNEL);
+ if (e == NULL)
+ return -ENOMEM;
+
+ /* syncobj */
+ ret = drm_syncobj_create(&syncobj, 0, NULL);
+ if (ret) {
+ DRM_DEBUG("crtc %d failed to allocate syncobj, %d\n", pipe, ret);
+ goto err_free_event;
+ }
+
+ /*
+ * This allocates the dma_fence object for the syncobj and
+ * leaves it unsignaled.
+ */
+ ret = drm_syncobj_assign_null_handle(syncobj, false);
+ if (ret) {
+ DRM_DEBUG("crtc %d failed to allocate syncobj fence, %d\n", pipe, ret);
+ goto err_free_event;
+ }
+
+ /* and finally the syncobj handle to return to user space */
+ ret = drm_syncobj_get_handle(file_priv, syncobj, &syncobj_handle);
+ if (ret) {
+ DRM_DEBUG("crtc %d failed to allocate syncobj handle, %d\n", pipe, ret);
+ goto err_free_syncobj;
+ }
+
+ ret = drm_crtc_vblank_get(crtc);
+ if (ret) {
+ DRM_DEBUG("crtc %d failed to acquire vblank counter, %d\n", pipe, ret);
+ goto err_free_syncobj_handle;
+ }
+
+ seq = drm_vblank_count_and_time(dev, pipe, &now);
+ req_seq = queue_syncobj->sequence;
+
+ if (flags & DRM_CRTC_SEQUENCE_RELATIVE)
+ req_seq += seq;
+
+ if ((flags & DRM_CRTC_SEQUENCE_NEXT_ON_MISS) && vblank_passed(seq, req_seq))
+ req_seq = seq + 1;
+
+ e->pipe = pipe;
+ spin_lock_irqsave(&dev->event_lock, spin_flags);
+
+ /*
+ * drm_crtc_vblank_off() might have been called after we called
+ * drm_crtc_vblank_get(). drm_crtc_vblank_off() holds event_lock around the
+ * vblank disable, so no need for further locking. The reference from
+ * drm_crtc_vblank_get() protects against vblank disable from another source.
+ */
+ if (!READ_ONCE(vblank->enabled)) {
+ ret = -EINVAL;
+ goto err_unlock;
+ }
+
+ e->base.syncobj = syncobj;
+ e->base.fence = drm_syncobj_fence_get(syncobj);
+
+ list_add(&e->base.pending_link, &file_priv->pending_event_list);
+ e->base.file_priv = file_priv;
+
+ e->sequence = req_seq;
+
+ if (vblank_passed(seq, req_seq)) {
+ drm_crtc_vblank_put(crtc);
+ send_vblank_event(dev, e, seq, now);
+ queue_syncobj->sequence = seq;
+ } else {
+ /* drm_handle_vblank_events will call drm_vblank_put */
+ list_add_tail(&e->base.link, &dev->vblank_event_list);
+ queue_syncobj->sequence = req_seq;
+ }
+
+ spin_unlock_irqrestore(&dev->event_lock, spin_flags);
+
+ /* Pass the syncobj handle back to user space */
+ queue_syncobj->handle = syncobj_handle;
+
+ return 0;
+
+err_unlock:
+ spin_unlock_irqrestore(&dev->event_lock, spin_flags);
+ drm_crtc_vblank_put(crtc);
+err_free_syncobj_handle:
+ drm_syncobj_destroy(file_priv, syncobj_handle);
+err_free_syncobj:
+ drm_syncobj_put(syncobj);
+err_free_event:
+ kfree(e);
+ return ret;
+}
+
+int drm_crtc_get_syncobj_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file_priv)
+{
+ struct drm_crtc_get_syncobj *args = data;
+ struct drm_syncobj *syncobj;
+ struct dma_fence *fence;
+
+ if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+ return -ENODEV;
+
+ syncobj = drm_syncobj_find(file_priv, args->handle);
+ if (!syncobj)
+ return -ENOENT;
+
+ /* If there's no fence, it can't have been signaled */
+ fence = syncobj->fence;
+ if (fence == NULL)
+ return -EBUSY;
+
+ if (!dma_fence_is_signaled(fence))
+ return -EBUSY;
+
+ args->sequence = syncobj->sequence;
+ args->sequence_ns = syncobj->time;
+
+ return 0;
+}
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 0e0c868451a5..16de395a2be9 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -106,8 +106,8 @@ struct drm_pending_event {
*
* Pointer to the actual event that should be sent to userspace to be
* read using drm_read(). Can be optional, since nowadays events are
- * also used to signal kernel internal threads with @completion or DMA
- * transactions using @fence.
+ * also used to signal kernel internal threads with @completion, DMA
+ * transactions using @fence or sync objects using @syncobj.
*/
struct drm_event *event;

@@ -119,6 +119,13 @@ struct drm_pending_event {
*/
struct dma_fence *fence;

+ /**
+ * @syncobj:
+ *
+ * Optional Sync Object to signal when the event occurs.
+ */
+ struct drm_syncobj *syncobj;
+
/**
* @file_priv:
*
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 43e2f382d2f0..924f8b880d0c 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -65,6 +65,16 @@ struct drm_syncobj {
* a file backing for this syncobj.
*/
struct file *file;
+ /**
+ * @sequence:
+ * crtc sequence number when a vblank syncobj was signaled
+ */
+ uint64_t sequence;
+ /**
+ * @time:
+ * time that a vblank syncobj was signaled
+ */
+ ktime_t time;
};

typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
@@ -132,6 +142,7 @@ void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
struct drm_syncobj_cb *cb);
void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
struct dma_fence *fence);
+int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj, bool signal);
int drm_syncobj_find_fence(struct drm_file *file_private,
u32 handle,
struct dma_fence **fence);
@@ -140,6 +151,8 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
struct dma_fence *fence);
int drm_syncobj_get_handle(struct drm_file *file_private,
struct drm_syncobj *syncobj, u32 *handle);
+int drm_syncobj_destroy(struct drm_file *file_private,
+ u32 handle);
int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);

#endif
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 6fdff5945c8a..7a996f73e972 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -759,6 +759,21 @@ struct drm_crtc_queue_sequence {
__u64 user_data; /* user data passed to event */
};

+struct drm_crtc_queue_syncobj {
+ __u32 crtc_id;
+ __u32 flags;
+ __u64 sequence;
+ __u32 handle; /* return syncobj handle */
+ __u32 pad;
+};
+
+struct drm_crtc_get_syncobj {
+ __u32 handle; /* signaled syncobj */
+ __u32 pad;
+ __u64 sequence; /* return: sequence when syncobj was signaled */
+ __u64 sequence_ns; /* return: time when syncobj was signaled */
+};
+
#if defined(__cplusplus)
}
#endif
@@ -843,6 +858,8 @@ extern "C" {

#define DRM_IOCTL_CRTC_GET_SEQUENCE DRM_IOWR(0x3b, struct drm_crtc_get_sequence)
#define DRM_IOCTL_CRTC_QUEUE_SEQUENCE DRM_IOWR(0x3c, struct drm_crtc_queue_sequence)
+#define DRM_IOCTL_CRTC_QUEUE_SYNCOBJ DRM_IOWR(0x3d, struct drm_crtc_queue_syncobj)
+#define DRM_IOCTL_CRTC_GET_SYNCOBJ DRM_IOWR(0x3e, struct drm_crtc_get_syncobj)

#define DRM_IOCTL_UPDATE_DRAW DRM_IOW(0x3f, struct drm_update_draw)

--
2.16.2


2018-04-07 02:54:58

by Keith Packard

[permalink] [raw]
Subject: Re: [PATCH] drm: Add crtc_queue_syncobj and crtc_get_syncobj ioctls

Jason Ekstrand <[email protected]> writes:

> Is the given sequence number guaranteed to be hit in finite time?

Certainly, it's a finite value...

However, realistically, it's just like all of the other vblank
interfaces where you can specify a crazy sequence and block for a very
long time. So, no different from the current situation.

Of course, the only vulkan API available today only lets you wait for
the next vblank, so we'll be asking for a sequence which is one more
than the current sequence.

> Just to make sure I've got this straight, the client queues a syncobj with
> queue_syncobj to fire at a given sequence number. Once the syncobj has
> fired (which it can find out by waiting on it), it then calls get_syncobj
> to figure out when it was fired?

If it cares, it can ask. It might not care at all, in which case it
wouldn't have to ask. The syncobj API doesn't provide any direct
information about the state of the object in the wait call.

> If so, what happens if a syncobj is re-used? Do you just loose the
> information?

You can't reuse one of these -- the 'queue_syncobj' API creates a new
one each time.

> If we have a finite time guarantee, I'm kind-of thinking a sync_file might
> be better as it's a one-shot and doesn't have the information loss
> problem. I'm not actually sure though.

It's a one-shot, once signaled, you're done with it.

> This whole "wait for a syncobj and then ask when it fired" thing is a bit
> odd. I'll have to think on it.

Yeah, the event mechanism has the nice feature that you get data with
each event. I guess the alternative would be to have a way to get an
event when a sync object was ready; perhaps a call which provided a set
of syncobjs and delivered a DRM event when any of them was ready?

That has a lot of appeal; it turns the poll-like nature of the current
API into an epoll-like system. Hrm.

--
-keith


Attachments:
signature.asc (847.00 B)

2018-04-08 02:34:36

by Keith Packard

[permalink] [raw]
Subject: Re: [PATCH] drm: Add crtc_queue_syncobj and crtc_get_syncobj ioctls

Jason Ekstrand <[email protected]> writes:

> Yeah, I suppose an application could ask for 10k frames in the future or
> something ridiculous like that. For sync_file, people strongly want a
> finite time guarantee for security/deadlock reasons. I don't know how
> happy they would be with a finite time of 10 minutes...

Sure, we've put arbitrary finite limits on vblank counters in other
places, so adding some kind of arbitrary limit like a couple of seconds
would be entirely reasonable here. The Vulkan API doesn't need any of
this complexity at all; the one I'm doing only lets you create a fence
for the next vblank.

> Ok, that's not expected... Part of the point of sync objects is that
> they're re-usable. The client is free to not re-use them or to be very
> careful about the recycling process.

Heh. I thought the opposite -- lightweight objects that you'd use once
and delete. I can certainly change the API to pass in an existing
syncobj rather than creating a new one. That would be easier in some
ways as it reduces the failure paths a bit.

> Is the event the important part or the moderately accurate timestamp?

I need the timestamp and sequence number, getting them in an event would
mean not having to make another syscall.

> We could probably modify drm_syncobj to have a "last signaled"
> timestamp that's queryable through an IOCTL.

That's exactly what I did, but it only works for these new syncobjs. The
fields are global and could be filled in by other bits of the code.

> Is the sequence number returned necessary? Will it ever be different from
> the one requested?

Yes, when the application queues it just slightly too late.

The application doesn't actually need either of these values directly,
but the system needs them to respond to requests to queue presentation
at a specific time, so the vulkan driver wants 'recent' vblank
time/sequence data to estimate when a vblank will occur.

--
-keith


Attachments:
signature.asc (847.00 B)

2018-04-09 09:18:50

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH 0/1] drm: Add crtc_queue_syncobj and crtc_get_syncobj ioctls

On Fri, Apr 06, 2018 at 04:56:48PM -0700, Keith Packard wrote:
> (This is an RFC on whether this pair of ioctls seems reasonable. The
> code compiles, but I haven't tested it as I'm away from home this
> weekend.)
>
> I'm rewriting my implementation of the Vulkan EXT_display_control
> extension, which provides a way to signal a Vulkan fence at vblank
> time. I had implemented it using events, but that isn't great as the
> Vulkan API includes the ability to wait for any of a set of fences to
> be signaled. As the other Vulkan fences are implemented using
> dma_fences in the kernel, and (eventually) using syncobj up in user
> space, it seems reasonable to use syncobjs for everything and hook
> vblank to those.
>
> In any case, I'm proposing two new syncobj/vblank ioctls (the names
> aren't great; suggestions welcome, as usual):
>
> DRM_IOCTL_CRTC_QUEUE_SYNCOBJ
>
> Create a new syncobj that will be signaled at (or after) the
> specified vblank sequence value. This uses the same parameters
> to specify the target sequence as
> DRM_IOCTL_CRTC_QUEUE_SEQUENCE.

My understanding of drm_syncobj is that you:

- Create a syncobj with the drm_syncobj_create ioctl.

- Pass it around to various driver callbacks who update the attached
dma_fence pointer using drm_syncobj_replace_fence().

Yes amdgpu does this a bit differently, but that seems to be the
exception.

From that pov I'd massage the uapi to just extend
drm_crtc_queue_sequence_ioctl with a new syncobj flag. Syncobj we can just
add at the bottom of struct drm_crtc_queue_sequence (drm structs can be
extended like that, it's part of the uapi). Or we reuse user_data, but
that's a bit a hack.

We also don't need a new event type, vblank code simply singals
event->fence, which we'll arrange to be the fence behind the syncobj.

> DRM_IOCTL_CRTC_GET_SYNCOBJ
>
> Once the above syncobj has been signaled, this ioctl allows
> the application to find out when that happened, returning both
> the vblank sequence count and time (in ns).

The android syncpt stuff already had the concept of a fence timestamp, and
we carried it over as part of struct dma_fence.timestamp. It's just not
exposed yet as proper uapi. I think we should aim a bit more into that
direction with something like the below sketch:

- Add a dma_fence_signal_ts, which allows us to set the timestamp from a
hw clock.

- Use that in the vblank code.

- Add new drm_syncobj ioctl to query the timestamp of the attached fence
(if it's signalled).

That would entirely avoid the special-case ioctl just for vblank syncobj
here. Also, this might be useful in your implementation of
VK_GOOGLE_display_timing, since the current query timestamp you're using
won't take into account irq wakeup latency. Using fence->timestamp will
still miss the irq->atomic worker wakupe latency, but should be a lot
better already.

> I'd like to hear comments on whether this seems reasonable, or whether
> I should go in some other direction.

Besides a few bikesheds to align better with other stuff going around I
think this looks good.
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2018-04-24 17:46:45

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH 0/1] drm: Add crtc_queue_syncobj and crtc_get_syncobj ioctls

On Mon, Apr 09, 2018 at 11:14:17AM +0200, Daniel Vetter wrote:
> On Fri, Apr 06, 2018 at 04:56:48PM -0700, Keith Packard wrote:
> > (This is an RFC on whether this pair of ioctls seems reasonable. The
> > code compiles, but I haven't tested it as I'm away from home this
> > weekend.)
> >
> > I'm rewriting my implementation of the Vulkan EXT_display_control
> > extension, which provides a way to signal a Vulkan fence at vblank
> > time. I had implemented it using events, but that isn't great as the
> > Vulkan API includes the ability to wait for any of a set of fences to
> > be signaled. As the other Vulkan fences are implemented using
> > dma_fences in the kernel, and (eventually) using syncobj up in user
> > space, it seems reasonable to use syncobjs for everything and hook
> > vblank to those.
> >
> > In any case, I'm proposing two new syncobj/vblank ioctls (the names
> > aren't great; suggestions welcome, as usual):
> >
> > DRM_IOCTL_CRTC_QUEUE_SYNCOBJ
> >
> > Create a new syncobj that will be signaled at (or after) the
> > specified vblank sequence value. This uses the same parameters
> > to specify the target sequence as
> > DRM_IOCTL_CRTC_QUEUE_SEQUENCE.
>
> My understanding of drm_syncobj is that you:
>
> - Create a syncobj with the drm_syncobj_create ioctl.
>
> - Pass it around to various driver callbacks who update the attached
> dma_fence pointer using drm_syncobj_replace_fence().
>
> Yes amdgpu does this a bit differently, but that seems to be the
> exception.
>
> From that pov I'd massage the uapi to just extend
> drm_crtc_queue_sequence_ioctl with a new syncobj flag. Syncobj we can just
> add at the bottom of struct drm_crtc_queue_sequence (drm structs can be
> extended like that, it's part of the uapi). Or we reuse user_data, but
> that's a bit a hack.
>
> We also don't need a new event type, vblank code simply singals
> event->fence, which we'll arrange to be the fence behind the syncobj.
>
> > DRM_IOCTL_CRTC_GET_SYNCOBJ
> >
> > Once the above syncobj has been signaled, this ioctl allows
> > the application to find out when that happened, returning both
> > the vblank sequence count and time (in ns).
>
> The android syncpt stuff already had the concept of a fence timestamp, and
> we carried it over as part of struct dma_fence.timestamp. It's just not
> exposed yet as proper uapi. I think we should aim a bit more into that
> direction with something like the below sketch:
>
> - Add a dma_fence_signal_ts, which allows us to set the timestamp from a
> hw clock.
>
> - Use that in the vblank code.
>
> - Add new drm_syncobj ioctl to query the timestamp of the attached fence
> (if it's signalled).
>
> That would entirely avoid the special-case ioctl just for vblank syncobj
> here. Also, this might be useful in your implementation of
> VK_GOOGLE_display_timing, since the current query timestamp you're using
> won't take into account irq wakeup latency. Using fence->timestamp will
> still miss the irq->atomic worker wakupe latency, but should be a lot
> better already.

Ok, I'm going to retract my entire suggestion above for the GET ioctl. It
would neatly work for vk_google_display_timing, but the KHR version of
that extensions very much sounds like we want the kernel to report 2 (or
maybe even 3) different timestamps back. An ioctl is much easier to extend
than shoe-horning all that into the dma_fence/drm_syncobj abstraction.

One bikeshed maybe on top: Maybe call the ioctl CRTC_GET_TS_DATA, since
that's what you're getting, not the syncobj.

Sorry for flopping around on this, just learned this all in discussions
past week.
-Daniel

>
> > I'd like to hear comments on whether this seems reasonable, or whether
> > I should go in some other direction.
>
> Besides a few bikesheds to align better with other stuff going around I
> think this looks good.
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch