This patch series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
The series is broken up into five patches:
1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
2. Move a call to _drm_lease_held() out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find().
3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
4. Serialize drm_file.master by introducing a new lock that's held whenever the value of drm_file.master changes.
5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
Changes in v6 -> v7:
- Patch 2:
Modify code alignment as suggested by the intel-gfx CI.
Update commit message based on the changes to patch 5.
- Patch 4:
Add patch 4 to the series. This patch adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
- Patch 5:
Move kerneldoc comment about protecting drm_file.master with drm_device.master_mutex into patch 4.
Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
Changes in v5 -> v6:
- Patch 2:
Add patch 2 to the series. This patch moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
- Patch 5:
Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
Modify comparison to NULL into "!master", as suggested by the intel-gfx CI.
Changes in v4 -> v5:
- Patch 1:
Add patch 1 to the series. The changes in patch 1 do not apply to stable because they apply to new changes in the drm-misc-next branch. This patch moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
- Patch 3:
Move changes to drm_connector.c into patch 1.
Changes in v3 -> v4:
- Patch 3:
Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
Additionally, inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
- Patch 5:
Modify kerneldoc formatting.
Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
Changes in v2 -> v3:
- Patch 3:
Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
- Patch 5:
Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
Changes in v1 -> v2:
- Patch 5:
Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
Desmond Cheong Zhi Xi (5):
drm: avoid circular locks in drm_mode_getconnector
drm: separate locks in __drm_mode_object_find
drm: add a locked version of drm_is_current_master
drm: serialize drm_file.master with a master lock
drm: protect drm_master pointers in drm_lease.c
drivers/gpu/drm/drm_auth.c | 86 +++++++++++++++++++++++--------
drivers/gpu/drm/drm_connector.c | 5 +-
drivers/gpu/drm/drm_file.c | 1 +
drivers/gpu/drm/drm_lease.c | 81 ++++++++++++++++++++++-------
drivers/gpu/drm/drm_mode_object.c | 10 ++--
include/drm/drm_auth.h | 1 +
include/drm/drm_file.h | 18 +++++--
7 files changed, 153 insertions(+), 49 deletions(-)
--
2.25.1
In preparation for a future patch to take a lock on
drm_device.master_mutex inside drm_is_current_master(), we first move
the call to drm_is_current_master() in drm_mode_getconnector out from the
section locked by &dev->mode_config.mutex. This avoids creating a
circular lock dependency.
Failing to avoid this lock dependency produces the following lockdep
splat:
======================================================
WARNING: possible circular locking dependency detected
5.13.0-rc7-CI-CI_DRM_10254+ #1 Not tainted
------------------------------------------------------
kms_frontbuffer/1087 is trying to acquire lock:
ffff88810dcd01a8 (&dev->master_mutex){+.+.}-{3:3}, at: drm_is_current_master+0x1b/0x40
but task is already holding lock:
ffff88810dcd0488 (&dev->mode_config.mutex){+.+.}-{3:3}, at: drm_mode_getconnector+0x1c6/0x4a0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&dev->mode_config.mutex){+.+.}-{3:3}:
__mutex_lock+0xab/0x970
drm_client_modeset_probe+0x22e/0xca0
__drm_fb_helper_initial_config_and_unlock+0x42/0x540
intel_fbdev_initial_config+0xf/0x20 [i915]
async_run_entry_fn+0x28/0x130
process_one_work+0x26d/0x5c0
worker_thread+0x37/0x380
kthread+0x144/0x170
ret_from_fork+0x1f/0x30
-> #1 (&client->modeset_mutex){+.+.}-{3:3}:
__mutex_lock+0xab/0x970
drm_client_modeset_commit_locked+0x1c/0x180
drm_client_modeset_commit+0x1c/0x40
__drm_fb_helper_restore_fbdev_mode_unlocked+0x88/0xb0
drm_fb_helper_set_par+0x34/0x40
intel_fbdev_set_par+0x11/0x40 [i915]
fbcon_init+0x270/0x4f0
visual_init+0xc6/0x130
do_bind_con_driver+0x1e5/0x2d0
do_take_over_console+0x10e/0x180
do_fbcon_takeover+0x53/0xb0
register_framebuffer+0x22d/0x310
__drm_fb_helper_initial_config_and_unlock+0x36c/0x540
intel_fbdev_initial_config+0xf/0x20 [i915]
async_run_entry_fn+0x28/0x130
process_one_work+0x26d/0x5c0
worker_thread+0x37/0x380
kthread+0x144/0x170
ret_from_fork+0x1f/0x30
-> #0 (&dev->master_mutex){+.+.}-{3:3}:
__lock_acquire+0x151e/0x2590
lock_acquire+0xd1/0x3d0
__mutex_lock+0xab/0x970
drm_is_current_master+0x1b/0x40
drm_mode_getconnector+0x37e/0x4a0
drm_ioctl_kernel+0xa8/0xf0
drm_ioctl+0x1e8/0x390
__x64_sys_ioctl+0x6a/0xa0
do_syscall_64+0x39/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of: &dev->master_mutex --> &client->modeset_mutex --> &dev->mode_config.mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&dev->mode_config.mutex);
lock(&client->modeset_mutex);
lock(&dev->mode_config.mutex);
lock(&dev->master_mutex);
*** DEADLOCK ***
1 lock held by kms_frontbuffer/1087:
#0: ffff88810dcd0488 (&dev->mode_config.mutex){+.+.}-{3:3}, at: drm_mode_getconnector+0x1c6/0x4a0
stack backtrace:
CPU: 7 PID: 1087 Comm: kms_frontbuffer Not tainted 5.13.0-rc7-CI-CI_DRM_10254+ #1
Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
Call Trace:
dump_stack+0x7f/0xad
check_noncircular+0x12e/0x150
__lock_acquire+0x151e/0x2590
lock_acquire+0xd1/0x3d0
__mutex_lock+0xab/0x970
drm_is_current_master+0x1b/0x40
drm_mode_getconnector+0x37e/0x4a0
drm_ioctl_kernel+0xa8/0xf0
drm_ioctl+0x1e8/0x390
__x64_sys_ioctl+0x6a/0xa0
do_syscall_64+0x39/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Reported-by: Daniel Vetter <[email protected]>
Signed-off-by: Desmond Cheong Zhi Xi <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
---
drivers/gpu/drm/drm_connector.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index da39e7ff6965..2ba257b1ae20 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2414,6 +2414,7 @@ int drm_mode_getconnector(struct drm_device *dev, void *data,
struct drm_mode_modeinfo u_mode;
struct drm_mode_modeinfo __user *mode_ptr;
uint32_t __user *encoder_ptr;
+ bool is_current_master;
if (!drm_core_check_feature(dev, DRIVER_MODESET))
return -EOPNOTSUPP;
@@ -2444,9 +2445,11 @@ int drm_mode_getconnector(struct drm_device *dev, void *data,
out_resp->connector_type = connector->connector_type;
out_resp->connector_type_id = connector->connector_type_id;
+ is_current_master = drm_is_current_master(file_priv);
+
mutex_lock(&dev->mode_config.mutex);
if (out_resp->count_modes == 0) {
- if (drm_is_current_master(file_priv))
+ if (is_current_master)
connector->funcs->fill_modes(connector,
dev->mode_config.max_width,
dev->mode_config.max_height);
--
2.25.1
Currently, drm_file.master pointers should be protected by
drm_device.master_mutex when being dereferenced. This is because
drm_file.master is not invariant for the lifetime of drm_file. If
drm_file is not the creator of master, then drm_file.is_master is
false, and a call to drm_setmaster_ioctl will invoke
drm_new_set_master, which then allocates a new master for drm_file and
puts the old master.
Thus, without holding drm_device.master_mutex, the old value of
drm_file.master could be freed while it is being used by another
concurrent process.
However, it is not always possible to lock drm_device.master_mutex to
dereference drm_file.master. Through the fbdev emulation code, this
might occur in a deep nest of other locks. But drm_device.master_mutex
is also the outermost lock in the nesting hierarchy, so this leads to
potential deadlocks.
To address this, we introduce a new mutex at the bottom of the lock
hierarchy that only serializes drm_file.master. With this change, the
value of drm_file.master changes only when both
drm_device.master_mutex and drm_file.master_lock are held. Hence, any
process holding either of those locks can ensure that the value of
drm_file.master will not change concurrently.
Since no lock depends on the new drm_file.master_lock, when
drm_file.master is dereferenced, but drm_device.master_mutex cannot be
held, we can safely protect the master pointer with
drm_file.master_lock.
Reported-by: Daniel Vetter <[email protected]>
Signed-off-by: Desmond Cheong Zhi Xi <[email protected]>
---
Since our lock inversions were a result of dev->master_mutex being
used to serialize many other things, perhaps a finer grained lock will
solve the lockdep issues.
drivers/gpu/drm/drm_auth.c | 10 ++++++++--
drivers/gpu/drm/drm_file.c | 1 +
include/drm/drm_file.h | 12 +++++++++---
3 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
index ab1863c5a5a0..fe5b6adc6133 100644
--- a/drivers/gpu/drm/drm_auth.c
+++ b/drivers/gpu/drm/drm_auth.c
@@ -169,11 +169,14 @@ static int drm_new_set_master(struct drm_device *dev, struct drm_file *fpriv)
WARN_ON(fpriv->is_master);
old_master = fpriv->master;
+ mutex_lock(&fpriv->master_lock);
fpriv->master = drm_master_create(dev);
if (!fpriv->master) {
fpriv->master = old_master;
+ mutex_unlock(&fpriv->master_lock);
return -ENOMEM;
}
+ mutex_unlock(&fpriv->master_lock);
fpriv->is_master = 1;
fpriv->authenticated = 1;
@@ -332,10 +335,13 @@ int drm_master_open(struct drm_file *file_priv)
* any master object for render clients
*/
mutex_lock(&dev->master_mutex);
- if (!dev->master)
+ if (!dev->master) {
ret = drm_new_set_master(dev, file_priv);
- else
+ } else {
+ mutex_lock(&file_priv->master_lock);
file_priv->master = drm_master_get(dev->master);
+ mutex_unlock(&file_priv->master_lock);
+ }
mutex_unlock(&dev->master_mutex);
return ret;
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index d4f0bac6f8f8..8ccadfa1c752 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -176,6 +176,7 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor)
init_waitqueue_head(&file->event_wait);
file->event_space = 4096; /* set aside 4k for event buffer */
+ mutex_init(&file->master_lock);
mutex_init(&file->event_read_lock);
if (drm_core_check_feature(dev, DRIVER_GEM))
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index b81b3bfb08c8..88539f93fc8e 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -226,15 +226,21 @@ struct drm_file {
/**
* @master:
*
- * Master this node is currently associated with. Only relevant if
- * drm_is_primary_client() returns true. Note that this only
- * matches &drm_device.master if the master is the currently active one.
+ * Master this node is currently associated with. Protected by struct
+ * &drm_device.master_mutex, and serialized by @master_lock.
+ *
+ * Only relevant if drm_is_primary_client() returns true. Note that
+ * this only matches &drm_device.master if the master is the currently
+ * active one.
*
* See also @authentication and @is_master and the :ref:`section on
* primary nodes and authentication <drm_primary_node>`.
*/
struct drm_master *master;
+ /** @master_lock: Serializes @master. */
+ struct mutex master_lock;
+
/** @pid: Process that opened this file. */
struct pid *pid;
--
2.25.1
drm_file->master pointers should be protected by
drm_device.master_mutex or drm_file.master_lock when being
dereferenced.
However, in drm_lease.c, there are multiple instances where
drm_file->master is accessed and dereferenced while neither lock is
held. This makes drm_lease.c vulnerable to use-after-free bugs.
We address this issue in 2 ways:
1. Add a new drm_file_get_master() function that calls drm_master_get
on drm_file->master while holding on to drm_file.master_lock. Since
drm_master_get increments the reference count of master, this
prevents master from being freed until we unreference it with
drm_master_put.
2. In each case where drm_file->master is directly accessed and
eventually dereferenced in drm_lease.c, we wrap the access in a call
to the new drm_file_get_master function, then unreference the master
pointer once we are done using it.
Reported-by: Daniel Vetter <[email protected]>
Signed-off-by: Desmond Cheong Zhi Xi <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
---
drivers/gpu/drm/drm_auth.c | 25 ++++++++++++
drivers/gpu/drm/drm_lease.c | 81 ++++++++++++++++++++++++++++---------
include/drm/drm_auth.h | 1 +
include/drm/drm_file.h | 6 +++
4 files changed, 93 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
index fe5b6adc6133..17440ee54f30 100644
--- a/drivers/gpu/drm/drm_auth.c
+++ b/drivers/gpu/drm/drm_auth.c
@@ -390,6 +390,31 @@ struct drm_master *drm_master_get(struct drm_master *master)
}
EXPORT_SYMBOL(drm_master_get);
+/**
+ * drm_file_get_master - reference &drm_file.master of @file_priv
+ * @file_priv: DRM file private
+ *
+ * Increments the reference count of @file_priv's &drm_file.master and returns
+ * the &drm_file.master. If @file_priv has no &drm_file.master, returns NULL.
+ *
+ * Master pointers returned from this function should be unreferenced using
+ * drm_master_put().
+ */
+struct drm_master *drm_file_get_master(struct drm_file *file_priv)
+{
+ struct drm_master *master = NULL;
+
+ mutex_lock(&file_priv->master_lock);
+ if (!file_priv->master)
+ goto unlock;
+ master = drm_master_get(file_priv->master);
+
+unlock:
+ mutex_unlock(&file_priv->master_lock);
+ return master;
+}
+EXPORT_SYMBOL(drm_file_get_master);
+
static void drm_master_destroy(struct kref *kref)
{
struct drm_master *master = container_of(kref, struct drm_master, refcount);
diff --git a/drivers/gpu/drm/drm_lease.c b/drivers/gpu/drm/drm_lease.c
index 00fb433bcef1..92eac73d9001 100644
--- a/drivers/gpu/drm/drm_lease.c
+++ b/drivers/gpu/drm/drm_lease.c
@@ -106,10 +106,19 @@ static bool _drm_has_leased(struct drm_master *master, int id)
*/
bool _drm_lease_held(struct drm_file *file_priv, int id)
{
- if (!file_priv || !file_priv->master)
+ bool ret;
+ struct drm_master *master;
+
+ if (!file_priv)
return true;
- return _drm_lease_held_master(file_priv->master, id);
+ master = drm_file_get_master(file_priv);
+ if (!master)
+ return true;
+ ret = _drm_lease_held_master(master, id);
+ drm_master_put(&master);
+
+ return ret;
}
/**
@@ -128,13 +137,22 @@ bool drm_lease_held(struct drm_file *file_priv, int id)
struct drm_master *master;
bool ret;
- if (!file_priv || !file_priv->master || !file_priv->master->lessor)
+ if (!file_priv)
return true;
- master = file_priv->master;
+ master = drm_file_get_master(file_priv);
+ if (!master)
+ return true;
+ if (!master->lessor) {
+ ret = true;
+ goto out;
+ }
mutex_lock(&master->dev->mode_config.idr_mutex);
ret = _drm_lease_held_master(master, id);
mutex_unlock(&master->dev->mode_config.idr_mutex);
+
+out:
+ drm_master_put(&master);
return ret;
}
@@ -154,10 +172,16 @@ uint32_t drm_lease_filter_crtcs(struct drm_file *file_priv, uint32_t crtcs_in)
int count_in, count_out;
uint32_t crtcs_out = 0;
- if (!file_priv || !file_priv->master || !file_priv->master->lessor)
+ if (!file_priv)
return crtcs_in;
- master = file_priv->master;
+ master = drm_file_get_master(file_priv);
+ if (!master)
+ return crtcs_in;
+ if (!master->lessor) {
+ crtcs_out = crtcs_in;
+ goto out;
+ }
dev = master->dev;
count_in = count_out = 0;
@@ -176,6 +200,9 @@ uint32_t drm_lease_filter_crtcs(struct drm_file *file_priv, uint32_t crtcs_in)
count_in++;
}
mutex_unlock(&master->dev->mode_config.idr_mutex);
+
+out:
+ drm_master_put(&master);
return crtcs_out;
}
@@ -489,7 +516,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
size_t object_count;
int ret = 0;
struct idr leases;
- struct drm_master *lessor = lessor_priv->master;
+ struct drm_master *lessor;
struct drm_master *lessee = NULL;
struct file *lessee_file = NULL;
struct file *lessor_file = lessor_priv->filp;
@@ -501,12 +528,6 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
if (!drm_core_check_feature(dev, DRIVER_MODESET))
return -EOPNOTSUPP;
- /* Do not allow sub-leases */
- if (lessor->lessor) {
- DRM_DEBUG_LEASE("recursive leasing not allowed\n");
- return -EINVAL;
- }
-
/* need some objects */
if (cl->object_count == 0) {
DRM_DEBUG_LEASE("no objects in lease\n");
@@ -518,12 +539,22 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
return -EINVAL;
}
+ lessor = drm_file_get_master(lessor_priv);
+ /* Do not allow sub-leases */
+ if (lessor->lessor) {
+ DRM_DEBUG_LEASE("recursive leasing not allowed\n");
+ ret = -EINVAL;
+ goto out_lessor;
+ }
+
object_count = cl->object_count;
object_ids = memdup_user(u64_to_user_ptr(cl->object_ids),
array_size(object_count, sizeof(__u32)));
- if (IS_ERR(object_ids))
- return PTR_ERR(object_ids);
+ if (IS_ERR(object_ids)) {
+ ret = PTR_ERR(object_ids);
+ goto out_lessor;
+ }
idr_init(&leases);
@@ -534,14 +565,15 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
if (ret) {
DRM_DEBUG_LEASE("lease object lookup failed: %i\n", ret);
idr_destroy(&leases);
- return ret;
+ goto out_lessor;
}
/* Allocate a file descriptor for the lease */
fd = get_unused_fd_flags(cl->flags & (O_CLOEXEC | O_NONBLOCK));
if (fd < 0) {
idr_destroy(&leases);
- return fd;
+ ret = fd;
+ goto out_lessor;
}
DRM_DEBUG_LEASE("Creating lease\n");
@@ -577,6 +609,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
/* Hook up the fd */
fd_install(fd, lessee_file);
+ drm_master_put(&lessor);
DRM_DEBUG_LEASE("drm_mode_create_lease_ioctl succeeded\n");
return 0;
@@ -586,6 +619,8 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
out_leases:
put_unused_fd(fd);
+out_lessor:
+ drm_master_put(&lessor);
DRM_DEBUG_LEASE("drm_mode_create_lease_ioctl failed: %d\n", ret);
return ret;
}
@@ -608,7 +643,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev,
struct drm_mode_list_lessees *arg = data;
__u32 __user *lessee_ids = (__u32 __user *) (uintptr_t) (arg->lessees_ptr);
__u32 count_lessees = arg->count_lessees;
- struct drm_master *lessor = lessor_priv->master, *lessee;
+ struct drm_master *lessor, *lessee;
int count;
int ret = 0;
@@ -619,6 +654,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev,
if (!drm_core_check_feature(dev, DRIVER_MODESET))
return -EOPNOTSUPP;
+ lessor = drm_file_get_master(lessor_priv);
DRM_DEBUG_LEASE("List lessees for %d\n", lessor->lessee_id);
mutex_lock(&dev->mode_config.idr_mutex);
@@ -642,6 +678,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev,
arg->count_lessees = count;
mutex_unlock(&dev->mode_config.idr_mutex);
+ drm_master_put(&lessor);
return ret;
}
@@ -661,7 +698,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev,
struct drm_mode_get_lease *arg = data;
__u32 __user *object_ids = (__u32 __user *) (uintptr_t) (arg->objects_ptr);
__u32 count_objects = arg->count_objects;
- struct drm_master *lessee = lessee_priv->master;
+ struct drm_master *lessee;
struct idr *object_idr;
int count;
void *entry;
@@ -675,6 +712,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev,
if (!drm_core_check_feature(dev, DRIVER_MODESET))
return -EOPNOTSUPP;
+ lessee = drm_file_get_master(lessee_priv);
DRM_DEBUG_LEASE("get lease for %d\n", lessee->lessee_id);
mutex_lock(&dev->mode_config.idr_mutex);
@@ -702,6 +740,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev,
arg->count_objects = count;
mutex_unlock(&dev->mode_config.idr_mutex);
+ drm_master_put(&lessee);
return ret;
}
@@ -720,7 +759,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev,
void *data, struct drm_file *lessor_priv)
{
struct drm_mode_revoke_lease *arg = data;
- struct drm_master *lessor = lessor_priv->master;
+ struct drm_master *lessor;
struct drm_master *lessee;
int ret = 0;
@@ -730,6 +769,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev,
if (!drm_core_check_feature(dev, DRIVER_MODESET))
return -EOPNOTSUPP;
+ lessor = drm_file_get_master(lessor_priv);
mutex_lock(&dev->mode_config.idr_mutex);
lessee = _drm_find_lessee(lessor, arg->lessee_id);
@@ -750,6 +790,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev,
fail:
mutex_unlock(&dev->mode_config.idr_mutex);
+ drm_master_put(&lessor);
return ret;
}
diff --git a/include/drm/drm_auth.h b/include/drm/drm_auth.h
index 6bf8b2b78991..f99d3417f304 100644
--- a/include/drm/drm_auth.h
+++ b/include/drm/drm_auth.h
@@ -107,6 +107,7 @@ struct drm_master {
};
struct drm_master *drm_master_get(struct drm_master *master);
+struct drm_master *drm_file_get_master(struct drm_file *file_priv);
void drm_master_put(struct drm_master **master);
bool drm_is_current_master(struct drm_file *fpriv);
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 88539f93fc8e..a58c7d83ae20 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -233,6 +233,12 @@ struct drm_file {
* this only matches &drm_device.master if the master is the currently
* active one.
*
+ * When dereferencing this pointer, either hold struct
+ * &drm_device.master_mutex for the duration of the pointer's use, or
+ * use drm_file_get_master() if struct &drm_device.master_mutex is not
+ * currently held and there is no other need to hold it. This prevents
+ * @master from being freed during use.
+ *
* See also @authentication and @is_master and the :ref:`section on
* primary nodes and authentication <drm_primary_node>`.
*/
--
2.25.1
While checking the master status of the DRM file in
drm_is_current_master(), the device's master mutex should be
held. Without the mutex, the pointer fpriv->master may be freed
concurrently by another process calling drm_setmaster_ioctl(). This
could lead to use-after-free errors when the pointer is subsequently
dereferenced in drm_lease_owner().
The callers of drm_is_current_master() from drm_auth.c hold the
device's master mutex, but external callers do not. Hence, we implement
drm_is_current_master_locked() to be used within drm_auth.c, and
modify drm_is_current_master() to grab the device's master mutex
before checking the master status.
Reported-by: Daniel Vetter <[email protected]>
Signed-off-by: Desmond Cheong Zhi Xi <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
---
drivers/gpu/drm/drm_auth.c | 51 ++++++++++++++++++++++++--------------
1 file changed, 32 insertions(+), 19 deletions(-)
diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
index f00e5abdbbf4..ab1863c5a5a0 100644
--- a/drivers/gpu/drm/drm_auth.c
+++ b/drivers/gpu/drm/drm_auth.c
@@ -61,6 +61,35 @@
* trusted clients.
*/
+static bool drm_is_current_master_locked(struct drm_file *fpriv)
+{
+ lockdep_assert_held_once(&fpriv->minor->dev->master_mutex);
+
+ return fpriv->is_master && drm_lease_owner(fpriv->master) == fpriv->minor->dev->master;
+}
+
+/**
+ * drm_is_current_master - checks whether @priv is the current master
+ * @fpriv: DRM file private
+ *
+ * Checks whether @fpriv is current master on its device. This decides whether a
+ * client is allowed to run DRM_MASTER IOCTLs.
+ *
+ * Most of the modern IOCTL which require DRM_MASTER are for kernel modesetting
+ * - the current master is assumed to own the non-shareable display hardware.
+ */
+bool drm_is_current_master(struct drm_file *fpriv)
+{
+ bool ret;
+
+ mutex_lock(&fpriv->minor->dev->master_mutex);
+ ret = drm_is_current_master_locked(fpriv);
+ mutex_unlock(&fpriv->minor->dev->master_mutex);
+
+ return ret;
+}
+EXPORT_SYMBOL(drm_is_current_master);
+
int drm_getmagic(struct drm_device *dev, void *data, struct drm_file *file_priv)
{
struct drm_auth *auth = data;
@@ -223,7 +252,7 @@ int drm_setmaster_ioctl(struct drm_device *dev, void *data,
if (ret)
goto out_unlock;
- if (drm_is_current_master(file_priv))
+ if (drm_is_current_master_locked(file_priv))
goto out_unlock;
if (dev->master) {
@@ -272,7 +301,7 @@ int drm_dropmaster_ioctl(struct drm_device *dev, void *data,
if (ret)
goto out_unlock;
- if (!drm_is_current_master(file_priv)) {
+ if (!drm_is_current_master_locked(file_priv)) {
ret = -EINVAL;
goto out_unlock;
}
@@ -321,7 +350,7 @@ void drm_master_release(struct drm_file *file_priv)
if (file_priv->magic)
idr_remove(&file_priv->master->magic_map, file_priv->magic);
- if (!drm_is_current_master(file_priv))
+ if (!drm_is_current_master_locked(file_priv))
goto out;
drm_legacy_lock_master_cleanup(dev, master);
@@ -342,22 +371,6 @@ void drm_master_release(struct drm_file *file_priv)
mutex_unlock(&dev->master_mutex);
}
-/**
- * drm_is_current_master - checks whether @priv is the current master
- * @fpriv: DRM file private
- *
- * Checks whether @fpriv is current master on its device. This decides whether a
- * client is allowed to run DRM_MASTER IOCTLs.
- *
- * Most of the modern IOCTL which require DRM_MASTER are for kernel modesetting
- * - the current master is assumed to own the non-shareable display hardware.
- */
-bool drm_is_current_master(struct drm_file *fpriv)
-{
- return fpriv->is_master && drm_lease_owner(fpriv->master) == fpriv->minor->dev->master;
-}
-EXPORT_SYMBOL(drm_is_current_master);
-
/**
* drm_master_get - reference a master pointer
* @master: &struct drm_master
--
2.25.1
In a future patch, _drm_lease_held will dereference drm_file->master
only after making a call to drm_file_get_master. This will increment
the reference count of drm_file->master while holding onto a new
drm_file.master_lock.
In preparation for this, the call to _drm_lease_held should be moved
out from the section locked by &dev->mode_config.idr_mutex. This
avoids creating new lock hierarchies between
&dev->mode_config.idr_mutex and &drm_file->master_lock.
Reported-by: Daniel Vetter <[email protected]>
Signed-off-by: Desmond Cheong Zhi Xi <[email protected]>
---
drivers/gpu/drm/drm_mode_object.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/drm_mode_object.c b/drivers/gpu/drm/drm_mode_object.c
index b26588b52795..83e35ff3b13a 100644
--- a/drivers/gpu/drm/drm_mode_object.c
+++ b/drivers/gpu/drm/drm_mode_object.c
@@ -146,16 +146,18 @@ struct drm_mode_object *__drm_mode_object_find(struct drm_device *dev,
if (obj && obj->id != id)
obj = NULL;
- if (obj && drm_mode_object_lease_required(obj->type) &&
- !_drm_lease_held(file_priv, obj->id))
- obj = NULL;
-
if (obj && obj->free_cb) {
if (!kref_get_unless_zero(&obj->refcount))
obj = NULL;
}
mutex_unlock(&dev->mode_config.idr_mutex);
+ if (obj && drm_mode_object_lease_required(obj->type) &&
+ !_drm_lease_held(file_priv, obj->id)) {
+ drm_mode_object_put(obj);
+ obj = NULL;
+ }
+
return obj;
}
--
2.25.1
On Fri, Jul 02, 2021 at 12:53:53AM +0800, Desmond Cheong Zhi Xi wrote:
> This patch series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
> https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
>
> The series is broken up into five patches:
>
> 1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
>
> 2. Move a call to _drm_lease_held() out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find().
>
> 3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
>
> 4. Serialize drm_file.master by introducing a new lock that's held whenever the value of drm_file.master changes.
>
> 5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
>
> Changes in v6 -> v7:
> - Patch 2:
> Modify code alignment as suggested by the intel-gfx CI.
>
> Update commit message based on the changes to patch 5.
>
> - Patch 4:
> Add patch 4 to the series. This patch adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
>
> - Patch 5:
> Move kerneldoc comment about protecting drm_file.master with drm_device.master_mutex into patch 4.
>
> Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
So there's another one now because master->leases is protected by the
mode_config.idr_mutex, and that's a bit awkward to untangle.
Also I'm really surprised that there was now lockdep through the atomic
code anywhere. The reason seems to be that somehow CI reboot first before
it managed to run any of the kms_atomic tests, and we can only hit this
when we go through the atomic kms ioctl, the legacy kms ioctl don't have
that specific issue.
Anyway I think this approach doesn't look too workable, and we need
something new.
But first things first: Are you still on board working on this? You
started with a simple patch to fix a UAF bug, now we're deep into
reworking tricky locking ... If you feel like you want out I'm totally
fine with that.
Anyway, I think we need to split drm_device->master_mutex up into two
parts:
- One part that protects the actual access/changes, which I think for
simplicity we'll just leave as the current lock. That lock is a very
inner lock, since for the drm_lease.c stuff it has to nest within
mode_config.idr_mutex even.
- Now the issue with checking master status/leases/whatever as an
innermost lock is that you can race, it's a classic time of check vs
time of use race: By the time we actually use the thing we validate
we'er allowed to use, we might now have access anymore. There's two
reasons for that:
* DROPMASTER ioctl could remove the master rights, which removes access
rights also for all leases
* REVOKE_LEASE ioctl can do the same but only for a specific lease
This is the thing we're trying to protect against in fbcon code, but
that's very spotty protection because all the ioctls by other users
aren't actually protected against this.
So I think for this we need some kind of big reader lock.
Now for the implementation, there's a few things:
- I think best option for this big reader lock would be to just use srcu.
We only need to flush out all current readers when we drop master or
revoke a lease, so synchronize_srcu is perfectly good enough for this
purpose.
- The fbdev code would switch over to srcu in
drm_master_internal_acquire() and drm_master_internal_release(). Ofc
within drm_master_internal_acquire we'd still need to check master
status with the normal master_mutex.
- While we revamp all this we should fix the ioctl checks in drm_ioctl.c.
Just noticed that drm_ioctl_permit() could and should be unexported,
last user was removed.
Within drm_ioctl_kernel we'd then replace the check for
drm_is_current_master with the drm_master_internal_acquire/release.
- This alone does nothing, we still need to make sure that dropmaster and
revoke_lease ioctl flush out all other access before they return to
userspace. We can't just call synchronize_srcu because due to the ioctl
code in drm_ioctl_kernel we're in that sruc section, we'd need to add a
DRM_MASTER_FLUSH ioctl flag which we'd check only when DRM_MASTER is
set, and use to call synchronize_srcu. Maybe wrap that in a
drm_master_flush or so, or perhaps a drm_master_internal_release_flush.
- Also maybe we should drop the _internal_ from that name. Feels a bit
wrong when we're also going to use this in the ioctl handler.
Thoughts? Totally silly and overkill?
Cheers, Daniel
> Changes in v5 -> v6:
> - Patch 2:
> Add patch 2 to the series. This patch moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
>
> - Patch 5:
> Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
>
> Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
>
> Modify comparison to NULL into "!master", as suggested by the intel-gfx CI.
>
> Changes in v4 -> v5:
> - Patch 1:
> Add patch 1 to the series. The changes in patch 1 do not apply to stable because they apply to new changes in the drm-misc-next branch. This patch moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
>
> Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
>
> - Patch 3:
> Move changes to drm_connector.c into patch 1.
>
> Changes in v3 -> v4:
> - Patch 3:
> Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
>
> Additionally, inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
>
> - Patch 5:
> Modify kerneldoc formatting.
>
> Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
>
> Changes in v2 -> v3:
> - Patch 3:
> Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
>
> - Patch 5:
> Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
>
> Changes in v1 -> v2:
> - Patch 5:
> Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
>
> Desmond Cheong Zhi Xi (5):
> drm: avoid circular locks in drm_mode_getconnector
> drm: separate locks in __drm_mode_object_find
> drm: add a locked version of drm_is_current_master
> drm: serialize drm_file.master with a master lock
> drm: protect drm_master pointers in drm_lease.c
>
> drivers/gpu/drm/drm_auth.c | 86 +++++++++++++++++++++++--------
> drivers/gpu/drm/drm_connector.c | 5 +-
> drivers/gpu/drm/drm_file.c | 1 +
> drivers/gpu/drm/drm_lease.c | 81 ++++++++++++++++++++++-------
> drivers/gpu/drm/drm_mode_object.c | 10 ++--
> include/drm/drm_auth.h | 1 +
> include/drm/drm_file.h | 18 +++++--
> 7 files changed, 153 insertions(+), 49 deletions(-)
>
> --
> 2.25.1
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On 3/7/21 3:07 am, Daniel Vetter wrote:
> On Fri, Jul 02, 2021 at 12:53:53AM +0800, Desmond Cheong Zhi Xi wrote:
>> This patch series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
>> https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
>>
>> The series is broken up into five patches:
>>
>> 1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
>>
>> 2. Move a call to _drm_lease_held() out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find().
>>
>> 3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
>>
>> 4. Serialize drm_file.master by introducing a new lock that's held whenever the value of drm_file.master changes.
>>
>> 5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
>>
>> Changes in v6 -> v7:
>> - Patch 2:
>> Modify code alignment as suggested by the intel-gfx CI.
>>
>> Update commit message based on the changes to patch 5.
>>
>> - Patch 4:
>> Add patch 4 to the series. This patch adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
>>
>> - Patch 5:
>> Move kerneldoc comment about protecting drm_file.master with drm_device.master_mutex into patch 4.
>>
>> Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
>
> So there's another one now because master->leases is protected by the
> mode_config.idr_mutex, and that's a bit awkward to untangle.
>
> Also I'm really surprised that there was now lockdep through the atomic
> code anywhere. The reason seems to be that somehow CI reboot first before
> it managed to run any of the kms_atomic tests, and we can only hit this
> when we go through the atomic kms ioctl, the legacy kms ioctl don't have
> that specific issue.
>
> Anyway I think this approach doesn't look too workable, and we need
> something new.
>
> But first things first: Are you still on board working on this? You
> started with a simple patch to fix a UAF bug, now we're deep into
> reworking tricky locking ... If you feel like you want out I'm totally
> fine with that.
>
Hi Daniel,
Thanks for asking, but I'm committed to seeing this through :) In fact,
I really appreciate all your guidance and patience as the simple patch
evolved into the current state of things.
> Anyway, I think we need to split drm_device->master_mutex up into two
> parts:
>
> - One part that protects the actual access/changes, which I think for
> simplicity we'll just leave as the current lock. That lock is a very
> inner lock, since for the drm_lease.c stuff it has to nest within
> mode_config.idr_mutex even.
>
> - Now the issue with checking master status/leases/whatever as an
> innermost lock is that you can race, it's a classic time of check vs
> time of use race: By the time we actually use the thing we validate
> we'er allowed to use, we might now have access anymore. There's two
> reasons for that:
>
> * DROPMASTER ioctl could remove the master rights, which removes access
> rights also for all leases
>
> * REVOKE_LEASE ioctl can do the same but only for a specific lease
>
> This is the thing we're trying to protect against in fbcon code, but
> that's very spotty protection because all the ioctls by other users
> aren't actually protected against this.
>
> So I think for this we need some kind of big reader lock.
>
> Now for the implementation, there's a few things:
>
> - I think best option for this big reader lock would be to just use srcu.
> We only need to flush out all current readers when we drop master or
> revoke a lease, so synchronize_srcu is perfectly good enough for this
> purpose.
>
> - The fbdev code would switch over to srcu in
> drm_master_internal_acquire() and drm_master_internal_release(). Ofc
> within drm_master_internal_acquire we'd still need to check master
> status with the normal master_mutex.
>
> - While we revamp all this we should fix the ioctl checks in drm_ioctl.c.
> Just noticed that drm_ioctl_permit() could and should be unexported,
> last user was removed.
>
> Within drm_ioctl_kernel we'd then replace the check for
> drm_is_current_master with the drm_master_internal_acquire/release.
>
> - This alone does nothing, we still need to make sure that dropmaster and
> revoke_lease ioctl flush out all other access before they return to
> userspace. We can't just call synchronize_srcu because due to the ioctl
> code in drm_ioctl_kernel we're in that sruc section, we'd need to add a
> DRM_MASTER_FLUSH ioctl flag which we'd check only when DRM_MASTER is
> set, and use to call synchronize_srcu. Maybe wrap that in a
> drm_master_flush or so, or perhaps a drm_master_internal_release_flush.
>
> - Also maybe we should drop the _internal_ from that name. Feels a bit
> wrong when we're also going to use this in the ioctl handler.
>
> Thoughts? Totally silly and overkill?
>
> Cheers, Daniel
>
>
Just some thoughts on the previous approach before we move on to
something new. Regarding the lockdep warning for mode_config.idr_mutex,
I think that's resolvable now by simply removing patch 2, which is no
longer really necessary with the introduction of a new mutex at the
bottom of the lock hierarchy in patch 4.
I was hesitant to create a new mutex (especially since this means that
drm_file.master is now protected by either of two mutexes), but it's
probably the smallest fix in terms of code churn. Is that approach no good?
Otherwise, on a high level, I think using an srcu mechanism makes a lot
of sense to me to address the issue of data items being reclaimed while
some readers still have references to them.
The implementation details seem sound to me too, but I'll need to code
it up a bit before I can comment further.
Best wishes,
Desmond
>> Changes in v5 -> v6:
>> - Patch 2:
>> Add patch 2 to the series. This patch moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
>>
>> - Patch 5:
>> Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
>>
>> Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
>>
>> Modify comparison to NULL into "!master", as suggested by the intel-gfx CI.
>>
>> Changes in v4 -> v5:
>> - Patch 1:
>> Add patch 1 to the series. The changes in patch 1 do not apply to stable because they apply to new changes in the drm-misc-next branch. This patch moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
>>
>> Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
>>
>> - Patch 3:
>> Move changes to drm_connector.c into patch 1.
>>
>> Changes in v3 -> v4:
>> - Patch 3:
>> Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
>>
>> Additionally, inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
>>
>> - Patch 5:
>> Modify kerneldoc formatting.
>>
>> Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
>>
>> Changes in v2 -> v3:
>> - Patch 3:
>> Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
>>
>> - Patch 5:
>> Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
>>
>> Changes in v1 -> v2:
>> - Patch 5:
>> Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
>>
>> Desmond Cheong Zhi Xi (5):
>> drm: avoid circular locks in drm_mode_getconnector
>> drm: separate locks in __drm_mode_object_find
>> drm: add a locked version of drm_is_current_master
>> drm: serialize drm_file.master with a master lock
>> drm: protect drm_master pointers in drm_lease.c
>>
>> drivers/gpu/drm/drm_auth.c | 86 +++++++++++++++++++++++--------
>> drivers/gpu/drm/drm_connector.c | 5 +-
>> drivers/gpu/drm/drm_file.c | 1 +
>> drivers/gpu/drm/drm_lease.c | 81 ++++++++++++++++++++++-------
>> drivers/gpu/drm/drm_mode_object.c | 10 ++--
>> include/drm/drm_auth.h | 1 +
>> include/drm/drm_file.h | 18 +++++--
>> 7 files changed, 153 insertions(+), 49 deletions(-)
>>
>> --
>> 2.25.1
>>
>
On Mon, Jul 05, 2021 at 10:15:45AM +0800, Desmond Cheong Zhi Xi wrote:
> On 3/7/21 3:07 am, Daniel Vetter wrote:
> > On Fri, Jul 02, 2021 at 12:53:53AM +0800, Desmond Cheong Zhi Xi wrote:
> > > This patch series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
> > > https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
> > >
> > > The series is broken up into five patches:
> > >
> > > 1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
> > >
> > > 2. Move a call to _drm_lease_held() out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find().
> > >
> > > 3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
> > >
> > > 4. Serialize drm_file.master by introducing a new lock that's held whenever the value of drm_file.master changes.
> > >
> > > 5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
> > >
> > > Changes in v6 -> v7:
> > > - Patch 2:
> > > Modify code alignment as suggested by the intel-gfx CI.
> > >
> > > Update commit message based on the changes to patch 5.
> > >
> > > - Patch 4:
> > > Add patch 4 to the series. This patch adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
> > >
> > > - Patch 5:
> > > Move kerneldoc comment about protecting drm_file.master with drm_device.master_mutex into patch 4.
> > >
> > > Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
> >
> > So there's another one now because master->leases is protected by the
> > mode_config.idr_mutex, and that's a bit awkward to untangle.
> >
> > Also I'm really surprised that there was now lockdep through the atomic
> > code anywhere. The reason seems to be that somehow CI reboot first before
> > it managed to run any of the kms_atomic tests, and we can only hit this
> > when we go through the atomic kms ioctl, the legacy kms ioctl don't have
> > that specific issue.
> >
> > Anyway I think this approach doesn't look too workable, and we need
> > something new.
> >
> > But first things first: Are you still on board working on this? You
> > started with a simple patch to fix a UAF bug, now we're deep into
> > reworking tricky locking ... If you feel like you want out I'm totally
> > fine with that.
> >
>
> Hi Daniel,
>
> Thanks for asking, but I'm committed to seeing this through :) In fact, I
> really appreciate all your guidance and patience as the simple patch evolved
> into the current state of things.
Cool, it's definitely been fun trying to figure out a good solution for
this tricky problem here :-)
> > Anyway, I think we need to split drm_device->master_mutex up into two
> > parts:
> >
> > - One part that protects the actual access/changes, which I think for
> > simplicity we'll just leave as the current lock. That lock is a very
> > inner lock, since for the drm_lease.c stuff it has to nest within
> > mode_config.idr_mutex even.
> >
> > - Now the issue with checking master status/leases/whatever as an
> > innermost lock is that you can race, it's a classic time of check vs
> > time of use race: By the time we actually use the thing we validate
> > we'er allowed to use, we might now have access anymore. There's two
> > reasons for that:
> >
> > * DROPMASTER ioctl could remove the master rights, which removes access
> > rights also for all leases
> >
> > * REVOKE_LEASE ioctl can do the same but only for a specific lease
> >
> > This is the thing we're trying to protect against in fbcon code, but
> > that's very spotty protection because all the ioctls by other users
> > aren't actually protected against this.
> >
> > So I think for this we need some kind of big reader lock.
> >
> > Now for the implementation, there's a few things:
> >
> > - I think best option for this big reader lock would be to just use srcu.
> > We only need to flush out all current readers when we drop master or
> > revoke a lease, so synchronize_srcu is perfectly good enough for this
> > purpose.
> >
> > - The fbdev code would switch over to srcu in
> > drm_master_internal_acquire() and drm_master_internal_release(). Ofc
> > within drm_master_internal_acquire we'd still need to check master
> > status with the normal master_mutex.
> >
> > - While we revamp all this we should fix the ioctl checks in drm_ioctl.c.
> > Just noticed that drm_ioctl_permit() could and should be unexported,
> > last user was removed.
> >
> > Within drm_ioctl_kernel we'd then replace the check for
> > drm_is_current_master with the drm_master_internal_acquire/release.
> >
> > - This alone does nothing, we still need to make sure that dropmaster and
> > revoke_lease ioctl flush out all other access before they return to
> > userspace. We can't just call synchronize_srcu because due to the ioctl
> > code in drm_ioctl_kernel we're in that sruc section, we'd need to add a
> > DRM_MASTER_FLUSH ioctl flag which we'd check only when DRM_MASTER is
> > set, and use to call synchronize_srcu. Maybe wrap that in a
> > drm_master_flush or so, or perhaps a drm_master_internal_release_flush.
> >
> > - Also maybe we should drop the _internal_ from that name. Feels a bit
> > wrong when we're also going to use this in the ioctl handler.
> >
> > Thoughts? Totally silly and overkill?
> >
> > Cheers, Daniel
> >
> >
>
> Just some thoughts on the previous approach before we move on to something
> new. Regarding the lockdep warning for mode_config.idr_mutex, I think that's
> resolvable now by simply removing patch 2, which is no longer really
> necessary with the introduction of a new mutex at the bottom of the lock
> hierarchy in patch 4.
Oh I missed that, this is essentially part-way to what I'm describing
above.
> I was hesitant to create a new mutex (especially since this means that
> drm_file.master is now protected by either of two mutexes), but it's
> probably the smallest fix in terms of code churn. Is that approach no good?
That's the other approach I considered. It solves the use-after-free
issue, but while I was musing all the different issues here I realized
that we might as well use the opportunity to plug a few functional races
around drm_device ownership rules.
I do think it works. One thing I'd change is make it a spinlock - that
wayy it's very clear that it's a tiny inner lock that's really only meant
to protect the ->master pointer.
> Otherwise, on a high level, I think using an srcu mechanism makes a lot of
> sense to me to address the issue of data items being reclaimed while some
> readers still have references to them.
>
> The implementation details seem sound to me too, but I'll need to code it up
> a bit before I can comment further.
So maybe this is complete overkill, but what about three locks :-)
- innermost spinlock, just to protect against use-after-free until we
successfully got a reference. Essentially this is the lookup lock -
maybe we could call it master_lookup_lock for clarity?
- mutex like we have right now to make sure master state is consistent
when someone races set/dropmaster in userspace. This would be the only
write lock we have.
- new srcu to make sure that after a dropmaster/revoke-lease all previous
users calls are flushed out with synchronize_srcu(). Essentially this
wouldn't be a lock, but more a barrier. So maybe should call it
master_barrier_srcu or so? fbdev emulation in drm_client would use this,
and also drm_ioctl code to plug the race I've spotted.
So maybe refresh your series with just the pieces you think we need for
the master lookup spinlock, and we try to land that first?
I do agree this should work against the use-after-free.
Cheers, Daniel
>
> Best wishes,
> Desmond
>
> > > Changes in v5 -> v6:
> > > - Patch 2:
> > > Add patch 2 to the series. This patch moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
> > >
> > > - Patch 5:
> > > Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
> > >
> > > Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
> > >
> > > Modify comparison to NULL into "!master", as suggested by the intel-gfx CI.
> > >
> > > Changes in v4 -> v5:
> > > - Patch 1:
> > > Add patch 1 to the series. The changes in patch 1 do not apply to stable because they apply to new changes in the drm-misc-next branch. This patch moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
> > >
> > > Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
> > >
> > > - Patch 3:
> > > Move changes to drm_connector.c into patch 1.
> > >
> > > Changes in v3 -> v4:
> > > - Patch 3:
> > > Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
> > >
> > > Additionally, inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
> > >
> > > - Patch 5:
> > > Modify kerneldoc formatting.
> > >
> > > Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
> > >
> > > Changes in v2 -> v3:
> > > - Patch 3:
> > > Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
> > >
> > > - Patch 5:
> > > Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
> > >
> > > Changes in v1 -> v2:
> > > - Patch 5:
> > > Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
> > >
> > > Desmond Cheong Zhi Xi (5):
> > > drm: avoid circular locks in drm_mode_getconnector
> > > drm: separate locks in __drm_mode_object_find
> > > drm: add a locked version of drm_is_current_master
> > > drm: serialize drm_file.master with a master lock
> > > drm: protect drm_master pointers in drm_lease.c
> > >
> > > drivers/gpu/drm/drm_auth.c | 86 +++++++++++++++++++++++--------
> > > drivers/gpu/drm/drm_connector.c | 5 +-
> > > drivers/gpu/drm/drm_file.c | 1 +
> > > drivers/gpu/drm/drm_lease.c | 81 ++++++++++++++++++++++-------
> > > drivers/gpu/drm/drm_mode_object.c | 10 ++--
> > > include/drm/drm_auth.h | 1 +
> > > include/drm/drm_file.h | 18 +++++--
> > > 7 files changed, 153 insertions(+), 49 deletions(-)
> > >
> > > --
> > > 2.25.1
> > >
> >
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On 5/7/21 10:34 pm, Daniel Vetter wrote:
> On Mon, Jul 05, 2021 at 10:15:45AM +0800, Desmond Cheong Zhi Xi wrote:
>> On 3/7/21 3:07 am, Daniel Vetter wrote:
>>> On Fri, Jul 02, 2021 at 12:53:53AM +0800, Desmond Cheong Zhi Xi wrote:
>>>> This patch series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
>>>> https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
>>>>
>>>> The series is broken up into five patches:
>>>>
>>>> 1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
>>>>
>>>> 2. Move a call to _drm_lease_held() out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find().
>>>>
>>>> 3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
>>>>
>>>> 4. Serialize drm_file.master by introducing a new lock that's held whenever the value of drm_file.master changes.
>>>>
>>>> 5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
>>>>
>>>> Changes in v6 -> v7:
>>>> - Patch 2:
>>>> Modify code alignment as suggested by the intel-gfx CI.
>>>>
>>>> Update commit message based on the changes to patch 5.
>>>>
>>>> - Patch 4:
>>>> Add patch 4 to the series. This patch adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
>>>>
>>>> - Patch 5:
>>>> Move kerneldoc comment about protecting drm_file.master with drm_device.master_mutex into patch 4.
>>>>
>>>> Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
>>>
>>> So there's another one now because master->leases is protected by the
>>> mode_config.idr_mutex, and that's a bit awkward to untangle.
>>>
>>> Also I'm really surprised that there was now lockdep through the atomic
>>> code anywhere. The reason seems to be that somehow CI reboot first before
>>> it managed to run any of the kms_atomic tests, and we can only hit this
>>> when we go through the atomic kms ioctl, the legacy kms ioctl don't have
>>> that specific issue.
>>>
>>> Anyway I think this approach doesn't look too workable, and we need
>>> something new.
>>>
>>> But first things first: Are you still on board working on this? You
>>> started with a simple patch to fix a UAF bug, now we're deep into
>>> reworking tricky locking ... If you feel like you want out I'm totally
>>> fine with that.
>>>
>>
>> Hi Daniel,
>>
>> Thanks for asking, but I'm committed to seeing this through :) In fact, I
>> really appreciate all your guidance and patience as the simple patch evolved
>> into the current state of things.
>
> Cool, it's definitely been fun trying to figure out a good solution for
> this tricky problem here :-)
>
>>> Anyway, I think we need to split drm_device->master_mutex up into two
>>> parts:
>>>
>>> - One part that protects the actual access/changes, which I think for
>>> simplicity we'll just leave as the current lock. That lock is a very
>>> inner lock, since for the drm_lease.c stuff it has to nest within
>>> mode_config.idr_mutex even.
>>>
>>> - Now the issue with checking master status/leases/whatever as an
>>> innermost lock is that you can race, it's a classic time of check vs
>>> time of use race: By the time we actually use the thing we validate
>>> we'er allowed to use, we might now have access anymore. There's two
>>> reasons for that:
>>>
>>> * DROPMASTER ioctl could remove the master rights, which removes access
>>> rights also for all leases
>>>
>>> * REVOKE_LEASE ioctl can do the same but only for a specific lease
>>>
>>> This is the thing we're trying to protect against in fbcon code, but
>>> that's very spotty protection because all the ioctls by other users
>>> aren't actually protected against this.
>>>
>>> So I think for this we need some kind of big reader lock.
>>>
>>> Now for the implementation, there's a few things:
>>>
>>> - I think best option for this big reader lock would be to just use srcu.
>>> We only need to flush out all current readers when we drop master or
>>> revoke a lease, so synchronize_srcu is perfectly good enough for this
>>> purpose.
>>>
>>> - The fbdev code would switch over to srcu in
>>> drm_master_internal_acquire() and drm_master_internal_release(). Ofc
>>> within drm_master_internal_acquire we'd still need to check master
>>> status with the normal master_mutex.
>>>
>>> - While we revamp all this we should fix the ioctl checks in drm_ioctl.c.
>>> Just noticed that drm_ioctl_permit() could and should be unexported,
>>> last user was removed.
>>>
>>> Within drm_ioctl_kernel we'd then replace the check for
>>> drm_is_current_master with the drm_master_internal_acquire/release.
>>>
>>> - This alone does nothing, we still need to make sure that dropmaster and
>>> revoke_lease ioctl flush out all other access before they return to
>>> userspace. We can't just call synchronize_srcu because due to the ioctl
>>> code in drm_ioctl_kernel we're in that sruc section, we'd need to add a
>>> DRM_MASTER_FLUSH ioctl flag which we'd check only when DRM_MASTER is
>>> set, and use to call synchronize_srcu. Maybe wrap that in a
>>> drm_master_flush or so, or perhaps a drm_master_internal_release_flush.
>>>
>>> - Also maybe we should drop the _internal_ from that name. Feels a bit
>>> wrong when we're also going to use this in the ioctl handler.
>>>
>>> Thoughts? Totally silly and overkill?
>>>
>>> Cheers, Daniel
>>>
>>>
>>
>> Just some thoughts on the previous approach before we move on to something
>> new. Regarding the lockdep warning for mode_config.idr_mutex, I think that's
>> resolvable now by simply removing patch 2, which is no longer really
>> necessary with the introduction of a new mutex at the bottom of the lock
>> hierarchy in patch 4.
>
> Oh I missed that, this is essentially part-way to what I'm describing
> above.
>
>> I was hesitant to create a new mutex (especially since this means that
>> drm_file.master is now protected by either of two mutexes), but it's
>> probably the smallest fix in terms of code churn. Is that approach no good?
>
> That's the other approach I considered. It solves the use-after-free
> issue, but while I was musing all the different issues here I realized
> that we might as well use the opportunity to plug a few functional races
> around drm_device ownership rules.
>
Ah, right, that sounds like a good thing to do. I suspect that I might
have misunderstood what we're trying to achieve, so to clarify:
Is the issue that DROPMASTER ioctl/REVOKE_LEASE ioctl may be called
concurrently with other ioctls, so we have to ensure that these other
ioctl cmds are not running with outdated permissions (by ensuring
they've been flushed out) once dropmaster/revoke_lease successfully
return to the user?
If that's the case, then something that confuses me is why we'd want to
use the srcu read lock in drm_master_internal_acquire. The function
returns true only if dev->master is not set. Wouldn't this mean that
between a successful call to drm_master_internal_acquire/release,
there's no master that would be affected by DROPMASTER/REVOKE_LEASE?
I'm also confused as to how drm_master_internal_acquire/release can
replace the check for drm_is_current_master.
> I do think it works. One thing I'd change is make it a spinlock - that
> wayy it's very clear that it's a tiny inner lock that's really only meant
> to protect the ->master pointer.
> >> Otherwise, on a high level, I think using an srcu mechanism makes a
lot of
>> sense to me to address the issue of data items being reclaimed while some
>> readers still have references to them.
>>
>> The implementation details seem sound to me too, but I'll need to code it up
>> a bit before I can comment further.
>
> So maybe this is complete overkill, but what about three locks :-)
>
> - innermost spinlock, just to protect against use-after-free until we
> successfully got a reference. Essentially this is the lookup lock -
> maybe we could call it master_lookup_lock for clarity?
>
> - mutex like we have right now to make sure master state is consistent
> when someone races set/dropmaster in userspace. This would be the only
> write lock we have.
>
> - new srcu to make sure that after a dropmaster/revoke-lease all previous
> users calls are flushed out with synchronize_srcu(). Essentially this
> wouldn't be a lock, but more a barrier. So maybe should call it
> master_barrier_srcu or so? fbdev emulation in drm_client would use this,
> and also drm_ioctl code to plug the race I've spotted.
>
> So maybe refresh your series with just the pieces you think we need for
> the master lookup spinlock, and we try to land that first?
>
But besides the clarification above, this plan and the change to a
spinlock sound good, I'll update the series accordingly.
> I do agree this should work against the use-after-free.
>
> Cheers, Daniel
>
>>
>> Best wishes,
>> Desmond
>>
>>>> Changes in v5 -> v6:
>>>> - Patch 2:
>>>> Add patch 2 to the series. This patch moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
>>>>
>>>> - Patch 5:
>>>> Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
>>>>
>>>> Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
>>>>
>>>> Modify comparison to NULL into "!master", as suggested by the intel-gfx CI.
>>>>
>>>> Changes in v4 -> v5:
>>>> - Patch 1:
>>>> Add patch 1 to the series. The changes in patch 1 do not apply to stable because they apply to new changes in the drm-misc-next branch. This patch moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
>>>>
>>>> Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
>>>>
>>>> - Patch 3:
>>>> Move changes to drm_connector.c into patch 1.
>>>>
>>>> Changes in v3 -> v4:
>>>> - Patch 3:
>>>> Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
>>>>
>>>> Additionally, inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
>>>>
>>>> - Patch 5:
>>>> Modify kerneldoc formatting.
>>>>
>>>> Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
>>>>
>>>> Changes in v2 -> v3:
>>>> - Patch 3:
>>>> Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
>>>>
>>>> - Patch 5:
>>>> Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
>>>>
>>>> Changes in v1 -> v2:
>>>> - Patch 5:
>>>> Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
>>>>
>>>> Desmond Cheong Zhi Xi (5):
>>>> drm: avoid circular locks in drm_mode_getconnector
>>>> drm: separate locks in __drm_mode_object_find
>>>> drm: add a locked version of drm_is_current_master
>>>> drm: serialize drm_file.master with a master lock
>>>> drm: protect drm_master pointers in drm_lease.c
>>>>
>>>> drivers/gpu/drm/drm_auth.c | 86 +++++++++++++++++++++++--------
>>>> drivers/gpu/drm/drm_connector.c | 5 +-
>>>> drivers/gpu/drm/drm_file.c | 1 +
>>>> drivers/gpu/drm/drm_lease.c | 81 ++++++++++++++++++++++-------
>>>> drivers/gpu/drm/drm_mode_object.c | 10 ++--
>>>> include/drm/drm_auth.h | 1 +
>>>> include/drm/drm_file.h | 18 +++++--
>>>> 7 files changed, 153 insertions(+), 49 deletions(-)
>>>>
>>>> --
>>>> 2.25.1
>>>>
>>>
>>
>