From: Sultan Alsawaf <[email protected]>
The retire and active callbacks can run simultaneously, allowing
intel_context_pin() and intel_context_unpin() to run at the same time,
trashing the ring and page tables. In 5.4, this was more noticeable
because intel_ring_unpin() would set ring->vaddr to NULL and cause a
clean NULL-pointer-dereference panic, but in newer kernels the
use-after-free goes unnoticed.
The NULL-pointer-dereference looks like this:
BUG: unable to handle page fault for address: 0000000000003448
RIP: 0010:gen8_emit_flush_render+0x163/0x190
Call Trace:
execlists_request_alloc+0x25/0x40
__i915_request_create+0x1f4/0x2c0
i915_request_create+0x71/0xc0
i915_gem_do_execbuffer+0xb98/0x1a80
? preempt_count_add+0x68/0xa0
? _raw_spin_lock+0x13/0x30
? _raw_spin_unlock+0x16/0x30
i915_gem_execbuffer2_ioctl+0x1de/0x3c0
? i915_gem_busy_ioctl+0x7f/0x1d0
? i915_gem_execbuffer_ioctl+0x2d0/0x2d0
drm_ioctl_kernel+0xb2/0x100
drm_ioctl+0x209/0x360
? i915_gem_execbuffer_ioctl+0x2d0/0x2d0
ksys_ioctl+0x87/0xc0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x4e/0x150
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Protect the retire callback with ref->mutex to complement the active
callback and fix the corruption.
Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback")
Cc: <[email protected]>
Signed-off-by: Sultan Alsawaf <[email protected]>
---
drivers/gpu/drm/i915/i915_active.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index c4048628188a..0478bcf061b5 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -148,8 +148,10 @@ __active_retire(struct i915_active *ref)
spin_unlock_irqrestore(&ref->tree_lock, flags);
/* After the final retire, the entire struct may be freed */
+ mutex_lock(&ref->mutex);
if (ref->retire)
ref->retire(ref);
+ mutex_unlock(&ref->mutex);
/* ... except if you wait on it, you must manage your own references! */
wake_up_var(ref);
--
2.26.0
From: Sultan Alsawaf <[email protected]>
The retire and active callbacks can run simultaneously, allowing
intel_context_pin() and intel_context_unpin() to run at the same time,
trashing the ring and page tables. In 5.4, this was more noticeable
because intel_ring_unpin() would set ring->vaddr to NULL and cause a
clean NULL-pointer-dereference panic, but in newer kernels the
use-after-free goes unnoticed.
The NULL-pointer-dereference looks like this:
BUG: unable to handle page fault for address: 0000000000003448
RIP: 0010:gen8_emit_flush_render+0x163/0x190
Call Trace:
execlists_request_alloc+0x25/0x40
__i915_request_create+0x1f4/0x2c0
i915_request_create+0x71/0xc0
i915_gem_do_execbuffer+0xb98/0x1a80
? preempt_count_add+0x68/0xa0
? _raw_spin_lock+0x13/0x30
? _raw_spin_unlock+0x16/0x30
i915_gem_execbuffer2_ioctl+0x1de/0x3c0
? i915_gem_busy_ioctl+0x7f/0x1d0
? i915_gem_execbuffer_ioctl+0x2d0/0x2d0
drm_ioctl_kernel+0xb2/0x100
drm_ioctl+0x209/0x360
? i915_gem_execbuffer_ioctl+0x2d0/0x2d0
ksys_ioctl+0x87/0xc0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x4e/0x150
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Protect __intel_context_retire() with active->mutex (i.e., ref->mutex)
to complement the active callback and fix the corruption.
Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback")
Cc: <[email protected]>
Signed-off-by: Sultan Alsawaf <[email protected]>
---
v2: Reduce the scope of the mutex lock to only __intel_context_retire()
and mark it as a function that may sleep so it doesn't run in
atomic context
drivers/gpu/drm/i915/gt/intel_context.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 57e8a051ddc2..9b9be8058881 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -221,6 +221,7 @@ static void __intel_context_retire(struct i915_active *active)
CE_TRACE(ce, "retire\n");
+ mutex_lock(&active->mutex);
set_bit(CONTEXT_VALID_BIT, &ce->flags);
if (ce->state)
__context_unpin_state(ce->state);
@@ -229,6 +230,7 @@ static void __intel_context_retire(struct i915_active *active)
__ring_retire(ce->ring);
intel_context_put(ce);
+ mutex_unlock(&active->mutex);
}
static int __intel_context_active(struct i915_active *active)
@@ -288,7 +290,8 @@ intel_context_init(struct intel_context *ce,
mutex_init(&ce->pin_mutex);
i915_active_init(&ce->active,
- __intel_context_active, __intel_context_retire);
+ __intel_context_active,
+ i915_active_may_sleep(__intel_context_retire));
}
void intel_context_fini(struct intel_context *ce)
--
2.26.0
From: Sultan Alsawaf <[email protected]>
Active and retire callbacks can run simultaneously, causing panics and
mayhem. The most notable case is with the intel_context_pin/unpin race
that causes ring and page table corruption. In 5.4, this race is more
noticeable because intel_ring_unpin() sets ring->vaddr to NULL and
causes a clean NULL-pointer-dereference panic, but in newer kernels this
race goes unnoticed.
Here is an example of a crash caused by this race on 5.4:
BUG: unable to handle page fault for address: 0000000000003448
RIP: 0010:gen8_emit_flush_render+0x163/0x190
Call Trace:
execlists_request_alloc+0x25/0x40
__i915_request_create+0x1f4/0x2c0
i915_request_create+0x71/0xc0
i915_gem_do_execbuffer+0xb98/0x1a80
? preempt_count_add+0x68/0xa0
? _raw_spin_lock+0x13/0x30
? _raw_spin_unlock+0x16/0x30
i915_gem_execbuffer2_ioctl+0x1de/0x3c0
? i915_gem_busy_ioctl+0x7f/0x1d0
? i915_gem_execbuffer_ioctl+0x2d0/0x2d0
drm_ioctl_kernel+0xb2/0x100
drm_ioctl+0x209/0x360
? i915_gem_execbuffer_ioctl+0x2d0/0x2d0
ksys_ioctl+0x87/0xc0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x4e/0x150
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Protect the active and retire callbacks with their own mutex to prevent
them from running at the same time as one another. This change makes
the i915_active_may_sleep() functionality a redundant for its only user,
so clean that out as well since it's no longer needed.
Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback")
Cc: <[email protected]>
Signed-off-by: Sultan Alsawaf <[email protected]>
---
v3: Protecting only intel_context_pin/unpin is insufficient because all
the callbacks race with each other, not just those ones. Now, all
callbacks are protected from racing with their counterparts with a
new mutex lock for reduced scope. This isn't as pretty, but it is
unavoidable.
.../gpu/drm/i915/display/intel_frontbuffer.c | 4 +--
drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 -
drivers/gpu/drm/i915/gt/intel_context.c | 1 -
drivers/gpu/drm/i915/gt/intel_engine_pool.c | 1 -
drivers/gpu/drm/i915/gt/intel_timeline.c | 1 -
drivers/gpu/drm/i915/i915_active.c | 30 ++++++++++++++-----
drivers/gpu/drm/i915/i915_active_types.h | 6 +---
drivers/gpu/drm/i915/i915_vma.c | 1 -
8 files changed, 24 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_frontbuffer.c b/drivers/gpu/drm/i915/display/intel_frontbuffer.c
index 6cb02c912acc..766325948b5b 100644
--- a/drivers/gpu/drm/i915/display/intel_frontbuffer.c
+++ b/drivers/gpu/drm/i915/display/intel_frontbuffer.c
@@ -206,7 +206,6 @@ static int frontbuffer_active(struct i915_active *ref)
return 0;
}
-__i915_active_call
static void frontbuffer_retire(struct i915_active *ref)
{
struct intel_frontbuffer *front =
@@ -254,8 +253,7 @@ intel_frontbuffer_get(struct drm_i915_gem_object *obj)
kref_init(&front->ref);
atomic_set(&front->bits, 0);
i915_active_init(&front->write,
- frontbuffer_active,
- i915_active_may_sleep(frontbuffer_retire));
+ frontbuffer_active, frontbuffer_retire);
spin_lock(&i915->fb_tracking.lock);
if (rcu_access_pointer(obj->frontbuffer)) {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 68326ad3b2e0..31791b4ae086 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1057,7 +1057,6 @@ struct context_barrier_task {
void *data;
};
-__i915_active_call
static void cb_retire(struct i915_active *base)
{
struct context_barrier_task *cb = container_of(base, typeof(*cb), base);
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index aea992e46c42..964e32cf5856 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -222,7 +222,6 @@ static void __ring_retire(struct intel_ring *ring)
i915_active_release(&ring->vma->active);
}
-__i915_active_call
static void __intel_context_retire(struct i915_active *active)
{
struct intel_context *ce = container_of(active, typeof(*ce), active);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.c b/drivers/gpu/drm/i915/gt/intel_engine_pool.c
index 397186818305..60dde6978511 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pool.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.c
@@ -61,7 +61,6 @@ static int pool_active(struct i915_active *ref)
return 0;
}
-__i915_active_call
static void pool_retire(struct i915_active *ref)
{
struct intel_engine_pool_node *node =
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 91debbc97c9a..ecd120e1cf2a 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -131,7 +131,6 @@ static void __idle_cacheline_free(struct intel_timeline_cacheline *cl)
kfree_rcu(cl, rcu);
}
-__i915_active_call
static void __cacheline_retire(struct i915_active *active)
{
struct intel_timeline_cacheline *cl =
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index c4048628188a..165216aee122 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -148,8 +148,15 @@ __active_retire(struct i915_active *ref)
spin_unlock_irqrestore(&ref->tree_lock, flags);
/* After the final retire, the entire struct may be freed */
- if (ref->retire)
- ref->retire(ref);
+ if (ref->retire) {
+ if (ref->active) {
+ mutex_lock(&ref->callback_lock);
+ ref->retire(ref);
+ mutex_unlock(&ref->callback_lock);
+ } else {
+ ref->retire(ref);
+ }
+ }
/* ... except if you wait on it, you must manage your own references! */
wake_up_var(ref);
@@ -281,15 +288,15 @@ void __i915_active_init(struct i915_active *ref,
struct lock_class_key *mkey,
struct lock_class_key *wkey)
{
- unsigned long bits;
-
debug_active_init(ref);
ref->flags = 0;
ref->active = active;
- ref->retire = ptr_unpack_bits(retire, &bits, 2);
- if (bits & I915_ACTIVE_MAY_SLEEP)
+ ref->retire = retire;
+ if (ref->active && ref->retire) {
+ mutex_init(&ref->callback_lock);
ref->flags |= I915_ACTIVE_RETIRE_SLEEPS;
+ }
spin_lock_init(&ref->tree_lock);
ref->tree = RB_ROOT;
@@ -428,8 +435,15 @@ int i915_active_acquire(struct i915_active *ref)
return err;
if (likely(!i915_active_acquire_if_busy(ref))) {
- if (ref->active)
- err = ref->active(ref);
+ if (ref->active) {
+ if (ref->retire) {
+ mutex_lock(&ref->callback_lock);
+ err = ref->active(ref);
+ mutex_unlock(&ref->callback_lock);
+ } else {
+ err = ref->active(ref);
+ }
+ }
if (!err) {
spin_lock_irq(&ref->tree_lock); /* __active_retire() */
debug_active_activate(ref);
diff --git a/drivers/gpu/drm/i915/i915_active_types.h b/drivers/gpu/drm/i915/i915_active_types.h
index 6360c3e4b765..19b06c64c017 100644
--- a/drivers/gpu/drm/i915/i915_active_types.h
+++ b/drivers/gpu/drm/i915/i915_active_types.h
@@ -24,11 +24,6 @@ struct i915_active_fence {
struct active_node;
-#define I915_ACTIVE_MAY_SLEEP BIT(0)
-
-#define __i915_active_call __aligned(4)
-#define i915_active_may_sleep(fn) ptr_pack_bits(&(fn), I915_ACTIVE_MAY_SLEEP, 2)
-
struct i915_active {
atomic_t count;
struct mutex mutex;
@@ -45,6 +40,7 @@ struct i915_active {
int (*active)(struct i915_active *ref);
void (*retire)(struct i915_active *ref);
+ struct mutex callback_lock;
struct work_struct work;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 08699fa069aa..0f73beb586ab 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -93,7 +93,6 @@ static int __i915_vma_active(struct i915_active *ref)
return i915_vma_tryget(active_to_vma(ref)) ? 0 : -ENOENT;
}
-__i915_active_call
static void __i915_vma_retire(struct i915_active *ref)
{
i915_vma_put(active_to_vma(ref));
--
2.26.0
On Fri, Apr 03, 2020 at 03:35:15PM -0700, Sultan Alsawaf wrote:
> + ref->retire(ref);
> + mutex_unlock(&ref->callback_lock);
Ugh, this patch is still wrong because the mutex unlock after ref->retire() is a
use-after-free. Fun times...
Sultan
From: Sultan Alsawaf <[email protected]>
Active and retire callbacks can run simultaneously, causing panics and
mayhem. The most notable case is with the intel_context_pin/unpin race
that causes ring and page table corruption. In 5.4, this race is more
noticeable because intel_ring_unpin() sets ring->vaddr to NULL and
causes a clean NULL-pointer-dereference panic, but in newer kernels this
race goes unnoticed.
Here is an example of a crash caused by this race on 5.4:
BUG: unable to handle page fault for address: 0000000000003448
RIP: 0010:gen8_emit_flush_render+0x163/0x190
Call Trace:
execlists_request_alloc+0x25/0x40
__i915_request_create+0x1f4/0x2c0
i915_request_create+0x71/0xc0
i915_gem_do_execbuffer+0xb98/0x1a80
? preempt_count_add+0x68/0xa0
? _raw_spin_lock+0x13/0x30
? _raw_spin_unlock+0x16/0x30
i915_gem_execbuffer2_ioctl+0x1de/0x3c0
? i915_gem_busy_ioctl+0x7f/0x1d0
? i915_gem_execbuffer_ioctl+0x2d0/0x2d0
drm_ioctl_kernel+0xb2/0x100
drm_ioctl+0x209/0x360
? i915_gem_execbuffer_ioctl+0x2d0/0x2d0
ksys_ioctl+0x87/0xc0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x4e/0x150
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Protect the active and retire callbacks with their own lock to prevent
them from running at the same time as one another.
Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback")
Cc: <[email protected]>
Signed-off-by: Sultan Alsawaf <[email protected]>
---
drivers/gpu/drm/i915/i915_active.c | 52 ++++++++++++++++++++----
drivers/gpu/drm/i915/i915_active.h | 10 ++---
drivers/gpu/drm/i915/i915_active_types.h | 2 +
3 files changed, 50 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index b0a499753526..5802233f71ec 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -147,8 +147,22 @@ __active_retire(struct i915_active *ref)
spin_unlock_irqrestore(&ref->tree_lock, flags);
/* After the final retire, the entire struct may be freed */
- if (ref->retire)
- ref->retire(ref);
+ if (ref->retire) {
+ if (ref->active) {
+ bool freed = false;
+
+ /* Don't race with the active callback, and avoid UaF */
+ down_write(&ref->rwsem);
+ ref->freed = &freed;
+ ref->retire(ref);
+ if (!freed) {
+ ref->freed = NULL;
+ up_write(&ref->rwsem);
+ }
+ } else {
+ ref->retire(ref);
+ }
+ }
/* ... except if you wait on it, you must manage your own references! */
wake_up_var(ref);
@@ -278,7 +292,8 @@ void __i915_active_init(struct i915_active *ref,
int (*active)(struct i915_active *ref),
void (*retire)(struct i915_active *ref),
struct lock_class_key *mkey,
- struct lock_class_key *wkey)
+ struct lock_class_key *wkey,
+ struct lock_class_key *rkey)
{
unsigned long bits;
@@ -287,8 +302,13 @@ void __i915_active_init(struct i915_active *ref,
ref->flags = 0;
ref->active = active;
ref->retire = ptr_unpack_bits(retire, &bits, 2);
- if (bits & I915_ACTIVE_MAY_SLEEP)
+ ref->freed = NULL;
+ if (ref->active && ref->retire) {
+ __init_rwsem(&ref->rwsem, "i915_active.rwsem", rkey);
ref->flags |= I915_ACTIVE_RETIRE_SLEEPS;
+ } else if (bits & I915_ACTIVE_MAY_SLEEP) {
+ ref->flags |= I915_ACTIVE_RETIRE_SLEEPS;
+ }
spin_lock_init(&ref->tree_lock);
ref->tree = RB_ROOT;
@@ -417,8 +437,20 @@ int i915_active_acquire(struct i915_active *ref)
return err;
if (likely(!i915_active_acquire_if_busy(ref))) {
- if (ref->active)
- err = ref->active(ref);
+ if (ref->active) {
+ if (ref->retire) {
+ /*
+ * This can be a recursive call, and the mutex
+ * above already protects from concurrent active
+ * callbacks, so a read lock fits best.
+ */
+ down_read(&ref->rwsem);
+ err = ref->active(ref);
+ up_read(&ref->rwsem);
+ } else {
+ err = ref->active(ref);
+ }
+ }
if (!err) {
spin_lock_irq(&ref->tree_lock); /* __active_retire() */
debug_active_activate(ref);
@@ -502,16 +534,20 @@ int i915_request_await_active(struct i915_request *rq, struct i915_active *ref)
return err;
}
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
void i915_active_fini(struct i915_active *ref)
{
+ if (ref->freed) {
+ *ref->freed = true;
+ up_write(&ref->rwsem);
+ }
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
debug_active_fini(ref);
GEM_BUG_ON(atomic_read(&ref->count));
GEM_BUG_ON(work_pending(&ref->work));
GEM_BUG_ON(!RB_EMPTY_ROOT(&ref->tree));
mutex_destroy(&ref->mutex);
-}
#endif
+}
static inline bool is_idle_barrier(struct active_node *node, u64 idx)
{
diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h
index 51e1e854ca55..b684b1fdcc02 100644
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -153,14 +153,16 @@ void __i915_active_init(struct i915_active *ref,
int (*active)(struct i915_active *ref),
void (*retire)(struct i915_active *ref),
struct lock_class_key *mkey,
- struct lock_class_key *wkey);
+ struct lock_class_key *wkey,
+ struct lock_class_key *rkey);
/* Specialise each class of i915_active to avoid impossible lockdep cycles. */
#define i915_active_init(ref, active, retire) do { \
static struct lock_class_key __mkey; \
static struct lock_class_key __wkey; \
+ static struct lock_class_key __rkey; \
\
- __i915_active_init(ref, active, retire, &__mkey, &__wkey); \
+ __i915_active_init(ref, active, retire, &__mkey, &__wkey, &__rkey); \
} while (0)
int i915_active_ref(struct i915_active *ref,
@@ -200,11 +202,7 @@ i915_active_is_idle(const struct i915_active *ref)
return !atomic_read(&ref->count);
}
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
void i915_active_fini(struct i915_active *ref);
-#else
-static inline void i915_active_fini(struct i915_active *ref) { }
-#endif
int i915_active_acquire_preallocate_barrier(struct i915_active *ref,
struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/i915_active_types.h b/drivers/gpu/drm/i915/i915_active_types.h
index 6360c3e4b765..aaee2548cb19 100644
--- a/drivers/gpu/drm/i915/i915_active_types.h
+++ b/drivers/gpu/drm/i915/i915_active_types.h
@@ -32,6 +32,8 @@ struct active_node;
struct i915_active {
atomic_t count;
struct mutex mutex;
+ struct rw_semaphore rwsem;
+ bool *freed;
spinlock_t tree_lock;
struct active_node *cache;
--
2.26.0
Chris,
Could you please take a look at this? This really is quite an important fix.
Thanks,
Sultan
Quoting Sultan Alsawaf (2020-04-14 07:13:12)
> Chris,
>
> Could you please take a look at this? This really is quite an important fix.
It's crazy. See a266bf420060 for a patch that should be applied to v5.4
-Chris
On Tue, Apr 14, 2020 at 09:23:56AM +0100, Chris Wilson wrote:
> Quoting Sultan Alsawaf (2020-04-14 07:13:12)
> > Chris,
> >
> > Could you please take a look at this? This really is quite an important fix.
>
> It's crazy. See a266bf420060 for a patch that should be applied to v5.4
> -Chris
What? a266bf420060 was part of 5.4.0-rc7, so it's already in 5.4. And if you
read the commit message, you would see that the problem in question affects
Linus' tree.
You can break i915 in 5.6 by just adding a small delay:
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index 6ff803f397c4..3a7968effdfd 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -10,6 +10,7 @@
#include "intel_engine.h"
#include "intel_ring.h"
#include "intel_timeline.h"
+#include <linux/delay.h>
unsigned int intel_ring_update_space(struct intel_ring *ring)
{
@@ -92,6 +93,9 @@ void intel_ring_unpin(struct intel_ring *ring)
else
i915_gem_object_unpin_map(vma->obj);
+ mdelay(1);
+ ring->vaddr = NULL;
+
i915_vma_make_purgeable(vma);
i915_vma_unpin(vma);
}
This is how I reproduced the race in question. I can't even reach the greeter on
my laptop with this, because i915 dies before that.
Sultan
Greeting,
FYI, we noticed the following commit (built with gcc-7):
commit: 6dc0b234a64d2fdea96623381b234ec328b5a0a2 ("[PATCH] drm/i915: Fix use-after-free due to intel_context_pin/unpin race")
url: https://github.com/0day-ci/linux/commits/Sultan-Alsawaf/drm-i915-Fix-use-after-free-due-to-intel_context_pin-unpin-race/20200404-054505
base: git://anongit.freedesktop.org/drm-intel for-linux-next
in testcase: suspend-stress
with following parameters:
mode: freeze
iterations: 10
on test machine: 4 threads BroadWell with 8G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
kern :err : [ 209.039440] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281
kern :err : [ 209.039594] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 34, name: kworker/3:1
kern :warn : [ 209.039709] CPU: 3 PID: 34 Comm: kworker/3:1 Not tainted 5.6.0-rc5-01501-g6dc0b234a64d2f #1
kern :warn : [ 209.039824] Hardware name: /NUC5i3RYB, BIOS RYBDWi35.86A.0363.2017.0316.1028 03/16/2017
kern :warn : [ 209.040023] Workqueue: events engine_retire [i915]
kern :warn : [ 209.040093] Call Trace:
kern :warn : [ 209.040140] dump_stack+0x66/0x8b
kern :warn : [ 209.040192] ___might_sleep+0x102/0x120
kern :warn : [ 209.040251] mutex_lock+0x1c/0x40
kern :warn : [ 209.040380] __active_retire+0x7f/0x110 [i915]
kern :warn : [ 209.040449] dma_fence_signal_locked+0x7e/0x100
kern :warn : [ 209.040595] i915_request_retire+0x315/0x370 [i915]
kern :warn : [ 209.040736] retire_requests+0x4e/0x70 [i915]
kern :warn : [ 209.040865] engine_retire+0x61/0x90 [i915]
kern :warn : [ 209.040930] process_one_work+0x1b0/0x3e0
kern :warn : [ 209.040990] ? move_linked_works+0x6e/0xa0
kern :warn : [ 209.041051] worker_thread+0x1e5/0x3b0
kern :warn : [ 209.041108] ? process_one_work+0x3e0/0x3e0
kern :warn : [ 209.041170] kthread+0x11e/0x140
kern :warn : [ 209.041220] ? kthread_park+0x90/0x90
kern :warn : [ 209.041277] ret_from_fork+0x35/0x40
kern :debug : [ 209.045034] calling coretemp_init+0x0/0x1000 [coretemp] @ 245
kern :debug : [ 209.045252] probe of coretemp.0 returned 1 after 44 usecs
kern :debug : [ 209.068661] initcall coretemp_init+0x0/0x1000 [coretemp] returned 0 after 22978 usecs
kern :debug : [ 209.071902] calling powerclamp_init+0x0/0x1000 [intel_powerclamp] @ 240
kern :debug : [ 209.078262] initcall powerclamp_init+0x0/0x1000 [intel_powerclamp] returned 0 after 6104 usecs
kern :info : [ 209.079857] [drm] Initialized i915 1.6.0 20200313 for 0000:00:02.0 on minor 0
kern :debug : [ 209.081471] calling pkg_temp_thermal_init+0x0/0x1000 [x86_pkg_temp_thermal] @ 240
kern :debug : [ 209.081729] initcall pkg_temp_thermal_init+0x0/0x1000 [x86_pkg_temp_thermal] returned 0 after 138 usecs
kern :info : [ 209.083553] ACPI: Video Device [GFX0] (multi-head: yes rom: no post: no)
kern :info : [ 209.085400] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input9
kern :debug : [ 209.085534] calling acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] @ 240
kern :debug : [ 209.085540] initcall acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] returned -17 after 0 usecs
kern :debug : [ 209.086704] probe of LNXVIDEO:00 returned 1 after 6197 usecs
kern :info : [ 209.087484] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
kern :debug : [ 209.087831] probe of 0000:00:02.0 returned 1 after 161586 usecs
kern :debug : [ 209.088502] initcall i915_init+0x0/0x6b [i915] returned 0 after 2820 usecs
kern :debug : [ 209.106252] calling acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] @ 245
kern :debug : [ 209.106354] initcall acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] returned -17 after 0 usecs
kern :debug : [ 209.108121] calling rapl_init+0x0/0x1000 [intel_rapl_common] @ 240
kern :debug : [ 209.108266] initcall rapl_init+0x0/0x1000 [intel_rapl_common] returned 0 after 47 usecs
kern :debug : [ 209.111832] calling intel_rapl_msr_driver_init+0x0/0x1000 [intel_rapl_msr] @ 242
kern :info : [ 209.112018] intel_rapl_common: Found RAPL domain package
kern :info : [ 209.112097] intel_rapl_common: Found RAPL domain core
kern :info : [ 209.112171] intel_rapl_common: Found RAPL domain uncore
kern :info : [ 209.112246] intel_rapl_common: Found RAPL domain dram
kern :debug : [ 209.120124] probe of intel_rapl_msr.0 returned 1 after 8156 usecs
kern :debug : [ 209.120247] initcall intel_rapl_msr_driver_init+0x0/0x1000 [intel_rapl_msr] returned 0 after 8102 usecs
kern :debug : [ 209.127972] calling hdmi_driver_init+0x0/0x1000 [snd_hda_codec_hdmi] @ 504
kern :debug : [ 209.128244] calling acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] @ 266
kern :debug : [ 209.128343] initcall acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] returned -17 after 1 usecs
kern :debug : [ 209.130475] probe of hdaudioC0D0 returned 1 after 2387 usecs
kern :debug : [ 209.130596] initcall hdmi_driver_init+0x0/0x1000 [snd_hda_codec_hdmi] returned 0 after 2194 usecs
kern :info : [ 209.131772] input: HDA Intel HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:03.0/sound/card0/input10
kern :info : [ 209.131985] input: HDA Intel HDMI HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:03.0/sound/card0/input11
kern :info : [ 209.132183] input: HDA Intel HDMI HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:03.0/sound/card0/input12
kern :info : [ 209.132377] input: HDA Intel HDMI HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:03.0/sound/card0/input13
kern :info : [ 209.132574] input: HDA Intel HDMI HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:03.0/sound/card0/input14
kern :debug : [ 209.147193] calling acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] @ 247
kern :debug : [ 209.147298] initcall acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] returned -17 after 0 usecs
kern :info : [ 209.237164] fbcon: i915drmfb (fb0) is primary device
kern :info : [ 209.277039] Console: switching to colour frame buffer device 240x67
kern :info : [ 209.302577] i915 0000:00:02.0: fb0: i915drmfb frame buffer device
kern :err : [ 215.244354] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2ee2000-0xa2ee2fff], got write-back
kern :err : [ 215.244986] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.245244] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.245796] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.246027] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.246523] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.246768] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.247262] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.247490] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.247979] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.248210] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.248642] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2ee2000-0xa2ee2fff], got write-back
kern :err : [ 215.249166] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.249398] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.249893] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.250123] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.250638] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.250883] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.251361] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.251589] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.252083] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.252312] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.252744] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2ee2000-0xa2ee2fff], got write-back
kern :err : [ 215.253290] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.253521] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.254018] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.254251] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.254762] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.254988] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.255503] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.255748] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.256239] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
kern :err : [ 215.256470] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back
user :notice: [ 215.849565] Kernel tests: Boot OK!
kern :info : [ 217.609110] PM: suspend entry (s2idle)
kern :info : [ 217.609211] Filesystems sync: 0.000 seconds
kern :info : [ 217.631728] Freezing user space processes ... (elapsed 0.000 seconds) done.
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
Thanks,
lkp
Chris,
Could you please look at this in earnest? This is a real bug that crashes my
laptop without any kind of provocation. It is undeniably a bug in i915, and I've
clearly described it in my patch. If you dont like the patch, I'm open to any
suggestions you have for an alternative solution. My goal here is to make i915
better, but it's difficult when communication only goes one way.
Thanks,
Sultan
Quoting Sultan Alsawaf (2020-04-20 08:24:19)
> Chris,
>
> Could you please look at this in earnest? This is a real bug that crashes my
> laptop without any kind of provocation. It is undeniably a bug in i915, and I've
> clearly described it in my patch. If you dont like the patch, I'm open to any
> suggestions you have for an alternative solution. My goal here is to make i915
> better, but it's difficult when communication only goes one way.
Hi Sultan,
The patch Chris pointed out was not part of 5.4 release. The commit
message describes that it fixes the functions to be tolerant to
running simultaneously. In doing that zeroing of ring->vaddr is
removed so the test to do mdelay(1) and "ring->vaddr = NULL;" is
not correct.
I think you might have used the wrong git command for checking the
patch history:
$ git describe a266bf420060
v5.4-rc7-1996-ga266bf420060 # after -rc7 tag
$ git describe --contains a266bf420060
v5.6-rc1~34^2~21^2~326 # included in v5.6-rc1
And git log to double check:
$ git log --format=oneline kernel.org/stable/linux-5.4.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint"
$ git log --format=oneline kernel.org/stable/linux-5.5.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint"
0725d9a31869e6c80630e99da366ede2848295cc drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
$ git log --format=oneline kernel.org/stable/linux-5.6.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint"
a754012b9f2323a5d640da7eb7b095ac3b8cd012 drm/i915/execlists: Leave resetting ring to intel_ring
0725d9a31869e6c80630e99da366ede2848295cc drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
a266bf42006004306dd48a9082c35dfbff153307 drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
So it seems that the patch got pulled into v5.6 and has been backported
to v5.5 but not v5.4.
Could you try applying the patch to 5.4 and seeing if the problem
persists?
Regards, Joonas
On Mon, Apr 20, 2020 at 11:21:42AM +0300, Joonas Lahtinen wrote:
> So it seems that the patch got pulled into v5.6 and has been backported
> to v5.5 but not v5.4.
You're right, that's my mistake.
> In doing that zeroing of ring->vaddr is removed so the test to do mdelay(1)
> and "ring->vaddr = NULL;" is not correct.
I'm not so sure about this. Look at where `ring->vaddr` is assigned:
-------------------------------------8<-----------------------------------------
ret = i915_vma_pin(vma, 0, 0, flags);
if (unlikely(ret))
goto err_unpin;
if (i915_vma_is_map_and_fenceable(vma))
addr = (void __force *)i915_vma_pin_iomap(vma);
else
addr = i915_gem_object_pin_map(vma->obj,
i915_coherent_map_type(vma->vm->i915));
if (IS_ERR(addr)) {
ret = PTR_ERR(addr);
goto err_ring;
}
i915_vma_make_unshrinkable(vma);
/* Discard any unused bytes beyond that submitted to hw. */
intel_ring_reset(ring, ring->emit);
ring->vaddr = addr;
------------------------------------->8-----------------------------------------
And then the converse of that is done *before* my reproducer patch does
`ring->vaddr = NULL;`:
-------------------------------------8<-----------------------------------------
i915_vma_unset_ggtt_write(vma);
if (i915_vma_is_map_and_fenceable(vma))
i915_vma_unpin_iomap(vma);
else
i915_gem_object_unpin_map(vma->obj);
/* mdelay(1);
ring->vaddr = NULL; */
i915_vma_make_purgeable(vma);
i915_vma_unpin(vma);
------------------------------------->8-----------------------------------------
Isn't the value assigned to `ring->vaddr` trashed by those function calls above
where I've got the mdelay? If so, why would it be correct to let the stale value
sit in `ring->vaddr`?
My interpretation of the zeroing of ring->vaddr being removed by Chris was that
it was an unnecessary step before the ring was getting discarded anyway.
Sultan
Quoting Sultan Alsawaf (2020-04-20 19:15:14)
> On Mon, Apr 20, 2020 at 11:21:42AM +0300, Joonas Lahtinen wrote:
> > So it seems that the patch got pulled into v5.6 and has been backported
> > to v5.5 but not v5.4.
>
> You're right, that's my mistake.
Did applying the patch to v5.4 fix the issue at hand?
Regards, Joonas
On Tue, Apr 21, 2020 at 09:51:37AM +0300, Joonas Lahtinen wrote:
> Quoting Sultan Alsawaf (2020-04-20 19:15:14)
> > On Mon, Apr 20, 2020 at 11:21:42AM +0300, Joonas Lahtinen wrote:
> > > So it seems that the patch got pulled into v5.6 and has been backported
> > > to v5.5 but not v5.4.
> >
> > You're right, that's my mistake.
>
> Did applying the patch to v5.4 fix the issue at hand?
Of course the patch doesn't apply as-is because the code's been shuffled around,
but yes. The crashes are gone with that patch, and I don't have the motivation
to check if that patch is actually correct, so hurray, problem solved. I'm not
going to send the backport myself because I'll probably be ignored, so you can
go ahead and do that.
Sultan