From: Janusz Krzysztofik <[email protected]>
During i915_driver_unload(), GEM contexts are verified restrictively
inside i915_gem_fini() if they don't consume shared resources which
should be cleaned up before the driver is released. If those checks
don't result in kernel panic, one more check is performed at the end of
i915_gem_fini() which issues a WARN_ON() if GEM contexts still exist.
Some GEM contexts are allocated unconditionally on device file open,
one per each file descriptor, and are kept open until those file
descriptors are closed. Since open file descriptors prevent the driver
module from being unloaded, that protects the driver from being
released while contexts are still open. However, that's not the case
on driver unbind or device unplug sysfs operations which are executed
regardless of open file descriptors.
To protect kernel resources from being accessed by those open file
decriptors while driver unbind or device unplug operation is in
progress, the driver now calls drm_device_unplug() at the beginning of
that process and relies on the DRM layer to provide such protection.
Taking all above information into account, as soon as shared resources
not associated with specific file descriptors are cleaned up, it should
be safe to postpone completion of driver release until users of those
open file decriptors give up on errors and close them.
When device has been marked unplugged, use WARN_ON() conditionally so
the warning is displayed only if a GEM context not associated with a
file descriptor is still allocated.
Signed-off-by: Janusz Krzysztofik <[email protected]>
---
drivers/gpu/drm/i915/i915_gem.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 54f27cabae2a..c00b6dbaf4f5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4670,7 +4670,17 @@ void i915_gem_fini(struct drm_i915_private *dev_priv)
i915_gem_drain_freed_objects(dev_priv);
- WARN_ON(!list_empty(&dev_priv->contexts.list));
+ if (drm_dev_is_unplugged(&dev_priv->drm)) {
+ struct i915_gem_context *ctx, *cn;
+
+ list_for_each_entry_safe(ctx, cn, &dev_priv->contexts.list,
+ link) {
+ WARN_ON(IS_ERR_OR_NULL(ctx->file_priv));
+ break;
+ }
+ } else {
+ WARN_ON(!list_empty(&dev_priv->contexts.list));
+ }
}
void i915_gem_init_mmio(struct drm_i915_private *i915)
--
2.21.0
Quoting Janusz Krzysztofik (2019-05-17 15:06:17)
> From: Janusz Krzysztofik <[email protected]>
>
> During i915_driver_unload(), GEM contexts are verified restrictively
> inside i915_gem_fini() if they don't consume shared resources which
> should be cleaned up before the driver is released. If those checks
> don't result in kernel panic, one more check is performed at the end of
> i915_gem_fini() which issues a WARN_ON() if GEM contexts still exist.
Just fix the underlying bug of this code being called too early. The
assumptions we made for unload are clearly invalid when applied to
unbind, and we need to split the phases.
-Chris
On Friday, May 17, 2019 4:32:35 PM CEST Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-05-17 15:06:17)
> > From: Janusz Krzysztofik <[email protected]>
> >
> > During i915_driver_unload(), GEM contexts are verified restrictively
> > inside i915_gem_fini() if they don't consume shared resources which
> > should be cleaned up before the driver is released. If those checks
> > don't result in kernel panic, one more check is performed at the end of
> > i915_gem_fini() which issues a WARN_ON() if GEM contexts still exist.
>
> Just fix the underlying bug of this code being called too early. The
> assumptions we made for unload are clearly invalid when applied to
> unbind, and we need to split the phases.
> -Chris
Thanks Chris, I think I get it finally.
Janusz