This fixes quite a number of runtime PM bugs I found that have been
causing some pretty nasty issues such as:
- Deadlocking on boot
- Connector probing potentially not working while the GPU is in runtime
suspend
- i2c char dev not working while the GPU is in runtime suspend
- aux char dev not working while the GPU is in runtime suspend
There's definitely more parts of nouveau that need to be fixed to use
runtime power management correctly, such as the hwmon portions, but this
series just handles the more important fixes that should get into stable
for the time being.
Cc: Karol Herbst <[email protected]>
Cc: [email protected]
Lyude Paul (5):
drm/nouveau: Prevent RPM callback recursion in suspend/resume paths
drm/nouveau: Grab RPM ref while probing outputs
drm/nouveau: Add missing RPM get/put() when probing connectors
drm/nouveau: Grab RPM ref when i2c bus is in use
drm/nouveau: Grab RPM ref when aux bus is in use
drivers/gpu/drm/nouveau/dispnv50/disp.c | 12 +++++++++--
drivers/gpu/drm/nouveau/nouveau_connector.c | 21 +++++++++++++++++--
drivers/gpu/drm/nouveau/nouveau_connector.h | 3 +++
drivers/gpu/drm/nouveau/nouveau_drm.c | 10 ++++++++-
drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c | 12 ++++++++++-
drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c | 12 ++++++++++-
6 files changed, 63 insertions(+), 7 deletions(-)
--
2.17.1
While the GPU is guaranteed to be on when a hotplug has been received,
the same assertion does not hold true if a connector probe has been
started by userspace without having had received a sysfs event. So
ensure that any connector probing keeps the GPU alive for the duration
of the probe.
Signed-off-by: Lyude Paul <[email protected]>
Cc: Karol Herbst <[email protected]>
Cc: [email protected]
---
drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_connector.c | 21 +++++++++++++++++++--
drivers/gpu/drm/nouveau/nouveau_connector.h | 3 +++
3 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index ea2a886854fe..0f283ca75189 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -858,7 +858,7 @@ static const struct drm_connector_funcs
nv50_mstc = {
.reset = nouveau_conn_reset,
.detect = nv50_mstc_detect,
- .fill_modes = drm_helper_probe_single_connector_modes,
+ .fill_modes = nouveau_connector_probe_single_connector_modes,
.destroy = nv50_mstc_destroy,
.atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
.atomic_destroy_state = nouveau_conn_atomic_destroy_state,
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
index 2a45b4c2ceb0..feb142fb7a8a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -770,6 +770,23 @@ nouveau_connector_force(struct drm_connector *connector)
nouveau_connector_set_encoder(connector, nv_encoder);
}
+int
+nouveau_connector_probe_single_connector_modes(struct drm_connector *connector,
+ uint32_t maxX, uint32_t maxY)
+{
+ struct device *dev = connector->dev->dev;
+ int ret;
+
+ ret = pm_runtime_get_sync(dev);
+ if (WARN_ON(ret < 0 && ret != -EACCES))
+ return 0;
+
+ ret = drm_helper_probe_single_connector_modes(connector, maxX, maxY);
+
+ pm_runtime_put_autosuspend(dev);
+ return ret;
+}
+
static int
nouveau_connector_set_property(struct drm_connector *connector,
struct drm_property *property, uint64_t value)
@@ -1088,7 +1105,7 @@ nouveau_connector_funcs = {
.reset = nouveau_conn_reset,
.detect = nouveau_connector_detect,
.force = nouveau_connector_force,
- .fill_modes = drm_helper_probe_single_connector_modes,
+ .fill_modes = nouveau_connector_probe_single_connector_modes,
.set_property = nouveau_connector_set_property,
.destroy = nouveau_connector_destroy,
.atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
@@ -1103,7 +1120,7 @@ nouveau_connector_funcs_lvds = {
.reset = nouveau_conn_reset,
.detect = nouveau_connector_detect_lvds,
.force = nouveau_connector_force,
- .fill_modes = drm_helper_probe_single_connector_modes,
+ .fill_modes = nouveau_connector_probe_single_connector_modes,
.set_property = nouveau_connector_set_property,
.destroy = nouveau_connector_destroy,
.atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.h b/drivers/gpu/drm/nouveau/nouveau_connector.h
index 2d9d35a146a4..e9ffc6eebda2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.h
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.h
@@ -106,6 +106,9 @@ nouveau_crtc_connector_get(struct nouveau_crtc *nv_crtc)
struct drm_connector *
nouveau_connector_create(struct drm_device *, const struct dcb_output *);
+int
+nouveau_connector_probe_single_connector_modes(struct drm_connector *,
+ uint32_t, uint32_t);
extern int nouveau_tv_disable;
extern int nouveau_ignorelid;
--
2.17.1
DP AUX busses can both be accessed by DRM, and through any of the
userspace dev nodes in /dev/drm_dp_auxN. We need to make sure the GPU
stays on in all of these codepaths.
Signed-off-by: Lyude Paul <[email protected]>
Cc: Karol Herbst <[email protected]>
Cc: [email protected]
---
drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c
index 4c1f547da463..6276b113065c 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c
@@ -98,18 +98,28 @@ nvkm_i2c_aux_release(struct nvkm_i2c_aux *aux)
AUX_TRACE(aux, "release");
nvkm_i2c_pad_release(pad);
mutex_unlock(&aux->mutex);
+ pm_runtime_put_autosuspend(pad->i2c->subdev.device->dev);
}
int
nvkm_i2c_aux_acquire(struct nvkm_i2c_aux *aux)
{
struct nvkm_i2c_pad *pad = aux->pad;
+ struct device *dev = pad->i2c->subdev.device->dev;
int ret;
+
AUX_TRACE(aux, "acquire");
+
+ ret = pm_runtime_get_sync(dev);
+ if (ret < 0 && ret != -EACCES)
+ return ret;
+
mutex_lock(&aux->mutex);
ret = nvkm_i2c_pad_acquire(pad, NVKM_I2C_PAD_AUX);
- if (ret)
+ if (ret) {
mutex_unlock(&aux->mutex);
+ pm_runtime_put_autosuspend(dev);
+ }
return ret;
}
--
2.17.1
The i2c bus can be both accessed by DRM itself, along with any of it's
devnodes (/sys/class/i2c). So, we need to make sure that all codepaths
using the i2c bus keep the GPU resumed.
Signed-off-by: Lyude Paul <[email protected]>
Cc: Karol Herbst <[email protected]>
Cc: [email protected]
---
drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
index 807a2b67bd64..1de48c990b80 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
@@ -119,18 +119,28 @@ nvkm_i2c_bus_release(struct nvkm_i2c_bus *bus)
BUS_TRACE(bus, "release");
nvkm_i2c_pad_release(pad);
mutex_unlock(&bus->mutex);
+ pm_runtime_put_autosuspend(pad->i2c->subdev.device->dev);
}
int
nvkm_i2c_bus_acquire(struct nvkm_i2c_bus *bus)
{
struct nvkm_i2c_pad *pad = bus->pad;
+ struct device *dev = pad->i2c->subdev.device->dev;
int ret;
+
BUS_TRACE(bus, "acquire");
+
+ ret = pm_runtime_get_sync(dev);
+ if (ret < 0 && ret != -EACCES)
+ return ret;
+
mutex_lock(&bus->mutex);
ret = nvkm_i2c_pad_acquire(pad, NVKM_I2C_PAD_I2C);
- if (ret)
+ if (ret) {
mutex_unlock(&bus->mutex);
+ pm_runtime_put_autosuspend(dev);
+ }
return ret;
}
--
2.17.1
When DP MST hubs get confused, they can occasionally stop responding for
a good bit of time up until the point where the DRM driver manages to
do the right DPCD accesses to get it to start responding again. In a
worst case scenario however, this process can take upwards of 10+
seconds.
Currently we use the default output_poll_changed handler
drm_fb_helper_output_poll_changed() to handle output polling, which
doesn't happen to grab any power references on the device when polling.
If we're unlucky enough to have a hub (such as Lenovo's infamous laptop
docks for the P5x/P7x series) that's easily startled and confused, this
can lead to a pretty nasty deadlock situation that looks like this:
- Hotplug event from hub happens, we enter
drm_fb_helper_output_poll_changed() and start communicating with the
hub
- While we're in drm_fb_helper_output_poll_changed() and attempting to
communicate with the hub, we end up confusing it and cause it to stop
responding for at least 10 seconds
- After 5 seconds of being in drm_fb_helper_output_poll_changed(), the
pm core attempts to put the GPU into autosuspend, which ends up
calling drm_kms_helper_poll_disable()
- While the runtime PM core is waiting in drm_kms_helper_poll_disable()
for the output poll to finish, we end up finally detecting an MST
display
- We notice the new display and tries to enable it, which triggers
an atomic commit which triggers a call to pm_runtime_get_sync()
- the output poll thread deadlocks the pm core waiting for the pm core
to finish the autosuspend request while the pm core waits for the
output poll thread to finish
Sample:
[ 246.669625] INFO: task kworker/4:0:37 blocked for more than 120 seconds.
[ 246.673398] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.675271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.676527] kworker/4:0 D 0 37 2 0x80000000
[ 246.677580] Workqueue: events output_poll_execute [drm_kms_helper]
[ 246.678704] Call Trace:
[ 246.679753] __schedule+0x322/0xaf0
[ 246.680916] schedule+0x33/0x90
[ 246.681924] schedule_preempt_disabled+0x15/0x20
[ 246.683023] __mutex_lock+0x569/0x9a0
[ 246.684035] ? kobject_uevent_env+0x117/0x7b0
[ 246.685132] ? drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.686179] mutex_lock_nested+0x1b/0x20
[ 246.687278] ? mutex_lock_nested+0x1b/0x20
[ 246.688307] drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.689420] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[ 246.690462] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[ 246.691570] output_poll_execute+0x198/0x1c0 [drm_kms_helper]
[ 246.692611] process_one_work+0x231/0x620
[ 246.693725] worker_thread+0x214/0x3a0
[ 246.694756] kthread+0x12b/0x150
[ 246.695856] ? wq_pool_ids_show+0x140/0x140
[ 246.696888] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.697998] ret_from_fork+0x3a/0x50
[ 246.699034] INFO: task kworker/0:1:60 blocked for more than 120 seconds.
[ 246.700153] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.701182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.702278] kworker/0:1 D 0 60 2 0x80000000
[ 246.703293] Workqueue: pm pm_runtime_work
[ 246.704393] Call Trace:
[ 246.705403] __schedule+0x322/0xaf0
[ 246.706439] ? wait_for_completion+0x104/0x190
[ 246.707393] schedule+0x33/0x90
[ 246.708375] schedule_timeout+0x3a5/0x590
[ 246.709289] ? mark_held_locks+0x58/0x80
[ 246.710208] ? _raw_spin_unlock_irq+0x2c/0x40
[ 246.711222] ? wait_for_completion+0x104/0x190
[ 246.712134] ? trace_hardirqs_on_caller+0xf4/0x190
[ 246.713094] ? wait_for_completion+0x104/0x190
[ 246.713964] wait_for_completion+0x12c/0x190
[ 246.714895] ? wake_up_q+0x80/0x80
[ 246.715727] ? get_work_pool+0x90/0x90
[ 246.716649] flush_work+0x1c9/0x280
[ 246.717483] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
[ 246.718442] __cancel_work_timer+0x146/0x1d0
[ 246.719247] cancel_delayed_work_sync+0x13/0x20
[ 246.720043] drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper]
[ 246.721123] nouveau_pmops_runtime_suspend+0x3d/0xb0 [nouveau]
[ 246.721897] pci_pm_runtime_suspend+0x6b/0x190
[ 246.722825] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.723737] __rpm_callback+0x7a/0x1d0
[ 246.724721] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.725607] rpm_callback+0x24/0x80
[ 246.726553] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.727376] rpm_suspend+0x142/0x6b0
[ 246.728185] pm_runtime_work+0x97/0xc0
[ 246.728938] process_one_work+0x231/0x620
[ 246.729796] worker_thread+0x44/0x3a0
[ 246.730614] kthread+0x12b/0x150
[ 246.731395] ? wq_pool_ids_show+0x140/0x140
[ 246.732202] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.732878] ret_from_fork+0x3a/0x50
[ 246.733768] INFO: task kworker/4:2:422 blocked for more than 120 seconds.
[ 246.734587] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.735393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.736113] kworker/4:2 D 0 422 2 0x80000080
[ 246.736789] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[ 246.737665] Call Trace:
[ 246.738490] __schedule+0x322/0xaf0
[ 246.739250] schedule+0x33/0x90
[ 246.739908] rpm_resume+0x19c/0x850
[ 246.740750] ? finish_wait+0x90/0x90
[ 246.741541] __pm_runtime_resume+0x4e/0x90
[ 246.742370] nv50_disp_atomic_commit+0x31/0x210 [nouveau]
[ 246.743124] drm_atomic_commit+0x4a/0x50 [drm]
[ 246.743775] restore_fbdev_mode_atomic+0x1c8/0x240 [drm_kms_helper]
[ 246.744603] restore_fbdev_mode+0x31/0x140 [drm_kms_helper]
[ 246.745373] drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper]
[ 246.746220] drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper]
[ 246.746884] drm_fb_helper_hotplug_event.part.28+0x96/0xb0 [drm_kms_helper]
[ 246.747675] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[ 246.748544] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[ 246.749439] nv50_mstm_hotplug+0x15/0x20 [nouveau]
[ 246.750111] drm_dp_send_link_address+0x177/0x1c0 [drm_kms_helper]
[ 246.750764] drm_dp_check_and_send_link_address+0xa8/0xd0 [drm_kms_helper]
[ 246.751602] drm_dp_mst_link_probe_work+0x51/0x90 [drm_kms_helper]
[ 246.752314] process_one_work+0x231/0x620
[ 246.752979] worker_thread+0x44/0x3a0
[ 246.753838] kthread+0x12b/0x150
[ 246.754619] ? wq_pool_ids_show+0x140/0x140
[ 246.755386] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.756162] ret_from_fork+0x3a/0x50
[ 246.756847]
Showing all locks held in the system:
[ 246.758261] 3 locks held by kworker/4:0/37:
[ 246.759016] #0: 00000000f8df4d2d ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.759856] #1: 00000000e6065461 ((work_completion)(&(&dev->mode_config.output_poll_work)->work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.760670] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.761516] 2 locks held by kworker/0:1/60:
[ 246.762274] #0: 00000000fff6be0f ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.762982] #1: 000000005ab44fb4 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.763890] 1 lock held by khungtaskd/64:
[ 246.764664] #0: 000000008cb8b5c3 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
[ 246.765588] 5 locks held by kworker/4:2/422:
[ 246.766440] #0: 00000000232f0959 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.767390] #1: 00000000bb59b134 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.768154] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_restore_fbdev_mode_unlocked+0x4c/0xb0 [drm_kms_helper]
[ 246.768966] #3: 000000004c8f0b6b (crtc_ww_class_acquire){+.+.}, at: restore_fbdev_mode_atomic+0x4b/0x240 [drm_kms_helper]
[ 246.769921] #4: 000000004c34a296 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8a/0x1b0 [drm]
[ 246.770839] 1 lock held by dmesg/1038:
[ 246.771739] 2 locks held by zsh/1172:
[ 246.772650] #0: 00000000836d0438 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
[ 246.773680] #1: 000000001f4f4d48 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
[ 246.775522] =============================================
So, to fix this (and any other possible deadlock issues like this that
could occur in the output_poll_changed patch) we make sure that
nouveau's output_poll_changed functions grab a runtime power ref before
sending any hotplug events, and hold it until we're finished.
This fixes deadlock issues when in fbcon with nouveau on my P50, and
should fix it for everyone else's as well!
Signed-off-by: Lyude Paul <[email protected]>
Cc: Karol Herbst <[email protected]>
Cc: [email protected]
---
drivers/gpu/drm/nouveau/dispnv50/disp.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index eb3e41a78806..ea2a886854fe 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -2012,10 +2012,18 @@ nv50_disp_atomic_state_alloc(struct drm_device *dev)
return &atom->state;
}
+static void
+nouveau_output_poll_changed(struct drm_device *dev)
+{
+ pm_runtime_get_sync(dev->dev);
+ drm_fb_helper_hotplug_event(dev->fb_helper);
+ pm_runtime_put_autosuspend(dev->dev);
+}
+
static const struct drm_mode_config_funcs
nv50_disp_func = {
.fb_create = nouveau_user_framebuffer_create,
- .output_poll_changed = drm_fb_helper_output_poll_changed,
+ .output_poll_changed = nouveau_output_poll_changed,
.atomic_check = nv50_disp_atomic_check,
.atomic_commit = nv50_disp_atomic_commit,
.atomic_state_alloc = nv50_disp_atomic_state_alloc,
--
2.17.1
In order to fix all of the spots that need to have runtime PM get/puts()
added, we need to ensure that it's possible for us to call
pm_runtime_get/put() in any context, regardless of how deep, since
almost all of the spots that are currently missing refs can potentially
get called in the runtime suspend/resume path. Otherwise, we'll try to
resume the GPU as we're trying to resume the GPU (and vice-versa) and
cause the kernel to deadlock.
With this, it should be safe to call the pm runtime functions in any
context in nouveau with one condition: any point in the driver that
calls pm_runtime_get*() cannot hold any locks owned by nouveau that
would be acquired anywhere inside nouveau_pmops_runtime_resume().
This includes modesetting locks, i2c bus locks, etc.
Signed-off-by: Lyude Paul <[email protected]>
Cc: Karol Herbst <[email protected]>
Cc: [email protected]
---
drivers/gpu/drm/nouveau/nouveau_drm.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index c7ec86d6c3c9..e851ef7b6373 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -835,6 +835,8 @@ nouveau_pmops_runtime_suspend(struct device *dev)
return -EBUSY;
}
+ dev->power.disable_depth++;
+
drm_kms_helper_poll_disable(drm_dev);
nouveau_switcheroo_optimus_dsm();
ret = nouveau_do_suspend(drm_dev, true);
@@ -843,6 +845,8 @@ nouveau_pmops_runtime_suspend(struct device *dev)
pci_ignore_hotplug(pdev);
pci_set_power_state(pdev, PCI_D3cold);
drm_dev->switch_power_state = DRM_SWITCH_POWER_DYNAMIC_OFF;
+
+ dev->power.disable_depth--;
return ret;
}
@@ -859,11 +863,13 @@ nouveau_pmops_runtime_resume(struct device *dev)
return -EBUSY;
}
+ dev->power.disable_depth++;
+
pci_set_power_state(pdev, PCI_D0);
pci_restore_state(pdev);
ret = pci_enable_device(pdev);
if (ret)
- return ret;
+ goto out;
pci_set_master(pdev);
ret = nouveau_do_resume(drm_dev, true);
@@ -875,6 +881,8 @@ nouveau_pmops_runtime_resume(struct device *dev)
/* Monitors may have been connected / disconnected during suspend */
schedule_work(&nouveau_drm(drm_dev)->hpd_work);
+out:
+ dev->power.disable_depth--;
return ret;
}
--
2.17.1
[cc += linux-pm]
Hi Lyude,
First of all, thanks a lot for looking into this.
On Mon, Jul 16, 2018 at 07:59:25PM -0400, Lyude Paul wrote:
> In order to fix all of the spots that need to have runtime PM get/puts()
> added, we need to ensure that it's possible for us to call
> pm_runtime_get/put() in any context, regardless of how deep, since
> almost all of the spots that are currently missing refs can potentially
> get called in the runtime suspend/resume path. Otherwise, we'll try to
> resume the GPU as we're trying to resume the GPU (and vice-versa) and
> cause the kernel to deadlock.
>
> With this, it should be safe to call the pm runtime functions in any
> context in nouveau with one condition: any point in the driver that
> calls pm_runtime_get*() cannot hold any locks owned by nouveau that
> would be acquired anywhere inside nouveau_pmops_runtime_resume().
> This includes modesetting locks, i2c bus locks, etc.
[snip]
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -835,6 +835,8 @@ nouveau_pmops_runtime_suspend(struct device *dev)
> return -EBUSY;
> }
>
> + dev->power.disable_depth++;
> +
I'm not sure if that variable is actually private to the PM core.
Grepping through the tree I only find a single occurrence where it's
accessed outside the PM core and that's in amdgpu. So this looks
a little fishy TBH. It may make sense to cc such patches to linux-pm
to get Rafael & other folks involved with the PM core to comment.
Also, the disable_depth variable only exists if the kernel was
compiled with CONFIG_PM enabled, but I can't find a "depends on PM"
or something like that in nouveau's Kconfig. Actually, if PM is
not selected, all the nouveau_pmops_*() functions should be #ifdef'ed
away, but oddly there's no #ifdef CONFIG_PM anywhere in nouveau_drm.c.
Anywayn, if I understand the commit message correctly, you're hitting a
pm_runtime_get_sync() in a code path that itself is called during a
pm_runtime_get_sync(). Could you include stack traces in the commit
message? My gut feeling is that this patch masks a deeper issue,
e.g. if the runtime_resume code path does in fact directly poll outputs,
that would seem wrong. Runtime resume should merely make the card
accessible, i.e. reinstate power if necessary, put into PCI_D0,
restore registers, etc. Output polling should be scheduled
asynchronously.
Thanks,
Lukas
On Mon, Jul 16, 2018 at 07:59:26PM -0400, Lyude Paul wrote:
> --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> @@ -2012,10 +2012,18 @@ nv50_disp_atomic_state_alloc(struct drm_device *dev)
> return &atom->state;
> }
>
> +static void
> +nouveau_output_poll_changed(struct drm_device *dev)
> +{
> + pm_runtime_get_sync(dev->dev);
> + drm_fb_helper_hotplug_event(dev->fb_helper);
> + pm_runtime_put_autosuspend(dev->dev);
> +}
> +
> static const struct drm_mode_config_funcs
> nv50_disp_func = {
> .fb_create = nouveau_user_framebuffer_create,
> - .output_poll_changed = drm_fb_helper_output_poll_changed,
> + .output_poll_changed = nouveau_output_poll_changed,
It might make sense to provide a generic DRM helper for this.
Same for patch 3 in this series.
Thanks,
Lukas
On Tue, Jul 17, 2018 at 9:16 AM, Lukas Wunner <[email protected]> wrote:
> [cc += linux-pm]
>
> Hi Lyude,
>
> First of all, thanks a lot for looking into this.
>
> On Mon, Jul 16, 2018 at 07:59:25PM -0400, Lyude Paul wrote:
>> In order to fix all of the spots that need to have runtime PM get/puts()
>> added, we need to ensure that it's possible for us to call
>> pm_runtime_get/put() in any context, regardless of how deep, since
>> almost all of the spots that are currently missing refs can potentially
>> get called in the runtime suspend/resume path. Otherwise, we'll try to
>> resume the GPU as we're trying to resume the GPU (and vice-versa) and
>> cause the kernel to deadlock.
>>
>> With this, it should be safe to call the pm runtime functions in any
>> context in nouveau with one condition: any point in the driver that
>> calls pm_runtime_get*() cannot hold any locks owned by nouveau that
>> would be acquired anywhere inside nouveau_pmops_runtime_resume().
>> This includes modesetting locks, i2c bus locks, etc.
> [snip]
>> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
>> @@ -835,6 +835,8 @@ nouveau_pmops_runtime_suspend(struct device *dev)
>> return -EBUSY;
>> }
>>
>> + dev->power.disable_depth++;
This is effectively equivalent to __pm_runtime_disable(dev, false)
except for the locking (which is necessary).
>> +
>
> I'm not sure if that variable is actually private to the PM core.
> Grepping through the tree I only find a single occurrence where it's
> accessed outside the PM core and that's in amdgpu. So this looks
> a little fishy TBH. It may make sense to cc such patches to linux-pm
> to get Rafael & other folks involved with the PM core to comment.
You are right, power.disable_depth is internal to the PM core.
Accessing it (and updating it in particular) directly from drivers is
not a good idea.
> Also, the disable_depth variable only exists if the kernel was
> compiled with CONFIG_PM enabled, but I can't find a "depends on PM"
> or something like that in nouveau's Kconfig. Actually, if PM is
> not selected, all the nouveau_pmops_*() functions should be #ifdef'ed
> away, but oddly there's no #ifdef CONFIG_PM anywhere in nouveau_drm.c.
>
> Anywayn, if I understand the commit message correctly, you're hitting a
> pm_runtime_get_sync() in a code path that itself is called during a
> pm_runtime_get_sync(). Could you include stack traces in the commit
> message? My gut feeling is that this patch masks a deeper issue,
> e.g. if the runtime_resume code path does in fact directly poll outputs,
> that would seem wrong. Runtime resume should merely make the card
> accessible, i.e. reinstate power if necessary, put into PCI_D0,
> restore registers, etc. Output polling should be scheduled
> asynchronously.
Right.
Thanks,
Rafael
Reviewed-by: Karol Herbst <[email protected]>
On Tue, Jul 17, 2018 at 1:59 AM, Lyude Paul <[email protected]> wrote:
> While the GPU is guaranteed to be on when a hotplug has been received,
> the same assertion does not hold true if a connector probe has been
> started by userspace without having had received a sysfs event. So
> ensure that any connector probing keeps the GPU alive for the duration
> of the probe.
>
> Signed-off-by: Lyude Paul <[email protected]>
> Cc: Karol Herbst <[email protected]>
> Cc: [email protected]
> ---
> drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +-
> drivers/gpu/drm/nouveau/nouveau_connector.c | 21 +++++++++++++++++++--
> drivers/gpu/drm/nouveau/nouveau_connector.h | 3 +++
> 3 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> index ea2a886854fe..0f283ca75189 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> @@ -858,7 +858,7 @@ static const struct drm_connector_funcs
> nv50_mstc = {
> .reset = nouveau_conn_reset,
> .detect = nv50_mstc_detect,
> - .fill_modes = drm_helper_probe_single_connector_modes,
> + .fill_modes = nouveau_connector_probe_single_connector_modes,
> .destroy = nv50_mstc_destroy,
> .atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
> .atomic_destroy_state = nouveau_conn_atomic_destroy_state,
> diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
> index 2a45b4c2ceb0..feb142fb7a8a 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_connector.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
> @@ -770,6 +770,23 @@ nouveau_connector_force(struct drm_connector *connector)
> nouveau_connector_set_encoder(connector, nv_encoder);
> }
>
> +int
> +nouveau_connector_probe_single_connector_modes(struct drm_connector *connector,
> + uint32_t maxX, uint32_t maxY)
> +{
> + struct device *dev = connector->dev->dev;
> + int ret;
> +
> + ret = pm_runtime_get_sync(dev);
> + if (WARN_ON(ret < 0 && ret != -EACCES))
> + return 0;
> +
> + ret = drm_helper_probe_single_connector_modes(connector, maxX, maxY);
> +
> + pm_runtime_put_autosuspend(dev);
> + return ret;
> +}
> +
> static int
> nouveau_connector_set_property(struct drm_connector *connector,
> struct drm_property *property, uint64_t value)
> @@ -1088,7 +1105,7 @@ nouveau_connector_funcs = {
> .reset = nouveau_conn_reset,
> .detect = nouveau_connector_detect,
> .force = nouveau_connector_force,
> - .fill_modes = drm_helper_probe_single_connector_modes,
> + .fill_modes = nouveau_connector_probe_single_connector_modes,
> .set_property = nouveau_connector_set_property,
> .destroy = nouveau_connector_destroy,
> .atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
> @@ -1103,7 +1120,7 @@ nouveau_connector_funcs_lvds = {
> .reset = nouveau_conn_reset,
> .detect = nouveau_connector_detect_lvds,
> .force = nouveau_connector_force,
> - .fill_modes = drm_helper_probe_single_connector_modes,
> + .fill_modes = nouveau_connector_probe_single_connector_modes,
> .set_property = nouveau_connector_set_property,
> .destroy = nouveau_connector_destroy,
> .atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
> diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.h b/drivers/gpu/drm/nouveau/nouveau_connector.h
> index 2d9d35a146a4..e9ffc6eebda2 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_connector.h
> +++ b/drivers/gpu/drm/nouveau/nouveau_connector.h
> @@ -106,6 +106,9 @@ nouveau_crtc_connector_get(struct nouveau_crtc *nv_crtc)
>
> struct drm_connector *
> nouveau_connector_create(struct drm_device *, const struct dcb_output *);
> +int
> +nouveau_connector_probe_single_connector_modes(struct drm_connector *,
> + uint32_t, uint32_t);
>
> extern int nouveau_tv_disable;
> extern int nouveau_ignorelid;
> --
> 2.17.1
>
> _______________________________________________
> Nouveau mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/nouveau
Reviewed-by: Karol Herbst <[email protected]>
On Tue, Jul 17, 2018 at 9:21 AM, Lukas Wunner <[email protected]> wrote:
> On Mon, Jul 16, 2018 at 07:59:26PM -0400, Lyude Paul wrote:
>> --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
>> +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
>> @@ -2012,10 +2012,18 @@ nv50_disp_atomic_state_alloc(struct drm_device *dev)
>> return &atom->state;
>> }
>>
>> +static void
>> +nouveau_output_poll_changed(struct drm_device *dev)
>> +{
>> + pm_runtime_get_sync(dev->dev);
>> + drm_fb_helper_hotplug_event(dev->fb_helper);
>> + pm_runtime_put_autosuspend(dev->dev);
>> +}
>> +
>> static const struct drm_mode_config_funcs
>> nv50_disp_func = {
>> .fb_create = nouveau_user_framebuffer_create,
>> - .output_poll_changed = drm_fb_helper_output_poll_changed,
>> + .output_poll_changed = nouveau_output_poll_changed,
>
> It might make sense to provide a generic DRM helper for this.
> Same for patch 3 in this series.
>
> Thanks,
>
> Lukas
> _______________________________________________
> Nouveau mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/nouveau
mhh, we shouldn't call to Linux APIs from within of nvkm. Rather gaurd
the Linux glue code to the i2c stuff instead, but this is all done
from inside of nvkm. I think we should move it out into
drm/nouveau/nouveau_i2c.c and do the handling there.
On Tue, Jul 17, 2018 at 1:59 AM, Lyude Paul <[email protected]> wrote:
> The i2c bus can be both accessed by DRM itself, along with any of it's
> devnodes (/sys/class/i2c). So, we need to make sure that all codepaths
> using the i2c bus keep the GPU resumed.
>
> Signed-off-by: Lyude Paul <[email protected]>
> Cc: Karol Herbst <[email protected]>
> Cc: [email protected]
> ---
> drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
> index 807a2b67bd64..1de48c990b80 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
> @@ -119,18 +119,28 @@ nvkm_i2c_bus_release(struct nvkm_i2c_bus *bus)
> BUS_TRACE(bus, "release");
> nvkm_i2c_pad_release(pad);
> mutex_unlock(&bus->mutex);
> + pm_runtime_put_autosuspend(pad->i2c->subdev.device->dev);
> }
>
> int
> nvkm_i2c_bus_acquire(struct nvkm_i2c_bus *bus)
> {
> struct nvkm_i2c_pad *pad = bus->pad;
> + struct device *dev = pad->i2c->subdev.device->dev;
> int ret;
> +
> BUS_TRACE(bus, "acquire");
> +
> + ret = pm_runtime_get_sync(dev);
> + if (ret < 0 && ret != -EACCES)
> + return ret;
> +
> mutex_lock(&bus->mutex);
> ret = nvkm_i2c_pad_acquire(pad, NVKM_I2C_PAD_I2C);
> - if (ret)
> + if (ret) {
> mutex_unlock(&bus->mutex);
> + pm_runtime_put_autosuspend(dev);
> + }
> return ret;
> }
>
> --
> 2.17.1
>
> _______________________________________________
> Nouveau mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/nouveau
On Tue, 17 Jul 2018 at 20:18, Karol Herbst <[email protected]> wrote:
>
> mhh, we shouldn't call to Linux APIs from within of nvkm. Rather gaurd
> the Linux glue code to the i2c stuff instead, but this is all done
> from inside of nvkm. I think we should move it out into
> drm/nouveau/nouveau_i2c.c and do the handling there.
Huh? No, this is completely fine.
>
> On Tue, Jul 17, 2018 at 1:59 AM, Lyude Paul <[email protected]> wrote:
> > The i2c bus can be both accessed by DRM itself, along with any of it's
> > devnodes (/sys/class/i2c). So, we need to make sure that all codepaths
> > using the i2c bus keep the GPU resumed.
> >
> > Signed-off-by: Lyude Paul <[email protected]>
> > Cc: Karol Herbst <[email protected]>
> > Cc: [email protected]
> > ---
> > drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c | 12 +++++++++++-
> > 1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
> > index 807a2b67bd64..1de48c990b80 100644
> > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
> > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
> > @@ -119,18 +119,28 @@ nvkm_i2c_bus_release(struct nvkm_i2c_bus *bus)
> > BUS_TRACE(bus, "release");
> > nvkm_i2c_pad_release(pad);
> > mutex_unlock(&bus->mutex);
> > + pm_runtime_put_autosuspend(pad->i2c->subdev.device->dev);
> > }
> >
> > int
> > nvkm_i2c_bus_acquire(struct nvkm_i2c_bus *bus)
> > {
> > struct nvkm_i2c_pad *pad = bus->pad;
> > + struct device *dev = pad->i2c->subdev.device->dev;
> > int ret;
> > +
> > BUS_TRACE(bus, "acquire");
> > +
> > + ret = pm_runtime_get_sync(dev);
> > + if (ret < 0 && ret != -EACCES)
> > + return ret;
> > +
> > mutex_lock(&bus->mutex);
> > ret = nvkm_i2c_pad_acquire(pad, NVKM_I2C_PAD_I2C);
> > - if (ret)
> > + if (ret) {
> > mutex_unlock(&bus->mutex);
> > + pm_runtime_put_autosuspend(dev);
> > + }
> > return ret;
> > }
> >
> > --
> > 2.17.1
> >
> > _______________________________________________
> > Nouveau mailing list
> > [email protected]
> > https://lists.freedesktop.org/mailman/listinfo/nouveau
> _______________________________________________
> Nouveau mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/nouveau
On Tue, Jul 17, 2018 at 1:54 PM, Ben Skeggs <[email protected]> wrote:
> On Tue, 17 Jul 2018 at 20:18, Karol Herbst <[email protected]> wrote:
>>
>> mhh, we shouldn't call to Linux APIs from within of nvkm. Rather gaurd
>> the Linux glue code to the i2c stuff instead, but this is all done
>> from inside of nvkm. I think we should move it out into
>> drm/nouveau/nouveau_i2c.c and do the handling there.
> Huh? No, this is completely fine.
>
okay, then the the two patches adding that guard code is reviewed-by me
>>
>> On Tue, Jul 17, 2018 at 1:59 AM, Lyude Paul <[email protected]> wrote:
>> > The i2c bus can be both accessed by DRM itself, along with any of it's
>> > devnodes (/sys/class/i2c). So, we need to make sure that all codepaths
>> > using the i2c bus keep the GPU resumed.
>> >
>> > Signed-off-by: Lyude Paul <[email protected]>
>> > Cc: Karol Herbst <[email protected]>
>> > Cc: [email protected]
>> > ---
>> > drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c | 12 +++++++++++-
>> > 1 file changed, 11 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
>> > index 807a2b67bd64..1de48c990b80 100644
>> > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
>> > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c
>> > @@ -119,18 +119,28 @@ nvkm_i2c_bus_release(struct nvkm_i2c_bus *bus)
>> > BUS_TRACE(bus, "release");
>> > nvkm_i2c_pad_release(pad);
>> > mutex_unlock(&bus->mutex);
>> > + pm_runtime_put_autosuspend(pad->i2c->subdev.device->dev);
>> > }
>> >
>> > int
>> > nvkm_i2c_bus_acquire(struct nvkm_i2c_bus *bus)
>> > {
>> > struct nvkm_i2c_pad *pad = bus->pad;
>> > + struct device *dev = pad->i2c->subdev.device->dev;
>> > int ret;
>> > +
>> > BUS_TRACE(bus, "acquire");
>> > +
>> > + ret = pm_runtime_get_sync(dev);
>> > + if (ret < 0 && ret != -EACCES)
>> > + return ret;
>> > +
>> > mutex_lock(&bus->mutex);
>> > ret = nvkm_i2c_pad_acquire(pad, NVKM_I2C_PAD_I2C);
>> > - if (ret)
>> > + if (ret) {
>> > mutex_unlock(&bus->mutex);
>> > + pm_runtime_put_autosuspend(dev);
>> > + }
>> > return ret;
>> > }
>> >
>> > --
>> > 2.17.1
>> >
>> > _______________________________________________
>> > Nouveau mailing list
>> > [email protected]
>> > https://lists.freedesktop.org/mailman/listinfo/nouveau
>> _______________________________________________
>> Nouveau mailing list
>> [email protected]
>> https://lists.freedesktop.org/mailman/listinfo/nouveau
On Tue, 2018-07-17 at 09:16 +0200, Lukas Wunner wrote:
> [cc += linux-pm]
>
> Hi Lyude,
>
> First of all, thanks a lot for looking into this.
>
> On Mon, Jul 16, 2018 at 07:59:25PM -0400, Lyude Paul wrote:
> > In order to fix all of the spots that need to have runtime PM get/puts()
> > added, we need to ensure that it's possible for us to call
> > pm_runtime_get/put() in any context, regardless of how deep, since
> > almost all of the spots that are currently missing refs can potentially
> > get called in the runtime suspend/resume path. Otherwise, we'll try to
> > resume the GPU as we're trying to resume the GPU (and vice-versa) and
> > cause the kernel to deadlock.
> >
> > With this, it should be safe to call the pm runtime functions in any
> > context in nouveau with one condition: any point in the driver that
> > calls pm_runtime_get*() cannot hold any locks owned by nouveau that
> > would be acquired anywhere inside nouveau_pmops_runtime_resume().
> > This includes modesetting locks, i2c bus locks, etc.
>
> [snip]
> > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> > @@ -835,6 +835,8 @@ nouveau_pmops_runtime_suspend(struct device *dev)
> > return -EBUSY;
> > }
> >
> > + dev->power.disable_depth++;
> > +
>
> I'm not sure if that variable is actually private to the PM core.
> Grepping through the tree I only find a single occurrence where it's
> accessed outside the PM core and that's in amdgpu. So this looks
> a little fishy TBH. It may make sense to cc such patches to linux-pm
> to get Rafael & other folks involved with the PM core to comment.
>
> Also, the disable_depth variable only exists if the kernel was
> compiled with CONFIG_PM enabled, but I can't find a "depends on PM"
> or something like that in nouveau's Kconfig. Actually, if PM is
> not selected, all the nouveau_pmops_*() functions should be #ifdef'ed
> away, but oddly there's no #ifdef CONFIG_PM anywhere in nouveau_drm.c.
>
> Anywayn, if I understand the commit message correctly, you're hitting a
> pm_runtime_get_sync() in a code path that itself is called during a
> pm_runtime_get_sync(). Could you include stack traces in the commit
> message? My gut feeling is that this patch masks a deeper issue,
> e.g. if the runtime_resume code path does in fact directly poll outputs,
> that would seem wrong. Runtime resume should merely make the card
> accessible, i.e. reinstate power if necessary, put into PCI_D0,
> restore registers, etc. Output polling should be scheduled
> asynchronously.
Since it is apparently internal to the RPM core (I should go fix the references
to that which I added in amdgpu as well then, whoops...) I will have to figure
out another way to do this.
So: the reason that patch was added was mainly for the patches later in the
series that add guards around the i2c bus and aux bus, since both of those
require that the device be awake for it to work. Currently, the spot where it
would recurse is:
[ 72.126859] nouveau 0000:01:00.0: DRM: suspending console...
[ 72.127161] nouveau 0000:01:00.0: DRM: suspending display...
[ 246.718589] INFO: task kworker/0:1:60 blocked for more than 120 seconds.
[ 246.719254] Tainted: G O 4.18.0-rc5Lyude-Test+ #3
[ 246.719411] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 246.719527] kworker/0:1 D 0 60 2 0x80000000
[ 246.719636] Workqueue: pm pm_runtime_work
[ 246.719772] Call Trace:
[ 246.719874] __schedule+0x322/0xaf0
[ 246.722800] schedule+0x33/0x90
[ 246.724269] rpm_resume+0x19c/0x850
[ 246.725128] ? finish_wait+0x90/0x90
[ 246.725990] __pm_runtime_resume+0x4e/0x90
[ 246.726876] nvkm_i2c_aux_acquire+0x39/0xc0 [nouveau]
[ 246.727713] nouveau_connector_aux_xfer+0x5c/0xd0 [nouveau]
[ 246.728546] drm_dp_dpcd_access+0x77/0x110 [drm_kms_helper]
[ 246.729349] drm_dp_dpcd_write+0x2b/0xb0 [drm_kms_helper]
[ 246.730085] drm_dp_mst_topology_mgr_suspend+0x4e/0x90 [drm_kms_helper]
[ 246.730828] nv50_display_fini+0xa5/0xc0 [nouveau]
[ 246.731606] nouveau_display_fini+0xc8/0x100 [nouveau]
[ 246.732375] nouveau_display_suspend+0x62/0x110 [nouveau]
[ 246.733106] nouveau_do_suspend+0x5e/0x2d0 [nouveau]
[ 246.733839] nouveau_pmops_runtime_suspend+0x4f/0xb0 [nouveau]
[ 246.734585] pci_pm_runtime_suspend+0x6b/0x190
[ 246.735297] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.736044] __rpm_callback+0x7a/0x1d0
[ 246.736742] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.737467] rpm_callback+0x24/0x80
[ 246.738165] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.738864] rpm_suspend+0x142/0x6b0
[ 246.739593] pm_runtime_work+0x97/0xc0
[ 246.740312] process_one_work+0x231/0x620
[ 246.741028] worker_thread+0x44/0x3a0
[ 246.741731] kthread+0x12b/0x150
[ 246.742439] ? wq_pool_ids_show+0x140/0x140
[ 246.743149] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.743846] ret_from_fork+0x3a/0x50
[ 246.744601]
Showing all locks held in the system:
[ 246.746010] 4 locks held by kworker/0:1/60:
[ 246.746757] #0: 000000003bb334a6 ((wq_completion)"pm"){+.+.}, at:
process_one_work+0x1b3/0x620
[ 246.747541] #1: 000000002c55902b ((work_completion)(&dev-
>power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.748338] #2: 000000002a39c817 (&mgr->lock){+.+.}, at:
drm_dp_mst_topology_mgr_suspend+0x33/0x90 [drm_kms_helper]
[ 246.749120] #3: 00000000b7d2f3c0 (&aux->hw_mutex){+.+.}, at:
drm_dp_dpcd_access+0x64/0x110 [drm_kms_helper]
[ 246.749928] 1 lock held by khungtaskd/65:
[ 246.750715] #0: 00000000407da5ec (rcu_read_lock){....}, at:
debug_show_all_locks+0x23/0x185
[ 246.751535] 1 lock held by dmesg/1122:
[ 246.752328] 2 locks held by zsh/1149:
[ 246.753100] #0: 000000000a27c37b (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x37/0x40
[ 246.753901] #1: 000000006cb043f7 (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0xc1/0x870
[ 246.755503] =============================================
[ 246.757068] NMI backtrace for cpu 1
[ 246.757858] CPU: 1 PID: 65 Comm: khungtaskd Tainted:
G O 4.18.0-rc5Lyude-Test+ #3
[ 246.758653] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51
) 05/18/2018
[ 246.759427] Call Trace:
[ 246.760203] dump_stack+0x8e/0xd3
[ 246.760977] nmi_cpu_backtrace.cold.3+0x14/0x5a
[ 246.761729] ? lapic_can_unplug_cpu.cold.27+0x42/0x42
[ 246.762462] nmi_trigger_cpumask_backtrace+0xa1/0xae
[ 246.763183] arch_trigger_cpumask_backtrace+0x19/0x20
[ 246.763908] watchdog+0x316/0x580
[ 246.764644] kthread+0x12b/0x150
[ 246.765350] ? reset_hung_task_detector+0x20/0x20
[ 246.766052] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.766777] ret_from_fork+0x3a/0x50
[ 246.767488] Sending NMI from CPU 1 to CPUs 0,2-7:
[ 246.768624] NMI backtrace for cpu 5 skipped: idling at intel_idle+0x7f/0x120
[ 246.768648] NMI backtrace for cpu 4 skipped: idling at intel_idle+0x7f/0x120
[ 246.768671] NMI backtrace for cpu 0 skipped: idling at intel_idle+0x7f/0x120
[ 246.768676] NMI backtrace for cpu 7 skipped: idling at intel_idle+0x7f/0x120
[ 246.768678] NMI backtrace for cpu 3 skipped: idling at intel_idle+0x7f/0x120
[ 246.768681] NMI backtrace for cpu 6 skipped: idling at intel_idle+0x7f/0x120
[ 246.768684] NMI backtrace for cpu 2 skipped: idling at intel_idle+0x7f/0x120
[ 246.769623] Kernel panic - not syncing: hung_task: blocked tasks
Suspending the MST topology at that point should be the right thing to do though
(and afaict, I don't -think- we reprobe connectors on resume by default), so I
definitely think we need some sort of way to have a RPM barrier here that
doesn't take effect in the suspend/resume path
>
> Thanks,
>
> Lukas
On Tue, Jul 17, 2018 at 12:53:11PM -0400, Lyude Paul wrote:
> On Tue, 2018-07-17 at 09:16 +0200, Lukas Wunner wrote:
> > On Mon, Jul 16, 2018 at 07:59:25PM -0400, Lyude Paul wrote:
> > > In order to fix all of the spots that need to have runtime PM get/puts()
> > > added, we need to ensure that it's possible for us to call
> > > pm_runtime_get/put() in any context, regardless of how deep, since
> > > almost all of the spots that are currently missing refs can potentially
> > > get called in the runtime suspend/resume path. Otherwise, we'll try to
> > > resume the GPU as we're trying to resume the GPU (and vice-versa) and
> > > cause the kernel to deadlock.
> > >
> > > With this, it should be safe to call the pm runtime functions in any
> > > context in nouveau with one condition: any point in the driver that
> > > calls pm_runtime_get*() cannot hold any locks owned by nouveau that
> > > would be acquired anywhere inside nouveau_pmops_runtime_resume().
> > > This includes modesetting locks, i2c bus locks, etc.
> >
> > [snip]
> > > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> > > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> > > @@ -835,6 +835,8 @@ nouveau_pmops_runtime_suspend(struct device *dev)
> > > return -EBUSY;
> > > }
> > >
> > > + dev->power.disable_depth++;
> > > +
> >
> > Anyway, if I understand the commit message correctly, you're hitting a
> > pm_runtime_get_sync() in a code path that itself is called during a
> > pm_runtime_get_sync(). Could you include stack traces in the commit
> > message? My gut feeling is that this patch masks a deeper issue,
> > e.g. if the runtime_resume code path does in fact directly poll outputs,
> > that would seem wrong. Runtime resume should merely make the card
> > accessible, i.e. reinstate power if necessary, put into PCI_D0,
> > restore registers, etc. Output polling should be scheduled
> > asynchronously.
>
> So: the reason that patch was added was mainly for the patches later in the
> series that add guards around the i2c bus and aux bus, since both of those
> require that the device be awake for it to work. Currently, the spot where it
> would recurse is:
Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire()
wants it in resumed state, so is waiting forever for the device to
runtime suspend in order to resume it again immediately afterwards.
The deadlock in the stack trace you've posted could be resolved using
the technique I used in d61a5c106351 by adding the following to
include/linux/pm_runtime.h:
static inline bool pm_runtime_status_suspending(struct device *dev)
{
return dev->power.runtime_status == RPM_SUSPENDING;
}
static inline bool is_pm_work(struct device *dev)
{
struct work_struct *work = current_work();
return work && work->func == dev->power.work;
}
Then adding this to nvkm_i2c_aux_acquire():
struct device *dev = pad->i2c->subdev.device->dev;
if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) {
ret = pm_runtime_get_sync(dev);
if (ret < 0 && ret != -EACCES)
return ret;
}
But here's the catch: This only works for an *async* runtime suspend.
It doesn't work for pm_runtime_put_sync(), pm_runtime_suspend() etc,
because then the runtime suspend is executed in the context of the caller,
not in the context of dev->power.work.
So it's not a full solution, but hopefully something that gets you
going. I'm not really familiar with the code paths leading to
nvkm_i2c_aux_acquire() to come up with a full solution off the top
of my head I'm afraid.
Note, it's not sufficient to just check pm_runtime_status_suspending(dev)
because if the runtime_suspend is carried out concurrently by something
else, this will return true but it's not guaranteed that the device is
actually kept awake until the i2c communication has been fully performed.
HTH,
Lukas
On Tue, 2018-07-17 at 20:20 +0200, Lukas Wunner wrote:
> On Tue, Jul 17, 2018 at 12:53:11PM -0400, Lyude Paul wrote:
> > On Tue, 2018-07-17 at 09:16 +0200, Lukas Wunner wrote:
> > > On Mon, Jul 16, 2018 at 07:59:25PM -0400, Lyude Paul wrote:
> > > > In order to fix all of the spots that need to have runtime PM get/puts()
> > > > added, we need to ensure that it's possible for us to call
> > > > pm_runtime_get/put() in any context, regardless of how deep, since
> > > > almost all of the spots that are currently missing refs can potentially
> > > > get called in the runtime suspend/resume path. Otherwise, we'll try to
> > > > resume the GPU as we're trying to resume the GPU (and vice-versa) and
> > > > cause the kernel to deadlock.
> > > >
> > > > With this, it should be safe to call the pm runtime functions in any
> > > > context in nouveau with one condition: any point in the driver that
> > > > calls pm_runtime_get*() cannot hold any locks owned by nouveau that
> > > > would be acquired anywhere inside nouveau_pmops_runtime_resume().
> > > > This includes modesetting locks, i2c bus locks, etc.
> > >
> > > [snip]
> > > > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> > > > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> > > > @@ -835,6 +835,8 @@ nouveau_pmops_runtime_suspend(struct device *dev)
> > > > return -EBUSY;
> > > > }
> > > >
> > > > + dev->power.disable_depth++;
> > > > +
> > >
> > > Anyway, if I understand the commit message correctly, you're hitting a
> > > pm_runtime_get_sync() in a code path that itself is called during a
> > > pm_runtime_get_sync(). Could you include stack traces in the commit
> > > message? My gut feeling is that this patch masks a deeper issue,
> > > e.g. if the runtime_resume code path does in fact directly poll outputs,
> > > that would seem wrong. Runtime resume should merely make the card
> > > accessible, i.e. reinstate power if necessary, put into PCI_D0,
> > > restore registers, etc. Output polling should be scheduled
> > > asynchronously.
> >
> > So: the reason that patch was added was mainly for the patches later in the
> > series that add guards around the i2c bus and aux bus, since both of those
> > require that the device be awake for it to work. Currently, the spot where
> > it
> > would recurse is:
>
> Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire()
> wants it in resumed state, so is waiting forever for the device to
> runtime suspend in order to resume it again immediately afterwards.
>
> The deadlock in the stack trace you've posted could be resolved using
> the technique I used in d61a5c106351 by adding the following to
> include/linux/pm_runtime.h:
>
> static inline bool pm_runtime_status_suspending(struct device *dev)
> {
> return dev->power.runtime_status == RPM_SUSPENDING;
> }
>
> static inline bool is_pm_work(struct device *dev)
> {
> struct work_struct *work = current_work();
>
> return work && work->func == dev->power.work;
> }
>
> Then adding this to nvkm_i2c_aux_acquire():
>
> struct device *dev = pad->i2c->subdev.device->dev;
>
> if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) {
> ret = pm_runtime_get_sync(dev);
> if (ret < 0 && ret != -EACCES)
> return ret;
> }
>
> But here's the catch: This only works for an *async* runtime suspend.
> It doesn't work for pm_runtime_put_sync(), pm_runtime_suspend() etc,
> because then the runtime suspend is executed in the context of the caller,
> not in the context of dev->power.work.
>
> So it's not a full solution, but hopefully something that gets you
> going. I'm not really familiar with the code paths leading to
> nvkm_i2c_aux_acquire() to come up with a full solution off the top
> of my head I'm afraid.
OK-I was considering doing something similar to that commit beforehand but I
wasn't sure if I was going to just be hacking around an actual issue. That
doesn't seem to be the case. This is very helpful and hopefully I should be able
to figure something out from this, thanks!
>
> Note, it's not sufficient to just check pm_runtime_status_suspending(dev)
> because if the runtime_suspend is carried out concurrently by something
> else, this will return true but it's not guaranteed that the device is
> actually kept awake until the i2c communication has been fully performed.
>
> HTH,
>
> Lukas
On Tue, Jul 17, 2018 at 02:24:31PM -0400, Lyude Paul wrote:
> On Tue, 2018-07-17 at 20:20 +0200, Lukas Wunner wrote:
> > Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire()
> > wants it in resumed state, so is waiting forever for the device to
> > runtime suspend in order to resume it again immediately afterwards.
> >
> > The deadlock in the stack trace you've posted could be resolved using
> > the technique I used in d61a5c106351 by adding the following to
> > include/linux/pm_runtime.h:
> >
> > static inline bool pm_runtime_status_suspending(struct device *dev)
> > {
> > return dev->power.runtime_status == RPM_SUSPENDING;
> > }
> >
> > static inline bool is_pm_work(struct device *dev)
> > {
> > struct work_struct *work = current_work();
> >
> > return work && work->func == dev->power.work;
> > }
> >
> > Then adding this to nvkm_i2c_aux_acquire():
> >
> > struct device *dev = pad->i2c->subdev.device->dev;
> >
> > if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) {
> > ret = pm_runtime_get_sync(dev);
> > if (ret < 0 && ret != -EACCES)
> > return ret;
> > }
> >
> > But here's the catch: This only works for an *async* runtime suspend.
> > It doesn't work for pm_runtime_put_sync(), pm_runtime_suspend() etc,
> > because then the runtime suspend is executed in the context of the caller,
> > not in the context of dev->power.work.
> >
> > So it's not a full solution, but hopefully something that gets you
> > going. I'm not really familiar with the code paths leading to
> > nvkm_i2c_aux_acquire() to come up with a full solution off the top
> > of my head I'm afraid.
>
> OK-I was considering doing something similar to that commit beforehand but I
> wasn't sure if I was going to just be hacking around an actual issue. That
> doesn't seem to be the case. This is very helpful and hopefully I should be able
> to figure something out from this, thanks!
In some cases, the function acquiring the runtime PM ref is only called
from a couple of places and then it would be feasible and appropriate
to add a bool parameter to the function telling it to acquire the ref
or not. So the function is told using a parameter which context it's
running in: In the runtime_suspend code path or some other code path.
The technique to use current_work() is an alternative approach to figure
out the context if passing in an additional parameter is not feasible
for some reason. That was the case with d61a5c106351. That approach
only works for work items though.
Lukas
On Tue, 2018-07-17 at 20:32 +0200, Lukas Wunner wrote:
> On Tue, Jul 17, 2018 at 02:24:31PM -0400, Lyude Paul wrote:
> > On Tue, 2018-07-17 at 20:20 +0200, Lukas Wunner wrote:
> > > Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire()
> > > wants it in resumed state, so is waiting forever for the device to
> > > runtime suspend in order to resume it again immediately afterwards.
> > >
> > > The deadlock in the stack trace you've posted could be resolved using
> > > the technique I used in d61a5c106351 by adding the following to
> > > include/linux/pm_runtime.h:
> > >
> > > static inline bool pm_runtime_status_suspending(struct device *dev)
> > > {
> > > return dev->power.runtime_status == RPM_SUSPENDING;
> > > }
> > >
> > > static inline bool is_pm_work(struct device *dev)
> > > {
> > > struct work_struct *work = current_work();
> > >
> > > return work && work->func == dev->power.work;
> > > }
> > >
> > > Then adding this to nvkm_i2c_aux_acquire():
> > >
> > > struct device *dev = pad->i2c->subdev.device->dev;
> > >
> > > if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) {
> > > ret = pm_runtime_get_sync(dev);
> > > if (ret < 0 && ret != -EACCES)
> > > return ret;
> > > }
> > >
> > > But here's the catch: This only works for an *async* runtime suspend.
> > > It doesn't work for pm_runtime_put_sync(), pm_runtime_suspend() etc,
> > > because then the runtime suspend is executed in the context of the caller,
> > > not in the context of dev->power.work.
> > >
> > > So it's not a full solution, but hopefully something that gets you
> > > going. I'm not really familiar with the code paths leading to
> > > nvkm_i2c_aux_acquire() to come up with a full solution off the top
> > > of my head I'm afraid.
> >
> > OK-I was considering doing something similar to that commit beforehand but I
> > wasn't sure if I was going to just be hacking around an actual issue. That
> > doesn't seem to be the case. This is very helpful and hopefully I should be
> > able
> > to figure something out from this, thanks!
>
> In some cases, the function acquiring the runtime PM ref is only called
> from a couple of places and then it would be feasible and appropriate
> to add a bool parameter to the function telling it to acquire the ref
> or not. So the function is told using a parameter which context it's
> running in: In the runtime_suspend code path or some other code path.
>
> The technique to use current_work() is an alternative approach to figure
> out the context if passing in an additional parameter is not feasible
> for some reason. That was the case with d61a5c106351. That approach
> only works for work items though.
Something I'm curious about. This isn't the first time I've hit a situation like
this (see: the improper disable_depth fix I added into amdgpu I now need to go
and fix), which makes me wonder: is there actually any reason Linux's runtime PM
core doesn't just turn get/puts() in the context of s/r callbacks into no-ops by
default?
>
> Lukas
On Tue, Jul 17, 2018 at 8:20 PM, Lukas Wunner <[email protected]> wrote:
> On Tue, Jul 17, 2018 at 12:53:11PM -0400, Lyude Paul wrote:
>> On Tue, 2018-07-17 at 09:16 +0200, Lukas Wunner wrote:
>> > On Mon, Jul 16, 2018 at 07:59:25PM -0400, Lyude Paul wrote:
>> > > In order to fix all of the spots that need to have runtime PM get/puts()
>> > > added, we need to ensure that it's possible for us to call
>> > > pm_runtime_get/put() in any context, regardless of how deep, since
>> > > almost all of the spots that are currently missing refs can potentially
>> > > get called in the runtime suspend/resume path. Otherwise, we'll try to
>> > > resume the GPU as we're trying to resume the GPU (and vice-versa) and
>> > > cause the kernel to deadlock.
>> > >
>> > > With this, it should be safe to call the pm runtime functions in any
>> > > context in nouveau with one condition: any point in the driver that
>> > > calls pm_runtime_get*() cannot hold any locks owned by nouveau that
>> > > would be acquired anywhere inside nouveau_pmops_runtime_resume().
>> > > This includes modesetting locks, i2c bus locks, etc.
>> >
>> > [snip]
>> > > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
>> > > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
>> > > @@ -835,6 +835,8 @@ nouveau_pmops_runtime_suspend(struct device *dev)
>> > > return -EBUSY;
>> > > }
>> > >
>> > > + dev->power.disable_depth++;
>> > > +
>> >
>> > Anyway, if I understand the commit message correctly, you're hitting a
>> > pm_runtime_get_sync() in a code path that itself is called during a
>> > pm_runtime_get_sync(). Could you include stack traces in the commit
>> > message? My gut feeling is that this patch masks a deeper issue,
>> > e.g. if the runtime_resume code path does in fact directly poll outputs,
>> > that would seem wrong. Runtime resume should merely make the card
>> > accessible, i.e. reinstate power if necessary, put into PCI_D0,
>> > restore registers, etc. Output polling should be scheduled
>> > asynchronously.
>>
>> So: the reason that patch was added was mainly for the patches later in the
>> series that add guards around the i2c bus and aux bus, since both of those
>> require that the device be awake for it to work. Currently, the spot where it
>> would recurse is:
>
> Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire()
> wants it in resumed state, so is waiting forever for the device to
> runtime suspend in order to resume it again immediately afterwards.
>
> The deadlock in the stack trace you've posted could be resolved using
> the technique I used in d61a5c106351 by adding the following to
> include/linux/pm_runtime.h:
>
> static inline bool pm_runtime_status_suspending(struct device *dev)
> {
> return dev->power.runtime_status == RPM_SUSPENDING;
> }
>
> static inline bool is_pm_work(struct device *dev)
> {
> struct work_struct *work = current_work();
>
> return work && work->func == dev->power.work;
> }
>
> Then adding this to nvkm_i2c_aux_acquire():
>
> struct device *dev = pad->i2c->subdev.device->dev;
>
> if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) {
> ret = pm_runtime_get_sync(dev);
> if (ret < 0 && ret != -EACCES)
> return ret;
> }
>
> But here's the catch: This only works for an *async* runtime suspend.
> It doesn't work for pm_runtime_put_sync(), pm_runtime_suspend() etc,
> because then the runtime suspend is executed in the context of the caller,
> not in the context of dev->power.work.
>
> So it's not a full solution, but hopefully something that gets you
> going. I'm not really familiar with the code paths leading to
> nvkm_i2c_aux_acquire() to come up with a full solution off the top
> of my head I'm afraid.
>
> Note, it's not sufficient to just check pm_runtime_status_suspending(dev)
> because if the runtime_suspend is carried out concurrently by something
> else, this will return true but it's not guaranteed that the device is
> actually kept awake until the i2c communication has been fully performed.
For the record, I don't quite like this approach as it seems to be
working around a broken dependency graph.
If you need to resume device A from within the runtime resume callback
of device B, then clearly B depends on A and there should be a link
between them.
That said, I do realize that it may be the path of least resistance,
but then I wonder if we can do better than this.
Thanks,
Rafael
On Tue, Jul 17, 2018 at 8:34 PM, Lyude Paul <[email protected]> wrote:
> On Tue, 2018-07-17 at 20:32 +0200, Lukas Wunner wrote:
>> On Tue, Jul 17, 2018 at 02:24:31PM -0400, Lyude Paul wrote:
>> > On Tue, 2018-07-17 at 20:20 +0200, Lukas Wunner wrote:
>> > > Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire()
>> > > wants it in resumed state, so is waiting forever for the device to
>> > > runtime suspend in order to resume it again immediately afterwards.
>> > >
>> > > The deadlock in the stack trace you've posted could be resolved using
>> > > the technique I used in d61a5c106351 by adding the following to
>> > > include/linux/pm_runtime.h:
>> > >
>> > > static inline bool pm_runtime_status_suspending(struct device *dev)
>> > > {
>> > > return dev->power.runtime_status == RPM_SUSPENDING;
>> > > }
>> > >
>> > > static inline bool is_pm_work(struct device *dev)
>> > > {
>> > > struct work_struct *work = current_work();
>> > >
>> > > return work && work->func == dev->power.work;
>> > > }
>> > >
>> > > Then adding this to nvkm_i2c_aux_acquire():
>> > >
>> > > struct device *dev = pad->i2c->subdev.device->dev;
>> > >
>> > > if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) {
>> > > ret = pm_runtime_get_sync(dev);
>> > > if (ret < 0 && ret != -EACCES)
>> > > return ret;
>> > > }
>> > >
>> > > But here's the catch: This only works for an *async* runtime suspend.
>> > > It doesn't work for pm_runtime_put_sync(), pm_runtime_suspend() etc,
>> > > because then the runtime suspend is executed in the context of the caller,
>> > > not in the context of dev->power.work.
>> > >
>> > > So it's not a full solution, but hopefully something that gets you
>> > > going. I'm not really familiar with the code paths leading to
>> > > nvkm_i2c_aux_acquire() to come up with a full solution off the top
>> > > of my head I'm afraid.
>> >
>> > OK-I was considering doing something similar to that commit beforehand but I
>> > wasn't sure if I was going to just be hacking around an actual issue. That
>> > doesn't seem to be the case. This is very helpful and hopefully I should be
>> > able
>> > to figure something out from this, thanks!
>>
>> In some cases, the function acquiring the runtime PM ref is only called
>> from a couple of places and then it would be feasible and appropriate
>> to add a bool parameter to the function telling it to acquire the ref
>> or not. So the function is told using a parameter which context it's
>> running in: In the runtime_suspend code path or some other code path.
>>
>> The technique to use current_work() is an alternative approach to figure
>> out the context if passing in an additional parameter is not feasible
>> for some reason. That was the case with d61a5c106351. That approach
>> only works for work items though.
>
> Something I'm curious about. This isn't the first time I've hit a situation like
> this (see: the improper disable_depth fix I added into amdgpu I now need to go
> and fix), which makes me wonder: is there actually any reason Linux's runtime PM
> core doesn't just turn get/puts() in the context of s/r callbacks into no-ops by
> default?
Because it's hard to detect reliably enough and because hiding issues
is a bad idea in general.
As I've just said in the message to Lukas, the fact that you need to
resume another device from within your resume callback indicates that
you're hiding your dependency graph from the core.
Thanks,
Rafael
On Tue, Jul 17, 2018 at 02:34:47PM -0400, Lyude Paul wrote:
> On Tue, 2018-07-17 at 20:32 +0200, Lukas Wunner wrote:
> > On Tue, Jul 17, 2018 at 02:24:31PM -0400, Lyude Paul wrote:
> > > On Tue, 2018-07-17 at 20:20 +0200, Lukas Wunner wrote:
> > > > Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire()
> > > > wants it in resumed state, so is waiting forever for the device to
> > > > runtime suspend in order to resume it again immediately afterwards.
> > > >
> > > > The deadlock in the stack trace you've posted could be resolved using
> > > > the technique I used in d61a5c106351 by adding the following to
> > > > include/linux/pm_runtime.h:
> > > >
> > > > static inline bool pm_runtime_status_suspending(struct device *dev)
> > > > {
> > > > return dev->power.runtime_status == RPM_SUSPENDING;
> > > > }
> > > >
> > > > static inline bool is_pm_work(struct device *dev)
> > > > {
> > > > struct work_struct *work = current_work();
> > > >
> > > > return work && work->func == dev->power.work;
> > > > }
> > > >
> > > > Then adding this to nvkm_i2c_aux_acquire():
> > > >
> > > > struct device *dev = pad->i2c->subdev.device->dev;
> > > >
> > > > if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) {
> > > > ret = pm_runtime_get_sync(dev);
> > > > if (ret < 0 && ret != -EACCES)
> > > > return ret;
> > > > }
> > > >
> > > > But here's the catch: This only works for an *async* runtime suspend.
> > > > It doesn't work for pm_runtime_put_sync(), pm_runtime_suspend() etc,
> > > > because then the runtime suspend is executed in the context of the caller,
> > > > not in the context of dev->power.work.
[snip]
>
> Something I'm curious about. This isn't the first time I've hit a
> situation like this (see: the improper disable_depth fix I added into
> amdgpu I now need to go and fix), which makes me wonder: is there
> actually any reason Linux's runtime PM core doesn't just turn get/puts()
> in the context of s/r callbacks into no-ops by default?
So the PM core could save a pointer to the "current" task_struct
in struct device before invoking the ->runtime_suspend or
->runtime_resume callback, and all subsequent rpm_resume() and
rpm_suspend() calls could then become no-ops if "current" is
equivalent to the saved pointer. (This is also how you could
solve the deadlock you're dealing with for sync suspend.)
For a recursive resume during a resume or a recursive suspend
during a suspend, this might actually be fine.
For a recursive suspend during a resume or a recursive resume
during a suspend, things become murkier: How should the PM core
know if the particular part of the device is still accessible
when hitting a recursive resume during a suspend? Let's say a
clock is needed for i2c. Then the recursive resume during a
suspend may only become a no-op before that clock has been
turned off. That's something only the device driver itself
has knowledge about, because it implements the order in which
subdevices of the GPU are turned off.
Lukas
On Wed, Jul 18, 2018 at 09:38:41AM +0200, Rafael J. Wysocki wrote:
> On Tue, Jul 17, 2018 at 8:20 PM, Lukas Wunner <[email protected]> wrote:
> > Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire()
> > wants it in resumed state, so is waiting forever for the device to
> > runtime suspend in order to resume it again immediately afterwards.
> >
> > The deadlock in the stack trace you've posted could be resolved using
> > the technique I used in d61a5c106351 by adding the following to
> > include/linux/pm_runtime.h:
> >
> > static inline bool pm_runtime_status_suspending(struct device *dev)
> > {
> > return dev->power.runtime_status == RPM_SUSPENDING;
> > }
> >
> > static inline bool is_pm_work(struct device *dev)
> > {
> > struct work_struct *work = current_work();
> >
> > return work && work->func == dev->power.work;
> > }
> >
> > Then adding this to nvkm_i2c_aux_acquire():
> >
> > struct device *dev = pad->i2c->subdev.device->dev;
> >
> > if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) {
> > ret = pm_runtime_get_sync(dev);
> > if (ret < 0 && ret != -EACCES)
> > return ret;
> > }
[snip]
>
> For the record, I don't quite like this approach as it seems to be
> working around a broken dependency graph.
>
> If you need to resume device A from within the runtime resume callback
> of device B, then clearly B depends on A and there should be a link
> between them.
>
> That said, I do realize that it may be the path of least resistance,
> but then I wonder if we can do better than this.
The GPU contains an i2c subdevice for each connector with DDC lines.
I believe those are modelled as children of the GPU's PCI device as
they're accessed via mmio of the PCI device.
The problem here is that when the GPU's PCI device runtime suspends,
its i2c child device needs to be runtime active to suspend the MST
topology. Catch-22.
I don't know whether or not it's necessary to suspend the MST topology.
I'm not an expert on DisplayPort MultiStream transport.
BTW Lyude, in patch 4 and 5 of this series, you're runtime resuming
pad->i2c->subdev.device->dev. Is this the PCI device or is it the i2c
device? I'm always confused by nouveau's structs. In nvkm_i2c_bus_ctor()
I can see that the device you're runtime resuming is the parent of the
i2c_adapter:
struct nvkm_device *device = pad->i2c->subdev.device;
[...]
bus->i2c.dev.parent = device->dev;
If the i2c_adapter is a child of the PCI device, it's sufficient
to runtime resume the i2c_adapter, i.e. bus->i2c.dev, and this will
implicitly runtime resume its parent.
Thanks,
Lukas
On Wed, Jul 18, 2018 at 10:25 AM, Lukas Wunner <[email protected]> wrote:
> On Wed, Jul 18, 2018 at 09:38:41AM +0200, Rafael J. Wysocki wrote:
>> On Tue, Jul 17, 2018 at 8:20 PM, Lukas Wunner <[email protected]> wrote:
>> > Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire()
>> > wants it in resumed state, so is waiting forever for the device to
>> > runtime suspend in order to resume it again immediately afterwards.
>> >
>> > The deadlock in the stack trace you've posted could be resolved using
>> > the technique I used in d61a5c106351 by adding the following to
>> > include/linux/pm_runtime.h:
>> >
>> > static inline bool pm_runtime_status_suspending(struct device *dev)
>> > {
>> > return dev->power.runtime_status == RPM_SUSPENDING;
>> > }
>> >
>> > static inline bool is_pm_work(struct device *dev)
>> > {
>> > struct work_struct *work = current_work();
>> >
>> > return work && work->func == dev->power.work;
>> > }
>> >
>> > Then adding this to nvkm_i2c_aux_acquire():
>> >
>> > struct device *dev = pad->i2c->subdev.device->dev;
>> >
>> > if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) {
>> > ret = pm_runtime_get_sync(dev);
>> > if (ret < 0 && ret != -EACCES)
>> > return ret;
>> > }
> [snip]
>>
>> For the record, I don't quite like this approach as it seems to be
>> working around a broken dependency graph.
>>
>> If you need to resume device A from within the runtime resume callback
>> of device B, then clearly B depends on A and there should be a link
>> between them.
>>
>> That said, I do realize that it may be the path of least resistance,
>> but then I wonder if we can do better than this.
>
> The GPU contains an i2c subdevice for each connector with DDC lines.
> I believe those are modelled as children of the GPU's PCI device as
> they're accessed via mmio of the PCI device.
>
> The problem here is that when the GPU's PCI device runtime suspends,
> its i2c child device needs to be runtime active to suspend the MST
> topology. Catch-22.
I see.
This sounds like a case for the ignore_children flag, maybe in a
slightly modified form, that will allow the parent to be suspended
regardless of the state of the children.
I wonder what happens to the I2C subdevices when the PCI device goes
into D3. They are not accessible through MMIO any more then, so how
can they be suspended then? Or do they need to be suspended at all?
> I don't know whether or not it's necessary to suspend the MST topology.
> I'm not an expert on DisplayPort MultiStream transport.
Me neither. :-)
On Wed, Jul 18, 2018 at 10:25:05AM +0200, Lukas Wunner wrote:
> The GPU contains an i2c subdevice for each connector with DDC lines.
> I believe those are modelled as children of the GPU's PCI device as
> they're accessed via mmio of the PCI device.
>
> The problem here is that when the GPU's PCI device runtime suspends,
> its i2c child device needs to be runtime active to suspend the MST
> topology. Catch-22.
>
> I don't know whether or not it's necessary to suspend the MST topology.
> I'm not an expert on DisplayPort MultiStream transport.
>
> BTW Lyude, in patch 4 and 5 of this series, you're runtime resuming
> pad->i2c->subdev.device->dev. Is this the PCI device or is it the i2c
> device? I'm always confused by nouveau's structs. In nvkm_i2c_bus_ctor()
> I can see that the device you're runtime resuming is the parent of the
> i2c_adapter:
>
> struct nvkm_device *device = pad->i2c->subdev.device;
> [...]
> bus->i2c.dev.parent = device->dev;
>
> If the i2c_adapter is a child of the PCI device, it's sufficient
> to runtime resume the i2c_adapter, i.e. bus->i2c.dev, and this will
> implicitly runtime resume its parent.
Actually, having written all this I just remembered that we have this
in the documentation:
8. "No-Callback" Devices
Some "devices" are only logical sub-devices of their parent and cannot be
power-managed on their own. [...]
Subsystems can tell the PM core about these devices by calling
pm_runtime_no_callbacks().
So it might actually be sufficient to just call pm_runtime_no_callbacks()
for the i2c devices...
Lukas
On Wed, 2018-07-18 at 10:36 +0200, Lukas Wunner wrote:
> On Wed, Jul 18, 2018 at 10:25:05AM +0200, Lukas Wunner wrote:
> > The GPU contains an i2c subdevice for each connector with DDC lines.
> > I believe those are modelled as children of the GPU's PCI device as
> > they're accessed via mmio of the PCI device.
> >
> > The problem here is that when the GPU's PCI device runtime suspends,
> > its i2c child device needs to be runtime active to suspend the MST
> > topology. Catch-22.
> >
> > I don't know whether or not it's necessary to suspend the MST topology.
> > I'm not an expert on DisplayPort MultiStream transport.
> >
> > BTW Lyude, in patch 4 and 5 of this series, you're runtime resuming
> > pad->i2c->subdev.device->dev. Is this the PCI device or is it the i2c
> > device? I'm always confused by nouveau's structs. In nvkm_i2c_bus_ctor()
> > I can see that the device you're runtime resuming is the parent of the
> > i2c_adapter:
> >
> > struct nvkm_device *device = pad->i2c->subdev.device;
> > [...]
> > bus->i2c.dev.parent = device->dev;
> >
> > If the i2c_adapter is a child of the PCI device, it's sufficient
> > to runtime resume the i2c_adapter, i.e. bus->i2c.dev, and this will
> > implicitly runtime resume its parent.
>
> Actually, having written all this I just remembered that we have this
> in the documentation:
>
> 8. "No-Callback" Devices
>
> Some "devices" are only logical sub-devices of their parent and cannot
> be
> power-managed on their own. [...]
>
> Subsystems can tell the PM core about these devices by calling
> pm_runtime_no_callbacks().
>
> So it might actually be sufficient to just call pm_runtime_no_callbacks()
I would have hoped so, but unfortunately it seems that
pm_runtime_no_callbacks() is already called by default for i2c adapters in
i2c_register_adapter(). Unfortunately this really can't fix the problem
though, because it will still try to runtime resume the parent device of the
i2c adapter, which still leads to deadlocking in the runtime suspend/resume
path.
Additionally; I did play around with ignore_children, but unfortunately this
isn't good enough either as it just means that our i2c devices won't wake the
GPU up on access.
I'm pretty stumped here on trying to figure out any clean way to handle this
in the PM core if recursive resume calls are off the table. The only possible
solution I could see to this is if we could disable execution of runtime
callbacks in the context of a certain task (while all other tasks have to
honor the runtime PM callbacks), do what we need to do in suspend, then re-
enable them
> for the i2c devices...
>
> Lukas
--
Cheers,
Lyude Paul
On Wed, Jul 18, 2018 at 10:11 PM, Lyude Paul <[email protected]> wrote:
> On Wed, 2018-07-18 at 10:36 +0200, Lukas Wunner wrote:
>> On Wed, Jul 18, 2018 at 10:25:05AM +0200, Lukas Wunner wrote:
>> > The GPU contains an i2c subdevice for each connector with DDC lines.
>> > I believe those are modelled as children of the GPU's PCI device as
>> > they're accessed via mmio of the PCI device.
>> >
>> > The problem here is that when the GPU's PCI device runtime suspends,
>> > its i2c child device needs to be runtime active to suspend the MST
>> > topology. Catch-22.
>> >
>> > I don't know whether or not it's necessary to suspend the MST topology.
>> > I'm not an expert on DisplayPort MultiStream transport.
>> >
>> > BTW Lyude, in patch 4 and 5 of this series, you're runtime resuming
>> > pad->i2c->subdev.device->dev. Is this the PCI device or is it the i2c
>> > device? I'm always confused by nouveau's structs. In nvkm_i2c_bus_ctor()
>> > I can see that the device you're runtime resuming is the parent of the
>> > i2c_adapter:
>> >
>> > struct nvkm_device *device = pad->i2c->subdev.device;
>> > [...]
>> > bus->i2c.dev.parent = device->dev;
>> >
>> > If the i2c_adapter is a child of the PCI device, it's sufficient
>> > to runtime resume the i2c_adapter, i.e. bus->i2c.dev, and this will
>> > implicitly runtime resume its parent.
>>
>> Actually, having written all this I just remembered that we have this
>> in the documentation:
>>
>> 8. "No-Callback" Devices
>>
>> Some "devices" are only logical sub-devices of their parent and cannot
>> be
>> power-managed on their own. [...]
>>
>> Subsystems can tell the PM core about these devices by calling
>> pm_runtime_no_callbacks().
>>
>> So it might actually be sufficient to just call pm_runtime_no_callbacks()
>
> I would have hoped so, but unfortunately it seems that
> pm_runtime_no_callbacks() is already called by default for i2c adapters in
> i2c_register_adapter(). Unfortunately this really can't fix the problem
> though, because it will still try to runtime resume the parent device of the
> i2c adapter, which still leads to deadlocking in the runtime suspend/resume
> path.
Well, there has to be a way to suspend all that thing without
recursion or similar.
If the adapter has no callbacks, then how is it possible for those
callbacks to invoke any runtime PM helpers for any other devices?
> Additionally; I did play around with ignore_children, but unfortunately this
> isn't good enough either as it just means that our i2c devices won't wake the
> GPU up on access.
So on the one hand you want them to stay active over a suspend of the
parent and on the other hand you want the parent to resume before
them. Are these requirements really consistent with each other?
> I'm pretty stumped here on trying to figure out any clean way to handle this
> in the PM core if recursive resume calls are off the table. The only possible
> solution I could see to this is if we could disable execution of runtime
> callbacks in the context of a certain task (while all other tasks have to
> honor the runtime PM callbacks), do what we need to do in suspend, then re-
> enable them
>> for the i2c devices...
This sounds completely broken to me, sorry.