Explicitly loading i915 becomes a problem when upstreaming the new intel driver
for Tiger Lake and higher graphics (xe). By loading i915, it doesn't wait for
driver load of xe, and will fail completely before it loads.
-EPROBE_DEFER has to be returned before any device is created in probe(),
otherwise the removal of the device will cause EPROBE_DEFER to try again
in an infinite loop.
The conversion is done in gradual steps. First I add an argument to
snd_hdac_i915_init to allow for -EPROBE_DEFER so I can convert each driver
separately. Then I convert each driver to move snd_hdac_i915_init out of the
workqueue. Finally I drop the ability to choose modprobe behavior after the
last user is converted.
Compared to previous version, I picked up Pierre's changes to add a
probe_no_wq, added remove_no_wq myself and then fixed the SOF changes to
use it.
Cc: Jaroslav Kysela <[email protected]>
Cc: Takashi Iwai <[email protected]>
Cc: Cezary Rojewski <[email protected]>
Cc: Pierre-Louis Bossart <[email protected]>
Cc: Liam Girdwood <[email protected]>
Cc: Peter Ujfalusi <[email protected]>
Cc: Bard Liao <[email protected]>
Cc: Ranjani Sridharan <[email protected]>
Cc: Kai Vehmanen <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Daniel Baluta <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Maarten Lankhorst (9):
ALSA: hda/intel: Fix error handling in azx_probe()
ALSA: hda/i915: Allow override of gpu binding.
ALSA: hda/i915: Add an allow_modprobe argument to snd_hdac_i915_init
ALSA: hda/i915: Allow xe as match for i915_component_master_match
ASoC: Intel: avs: Move snd_hdac_i915_init to before probe_work.
ASoC: Intel: Skylake: Move snd_hdac_i915_init to before probe_work.
ALSA: hda/intel: Move snd_hdac_i915_init to before probe_work.
ASoC: SOF: Intel: Move binding to display driver outside of deferred
probe
ALSA: hda/i915: Remove extra argument from snd_hdac_i915_init
Pierre-Louis Bossart (2):
ASoC: SOF: core: add 'no_wq' probe and remove callbacks
ASoC: SOF: Intel: hda: start splitting the probe
sound/hda/hdac_i915.c | 24 ++++++-----
sound/pci/hda/hda_intel.c | 60 ++++++++++++++--------------
sound/soc/intel/avs/core.c | 13 ++++--
sound/soc/intel/skylake/skl.c | 31 +++++---------
sound/soc/sof/core.c | 11 ++++-
sound/soc/sof/intel/hda-common-ops.c | 2 +
sound/soc/sof/intel/hda.c | 31 ++++++++------
sound/soc/sof/intel/hda.h | 2 +
sound/soc/sof/ops.h | 16 ++++++++
sound/soc/sof/sof-priv.h | 2 +
10 files changed, 112 insertions(+), 80 deletions(-)
--
2.39.2
From: Pierre-Louis Bossart <[email protected]>
The existing DSP probe may be handled in a workqueue to allow for
extra time, typically for the i915 request_module and HDAudio codec
handling.
With the upcoming changes for i915/Xe driver relying on the
-EPROBE_DEFER mechanism, we need to have a first pass of the probe
which cannot be pushed to a workqueue. Introduce 2 new optional
callbacks.
Note that instead of probe_no_wq/probe we could have use a more
self-explanatory naming such as probe/probe_wq_allowed, but that would
have been a very intrusive change.
Signed-off-by: Pierre-Louis Bossart <[email protected]>
Signed-off-by: Maarten Lankhorst <[email protected]>
---
sound/soc/sof/core.c | 11 +++++++++--
sound/soc/sof/ops.h | 16 ++++++++++++++++
sound/soc/sof/sof-priv.h | 2 ++
3 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c
index 30db685cc5f4b..54c384a5d6140 100644
--- a/sound/soc/sof/core.c
+++ b/sound/soc/sof/core.c
@@ -327,8 +327,6 @@ static int sof_probe_continue(struct snd_sof_dev *sdev)
dsp_err:
snd_sof_remove(sdev);
probe_err:
- sof_ops_free(sdev);
-
/* all resources freed, update state to match */
sof_set_fw_state(sdev, SOF_FW_BOOT_NOT_STARTED);
sdev->first_boot = true;
@@ -436,6 +434,14 @@ int snd_sof_device_probe(struct device *dev, struct snd_sof_pdata *plat_data)
sof_set_fw_state(sdev, SOF_FW_BOOT_NOT_STARTED);
+ /*
+ * first pass of probe which isn't allowed to run in a work-queue,
+ * typically to rely on -EPROBE_DEFER dependencies
+ */
+ ret = snd_sof_probe_no_wq(sdev);
+ if (ret < 0)
+ return ret;
+
if (IS_ENABLED(CONFIG_SND_SOC_SOF_PROBE_WORK_QUEUE)) {
INIT_WORK(&sdev->probe_work, sof_probe_work);
schedule_work(&sdev->probe_work);
@@ -487,6 +493,7 @@ int snd_sof_device_remove(struct device *dev)
snd_sof_free_debug(sdev);
snd_sof_remove(sdev);
}
+ snd_sof_remove_no_wq(sdev);
sof_ops_free(sdev);
diff --git a/sound/soc/sof/ops.h b/sound/soc/sof/ops.h
index 9ab7b9be765bc..2a03b152e9313 100644
--- a/sound/soc/sof/ops.h
+++ b/sound/soc/sof/ops.h
@@ -38,6 +38,14 @@ static inline void sof_ops_free(struct snd_sof_dev *sdev)
/* Mandatory operations are verified during probing */
/* init */
+static inline int snd_sof_probe_no_wq(struct snd_sof_dev *sdev)
+{
+ if (sof_ops(sdev)->probe_no_wq)
+ return sof_ops(sdev)->probe_no_wq(sdev);
+
+ return 0;
+}
+
static inline int snd_sof_probe(struct snd_sof_dev *sdev)
{
return sof_ops(sdev)->probe(sdev);
@@ -51,6 +59,14 @@ static inline int snd_sof_remove(struct snd_sof_dev *sdev)
return 0;
}
+static inline int snd_sof_remove_no_wq(struct snd_sof_dev *sdev)
+{
+ if (sof_ops(sdev)->remove_no_wq)
+ return sof_ops(sdev)->remove_no_wq(sdev);
+
+ return 0;
+}
+
static inline int snd_sof_shutdown(struct snd_sof_dev *sdev)
{
if (sof_ops(sdev)->shutdown)
diff --git a/sound/soc/sof/sof-priv.h b/sound/soc/sof/sof-priv.h
index d4f6702e93dcb..addaef282ee92 100644
--- a/sound/soc/sof/sof-priv.h
+++ b/sound/soc/sof/sof-priv.h
@@ -165,6 +165,8 @@ struct sof_firmware {
struct snd_sof_dsp_ops {
/* probe/remove/shutdown */
+ int (*probe_no_wq)(struct snd_sof_dev *sof_dev); /* optional */
+ int (*remove_no_wq)(struct snd_sof_dev *sof_dev); /* optional */
int (*probe)(struct snd_sof_dev *sof_dev); /* mandatory */
int (*remove)(struct snd_sof_dev *sof_dev); /* optional */
int (*shutdown)(struct snd_sof_dev *sof_dev); /* optional */
--
2.39.2
Now that we can use -EPROBE_DEFER, it's no longer required to spin off
the snd_hdac_i915_init into a workqueue. It's likely the whole workqueue
can be destroyed, but I don't have the means to test this.
Removing the workqueue would simplify init even further, but is left
as exercise for the reviewer.
Signed-off-by: Maarten Lankhorst <[email protected]>
Acked-by: Mark Brown <[email protected]>
Reviewed-by: Kai Vehmanen <[email protected]>
---
sound/soc/intel/skylake/skl.c | 31 +++++++++----------------------
1 file changed, 9 insertions(+), 22 deletions(-)
diff --git a/sound/soc/intel/skylake/skl.c b/sound/soc/intel/skylake/skl.c
index 4f7acb4f6680b..24bdbe2a53bec 100644
--- a/sound/soc/intel/skylake/skl.c
+++ b/sound/soc/intel/skylake/skl.c
@@ -783,23 +783,6 @@ static void skl_codec_create(struct hdac_bus *bus)
}
}
-static int skl_i915_init(struct hdac_bus *bus)
-{
- int err;
-
- /*
- * The HDMI codec is in GPU so we need to ensure that it is powered
- * up and ready for probe
- */
- err = snd_hdac_i915_init(bus, true);
- if (err < 0)
- return err;
-
- snd_hdac_display_power(bus, HDA_CODEC_IDX_CONTROLLER, true);
-
- return 0;
-}
-
static void skl_probe_work(struct work_struct *work)
{
struct skl_dev *skl = container_of(work, struct skl_dev, probe_work);
@@ -807,11 +790,8 @@ static void skl_probe_work(struct work_struct *work)
struct hdac_ext_link *hlink;
int err;
- if (IS_ENABLED(CONFIG_SND_SOC_HDAC_HDMI)) {
- err = skl_i915_init(bus);
- if (err < 0)
- return;
- }
+ if (IS_ENABLED(CONFIG_SND_SOC_HDAC_HDMI))
+ snd_hdac_display_power(bus, HDA_CODEC_IDX_CONTROLLER, true);
skl_init_pci(skl);
skl_dum_set(bus);
@@ -1075,10 +1055,17 @@ static int skl_probe(struct pci_dev *pci,
goto out_dsp_free;
}
+ if (IS_ENABLED(CONFIG_SND_SOC_HDAC_HDMI)) {
+ err = snd_hdac_i915_init(bus, false);
+ if (err < 0)
+ goto out_dmic_unregister;
+ }
schedule_work(&skl->probe_work);
return 0;
+out_dmic_unregister:
+ skl_dmic_device_unregister(skl);
out_dsp_free:
skl_free_dsp(skl);
out_clk_free:
--
2.39.2
From: Pierre-Louis Bossart <[email protected]>
This patch moves the initial parts of the probe to the probe_no_wq()
callback, which provides a much faster decision on whether the SOF
driver shall deal with a specific platform or yield to other Intel
drivers.
This is a limited functionality change, the bigger change is to move
the i915/Xe initialization to the probe_no_wq().
Signed-off-by: Pierre-Louis Bossart <[email protected]>
Signed-off-by: Maarten Lankhorst <[email protected]>
---
sound/soc/sof/intel/hda-common-ops.c | 1 +
sound/soc/sof/intel/hda.c | 16 +++++++++++++---
sound/soc/sof/intel/hda.h | 1 +
3 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/sound/soc/sof/intel/hda-common-ops.c b/sound/soc/sof/intel/hda-common-ops.c
index 8e1cd0babd32c..803b5e9087782 100644
--- a/sound/soc/sof/intel/hda-common-ops.c
+++ b/sound/soc/sof/intel/hda-common-ops.c
@@ -16,6 +16,7 @@
struct snd_sof_dsp_ops sof_hda_common_ops = {
/* probe/remove/shutdown */
+ .probe_no_wq = hda_dsp_probe_no_wq,
.probe = hda_dsp_probe,
.remove = hda_dsp_remove,
diff --git a/sound/soc/sof/intel/hda.c b/sound/soc/sof/intel/hda.c
index 15e6779efaa3b..e918b5dadfa02 100644
--- a/sound/soc/sof/intel/hda.c
+++ b/sound/soc/sof/intel/hda.c
@@ -1118,11 +1118,10 @@ static irqreturn_t hda_dsp_interrupt_thread(int irq, void *context)
return IRQ_HANDLED;
}
-int hda_dsp_probe(struct snd_sof_dev *sdev)
+int hda_dsp_probe_no_wq(struct snd_sof_dev *sdev)
{
struct pci_dev *pci = to_pci_dev(sdev->dev);
struct sof_intel_hda_dev *hdev;
- struct hdac_bus *bus;
const struct sof_intel_dsp_desc *chip;
int ret = 0;
@@ -1162,6 +1161,17 @@ int hda_dsp_probe(struct snd_sof_dev *sdev)
sdev->pdata->hw_pdata = hdev;
hdev->desc = chip;
+err:
+ return ret;
+}
+
+int hda_dsp_probe(struct snd_sof_dev *sdev)
+{
+ struct pci_dev *pci = to_pci_dev(sdev->dev);
+ struct sof_intel_hda_dev *hdev = sdev->pdata->hw_pdata;
+ struct hdac_bus *bus;
+ int ret = 0;
+
hdev->dmic_dev = platform_device_register_data(sdev->dev, "dmic-codec",
PLATFORM_DEVID_NONE,
NULL, 0);
@@ -1299,7 +1309,7 @@ int hda_dsp_probe(struct snd_sof_dev *sdev)
platform_device_unregister(hdev->dmic_dev);
iounmap(bus->remap_addr);
hda_codec_i915_exit(sdev);
-err:
+
return ret;
}
diff --git a/sound/soc/sof/intel/hda.h b/sound/soc/sof/intel/hda.h
index 5c517ec57d4a2..89b8c239e9a5e 100644
--- a/sound/soc/sof/intel/hda.h
+++ b/sound/soc/sof/intel/hda.h
@@ -573,6 +573,7 @@ struct sof_intel_hda_stream {
/*
* DSP Core services.
*/
+int hda_dsp_probe_no_wq(struct snd_sof_dev *sdev);
int hda_dsp_probe(struct snd_sof_dev *sdev);
int hda_dsp_remove(struct snd_sof_dev *sdev);
int hda_dsp_core_power_up(struct snd_sof_dev *sdev, unsigned int core_mask);
--
2.39.2
Now that we can use -EPROBE_DEFER, it's no longer required to spin off
the snd_hdac_i915_init into a workqueue. It's likely the whole workqueue
can be destroyed, but I don't have the means to test this.
Removing the workqueue would simplify init even further, but is left
as exercise for the reviewer.
Signed-off-by: Maarten Lankhorst <[email protected]>
Acked-by: Mark Brown <[email protected]>
Reviewed-by: Kai Vehmanen <[email protected]>
---
Changes since v1:
- Rename error label.
---
sound/soc/intel/avs/core.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/sound/soc/intel/avs/core.c b/sound/soc/intel/avs/core.c
index bbb40339c75f4..8a20639582487 100644
--- a/sound/soc/intel/avs/core.c
+++ b/sound/soc/intel/avs/core.c
@@ -191,10 +191,6 @@ static void avs_hda_probe_work(struct work_struct *work)
pm_runtime_set_active(bus->dev); /* clear runtime_error flag */
- ret = snd_hdac_i915_init(bus, true);
- if (ret < 0)
- dev_info(bus->dev, "i915 init unsuccessful: %d\n", ret);
-
snd_hdac_display_power(bus, HDA_CODEC_IDX_CONTROLLER, true);
avs_hdac_bus_init_chip(bus, true);
avs_hdac_bus_probe_codecs(bus);
@@ -465,10 +461,19 @@ static int avs_pci_probe(struct pci_dev *pci, const struct pci_device_id *id)
pci_set_drvdata(pci, bus);
device_disable_async_suspend(dev);
+ ret = snd_hdac_i915_init(bus, false);
+ if (ret == -EPROBE_DEFER)
+ goto err_i915_init;
+ else if (ret < 0)
+ dev_info(bus->dev, "i915 init unsuccessful: %d\n", ret);
+
schedule_work(&adev->probe_work);
return 0;
+err_i915_init:
+ pci_clear_master(pci);
+ pci_set_drvdata(pci, NULL);
err_acquire_irq:
snd_hdac_bus_free_stream_pages(bus);
snd_hdac_ext_stream_free_all(bus);
--
2.39.2
Add missing pci_set_drv to NULL call on error.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Pierre-Louis Bossart <[email protected]>
Reviewed-by: Peter Ujfalusi <[email protected]>
Reviewed-by: Kai Vehmanen <[email protected]>
---
sound/pci/hda/hda_intel.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
index 765d95e798617..bf6210506a2da 100644
--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -2176,6 +2176,7 @@ static int azx_probe(struct pci_dev *pci,
return 0;
out_free:
+ pci_set_drvdata(pci, NULL);
snd_card_free(card);
return err;
}
--
2.39.2
On 8/30/2023 5:36 PM, Maarten Lankhorst wrote:
> Now that we can use -EPROBE_DEFER, it's no longer required to spin off
> the snd_hdac_i915_init into a workqueue. It's likely the whole workqueue
> can be destroyed, but I don't have the means to test this.
>
> Removing the workqueue would simplify init even further, but is left
> as exercise for the reviewer.
>
> Signed-off-by: Maarten Lankhorst <[email protected]>
> Acked-by: Mark Brown <[email protected]>
> Reviewed-by: Kai Vehmanen <[email protected]>
Reviewed-by: Amadeusz Sławiński <[email protected]>
Hey,
Den 2023-08-30 kl. 19:13, skrev Pierre-Louis Bossart:
>> +static inline int snd_sof_remove_no_wq(struct snd_sof_dev *sdev)
>> +{
>> + if (sof_ops(sdev)->remove_no_wq)
>> + return sof_ops(sdev)->remove_no_wq(sdev);
>> +
>> + return 0;
>> +}
>> /* probe/remove/shutdown */
>> + int (*probe_no_wq)(struct snd_sof_dev *sof_dev); /* optional */
>> + int (*remove_no_wq)(struct snd_sof_dev *sof_dev); /* optional */
> My initial PR didn't have this remove_no_wq() callback.
>
> For symmetry it could be useful if it is meant to undo what the
> probe_no_wq() did, but conceptually the first thing we do in the remove
> is make sure that workqueue is either not scheduled or we wait until it
> completes.
>
> int snd_sof_device_remove(struct device *dev)
> {
> struct snd_sof_dev *sdev = dev_get_drvdata(dev);
> struct snd_sof_pdata *pdata = sdev->pdata;
> int ret;
>
> if (IS_ENABLED(CONFIG_SND_SOC_SOF_PROBE_WORK_QUEUE))
> cancel_work_sync(&sdev->probe_work);
That is exactly what I was using it for later on.
I had to have a counter to hda_init() in patch 10/11.
Cheers,
Maarten
On 12/09/2023 03:25, Pierre-Louis Bossart wrote:
>
>
>> What we have atm:
>> snd_sof_probe - might be called from wq
>> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
>> step)
>
> I don't think it's correct, snd_sof_remove cannot be called from a wq.
> The device core knows nothing about workqueues.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/sound/soc/sof/core.c#n328
it is called on the error path of sof_probe_continue(), which can be run
in a workque.
>> We want a callbacks for hardware/device probing, right, split the
>> snd_sof_probe (and remove) to be able to support a sane level of
>> deferred probing support.
>>
>> With that in mind:
>> snd_sof_device_probe - Not called from wq (to handle deferred probing)
>> snd_sof_probe - might be called from wq
>>
>> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
>> step)
>> snd_sof_device_remove - Not called from wq (to up the
>> snd_sof_device_probe step)
>>
>> Naming option: s/device/hardware
>
> I like the 'device' hint since it's directly related to the device (or
> subsystem) callbacks.
>
>> However, I think the snd_sof_device_remove itself is redundant and we
>> might not need it at all as in case we have wq and there is a failure in
>> there we do want to release resources as much as possible. The module
>> will be kept loaded (no deferred handling in wq) and that might block
>> PM, other devices to behave correctly. Iow, if the wq has failure we
>> should do a cleanup to the best effort to reach a level like the driver
>> is not even loaded.
>
> If we have a failure in a workqueue used for probe, then we have to
> clean-up everything since nothing in the device core will do so for us.
Yes, this makes the snd_sof_device_remove() redundant or at least the
definition of it is no longer a mirror of snd_sof_device_probe():
snd_sof_device_remove - might be called from wq (cleans up the
snd_sof_device_probe step)
Any failure in sof_probe_continue() should execute the
snd_sof_device_remove(), snd_sof_remove() is only involved after the
snd_sof_probe() have returned with success.
I think this makes actually makes sense and it is well defined.
On module remove we need to take into account the case when we have
failed in wq similarly as we do currently (the resources have been freed
up already).
--
Péter
> What we have atm:
> snd_sof_probe - might be called from wq
> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
> step)
I don't think it's correct, snd_sof_remove cannot be called from a wq.
The device core knows nothing about workqueues.
> We want a callbacks for hardware/device probing, right, split the
> snd_sof_probe (and remove) to be able to support a sane level of
> deferred probing support.
>
> With that in mind:
> snd_sof_device_probe - Not called from wq (to handle deferred probing)
> snd_sof_probe - might be called from wq
>
> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
> step)
> snd_sof_device_remove - Not called from wq (to up the
> snd_sof_device_probe step)
>
> Naming option: s/device/hardware
I like the 'device' hint since it's directly related to the device (or
subsystem) callbacks.
> However, I think the snd_sof_device_remove itself is redundant and we
> might not need it at all as in case we have wq and there is a failure in
> there we do want to release resources as much as possible. The module
> will be kept loaded (no deferred handling in wq) and that might block
> PM, other devices to behave correctly. Iow, if the wq has failure we
> should do a cleanup to the best effort to reach a level like the driver
> is not even loaded.
If we have a failure in a workqueue used for probe, then we have to
clean-up everything since nothing in the device core will do so for us.
On 9/12/23 02:10, Péter Ujfalusi wrote:
>
>
> On 12/09/2023 03:25, Pierre-Louis Bossart wrote:
>>
>>
>>> What we have atm:
>>> snd_sof_probe - might be called from wq
>>> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
>>> step)
>>
>> I don't think it's correct, snd_sof_remove cannot be called from a wq.
>> The device core knows nothing about workqueues.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/sound/soc/sof/core.c#n328
>
> it is called on the error path of sof_probe_continue(), which can be run
> in a workque.
>
>>> We want a callbacks for hardware/device probing, right, split the
>>> snd_sof_probe (and remove) to be able to support a sane level of
>>> deferred probing support.
>>>
>>> With that in mind:
>>> snd_sof_device_probe - Not called from wq (to handle deferred probing)
>>> snd_sof_probe - might be called from wq
>>>
>>> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
>>> step)
>>> snd_sof_device_remove - Not called from wq (to up the
>>> snd_sof_device_probe step)
>>>
>>> Naming option: s/device/hardware
>>
>> I like the 'device' hint since it's directly related to the device (or
>> subsystem) callbacks.
>>
>>> However, I think the snd_sof_device_remove itself is redundant and we
>>> might not need it at all as in case we have wq and there is a failure in
>>> there we do want to release resources as much as possible. The module
>>> will be kept loaded (no deferred handling in wq) and that might block
>>> PM, other devices to behave correctly. Iow, if the wq has failure we
>>> should do a cleanup to the best effort to reach a level like the driver
>>> is not even loaded.
>>
>> If we have a failure in a workqueue used for probe, then we have to
>> clean-up everything since nothing in the device core will do so for us.
>
> Yes, this makes the snd_sof_device_remove() redundant or at least the
> definition of it is no longer a mirror of snd_sof_device_probe():
>
> snd_sof_device_remove - might be called from wq (cleans up the
> snd_sof_device_probe step)
>
> Any failure in sof_probe_continue() should execute the
> snd_sof_device_remove(), snd_sof_remove() is only involved after the
> snd_sof_probe() have returned with success.
>
> I think this makes actually makes sense and it is well defined.
> On module remove we need to take into account the case when we have
> failed in wq similarly as we do currently (the resources have been freed
> up already).
Agree, I stand corrected, thanks Peter.