2023-11-27 16:20:34

by Kalle Valo

[permalink] [raw]
Subject: [PATCH RFC v2 0/8] wifi: ath11k: hibernation support

From: Kalle Valo <[email protected]>

Currently in ath11k we keep the firmware running on the WLAN device when the
network interface (wlan0) is down. The problem is that this will break
hibernation, obviously the firmware can't be running after the whole system is
powered off. To power down the ath11k firmware for suspend/hibernation some
changes both in MHI subsystem and ath11k is needed.

This patchset fixes a longstanding bug report about broken hibernation support:

https://bugzilla.kernel.org/show_bug.cgi?id=214649

This patchset is marked as RFC as it requires changes in MHI subsystem. Also
this has been tested only on WCN6855, need to test also on more AP based
chipsets like IPQ8074 and QCN9074.

The patches are also available at:

https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/log/?h=ath11k-hibernation-support

Earlier versions of this patchset have been tested by multiple users with
positive results.

v2:

* rebase to ath-202311221826 (6.7.0-rc2-wt-ath+)

* 'bus: mhi: host: add mhi_power_down_no_destroy()': fix null state string for
DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE

* 'bus: mhi: host: add new interfaces to handle MHI channels
directly': fix typos in comments

* 'bus: mhi: host: add new interfaces to handle MHI channels directly': honour
initial autoqueue configuration

* 'bus: mhi: host: add new interfaces to handle MHI channels
directly': don't prepare/unprepare MHI devices that don't match
with a MHI client driver

* 'wifi: ath11k: remove MHI LOOPBACK channels': remove LOOPBACK channels for QCN9074 as well

v1: https://lore.kernel.org/mhi/[email protected]/

Baochen Qiang (7):
bus: mhi: host: add mhi_power_down_no_destroy()
bus: mhi: host: add new interfaces to handle MHI channels directly
wifi: ath11k: handle irq enable/disable in several code path
wifi: ath11k: remove MHI LOOPBACK channels
wifi: ath11k: do not dump SRNG statistics during resume
wifi: ath11k: fix warning on DMA ring capabilities event
wifi: ath11k: support hibernation

Kalle Valo (1):
wifi: ath11k: thermal: don't try to register multiple times

drivers/bus/mhi/host/init.c | 1 +
drivers/bus/mhi/host/internal.h | 1 +
drivers/bus/mhi/host/main.c | 107 ++++++++++++++++++++++
drivers/bus/mhi/host/pm.c | 26 ++++--
drivers/net/wireless/ath/ath11k/ahb.c | 8 +-
drivers/net/wireless/ath/ath11k/core.c | 44 +++++----
drivers/net/wireless/ath/ath11k/core.h | 2 +
drivers/net/wireless/ath/ath11k/hif.h | 12 +--
drivers/net/wireless/ath/ath11k/mhi.c | 77 ++++------------
drivers/net/wireless/ath/ath11k/mhi.h | 4 +-
drivers/net/wireless/ath/ath11k/pci.c | 55 +++++++++--
drivers/net/wireless/ath/ath11k/qmi.c | 7 +-
drivers/net/wireless/ath/ath11k/thermal.c | 3 +
drivers/net/wireless/ath/ath11k/wmi.c | 1 +
include/linux/mhi.h | 47 +++++++++-
15 files changed, 285 insertions(+), 110 deletions(-)


base-commit: 16a212b4f33c4edd9ce9a9e0953b5389216e8ed9
--
2.39.2



2023-11-27 16:20:38

by Kalle Valo

[permalink] [raw]
Subject: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

From: Baochen Qiang <[email protected]>

If ath11k tries to call mhi_power_up() during resume it fails:

ath11k_pci 0000:06:00.0: timeout while waiting for restart complete

This happens because when calling mhi_power_up() the MHI subsystem eventually
calls device_add() from mhi_create_devices() but the device creation is
deferred:

mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral

The reason for deferring device creation is explained in dpm_prepare():

/*
* It is unsafe if probing of devices will happen during suspend or
* hibernation and system behavior will be unpredictable in this case.
* So, let's prohibit device's probing here and defer their probes
* instead. The normal behavior will be restored in dpm_complete().
*/

Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:

static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
struct mhi_result *mhi_res)
{
struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
int rc;

if (!qdev || mhi_res->transaction_status)
return;

So what this means that QRTR is not delivering messages and the QMI connection
is not working between ath11k and the firmware, resulting a failure in firmware
initialisation.

To fix this add new function mhi_power_down_no_destroy() which does not destroy
the devices during power down. This way mhi_power_up() can be called during
resume and we can get ath11k hibernation working with the following patches.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Signed-off-by: Baochen Qiang <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
---
drivers/bus/mhi/host/init.c | 1 +
drivers/bus/mhi/host/internal.h | 1 +
drivers/bus/mhi/host/pm.c | 26 +++++++++++++++++++-------
include/linux/mhi.h | 29 +++++++++++++++++++++++++++--
4 files changed, 48 insertions(+), 9 deletions(-)

diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
index 65ceac1837f9..e626b03ffafa 100644
--- a/drivers/bus/mhi/host/init.c
+++ b/drivers/bus/mhi/host/init.c
@@ -43,6 +43,7 @@ const char * const dev_state_tran_str[DEV_ST_TRANSITION_MAX] = {
[DEV_ST_TRANSITION_FP] = "FLASH PROGRAMMER",
[DEV_ST_TRANSITION_SYS_ERR] = "SYS ERROR",
[DEV_ST_TRANSITION_DISABLE] = "DISABLE",
+ [DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE] = "DISABLE (DESTROY DEVICE)",
};

const char * const mhi_ch_state_type_str[MHI_CH_STATE_TYPE_MAX] = {
diff --git a/drivers/bus/mhi/host/internal.h b/drivers/bus/mhi/host/internal.h
index 30ac415a3000..3f45c9c447bd 100644
--- a/drivers/bus/mhi/host/internal.h
+++ b/drivers/bus/mhi/host/internal.h
@@ -69,6 +69,7 @@ enum dev_st_transition {
DEV_ST_TRANSITION_FP,
DEV_ST_TRANSITION_SYS_ERR,
DEV_ST_TRANSITION_DISABLE,
+ DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE,
DEV_ST_TRANSITION_MAX,
};

diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c
index a2f2feef1476..8833b0248393 100644
--- a/drivers/bus/mhi/host/pm.c
+++ b/drivers/bus/mhi/host/pm.c
@@ -458,7 +458,8 @@ static int mhi_pm_mission_mode_transition(struct mhi_controller *mhi_cntrl)
}

/* Handle shutdown transitions */
-static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
+static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl,
+ bool destroy_device)
{
enum mhi_pm_state cur_state;
struct mhi_event *mhi_event;
@@ -520,8 +521,10 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
dev_dbg(dev, "Waiting for all pending threads to complete\n");
wake_up_all(&mhi_cntrl->state_event);

- dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
- device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);
+ if (destroy_device) {
+ dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
+ device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);
+ }

mutex_lock(&mhi_cntrl->pm_mutex);

@@ -806,7 +809,10 @@ void mhi_pm_st_worker(struct work_struct *work)
mhi_pm_sys_error_transition(mhi_cntrl);
break;
case DEV_ST_TRANSITION_DISABLE:
- mhi_pm_disable_transition(mhi_cntrl);
+ mhi_pm_disable_transition(mhi_cntrl, false);
+ break;
+ case DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE:
+ mhi_pm_disable_transition(mhi_cntrl, true);
break;
default:
break;
@@ -1160,7 +1166,8 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
}
EXPORT_SYMBOL_GPL(mhi_async_power_up);

-void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
+void __mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful,
+ bool destroy_device)
{
enum mhi_pm_state cur_state, transition_state;
struct device *dev = &mhi_cntrl->mhi_dev->dev;
@@ -1196,14 +1203,19 @@ void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
write_unlock_irq(&mhi_cntrl->pm_lock);
mutex_unlock(&mhi_cntrl->pm_mutex);

- mhi_queue_state_transition(mhi_cntrl, DEV_ST_TRANSITION_DISABLE);
+ if (destroy_device)
+ mhi_queue_state_transition(mhi_cntrl,
+ DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE);
+ else
+ mhi_queue_state_transition(mhi_cntrl,
+ DEV_ST_TRANSITION_DISABLE);

/* Wait for shutdown to complete */
flush_work(&mhi_cntrl->st_worker);

disable_irq(mhi_cntrl->irq[0]);
}
-EXPORT_SYMBOL_GPL(mhi_power_down);
+EXPORT_SYMBOL_GPL(__mhi_power_down);

int mhi_sync_power_up(struct mhi_controller *mhi_cntrl)
{
diff --git a/include/linux/mhi.h b/include/linux/mhi.h
index d0f9b522f328..ae092bc8b97e 100644
--- a/include/linux/mhi.h
+++ b/include/linux/mhi.h
@@ -648,12 +648,37 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl);
*/
int mhi_sync_power_up(struct mhi_controller *mhi_cntrl);

+void __mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful,
+ bool destroy_device);
+
/**
- * mhi_power_down - Start MHI power down sequence
+ * mhi_power_down - Start MHI power down sequence. See also
+ * mhi_power_down_no_destroy() which is a variant of this for suspend.
+ *
* @mhi_cntrl: MHI controller
* @graceful: Link is still accessible, so do a graceful shutdown process
*/
-void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful);
+static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
+{
+ __mhi_power_down(mhi_cntrl, graceful, true);
+}
+
+/**
+ * mhi_power_down_no_destroy - Start MHI power down sequence but don't
+ * destroy struct devices. This is a variant for mhi_power_down() and is a
+ * workaround to make it possible to use mhi_power_up() in a resume
+ * handler. When using this variant the caller must also call
+ * mhi_prepare_all_for_transfer_autoqueue() and
+ * mhi_unprepare_all_from_transfer().
+ *
+ * @mhi_cntrl: MHI controller
+ * @graceful: Link is still accessible, so do a graceful shutdown process
+ */
+static inline void mhi_power_down_no_destroy(struct mhi_controller *mhi_cntrl,
+ bool graceful)
+{
+ __mhi_power_down(mhi_cntrl, graceful, false);
+}

/**
* mhi_unprepare_after_power_down - Free any allocated memory after power down
--
2.39.2


2023-11-27 16:20:44

by Kalle Valo

[permalink] [raw]
Subject: [PATCH RFC v2 2/8] bus: mhi: host: add new interfaces to handle MHI channels directly

From: Baochen Qiang <[email protected]>

When using mhi_power_down_no_destroy() MHI hosts need to unprepare MHI channels
by themselves. Similarly, MHI stack will also not create new MHI device since
old devices were not destroyed, so MHI hosts need to prepare channels as well.
Hence add these two interfaces to make that possible.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Signed-off-by: Baochen Qiang <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
---
drivers/bus/mhi/host/main.c | 107 ++++++++++++++++++++++++++++++++++++
include/linux/mhi.h | 20 ++++++-
2 files changed, 126 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
index d80975f4bba8..3f677fc628ad 100644
--- a/drivers/bus/mhi/host/main.c
+++ b/drivers/bus/mhi/host/main.c
@@ -1669,6 +1669,58 @@ int mhi_prepare_for_transfer_autoqueue(struct mhi_device *mhi_dev)
}
EXPORT_SYMBOL_GPL(mhi_prepare_for_transfer_autoqueue);

+static int ____mhi_prepare_for_transfer(struct device *dev, void *data)
+{
+ struct mhi_device *mhi_dev;
+ struct mhi_chan *ul_chan, *dl_chan;
+ enum mhi_ee_type ee = MHI_EE_MAX;
+
+ if (dev->bus != &mhi_bus_type)
+ return 0;
+
+ mhi_dev = to_mhi_device(dev);
+
+ /* Only prepare virtual devices that are attached to bus */
+ if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
+ return 0;
+
+ /* There are cases where there is no MHI client driver matches
+ * this device, we are not allowed to do prepare for it.
+ */
+ if (!mhi_dev->id)
+ return 0;
+
+ ul_chan = mhi_dev->ul_chan;
+ dl_chan = mhi_dev->dl_chan;
+
+ /*
+ * If execution environment is specified, remove only those devices that
+ * started in them based on ee_mask for the channels as we move on to a
+ * different execution environment
+ */
+ if (data)
+ ee = *(enum mhi_ee_type *)data;
+
+ if (ul_chan && ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
+ return 0;
+
+
+ if (dl_chan && ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
+ return 0;
+
+ if (dl_chan->pre_alloc)
+ return mhi_prepare_for_transfer_autoqueue(mhi_dev);
+ else
+ return mhi_prepare_for_transfer(mhi_dev);
+}
+
+int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl)
+{
+ return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
+ ____mhi_prepare_for_transfer);
+}
+EXPORT_SYMBOL_GPL(mhi_prepare_all_for_transfer);
+
void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
{
struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl;
@@ -1684,3 +1736,58 @@ void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
}
}
EXPORT_SYMBOL_GPL(mhi_unprepare_from_transfer);
+
+static int ____mhi_unprepare_from_transfer(struct device *dev, void *data)
+{
+ struct mhi_device *mhi_dev;
+ struct mhi_chan *ul_chan, *dl_chan;
+ enum mhi_ee_type ee = MHI_EE_MAX;
+
+ if (dev->bus != &mhi_bus_type)
+ return 0;
+
+ mhi_dev = to_mhi_device(dev);
+
+ /* Only unprepare virtual devices that are attached to bus */
+ if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
+ return 0;
+
+ /* There are cases where there is no MHI client driver matches
+ * this device, so it is not probed or prepared, no need to
+ * do unprepare for it.
+ */
+ if (!mhi_dev->id)
+ return 0;
+
+ ul_chan = mhi_dev->ul_chan;
+ dl_chan = mhi_dev->dl_chan;
+
+ /*
+ * If execution environment is specified, remove only those devices that
+ * started in them based on ee_mask for the channels as we move on to a
+ * different execution environment
+ */
+ if (data)
+ ee = *(enum mhi_ee_type *)data;
+
+ if (ul_chan) {
+ if (ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
+ return 0;
+ }
+
+ if (dl_chan) {
+ if (ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
+ return 0;
+ }
+
+ mhi_unprepare_from_transfer(mhi_dev);
+
+ return 0;
+}
+
+int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl)
+{
+ return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
+ ____mhi_unprepare_from_transfer);
+}
+EXPORT_SYMBOL_GPL(mhi_unprepare_all_from_transfer);
diff --git a/include/linux/mhi.h b/include/linux/mhi.h
index ae092bc8b97e..dcf62a57056a 100644
--- a/include/linux/mhi.h
+++ b/include/linux/mhi.h
@@ -668,7 +668,7 @@ static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool gracefu
* destroy struct devices. This is a variant for mhi_power_down() and is a
* workaround to make it possible to use mhi_power_up() in a resume
* handler. When using this variant the caller must also call
- * mhi_prepare_all_for_transfer_autoqueue() and
+ * mhi_prepare_all_for_transfer() and
* mhi_unprepare_all_from_transfer().
*
* @mhi_cntrl: MHI controller
@@ -842,4 +842,22 @@ int mhi_queue_skb(struct mhi_device *mhi_dev, enum dma_data_direction dir,
*/
bool mhi_queue_is_full(struct mhi_device *mhi_dev, enum dma_data_direction dir);

+/**
+ * mhi_prepare_all_for_transfer - if you are using
+ * mhi_power_down_no_destroy() variant this needs to be called after
+ * calling mhi_power_up().
+ *
+ * @mhi_cntrl: MHI controller
+ */
+int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl);
+
+/**
+ * mhi_unprepare_all_from_transfer - if you are using
+ * mhi_power_down_no_destroy() variant this function needs to be called
+ * before calling mhi_power_down_no_destroy().
+ *
+ * @mhi_cntrl: MHI controller
+ */
+int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl);
+
#endif /* _MHI_H_ */
--
2.39.2


2023-11-27 16:20:48

by Kalle Valo

[permalink] [raw]
Subject: [PATCH RFC v2 3/8] wifi: ath11k: handle irq enable/disable in several code path

From: Baochen Qiang <[email protected]>

For non WoW suspend/resume, ath11k host powers down whole hardware
when suspend and power up it when resume, the code path it goes
through is very like the ath11k reset logic.

In order to reuse that logic, do some IRQ management work to make
it work.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Signed-off-by: Baochen Qiang <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
---
drivers/net/wireless/ath/ath11k/core.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/ath/ath11k/core.c b/drivers/net/wireless/ath/ath11k/core.c
index 0c6ecbb9a066..fbd6b6a0e12c 100644
--- a/drivers/net/wireless/ath/ath11k/core.c
+++ b/drivers/net/wireless/ath/ath11k/core.c
@@ -852,9 +852,6 @@ int ath11k_core_resume(struct ath11k_base *ab)
return ret;
}

- ath11k_hif_ce_irq_enable(ab);
- ath11k_hif_irq_enable(ab);
-
ret = ath11k_dp_rx_pktlog_start(ab);
if (ret) {
ath11k_warn(ab, "failed to start rx pktlog during resume: %d\n",
@@ -1775,10 +1772,9 @@ static int ath11k_core_reconfigure_on_crash(struct ath11k_base *ab)

mutex_lock(&ab->core_lock);
ath11k_thermal_unregister(ab);
- ath11k_hif_irq_disable(ab);
ath11k_dp_pdev_free(ab);
ath11k_spectral_deinit(ab);
- ath11k_hif_stop(ab);
+ ath11k_ce_cleanup_pipes(ab);
ath11k_wmi_detach(ab);
ath11k_dp_pdev_reo_cleanup(ab);
mutex_unlock(&ab->core_lock);
@@ -2033,8 +2029,8 @@ static void ath11k_core_reset(struct work_struct *work)
time_left = wait_for_completion_timeout(&ab->recovery_start,
ATH11K_RECOVER_START_TIMEOUT_HZ);

- ath11k_hif_power_down(ab);
- ath11k_hif_power_up(ab);
+ ath11k_hif_irq_disable(ab);
+ ath11k_hif_ce_irq_disable(ab);

ath11k_dbg(ab, ATH11K_DBG_BOOT, "reset started\n");
}
--
2.39.2


2023-11-27 16:20:56

by Kalle Valo

[permalink] [raw]
Subject: [PATCH RFC v2 5/8] wifi: ath11k: do not dump SRNG statistics during resume

From: Baochen Qiang <[email protected]>

Both the firmware reset feature and the power management
suspend/resume feature share common power-down and power-up
functionality. One aspect of the power-up functionality is
the handling of the ATH11K_QMI_EVENT_FW_INIT_DONE event.
When this event is received, a call is made to
ath11k_hal_dump_srng_stats(), with the purpose to collect
information that may be useful in debugging the cause of a
firmware reset.

Unfortunately, since this functionality is shared between
both the firmware reset path and the power management
resume path, the kernel log is flooded with messages during
resume. Since these messages are not useful during resume,
and in fact can be confusing and can increase the time it
takes to resume, update the logic to only call
ath11k_hal_dump_srng_stats() during firmware reset.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Signed-off-by: Baochen Qiang <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
---
drivers/net/wireless/ath/ath11k/qmi.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
index c270dc46d506..97a74563d4a6 100644
--- a/drivers/net/wireless/ath/ath11k/qmi.c
+++ b/drivers/net/wireless/ath/ath11k/qmi.c
@@ -3249,7 +3249,8 @@ static void ath11k_qmi_driver_event_work(struct work_struct *work)
case ATH11K_QMI_EVENT_FW_INIT_DONE:
clear_bit(ATH11K_FLAG_QMI_FAIL, &ab->dev_flags);
if (test_bit(ATH11K_FLAG_REGISTERED, &ab->dev_flags)) {
- ath11k_hal_dump_srng_stats(ab);
+ if (ab->is_reset)
+ ath11k_hal_dump_srng_stats(ab);
queue_work(ab->workqueue, &ab->restart_work);
break;
}
--
2.39.2


2023-11-27 16:21:05

by Kalle Valo

[permalink] [raw]
Subject: [PATCH RFC v2 7/8] wifi: ath11k: thermal: don't try to register multiple times

From: Kalle Valo <[email protected]>

Every time the firmware boots we call ath11k_core_qmi_firmware_ready() which
ends up calling ath11k_thermal_register(). So we try to register thermal
devices multiple times. And when we power off the firmware during
suspend/hibernation (implemented in the next patch) we get a warning in resume:

hwmon hwmon4: PM: parent phy0 should not be sleeping

Workaround this similarly like ath11k_mac_register() does by testing
ATH11K_FLAG_REGISTERED.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Signed-off-by: Kalle Valo <[email protected]>
---
drivers/net/wireless/ath/ath11k/thermal.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/net/wireless/ath/ath11k/thermal.c b/drivers/net/wireless/ath/ath11k/thermal.c
index c9b012f97ba5..80abf472fb87 100644
--- a/drivers/net/wireless/ath/ath11k/thermal.c
+++ b/drivers/net/wireless/ath/ath11k/thermal.c
@@ -162,6 +162,9 @@ int ath11k_thermal_register(struct ath11k_base *ab)
struct ath11k_pdev *pdev;
int i, ret;

+ if (test_bit(ATH11K_FLAG_REGISTERED, &ab->dev_flags))
+ return 0;
+
for (i = 0; i < ab->num_radios; i++) {
pdev = &ab->pdevs[i];
ar = pdev->ar;
--
2.39.2


2023-11-27 16:21:07

by Kalle Valo

[permalink] [raw]
Subject: [PATCH RFC v2 8/8] wifi: ath11k: support hibernation

From: Baochen Qiang <[email protected]>

Now that all infrastructure is in place and ath11k is fixed to handle all the
corner cases, power down the ath11k firmware during suspend and power it back
up during resume. This fixes the problem when using hibernation with ath11k PCI
devices.

Change to use ath11k_hif_power_down() instead of ath11k_hif_suspend()
in suspend callback and to use ath11k_hif_power_up() instead of
ath11k_hif_resume() in resume callback.

In ath11k_hif_power_down(), we reset MHI channels to keep from unexpected
activities, and last we go PCI power down path to completely reset whole
hardware. Most importantly in power down path, we tell mhi_power_down() to not
to destroy MHI devices, making us get rid of the probe-defer issue when resume.

In ath11k_hif_power_up(), we go normal PCI power up path to download firmware
etc. Since MHI channels are not activated automatically, we do it manually as
the last part.

Also change related code due to interface changes.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Tested-by: Takashi Iwai <[email protected]>
Signed-off-by: Baochen Qiang <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
---
drivers/net/wireless/ath/ath11k/ahb.c | 8 ++--
drivers/net/wireless/ath/ath11k/core.c | 34 ++++++++--------
drivers/net/wireless/ath/ath11k/core.h | 2 +
drivers/net/wireless/ath/ath11k/hif.h | 12 +++---
drivers/net/wireless/ath/ath11k/mhi.c | 21 +++++++++-
drivers/net/wireless/ath/ath11k/mhi.h | 4 +-
drivers/net/wireless/ath/ath11k/pci.c | 55 +++++++++++++++++++++++---
drivers/net/wireless/ath/ath11k/qmi.c | 4 +-
8 files changed, 103 insertions(+), 37 deletions(-)

diff --git a/drivers/net/wireless/ath/ath11k/ahb.c b/drivers/net/wireless/ath/ath11k/ahb.c
index f8f5e653cd03..0ff5e85392aa 100644
--- a/drivers/net/wireless/ath/ath11k/ahb.c
+++ b/drivers/net/wireless/ath/ath11k/ahb.c
@@ -401,7 +401,7 @@ static void ath11k_ahb_stop(struct ath11k_base *ab)
ath11k_ce_cleanup_pipes(ab);
}

-static int ath11k_ahb_power_up(struct ath11k_base *ab)
+static int ath11k_ahb_power_up(struct ath11k_base *ab, bool is_resume)
{
struct ath11k_ahb *ab_ahb = ath11k_ahb_priv(ab);
int ret;
@@ -413,11 +413,11 @@ static int ath11k_ahb_power_up(struct ath11k_base *ab)
return ret;
}

-static void ath11k_ahb_power_down(struct ath11k_base *ab)
+static int ath11k_ahb_power_down(struct ath11k_base *ab, bool is_suspend)
{
struct ath11k_ahb *ab_ahb = ath11k_ahb_priv(ab);

- rproc_shutdown(ab_ahb->tgt_rproc);
+ return rproc_shutdown(ab_ahb->tgt_rproc);
}

static void ath11k_ahb_init_qmi_ce_config(struct ath11k_base *ab)
@@ -1256,7 +1256,7 @@ static int ath11k_ahb_remove(struct platform_device *pdev)
struct ath11k_base *ab = platform_get_drvdata(pdev);

if (test_bit(ATH11K_FLAG_QMI_FAIL, &ab->dev_flags)) {
- ath11k_ahb_power_down(ab);
+ ath11k_ahb_power_down(ab, false);
ath11k_debugfs_soc_destroy(ab);
ath11k_qmi_deinit_service(ab);
goto qmi_fail;
diff --git a/drivers/net/wireless/ath/ath11k/core.c b/drivers/net/wireless/ath/ath11k/core.c
index fbd6b6a0e12c..0e0b02692282 100644
--- a/drivers/net/wireless/ath/ath11k/core.c
+++ b/drivers/net/wireless/ath/ath11k/core.c
@@ -800,12 +800,6 @@ int ath11k_core_suspend(struct ath11k_base *ab)
return ret;
}

- ret = ath11k_wow_enable(ab);
- if (ret) {
- ath11k_warn(ab, "failed to enable wow during suspend: %d\n", ret);
- return ret;
- }
-
ret = ath11k_dp_rx_pktlog_stop(ab, false);
if (ret) {
ath11k_warn(ab, "failed to stop dp rx pktlog during suspend: %d\n",
@@ -819,7 +813,7 @@ int ath11k_core_suspend(struct ath11k_base *ab)
ath11k_hif_irq_disable(ab);
ath11k_hif_ce_irq_disable(ab);

- ret = ath11k_hif_suspend(ab);
+ ret = ath11k_hif_power_down(ab, true);
if (ret) {
ath11k_warn(ab, "failed to suspend hif: %d\n", ret);
return ret;
@@ -834,6 +828,7 @@ int ath11k_core_resume(struct ath11k_base *ab)
int ret;
struct ath11k_pdev *pdev;
struct ath11k *ar;
+ long time_left;

if (!ab->hw_params.supports_suspend)
return -EOPNOTSUPP;
@@ -846,11 +841,18 @@ int ath11k_core_resume(struct ath11k_base *ab)
if (!ar || ar->state != ATH11K_STATE_OFF)
return 0;

- ret = ath11k_hif_resume(ab);
+ reinit_completion(&ab->restart_completed);
+ ret = ath11k_hif_power_up(ab, true);
if (ret) {
ath11k_warn(ab, "failed to resume hif during resume: %d\n", ret);
return ret;
}
+ time_left = wait_for_completion_timeout(&ab->restart_completed,
+ ATH11K_RESET_TIMEOUT_HZ);
+ if (time_left == 0) {
+ ath11k_warn(ab, "timeout while waiting for restart complete");
+ return -ETIMEDOUT;
+ }

ret = ath11k_dp_rx_pktlog_start(ab);
if (ret) {
@@ -859,12 +861,6 @@ int ath11k_core_resume(struct ath11k_base *ab)
return ret;
}

- ret = ath11k_wow_wakeup(ab);
- if (ret) {
- ath11k_warn(ab, "failed to wakeup wow during resume: %d\n", ret);
- return ret;
- }
-
return 0;
}
EXPORT_SYMBOL(ath11k_core_resume);
@@ -1488,7 +1484,7 @@ static int ath11k_core_soc_create(struct ath11k_base *ab)
goto err_qmi_deinit;
}

- ret = ath11k_hif_power_up(ab);
+ ret = ath11k_hif_power_up(ab, false);
if (ret) {
ath11k_err(ab, "failed to power up :%d\n", ret);
goto err_debugfs_reg;
@@ -1963,6 +1959,8 @@ static void ath11k_core_restart(struct work_struct *work)

if (!ab->is_reset)
ath11k_core_post_reconfigure_recovery(ab);
+
+ complete(&ab->restart_completed);
}

static void ath11k_core_reset(struct work_struct *work)
@@ -2032,6 +2030,9 @@ static void ath11k_core_reset(struct work_struct *work)
ath11k_hif_irq_disable(ab);
ath11k_hif_ce_irq_disable(ab);

+ ath11k_hif_power_down(ab, false);
+ ath11k_hif_power_up(ab, false);
+
ath11k_dbg(ab, ATH11K_DBG_BOOT, "reset started\n");
}

@@ -2102,7 +2103,7 @@ void ath11k_core_deinit(struct ath11k_base *ab)

mutex_unlock(&ab->core_lock);

- ath11k_hif_power_down(ab);
+ ath11k_hif_power_down(ab, false);
ath11k_mac_destroy(ab);
ath11k_core_soc_destroy(ab);
ath11k_fw_destroy(ab);
@@ -2155,6 +2156,7 @@ struct ath11k_base *ath11k_core_alloc(struct device *dev, size_t priv_size,
timer_setup(&ab->rx_replenish_retry, ath11k_ce_rx_replenish_retry, 0);
init_completion(&ab->htc_suspend);
init_completion(&ab->wow.wakeup_completed);
+ init_completion(&ab->restart_completed);

ab->dev = dev;
ab->hif.bus = bus;
diff --git a/drivers/net/wireless/ath/ath11k/core.h b/drivers/net/wireless/ath/ath11k/core.h
index 7e3b6779f4e9..b3816b8be934 100644
--- a/drivers/net/wireless/ath/ath11k/core.h
+++ b/drivers/net/wireless/ath/ath11k/core.h
@@ -995,6 +995,8 @@ struct ath11k_base {
DECLARE_BITMAP(fw_features, ATH11K_FW_FEATURE_COUNT);
} fw;

+ struct completion restart_completed;
+
#ifdef CONFIG_NL80211_TESTMODE
struct {
u32 data_pos;
diff --git a/drivers/net/wireless/ath/ath11k/hif.h b/drivers/net/wireless/ath/ath11k/hif.h
index d68ed4214dec..7f08591ed5c8 100644
--- a/drivers/net/wireless/ath/ath11k/hif.h
+++ b/drivers/net/wireless/ath/ath11k/hif.h
@@ -16,8 +16,8 @@ struct ath11k_hif_ops {
void (*irq_disable)(struct ath11k_base *ab);
int (*start)(struct ath11k_base *ab);
void (*stop)(struct ath11k_base *ab);
- int (*power_up)(struct ath11k_base *ab);
- void (*power_down)(struct ath11k_base *ab);
+ int (*power_up)(struct ath11k_base *ab, bool is_resume);
+ int (*power_down)(struct ath11k_base *ab, bool is_suspend);
int (*suspend)(struct ath11k_base *ab);
int (*resume)(struct ath11k_base *ab);
int (*map_service_to_pipe)(struct ath11k_base *ab, u16 service_id,
@@ -64,14 +64,14 @@ static inline void ath11k_hif_irq_disable(struct ath11k_base *ab)
ab->hif.ops->irq_disable(ab);
}

-static inline int ath11k_hif_power_up(struct ath11k_base *ab)
+static inline int ath11k_hif_power_up(struct ath11k_base *ab, bool is_resume)
{
- return ab->hif.ops->power_up(ab);
+ return ab->hif.ops->power_up(ab, is_resume);
}

-static inline void ath11k_hif_power_down(struct ath11k_base *ab)
+static inline int ath11k_hif_power_down(struct ath11k_base *ab, bool is_resume)
{
- ab->hif.ops->power_down(ab);
+ return ab->hif.ops->power_down(ab, is_resume);
}

static inline int ath11k_hif_suspend(struct ath11k_base *ab)
diff --git a/drivers/net/wireless/ath/ath11k/mhi.c b/drivers/net/wireless/ath/ath11k/mhi.c
index 579af57f7377..58b92f7ce2b8 100644
--- a/drivers/net/wireless/ath/ath11k/mhi.c
+++ b/drivers/net/wireless/ath/ath11k/mhi.c
@@ -441,9 +441,16 @@ int ath11k_mhi_start(struct ath11k_pci *ab_pci)
return 0;
}

-void ath11k_mhi_stop(struct ath11k_pci *ab_pci)
+void ath11k_mhi_stop(struct ath11k_pci *ab_pci, bool is_suspend)
{
- mhi_power_down(ab_pci->mhi_ctrl, true);
+ /* During suspend we need to use mhi_power_down_no_destroy()
+ * workaround, otherwise mhi_power_up() will fail during resume.
+ */
+ if (is_suspend)
+ mhi_power_down_no_destroy(ab_pci->mhi_ctrl, true);
+ else
+ mhi_power_down(ab_pci->mhi_ctrl, true);
+
mhi_unprepare_after_power_down(ab_pci->mhi_ctrl);
}

@@ -478,3 +485,13 @@ int ath11k_mhi_resume(struct ath11k_pci *ab_pci)

return 0;
}
+
+int ath11k_mhi_prepare_for_transfer(struct ath11k_pci *ab_pci)
+{
+ return mhi_prepare_all_for_transfer(ab_pci->mhi_ctrl);
+}
+
+int ath11k_mhi_unprepare_from_transfer(struct ath11k_pci *ab_pci)
+{
+ return mhi_unprepare_all_from_transfer(ab_pci->mhi_ctrl);
+}
diff --git a/drivers/net/wireless/ath/ath11k/mhi.h b/drivers/net/wireless/ath/ath11k/mhi.h
index 8d9f852da695..80902fda5842 100644
--- a/drivers/net/wireless/ath/ath11k/mhi.h
+++ b/drivers/net/wireless/ath/ath11k/mhi.h
@@ -17,7 +17,7 @@
#define MHICTRL_RESET_MASK 0x2

int ath11k_mhi_start(struct ath11k_pci *ar_pci);
-void ath11k_mhi_stop(struct ath11k_pci *ar_pci);
+void ath11k_mhi_stop(struct ath11k_pci *ar_pci, bool is_suspend);
int ath11k_mhi_register(struct ath11k_pci *ar_pci);
void ath11k_mhi_unregister(struct ath11k_pci *ar_pci);
void ath11k_mhi_set_mhictrl_reset(struct ath11k_base *ab);
@@ -26,4 +26,6 @@ void ath11k_mhi_clear_vector(struct ath11k_base *ab);
int ath11k_mhi_suspend(struct ath11k_pci *ar_pci);
int ath11k_mhi_resume(struct ath11k_pci *ar_pci);

+int ath11k_mhi_prepare_for_transfer(struct ath11k_pci *ar_pci);
+int ath11k_mhi_unprepare_from_transfer(struct ath11k_pci *ar_pci);
#endif
diff --git a/drivers/net/wireless/ath/ath11k/pci.c b/drivers/net/wireless/ath/ath11k/pci.c
index 09e65c5e55c4..3d6195bc6f55 100644
--- a/drivers/net/wireless/ath/ath11k/pci.c
+++ b/drivers/net/wireless/ath/ath11k/pci.c
@@ -625,7 +625,7 @@ static int ath11k_pci_power_up(struct ath11k_base *ab)
return 0;
}

-static void ath11k_pci_power_down(struct ath11k_base *ab)
+static void ath11k_pci_power_down(struct ath11k_base *ab, bool is_suspend)
{
struct ath11k_pci *ab_pci = ath11k_pci_priv(ab);

@@ -636,11 +636,54 @@ static void ath11k_pci_power_down(struct ath11k_base *ab)

ath11k_pci_msi_disable(ab_pci);

- ath11k_mhi_stop(ab_pci);
+ ath11k_mhi_stop(ab_pci, is_suspend);
clear_bit(ATH11K_FLAG_DEVICE_INIT_DONE, &ab->dev_flags);
ath11k_pci_sw_reset(ab_pci->ab, false);
}

+static int ath11k_pci_hif_power_down(struct ath11k_base *ab, bool is_suspend)
+{
+ struct ath11k_pci *ab_pci = ath11k_pci_priv(ab);
+ int ret;
+
+ if (is_suspend) {
+ ret = ath11k_mhi_unprepare_from_transfer(ab_pci);
+ if (ret) {
+ ath11k_err(ab_pci->ab, "failed to unprepare from transfer %d\n",
+ ret);
+ return ret;
+ }
+ }
+
+ ath11k_pci_power_down(ab, is_suspend);
+ return 0;
+}
+
+static int ath11k_pci_hif_power_up(struct ath11k_base *ab, bool is_resume)
+{
+ struct ath11k_pci *ab_pci = ath11k_pci_priv(ab);
+ int ret;
+
+ ret = ath11k_pci_power_up(ab);
+ if (ret) {
+ ath11k_err(ab_pci->ab, "failed to power up %d\n", ret);
+ return ret;
+ }
+
+ if (is_resume) {
+ /* sleep for 500ms to let mhi_pm_mission_mode_transition()
+ * finishes, or we may be wake up immediately after mission
+ * mode event received and call
+ * ath11k_mhi_prepare_for_transfer(), while bottom half of
+ * mhi_pm_mission_mode_transition() does not finish.
+ */
+ msleep(500);
+ ret = ath11k_mhi_prepare_for_transfer(ab_pci);
+ }
+
+ return ret;
+}
+
static int ath11k_pci_hif_suspend(struct ath11k_base *ab)
{
struct ath11k_pci *ar_pci = ath11k_pci_priv(ab);
@@ -688,8 +731,8 @@ static const struct ath11k_hif_ops ath11k_pci_hif_ops = {
.read32 = ath11k_pcic_read32,
.write32 = ath11k_pcic_write32,
.read = ath11k_pcic_read,
- .power_down = ath11k_pci_power_down,
- .power_up = ath11k_pci_power_up,
+ .power_down = ath11k_pci_hif_power_down,
+ .power_up = ath11k_pci_hif_power_up,
.suspend = ath11k_pci_hif_suspend,
.resume = ath11k_pci_hif_resume,
.irq_enable = ath11k_pcic_ext_irq_enable,
@@ -938,7 +981,7 @@ static void ath11k_pci_remove(struct pci_dev *pdev)
ath11k_pci_set_irq_affinity_hint(ab_pci, NULL);

if (test_bit(ATH11K_FLAG_QMI_FAIL, &ab->dev_flags)) {
- ath11k_pci_power_down(ab);
+ ath11k_pci_power_down(ab, false);
ath11k_debugfs_soc_destroy(ab);
ath11k_qmi_deinit_service(ab);
goto qmi_fail;
@@ -966,7 +1009,7 @@ static void ath11k_pci_shutdown(struct pci_dev *pdev)
struct ath11k_pci *ab_pci = ath11k_pci_priv(ab);

ath11k_pci_set_irq_affinity_hint(ab_pci, NULL);
- ath11k_pci_power_down(ab);
+ ath11k_pci_power_down(ab, false);
}

static __maybe_unused int ath11k_pci_pm_suspend(struct device *dev)
diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
index 97a74563d4a6..7d856d8b7f89 100644
--- a/drivers/net/wireless/ath/ath11k/qmi.c
+++ b/drivers/net/wireless/ath/ath11k/qmi.c
@@ -2877,8 +2877,8 @@ int ath11k_qmi_fwreset_from_cold_boot(struct ath11k_base *ab)
}

/* reset the firmware */
- ath11k_hif_power_down(ab);
- ath11k_hif_power_up(ab);
+ ath11k_hif_power_down(ab, false);
+ ath11k_hif_power_up(ab, false);
ath11k_dbg(ab, ATH11K_DBG_QMI, "exit wait for cold boot done\n");
return 0;
}
--
2.39.2


2023-11-27 16:21:41

by Kalle Valo

[permalink] [raw]
Subject: [PATCH RFC v2 4/8] wifi: ath11k: remove MHI LOOPBACK channels

From: Baochen Qiang <[email protected]>

There is no driver to match these two channels, so
remove them. This fixes warnings from MHI subsystem during suspend:

mhi mhi0_LOOPBACK: 1: Failed to reset channel, still resetting
mhi mhi0_LOOPBACK: 0: Failed to reset channel, still resetting

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Signed-off-by: Baochen Qiang <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
---
drivers/net/wireless/ath/ath11k/mhi.c | 56 ---------------------------
1 file changed, 56 deletions(-)

diff --git a/drivers/net/wireless/ath/ath11k/mhi.c b/drivers/net/wireless/ath/ath11k/mhi.c
index afeabd6ecc67..579af57f7377 100644
--- a/drivers/net/wireless/ath/ath11k/mhi.c
+++ b/drivers/net/wireless/ath/ath11k/mhi.c
@@ -21,34 +21,6 @@
#define RDDM_DUMP_SIZE 0x420000

static struct mhi_channel_config ath11k_mhi_channels_qca6390[] = {
- {
- .num = 0,
- .name = "LOOPBACK",
- .num_elements = 32,
- .event_ring = 0,
- .dir = DMA_TO_DEVICE,
- .ee_mask = 0x4,
- .pollcfg = 0,
- .doorbell = MHI_DB_BRST_DISABLE,
- .lpm_notify = false,
- .offload_channel = false,
- .doorbell_mode_switch = false,
- .auto_queue = false,
- },
- {
- .num = 1,
- .name = "LOOPBACK",
- .num_elements = 32,
- .event_ring = 0,
- .dir = DMA_FROM_DEVICE,
- .ee_mask = 0x4,
- .pollcfg = 0,
- .doorbell = MHI_DB_BRST_DISABLE,
- .lpm_notify = false,
- .offload_channel = false,
- .doorbell_mode_switch = false,
- .auto_queue = false,
- },
{
.num = 20,
.name = "IPCR",
@@ -114,34 +86,6 @@ static struct mhi_controller_config ath11k_mhi_config_qca6390 = {
};

static struct mhi_channel_config ath11k_mhi_channels_qcn9074[] = {
- {
- .num = 0,
- .name = "LOOPBACK",
- .num_elements = 32,
- .event_ring = 1,
- .dir = DMA_TO_DEVICE,
- .ee_mask = 0x14,
- .pollcfg = 0,
- .doorbell = MHI_DB_BRST_DISABLE,
- .lpm_notify = false,
- .offload_channel = false,
- .doorbell_mode_switch = false,
- .auto_queue = false,
- },
- {
- .num = 1,
- .name = "LOOPBACK",
- .num_elements = 32,
- .event_ring = 1,
- .dir = DMA_FROM_DEVICE,
- .ee_mask = 0x14,
- .pollcfg = 0,
- .doorbell = MHI_DB_BRST_DISABLE,
- .lpm_notify = false,
- .offload_channel = false,
- .doorbell_mode_switch = false,
- .auto_queue = false,
- },
{
.num = 20,
.name = "IPCR",
--
2.39.2


2023-11-27 16:22:00

by Kalle Valo

[permalink] [raw]
Subject: [PATCH RFC v2 6/8] wifi: ath11k: fix warning on DMA ring capabilities event

From: Baochen Qiang <[email protected]>

We are seeing below warning in both reset and suspend/resume scenarios:

[69663.691847] ath11k_pci 0000:02:00.0: Already processed, so ignoring dma ring caps

This is because ab->num_db_cap is not cleared in
ath11k_wmi_free_dbring_caps(), so clear it to avoid such
warnings.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Signed-off-by: Baochen Qiang <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
---
drivers/net/wireless/ath/ath11k/wmi.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/net/wireless/ath/ath11k/wmi.c b/drivers/net/wireless/ath/ath11k/wmi.c
index 2845b4313d3a..b73d4286f7d3 100644
--- a/drivers/net/wireless/ath/ath11k/wmi.c
+++ b/drivers/net/wireless/ath/ath11k/wmi.c
@@ -4786,6 +4786,7 @@ static void ath11k_wmi_free_dbring_caps(struct ath11k_base *ab)
{
kfree(ab->db_caps);
ab->db_caps = NULL;
+ ab->num_db_cap = 0;
}

static int ath11k_wmi_tlv_dma_ring_caps(struct ath11k_base *ab,
--
2.39.2


2023-11-27 18:49:28

by Jeff Johnson

[permalink] [raw]
Subject: Re: [PATCH RFC v2 0/8] wifi: ath11k: hibernation support

On 11/27/2023 8:20 AM, Kalle Valo wrote:
> From: Kalle Valo <[email protected]>
>
> Currently in ath11k we keep the firmware running on the WLAN device when the
> network interface (wlan0) is down. The problem is that this will break
> hibernation, obviously the firmware can't be running after the whole system is
> powered off. To power down the ath11k firmware for suspend/hibernation some
> changes both in MHI subsystem and ath11k is needed.
>
> This patchset fixes a longstanding bug report about broken hibernation support:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=214649
>
> This patchset is marked as RFC as it requires changes in MHI subsystem. Also
> this has been tested only on WCN6855, need to test also on more AP based
> chipsets like IPQ8074 and QCN9074.
>
> The patches are also available at:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/log/?h=ath11k-hibernation-support
>
> Earlier versions of this patchset have been tested by multiple users with
> positive results.
>
> v2:
>
> * rebase to ath-202311221826 (6.7.0-rc2-wt-ath+)
>
> * 'bus: mhi: host: add mhi_power_down_no_destroy()': fix null state string for
> DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE
>
> * 'bus: mhi: host: add new interfaces to handle MHI channels
> directly': fix typos in comments
>
> * 'bus: mhi: host: add new interfaces to handle MHI channels directly': honour
> initial autoqueue configuration
>
> * 'bus: mhi: host: add new interfaces to handle MHI channels
> directly': don't prepare/unprepare MHI devices that don't match
> with a MHI client driver
>
> * 'wifi: ath11k: remove MHI LOOPBACK channels': remove LOOPBACK channels for QCN9074 as well
>
> v1: https://lore.kernel.org/mhi/[email protected]/
>
> Baochen Qiang (7):
> bus: mhi: host: add mhi_power_down_no_destroy()
> bus: mhi: host: add new interfaces to handle MHI channels directly
> wifi: ath11k: handle irq enable/disable in several code path
> wifi: ath11k: remove MHI LOOPBACK channels
> wifi: ath11k: do not dump SRNG statistics during resume
> wifi: ath11k: fix warning on DMA ring capabilities event
> wifi: ath11k: support hibernation
>
> Kalle Valo (1):
> wifi: ath11k: thermal: don't try to register multiple times
>
> drivers/bus/mhi/host/init.c | 1 +
> drivers/bus/mhi/host/internal.h | 1 +
> drivers/bus/mhi/host/main.c | 107 ++++++++++++++++++++++
> drivers/bus/mhi/host/pm.c | 26 ++++--
> drivers/net/wireless/ath/ath11k/ahb.c | 8 +-
> drivers/net/wireless/ath/ath11k/core.c | 44 +++++----
> drivers/net/wireless/ath/ath11k/core.h | 2 +
> drivers/net/wireless/ath/ath11k/hif.h | 12 +--
> drivers/net/wireless/ath/ath11k/mhi.c | 77 ++++------------
> drivers/net/wireless/ath/ath11k/mhi.h | 4 +-
> drivers/net/wireless/ath/ath11k/pci.c | 55 +++++++++--
> drivers/net/wireless/ath/ath11k/qmi.c | 7 +-
> drivers/net/wireless/ath/ath11k/thermal.c | 3 +
> drivers/net/wireless/ath/ath11k/wmi.c | 1 +
> include/linux/mhi.h | 47 +++++++++-
> 15 files changed, 285 insertions(+), 110 deletions(-)
>
>
> base-commit: 16a212b4f33c4edd9ce9a9e0953b5389216e8ed9

Series LGTM. Will add appropriate tags when this is no longer RFC


2023-11-28 01:13:50

by Baochen Qiang

[permalink] [raw]
Subject: Re: [PATCH RFC v2 4/8] wifi: ath11k: remove MHI LOOPBACK channels



On 11/28/2023 12:20 AM, Kalle Valo wrote:
> From: Baochen Qiang <[email protected]>
>
> There is no driver to match these two channels, so
> remove them. This fixes warnings from MHI subsystem during suspend:
>
> mhi mhi0_LOOPBACK: 1: Failed to reset channel, still resetting
> mhi mhi0_LOOPBACK: 0: Failed to reset channel, still resetting
With v2, these warnings are gone even without this patch. so it should
be removed from commit log. It's enough to only mention that those
channels are not used.

>
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>
> Signed-off-by: Baochen Qiang <[email protected]>
> Signed-off-by: Kalle Valo <[email protected]>
> ---
> drivers/net/wireless/ath/ath11k/mhi.c | 56 ---------------------------
> 1 file changed, 56 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath11k/mhi.c b/drivers/net/wireless/ath/ath11k/mhi.c
> index afeabd6ecc67..579af57f7377 100644
> --- a/drivers/net/wireless/ath/ath11k/mhi.c
> +++ b/drivers/net/wireless/ath/ath11k/mhi.c
> @@ -21,34 +21,6 @@
> #define RDDM_DUMP_SIZE 0x420000
>
> static struct mhi_channel_config ath11k_mhi_channels_qca6390[] = {
> - {
> - .num = 0,
> - .name = "LOOPBACK",
> - .num_elements = 32,
> - .event_ring = 0,
> - .dir = DMA_TO_DEVICE,
> - .ee_mask = 0x4,
> - .pollcfg = 0,
> - .doorbell = MHI_DB_BRST_DISABLE,
> - .lpm_notify = false,
> - .offload_channel = false,
> - .doorbell_mode_switch = false,
> - .auto_queue = false,
> - },
> - {
> - .num = 1,
> - .name = "LOOPBACK",
> - .num_elements = 32,
> - .event_ring = 0,
> - .dir = DMA_FROM_DEVICE,
> - .ee_mask = 0x4,
> - .pollcfg = 0,
> - .doorbell = MHI_DB_BRST_DISABLE,
> - .lpm_notify = false,
> - .offload_channel = false,
> - .doorbell_mode_switch = false,
> - .auto_queue = false,
> - },
> {
> .num = 20,
> .name = "IPCR",
> @@ -114,34 +86,6 @@ static struct mhi_controller_config ath11k_mhi_config_qca6390 = {
> };
>
> static struct mhi_channel_config ath11k_mhi_channels_qcn9074[] = {
> - {
> - .num = 0,
> - .name = "LOOPBACK",
> - .num_elements = 32,
> - .event_ring = 1,
> - .dir = DMA_TO_DEVICE,
> - .ee_mask = 0x14,
> - .pollcfg = 0,
> - .doorbell = MHI_DB_BRST_DISABLE,
> - .lpm_notify = false,
> - .offload_channel = false,
> - .doorbell_mode_switch = false,
> - .auto_queue = false,
> - },
> - {
> - .num = 1,
> - .name = "LOOPBACK",
> - .num_elements = 32,
> - .event_ring = 1,
> - .dir = DMA_FROM_DEVICE,
> - .ee_mask = 0x14,
> - .pollcfg = 0,
> - .doorbell = MHI_DB_BRST_DISABLE,
> - .lpm_notify = false,
> - .offload_channel = false,
> - .doorbell_mode_switch = false,
> - .auto_queue = false,
> - },
> {
> .num = 20,
> .name = "IPCR",

2023-11-30 05:43:00

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

On Mon, Nov 27, 2023 at 06:20:15PM +0200, Kalle Valo wrote:
> From: Baochen Qiang <[email protected]>
>
> If ath11k tries to call mhi_power_up() during resume it fails:
>
> ath11k_pci 0000:06:00.0: timeout while waiting for restart complete
>
> This happens because when calling mhi_power_up() the MHI subsystem eventually
> calls device_add() from mhi_create_devices() but the device creation is
> deferred:
>
> mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral
>
> The reason for deferring device creation is explained in dpm_prepare():
>
> /*
> * It is unsafe if probing of devices will happen during suspend or
> * hibernation and system behavior will be unpredictable in this case.
> * So, let's prohibit device's probing here and defer their probes
> * instead. The normal behavior will be restored in dpm_complete().
> */
>
> Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
> qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:
>
> static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
> struct mhi_result *mhi_res)
> {
> struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
> int rc;
>
> if (!qdev || mhi_res->transaction_status)
> return;
>
> So what this means that QRTR is not delivering messages and the QMI connection
> is not working between ath11k and the firmware, resulting a failure in firmware
> initialisation.
>
> To fix this add new function mhi_power_down_no_destroy() which does not destroy
> the devices during power down. This way mhi_power_up() can be called during
> resume and we can get ath11k hibernation working with the following patches.
>
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>
> Signed-off-by: Baochen Qiang <[email protected]>
> Signed-off-by: Kalle Valo <[email protected]>

Any reason for reposting this series without discussing the suggestion from
Mayank? As I said in the internal thread, this patch breaks the Linux device
driver model by not destroying the "struct device" when the actual device gets
removed. We should try to explore alternate options instead of persisting with
this solution.

- Mani

> ---
> drivers/bus/mhi/host/init.c | 1 +
> drivers/bus/mhi/host/internal.h | 1 +
> drivers/bus/mhi/host/pm.c | 26 +++++++++++++++++++-------
> include/linux/mhi.h | 29 +++++++++++++++++++++++++++--
> 4 files changed, 48 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
> index 65ceac1837f9..e626b03ffafa 100644
> --- a/drivers/bus/mhi/host/init.c
> +++ b/drivers/bus/mhi/host/init.c
> @@ -43,6 +43,7 @@ const char * const dev_state_tran_str[DEV_ST_TRANSITION_MAX] = {
> [DEV_ST_TRANSITION_FP] = "FLASH PROGRAMMER",
> [DEV_ST_TRANSITION_SYS_ERR] = "SYS ERROR",
> [DEV_ST_TRANSITION_DISABLE] = "DISABLE",
> + [DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE] = "DISABLE (DESTROY DEVICE)",
> };
>
> const char * const mhi_ch_state_type_str[MHI_CH_STATE_TYPE_MAX] = {
> diff --git a/drivers/bus/mhi/host/internal.h b/drivers/bus/mhi/host/internal.h
> index 30ac415a3000..3f45c9c447bd 100644
> --- a/drivers/bus/mhi/host/internal.h
> +++ b/drivers/bus/mhi/host/internal.h
> @@ -69,6 +69,7 @@ enum dev_st_transition {
> DEV_ST_TRANSITION_FP,
> DEV_ST_TRANSITION_SYS_ERR,
> DEV_ST_TRANSITION_DISABLE,
> + DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE,
> DEV_ST_TRANSITION_MAX,
> };
>
> diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c
> index a2f2feef1476..8833b0248393 100644
> --- a/drivers/bus/mhi/host/pm.c
> +++ b/drivers/bus/mhi/host/pm.c
> @@ -458,7 +458,8 @@ static int mhi_pm_mission_mode_transition(struct mhi_controller *mhi_cntrl)
> }
>
> /* Handle shutdown transitions */
> -static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
> +static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl,
> + bool destroy_device)
> {
> enum mhi_pm_state cur_state;
> struct mhi_event *mhi_event;
> @@ -520,8 +521,10 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
> dev_dbg(dev, "Waiting for all pending threads to complete\n");
> wake_up_all(&mhi_cntrl->state_event);
>
> - dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
> - device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);
> + if (destroy_device) {
> + dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
> + device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);
> + }
>
> mutex_lock(&mhi_cntrl->pm_mutex);
>
> @@ -806,7 +809,10 @@ void mhi_pm_st_worker(struct work_struct *work)
> mhi_pm_sys_error_transition(mhi_cntrl);
> break;
> case DEV_ST_TRANSITION_DISABLE:
> - mhi_pm_disable_transition(mhi_cntrl);
> + mhi_pm_disable_transition(mhi_cntrl, false);
> + break;
> + case DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE:
> + mhi_pm_disable_transition(mhi_cntrl, true);
> break;
> default:
> break;
> @@ -1160,7 +1166,8 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
> }
> EXPORT_SYMBOL_GPL(mhi_async_power_up);
>
> -void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
> +void __mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful,
> + bool destroy_device)
> {
> enum mhi_pm_state cur_state, transition_state;
> struct device *dev = &mhi_cntrl->mhi_dev->dev;
> @@ -1196,14 +1203,19 @@ void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
> write_unlock_irq(&mhi_cntrl->pm_lock);
> mutex_unlock(&mhi_cntrl->pm_mutex);
>
> - mhi_queue_state_transition(mhi_cntrl, DEV_ST_TRANSITION_DISABLE);
> + if (destroy_device)
> + mhi_queue_state_transition(mhi_cntrl,
> + DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE);
> + else
> + mhi_queue_state_transition(mhi_cntrl,
> + DEV_ST_TRANSITION_DISABLE);
>
> /* Wait for shutdown to complete */
> flush_work(&mhi_cntrl->st_worker);
>
> disable_irq(mhi_cntrl->irq[0]);
> }
> -EXPORT_SYMBOL_GPL(mhi_power_down);
> +EXPORT_SYMBOL_GPL(__mhi_power_down);
>
> int mhi_sync_power_up(struct mhi_controller *mhi_cntrl)
> {
> diff --git a/include/linux/mhi.h b/include/linux/mhi.h
> index d0f9b522f328..ae092bc8b97e 100644
> --- a/include/linux/mhi.h
> +++ b/include/linux/mhi.h
> @@ -648,12 +648,37 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl);
> */
> int mhi_sync_power_up(struct mhi_controller *mhi_cntrl);
>
> +void __mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful,
> + bool destroy_device);
> +
> /**
> - * mhi_power_down - Start MHI power down sequence
> + * mhi_power_down - Start MHI power down sequence. See also
> + * mhi_power_down_no_destroy() which is a variant of this for suspend.
> + *
> * @mhi_cntrl: MHI controller
> * @graceful: Link is still accessible, so do a graceful shutdown process
> */
> -void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful);
> +static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
> +{
> + __mhi_power_down(mhi_cntrl, graceful, true);
> +}
> +
> +/**
> + * mhi_power_down_no_destroy - Start MHI power down sequence but don't
> + * destroy struct devices. This is a variant for mhi_power_down() and is a
> + * workaround to make it possible to use mhi_power_up() in a resume
> + * handler. When using this variant the caller must also call
> + * mhi_prepare_all_for_transfer_autoqueue() and
> + * mhi_unprepare_all_from_transfer().
> + *
> + * @mhi_cntrl: MHI controller
> + * @graceful: Link is still accessible, so do a graceful shutdown process
> + */
> +static inline void mhi_power_down_no_destroy(struct mhi_controller *mhi_cntrl,
> + bool graceful)
> +{
> + __mhi_power_down(mhi_cntrl, graceful, false);
> +}
>
> /**
> * mhi_unprepare_after_power_down - Free any allocated memory after power down
> --
> 2.39.2
>
>

--
மணிவண்ணன் சதாசிவம்

2023-12-01 01:09:10

by Baochen Qiang

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()



On 11/30/2023 1:42 PM, Manivannan Sadhasivam wrote:
> On Mon, Nov 27, 2023 at 06:20:15PM +0200, Kalle Valo wrote:
>> From: Baochen Qiang <[email protected]>
>>
>> If ath11k tries to call mhi_power_up() during resume it fails:
>>
>> ath11k_pci 0000:06:00.0: timeout while waiting for restart complete
>>
>> This happens because when calling mhi_power_up() the MHI subsystem eventually
>> calls device_add() from mhi_create_devices() but the device creation is
>> deferred:
>>
>> mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral
>>
>> The reason for deferring device creation is explained in dpm_prepare():
>>
>> /*
>> * It is unsafe if probing of devices will happen during suspend or
>> * hibernation and system behavior will be unpredictable in this case.
>> * So, let's prohibit device's probing here and defer their probes
>> * instead. The normal behavior will be restored in dpm_complete().
>> */
>>
>> Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
>> qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:
>>
>> static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
>> struct mhi_result *mhi_res)
>> {
>> struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
>> int rc;
>>
>> if (!qdev || mhi_res->transaction_status)
>> return;
>>
>> So what this means that QRTR is not delivering messages and the QMI connection
>> is not working between ath11k and the firmware, resulting a failure in firmware
>> initialisation.
>>
>> To fix this add new function mhi_power_down_no_destroy() which does not destroy
>> the devices during power down. This way mhi_power_up() can be called during
>> resume and we can get ath11k hibernation working with the following patches.
>>
>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>>
>> Signed-off-by: Baochen Qiang <[email protected]>
>> Signed-off-by: Kalle Valo <[email protected]>
>
> Any reason for reposting this series without discussing the suggestion from
> Mayank? As I said in the internal thread, this patch breaks the Linux device
> driver model by not destroying the "struct device" when the actual device gets
> removed. We should try to explore alternate options instead of persisting with
> this solution.
>
> - Mani
I shared my opinion on Mayank's proposal, but not noticed it is in
internal thead. I will repost here:

Mayank's proposal assumes that at the time PM_POST_HIBERNATION is
posted, all deferred probe is unblocked and completed, however which is
not true.

Yes, kernel's sequence is first unblock probe defer and trigger probe
again, then post PM_POST_HIBERNATION message. But the problem here is
that the probe-trigger is delegated to an work item and returns
immediately, kernel does not wait for all probe to complete before
posting PM_POST_HIBERNATION. So it is not guaranteed that an MHI device
is probed at the time we get that message.
>
>> ---
>> drivers/bus/mhi/host/init.c | 1 +
>> drivers/bus/mhi/host/internal.h | 1 +
>> drivers/bus/mhi/host/pm.c | 26 +++++++++++++++++++-------
>> include/linux/mhi.h | 29 +++++++++++++++++++++++++++--
>> 4 files changed, 48 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
>> index 65ceac1837f9..e626b03ffafa 100644
>> --- a/drivers/bus/mhi/host/init.c
>> +++ b/drivers/bus/mhi/host/init.c
>> @@ -43,6 +43,7 @@ const char * const dev_state_tran_str[DEV_ST_TRANSITION_MAX] = {
>> [DEV_ST_TRANSITION_FP] = "FLASH PROGRAMMER",
>> [DEV_ST_TRANSITION_SYS_ERR] = "SYS ERROR",
>> [DEV_ST_TRANSITION_DISABLE] = "DISABLE",
>> + [DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE] = "DISABLE (DESTROY DEVICE)",
>> };
>>
>> const char * const mhi_ch_state_type_str[MHI_CH_STATE_TYPE_MAX] = {
>> diff --git a/drivers/bus/mhi/host/internal.h b/drivers/bus/mhi/host/internal.h
>> index 30ac415a3000..3f45c9c447bd 100644
>> --- a/drivers/bus/mhi/host/internal.h
>> +++ b/drivers/bus/mhi/host/internal.h
>> @@ -69,6 +69,7 @@ enum dev_st_transition {
>> DEV_ST_TRANSITION_FP,
>> DEV_ST_TRANSITION_SYS_ERR,
>> DEV_ST_TRANSITION_DISABLE,
>> + DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE,
>> DEV_ST_TRANSITION_MAX,
>> };
>>
>> diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c
>> index a2f2feef1476..8833b0248393 100644
>> --- a/drivers/bus/mhi/host/pm.c
>> +++ b/drivers/bus/mhi/host/pm.c
>> @@ -458,7 +458,8 @@ static int mhi_pm_mission_mode_transition(struct mhi_controller *mhi_cntrl)
>> }
>>
>> /* Handle shutdown transitions */
>> -static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
>> +static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl,
>> + bool destroy_device)
>> {
>> enum mhi_pm_state cur_state;
>> struct mhi_event *mhi_event;
>> @@ -520,8 +521,10 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
>> dev_dbg(dev, "Waiting for all pending threads to complete\n");
>> wake_up_all(&mhi_cntrl->state_event);
>>
>> - dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
>> - device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);
>> + if (destroy_device) {
>> + dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
>> + device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);
>> + }
>>
>> mutex_lock(&mhi_cntrl->pm_mutex);
>>
>> @@ -806,7 +809,10 @@ void mhi_pm_st_worker(struct work_struct *work)
>> mhi_pm_sys_error_transition(mhi_cntrl);
>> break;
>> case DEV_ST_TRANSITION_DISABLE:
>> - mhi_pm_disable_transition(mhi_cntrl);
>> + mhi_pm_disable_transition(mhi_cntrl, false);
>> + break;
>> + case DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE:
>> + mhi_pm_disable_transition(mhi_cntrl, true);
>> break;
>> default:
>> break;
>> @@ -1160,7 +1166,8 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
>> }
>> EXPORT_SYMBOL_GPL(mhi_async_power_up);
>>
>> -void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
>> +void __mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful,
>> + bool destroy_device)
>> {
>> enum mhi_pm_state cur_state, transition_state;
>> struct device *dev = &mhi_cntrl->mhi_dev->dev;
>> @@ -1196,14 +1203,19 @@ void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
>> write_unlock_irq(&mhi_cntrl->pm_lock);
>> mutex_unlock(&mhi_cntrl->pm_mutex);
>>
>> - mhi_queue_state_transition(mhi_cntrl, DEV_ST_TRANSITION_DISABLE);
>> + if (destroy_device)
>> + mhi_queue_state_transition(mhi_cntrl,
>> + DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE);
>> + else
>> + mhi_queue_state_transition(mhi_cntrl,
>> + DEV_ST_TRANSITION_DISABLE);
>>
>> /* Wait for shutdown to complete */
>> flush_work(&mhi_cntrl->st_worker);
>>
>> disable_irq(mhi_cntrl->irq[0]);
>> }
>> -EXPORT_SYMBOL_GPL(mhi_power_down);
>> +EXPORT_SYMBOL_GPL(__mhi_power_down);
>>
>> int mhi_sync_power_up(struct mhi_controller *mhi_cntrl)
>> {
>> diff --git a/include/linux/mhi.h b/include/linux/mhi.h
>> index d0f9b522f328..ae092bc8b97e 100644
>> --- a/include/linux/mhi.h
>> +++ b/include/linux/mhi.h
>> @@ -648,12 +648,37 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl);
>> */
>> int mhi_sync_power_up(struct mhi_controller *mhi_cntrl);
>>
>> +void __mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful,
>> + bool destroy_device);
>> +
>> /**
>> - * mhi_power_down - Start MHI power down sequence
>> + * mhi_power_down - Start MHI power down sequence. See also
>> + * mhi_power_down_no_destroy() which is a variant of this for suspend.
>> + *
>> * @mhi_cntrl: MHI controller
>> * @graceful: Link is still accessible, so do a graceful shutdown process
>> */
>> -void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful);
>> +static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
>> +{
>> + __mhi_power_down(mhi_cntrl, graceful, true);
>> +}
>> +
>> +/**
>> + * mhi_power_down_no_destroy - Start MHI power down sequence but don't
>> + * destroy struct devices. This is a variant for mhi_power_down() and is a
>> + * workaround to make it possible to use mhi_power_up() in a resume
>> + * handler. When using this variant the caller must also call
>> + * mhi_prepare_all_for_transfer_autoqueue() and
>> + * mhi_unprepare_all_from_transfer().
>> + *
>> + * @mhi_cntrl: MHI controller
>> + * @graceful: Link is still accessible, so do a graceful shutdown process
>> + */
>> +static inline void mhi_power_down_no_destroy(struct mhi_controller *mhi_cntrl,
>> + bool graceful)
>> +{
>> + __mhi_power_down(mhi_cntrl, graceful, false);
>> +}
>>
>> /**
>> * mhi_unprepare_after_power_down - Free any allocated memory after power down
>> --
>> 2.39.2
>>
>>
>

2023-12-05 12:29:42

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

Manivannan Sadhasivam <[email protected]> writes:

> On Mon, Nov 27, 2023 at 06:20:15PM +0200, Kalle Valo wrote:
>
>> From: Baochen Qiang <[email protected]>
>>
>> If ath11k tries to call mhi_power_up() during resume it fails:
>>
>> ath11k_pci 0000:06:00.0: timeout while waiting for restart complete
>>
>> This happens because when calling mhi_power_up() the MHI subsystem eventually
>> calls device_add() from mhi_create_devices() but the device creation is
>> deferred:
>>
>> mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral
>>
>> The reason for deferring device creation is explained in dpm_prepare():
>>
>> /*
>> * It is unsafe if probing of devices will happen during suspend or
>> * hibernation and system behavior will be unpredictable in this case.
>> * So, let's prohibit device's probing here and defer their probes
>> * instead. The normal behavior will be restored in dpm_complete().
>> */
>>
>> Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
>> qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:
>>
>> static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
>> struct mhi_result *mhi_res)
>> {
>> struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
>> int rc;
>>
>> if (!qdev || mhi_res->transaction_status)
>> return;
>>
>> So what this means that QRTR is not delivering messages and the QMI connection
>> is not working between ath11k and the firmware, resulting a failure in firmware
>> initialisation.
>>
>> To fix this add new function mhi_power_down_no_destroy() which does not destroy
>> the devices during power down. This way mhi_power_up() can be called during
>> resume and we can get ath11k hibernation working with the following patches.
>>
>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>>
>> Signed-off-by: Baochen Qiang <[email protected]>
>> Signed-off-by: Kalle Valo <[email protected]>
>
> Any reason for reposting this series without discussing the suggestion from
> Mayank?

Baochen quickly sent me fixes for the v1 review comments, as I have been
out of office for some time I didn't want to sit on Baochen's fixes for
too long. Better to get them out of the door as soon as possible. I will
definitely look at Mayank's proposal but that will take longer.

> As I said in the internal thread, this patch breaks the Linux device
> driver model by not destroying the "struct device" when the actual
> device gets removed.

This patchset has been tested by several people, I'm even using this
patchset on main laptop every day, and we haven't noticed any issues.

Can you elaborate more about this driver model? We are not removing any
ath11k devices, we just want to power down the ath11k (and in the future
ath12k) devices for suspend and power up during resume.

> We should try to explore alternate options instead of persisting with
> this solution.

What other options we have here? At least Baochen is not optimistic that
using PM_POST_HIBERNATION as a workaround would work. The issue we have
here is that mhi_power_up() doesn't work in the resume handler and
that's what we should try to fix, not make workarounds.

--
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2023-12-18 16:30:16

by Jeff Johnson

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

On 12/5/2023 4:29 AM, Kalle Valo wrote:
> Manivannan Sadhasivam <[email protected]> writes:
>
>> On Mon, Nov 27, 2023 at 06:20:15PM +0200, Kalle Valo wrote:
>>
>>> From: Baochen Qiang <[email protected]>
>>>
>>> If ath11k tries to call mhi_power_up() during resume it fails:
>>>
>>> ath11k_pci 0000:06:00.0: timeout while waiting for restart complete
>>>
>>> This happens because when calling mhi_power_up() the MHI subsystem eventually
>>> calls device_add() from mhi_create_devices() but the device creation is
>>> deferred:
>>>
>>> mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral
>>>
>>> The reason for deferring device creation is explained in dpm_prepare():
>>>
>>> /*
>>> * It is unsafe if probing of devices will happen during suspend or
>>> * hibernation and system behavior will be unpredictable in this case.
>>> * So, let's prohibit device's probing here and defer their probes
>>> * instead. The normal behavior will be restored in dpm_complete().
>>> */
>>>
>>> Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
>>> qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:
>>>
>>> static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
>>> struct mhi_result *mhi_res)
>>> {
>>> struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
>>> int rc;
>>>
>>> if (!qdev || mhi_res->transaction_status)
>>> return;
>>>
>>> So what this means that QRTR is not delivering messages and the QMI connection
>>> is not working between ath11k and the firmware, resulting a failure in firmware
>>> initialisation.
>>>
>>> To fix this add new function mhi_power_down_no_destroy() which does not destroy
>>> the devices during power down. This way mhi_power_up() can be called during
>>> resume and we can get ath11k hibernation working with the following patches.
>>>
>>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>>>
>>> Signed-off-by: Baochen Qiang <[email protected]>
>>> Signed-off-by: Kalle Valo <[email protected]>
>>
>> Any reason for reposting this series without discussing the suggestion from
>> Mayank?
>
> Baochen quickly sent me fixes for the v1 review comments, as I have been
> out of office for some time I didn't want to sit on Baochen's fixes for
> too long. Better to get them out of the door as soon as possible. I will
> definitely look at Mayank's proposal but that will take longer.
>
>> As I said in the internal thread, this patch breaks the Linux device
>> driver model by not destroying the "struct device" when the actual
>> device gets removed.
>
> This patchset has been tested by several people, I'm even using this
> patchset on main laptop every day, and we haven't noticed any issues.
>
> Can you elaborate more about this driver model? We are not removing any
> ath11k devices, we just want to power down the ath11k (and in the future
> ath12k) devices for suspend and power up during resume.
>
>> We should try to explore alternate options instead of persisting with
>> this solution.
>
> What other options we have here? At least Baochen is not optimistic that
> using PM_POST_HIBERNATION as a workaround would work. The issue we have
> here is that mhi_power_up() doesn't work in the resume handler and
> that's what we should try to fix, not make workarounds.
>

Adding Mayank directly plus others to this discussion since we need a
solution to have proper hibernation support for devices containing
ath11k (and in the near future ath12k).

/jeff
/jeff


2023-12-20 16:48:23

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

On Tue, Dec 05, 2023 at 02:29:33PM +0200, Kalle Valo wrote:
> Manivannan Sadhasivam <[email protected]> writes:
>
> > On Mon, Nov 27, 2023 at 06:20:15PM +0200, Kalle Valo wrote:
> >
> >> From: Baochen Qiang <[email protected]>
> >>
> >> If ath11k tries to call mhi_power_up() during resume it fails:
> >>
> >> ath11k_pci 0000:06:00.0: timeout while waiting for restart complete
> >>
> >> This happens because when calling mhi_power_up() the MHI subsystem eventually
> >> calls device_add() from mhi_create_devices() but the device creation is
> >> deferred:
> >>
> >> mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral
> >>
> >> The reason for deferring device creation is explained in dpm_prepare():
> >>
> >> /*
> >> * It is unsafe if probing of devices will happen during suspend or
> >> * hibernation and system behavior will be unpredictable in this case.
> >> * So, let's prohibit device's probing here and defer their probes
> >> * instead. The normal behavior will be restored in dpm_complete().
> >> */
> >>
> >> Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
> >> qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:
> >>
> >> static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
> >> struct mhi_result *mhi_res)
> >> {
> >> struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
> >> int rc;
> >>
> >> if (!qdev || mhi_res->transaction_status)
> >> return;
> >>
> >> So what this means that QRTR is not delivering messages and the QMI connection
> >> is not working between ath11k and the firmware, resulting a failure in firmware
> >> initialisation.
> >>
> >> To fix this add new function mhi_power_down_no_destroy() which does not destroy
> >> the devices during power down. This way mhi_power_up() can be called during
> >> resume and we can get ath11k hibernation working with the following patches.
> >>
> >> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
> >>
> >> Signed-off-by: Baochen Qiang <[email protected]>
> >> Signed-off-by: Kalle Valo <[email protected]>
> >
> > Any reason for reposting this series without discussing the suggestion from
> > Mayank?
>
> Baochen quickly sent me fixes for the v1 review comments, as I have been
> out of office for some time I didn't want to sit on Baochen's fixes for
> too long. Better to get them out of the door as soon as possible. I will
> definitely look at Mayank's proposal but that will take longer.
>
> > As I said in the internal thread, this patch breaks the Linux device
> > driver model by not destroying the "struct device" when the actual
> > device gets removed.
>
> This patchset has been tested by several people, I'm even using this
> patchset on main laptop every day, and we haven't noticed any issues.
>
> Can you elaborate more about this driver model? We are not removing any
> ath11k devices, we just want to power down the ath11k (and in the future
> ath12k) devices for suspend and power up during resume.
>

Devices (struct dev) for each channels are created once the device (WLAN) enters
runtime mode such as (MISSION, SBL etc...). During hibernation, ath11k stack
calls mhi_power_down() which essentially resets the device to POR and also the
stack powers down the device properly.

In that case, MHI channels do not exist as the device (WLAN) itself is powered
down. As per kernel driver model, each struct device is tied to its reference
count. And the reference count should be decremented whenever the actual device
is not in use. Once the actual device is removed from the system, then the
respective struct device has to be destroyed altogether.

So in this case, even though the channels are not active (present) in the
device, the device itself gets powered off, you want MHI stack to keep the
struct device active, which is against the model I referenced above.

To fix this issue properly, we need to investigate on how other subsystems are
handling this situation (device getting powered down during hibernation), like
USB.

- Mani

> > We should try to explore alternate options instead of persisting with
> > this solution.
>
> What other options we have here? At least Baochen is not optimistic that
> using PM_POST_HIBERNATION as a workaround would work. The issue we have
> here is that mhi_power_up() doesn't work in the resume handler and
> that's what we should try to fix, not make workarounds.
>
> --
> https://patchwork.kernel.org/project/linux-wireless/list/
>
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

--
மணிவண்ணன் சதாசிவம்

2023-12-20 16:51:29

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

On Wed, Dec 20, 2023 at 10:02:25PM +0530, Manivannan Sadhasivam wrote:
> On Tue, Dec 05, 2023 at 02:29:33PM +0200, Kalle Valo wrote:
> > Manivannan Sadhasivam <[email protected]> writes:
> >
> > > On Mon, Nov 27, 2023 at 06:20:15PM +0200, Kalle Valo wrote:
> > >
> > >> From: Baochen Qiang <[email protected]>
> > >>
> > >> If ath11k tries to call mhi_power_up() during resume it fails:
> > >>
> > >> ath11k_pci 0000:06:00.0: timeout while waiting for restart complete
> > >>
> > >> This happens because when calling mhi_power_up() the MHI subsystem eventually
> > >> calls device_add() from mhi_create_devices() but the device creation is
> > >> deferred:
> > >>
> > >> mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral
> > >>
> > >> The reason for deferring device creation is explained in dpm_prepare():
> > >>
> > >> /*
> > >> * It is unsafe if probing of devices will happen during suspend or
> > >> * hibernation and system behavior will be unpredictable in this case.
> > >> * So, let's prohibit device's probing here and defer their probes
> > >> * instead. The normal behavior will be restored in dpm_complete().
> > >> */
> > >>
> > >> Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
> > >> qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:
> > >>
> > >> static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
> > >> struct mhi_result *mhi_res)
> > >> {
> > >> struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
> > >> int rc;
> > >>
> > >> if (!qdev || mhi_res->transaction_status)
> > >> return;
> > >>
> > >> So what this means that QRTR is not delivering messages and the QMI connection
> > >> is not working between ath11k and the firmware, resulting a failure in firmware
> > >> initialisation.
> > >>
> > >> To fix this add new function mhi_power_down_no_destroy() which does not destroy
> > >> the devices during power down. This way mhi_power_up() can be called during
> > >> resume and we can get ath11k hibernation working with the following patches.
> > >>
> > >> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
> > >>
> > >> Signed-off-by: Baochen Qiang <[email protected]>
> > >> Signed-off-by: Kalle Valo <[email protected]>
> > >
> > > Any reason for reposting this series without discussing the suggestion from
> > > Mayank?
> >
> > Baochen quickly sent me fixes for the v1 review comments, as I have been
> > out of office for some time I didn't want to sit on Baochen's fixes for
> > too long. Better to get them out of the door as soon as possible. I will
> > definitely look at Mayank's proposal but that will take longer.
> >
> > > As I said in the internal thread, this patch breaks the Linux device
> > > driver model by not destroying the "struct device" when the actual
> > > device gets removed.
> >
> > This patchset has been tested by several people, I'm even using this
> > patchset on main laptop every day, and we haven't noticed any issues.
> >
> > Can you elaborate more about this driver model? We are not removing any
> > ath11k devices, we just want to power down the ath11k (and in the future
> > ath12k) devices for suspend and power up during resume.
> >
>
> Devices (struct dev) for each channels are created once the device (WLAN) enters
> runtime mode such as (MISSION, SBL etc...). During hibernation, ath11k stack
> calls mhi_power_down() which essentially resets the device to POR and also the
> stack powers down the device properly.
>
> In that case, MHI channels do not exist as the device (WLAN) itself is powered
> down. As per kernel driver model, each struct device is tied to its reference
> count. And the reference count should be decremented whenever the actual device
> is not in use. Once the actual device is removed from the system, then the
> respective struct device has to be destroyed altogether.
>
> So in this case, even though the channels are not active (present) in the
> device, the device itself gets powered off, you want MHI stack to keep the
> struct device active, which is against the model I referenced above.
>
> To fix this issue properly, we need to investigate on how other subsystems are
> handling this situation (device getting powered down during hibernation), like
> USB.
>

To me it all sounds like the probe deferral is not handled properly in mac80211
stack. As you mentioned in the commit message that the dpm_prepare() blocks
probing of devices. It gets unblocked and trigerred in dpm_complete():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/power/main.c#n1131

So if mac80211/ath11k cannot probe the devices at the dpm_complete() stage, then
it is definitely an issue that needs to be fixed properly.

- Mani

> - Mani
>
> > > We should try to explore alternate options instead of persisting with
> > > this solution.
> >
> > What other options we have here? At least Baochen is not optimistic that
> > using PM_POST_HIBERNATION as a workaround would work. The issue we have
> > here is that mhi_power_up() doesn't work in the resume handler and
> > that's what we should try to fix, not make workarounds.
> >
> > --
> > https://patchwork.kernel.org/project/linux-wireless/list/
> >
> > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
>
> --
> மணிவண்ணன் சதாசிவம்

--
மணிவண்ணன் சதாசிவம்

2023-12-21 11:05:27

by Baochen Qiang

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()



On 12/21/2023 12:51 AM, Manivannan Sadhasivam wrote:
> On Wed, Dec 20, 2023 at 10:02:25PM +0530, Manivannan Sadhasivam wrote:
>> On Tue, Dec 05, 2023 at 02:29:33PM +0200, Kalle Valo wrote:
>>> Manivannan Sadhasivam <[email protected]> writes:
>>>
>>>> On Mon, Nov 27, 2023 at 06:20:15PM +0200, Kalle Valo wrote:
>>>>
>>>>> From: Baochen Qiang <[email protected]>
>>>>>
>>>>> If ath11k tries to call mhi_power_up() during resume it fails:
>>>>>
>>>>> ath11k_pci 0000:06:00.0: timeout while waiting for restart complete
>>>>>
>>>>> This happens because when calling mhi_power_up() the MHI subsystem eventually
>>>>> calls device_add() from mhi_create_devices() but the device creation is
>>>>> deferred:
>>>>>
>>>>> mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral
>>>>>
>>>>> The reason for deferring device creation is explained in dpm_prepare():
>>>>>
>>>>> /*
>>>>> * It is unsafe if probing of devices will happen during suspend or
>>>>> * hibernation and system behavior will be unpredictable in this case.
>>>>> * So, let's prohibit device's probing here and defer their probes
>>>>> * instead. The normal behavior will be restored in dpm_complete().
>>>>> */
>>>>>
>>>>> Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
>>>>> qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:
>>>>>
>>>>> static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
>>>>> struct mhi_result *mhi_res)
>>>>> {
>>>>> struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
>>>>> int rc;
>>>>>
>>>>> if (!qdev || mhi_res->transaction_status)
>>>>> return;
>>>>>
>>>>> So what this means that QRTR is not delivering messages and the QMI connection
>>>>> is not working between ath11k and the firmware, resulting a failure in firmware
>>>>> initialisation.
>>>>>
>>>>> To fix this add new function mhi_power_down_no_destroy() which does not destroy
>>>>> the devices during power down. This way mhi_power_up() can be called during
>>>>> resume and we can get ath11k hibernation working with the following patches.
>>>>>
>>>>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>>>>>
>>>>> Signed-off-by: Baochen Qiang <[email protected]>
>>>>> Signed-off-by: Kalle Valo <[email protected]>
>>>>
>>>> Any reason for reposting this series without discussing the suggestion from
>>>> Mayank?
>>>
>>> Baochen quickly sent me fixes for the v1 review comments, as I have been
>>> out of office for some time I didn't want to sit on Baochen's fixes for
>>> too long. Better to get them out of the door as soon as possible. I will
>>> definitely look at Mayank's proposal but that will take longer.
>>>
>>>> As I said in the internal thread, this patch breaks the Linux device
>>>> driver model by not destroying the "struct device" when the actual
>>>> device gets removed.
>>>
>>> This patchset has been tested by several people, I'm even using this
>>> patchset on main laptop every day, and we haven't noticed any issues.
>>>
>>> Can you elaborate more about this driver model? We are not removing any
>>> ath11k devices, we just want to power down the ath11k (and in the future
>>> ath12k) devices for suspend and power up during resume.
>>>
>>
>> Devices (struct dev) for each channels are created once the device (WLAN) enters
>> runtime mode such as (MISSION, SBL etc...). During hibernation, ath11k stack
>> calls mhi_power_down() which essentially resets the device to POR and also the
>> stack powers down the device properly.
>>
>> In that case, MHI channels do not exist as the device (WLAN) itself is powered
>> down. As per kernel driver model, each struct device is tied to its reference
>> count. And the reference count should be decremented whenever the actual device
>> is not in use. Once the actual device is removed from the system, then the
>> respective struct device has to be destroyed altogether.
>>
>> So in this case, even though the channels are not active (present) in the
>> device, the device itself gets powered off, you want MHI stack to keep the
>> struct device active, which is against the model I referenced above.
>>
>> To fix this issue properly, we need to investigate on how other subsystems are
>> handling this situation (device getting powered down during hibernation), like
>> USB.
>>
>
> To me it all sounds like the probe deferral is not handled properly in mac80211
> stack. As you mentioned in the commit message that the dpm_prepare() blocks
> probing of devices. It gets unblocked and trigerred in dpm_complete():
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/power/main.c#n1131
>
> So if mac80211/ath11k cannot probe the devices at the dpm_complete() stage, then
> it is definitely an issue that needs to be fixed properly.
To clarify, ath11k CAN probe the devices at dpm_complete() stage. The
problem is kernel does not wait for all probes to finish, and in that
way we will face the issue that user space applications are likely to
fail because they get thawed BEFORE WLAN is ready.

>
> - Mani
>
>> - Mani
>>
>>>> We should try to explore alternate options instead of persisting with
>>>> this solution.
>>>
>>> What other options we have here? At least Baochen is not optimistic that
>>> using PM_POST_HIBERNATION as a workaround would work. The issue we have
>>> here is that mhi_power_up() doesn't work in the resume handler and
>>> that's what we should try to fix, not make workarounds.
>>>
>>> --
>>> https://patchwork.kernel.org/project/linux-wireless/list/
>>>
>>> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
>>
>> --
>> மணிவண்ணன் சதாசிவம்
>

2024-01-04 06:09:22

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

On Thu, Dec 21, 2023 at 07:05:09PM +0800, Baochen Qiang wrote:
>
>
> On 12/21/2023 12:51 AM, Manivannan Sadhasivam wrote:
> > On Wed, Dec 20, 2023 at 10:02:25PM +0530, Manivannan Sadhasivam wrote:
> > > On Tue, Dec 05, 2023 at 02:29:33PM +0200, Kalle Valo wrote:
> > > > Manivannan Sadhasivam <[email protected]> writes:
> > > >
> > > > > On Mon, Nov 27, 2023 at 06:20:15PM +0200, Kalle Valo wrote:
> > > > >
> > > > > > From: Baochen Qiang <[email protected]>
> > > > > >
> > > > > > If ath11k tries to call mhi_power_up() during resume it fails:
> > > > > >
> > > > > > ath11k_pci 0000:06:00.0: timeout while waiting for restart complete
> > > > > >
> > > > > > This happens because when calling mhi_power_up() the MHI subsystem eventually
> > > > > > calls device_add() from mhi_create_devices() but the device creation is
> > > > > > deferred:
> > > > > >
> > > > > > mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral
> > > > > >
> > > > > > The reason for deferring device creation is explained in dpm_prepare():
> > > > > >
> > > > > > /*
> > > > > > * It is unsafe if probing of devices will happen during suspend or
> > > > > > * hibernation and system behavior will be unpredictable in this case.
> > > > > > * So, let's prohibit device's probing here and defer their probes
> > > > > > * instead. The normal behavior will be restored in dpm_complete().
> > > > > > */
> > > > > >
> > > > > > Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
> > > > > > qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:
> > > > > >
> > > > > > static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
> > > > > > struct mhi_result *mhi_res)
> > > > > > {
> > > > > > struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
> > > > > > int rc;
> > > > > >
> > > > > > if (!qdev || mhi_res->transaction_status)
> > > > > > return;
> > > > > >
> > > > > > So what this means that QRTR is not delivering messages and the QMI connection
> > > > > > is not working between ath11k and the firmware, resulting a failure in firmware
> > > > > > initialisation.
> > > > > >
> > > > > > To fix this add new function mhi_power_down_no_destroy() which does not destroy
> > > > > > the devices during power down. This way mhi_power_up() can be called during
> > > > > > resume and we can get ath11k hibernation working with the following patches.
> > > > > >
> > > > > > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
> > > > > >
> > > > > > Signed-off-by: Baochen Qiang <[email protected]>
> > > > > > Signed-off-by: Kalle Valo <[email protected]>
> > > > >
> > > > > Any reason for reposting this series without discussing the suggestion from
> > > > > Mayank?
> > > >
> > > > Baochen quickly sent me fixes for the v1 review comments, as I have been
> > > > out of office for some time I didn't want to sit on Baochen's fixes for
> > > > too long. Better to get them out of the door as soon as possible. I will
> > > > definitely look at Mayank's proposal but that will take longer.
> > > >
> > > > > As I said in the internal thread, this patch breaks the Linux device
> > > > > driver model by not destroying the "struct device" when the actual
> > > > > device gets removed.
> > > >
> > > > This patchset has been tested by several people, I'm even using this
> > > > patchset on main laptop every day, and we haven't noticed any issues.
> > > >
> > > > Can you elaborate more about this driver model? We are not removing any
> > > > ath11k devices, we just want to power down the ath11k (and in the future
> > > > ath12k) devices for suspend and power up during resume.
> > > >
> > >
> > > Devices (struct dev) for each channels are created once the device (WLAN) enters
> > > runtime mode such as (MISSION, SBL etc...). During hibernation, ath11k stack
> > > calls mhi_power_down() which essentially resets the device to POR and also the
> > > stack powers down the device properly.
> > >
> > > In that case, MHI channels do not exist as the device (WLAN) itself is powered
> > > down. As per kernel driver model, each struct device is tied to its reference
> > > count. And the reference count should be decremented whenever the actual device
> > > is not in use. Once the actual device is removed from the system, then the
> > > respective struct device has to be destroyed altogether.
> > >
> > > So in this case, even though the channels are not active (present) in the
> > > device, the device itself gets powered off, you want MHI stack to keep the
> > > struct device active, which is against the model I referenced above.
> > >
> > > To fix this issue properly, we need to investigate on how other subsystems are
> > > handling this situation (device getting powered down during hibernation), like
> > > USB.
> > >
> >
> > To me it all sounds like the probe deferral is not handled properly in mac80211
> > stack. As you mentioned in the commit message that the dpm_prepare() blocks
> > probing of devices. It gets unblocked and trigerred in dpm_complete():
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/power/main.c#n1131
> >
> > So if mac80211/ath11k cannot probe the devices at the dpm_complete() stage, then
> > it is definitely an issue that needs to be fixed properly.
> To clarify, ath11k CAN probe the devices at dpm_complete() stage. The
> problem is kernel does not wait for all probes to finish, and in that way we
> will face the issue that user space applications are likely to fail because
> they get thawed BEFORE WLAN is ready.
>

Hmm. Please give me some time to reproduce this issue locally. I will get back
to this thread with my analysis.

- Mani

> >
> > - Mani
> >
> > > - Mani
> > >
> > > > > We should try to explore alternate options instead of persisting with
> > > > > this solution.
> > > >
> > > > What other options we have here? At least Baochen is not optimistic that
> > > > using PM_POST_HIBERNATION as a workaround would work. The issue we have
> > > > here is that mhi_power_up() doesn't work in the resume handler and
> > > > that's what we should try to fix, not make workarounds.
> > > >
> > > > --
> > > > https://patchwork.kernel.org/project/linux-wireless/list/
> > > >
> > > > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
> > >
> > > --
> > > மணிவண்ணன் சதாசிவம்
> >
>

--
மணிவண்ணன் சதாசிவம்

2024-01-22 06:24:29

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

On Thu, Jan 04, 2024 at 11:39:12AM +0530, Manivannan Sadhasivam wrote:

+ Can, Qiang

[...]

> > > To me it all sounds like the probe deferral is not handled properly in mac80211
> > > stack. As you mentioned in the commit message that the dpm_prepare() blocks
> > > probing of devices. It gets unblocked and trigerred in dpm_complete():
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/power/main.c#n1131
> > >
> > > So if mac80211/ath11k cannot probe the devices at the dpm_complete() stage, then
> > > it is definitely an issue that needs to be fixed properly.
> > To clarify, ath11k CAN probe the devices at dpm_complete() stage. The
> > problem is kernel does not wait for all probes to finish, and in that way we
> > will face the issue that user space applications are likely to fail because
> > they get thawed BEFORE WLAN is ready.
> >
>
> Hmm. Please give me some time to reproduce this issue locally. I will get back
> to this thread with my analysis.
>

We reproduced the issue with the help of PCIe team (thanks Can). What we found
out was, during the resume from hibernation the faliure happens in
ath11k_core_resume(). Precisely here:
https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/tree/drivers/net/wireless/ath/ath11k/core.c?h=ath11k-hibernation-support#n850

This code waits for the QMI messages to arrive and eventually timesout. But the
impression I got from the start was that the mhi_power_up() always fails during
resume. In our investigation, we confirmed that the failure is not happening at
the MHI level.

I'm not pointing fingers here, but trying to understand why can't you fix
ath11k_core_resume() to not timeout? IMO this timeout should be handled as a
deferral case.

- Mani

--
மணிவண்ணன் சதாசிவம்

2024-01-22 08:10:20

by Baochen Qiang

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()



On 1/22/2024 2:24 PM, Manivannan Sadhasivam wrote:
> On Thu, Jan 04, 2024 at 11:39:12AM +0530, Manivannan Sadhasivam wrote:
>
> + Can, Qiang
>
> [...]
>
>>>> To me it all sounds like the probe deferral is not handled properly in mac80211
>>>> stack. As you mentioned in the commit message that the dpm_prepare() blocks
>>>> probing of devices. It gets unblocked and trigerred in dpm_complete():
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/power/main.c#n1131
>>>>
>>>> So if mac80211/ath11k cannot probe the devices at the dpm_complete() stage, then
>>>> it is definitely an issue that needs to be fixed properly.
>>> To clarify, ath11k CAN probe the devices at dpm_complete() stage. The
>>> problem is kernel does not wait for all probes to finish, and in that way we
>>> will face the issue that user space applications are likely to fail because
>>> they get thawed BEFORE WLAN is ready.
>>>
>>
>> Hmm. Please give me some time to reproduce this issue locally. I will get back
>> to this thread with my analysis.
>>
>
> We reproduced the issue with the help of PCIe team (thanks Can). What we found
> out was, during the resume from hibernation the faliure happens in
> ath11k_core_resume(). Precisely here:
> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/tree/drivers/net/wireless/ath/ath11k/core.c?h=ath11k-hibernation-support#n850
>
> This code waits for the QMI messages to arrive and eventually timesout. But the
> impression I got from the start was that the mhi_power_up() always fails during
> resume. In our investigation, we confirmed that the failure is not happening at
> the MHI level.No, mhi_power_up() never fails as it only downloads PBL, SBL and waits
for mission mode, no MHI device created hence not affected by the
deferred probe. However in addition to PBL/SBL, ath11k also needs to
download m3.bin, borad.bin and regdb.bin. Those files are part of WLAN
firmware and are downloaded via QMI messages. After mhi_power_up()
succeeds ath11k_core_resume() waits for QMI downloading those files. As
you know QMI relies on MHI channels, these channels are managed by
qcom_mhi_qrtr_driver. Since device probing is deferred,
qcom_mhi_qrtr_driver has no chance to run at this stage. As a result
ath11k_core_resume() times out.

>
> I'm not pointing fingers here, but trying to understand why can't you fix
> ath11k_core_resume() to not timeout? IMO this timeout should be handled as a
> deferral case.
Let's see what happens if we do it in a deferral way:
1. In ath11k_core_resume() we returns success directly without waiting
for QMI downloading other firmware files.
2. Kernel unblocks device probe and schedules a work item to trigger all
deferred probing. As a result MHI devices are probed by
qcom_mhi_qrtr_driver and finally QMI is online.
3. kernel continues to resume and wake up userspace applications.
4. ath11k gets the message, either by kernel PM notification or
something else, that QMI is ready and then downloads other firmware files.

What happens if userspace applications or network stack immediately
initiate some WLAN request after resume back? Can ath11k handle such
request? The answer is, most likely, no. Because there is no guarantee
that QMI finishes downloading before those request.

>
> - Mani
>

2024-01-22 13:09:59

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

On Mon, Jan 22, 2024 at 04:09:53PM +0800, Baochen Qiang wrote:
>
>
> On 1/22/2024 2:24 PM, Manivannan Sadhasivam wrote:
> > On Thu, Jan 04, 2024 at 11:39:12AM +0530, Manivannan Sadhasivam wrote:
> >
> > + Can, Qiang
> >
> > [...]
> >
> > > > > To me it all sounds like the probe deferral is not handled properly in mac80211
> > > > > stack. As you mentioned in the commit message that the dpm_prepare() blocks
> > > > > probing of devices. It gets unblocked and trigerred in dpm_complete():
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/power/main.c#n1131
> > > > >
> > > > > So if mac80211/ath11k cannot probe the devices at the dpm_complete() stage, then
> > > > > it is definitely an issue that needs to be fixed properly.
> > > > To clarify, ath11k CAN probe the devices at dpm_complete() stage. The
> > > > problem is kernel does not wait for all probes to finish, and in that way we
> > > > will face the issue that user space applications are likely to fail because
> > > > they get thawed BEFORE WLAN is ready.
> > > >
> > >
> > > Hmm. Please give me some time to reproduce this issue locally. I will get back
> > > to this thread with my analysis.
> > >
> >
> > We reproduced the issue with the help of PCIe team (thanks Can). What we found
> > out was, during the resume from hibernation the faliure happens in
> > ath11k_core_resume(). Precisely here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/tree/drivers/net/wireless/ath/ath11k/core.c?h=ath11k-hibernation-support#n850
> >
> > This code waits for the QMI messages to arrive and eventually timesout. But the
> > impression I got from the start was that the mhi_power_up() always fails during
> > resume. In our investigation, we confirmed that the failure is not happening at
> > the MHI level.No, mhi_power_up() never fails as it only downloads PBL,
> > SBL and waits
> for mission mode, no MHI device created hence not affected by the deferred
> probe. However in addition to PBL/SBL, ath11k also needs to download m3.bin,
> borad.bin and regdb.bin. Those files are part of WLAN firmware and are
> downloaded via QMI messages. After mhi_power_up() succeeds
> ath11k_core_resume() waits for QMI downloading those files. As you know QMI
> relies on MHI channels, these channels are managed by qcom_mhi_qrtr_driver.
> Since device probing is deferred, qcom_mhi_qrtr_driver has no chance to run
> at this stage. As a result ath11k_core_resume() times out.
>

Thanks for the info, this clarifies the issue in detail.

> >
> > I'm not pointing fingers here, but trying to understand why can't you fix
> > ath11k_core_resume() to not timeout? IMO this timeout should be handled as a
> > deferral case.
> Let's see what happens if we do it in a deferral way:
> 1. In ath11k_core_resume() we returns success directly without waiting for
> QMI downloading other firmware files.
> 2. Kernel unblocks device probe and schedules a work item to trigger all
> deferred probing. As a result MHI devices are probed by qcom_mhi_qrtr_driver
> and finally QMI is online.
> 3. kernel continues to resume and wake up userspace applications.
> 4. ath11k gets the message, either by kernel PM notification or something
> else, that QMI is ready and then downloads other firmware files.
>
> What happens if userspace applications or network stack immediately initiate
> some WLAN request after resume back? Can ath11k handle such request? The
> answer is, most likely, no. Because there is no guarantee that QMI finishes
> downloading before those request.
>

What will happen to userspace if ath11k returns an error like -EBUSY or
something? Will the netdev completely go away?

- Mani

> >
> > - Mani
> >

--
மணிவண்ணன் சதாசிவம்

2024-01-23 02:19:26

by Baochen Qiang

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()



On 1/22/2024 9:09 PM, Manivannan Sadhasivam wrote:
> On Mon, Jan 22, 2024 at 04:09:53PM +0800, Baochen Qiang wrote:
>>
>>
>> On 1/22/2024 2:24 PM, Manivannan Sadhasivam wrote:
>>> On Thu, Jan 04, 2024 at 11:39:12AM +0530, Manivannan Sadhasivam wrote:
>>>
>>> + Can, Qiang
>>>
>>> [...]
>>>
>>>>>> To me it all sounds like the probe deferral is not handled properly in mac80211
>>>>>> stack. As you mentioned in the commit message that the dpm_prepare() blocks
>>>>>> probing of devices. It gets unblocked and trigerred in dpm_complete():
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/power/main.c#n1131
>>>>>>
>>>>>> So if mac80211/ath11k cannot probe the devices at the dpm_complete() stage, then
>>>>>> it is definitely an issue that needs to be fixed properly.
>>>>> To clarify, ath11k CAN probe the devices at dpm_complete() stage. The
>>>>> problem is kernel does not wait for all probes to finish, and in that way we
>>>>> will face the issue that user space applications are likely to fail because
>>>>> they get thawed BEFORE WLAN is ready.
>>>>>
>>>>
>>>> Hmm. Please give me some time to reproduce this issue locally. I will get back
>>>> to this thread with my analysis.
>>>>
>>>
>>> We reproduced the issue with the help of PCIe team (thanks Can). What we found
>>> out was, during the resume from hibernation the faliure happens in
>>> ath11k_core_resume(). Precisely here:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/tree/drivers/net/wireless/ath/ath11k/core.c?h=ath11k-hibernation-support#n850
>>>
>>> This code waits for the QMI messages to arrive and eventually timesout. But the
>>> impression I got from the start was that the mhi_power_up() always fails during
>>> resume. In our investigation, we confirmed that the failure is not happening at
>>> the MHI level.No, mhi_power_up() never fails as it only downloads PBL,
>>> SBL and waits
>> for mission mode, no MHI device created hence not affected by the deferred
>> probe. However in addition to PBL/SBL, ath11k also needs to download m3.bin,
>> borad.bin and regdb.bin. Those files are part of WLAN firmware and are
>> downloaded via QMI messages. After mhi_power_up() succeeds
>> ath11k_core_resume() waits for QMI downloading those files. As you know QMI
>> relies on MHI channels, these channels are managed by qcom_mhi_qrtr_driver.
>> Since device probing is deferred, qcom_mhi_qrtr_driver has no chance to run
>> at this stage. As a result ath11k_core_resume() times out.
>>
>
> Thanks for the info, this clarifies the issue in detail.
>
>>>
>>> I'm not pointing fingers here, but trying to understand why can't you fix
>>> ath11k_core_resume() to not timeout? IMO this timeout should be handled as a
>>> deferral case.
>> Let's see what happens if we do it in a deferral way:
>> 1. In ath11k_core_resume() we returns success directly without waiting for
>> QMI downloading other firmware files.
>> 2. Kernel unblocks device probe and schedules a work item to trigger all
>> deferred probing. As a result MHI devices are probed by qcom_mhi_qrtr_driver
>> and finally QMI is online.
>> 3. kernel continues to resume and wake up userspace applications.
>> 4. ath11k gets the message, either by kernel PM notification or something
>> else, that QMI is ready and then downloads other firmware files.
>>
>> What happens if userspace applications or network stack immediately initiate
>> some WLAN request after resume back? Can ath11k handle such request? The
>> answer is, most likely, no. Because there is no guarantee that QMI finishes
>> downloading before those request.
>>
>
> What will happen to userspace if ath11k returns an error like -EBUSY or
> something? Will the netdev completely go away?
It depends, and varies from application to application, we can't make
the assumption.

Besides, it doesn't make sense to return -EBUSY or something like that,
if ath11k returns success during resume. A WLAN driver is supposed to
finish everything, at least get back to the state before suspend, in the
resume callback. If it couldn't, report the error.

>
> - Mani
>
>>>
>>> - Mani
>>>
>

2024-01-23 15:45:18

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

On Tue, Jan 23, 2024 at 09:44:11AM +0800, Baochen Qiang wrote:
>
>
> On 1/22/2024 9:09 PM, Manivannan Sadhasivam wrote:
> > On Mon, Jan 22, 2024 at 04:09:53PM +0800, Baochen Qiang wrote:
> > >
> > >
> > > On 1/22/2024 2:24 PM, Manivannan Sadhasivam wrote:
> > > > On Thu, Jan 04, 2024 at 11:39:12AM +0530, Manivannan Sadhasivam wrote:
> > > >
> > > > + Can, Qiang
> > > >
> > > > [...]
> > > >
> > > > > > > To me it all sounds like the probe deferral is not handled properly in mac80211
> > > > > > > stack. As you mentioned in the commit message that the dpm_prepare() blocks
> > > > > > > probing of devices. It gets unblocked and trigerred in dpm_complete():
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/power/main.c#n1131
> > > > > > >
> > > > > > > So if mac80211/ath11k cannot probe the devices at the dpm_complete() stage, then
> > > > > > > it is definitely an issue that needs to be fixed properly.
> > > > > > To clarify, ath11k CAN probe the devices at dpm_complete() stage. The
> > > > > > problem is kernel does not wait for all probes to finish, and in that way we
> > > > > > will face the issue that user space applications are likely to fail because
> > > > > > they get thawed BEFORE WLAN is ready.
> > > > > >
> > > > >
> > > > > Hmm. Please give me some time to reproduce this issue locally. I will get back
> > > > > to this thread with my analysis.
> > > > >
> > > >
> > > > We reproduced the issue with the help of PCIe team (thanks Can). What we found
> > > > out was, during the resume from hibernation the faliure happens in
> > > > ath11k_core_resume(). Precisely here:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/tree/drivers/net/wireless/ath/ath11k/core.c?h=ath11k-hibernation-support#n850
> > > >
> > > > This code waits for the QMI messages to arrive and eventually timesout. But the
> > > > impression I got from the start was that the mhi_power_up() always fails during
> > > > resume. In our investigation, we confirmed that the failure is not happening at
> > > > the MHI level.No, mhi_power_up() never fails as it only downloads PBL,
> > > > SBL and waits
> > > for mission mode, no MHI device created hence not affected by the deferred
> > > probe. However in addition to PBL/SBL, ath11k also needs to download m3.bin,
> > > borad.bin and regdb.bin. Those files are part of WLAN firmware and are
> > > downloaded via QMI messages. After mhi_power_up() succeeds
> > > ath11k_core_resume() waits for QMI downloading those files. As you know QMI
> > > relies on MHI channels, these channels are managed by qcom_mhi_qrtr_driver.
> > > Since device probing is deferred, qcom_mhi_qrtr_driver has no chance to run
> > > at this stage. As a result ath11k_core_resume() times out.
> > >
> >
> > Thanks for the info, this clarifies the issue in detail.
> >
> > > >
> > > > I'm not pointing fingers here, but trying to understand why can't you fix
> > > > ath11k_core_resume() to not timeout? IMO this timeout should be handled as a
> > > > deferral case.
> > > Let's see what happens if we do it in a deferral way:
> > > 1. In ath11k_core_resume() we returns success directly without waiting for
> > > QMI downloading other firmware files.
> > > 2. Kernel unblocks device probe and schedules a work item to trigger all
> > > deferred probing. As a result MHI devices are probed by qcom_mhi_qrtr_driver
> > > and finally QMI is online.
> > > 3. kernel continues to resume and wake up userspace applications.
> > > 4. ath11k gets the message, either by kernel PM notification or something
> > > else, that QMI is ready and then downloads other firmware files.
> > >
> > > What happens if userspace applications or network stack immediately initiate
> > > some WLAN request after resume back? Can ath11k handle such request? The
> > > answer is, most likely, no. Because there is no guarantee that QMI finishes
> > > downloading before those request.
> > >
> >
> > What will happen to userspace if ath11k returns an error like -EBUSY or
> > something? Will the netdev completely go away?
> It depends, and varies from application to application, we can't make the
> assumption.
>
> Besides, it doesn't make sense to return -EBUSY or something like that, if
> ath11k returns success during resume. A WLAN driver is supposed to finish
> everything, at least get back to the state before suspend, in the resume
> callback. If it couldn't, report the error.
>

Ok. So I am getting the feeling that we need to talk to the PM people to get a
proper solution. Clearly fixing the MHI code is not the right thing to do. We
might need a separate callback that gets registered by the drivers like ath11k
to wait for the dependency drivers to get probed.

Can you initiate such a discussion? You can write to [email protected],
"Rafael J. Wysocki" <[email protected]> and Pavel Machek <[email protected]>.

- Mani

--
மணிவண்ணன் சதாசிவம்

2024-01-23 17:33:34

by Jeff Johnson

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

On 1/23/2024 7:36 AM, Manivannan Sadhasivam wrote:
> Ok. So I am getting the feeling that we need to talk to the PM people to get a
> proper solution. Clearly fixing the MHI code is not the right thing to do. We
> might need a separate callback that gets registered by the drivers like ath11k
> to wait for the dependency drivers to get probed.
>
> Can you initiate such a discussion? You can write to [email protected],
> "Rafael J. Wysocki" <[email protected]> and Pavel Machek <[email protected]>.

Please also include [email protected] since this topic may have
ramifications across other Qualcomm technologies that use MHI.

/jeff


2024-01-30 18:04:33

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

On Mon, Nov 27, 2023 at 06:20:15PM +0200, Kalle Valo wrote:
> From: Baochen Qiang <[email protected]>
>
> If ath11k tries to call mhi_power_up() during resume it fails:
>

This is confusing! Maybe this is what confused me initially. mhi_sync_power_up()
never fails, but ath11k timesout waiting for QMI. You also confirmed the same
[1].

> ath11k_pci 0000:06:00.0: timeout while waiting for restart complete
>
> This happens because when calling mhi_power_up() the MHI subsystem eventually
> calls device_add() from mhi_create_devices() but the device creation is
> deferred:
>
> mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral
>
> The reason for deferring device creation is explained in dpm_prepare():
>
> /*
> * It is unsafe if probing of devices will happen during suspend or
> * hibernation and system behavior will be unpredictable in this case.
> * So, let's prohibit device's probing here and defer their probes
> * instead. The normal behavior will be restored in dpm_complete().
> */
>
> Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
> qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:
>
> static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
> struct mhi_result *mhi_res)
> {
> struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
> int rc;
>
> if (!qdev || mhi_res->transaction_status)
> return;
>
> So what this means that QRTR is not delivering messages and the QMI connection
> is not working between ath11k and the firmware, resulting a failure in firmware
> initialisation.
>
> To fix this add new function mhi_power_down_no_destroy() which does not destroy
> the devices during power down. This way mhi_power_up() can be called during
> resume and we can get ath11k hibernation working with the following patches.
>
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>
> Signed-off-by: Baochen Qiang <[email protected]>
> Signed-off-by: Kalle Valo <[email protected]>
> ---
> drivers/bus/mhi/host/init.c | 1 +
> drivers/bus/mhi/host/internal.h | 1 +
> drivers/bus/mhi/host/pm.c | 26 +++++++++++++++++++-------
> include/linux/mhi.h | 29 +++++++++++++++++++++++++++--
> 4 files changed, 48 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
> index 65ceac1837f9..e626b03ffafa 100644
> --- a/drivers/bus/mhi/host/init.c
> +++ b/drivers/bus/mhi/host/init.c
> @@ -43,6 +43,7 @@ const char * const dev_state_tran_str[DEV_ST_TRANSITION_MAX] = {
> [DEV_ST_TRANSITION_FP] = "FLASH PROGRAMMER",
> [DEV_ST_TRANSITION_SYS_ERR] = "SYS ERROR",
> [DEV_ST_TRANSITION_DISABLE] = "DISABLE",
> + [DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE] = "DISABLE (DESTROY DEVICE)",
> };
>
> const char * const mhi_ch_state_type_str[MHI_CH_STATE_TYPE_MAX] = {
> diff --git a/drivers/bus/mhi/host/internal.h b/drivers/bus/mhi/host/internal.h
> index 30ac415a3000..3f45c9c447bd 100644
> --- a/drivers/bus/mhi/host/internal.h
> +++ b/drivers/bus/mhi/host/internal.h
> @@ -69,6 +69,7 @@ enum dev_st_transition {
> DEV_ST_TRANSITION_FP,
> DEV_ST_TRANSITION_SYS_ERR,
> DEV_ST_TRANSITION_DISABLE,
> + DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE,
> DEV_ST_TRANSITION_MAX,
> };
>
> diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c
> index a2f2feef1476..8833b0248393 100644
> --- a/drivers/bus/mhi/host/pm.c
> +++ b/drivers/bus/mhi/host/pm.c
> @@ -458,7 +458,8 @@ static int mhi_pm_mission_mode_transition(struct mhi_controller *mhi_cntrl)
> }
>
> /* Handle shutdown transitions */
> -static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
> +static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl,
> + bool destroy_device)
> {
> enum mhi_pm_state cur_state;
> struct mhi_event *mhi_event;
> @@ -520,8 +521,10 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
> dev_dbg(dev, "Waiting for all pending threads to complete\n");
> wake_up_all(&mhi_cntrl->state_event);
>
> - dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
> - device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);
> + if (destroy_device) {
> + dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
> + device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);
> + }
>
> mutex_lock(&mhi_cntrl->pm_mutex);
>
> @@ -806,7 +809,10 @@ void mhi_pm_st_worker(struct work_struct *work)
> mhi_pm_sys_error_transition(mhi_cntrl);
> break;
> case DEV_ST_TRANSITION_DISABLE:
> - mhi_pm_disable_transition(mhi_cntrl);
> + mhi_pm_disable_transition(mhi_cntrl, false);
> + break;
> + case DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE:
> + mhi_pm_disable_transition(mhi_cntrl, true);
> break;
> default:
> break;
> @@ -1160,7 +1166,8 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
> }
> EXPORT_SYMBOL_GPL(mhi_async_power_up);
>
> -void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
> +void __mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful,
> + bool destroy_device)
> {
> enum mhi_pm_state cur_state, transition_state;
> struct device *dev = &mhi_cntrl->mhi_dev->dev;
> @@ -1196,14 +1203,19 @@ void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
> write_unlock_irq(&mhi_cntrl->pm_lock);
> mutex_unlock(&mhi_cntrl->pm_mutex);
>
> - mhi_queue_state_transition(mhi_cntrl, DEV_ST_TRANSITION_DISABLE);
> + if (destroy_device)
> + mhi_queue_state_transition(mhi_cntrl,
> + DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE);
> + else
> + mhi_queue_state_transition(mhi_cntrl,
> + DEV_ST_TRANSITION_DISABLE);
>
> /* Wait for shutdown to complete */
> flush_work(&mhi_cntrl->st_worker);
>
> disable_irq(mhi_cntrl->irq[0]);
> }
> -EXPORT_SYMBOL_GPL(mhi_power_down);
> +EXPORT_SYMBOL_GPL(__mhi_power_down);

This is a helper, so should not be exported. You should export the API instead.

>
> int mhi_sync_power_up(struct mhi_controller *mhi_cntrl)
> {
> diff --git a/include/linux/mhi.h b/include/linux/mhi.h
> index d0f9b522f328..ae092bc8b97e 100644
> --- a/include/linux/mhi.h
> +++ b/include/linux/mhi.h
> @@ -648,12 +648,37 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl);
> */
> int mhi_sync_power_up(struct mhi_controller *mhi_cntrl);
>
> +void __mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful,
> + bool destroy_device);

This is a helper, so make it static.

> +
> /**
> - * mhi_power_down - Start MHI power down sequence
> + * mhi_power_down - Start MHI power down sequence. See also
> + * mhi_power_down_no_destroy() which is a variant of this for suspend.

suspend/hibernation

> + *
> * @mhi_cntrl: MHI controller
> * @graceful: Link is still accessible, so do a graceful shutdown process
> */
> -void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful);
> +static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)

No, API cannot be static inline. Make it global.

> +{
> + __mhi_power_down(mhi_cntrl, graceful, true);
> +}
> +
> +/**
> + * mhi_power_down_no_destroy - Start MHI power down sequence but don't
> + * destroy struct devices. This is a variant for mhi_power_down() and is a

"struct devices for channels"

> + * workaround to make it possible to use mhi_power_up() in a resume

You should mention that the devices are not destroyed and this would be useful
in suspend/hibernation.

> + * handler. When using this variant the caller must also call
> + * mhi_prepare_all_for_transfer_autoqueue() and

mhi_prepare_all_for_transfer*()

> + * mhi_unprepare_all_from_transfer().
> + *
> + * @mhi_cntrl: MHI controller
> + * @graceful: Link is still accessible, so do a graceful shutdown process
> + */
> +static inline void mhi_power_down_no_destroy(struct mhi_controller *mhi_cntrl,
> + bool graceful)

Same as above, make it global.

- Mani

--
மணிவண்ணன் சதாசிவம்

2024-01-30 18:19:49

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 2/8] bus: mhi: host: add new interfaces to handle MHI channels directly

On Mon, Nov 27, 2023 at 06:20:16PM +0200, Kalle Valo wrote:
> From: Baochen Qiang <[email protected]>
>
> When using mhi_power_down_no_destroy() MHI hosts need to unprepare MHI channels
> by themselves. Similarly, MHI stack will also not create new MHI device since
> old devices were not destroyed, so MHI hosts need to prepare channels as well.
> Hence add these two interfaces to make that possible.
>
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>
> Signed-off-by: Baochen Qiang <[email protected]>
> Signed-off-by: Kalle Valo <[email protected]>
> ---
> drivers/bus/mhi/host/main.c | 107 ++++++++++++++++++++++++++++++++++++
> include/linux/mhi.h | 20 ++++++-
> 2 files changed, 126 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> index d80975f4bba8..3f677fc628ad 100644
> --- a/drivers/bus/mhi/host/main.c
> +++ b/drivers/bus/mhi/host/main.c
> @@ -1669,6 +1669,58 @@ int mhi_prepare_for_transfer_autoqueue(struct mhi_device *mhi_dev)
> }
> EXPORT_SYMBOL_GPL(mhi_prepare_for_transfer_autoqueue);
>
> +static int ____mhi_prepare_for_transfer(struct device *dev, void *data)

"__mhi_prepare_all_for_transfer"

> +{
> + struct mhi_device *mhi_dev;
> + struct mhi_chan *ul_chan, *dl_chan;
> + enum mhi_ee_type ee = MHI_EE_MAX;

Reverse Xmas order, please.

> +
> + if (dev->bus != &mhi_bus_type)
> + return 0;
> +
> + mhi_dev = to_mhi_device(dev);
> +
> + /* Only prepare virtual devices that are attached to bus */

"Only prepare virtual devices for the channels". Here and below.

> + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
> + return 0;
> +
> + /* There are cases where there is no MHI client driver matches
> + * this device, we are not allowed to do prepare for it.
> + */

Use the preferred style for comment:

/*
* ...
*/

> + if (!mhi_dev->id)
> + return 0;
> +
> + ul_chan = mhi_dev->ul_chan;
> + dl_chan = mhi_dev->dl_chan;
> +
> + /*
> + * If execution environment is specified, remove only those devices that
> + * started in them based on ee_mask for the channels as we move on to a
> + * different execution environment
> + */
> + if (data)
> + ee = *(enum mhi_ee_type *)data;
> +
> + if (ul_chan && ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
> + return 0;
> +
> +

Remove extra newline.

> + if (dl_chan && ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
> + return 0;
> +
> + if (dl_chan->pre_alloc)
> + return mhi_prepare_for_transfer_autoqueue(mhi_dev);
> + else
> + return mhi_prepare_for_transfer(mhi_dev);
> +}
> +
> +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl)
> +{
> + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
> + ____mhi_prepare_for_transfer);
> +}
> +EXPORT_SYMBOL_GPL(mhi_prepare_all_for_transfer);
> +
> void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
> {
> struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl;
> @@ -1684,3 +1736,58 @@ void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
> }
> }
> EXPORT_SYMBOL_GPL(mhi_unprepare_from_transfer);
> +
> +static int ____mhi_unprepare_from_transfer(struct device *dev, void *data)

__mhi_unprepare_all_from_transfer

> +{
> + struct mhi_device *mhi_dev;
> + struct mhi_chan *ul_chan, *dl_chan;
> + enum mhi_ee_type ee = MHI_EE_MAX;
> +
> + if (dev->bus != &mhi_bus_type)
> + return 0;
> +
> + mhi_dev = to_mhi_device(dev);
> +
> + /* Only unprepare virtual devices that are attached to bus */
> + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
> + return 0;
> +
> + /* There are cases where there is no MHI client driver matches
> + * this device, so it is not probed or prepared, no need to
> + * do unprepare for it.
> + */
> + if (!mhi_dev->id)
> + return 0;
> +
> + ul_chan = mhi_dev->ul_chan;
> + dl_chan = mhi_dev->dl_chan;
> +
> + /*
> + * If execution environment is specified, remove only those devices that
> + * started in them based on ee_mask for the channels as we move on to a
> + * different execution environment
> + */
> + if (data)
> + ee = *(enum mhi_ee_type *)data;
> +
> + if (ul_chan) {
> + if (ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
> + return 0;
> + }
> +
> + if (dl_chan) {
> + if (ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
> + return 0;
> + }
> +
> + mhi_unprepare_from_transfer(mhi_dev);
> +
> + return 0;
> +}
> +
> +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl)
> +{
> + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
> + ____mhi_unprepare_from_transfer);
> +}
> +EXPORT_SYMBOL_GPL(mhi_unprepare_all_from_transfer);
> diff --git a/include/linux/mhi.h b/include/linux/mhi.h
> index ae092bc8b97e..dcf62a57056a 100644
> --- a/include/linux/mhi.h
> +++ b/include/linux/mhi.h
> @@ -668,7 +668,7 @@ static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool gracefu
> * destroy struct devices. This is a variant for mhi_power_down() and is a
> * workaround to make it possible to use mhi_power_up() in a resume
> * handler. When using this variant the caller must also call
> - * mhi_prepare_all_for_transfer_autoqueue() and
> + * mhi_prepare_all_for_transfer() and

This change belongs to previous patch.

> * mhi_unprepare_all_from_transfer().
> *
> * @mhi_cntrl: MHI controller
> @@ -842,4 +842,22 @@ int mhi_queue_skb(struct mhi_device *mhi_dev, enum dma_data_direction dir,
> */
> bool mhi_queue_is_full(struct mhi_device *mhi_dev, enum dma_data_direction dir);
>
> +/**
> + * mhi_prepare_all_for_transfer - if you are using
> + * mhi_power_down_no_destroy() variant this needs to be called after
> + * calling mhi_power_up().

Add info about what this API does also.

> + *
> + * @mhi_cntrl: MHI controller
> + */
> +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl);
> +
> +/**
> + * mhi_unprepare_all_from_transfer - if you are using
> + * mhi_power_down_no_destroy() variant this function needs to be called
> + * before calling mhi_power_down_no_destroy().

Same as above.

- Mani

> + *
> + * @mhi_cntrl: MHI controller
> + */
> +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl);
> +
> #endif /* _MHI_H_ */
> --
> 2.39.2
>
>

--
மணிவண்ணன் சதாசிவம்

2024-01-31 07:45:24

by Baochen Qiang

[permalink] [raw]
Subject: Re: [PATCH RFC v2 2/8] bus: mhi: host: add new interfaces to handle MHI channels directly



On 1/31/2024 2:19 AM, Manivannan Sadhasivam wrote:
> On Mon, Nov 27, 2023 at 06:20:16PM +0200, Kalle Valo wrote:
>> From: Baochen Qiang <[email protected]>
>>
>> When using mhi_power_down_no_destroy() MHI hosts need to unprepare MHI channels
>> by themselves. Similarly, MHI stack will also not create new MHI device since
>> old devices were not destroyed, so MHI hosts need to prepare channels as well.
>> Hence add these two interfaces to make that possible.
>>
>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>>
>> Signed-off-by: Baochen Qiang <[email protected]>
>> Signed-off-by: Kalle Valo <[email protected]>
>> ---
>> drivers/bus/mhi/host/main.c | 107 ++++++++++++++++++++++++++++++++++++
>> include/linux/mhi.h | 20 ++++++-
>> 2 files changed, 126 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>> index d80975f4bba8..3f677fc628ad 100644
>> --- a/drivers/bus/mhi/host/main.c
>> +++ b/drivers/bus/mhi/host/main.c
>> @@ -1669,6 +1669,58 @@ int mhi_prepare_for_transfer_autoqueue(struct mhi_device *mhi_dev)
>> }
>> EXPORT_SYMBOL_GPL(mhi_prepare_for_transfer_autoqueue);
>>
>> +static int ____mhi_prepare_for_transfer(struct device *dev, void *data)
>
> "__mhi_prepare_all_for_transfer"

This is to prepare one single child device, I don't think a name like
__mhi_prepare_all_for_transfer (with 'all' inside) make sense, right?
How about changing to "mhi_prepare_dev_for_transfer" or
"mhi_prepare_single_for_transfer"?

>
>> +{
>> + struct mhi_device *mhi_dev;
>> + struct mhi_chan *ul_chan, *dl_chan;
>> + enum mhi_ee_type ee = MHI_EE_MAX;
>
> Reverse Xmas order, please.
>
>> +
>> + if (dev->bus != &mhi_bus_type)
>> + return 0;
>> +
>> + mhi_dev = to_mhi_device(dev);
>> +
>> + /* Only prepare virtual devices that are attached to bus */
>
> "Only prepare virtual devices for the channels". Here and below.
>
>> + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
>> + return 0;
>> +
>> + /* There are cases where there is no MHI client driver matches
>> + * this device, we are not allowed to do prepare for it.
>> + */
>
> Use the preferred style for comment:
>
> /*
> * ...
> */
>
>> + if (!mhi_dev->id)
>> + return 0;
>> +
>> + ul_chan = mhi_dev->ul_chan;
>> + dl_chan = mhi_dev->dl_chan;
>> +
>> + /*
>> + * If execution environment is specified, remove only those devices that
>> + * started in them based on ee_mask for the channels as we move on to a
>> + * different execution environment
>> + */
>> + if (data)
>> + ee = *(enum mhi_ee_type *)data;
>> +
>> + if (ul_chan && ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
>> + return 0;
>> +
>> +
>
> Remove extra newline.
>
>> + if (dl_chan && ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
>> + return 0;
>> +
>> + if (dl_chan->pre_alloc)
>> + return mhi_prepare_for_transfer_autoqueue(mhi_dev);
>> + else
>> + return mhi_prepare_for_transfer(mhi_dev);
>> +}
>> +
>> +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl)
>> +{
>> + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
>> + ____mhi_prepare_for_transfer);
>> +}
>> +EXPORT_SYMBOL_GPL(mhi_prepare_all_for_transfer);
>> +
>> void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
>> {
>> struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl;
>> @@ -1684,3 +1736,58 @@ void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
>> }
>> }
>> EXPORT_SYMBOL_GPL(mhi_unprepare_from_transfer);
>> +
>> +static int ____mhi_unprepare_from_transfer(struct device *dev, void *data)
>
> __mhi_unprepare_all_from_transfer

same as above.

>
>> +{
>> + struct mhi_device *mhi_dev;
>> + struct mhi_chan *ul_chan, *dl_chan;
>> + enum mhi_ee_type ee = MHI_EE_MAX;
>> +
>> + if (dev->bus != &mhi_bus_type)
>> + return 0;
>> +
>> + mhi_dev = to_mhi_device(dev);
>> +
>> + /* Only unprepare virtual devices that are attached to bus */
>> + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
>> + return 0;
>> +
>> + /* There are cases where there is no MHI client driver matches
>> + * this device, so it is not probed or prepared, no need to
>> + * do unprepare for it.
>> + */
>> + if (!mhi_dev->id)
>> + return 0;
>> +
>> + ul_chan = mhi_dev->ul_chan;
>> + dl_chan = mhi_dev->dl_chan;
>> +
>> + /*
>> + * If execution environment is specified, remove only those devices that
>> + * started in them based on ee_mask for the channels as we move on to a
>> + * different execution environment
>> + */
>> + if (data)
>> + ee = *(enum mhi_ee_type *)data;
>> +
>> + if (ul_chan) {
>> + if (ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
>> + return 0;
>> + }
>> +
>> + if (dl_chan) {
>> + if (ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
>> + return 0;
>> + }
>> +
>> + mhi_unprepare_from_transfer(mhi_dev);
>> +
>> + return 0;
>> +}
>> +
>> +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl)
>> +{
>> + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
>> + ____mhi_unprepare_from_transfer);
>> +}
>> +EXPORT_SYMBOL_GPL(mhi_unprepare_all_from_transfer);
>> diff --git a/include/linux/mhi.h b/include/linux/mhi.h
>> index ae092bc8b97e..dcf62a57056a 100644
>> --- a/include/linux/mhi.h
>> +++ b/include/linux/mhi.h
>> @@ -668,7 +668,7 @@ static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool gracefu
>> * destroy struct devices. This is a variant for mhi_power_down() and is a
>> * workaround to make it possible to use mhi_power_up() in a resume
>> * handler. When using this variant the caller must also call
>> - * mhi_prepare_all_for_transfer_autoqueue() and
>> + * mhi_prepare_all_for_transfer() and
>
> This change belongs to previous patch.
>
>> * mhi_unprepare_all_from_transfer().
>> *
>> * @mhi_cntrl: MHI controller
>> @@ -842,4 +842,22 @@ int mhi_queue_skb(struct mhi_device *mhi_dev, enum dma_data_direction dir,
>> */
>> bool mhi_queue_is_full(struct mhi_device *mhi_dev, enum dma_data_direction dir);
>>
>> +/**
>> + * mhi_prepare_all_for_transfer - if you are using
>> + * mhi_power_down_no_destroy() variant this needs to be called after
>> + * calling mhi_power_up().
>
> Add info about what this API does also.
>
>> + *
>> + * @mhi_cntrl: MHI controller
>> + */
>> +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl);
>> +
>> +/**
>> + * mhi_unprepare_all_from_transfer - if you are using
>> + * mhi_power_down_no_destroy() variant this function needs to be called
>> + * before calling mhi_power_down_no_destroy().
>
> Same as above.
>
> - Mani
>
>> + *
>> + * @mhi_cntrl: MHI controller
>> + */
>> +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl);
>> +
>> #endif /* _MHI_H_ */
>> --
>> 2.39.2
>>
>>
>

2024-01-31 11:00:18

by Baochen Qiang

[permalink] [raw]
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()



On 1/31/2024 2:04 AM, Manivannan Sadhasivam wrote:
> On Mon, Nov 27, 2023 at 06:20:15PM +0200, Kalle Valo wrote:
>> From: Baochen Qiang <[email protected]>
>>
>> If ath11k tries to call mhi_power_up() during resume it fails:
>>
>
> This is confusing! Maybe this is what confused me initially. mhi_sync_power_up()
> never fails, but ath11k timesout waiting for QMI. You also confirmed the same
> [1].

mhi_sync_power_up() creates MHI devices which fails to get probed. So
from the view of ath11k it fails, while from the sense of code
execuation it succeeds. Will rephrase to avoid confusion.

>
>> ath11k_pci 0000:06:00.0: timeout while waiting for restart complete
>>
>> This happens because when calling mhi_power_up() the MHI subsystem eventually
>> calls device_add() from mhi_create_devices() but the device creation is
>> deferred:
>>
>> mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral
>>
>> The reason for deferring device creation is explained in dpm_prepare():
>>
>> /*
>> * It is unsafe if probing of devices will happen during suspend or
>> * hibernation and system behavior will be unpredictable in this case.
>> * So, let's prohibit device's probing here and defer their probes
>> * instead. The normal behavior will be restored in dpm_complete().
>> */
>>
>> Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
>> qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:
>>
>> static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
>> struct mhi_result *mhi_res)
>> {
>> struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
>> int rc;
>>
>> if (!qdev || mhi_res->transaction_status)
>> return;
>>
>> So what this means that QRTR is not delivering messages and the QMI connection
>> is not working between ath11k and the firmware, resulting a failure in firmware
>> initialisation.
>>
>> To fix this add new function mhi_power_down_no_destroy() which does not destroy
>> the devices during power down. This way mhi_power_up() can be called during
>> resume and we can get ath11k hibernation working with the following patches.
>>
>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>>
>> Signed-off-by: Baochen Qiang <[email protected]>
>> Signed-off-by: Kalle Valo <[email protected]>
>> ---
>> drivers/bus/mhi/host/init.c | 1 +
>> drivers/bus/mhi/host/internal.h | 1 +
>> drivers/bus/mhi/host/pm.c | 26 +++++++++++++++++++-------
>> include/linux/mhi.h | 29 +++++++++++++++++++++++++++--
>> 4 files changed, 48 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
>> index 65ceac1837f9..e626b03ffafa 100644
>> --- a/drivers/bus/mhi/host/init.c
>> +++ b/drivers/bus/mhi/host/init.c
>> @@ -43,6 +43,7 @@ const char * const dev_state_tran_str[DEV_ST_TRANSITION_MAX] = {
>> [DEV_ST_TRANSITION_FP] = "FLASH PROGRAMMER",
>> [DEV_ST_TRANSITION_SYS_ERR] = "SYS ERROR",
>> [DEV_ST_TRANSITION_DISABLE] = "DISABLE",
>> + [DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE] = "DISABLE (DESTROY DEVICE)",
>> };
>>
>> const char * const mhi_ch_state_type_str[MHI_CH_STATE_TYPE_MAX] = {
>> diff --git a/drivers/bus/mhi/host/internal.h b/drivers/bus/mhi/host/internal.h
>> index 30ac415a3000..3f45c9c447bd 100644
>> --- a/drivers/bus/mhi/host/internal.h
>> +++ b/drivers/bus/mhi/host/internal.h
>> @@ -69,6 +69,7 @@ enum dev_st_transition {
>> DEV_ST_TRANSITION_FP,
>> DEV_ST_TRANSITION_SYS_ERR,
>> DEV_ST_TRANSITION_DISABLE,
>> + DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE,
>> DEV_ST_TRANSITION_MAX,
>> };
>>
>> diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c
>> index a2f2feef1476..8833b0248393 100644
>> --- a/drivers/bus/mhi/host/pm.c
>> +++ b/drivers/bus/mhi/host/pm.c
>> @@ -458,7 +458,8 @@ static int mhi_pm_mission_mode_transition(struct mhi_controller *mhi_cntrl)
>> }
>>
>> /* Handle shutdown transitions */
>> -static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
>> +static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl,
>> + bool destroy_device)
>> {
>> enum mhi_pm_state cur_state;
>> struct mhi_event *mhi_event;
>> @@ -520,8 +521,10 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
>> dev_dbg(dev, "Waiting for all pending threads to complete\n");
>> wake_up_all(&mhi_cntrl->state_event);
>>
>> - dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
>> - device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);
>> + if (destroy_device) {
>> + dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
>> + device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);
>> + }
>>
>> mutex_lock(&mhi_cntrl->pm_mutex);
>>
>> @@ -806,7 +809,10 @@ void mhi_pm_st_worker(struct work_struct *work)
>> mhi_pm_sys_error_transition(mhi_cntrl);
>> break;
>> case DEV_ST_TRANSITION_DISABLE:
>> - mhi_pm_disable_transition(mhi_cntrl);
>> + mhi_pm_disable_transition(mhi_cntrl, false);
>> + break;
>> + case DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE:
>> + mhi_pm_disable_transition(mhi_cntrl, true);
>> break;
>> default:
>> break;
>> @@ -1160,7 +1166,8 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
>> }
>> EXPORT_SYMBOL_GPL(mhi_async_power_up);
>>
>> -void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
>> +void __mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful,
>> + bool destroy_device)
>> {
>> enum mhi_pm_state cur_state, transition_state;
>> struct device *dev = &mhi_cntrl->mhi_dev->dev;
>> @@ -1196,14 +1203,19 @@ void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
>> write_unlock_irq(&mhi_cntrl->pm_lock);
>> mutex_unlock(&mhi_cntrl->pm_mutex);
>>
>> - mhi_queue_state_transition(mhi_cntrl, DEV_ST_TRANSITION_DISABLE);
>> + if (destroy_device)
>> + mhi_queue_state_transition(mhi_cntrl,
>> + DEV_ST_TRANSITION_DISABLE_DESTROY_DEVICE);
>> + else
>> + mhi_queue_state_transition(mhi_cntrl,
>> + DEV_ST_TRANSITION_DISABLE);
>>
>> /* Wait for shutdown to complete */
>> flush_work(&mhi_cntrl->st_worker);
>>
>> disable_irq(mhi_cntrl->irq[0]);
>> }
>> -EXPORT_SYMBOL_GPL(mhi_power_down);
>> +EXPORT_SYMBOL_GPL(__mhi_power_down);
>
> This is a helper, so should not be exported. You should export the API instead.
>
>>
>> int mhi_sync_power_up(struct mhi_controller *mhi_cntrl)
>> {
>> diff --git a/include/linux/mhi.h b/include/linux/mhi.h
>> index d0f9b522f328..ae092bc8b97e 100644
>> --- a/include/linux/mhi.h
>> +++ b/include/linux/mhi.h
>> @@ -648,12 +648,37 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl);
>> */
>> int mhi_sync_power_up(struct mhi_controller *mhi_cntrl);
>>
>> +void __mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful,
>> + bool destroy_device);
>
> This is a helper, so make it static.
>
>> +
>> /**
>> - * mhi_power_down - Start MHI power down sequence
>> + * mhi_power_down - Start MHI power down sequence. See also
>> + * mhi_power_down_no_destroy() which is a variant of this for suspend.
>
> suspend/hibernation
>
>> + *
>> * @mhi_cntrl: MHI controller
>> * @graceful: Link is still accessible, so do a graceful shutdown process
>> */
>> -void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful);
>> +static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
>
> No, API cannot be static inline. Make it global.
>
>> +{
>> + __mhi_power_down(mhi_cntrl, graceful, true);
>> +}
>> +
>> +/**
>> + * mhi_power_down_no_destroy - Start MHI power down sequence but don't
>> + * destroy struct devices. This is a variant for mhi_power_down() and is a
>
> "struct devices for channels"
>
>> + * workaround to make it possible to use mhi_power_up() in a resume
>
> You should mention that the devices are not destroyed and this would be useful
> in suspend/hibernation.
>
>> + * handler. When using this variant the caller must also call
>> + * mhi_prepare_all_for_transfer_autoqueue() and
>
> mhi_prepare_all_for_transfer*()
>
>> + * mhi_unprepare_all_from_transfer().
>> + *
>> + * @mhi_cntrl: MHI controller
>> + * @graceful: Link is still accessible, so do a graceful shutdown process
>> + */
>> +static inline void mhi_power_down_no_destroy(struct mhi_controller *mhi_cntrl,
>> + bool graceful)
>
> Same as above, make it global.
>
> - Mani
>

2024-02-01 10:00:56

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 2/8] bus: mhi: host: add new interfaces to handle MHI channels directly

On Wed, Jan 31, 2024 at 03:39:26PM +0800, Baochen Qiang wrote:
>
>
> On 1/31/2024 2:19 AM, Manivannan Sadhasivam wrote:
> > On Mon, Nov 27, 2023 at 06:20:16PM +0200, Kalle Valo wrote:
> > > From: Baochen Qiang <[email protected]>
> > >
> > > When using mhi_power_down_no_destroy() MHI hosts need to unprepare MHI channels
> > > by themselves. Similarly, MHI stack will also not create new MHI device since
> > > old devices were not destroyed, so MHI hosts need to prepare channels as well.
> > > Hence add these two interfaces to make that possible.
> > >
> > > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
> > >
> > > Signed-off-by: Baochen Qiang <[email protected]>
> > > Signed-off-by: Kalle Valo <[email protected]>
> > > ---
> > > drivers/bus/mhi/host/main.c | 107 ++++++++++++++++++++++++++++++++++++
> > > include/linux/mhi.h | 20 ++++++-
> > > 2 files changed, 126 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> > > index d80975f4bba8..3f677fc628ad 100644
> > > --- a/drivers/bus/mhi/host/main.c
> > > +++ b/drivers/bus/mhi/host/main.c
> > > @@ -1669,6 +1669,58 @@ int mhi_prepare_for_transfer_autoqueue(struct mhi_device *mhi_dev)
> > > }
> > > EXPORT_SYMBOL_GPL(mhi_prepare_for_transfer_autoqueue);
> > > +static int ____mhi_prepare_for_transfer(struct device *dev, void *data)
> >
> > "__mhi_prepare_all_for_transfer"
>
> This is to prepare one single child device, I don't think a name like
> __mhi_prepare_all_for_transfer (with 'all' inside) make sense, right?
> How about changing to "mhi_prepare_dev_for_transfer" or
> "mhi_prepare_single_for_transfer"?
>

I think most of the checks in this function can be moved inside
mhi_prepare_for_transfer() API. With that you can just reuse the API without
adding a new helper.

For autoqueue channels, you can add another API
mhi_prepare_all_for_transfer_autoqueue() just like
mhi_prepare_for_transfer_autoqueue() to maintain uniformity.

- Mani

> >
> > > +{
> > > + struct mhi_device *mhi_dev;
> > > + struct mhi_chan *ul_chan, *dl_chan;
> > > + enum mhi_ee_type ee = MHI_EE_MAX;
> >
> > Reverse Xmas order, please.
> >
> > > +
> > > + if (dev->bus != &mhi_bus_type)
> > > + return 0;
> > > +
> > > + mhi_dev = to_mhi_device(dev);
> > > +
> > > + /* Only prepare virtual devices that are attached to bus */
> >
> > "Only prepare virtual devices for the channels". Here and below.
> >
> > > + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
> > > + return 0;
> > > +
> > > + /* There are cases where there is no MHI client driver matches
> > > + * this device, we are not allowed to do prepare for it.
> > > + */
> >
> > Use the preferred style for comment:
> >
> > /*
> > * ...
> > */
> >
> > > + if (!mhi_dev->id)
> > > + return 0;
> > > +
> > > + ul_chan = mhi_dev->ul_chan;
> > > + dl_chan = mhi_dev->dl_chan;
> > > +
> > > + /*
> > > + * If execution environment is specified, remove only those devices that
> > > + * started in them based on ee_mask for the channels as we move on to a
> > > + * different execution environment
> > > + */
> > > + if (data)
> > > + ee = *(enum mhi_ee_type *)data;
> > > +
> > > + if (ul_chan && ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
> > > + return 0;
> > > +
> > > +
> >
> > Remove extra newline.
> >
> > > + if (dl_chan && ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
> > > + return 0;
> > > +
> > > + if (dl_chan->pre_alloc)
> > > + return mhi_prepare_for_transfer_autoqueue(mhi_dev);
> > > + else
> > > + return mhi_prepare_for_transfer(mhi_dev);
> > > +}
> > > +
> > > +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl)
> > > +{
> > > + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
> > > + ____mhi_prepare_for_transfer);
> > > +}
> > > +EXPORT_SYMBOL_GPL(mhi_prepare_all_for_transfer);
> > > +
> > > void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
> > > {
> > > struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl;
> > > @@ -1684,3 +1736,58 @@ void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
> > > }
> > > }
> > > EXPORT_SYMBOL_GPL(mhi_unprepare_from_transfer);
> > > +
> > > +static int ____mhi_unprepare_from_transfer(struct device *dev, void *data)
> >
> > __mhi_unprepare_all_from_transfer
>
> same as above.
>
> >
> > > +{
> > > + struct mhi_device *mhi_dev;
> > > + struct mhi_chan *ul_chan, *dl_chan;
> > > + enum mhi_ee_type ee = MHI_EE_MAX;
> > > +
> > > + if (dev->bus != &mhi_bus_type)
> > > + return 0;
> > > +
> > > + mhi_dev = to_mhi_device(dev);
> > > +
> > > + /* Only unprepare virtual devices that are attached to bus */
> > > + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
> > > + return 0;
> > > +
> > > + /* There are cases where there is no MHI client driver matches
> > > + * this device, so it is not probed or prepared, no need to
> > > + * do unprepare for it.
> > > + */
> > > + if (!mhi_dev->id)
> > > + return 0;
> > > +
> > > + ul_chan = mhi_dev->ul_chan;
> > > + dl_chan = mhi_dev->dl_chan;
> > > +
> > > + /*
> > > + * If execution environment is specified, remove only those devices that
> > > + * started in them based on ee_mask for the channels as we move on to a
> > > + * different execution environment
> > > + */
> > > + if (data)
> > > + ee = *(enum mhi_ee_type *)data;
> > > +
> > > + if (ul_chan) {
> > > + if (ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
> > > + return 0;
> > > + }
> > > +
> > > + if (dl_chan) {
> > > + if (ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
> > > + return 0;
> > > + }
> > > +
> > > + mhi_unprepare_from_transfer(mhi_dev);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl)
> > > +{
> > > + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
> > > + ____mhi_unprepare_from_transfer);
> > > +}
> > > +EXPORT_SYMBOL_GPL(mhi_unprepare_all_from_transfer);
> > > diff --git a/include/linux/mhi.h b/include/linux/mhi.h
> > > index ae092bc8b97e..dcf62a57056a 100644
> > > --- a/include/linux/mhi.h
> > > +++ b/include/linux/mhi.h
> > > @@ -668,7 +668,7 @@ static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool gracefu
> > > * destroy struct devices. This is a variant for mhi_power_down() and is a
> > > * workaround to make it possible to use mhi_power_up() in a resume
> > > * handler. When using this variant the caller must also call
> > > - * mhi_prepare_all_for_transfer_autoqueue() and
> > > + * mhi_prepare_all_for_transfer() and
> >
> > This change belongs to previous patch.
> >
> > > * mhi_unprepare_all_from_transfer().
> > > *
> > > * @mhi_cntrl: MHI controller
> > > @@ -842,4 +842,22 @@ int mhi_queue_skb(struct mhi_device *mhi_dev, enum dma_data_direction dir,
> > > */
> > > bool mhi_queue_is_full(struct mhi_device *mhi_dev, enum dma_data_direction dir);
> > > +/**
> > > + * mhi_prepare_all_for_transfer - if you are using
> > > + * mhi_power_down_no_destroy() variant this needs to be called after
> > > + * calling mhi_power_up().
> >
> > Add info about what this API does also.
> >
> > > + *
> > > + * @mhi_cntrl: MHI controller
> > > + */
> > > +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl);
> > > +
> > > +/**
> > > + * mhi_unprepare_all_from_transfer - if you are using
> > > + * mhi_power_down_no_destroy() variant this function needs to be called
> > > + * before calling mhi_power_down_no_destroy().
> >
> > Same as above.
> >
> > - Mani
> >
> > > + *
> > > + * @mhi_cntrl: MHI controller
> > > + */
> > > +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl);
> > > +
> > > #endif /* _MHI_H_ */
> > > --
> > > 2.39.2
> > >
> > >
> >

--
மணிவண்ணன் சதாசிவம்

2024-02-02 06:47:33

by Baochen Qiang

[permalink] [raw]
Subject: Re: [PATCH RFC v2 2/8] bus: mhi: host: add new interfaces to handle MHI channels directly



On 2/1/2024 6:00 PM, Manivannan Sadhasivam wrote:
> On Wed, Jan 31, 2024 at 03:39:26PM +0800, Baochen Qiang wrote:
>>
>>
>> On 1/31/2024 2:19 AM, Manivannan Sadhasivam wrote:
>>> On Mon, Nov 27, 2023 at 06:20:16PM +0200, Kalle Valo wrote:
>>>> From: Baochen Qiang <[email protected]>
>>>>
>>>> When using mhi_power_down_no_destroy() MHI hosts need to unprepare MHI channels
>>>> by themselves. Similarly, MHI stack will also not create new MHI device since
>>>> old devices were not destroyed, so MHI hosts need to prepare channels as well.
>>>> Hence add these two interfaces to make that possible.
>>>>
>>>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>>>>
>>>> Signed-off-by: Baochen Qiang <[email protected]>
>>>> Signed-off-by: Kalle Valo <[email protected]>
>>>> ---
>>>> drivers/bus/mhi/host/main.c | 107 ++++++++++++++++++++++++++++++++++++
>>>> include/linux/mhi.h | 20 ++++++-
>>>> 2 files changed, 126 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>>>> index d80975f4bba8..3f677fc628ad 100644
>>>> --- a/drivers/bus/mhi/host/main.c
>>>> +++ b/drivers/bus/mhi/host/main.c
>>>> @@ -1669,6 +1669,58 @@ int mhi_prepare_for_transfer_autoqueue(struct mhi_device *mhi_dev)
>>>> }
>>>> EXPORT_SYMBOL_GPL(mhi_prepare_for_transfer_autoqueue);
>>>> +static int ____mhi_prepare_for_transfer(struct device *dev, void *data)
>>>
>>> "__mhi_prepare_all_for_transfer"
>>
>> This is to prepare one single child device, I don't think a name like
>> __mhi_prepare_all_for_transfer (with 'all' inside) make sense, right?
>> How about changing to "mhi_prepare_dev_for_transfer" or
>> "mhi_prepare_single_for_transfer"?
>>
>
> I think most of the checks in this function can be moved inside
> mhi_prepare_for_transfer() API. With that you can just reuse the API without
> adding a new helper.
>
> For autoqueue channels, you can add another API
> mhi_prepare_all_for_transfer_autoqueue() just like
> mhi_prepare_for_transfer_autoqueue() to maintain uniformity.
>
> - Mani
If we do that, we need to call two APIs together, does it make sense?
From the view of an MHI user, what we want is an API to prepare all
channels, no matter whether a channel is configured as autoqueue or
non-autoqueue, we don't care about it.

And besides, I don't think there is a scenario where we need to use them
separately. So if we always need to use them together, why not merge
them in a single API?

>
>>>
>>>> +{
>>>> + struct mhi_device *mhi_dev;
>>>> + struct mhi_chan *ul_chan, *dl_chan;
>>>> + enum mhi_ee_type ee = MHI_EE_MAX;
>>>
>>> Reverse Xmas order, please.
>>>
>>>> +
>>>> + if (dev->bus != &mhi_bus_type)
>>>> + return 0;
>>>> +
>>>> + mhi_dev = to_mhi_device(dev);
>>>> +
>>>> + /* Only prepare virtual devices that are attached to bus */
>>>
>>> "Only prepare virtual devices for the channels". Here and below.
>>>
>>>> + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
>>>> + return 0;
>>>> +
>>>> + /* There are cases where there is no MHI client driver matches
>>>> + * this device, we are not allowed to do prepare for it.
>>>> + */
>>>
>>> Use the preferred style for comment:
>>>
>>> /*
>>> * ...
>>> */
>>>
>>>> + if (!mhi_dev->id)
>>>> + return 0;
>>>> +
>>>> + ul_chan = mhi_dev->ul_chan;
>>>> + dl_chan = mhi_dev->dl_chan;
>>>> +
>>>> + /*
>>>> + * If execution environment is specified, remove only those devices that
>>>> + * started in them based on ee_mask for the channels as we move on to a
>>>> + * different execution environment
>>>> + */
>>>> + if (data)
>>>> + ee = *(enum mhi_ee_type *)data;
>>>> +
>>>> + if (ul_chan && ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
>>>> + return 0;
>>>> +
>>>> +
>>>
>>> Remove extra newline.
>>>
>>>> + if (dl_chan && ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
>>>> + return 0;
>>>> +
>>>> + if (dl_chan->pre_alloc)
>>>> + return mhi_prepare_for_transfer_autoqueue(mhi_dev);
>>>> + else
>>>> + return mhi_prepare_for_transfer(mhi_dev);
>>>> +}
>>>> +
>>>> +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl)
>>>> +{
>>>> + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
>>>> + ____mhi_prepare_for_transfer);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(mhi_prepare_all_for_transfer);
>>>> +
>>>> void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
>>>> {
>>>> struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl;
>>>> @@ -1684,3 +1736,58 @@ void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
>>>> }
>>>> }
>>>> EXPORT_SYMBOL_GPL(mhi_unprepare_from_transfer);
>>>> +
>>>> +static int ____mhi_unprepare_from_transfer(struct device *dev, void *data)
>>>
>>> __mhi_unprepare_all_from_transfer
>>
>> same as above.
>>
>>>
>>>> +{
>>>> + struct mhi_device *mhi_dev;
>>>> + struct mhi_chan *ul_chan, *dl_chan;
>>>> + enum mhi_ee_type ee = MHI_EE_MAX;
>>>> +
>>>> + if (dev->bus != &mhi_bus_type)
>>>> + return 0;
>>>> +
>>>> + mhi_dev = to_mhi_device(dev);
>>>> +
>>>> + /* Only unprepare virtual devices that are attached to bus */
>>>> + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
>>>> + return 0;
>>>> +
>>>> + /* There are cases where there is no MHI client driver matches
>>>> + * this device, so it is not probed or prepared, no need to
>>>> + * do unprepare for it.
>>>> + */
>>>> + if (!mhi_dev->id)
>>>> + return 0;
>>>> +
>>>> + ul_chan = mhi_dev->ul_chan;
>>>> + dl_chan = mhi_dev->dl_chan;
>>>> +
>>>> + /*
>>>> + * If execution environment is specified, remove only those devices that
>>>> + * started in them based on ee_mask for the channels as we move on to a
>>>> + * different execution environment
>>>> + */
>>>> + if (data)
>>>> + ee = *(enum mhi_ee_type *)data;
>>>> +
>>>> + if (ul_chan) {
>>>> + if (ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
>>>> + return 0;
>>>> + }
>>>> +
>>>> + if (dl_chan) {
>>>> + if (ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
>>>> + return 0;
>>>> + }
>>>> +
>>>> + mhi_unprepare_from_transfer(mhi_dev);
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl)
>>>> +{
>>>> + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
>>>> + ____mhi_unprepare_from_transfer);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(mhi_unprepare_all_from_transfer);
>>>> diff --git a/include/linux/mhi.h b/include/linux/mhi.h
>>>> index ae092bc8b97e..dcf62a57056a 100644
>>>> --- a/include/linux/mhi.h
>>>> +++ b/include/linux/mhi.h
>>>> @@ -668,7 +668,7 @@ static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool gracefu
>>>> * destroy struct devices. This is a variant for mhi_power_down() and is a
>>>> * workaround to make it possible to use mhi_power_up() in a resume
>>>> * handler. When using this variant the caller must also call
>>>> - * mhi_prepare_all_for_transfer_autoqueue() and
>>>> + * mhi_prepare_all_for_transfer() and
>>>
>>> This change belongs to previous patch.
>>>
>>>> * mhi_unprepare_all_from_transfer().
>>>> *
>>>> * @mhi_cntrl: MHI controller
>>>> @@ -842,4 +842,22 @@ int mhi_queue_skb(struct mhi_device *mhi_dev, enum dma_data_direction dir,
>>>> */
>>>> bool mhi_queue_is_full(struct mhi_device *mhi_dev, enum dma_data_direction dir);
>>>> +/**
>>>> + * mhi_prepare_all_for_transfer - if you are using
>>>> + * mhi_power_down_no_destroy() variant this needs to be called after
>>>> + * calling mhi_power_up().
>>>
>>> Add info about what this API does also.
>>>
>>>> + *
>>>> + * @mhi_cntrl: MHI controller
>>>> + */
>>>> +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl);
>>>> +
>>>> +/**
>>>> + * mhi_unprepare_all_from_transfer - if you are using
>>>> + * mhi_power_down_no_destroy() variant this function needs to be called
>>>> + * before calling mhi_power_down_no_destroy().
>>>
>>> Same as above.
>>>
>>> - Mani
>>>
>>>> + *
>>>> + * @mhi_cntrl: MHI controller
>>>> + */
>>>> +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl);
>>>> +
>>>> #endif /* _MHI_H_ */
>>>> --
>>>> 2.39.2
>>>>
>>>>
>>>
>

2024-02-02 07:22:40

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 2/8] bus: mhi: host: add new interfaces to handle MHI channels directly

On Fri, Feb 02, 2024 at 02:42:58PM +0800, Baochen Qiang wrote:
>
>
> On 2/1/2024 6:00 PM, Manivannan Sadhasivam wrote:
> > On Wed, Jan 31, 2024 at 03:39:26PM +0800, Baochen Qiang wrote:
> > >
> > >
> > > On 1/31/2024 2:19 AM, Manivannan Sadhasivam wrote:
> > > > On Mon, Nov 27, 2023 at 06:20:16PM +0200, Kalle Valo wrote:
> > > > > From: Baochen Qiang <[email protected]>
> > > > >
> > > > > When using mhi_power_down_no_destroy() MHI hosts need to unprepare MHI channels
> > > > > by themselves. Similarly, MHI stack will also not create new MHI device since
> > > > > old devices were not destroyed, so MHI hosts need to prepare channels as well.
> > > > > Hence add these two interfaces to make that possible.
> > > > >
> > > > > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
> > > > >
> > > > > Signed-off-by: Baochen Qiang <[email protected]>
> > > > > Signed-off-by: Kalle Valo <[email protected]>
> > > > > ---
> > > > > drivers/bus/mhi/host/main.c | 107 ++++++++++++++++++++++++++++++++++++
> > > > > include/linux/mhi.h | 20 ++++++-
> > > > > 2 files changed, 126 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> > > > > index d80975f4bba8..3f677fc628ad 100644
> > > > > --- a/drivers/bus/mhi/host/main.c
> > > > > +++ b/drivers/bus/mhi/host/main.c
> > > > > @@ -1669,6 +1669,58 @@ int mhi_prepare_for_transfer_autoqueue(struct mhi_device *mhi_dev)
> > > > > }
> > > > > EXPORT_SYMBOL_GPL(mhi_prepare_for_transfer_autoqueue);
> > > > > +static int ____mhi_prepare_for_transfer(struct device *dev, void *data)
> > > >
> > > > "__mhi_prepare_all_for_transfer"
> > >
> > > This is to prepare one single child device, I don't think a name like
> > > __mhi_prepare_all_for_transfer (with 'all' inside) make sense, right?
> > > How about changing to "mhi_prepare_dev_for_transfer" or
> > > "mhi_prepare_single_for_transfer"?
> > >
> >
> > I think most of the checks in this function can be moved inside
> > mhi_prepare_for_transfer() API. With that you can just reuse the API without
> > adding a new helper.
> >
> > For autoqueue channels, you can add another API
> > mhi_prepare_all_for_transfer_autoqueue() just like
> > mhi_prepare_for_transfer_autoqueue() to maintain uniformity.
> >
> > - Mani
> If we do that, we need to call two APIs together, does it make sense? From
> the view of an MHI user, what we want is an API to prepare all channels, no
> matter whether a channel is configured as autoqueue or non-autoqueue, we
> don't care about it.
>

You are calling this API from a wrong place first up.
mhi_{prepare/unprepare}_transfer* APIs are meant to be used by the client
drivers like QRTR. Controller drivers should not call them.

What you need here is the hibernation support for QRTR itself and call these
APIs from there.

> And besides, I don't think there is a scenario where we need to use them
> separately. So if we always need to use them together, why not merge them in
> a single API?
>

A single controller driver may expose multiple channels and those will bind to
multiple client drivers. So only the client drivers should manage the channels,
not the controller drivers themselves.

- Mani

> >
> > > >
> > > > > +{
> > > > > + struct mhi_device *mhi_dev;
> > > > > + struct mhi_chan *ul_chan, *dl_chan;
> > > > > + enum mhi_ee_type ee = MHI_EE_MAX;
> > > >
> > > > Reverse Xmas order, please.
> > > >
> > > > > +
> > > > > + if (dev->bus != &mhi_bus_type)
> > > > > + return 0;
> > > > > +
> > > > > + mhi_dev = to_mhi_device(dev);
> > > > > +
> > > > > + /* Only prepare virtual devices that are attached to bus */
> > > >
> > > > "Only prepare virtual devices for the channels". Here and below.
> > > >
> > > > > + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
> > > > > + return 0;
> > > > > +
> > > > > + /* There are cases where there is no MHI client driver matches
> > > > > + * this device, we are not allowed to do prepare for it.
> > > > > + */
> > > >
> > > > Use the preferred style for comment:
> > > >
> > > > /*
> > > > * ...
> > > > */
> > > >
> > > > > + if (!mhi_dev->id)
> > > > > + return 0;
> > > > > +
> > > > > + ul_chan = mhi_dev->ul_chan;
> > > > > + dl_chan = mhi_dev->dl_chan;
> > > > > +
> > > > > + /*
> > > > > + * If execution environment is specified, remove only those devices that
> > > > > + * started in them based on ee_mask for the channels as we move on to a
> > > > > + * different execution environment
> > > > > + */
> > > > > + if (data)
> > > > > + ee = *(enum mhi_ee_type *)data;
> > > > > +
> > > > > + if (ul_chan && ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
> > > > > + return 0;
> > > > > +
> > > > > +
> > > >
> > > > Remove extra newline.
> > > >
> > > > > + if (dl_chan && ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
> > > > > + return 0;
> > > > > +
> > > > > + if (dl_chan->pre_alloc)
> > > > > + return mhi_prepare_for_transfer_autoqueue(mhi_dev);
> > > > > + else
> > > > > + return mhi_prepare_for_transfer(mhi_dev);
> > > > > +}
> > > > > +
> > > > > +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl)
> > > > > +{
> > > > > + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
> > > > > + ____mhi_prepare_for_transfer);
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(mhi_prepare_all_for_transfer);
> > > > > +
> > > > > void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
> > > > > {
> > > > > struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl;
> > > > > @@ -1684,3 +1736,58 @@ void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
> > > > > }
> > > > > }
> > > > > EXPORT_SYMBOL_GPL(mhi_unprepare_from_transfer);
> > > > > +
> > > > > +static int ____mhi_unprepare_from_transfer(struct device *dev, void *data)
> > > >
> > > > __mhi_unprepare_all_from_transfer
> > >
> > > same as above.
> > >
> > > >
> > > > > +{
> > > > > + struct mhi_device *mhi_dev;
> > > > > + struct mhi_chan *ul_chan, *dl_chan;
> > > > > + enum mhi_ee_type ee = MHI_EE_MAX;
> > > > > +
> > > > > + if (dev->bus != &mhi_bus_type)
> > > > > + return 0;
> > > > > +
> > > > > + mhi_dev = to_mhi_device(dev);
> > > > > +
> > > > > + /* Only unprepare virtual devices that are attached to bus */
> > > > > + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
> > > > > + return 0;
> > > > > +
> > > > > + /* There are cases where there is no MHI client driver matches
> > > > > + * this device, so it is not probed or prepared, no need to
> > > > > + * do unprepare for it.
> > > > > + */
> > > > > + if (!mhi_dev->id)
> > > > > + return 0;
> > > > > +
> > > > > + ul_chan = mhi_dev->ul_chan;
> > > > > + dl_chan = mhi_dev->dl_chan;
> > > > > +
> > > > > + /*
> > > > > + * If execution environment is specified, remove only those devices that
> > > > > + * started in them based on ee_mask for the channels as we move on to a
> > > > > + * different execution environment
> > > > > + */
> > > > > + if (data)
> > > > > + ee = *(enum mhi_ee_type *)data;
> > > > > +
> > > > > + if (ul_chan) {
> > > > > + if (ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
> > > > > + return 0;
> > > > > + }
> > > > > +
> > > > > + if (dl_chan) {
> > > > > + if (ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
> > > > > + return 0;
> > > > > + }
> > > > > +
> > > > > + mhi_unprepare_from_transfer(mhi_dev);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl)
> > > > > +{
> > > > > + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
> > > > > + ____mhi_unprepare_from_transfer);
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(mhi_unprepare_all_from_transfer);
> > > > > diff --git a/include/linux/mhi.h b/include/linux/mhi.h
> > > > > index ae092bc8b97e..dcf62a57056a 100644
> > > > > --- a/include/linux/mhi.h
> > > > > +++ b/include/linux/mhi.h
> > > > > @@ -668,7 +668,7 @@ static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool gracefu
> > > > > * destroy struct devices. This is a variant for mhi_power_down() and is a
> > > > > * workaround to make it possible to use mhi_power_up() in a resume
> > > > > * handler. When using this variant the caller must also call
> > > > > - * mhi_prepare_all_for_transfer_autoqueue() and
> > > > > + * mhi_prepare_all_for_transfer() and
> > > >
> > > > This change belongs to previous patch.
> > > >
> > > > > * mhi_unprepare_all_from_transfer().
> > > > > *
> > > > > * @mhi_cntrl: MHI controller
> > > > > @@ -842,4 +842,22 @@ int mhi_queue_skb(struct mhi_device *mhi_dev, enum dma_data_direction dir,
> > > > > */
> > > > > bool mhi_queue_is_full(struct mhi_device *mhi_dev, enum dma_data_direction dir);
> > > > > +/**
> > > > > + * mhi_prepare_all_for_transfer - if you are using
> > > > > + * mhi_power_down_no_destroy() variant this needs to be called after
> > > > > + * calling mhi_power_up().
> > > >
> > > > Add info about what this API does also.
> > > >
> > > > > + *
> > > > > + * @mhi_cntrl: MHI controller
> > > > > + */
> > > > > +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl);
> > > > > +
> > > > > +/**
> > > > > + * mhi_unprepare_all_from_transfer - if you are using
> > > > > + * mhi_power_down_no_destroy() variant this function needs to be called
> > > > > + * before calling mhi_power_down_no_destroy().
> > > >
> > > > Same as above.
> > > >
> > > > - Mani
> > > >
> > > > > + *
> > > > > + * @mhi_cntrl: MHI controller
> > > > > + */
> > > > > +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl);
> > > > > +
> > > > > #endif /* _MHI_H_ */
> > > > > --
> > > > > 2.39.2
> > > > >
> > > > >
> > > >
> >

--
மணிவண்ணன் சதாசிவம்

2024-02-02 10:49:36

by Baochen Qiang

[permalink] [raw]
Subject: Re: [PATCH RFC v2 2/8] bus: mhi: host: add new interfaces to handle MHI channels directly



On 2/2/2024 3:10 PM, Manivannan Sadhasivam wrote:
> On Fri, Feb 02, 2024 at 02:42:58PM +0800, Baochen Qiang wrote:
>>
>>
>> On 2/1/2024 6:00 PM, Manivannan Sadhasivam wrote:
>>> On Wed, Jan 31, 2024 at 03:39:26PM +0800, Baochen Qiang wrote:
>>>>
>>>>
>>>> On 1/31/2024 2:19 AM, Manivannan Sadhasivam wrote:
>>>>> On Mon, Nov 27, 2023 at 06:20:16PM +0200, Kalle Valo wrote:
>>>>>> From: Baochen Qiang <[email protected]>
>>>>>>
>>>>>> When using mhi_power_down_no_destroy() MHI hosts need to unprepare MHI channels
>>>>>> by themselves. Similarly, MHI stack will also not create new MHI device since
>>>>>> old devices were not destroyed, so MHI hosts need to prepare channels as well.
>>>>>> Hence add these two interfaces to make that possible.
>>>>>>
>>>>>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
>>>>>>
>>>>>> Signed-off-by: Baochen Qiang <[email protected]>
>>>>>> Signed-off-by: Kalle Valo <[email protected]>
>>>>>> ---
>>>>>> drivers/bus/mhi/host/main.c | 107 ++++++++++++++++++++++++++++++++++++
>>>>>> include/linux/mhi.h | 20 ++++++-
>>>>>> 2 files changed, 126 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>>>>>> index d80975f4bba8..3f677fc628ad 100644
>>>>>> --- a/drivers/bus/mhi/host/main.c
>>>>>> +++ b/drivers/bus/mhi/host/main.c
>>>>>> @@ -1669,6 +1669,58 @@ int mhi_prepare_for_transfer_autoqueue(struct mhi_device *mhi_dev)
>>>>>> }
>>>>>> EXPORT_SYMBOL_GPL(mhi_prepare_for_transfer_autoqueue);
>>>>>> +static int ____mhi_prepare_for_transfer(struct device *dev, void *data)
>>>>>
>>>>> "__mhi_prepare_all_for_transfer"
>>>>
>>>> This is to prepare one single child device, I don't think a name like
>>>> __mhi_prepare_all_for_transfer (with 'all' inside) make sense, right?
>>>> How about changing to "mhi_prepare_dev_for_transfer" or
>>>> "mhi_prepare_single_for_transfer"?
>>>>
>>>
>>> I think most of the checks in this function can be moved inside
>>> mhi_prepare_for_transfer() API. With that you can just reuse the API without
>>> adding a new helper.
>>>
>>> For autoqueue channels, you can add another API
>>> mhi_prepare_all_for_transfer_autoqueue() just like
>>> mhi_prepare_for_transfer_autoqueue() to maintain uniformity.
>>>
>>> - Mani
>> If we do that, we need to call two APIs together, does it make sense? From
>> the view of an MHI user, what we want is an API to prepare all channels, no
>> matter whether a channel is configured as autoqueue or non-autoqueue, we
>> don't care about it.
>>
>
> You are calling this API from a wrong place first up.
> mhi_{prepare/unprepare}_transfer* APIs are meant to be used by the client
> drivers like QRTR. Controller drivers should not call them.
>
> What you need here is the hibernation support for QRTR itself and call these
> APIs from there.

OK, I got your point. QRTR is the right place to manage MHI channels,
not ath11k it self.
And we even don't need these two APIs if change to do it in QRTR.

>
>> And besides, I don't think there is a scenario where we need to use them
>> separately. So if we always need to use them together, why not merge them in
>> a single API?
>>
>
> A single controller driver may expose multiple channels and those will bind to
> multiple client drivers. So only the client drivers should manage the channels,
> not the controller drivers themselves.
Exactly.

Great thanks for the proposal, Mani. Will change accordingly in next
version.

>
> - Mani
>
>>>
>>>>>
>>>>>> +{
>>>>>> + struct mhi_device *mhi_dev;
>>>>>> + struct mhi_chan *ul_chan, *dl_chan;
>>>>>> + enum mhi_ee_type ee = MHI_EE_MAX;
>>>>>
>>>>> Reverse Xmas order, please.
>>>>>
>>>>>> +
>>>>>> + if (dev->bus != &mhi_bus_type)
>>>>>> + return 0;
>>>>>> +
>>>>>> + mhi_dev = to_mhi_device(dev);
>>>>>> +
>>>>>> + /* Only prepare virtual devices that are attached to bus */
>>>>>
>>>>> "Only prepare virtual devices for the channels". Here and below.
>>>>>
>>>>>> + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
>>>>>> + return 0;
>>>>>> +
>>>>>> + /* There are cases where there is no MHI client driver matches
>>>>>> + * this device, we are not allowed to do prepare for it.
>>>>>> + */
>>>>>
>>>>> Use the preferred style for comment:
>>>>>
>>>>> /*
>>>>> * ...
>>>>> */
>>>>>
>>>>>> + if (!mhi_dev->id)
>>>>>> + return 0;
>>>>>> +
>>>>>> + ul_chan = mhi_dev->ul_chan;
>>>>>> + dl_chan = mhi_dev->dl_chan;
>>>>>> +
>>>>>> + /*
>>>>>> + * If execution environment is specified, remove only those devices that
>>>>>> + * started in them based on ee_mask for the channels as we move on to a
>>>>>> + * different execution environment
>>>>>> + */
>>>>>> + if (data)
>>>>>> + ee = *(enum mhi_ee_type *)data;
>>>>>> +
>>>>>> + if (ul_chan && ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
>>>>>> + return 0;
>>>>>> +
>>>>>> +
>>>>>
>>>>> Remove extra newline.
>>>>>
>>>>>> + if (dl_chan && ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
>>>>>> + return 0;
>>>>>> +
>>>>>> + if (dl_chan->pre_alloc)
>>>>>> + return mhi_prepare_for_transfer_autoqueue(mhi_dev);
>>>>>> + else
>>>>>> + return mhi_prepare_for_transfer(mhi_dev);
>>>>>> +}
>>>>>> +
>>>>>> +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl)
>>>>>> +{
>>>>>> + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
>>>>>> + ____mhi_prepare_for_transfer);
>>>>>> +}
>>>>>> +EXPORT_SYMBOL_GPL(mhi_prepare_all_for_transfer);
>>>>>> +
>>>>>> void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
>>>>>> {
>>>>>> struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl;
>>>>>> @@ -1684,3 +1736,58 @@ void mhi_unprepare_from_transfer(struct mhi_device *mhi_dev)
>>>>>> }
>>>>>> }
>>>>>> EXPORT_SYMBOL_GPL(mhi_unprepare_from_transfer);
>>>>>> +
>>>>>> +static int ____mhi_unprepare_from_transfer(struct device *dev, void *data)
>>>>>
>>>>> __mhi_unprepare_all_from_transfer
>>>>
>>>> same as above.
>>>>
>>>>>
>>>>>> +{
>>>>>> + struct mhi_device *mhi_dev;
>>>>>> + struct mhi_chan *ul_chan, *dl_chan;
>>>>>> + enum mhi_ee_type ee = MHI_EE_MAX;
>>>>>> +
>>>>>> + if (dev->bus != &mhi_bus_type)
>>>>>> + return 0;
>>>>>> +
>>>>>> + mhi_dev = to_mhi_device(dev);
>>>>>> +
>>>>>> + /* Only unprepare virtual devices that are attached to bus */
>>>>>> + if (mhi_dev->dev_type == MHI_DEVICE_CONTROLLER)
>>>>>> + return 0;
>>>>>> +
>>>>>> + /* There are cases where there is no MHI client driver matches
>>>>>> + * this device, so it is not probed or prepared, no need to
>>>>>> + * do unprepare for it.
>>>>>> + */
>>>>>> + if (!mhi_dev->id)
>>>>>> + return 0;
>>>>>> +
>>>>>> + ul_chan = mhi_dev->ul_chan;
>>>>>> + dl_chan = mhi_dev->dl_chan;
>>>>>> +
>>>>>> + /*
>>>>>> + * If execution environment is specified, remove only those devices that
>>>>>> + * started in them based on ee_mask for the channels as we move on to a
>>>>>> + * different execution environment
>>>>>> + */
>>>>>> + if (data)
>>>>>> + ee = *(enum mhi_ee_type *)data;
>>>>>> +
>>>>>> + if (ul_chan) {
>>>>>> + if (ee != MHI_EE_MAX && !(ul_chan->ee_mask & BIT(ee)))
>>>>>> + return 0;
>>>>>> + }
>>>>>> +
>>>>>> + if (dl_chan) {
>>>>>> + if (ee != MHI_EE_MAX && !(dl_chan->ee_mask & BIT(ee)))
>>>>>> + return 0;
>>>>>> + }
>>>>>> +
>>>>>> + mhi_unprepare_from_transfer(mhi_dev);
>>>>>> +
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl)
>>>>>> +{
>>>>>> + return device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL,
>>>>>> + ____mhi_unprepare_from_transfer);
>>>>>> +}
>>>>>> +EXPORT_SYMBOL_GPL(mhi_unprepare_all_from_transfer);
>>>>>> diff --git a/include/linux/mhi.h b/include/linux/mhi.h
>>>>>> index ae092bc8b97e..dcf62a57056a 100644
>>>>>> --- a/include/linux/mhi.h
>>>>>> +++ b/include/linux/mhi.h
>>>>>> @@ -668,7 +668,7 @@ static inline void mhi_power_down(struct mhi_controller *mhi_cntrl, bool gracefu
>>>>>> * destroy struct devices. This is a variant for mhi_power_down() and is a
>>>>>> * workaround to make it possible to use mhi_power_up() in a resume
>>>>>> * handler. When using this variant the caller must also call
>>>>>> - * mhi_prepare_all_for_transfer_autoqueue() and
>>>>>> + * mhi_prepare_all_for_transfer() and
>>>>>
>>>>> This change belongs to previous patch.
>>>>>
>>>>>> * mhi_unprepare_all_from_transfer().
>>>>>> *
>>>>>> * @mhi_cntrl: MHI controller
>>>>>> @@ -842,4 +842,22 @@ int mhi_queue_skb(struct mhi_device *mhi_dev, enum dma_data_direction dir,
>>>>>> */
>>>>>> bool mhi_queue_is_full(struct mhi_device *mhi_dev, enum dma_data_direction dir);
>>>>>> +/**
>>>>>> + * mhi_prepare_all_for_transfer - if you are using
>>>>>> + * mhi_power_down_no_destroy() variant this needs to be called after
>>>>>> + * calling mhi_power_up().
>>>>>
>>>>> Add info about what this API does also.
>>>>>
>>>>>> + *
>>>>>> + * @mhi_cntrl: MHI controller
>>>>>> + */
>>>>>> +int mhi_prepare_all_for_transfer(struct mhi_controller *mhi_cntrl);
>>>>>> +
>>>>>> +/**
>>>>>> + * mhi_unprepare_all_from_transfer - if you are using
>>>>>> + * mhi_power_down_no_destroy() variant this function needs to be called
>>>>>> + * before calling mhi_power_down_no_destroy().
>>>>>
>>>>> Same as above.
>>>>>
>>>>> - Mani
>>>>>
>>>>>> + *
>>>>>> + * @mhi_cntrl: MHI controller
>>>>>> + */
>>>>>> +int mhi_unprepare_all_from_transfer(struct mhi_controller *mhi_cntrl);
>>>>>> +
>>>>>> #endif /* _MHI_H_ */
>>>>>> --
>>>>>> 2.39.2
>>>>>>
>>>>>>
>>>>>
>>>
>

2024-02-02 12:16:48

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH RFC v2 2/8] bus: mhi: host: add new interfaces to handle MHI channels directly

On Fri, Feb 02, 2024 at 06:49:19PM +0800, Baochen Qiang wrote:
>
>
> On 2/2/2024 3:10 PM, Manivannan Sadhasivam wrote:
> > On Fri, Feb 02, 2024 at 02:42:58PM +0800, Baochen Qiang wrote:
> > >
> > >
> > > On 2/1/2024 6:00 PM, Manivannan Sadhasivam wrote:
> > > > On Wed, Jan 31, 2024 at 03:39:26PM +0800, Baochen Qiang wrote:
> > > > >
> > > > >
> > > > > On 1/31/2024 2:19 AM, Manivannan Sadhasivam wrote:
> > > > > > On Mon, Nov 27, 2023 at 06:20:16PM +0200, Kalle Valo wrote:
> > > > > > > From: Baochen Qiang <[email protected]>
> > > > > > >
> > > > > > > When using mhi_power_down_no_destroy() MHI hosts need to unprepare MHI channels
> > > > > > > by themselves. Similarly, MHI stack will also not create new MHI device since
> > > > > > > old devices were not destroyed, so MHI hosts need to prepare channels as well.
> > > > > > > Hence add these two interfaces to make that possible.
> > > > > > >
> > > > > > > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
> > > > > > >
> > > > > > > Signed-off-by: Baochen Qiang <[email protected]>
> > > > > > > Signed-off-by: Kalle Valo <[email protected]>
> > > > > > > ---
> > > > > > > drivers/bus/mhi/host/main.c | 107 ++++++++++++++++++++++++++++++++++++
> > > > > > > include/linux/mhi.h | 20 ++++++-
> > > > > > > 2 files changed, 126 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> > > > > > > index d80975f4bba8..3f677fc628ad 100644
> > > > > > > --- a/drivers/bus/mhi/host/main.c
> > > > > > > +++ b/drivers/bus/mhi/host/main.c
> > > > > > > @@ -1669,6 +1669,58 @@ int mhi_prepare_for_transfer_autoqueue(struct mhi_device *mhi_dev)
> > > > > > > }
> > > > > > > EXPORT_SYMBOL_GPL(mhi_prepare_for_transfer_autoqueue);
> > > > > > > +static int ____mhi_prepare_for_transfer(struct device *dev, void *data)
> > > > > >
> > > > > > "__mhi_prepare_all_for_transfer"
> > > > >
> > > > > This is to prepare one single child device, I don't think a name like
> > > > > __mhi_prepare_all_for_transfer (with 'all' inside) make sense, right?
> > > > > How about changing to "mhi_prepare_dev_for_transfer" or
> > > > > "mhi_prepare_single_for_transfer"?
> > > > >
> > > >
> > > > I think most of the checks in this function can be moved inside
> > > > mhi_prepare_for_transfer() API. With that you can just reuse the API without
> > > > adding a new helper.
> > > >
> > > > For autoqueue channels, you can add another API
> > > > mhi_prepare_all_for_transfer_autoqueue() just like
> > > > mhi_prepare_for_transfer_autoqueue() to maintain uniformity.
> > > >
> > > > - Mani
> > > If we do that, we need to call two APIs together, does it make sense? From
> > > the view of an MHI user, what we want is an API to prepare all channels, no
> > > matter whether a channel is configured as autoqueue or non-autoqueue, we
> > > don't care about it.
> > >
> >
> > You are calling this API from a wrong place first up.
> > mhi_{prepare/unprepare}_transfer* APIs are meant to be used by the client
> > drivers like QRTR. Controller drivers should not call them.
> >
> > What you need here is the hibernation support for QRTR itself and call these
> > APIs from there.
>
> OK, I got your point. QRTR is the right place to manage MHI channels, not
> ath11k it self.
> And we even don't need these two APIs if change to do it in QRTR.
>
> >
> > > And besides, I don't think there is a scenario where we need to use them
> > > separately. So if we always need to use them together, why not merge them in
> > > a single API?
> > >
> >
> > A single controller driver may expose multiple channels and those will bind to
> > multiple client drivers. So only the client drivers should manage the channels,
> > not the controller drivers themselves.
> Exactly.
>
> Great thanks for the proposal, Mani. Will change accordingly in next
> version.
>

And you can drop the RFC tag in the version.

- Mani

--
மணிவண்ணன் சதாசிவம்