This series implements shutdown & reinitialization of the PCIe RC on
system suspend. Tested on 8280-crd.
Signed-off-by: Konrad Dybcio <[email protected]>
---
Konrad Dybcio (4):
PCI: qcom: Reshuffle reset logic in 2_7_0 .init
PCI: qcom: Cache last icc bandwidth
PCI: qcom: Read back PARF_LTSSM register
PCI: qcom: Implement RC shutdown/power up
drivers/pci/controller/dwc/Kconfig | 1 +
drivers/pci/controller/dwc/pcie-qcom.c | 159 ++++++++++++++++++++++++---------
2 files changed, 120 insertions(+), 40 deletions(-)
---
base-commit: 39676dfe52331dba909c617f213fdb21015c8d10
change-id: 20231227-topic-8280_pcie-811c0f92fc1c
Best regards,
--
Konrad Dybcio <[email protected]>
At least on SC8280XP, if the PCIe reset is asserted, the corresponding
AUX_CLK will be stuck at 'off'.
Assert the reset (which may end up being a NOP if it was previously
asserted) and de-assert it back *before* turning on the clocks to avoid
such cases.
In addition to that, in case the clock bulk enable fails, assert the
RC reset back, as the hardware is in an unknown state at best.
Fixes: ed8cc3b1fc84 ("PCI: qcom: Add support for SDM845 PCIe controller")
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/pci/controller/dwc/pcie-qcom.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 11c80555d975..1c5ab8c4ff39 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -900,27 +900,27 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie)
return ret;
}
- ret = clk_bulk_prepare_enable(res->num_clks, res->clks);
- if (ret < 0)
- goto err_disable_regulators;
-
+ /* Assert the reset to hold the RC in a known state */
ret = reset_control_assert(res->rst);
if (ret) {
dev_err(dev, "reset assert failed (%d)\n", ret);
- goto err_disable_clocks;
+ goto err_disable_regulators;
}
-
usleep_range(1000, 1500);
+ /* GCC_PCIE_n_AUX_CLK won't come up if the reset is asserted */
ret = reset_control_deassert(res->rst);
if (ret) {
dev_err(dev, "reset deassert failed (%d)\n", ret);
- goto err_disable_clocks;
+ goto err_disable_regulators;
}
-
/* Wait for reset to complete, required on SM8450 */
usleep_range(1000, 1500);
+ ret = clk_bulk_prepare_enable(res->num_clks, res->clks);
+ if (ret < 0)
+ goto err_assert_reset;
+
/* configure PCIe to RC mode */
writel(DEVICE_TYPE_RC, pcie->parf + PARF_DEVICE_TYPE);
@@ -951,8 +951,9 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie)
writel(val, pcie->parf + PARF_AXI_MSTR_WR_ADDR_HALT_V2);
return 0;
-err_disable_clocks:
- clk_bulk_disable_unprepare(res->num_clks, res->clks);
+
+err_assert_reset:
+ reset_control_assert(res->rst);
err_disable_regulators:
regulator_bulk_disable(ARRAY_SIZE(res->supplies), res->supplies);
--
2.43.0
In preparation for shutting down the RC, cache the last interconnect
bandwidth vote to allow for icc tag setting.
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/pci/controller/dwc/pcie-qcom.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 1c5ab8c4ff39..a02dc197c495 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -240,6 +240,7 @@ struct qcom_pcie {
struct phy *phy;
struct gpio_desc *reset;
struct icc_path *icc_mem;
+ u32 last_bw;
const struct qcom_pcie_cfg *cfg;
struct dentry *debugfs;
bool suspended;
@@ -1387,6 +1388,8 @@ static int qcom_pcie_icc_init(struct qcom_pcie *pcie)
return ret;
}
+ pcie->last_bw = QCOM_PCIE_LINK_SPEED_TO_BW(1);
+
return 0;
}
@@ -1415,6 +1418,8 @@ static void qcom_pcie_icc_update(struct qcom_pcie *pcie)
dev_err(pci->dev, "failed to set interconnect bandwidth: %d\n",
ret);
}
+
+ pcie->last_bw = width * QCOM_PCIE_LINK_SPEED_TO_BW(speed);
}
static int qcom_pcie_link_transition_count(struct seq_file *s, void *data)
@@ -1578,6 +1583,8 @@ static int qcom_pcie_suspend_noirq(struct device *dev)
return ret;
}
+ pcie->last_bw = kBps_to_icc(1);
+
/*
* Turn OFF the resources only for controllers without active PCIe
* devices. For controllers with active devices, the resources are kept
--
2.43.0
To ensure write completion, read the PARF_LTSSM register after setting
the LTSSM enable bit before polling for "link up".
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/pci/controller/dwc/pcie-qcom.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index a02dc197c495..3d77269e70da 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -540,6 +540,7 @@ static void qcom_pcie_2_3_2_ltssm_enable(struct qcom_pcie *pcie)
val = readl(pcie->parf + PARF_LTSSM);
val |= LTSSM_EN;
writel(val, pcie->parf + PARF_LTSSM);
+ readl(pcie->parf + PARF_LTSSM);
}
static int qcom_pcie_get_resources_2_3_2(struct qcom_pcie *pcie)
--
2.43.0
Currently, we've only been minimizing the power draw while keeping the
RC up at all times. This is suboptimal, as it draws a whole lot of power
and prevents the SoC from power collapsing.
Implement full shutdown and re-initialization to allow for powering off
the controller.
This is mainly intended for v1_9_0 (sc8280xp), but the hardware is rather
similar across the board. More platform-specific details may be added in
the future as necessary.
Co-developed-by: Bjorn Andersson <[email protected]>
Signed-off-by: Konrad Dybcio <[email protected]>
---
drivers/pci/controller/dwc/Kconfig | 1 +
drivers/pci/controller/dwc/pcie-qcom.c | 132 +++++++++++++++++++++++++--------
2 files changed, 102 insertions(+), 31 deletions(-)
diff --git a/drivers/pci/controller/dwc/Kconfig b/drivers/pci/controller/dwc/Kconfig
index 5ac021dbd46a..591c4109ed62 100644
--- a/drivers/pci/controller/dwc/Kconfig
+++ b/drivers/pci/controller/dwc/Kconfig
@@ -268,6 +268,7 @@ config PCIE_DW_PLAT_EP
config PCIE_QCOM
bool "Qualcomm PCIe controller (host mode)"
depends on OF && (ARCH_QCOM || COMPILE_TEST)
+ depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n
depends on PCI_MSI
select PCIE_DW_HOST
select CRC8
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 3d77269e70da..a9e24fcd1f66 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -30,13 +30,18 @@
#include <linux/reset.h>
#include <linux/slab.h>
#include <linux/types.h>
+#include <soc/qcom/cmd-db.h>
#include "../../pci.h"
#include "pcie-designware.h"
+#include <dt-bindings/interconnect/qcom,icc.h>
+#include <dt-bindings/interconnect/qcom,rpm-icc.h>
+
/* PARF registers */
#define PARF_SYS_CTRL 0x00
#define PARF_PM_CTRL 0x20
+#define PARF_PM_STTS 0x24
#define PARF_PCS_DEEMPH 0x34
#define PARF_PCS_SWING 0x38
#define PARF_PHY_CTRL 0x40
@@ -80,7 +85,10 @@
#define L1_CLK_RMV_DIS BIT(1)
/* PARF_PM_CTRL register fields */
-#define REQ_NOT_ENTR_L1 BIT(5)
+#define REQ_NOT_ENTR_L1 BIT(5) /* "Prevent L0->L1" */
+
+/* PARF_PM_STTS register fields */
+#define PM_ENTER_L23 BIT(5)
/* PARF_PCS_DEEMPH register fields */
#define PCS_DEEMPH_TX_DEEMPH_GEN1(x) FIELD_PREP(GENMASK(21, 16), x)
@@ -122,6 +130,7 @@
/* ELBI_SYS_CTRL register fields */
#define ELBI_SYS_CTRL_LT_ENABLE BIT(0)
+#define ELBI_SYS_CTRL_PME_TURNOFF_MSG BIT(4)
/* AXI_MSTR_RESP_COMP_CTRL0 register fields */
#define CFG_REMOTE_RD_REQ_BRIDGE_SIZE_2K 0x4
@@ -244,6 +253,7 @@ struct qcom_pcie {
const struct qcom_pcie_cfg *cfg;
struct dentry *debugfs;
bool suspended;
+ bool soc_is_rpmh;
};
#define to_qcom_pcie(x) dev_get_drvdata((x)->dev)
@@ -273,6 +283,24 @@ static int qcom_pcie_start_link(struct dw_pcie *pci)
return 0;
}
+static int qcom_pcie_stop_link(struct dw_pcie *pci)
+{
+ struct qcom_pcie *pcie = to_qcom_pcie(pci);
+ u32 ret_l23, val;
+
+ writel(ELBI_SYS_CTRL_PME_TURNOFF_MSG, pcie->elbi + ELBI_SYS_CTRL);
+ readl(pcie->elbi + ELBI_SYS_CTRL);
+
+ ret_l23 = readl_poll_timeout(pcie->parf + PARF_PM_STTS, val,
+ val & PM_ENTER_L23, 10000, 100000);
+ if (ret_l23) {
+ dev_err(pci->dev, "Failed to enter L2/L3\n");
+ return -ETIMEDOUT;
+ }
+
+ return 0;
+}
+
static void qcom_pcie_clear_hpc(struct dw_pcie *pci)
{
u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
@@ -991,9 +1019,19 @@ static void qcom_pcie_host_post_init_2_7_0(struct qcom_pcie *pcie)
static void qcom_pcie_deinit_2_7_0(struct qcom_pcie *pcie)
{
struct qcom_pcie_resources_2_7_0 *res = &pcie->res.v2_7_0;
+ u32 val;
+
+ /* Disable PCIe clocks and resets */
+ val = readl(pcie->parf + PARF_PHY_CTRL);
+ val |= PHY_TEST_PWR_DOWN;
+ writel(val, pcie->parf + PARF_PHY_CTRL);
+ readl(pcie->parf + PARF_PHY_CTRL);
clk_bulk_disable_unprepare(res->num_clks, res->clks);
+ reset_control_assert(res->rst);
+ usleep_range(2000, 2500);
+
regulator_bulk_disable(ARRAY_SIZE(res->supplies), res->supplies);
}
@@ -1553,6 +1591,9 @@ static int qcom_pcie_probe(struct platform_device *pdev)
goto err_phy_exit;
}
+ /* If the soc features RPMh, cmd_db must have been prepared by now */
+ pcie->soc_is_rpmh = !cmd_db_ready();
+
qcom_pcie_icc_update(pcie);
if (pcie->mhi)
@@ -1569,60 +1610,89 @@ static int qcom_pcie_probe(struct platform_device *pdev)
return ret;
}
-static int qcom_pcie_suspend_noirq(struct device *dev)
+static int qcom_pcie_resume_noirq(struct device *dev)
{
struct qcom_pcie *pcie = dev_get_drvdata(dev);
int ret;
/*
- * Set minimum bandwidth required to keep data path functional during
- * suspend.
+ * Undo the tag change from qcom_pcie_suspend_noirq first in case
+ * RPM(h) spontaneously decides to power collapse the platform based
+ * on other inputs.
*/
- ret = icc_set_bw(pcie->icc_mem, 0, kBps_to_icc(1));
+ icc_set_tag(pcie->icc_mem, pcie->soc_is_rpmh ? QCOM_ICC_TAG_ALWAYS : RPM_ALWAYS_TAG);
+ /* Flush the tag change */
+ ret = icc_set_bw(pcie->icc_mem, 0, pcie->last_bw);
if (ret) {
- dev_err(dev, "Failed to set interconnect bandwidth: %d\n", ret);
+ dev_err(pcie->pci->dev, "failed to set interconnect bandwidth: %d\n", ret);
return ret;
}
- pcie->last_bw = kBps_to_icc(1);
+ /* Only check this now to make sure the icc vote is in before going furhter. */
+ if (!pcie->suspended)
+ return 0;
- /*
- * Turn OFF the resources only for controllers without active PCIe
- * devices. For controllers with active devices, the resources are kept
- * ON and the link is expected to be in L0/L1 (sub)states.
- *
- * Turning OFF the resources for controllers with active PCIe devices
- * will trigger access violation during the end of the suspend cycle,
- * as kernel tries to access the PCIe devices config space for masking
- * MSIs.
- *
- * Also, it is not desirable to put the link into L2/L3 state as that
- * implies VDD supply will be removed and the devices may go into
- * powerdown state. This will affect the lifetime of the storage devices
- * like NVMe.
- */
- if (!dw_pcie_link_up(pcie->pci)) {
- qcom_pcie_host_deinit(&pcie->pci->pp);
- pcie->suspended = true;
- }
+ ret = qcom_pcie_host_init(&pcie->pci->pp);
+ if (ret)
+ goto revert_icc_tag;
+
+ dw_pcie_setup_rc(&pcie->pci->pp);
+
+ ret = qcom_pcie_start_link(pcie->pci);
+ if (ret)
+ goto deinit_host;
+
+ /* Ignore the retval, the devices may come up later. */
+ dw_pcie_wait_for_link(pcie->pci);
+
+ qcom_pcie_icc_update(pcie);
+
+ pcie->suspended = false;
return 0;
+
+deinit_host:
+ qcom_pcie_host_deinit(&pcie->pci->pp);
+revert_icc_tag:
+ icc_set_tag(pcie->icc_mem, pcie->soc_is_rpmh ? QCOM_ICC_TAG_WAKE : RPM_ACTIVE_TAG);
+ /* Ignore the retval, failing here would be tragic anyway.. */
+ icc_set_bw(pcie->icc_mem, 0, pcie->last_bw);
+
+ return ret;
}
-static int qcom_pcie_resume_noirq(struct device *dev)
+static int qcom_pcie_suspend_noirq(struct device *dev)
{
struct qcom_pcie *pcie = dev_get_drvdata(dev);
int ret;
- if (pcie->suspended) {
- ret = qcom_pcie_host_init(&pcie->pci->pp);
+ if (pcie->suspended)
+ return 0;
+
+ if (dw_pcie_link_up(pcie->pci)) {
+ ret = qcom_pcie_stop_link(pcie->pci);
if (ret)
return ret;
+ }
- pcie->suspended = false;
+ qcom_pcie_host_deinit(&pcie->pci->pp);
+
+ /*
+ * The PCIe RC may be covertly accessed by the secure firmware on sleep exit.
+ * Use the WAKE bucket to let RPMh pull the plug on PCIe in sleep,
+ * but guarantee it comes back for resume.
+ */
+ icc_set_tag(pcie->icc_mem, pcie->soc_is_rpmh ? QCOM_ICC_TAG_WAKE : RPM_ACTIVE_TAG);
+ /* Flush the tag change */
+ ret = icc_set_bw(pcie->icc_mem, 0, pcie->last_bw);
+ if (ret) {
+ dev_err(pcie->pci->dev, "failed to set interconnect bandwidth: %d\n", ret);
+
+ /* Revert everything and hope the next icc_set_bw goes through.. */
+ return qcom_pcie_resume_noirq(dev);
}
- qcom_pcie_icc_update(pcie);
+ pcie->suspended = true;
return 0;
}
--
2.43.0
Hi Konrad,
kernel test robot noticed the following build errors:
[auto build test ERROR on 39676dfe52331dba909c617f213fdb21015c8d10]
url: https://github.com/intel-lab-lkp/linux/commits/Konrad-Dybcio/PCI-qcom-Reshuffle-reset-logic-in-2_7_0-init/20231228-062002
base: 39676dfe52331dba909c617f213fdb21015c8d10
patch link: https://lore.kernel.org/r/20231227-topic-8280_pcie-v1-4-095491baf9e4%40linaro.org
patch subject: [PATCH 4/4] PCI: qcom: Implement RC shutdown/power up
config: powerpc64-randconfig-002-20231228 (https://download.01.org/0day-ci/archive/20231228/[email protected]/config)
compiler: powerpc64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231228/[email protected]/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
All errors (new ones prefixed by >>):
powerpc64-linux-ld: warning: orphan section `.stubs' from `drivers/dma/fsl-edma-common.o' being placed in section `.stubs'
powerpc64-linux-ld: warning: discarding dynamic section .glink
powerpc64-linux-ld: warning: discarding dynamic section .plt
powerpc64-linux-ld: linkage table error against `cmd_db_ready'
powerpc64-linux-ld: stubs don't match calculated size
powerpc64-linux-ld: can not build stubs: bad value
powerpc64-linux-ld: drivers/pci/controller/dwc/pcie-qcom.o: in function `qcom_pcie_probe':
>> pcie-qcom.c:(.text+0x1894): undefined reference to `cmd_db_ready'
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Hi Konrad,
kernel test robot noticed the following build errors:
[auto build test ERROR on 39676dfe52331dba909c617f213fdb21015c8d10]
url: https://github.com/intel-lab-lkp/linux/commits/Konrad-Dybcio/PCI-qcom-Reshuffle-reset-logic-in-2_7_0-init/20231228-062002
base: 39676dfe52331dba909c617f213fdb21015c8d10
patch link: https://lore.kernel.org/r/20231227-topic-8280_pcie-v1-4-095491baf9e4%40linaro.org
patch subject: [PATCH 4/4] PCI: qcom: Implement RC shutdown/power up
config: arm64-allmodconfig (https://download.01.org/0day-ci/archive/20231229/[email protected]/config)
compiler: clang version 18.0.0git (https://github.com/llvm/llvm-project 8a4266a626914765c0c69839e8a51be383013c1a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231229/[email protected]/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
All errors (new ones prefixed by >>):
>> ld.lld: error: undefined symbol: cmd_db_ready
>>> referenced by pcie-qcom.c
>>> drivers/pci/controller/dwc/pcie-qcom.o:(qcom_pcie_probe) in archive vmlinux.a
--
>> ld.lld: error: undefined symbol: crc8_populate_msb
>>> referenced by pcie-qcom.c
>>> drivers/pci/controller/dwc/pcie-qcom.o:(qcom_pcie_config_sid_1_9_0) in archive vmlinux.a
--
>> ld.lld: error: undefined symbol: crc8
>>> referenced by pcie-qcom.c
>>> drivers/pci/controller/dwc/pcie-qcom.o:(qcom_pcie_config_sid_1_9_0) in archive vmlinux.a
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
On Wed, Dec 27, 2023 at 11:17:19PM +0100, Konrad Dybcio wrote:
> At least on SC8280XP, if the PCIe reset is asserted, the corresponding
> AUX_CLK will be stuck at 'off'.
No, this path is exercised on every boot without the aux clock ever
being stuck at off. So something is clearly missing in this description.
> Assert the reset (which may end up being a NOP if it was previously
> asserted) and de-assert it back *before* turning on the clocks to avoid
> such cases.
>
> In addition to that, in case the clock bulk enable fails, assert the
> RC reset back, as the hardware is in an unknown state at best.
This is arguably a separate change, and not necessarily one that is
correct either, so should at least go in a separate patch if it should
be done at all.
> Fixes: ed8cc3b1fc84 ("PCI: qcom: Add support for SDM845 PCIe controller")
I think you're being way to liberal with your use of Fixes tags. To
claim that this is a bug, you need to make a more convincing case for
why you think so.
Also note Qualcomm's vendor driver is similarly asserting reset after
enabling the clocks.
That driver does not seem to reset the controller on resume, though, in
case that is relevant for your current experiments.
Johan
On 29.12.2023 15:04, Johan Hovold wrote:
> On Wed, Dec 27, 2023 at 11:17:19PM +0100, Konrad Dybcio wrote:
>> At least on SC8280XP, if the PCIe reset is asserted, the corresponding
>> AUX_CLK will be stuck at 'off'.
>
> No, this path is exercised on every boot without the aux clock ever
> being stuck at off. So something is clearly missing in this description.
That's likely because the hardware has been initialized and not cleanly
shut down by your bootloader. When you reset it, or your bootloader
wasn't so kind, you need to start initialization from scratch.
>> Assert the reset (which may end up being a NOP if it was previously
>> asserted) and de-assert it back *before* turning on the clocks to avoid
>> such cases.
>>
>> In addition to that, in case the clock bulk enable fails, assert the
>> RC reset back, as the hardware is in an unknown state at best.
>
> This is arguably a separate change, and not necessarily one that is
> correct either
If the clock enable fails, the PCIe hw is not in reset state, ergo it
may be doing "something", and that "something" would eat non-zero power.
It's just cleaning up after yourself.
> so should at least go in a separate patch if it should
> be done at all.
I'll grumpily comply..
>> Fixes: ed8cc3b1fc84 ("PCI: qcom: Add support for SDM845 PCIe controller")
>
> I think you're being way to liberal with your use of Fixes tags. To
> claim that this is a bug, you need to make a more convincing case for
> why you think so.
The first paragraph describes the issue that this patch fixes.
> Also note Qualcomm's vendor driver is similarly asserting reset after
> enabling the clocks.
It's also not asserting the reset on suspend, see below.
> That driver does not seem to reset the controller on resume, though, in
> case that is relevant for your current experiments.
I know, the vendor driver doesn't fully shut down the controller. This
is however the only sequence that we (partially) have upstream, and the
only one that is going to work on SC8280XP (due to hw design).
On other platforms, a "soft shutdown" (i.e. dropping the link, cutting
clocks but not fully resetting the RC state) should be possible, but
that's not what this patchset concerns.
Konrad
[ Again, please remember to add a newline before you inline comments to
make you replies readable. ]
On Fri, Dec 29, 2023 at 04:01:27PM +0100, Konrad Dybcio wrote:
> On 29.12.2023 15:04, Johan Hovold wrote:
> > On Wed, Dec 27, 2023 at 11:17:19PM +0100, Konrad Dybcio wrote:
> >> At least on SC8280XP, if the PCIe reset is asserted, the corresponding
> >> AUX_CLK will be stuck at 'off'.
> >
> > No, this path is exercised on every boot without the aux clock ever
> > being stuck at off. So something is clearly missing in this description.
> That's likely because the hardware has been initialized and not cleanly
> shut down by your bootloader. When you reset it, or your bootloader
> wasn't so kind, you need to start initialization from scratch.
What does that even mean? I'm telling you that this reset is asserted on
each boot, on all sc8280xp platforms I have access to, and never have I
seen the aux clk stuck at off.
So clearly your claim above is too broad and the commit message is
incorrect or incomplete.
> >> Assert the reset (which may end up being a NOP if it was previously
> >> asserted) and de-assert it back *before* turning on the clocks to avoid
> >> such cases.
> >>
> >> In addition to that, in case the clock bulk enable fails, assert the
> >> RC reset back, as the hardware is in an unknown state at best.
> >
> > This is arguably a separate change, and not necessarily one that is
> > correct either
> If the clock enable fails, the PCIe hw is not in reset state, ergo it
> may be doing "something", and that "something" would eat non-zero power.
> It's just cleaning up after yourself.
How can it do something without power and clocks? And leaving reset
asserted for non-powered devices is generally not a good idea.
> > so should at least go in a separate patch if it should
> > be done at all.
> I'll grumpily comply..
I suggest you leave it deasserted unless you have documentation
suggesting that the opposite is safe and recommended for this piece of
hardware.
> >> Fixes: ed8cc3b1fc84 ("PCI: qcom: Add support for SDM845 PCIe controller")
> >
> > I think you're being way to liberal with your use of Fixes tags. To
> > claim that this is a bug, you need to make a more convincing case for
> > why you think so.
> The first paragraph describes the issue that this patch fixes.
Yes, but this is all very hand-wavy so far. With a complete commit
message I may agree, but you still haven't convinced me that this is a
bug and not just a workaround from some not fully-understood issue on
one particular platform.
> > Also note Qualcomm's vendor driver is similarly asserting reset after
> > enabling the clocks.
> It's also not asserting the reset on suspend, see below.
Right, as I mentioned.
> > That driver does not seem to reset the controller on resume, though, in
> > case that is relevant for your current experiments.
> I know, the vendor driver doesn't fully shut down the controller. This
> is however the only sequence that we (partially) have upstream, and the
> only one that is going to work on SC8280XP (due to hw design).
>
> On other platforms, a "soft shutdown" (i.e. dropping the link, cutting
> clocks but not fully resetting the RC state) should be possible, but
> that's not what this patchset concerns.
The commit message does not even mention suspend, it just makes a
clearly false general claim about a clock being stuck unless you reorder
things.
Johan
On Fri, Dec 29, 2023 at 03:04:23PM +0100, Johan Hovold wrote:
> On Wed, Dec 27, 2023 at 11:17:19PM +0100, Konrad Dybcio wrote:
> ...
> This is arguably a separate change, and not necessarily one that is
> correct either, so should at least go in a separate patch if it should
> be done at all.
A nice side effect of splitting might be that it would be a chance to
put a little more specific information in the subject lines.
"Reshuffle reset logic" by itself doesn't connect it to a specific
issue or reason for the change.
Bjorn
On 29.12.2023 16:29, Johan Hovold wrote:
> [ Again, please remember to add a newline before you inline comments to
> make you replies readable. ]
>
> On Fri, Dec 29, 2023 at 04:01:27PM +0100, Konrad Dybcio wrote:
>> On 29.12.2023 15:04, Johan Hovold wrote:
>>> On Wed, Dec 27, 2023 at 11:17:19PM +0100, Konrad Dybcio wrote:
>>>> At least on SC8280XP, if the PCIe reset is asserted, the corresponding
>>>> AUX_CLK will be stuck at 'off'.
>>>
>>> No, this path is exercised on every boot without the aux clock ever
>>> being stuck at off. So something is clearly missing in this description.
>
>> That's likely because the hardware has been initialized and not cleanly
>> shut down by your bootloader. When you reset it, or your bootloader
>> wasn't so kind, you need to start initialization from scratch.
>
> What does that even mean? I'm telling you that this reset is asserted on
> each boot, on all sc8280xp platforms I have access to, and never have I
> seen the aux clk stuck at off.
>
> So clearly your claim above is too broad and the commit message is
> incorrect or incomplete.
diff --git a/drivers/clk/qcom/gcc-sc8280xp.c b/drivers/clk/qcom/gcc-sc8280xp.c
index 0b7801971dc1..6650bd6af5e3 100644
--- a/drivers/clk/qcom/gcc-sc8280xp.c
+++ b/drivers/clk/qcom/gcc-sc8280xp.c
@@ -7566,6 +7566,18 @@ static int gcc_sc8280xp_probe(struct platform_device *pdev)
if (ret)
goto err_put_rpm;
+ int val;
+ regmap_read(regmap, 0xa0000, &val);
+ pr_err("GCC_PCIE_3A_BCR = 0x%x\n", val);
+ regmap_read(regmap, 0xa00f0, &val);
+ pr_err("GCC_PCIE_3A_LINK_DOWN_BCR = 0x%x\n", val);
+ regmap_read(regmap, 0xa00fc, &val);
+ pr_err("GCC_PCIE_3A_NOCSR_COM_PHY_BCR = 0x%x\n", val);
+ regmap_read(regmap, 0xa00e0, &val);
+ pr_err("GCC_PCIE_3A_PHY_BCR = 0x%x\n", val);
+ regmap_read(regmap, 0xa00e4, &val);
+ pr_err("GCC_PCIE_3A_PHY_NOCSR_COM_PHY_BCR = 0x%x\n", val);
+
pm_runtime_put(&pdev->dev);
return 0;
[root@sc8280xp-crd ~]# dmesg | grep BCR
[ 2.500245] GCC_PCIE_3A_BCR = 0x0
[ 2.500250] GCC_PCIE_3A_LINK_DOWN_BCR = 0x0
[ 2.500253] GCC_PCIE_3A_NOCSR_COM_PHY_BCR = 0x0
[ 2.500255] GCC_PCIE_3A_PHY_BCR = 0x0
[ 2.500257] GCC_PCIE_3A_PHY_NOCSR_COM_PHY_BCR = 0x0
0 meaning "not asserted".
Adding the read in the GCC driver .probe ensures we get the
unmodified data, as all GCC consumers must wait for it to probe.
PCIE3A is used for WLAN on the CRD, btw.
>
>>>> Assert the reset (which may end up being a NOP if it was previously
>>>> asserted) and de-assert it back *before* turning on the clocks to avoid
>>>> such cases.
>>>>
>>>> In addition to that, in case the clock bulk enable fails, assert the
>>>> RC reset back, as the hardware is in an unknown state at best.
>>>
>>> This is arguably a separate change, and not necessarily one that is
>>> correct either
>
>> If the clock enable fails, the PCIe hw is not in reset state, ergo it
>> may be doing "something", and that "something" would eat non-zero power.
>> It's just cleaning up after yourself.
>
> How can it do something without power and clocks?
Fair point.
As far as power goes, the RC hangs off CX, which is on whenever the
system is not in power collapse. As for clocks, at least parts of it
use the crystal oscillator, not sure if directly.
> And leaving reset
> asserted for non-powered devices is generally not a good idea.
Depends on the hw.
>
>>> so should at least go in a separate patch if it should
>>> be done at all.
>
>> I'll grumpily comply..
>
> I suggest you leave it deasserted unless you have documentation
> suggesting that the opposite is safe and recommended for this piece of
> hardware.
>
>>>> Fixes: ed8cc3b1fc84 ("PCI: qcom: Add support for SDM845 PCIe controller")
>>>
>>> I think you're being way to liberal with your use of Fixes tags. To
>>> claim that this is a bug, you need to make a more convincing case for
>>> why you think so.
>
>> The first paragraph describes the issue that this patch fixes.
>
> Yes, but this is all very hand-wavy so far. With a complete commit
> message I may agree, but you still haven't convinced me that this is a
> bug and not just a workaround from some not fully-understood issue on
> one particular platform.
Right, reading it again, it doesn't really tell the whole story.
>
>>> Also note Qualcomm's vendor driver is similarly asserting reset after
>>> enabling the clocks.
>
>> It's also not asserting the reset on suspend, see below.
>
> Right, as I mentioned.
>
>>> That driver does not seem to reset the controller on resume, though, in
>>> case that is relevant for your current experiments.
>
>> I know, the vendor driver doesn't fully shut down the controller. This
>> is however the only sequence that we (partially) have upstream, and the
>> only one that is going to work on SC8280XP (due to hw design).
>>
>> On other platforms, a "soft shutdown" (i.e. dropping the link, cutting
>> clocks but not fully resetting the RC state) should be possible, but
>> that's not what this patchset concerns.
>
> The commit message does not even mention suspend, it just makes a
> clearly false general claim about a clock being stuck unless you reorder
> things.
No, I insist that this general statement, while indeed lacking a full
description of the problem, is provably true. The AUX clock will not
turn on if the PCIe reset is asserted, at least on SC8280XP.
Konrad
On 29.12.2023 16:46, Bjorn Helgaas wrote:
> On Fri, Dec 29, 2023 at 03:04:23PM +0100, Johan Hovold wrote:
>> On Wed, Dec 27, 2023 at 11:17:19PM +0100, Konrad Dybcio wrote:
>> ...
>
>> This is arguably a separate change, and not necessarily one that is
>> correct either, so should at least go in a separate patch if it should
>> be done at all.
>
> A nice side effect of splitting might be that it would be a chance to
> put a little more specific information in the subject lines.
> "Reshuffle reset logic" by itself doesn't connect it to a specific
> issue or reason for the change.
Yes, sorry, that's on me.
I've been deep inside this topic recently and many things on the QC
side are quite obvious to me, but I often keep forgetting that I
need to externalize that knowledge in commit messages properly..
Konrad
On Sat, Dec 30, 2023 at 02:16:18AM +0100, Konrad Dybcio wrote:
> On 29.12.2023 16:29, Johan Hovold wrote:
> > [ Again, please remember to add a newline before you inline comments to
> > make you replies readable. ]
> >
> > On Fri, Dec 29, 2023 at 04:01:27PM +0100, Konrad Dybcio wrote:
> >> On 29.12.2023 15:04, Johan Hovold wrote:
> >>> On Wed, Dec 27, 2023 at 11:17:19PM +0100, Konrad Dybcio wrote:
> >>>> At least on SC8280XP, if the PCIe reset is asserted, the corresponding
> >>>> AUX_CLK will be stuck at 'off'.
> >>>
> >>> No, this path is exercised on every boot without the aux clock ever
> >>> being stuck at off. So something is clearly missing in this description.
> >
> >> That's likely because the hardware has been initialized and not cleanly
> >> shut down by your bootloader. When you reset it, or your bootloader
> >> wasn't so kind, you need to start initialization from scratch.
> >
> > What does that even mean? I'm telling you that this reset is asserted on
> > each boot, on all sc8280xp platforms I have access to, and never have I
> > seen the aux clk stuck at off.
> >
> > So clearly your claim above is too broad and the commit message is
> > incorrect or incomplete.
>
> diff --git a/drivers/clk/qcom/gcc-sc8280xp.c b/drivers/clk/qcom/gcc-sc8280xp.c
> index 0b7801971dc1..6650bd6af5e3 100644
> --- a/drivers/clk/qcom/gcc-sc8280xp.c
> +++ b/drivers/clk/qcom/gcc-sc8280xp.c
> @@ -7566,6 +7566,18 @@ static int gcc_sc8280xp_probe(struct platform_device *pdev)
> if (ret)
> goto err_put_rpm;
>
> + int val;
> + regmap_read(regmap, 0xa0000, &val);
> + pr_err("GCC_PCIE_3A_BCR = 0x%x\n", val);
> + regmap_read(regmap, 0xa00f0, &val);
> + pr_err("GCC_PCIE_3A_LINK_DOWN_BCR = 0x%x\n", val);
> + regmap_read(regmap, 0xa00fc, &val);
> + pr_err("GCC_PCIE_3A_NOCSR_COM_PHY_BCR = 0x%x\n", val);
> + regmap_read(regmap, 0xa00e0, &val);
> + pr_err("GCC_PCIE_3A_PHY_BCR = 0x%x\n", val);
> + regmap_read(regmap, 0xa00e4, &val);
> + pr_err("GCC_PCIE_3A_PHY_NOCSR_COM_PHY_BCR = 0x%x\n", val);
> +
> pm_runtime_put(&pdev->dev);
>
> return 0;
>
>
> [root@sc8280xp-crd ~]# dmesg | grep BCR
> [ 2.500245] GCC_PCIE_3A_BCR = 0x0
> [ 2.500250] GCC_PCIE_3A_LINK_DOWN_BCR = 0x0
> [ 2.500253] GCC_PCIE_3A_NOCSR_COM_PHY_BCR = 0x0
> [ 2.500255] GCC_PCIE_3A_PHY_BCR = 0x0
> [ 2.500257] GCC_PCIE_3A_PHY_NOCSR_COM_PHY_BCR = 0x0
>
>
> 0 meaning "not asserted".
>
> Adding the read in the GCC driver .probe ensures we get the
> unmodified data, as all GCC consumers must wait for it to probe.
>
> PCIE3A is used for WLAN on the CRD, btw.
>
I get what you are trying to do, but I should say that your justification so far
didn't do the justice.
Your point is that, if the PCIe BCR is asserted during boot (which can happen if
the bootloader didn't initialize PCIe for things like storage), then trying to
enable clocks will result in the "clk stuck" error from GCC. But in the case of
sc8280xp, bootloader would've already initialized PCIe during boot as it is uses
NVMe for things like firmware. But when you do a full power down during suspend
(CX power collapse), then while reinitializing the controller during resume, you
are hitting the "clk stuck" error because at that time, the BCR reset would be
asserted (POR value).
But I really suspect if that is the case... Because, the same init function is
being used by other SoCs (sm8150, sm8250, etc... even sdx55) and I'm pretty sure
that in those SoCs, the bootloader wouldn't have initialized PCIe during boot as
there is no use case.
So I cross checked it on SM8450, but I saw the BCR status being "0" during boot
(same as on sc8280xp). Then I checked the HPG and came to know that when the
PCIe GDSC is uncollapsed, some of the BCRs would be deasserted in the back.
Though it mentioned only PHY_BCR and not the PCIE_n_BCR. But I think the
behavior might be same for both. You can verify it by printing the state of all
BCRs during resume from suspend. This will give us some clue...
> >
> >>>> Assert the reset (which may end up being a NOP if it was previously
> >>>> asserted) and de-assert it back *before* turning on the clocks to avoid
> >>>> such cases.
> >>>>
> >>>> In addition to that, in case the clock bulk enable fails, assert the
> >>>> RC reset back, as the hardware is in an unknown state at best.
> >>>
> >>> This is arguably a separate change, and not necessarily one that is
> >>> correct either
> >
> >> If the clock enable fails, the PCIe hw is not in reset state, ergo it
> >> may be doing "something", and that "something" would eat non-zero power.
> >> It's just cleaning up after yourself.
> >
> > How can it do something without power and clocks?
>
> Fair point.
>
> As far as power goes, the RC hangs off CX, which is on whenever the
> system is not in power collapse. As for clocks, at least parts of it
> use the crystal oscillator, not sure if directly.
>
> > And leaving reset
> > asserted for non-powered devices is generally not a good idea.
>
> Depends on the hw.
>
I do not have any strong argument here as there are too many things happening
that determines whether the controller is properly powered or not. So I'd say
that if you do not have any power numbers, it is best to leave it as it is.
- Mani
> >
> >>> so should at least go in a separate patch if it should
> >>> be done at all.
> >
> >> I'll grumpily comply..
> >
> > I suggest you leave it deasserted unless you have documentation
> > suggesting that the opposite is safe and recommended for this piece of
> > hardware.
> >
> >>>> Fixes: ed8cc3b1fc84 ("PCI: qcom: Add support for SDM845 PCIe controller")
> >>>
> >>> I think you're being way to liberal with your use of Fixes tags. To
> >>> claim that this is a bug, you need to make a more convincing case for
> >>> why you think so.
> >
> >> The first paragraph describes the issue that this patch fixes.
> >
> > Yes, but this is all very hand-wavy so far. With a complete commit
> > message I may agree, but you still haven't convinced me that this is a
> > bug and not just a workaround from some not fully-understood issue on
> > one particular platform.
>
> Right, reading it again, it doesn't really tell the whole story.
>
> >
> >>> Also note Qualcomm's vendor driver is similarly asserting reset after
> >>> enabling the clocks.
> >
> >> It's also not asserting the reset on suspend, see below.
> >
> > Right, as I mentioned.
> >
> >>> That driver does not seem to reset the controller on resume, though, in
> >>> case that is relevant for your current experiments.
> >
> >> I know, the vendor driver doesn't fully shut down the controller. This
> >> is however the only sequence that we (partially) have upstream, and the
> >> only one that is going to work on SC8280XP (due to hw design).
> >>
> >> On other platforms, a "soft shutdown" (i.e. dropping the link, cutting
> >> clocks but not fully resetting the RC state) should be possible, but
> >> that's not what this patchset concerns.
> >
> > The commit message does not even mention suspend, it just makes a
> > clearly false general claim about a clock being stuck unless you reorder
> > things.
>
> No, I insist that this general statement, while indeed lacking a full
> description of the problem, is provably true. The AUX clock will not
> turn on if the PCIe reset is asserted, at least on SC8280XP.
>
> Konrad
--
மணிவண்ணன் சதாசிவம்
On Sat, Dec 30, 2023 at 02:16:18AM +0100, Konrad Dybcio wrote:
> On 29.12.2023 16:29, Johan Hovold wrote:
> > [ Again, please remember to add a newline before you inline comments to
> > make you replies readable. ]
> >
> > On Fri, Dec 29, 2023 at 04:01:27PM +0100, Konrad Dybcio wrote:
> >> On 29.12.2023 15:04, Johan Hovold wrote:
> >>> On Wed, Dec 27, 2023 at 11:17:19PM +0100, Konrad Dybcio wrote:
> >>>> At least on SC8280XP, if the PCIe reset is asserted, the corresponding
> >>>> AUX_CLK will be stuck at 'off'.
> >>>
> >>> No, this path is exercised on every boot without the aux clock ever
> >>> being stuck at off. So something is clearly missing in this description.
> >
> >> That's likely because the hardware has been initialized and not cleanly
> >> shut down by your bootloader. When you reset it, or your bootloader
> >> wasn't so kind, you need to start initialization from scratch.
> >
> > What does that even mean? I'm telling you that this reset is asserted on
> > each boot, on all sc8280xp platforms I have access to, and never have I
> > seen the aux clk stuck at off.
> >
> > So clearly your claim above is too broad and the commit message is
> > incorrect or incomplete.
>
> diff --git a/drivers/clk/qcom/gcc-sc8280xp.c b/drivers/clk/qcom/gcc-sc8280xp.c
> index 0b7801971dc1..6650bd6af5e3 100644
> --- a/drivers/clk/qcom/gcc-sc8280xp.c
> +++ b/drivers/clk/qcom/gcc-sc8280xp.c
> @@ -7566,6 +7566,18 @@ static int gcc_sc8280xp_probe(struct platform_device *pdev)
> if (ret)
> goto err_put_rpm;
>
> + int val;
> + regmap_read(regmap, 0xa0000, &val);
> + pr_err("GCC_PCIE_3A_BCR = 0x%x\n", val);
> + regmap_read(regmap, 0xa00f0, &val);
> + pr_err("GCC_PCIE_3A_LINK_DOWN_BCR = 0x%x\n", val);
> + regmap_read(regmap, 0xa00fc, &val);
> + pr_err("GCC_PCIE_3A_NOCSR_COM_PHY_BCR = 0x%x\n", val);
> + regmap_read(regmap, 0xa00e0, &val);
> + pr_err("GCC_PCIE_3A_PHY_BCR = 0x%x\n", val);
> + regmap_read(regmap, 0xa00e4, &val);
> + pr_err("GCC_PCIE_3A_PHY_NOCSR_COM_PHY_BCR = 0x%x\n", val);
> +
> pm_runtime_put(&pdev->dev);
>
> return 0;
>
>
> [root@sc8280xp-crd ~]# dmesg | grep BCR
> [ 2.500245] GCC_PCIE_3A_BCR = 0x0
> [ 2.500250] GCC_PCIE_3A_LINK_DOWN_BCR = 0x0
> [ 2.500253] GCC_PCIE_3A_NOCSR_COM_PHY_BCR = 0x0
> [ 2.500255] GCC_PCIE_3A_PHY_BCR = 0x0
> [ 2.500257] GCC_PCIE_3A_PHY_NOCSR_COM_PHY_BCR = 0x0
>
>
> 0 meaning "not asserted".
We're clearly talking past each other. When I'm saying reset is asserted
on each boot, I'm referring to reset being asserted in
qcom_pcie_init_2_7_0(), whereas you appear to be referring to whether
the reset has been left asserted by the bootloader when the driver
probes.
I understand what you meant to say now, but I think you should rephrase:
At least on SC8280XP, if the PCIe reset is asserted, the
corresponding AUX_CLK will be stuck at 'off'.
because as it stands, it sounds like the problem happens when the driver
asserts reset.
> PCIE3A is used for WLAN on the CRD, btw.
You meant to say WWAN (modem).
> >>>> Assert the reset (which may end up being a NOP if it was previously
> >>>> asserted) and de-assert it back *before* turning on the clocks to avoid
> >>>> such cases.
> >>>>
> >>>> In addition to that, in case the clock bulk enable fails, assert the
> >>>> RC reset back, as the hardware is in an unknown state at best.
> >>>
> >>> This is arguably a separate change, and not necessarily one that is
> >>> correct either
> >
> >> If the clock enable fails, the PCIe hw is not in reset state, ergo it
> >> may be doing "something", and that "something" would eat non-zero power.
> >> It's just cleaning up after yourself.
> >
> > How can it do something without power and clocks?
>
> Fair point.
>
> As far as power goes, the RC hangs off CX, which is on whenever the
> system is not in power collapse. As for clocks, at least parts of it
> use the crystal oscillator, not sure if directly.
>
> > And leaving reset
> > asserted for non-powered devices is generally not a good idea.
>
> Depends on the hw.
That's why I said "generally".
> > The commit message does not even mention suspend, it just makes a
> > clearly false general claim about a clock being stuck unless you reorder
> > things.
>
> No, I insist that this general statement, while indeed lacking a full
> description of the problem, is provably true. The AUX clock will not
> turn on if the PCIe reset is asserted, at least on SC8280XP.
I agree, I was misinterpreting the commit message.
Johan
On 2.01.2024 11:17, Johan Hovold wrote:
> On Sat, Dec 30, 2023 at 02:16:18AM +0100, Konrad Dybcio wrote:
>> On 29.12.2023 16:29, Johan Hovold wrote:
>>> [ Again, please remember to add a newline before you inline comments to
>>> make you replies readable. ]
>>>
>>> On Fri, Dec 29, 2023 at 04:01:27PM +0100, Konrad Dybcio wrote:
>>>> On 29.12.2023 15:04, Johan Hovold wrote:
>>>>> On Wed, Dec 27, 2023 at 11:17:19PM +0100, Konrad Dybcio wrote:
>>>>>> At least on SC8280XP, if the PCIe reset is asserted, the corresponding
>>>>>> AUX_CLK will be stuck at 'off'.
>>>>>
>>>>> No, this path is exercised on every boot without the aux clock ever
>>>>> being stuck at off. So something is clearly missing in this description.
>>>
>>>> That's likely because the hardware has been initialized and not cleanly
>>>> shut down by your bootloader. When you reset it, or your bootloader
>>>> wasn't so kind, you need to start initialization from scratch.
>>>
>>> What does that even mean? I'm telling you that this reset is asserted on
>>> each boot, on all sc8280xp platforms I have access to, and never have I
>>> seen the aux clk stuck at off.
>>>
>>> So clearly your claim above is too broad and the commit message is
>>> incorrect or incomplete.
>>
>> diff --git a/drivers/clk/qcom/gcc-sc8280xp.c b/drivers/clk/qcom/gcc-sc8280xp.c
>> index 0b7801971dc1..6650bd6af5e3 100644
>> --- a/drivers/clk/qcom/gcc-sc8280xp.c
>> +++ b/drivers/clk/qcom/gcc-sc8280xp.c
>> @@ -7566,6 +7566,18 @@ static int gcc_sc8280xp_probe(struct platform_device *pdev)
>> if (ret)
>> goto err_put_rpm;
>>
>> + int val;
>> + regmap_read(regmap, 0xa0000, &val);
>> + pr_err("GCC_PCIE_3A_BCR = 0x%x\n", val);
>> + regmap_read(regmap, 0xa00f0, &val);
>> + pr_err("GCC_PCIE_3A_LINK_DOWN_BCR = 0x%x\n", val);
>> + regmap_read(regmap, 0xa00fc, &val);
>> + pr_err("GCC_PCIE_3A_NOCSR_COM_PHY_BCR = 0x%x\n", val);
>> + regmap_read(regmap, 0xa00e0, &val);
>> + pr_err("GCC_PCIE_3A_PHY_BCR = 0x%x\n", val);
>> + regmap_read(regmap, 0xa00e4, &val);
>> + pr_err("GCC_PCIE_3A_PHY_NOCSR_COM_PHY_BCR = 0x%x\n", val);
>> +
>> pm_runtime_put(&pdev->dev);
>>
>> return 0;
>>
>>
>> [root@sc8280xp-crd ~]# dmesg | grep BCR
>> [ 2.500245] GCC_PCIE_3A_BCR = 0x0
>> [ 2.500250] GCC_PCIE_3A_LINK_DOWN_BCR = 0x0
>> [ 2.500253] GCC_PCIE_3A_NOCSR_COM_PHY_BCR = 0x0
>> [ 2.500255] GCC_PCIE_3A_PHY_BCR = 0x0
>> [ 2.500257] GCC_PCIE_3A_PHY_NOCSR_COM_PHY_BCR = 0x0
>>
>>
>> 0 meaning "not asserted".
>
> We're clearly talking past each other. When I'm saying reset is asserted
> on each boot, I'm referring to reset being asserted in
> qcom_pcie_init_2_7_0(), whereas you appear to be referring to whether
> the reset has been left asserted by the bootloader when the driver
> probes.
OK, "boot" meant "booting the device" to me, not the PCIe controller.
>
> I understand what you meant to say now, but I think you should rephrase:
>
> At least on SC8280XP, if the PCIe reset is asserted, the
> corresponding AUX_CLK will be stuck at 'off'.
>
> because as it stands, it sounds like the problem happens when the driver
> asserts reset.
Does this sound good?
"At least on SC8280XP, trying to enable the AUX_CLK associated with
a PCIe host fails if the corresponding PCIe reset is asserted."
>
>> PCIE3A is used for WLAN on the CRD, btw.
>
> You meant to say WWAN (modem).
Right :)
>
>>>>>> Assert the reset (which may end up being a NOP if it was previously
>>>>>> asserted) and de-assert it back *before* turning on the clocks to avoid
>>>>>> such cases.
>>>>>>
>>>>>> In addition to that, in case the clock bulk enable fails, assert the
>>>>>> RC reset back, as the hardware is in an unknown state at best.
>>>>>
>>>>> This is arguably a separate change, and not necessarily one that is
>>>>> correct either
>>>
>>>> If the clock enable fails, the PCIe hw is not in reset state, ergo it
>>>> may be doing "something", and that "something" would eat non-zero power.
>>>> It's just cleaning up after yourself.
>>>
>>> How can it do something without power and clocks?
>>
>> Fair point.
>>
>> As far as power goes, the RC hangs off CX, which is on whenever the
>> system is not in power collapse. As for clocks, at least parts of it
>> use the crystal oscillator, not sure if directly.
>>
>>> And leaving reset
>>> asserted for non-powered devices is generally not a good idea.
>>
>> Depends on the hw.
>
> That's why I said "generally".
I'll try to get a proper answer for this, or otherwise see if there's
any change in power draw / functionality.
Konrad
On Wed, Dec 27, 2023 at 11:17:21PM +0100, Konrad Dybcio wrote:
> To ensure write completion, read the PARF_LTSSM register after setting
> the LTSSM enable bit before polling for "link up".
>
> Signed-off-by: Konrad Dybcio <[email protected]>
I'd consider this as a bug since if the LTSSM write gets cached in Write Buffer,
then the polling time becomes wrong in dw_pcie_wait_for_link(), leading to
false link_up failure.
Although both of the write/read (LTSSM in qcom_pcie_2_3_2_ltssm_enable(), and
PCI_EXP_LNKSTA in qcom_pcie_link_up()) belong to PCIe domain, they belong to
different regions (PARF, DBI). So I'm not sure we can safely ignore the write
completion issue. So,
Reviewed-by: Manivannan Sadhasivam <[email protected]>
- Mani
> ---
> drivers/pci/controller/dwc/pcie-qcom.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> index a02dc197c495..3d77269e70da 100644
> --- a/drivers/pci/controller/dwc/pcie-qcom.c
> +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> @@ -540,6 +540,7 @@ static void qcom_pcie_2_3_2_ltssm_enable(struct qcom_pcie *pcie)
> val = readl(pcie->parf + PARF_LTSSM);
> val |= LTSSM_EN;
> writel(val, pcie->parf + PARF_LTSSM);
> + readl(pcie->parf + PARF_LTSSM);
> }
>
> static int qcom_pcie_get_resources_2_3_2(struct qcom_pcie *pcie)
>
> --
> 2.43.0
>
--
மணிவண்ணன் சதாசிவம்
On Wed, Dec 27, 2023 at 11:17:22PM +0100, Konrad Dybcio wrote:
> Currently, we've only been minimizing the power draw while keeping the
> RC up at all times. This is suboptimal, as it draws a whole lot of power
> and prevents the SoC from power collapsing.
>
> Implement full shutdown and re-initialization to allow for powering off
> the controller.
>
> This is mainly intended for v1_9_0 (sc8280xp), but the hardware is rather
> similar across the board. More platform-specific details may be added in
> the future as necessary.
>
> Co-developed-by: Bjorn Andersson <[email protected]>
There should be s-o-b for Bjorn also.
> Signed-off-by: Konrad Dybcio <[email protected]>
Before going into the patch, I should ask you how you are dealing with properly
powering down the PCIe device drivers before pulling the plug here.
- Mani
> ---
> drivers/pci/controller/dwc/Kconfig | 1 +
> drivers/pci/controller/dwc/pcie-qcom.c | 132 +++++++++++++++++++++++++--------
> 2 files changed, 102 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/pci/controller/dwc/Kconfig b/drivers/pci/controller/dwc/Kconfig
> index 5ac021dbd46a..591c4109ed62 100644
> --- a/drivers/pci/controller/dwc/Kconfig
> +++ b/drivers/pci/controller/dwc/Kconfig
> @@ -268,6 +268,7 @@ config PCIE_DW_PLAT_EP
> config PCIE_QCOM
> bool "Qualcomm PCIe controller (host mode)"
> depends on OF && (ARCH_QCOM || COMPILE_TEST)
> + depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n
> depends on PCI_MSI
> select PCIE_DW_HOST
> select CRC8
> diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> index 3d77269e70da..a9e24fcd1f66 100644
> --- a/drivers/pci/controller/dwc/pcie-qcom.c
> +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> @@ -30,13 +30,18 @@
> #include <linux/reset.h>
> #include <linux/slab.h>
> #include <linux/types.h>
> +#include <soc/qcom/cmd-db.h>
>
> #include "../../pci.h"
> #include "pcie-designware.h"
>
> +#include <dt-bindings/interconnect/qcom,icc.h>
> +#include <dt-bindings/interconnect/qcom,rpm-icc.h>
> +
> /* PARF registers */
> #define PARF_SYS_CTRL 0x00
> #define PARF_PM_CTRL 0x20
> +#define PARF_PM_STTS 0x24
> #define PARF_PCS_DEEMPH 0x34
> #define PARF_PCS_SWING 0x38
> #define PARF_PHY_CTRL 0x40
> @@ -80,7 +85,10 @@
> #define L1_CLK_RMV_DIS BIT(1)
>
> /* PARF_PM_CTRL register fields */
> -#define REQ_NOT_ENTR_L1 BIT(5)
> +#define REQ_NOT_ENTR_L1 BIT(5) /* "Prevent L0->L1" */
> +
> +/* PARF_PM_STTS register fields */
> +#define PM_ENTER_L23 BIT(5)
>
> /* PARF_PCS_DEEMPH register fields */
> #define PCS_DEEMPH_TX_DEEMPH_GEN1(x) FIELD_PREP(GENMASK(21, 16), x)
> @@ -122,6 +130,7 @@
>
> /* ELBI_SYS_CTRL register fields */
> #define ELBI_SYS_CTRL_LT_ENABLE BIT(0)
> +#define ELBI_SYS_CTRL_PME_TURNOFF_MSG BIT(4)
>
> /* AXI_MSTR_RESP_COMP_CTRL0 register fields */
> #define CFG_REMOTE_RD_REQ_BRIDGE_SIZE_2K 0x4
> @@ -244,6 +253,7 @@ struct qcom_pcie {
> const struct qcom_pcie_cfg *cfg;
> struct dentry *debugfs;
> bool suspended;
> + bool soc_is_rpmh;
> };
>
> #define to_qcom_pcie(x) dev_get_drvdata((x)->dev)
> @@ -273,6 +283,24 @@ static int qcom_pcie_start_link(struct dw_pcie *pci)
> return 0;
> }
>
> +static int qcom_pcie_stop_link(struct dw_pcie *pci)
> +{
> + struct qcom_pcie *pcie = to_qcom_pcie(pci);
> + u32 ret_l23, val;
> +
> + writel(ELBI_SYS_CTRL_PME_TURNOFF_MSG, pcie->elbi + ELBI_SYS_CTRL);
> + readl(pcie->elbi + ELBI_SYS_CTRL);
> +
> + ret_l23 = readl_poll_timeout(pcie->parf + PARF_PM_STTS, val,
> + val & PM_ENTER_L23, 10000, 100000);
> + if (ret_l23) {
> + dev_err(pci->dev, "Failed to enter L2/L3\n");
> + return -ETIMEDOUT;
> + }
> +
> + return 0;
> +}
> +
> static void qcom_pcie_clear_hpc(struct dw_pcie *pci)
> {
> u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> @@ -991,9 +1019,19 @@ static void qcom_pcie_host_post_init_2_7_0(struct qcom_pcie *pcie)
> static void qcom_pcie_deinit_2_7_0(struct qcom_pcie *pcie)
> {
> struct qcom_pcie_resources_2_7_0 *res = &pcie->res.v2_7_0;
> + u32 val;
> +
> + /* Disable PCIe clocks and resets */
> + val = readl(pcie->parf + PARF_PHY_CTRL);
> + val |= PHY_TEST_PWR_DOWN;
> + writel(val, pcie->parf + PARF_PHY_CTRL);
> + readl(pcie->parf + PARF_PHY_CTRL);
>
> clk_bulk_disable_unprepare(res->num_clks, res->clks);
>
> + reset_control_assert(res->rst);
> + usleep_range(2000, 2500);
> +
> regulator_bulk_disable(ARRAY_SIZE(res->supplies), res->supplies);
> }
>
> @@ -1553,6 +1591,9 @@ static int qcom_pcie_probe(struct platform_device *pdev)
> goto err_phy_exit;
> }
>
> + /* If the soc features RPMh, cmd_db must have been prepared by now */
> + pcie->soc_is_rpmh = !cmd_db_ready();
> +
> qcom_pcie_icc_update(pcie);
>
> if (pcie->mhi)
> @@ -1569,60 +1610,89 @@ static int qcom_pcie_probe(struct platform_device *pdev)
> return ret;
> }
>
> -static int qcom_pcie_suspend_noirq(struct device *dev)
> +static int qcom_pcie_resume_noirq(struct device *dev)
> {
> struct qcom_pcie *pcie = dev_get_drvdata(dev);
> int ret;
>
> /*
> - * Set minimum bandwidth required to keep data path functional during
> - * suspend.
> + * Undo the tag change from qcom_pcie_suspend_noirq first in case
> + * RPM(h) spontaneously decides to power collapse the platform based
> + * on other inputs.
> */
> - ret = icc_set_bw(pcie->icc_mem, 0, kBps_to_icc(1));
> + icc_set_tag(pcie->icc_mem, pcie->soc_is_rpmh ? QCOM_ICC_TAG_ALWAYS : RPM_ALWAYS_TAG);
> + /* Flush the tag change */
> + ret = icc_set_bw(pcie->icc_mem, 0, pcie->last_bw);
> if (ret) {
> - dev_err(dev, "Failed to set interconnect bandwidth: %d\n", ret);
> + dev_err(pcie->pci->dev, "failed to set interconnect bandwidth: %d\n", ret);
> return ret;
> }
>
> - pcie->last_bw = kBps_to_icc(1);
> + /* Only check this now to make sure the icc vote is in before going furhter. */
> + if (!pcie->suspended)
> + return 0;
>
> - /*
> - * Turn OFF the resources only for controllers without active PCIe
> - * devices. For controllers with active devices, the resources are kept
> - * ON and the link is expected to be in L0/L1 (sub)states.
> - *
> - * Turning OFF the resources for controllers with active PCIe devices
> - * will trigger access violation during the end of the suspend cycle,
> - * as kernel tries to access the PCIe devices config space for masking
> - * MSIs.
> - *
> - * Also, it is not desirable to put the link into L2/L3 state as that
> - * implies VDD supply will be removed and the devices may go into
> - * powerdown state. This will affect the lifetime of the storage devices
> - * like NVMe.
> - */
> - if (!dw_pcie_link_up(pcie->pci)) {
> - qcom_pcie_host_deinit(&pcie->pci->pp);
> - pcie->suspended = true;
> - }
> + ret = qcom_pcie_host_init(&pcie->pci->pp);
> + if (ret)
> + goto revert_icc_tag;
> +
> + dw_pcie_setup_rc(&pcie->pci->pp);
> +
> + ret = qcom_pcie_start_link(pcie->pci);
> + if (ret)
> + goto deinit_host;
> +
> + /* Ignore the retval, the devices may come up later. */
> + dw_pcie_wait_for_link(pcie->pci);
> +
> + qcom_pcie_icc_update(pcie);
> +
> + pcie->suspended = false;
>
> return 0;
> +
> +deinit_host:
> + qcom_pcie_host_deinit(&pcie->pci->pp);
> +revert_icc_tag:
> + icc_set_tag(pcie->icc_mem, pcie->soc_is_rpmh ? QCOM_ICC_TAG_WAKE : RPM_ACTIVE_TAG);
> + /* Ignore the retval, failing here would be tragic anyway.. */
> + icc_set_bw(pcie->icc_mem, 0, pcie->last_bw);
> +
> + return ret;
> }
>
> -static int qcom_pcie_resume_noirq(struct device *dev)
> +static int qcom_pcie_suspend_noirq(struct device *dev)
> {
> struct qcom_pcie *pcie = dev_get_drvdata(dev);
> int ret;
>
> - if (pcie->suspended) {
> - ret = qcom_pcie_host_init(&pcie->pci->pp);
> + if (pcie->suspended)
> + return 0;
> +
> + if (dw_pcie_link_up(pcie->pci)) {
> + ret = qcom_pcie_stop_link(pcie->pci);
> if (ret)
> return ret;
> + }
>
> - pcie->suspended = false;
> + qcom_pcie_host_deinit(&pcie->pci->pp);
> +
> + /*
> + * The PCIe RC may be covertly accessed by the secure firmware on sleep exit.
> + * Use the WAKE bucket to let RPMh pull the plug on PCIe in sleep,
> + * but guarantee it comes back for resume.
> + */
> + icc_set_tag(pcie->icc_mem, pcie->soc_is_rpmh ? QCOM_ICC_TAG_WAKE : RPM_ACTIVE_TAG);
> + /* Flush the tag change */
> + ret = icc_set_bw(pcie->icc_mem, 0, pcie->last_bw);
> + if (ret) {
> + dev_err(pcie->pci->dev, "failed to set interconnect bandwidth: %d\n", ret);
> +
> + /* Revert everything and hope the next icc_set_bw goes through.. */
> + return qcom_pcie_resume_noirq(dev);
> }
>
> - qcom_pcie_icc_update(pcie);
> + pcie->suspended = true;
>
> return 0;
> }
>
> --
> 2.43.0
>
--
மணிவண்ணன் சதாசிவம்
On Tue, Jan 02, 2024 at 06:03:36PM +0100, Konrad Dybcio wrote:
> On 2.01.2024 11:17, Johan Hovold wrote:
> > On Sat, Dec 30, 2023 at 02:16:18AM +0100, Konrad Dybcio wrote:
> >> On 29.12.2023 16:29, Johan Hovold wrote:
> >>> On Fri, Dec 29, 2023 at 04:01:27PM +0100, Konrad Dybcio wrote:
> >>>> On 29.12.2023 15:04, Johan Hovold wrote:
> >>>>> On Wed, Dec 27, 2023 at 11:17:19PM +0100, Konrad Dybcio wrote:
> >>>>>> At least on SC8280XP, if the PCIe reset is asserted, the corresponding
> >>>>>> AUX_CLK will be stuck at 'off'.
> >>>>>
> >>>>> No, this path is exercised on every boot without the aux clock ever
> >>>>> being stuck at off. So something is clearly missing in this description.
> >>>
> >>>> That's likely because the hardware has been initialized and not cleanly
> >>>> shut down by your bootloader. When you reset it, or your bootloader
> >>>> wasn't so kind, you need to start initialization from scratch.
> >>>
> >>> What does that even mean? I'm telling you that this reset is asserted on
> >>> each boot, on all sc8280xp platforms I have access to, and never have I
> >>> seen the aux clk stuck at off.
> >>>
> >>> So clearly your claim above is too broad and the commit message is
> >>> incorrect or incomplete.
> > We're clearly talking past each other. When I'm saying reset is asserted
> > on each boot, I'm referring to reset being asserted in
> > qcom_pcie_init_2_7_0(), whereas you appear to be referring to whether
> > the reset has been left asserted by the bootloader when the driver
> > probes.
>
> OK, "boot" meant "booting the device" to me, not the PCIe controller.
Still not getting across to you apparently.
Again, the code in question is exercised on every boot and not once have
I seen a stuck clock due to reset being asserted *in*
qcom_pcie_init_2_7_0().
Now that's not what you were trying to describe as you were thinking of
reset having been left asserted *before* the driver probes (or before
qcom_pcie_init_2_7_0() is called).
See? Do you understand now what I was trying to say and why my
misinterpretation of your terse commit message lead me to claim that it
was clearly false?
> > I understand what you meant to say now, but I think you should rephrase:
> >
> > At least on SC8280XP, if the PCIe reset is asserted, the
> > corresponding AUX_CLK will be stuck at 'off'.
> >
> > because as it stands, it sounds like the problem happens when the driver
> > asserts reset.
>
> Does this sound good?
>
> "At least on SC8280XP, trying to enable the AUX_CLK associated with
> a PCIe host fails if the corresponding PCIe reset is asserted."
Yes, but you need to also say something about how this would happen, for
example, your hypothetical bootloader leaving it asserted and your actual
motivation for this change which is that it appears to be needed after
suspend.
A commit message should be clear and self-contained and not force
reviewers to have to try to interpret what it means and guess what the
motivation for the change really is.
Johan
On 3.01.2024 11:40, Johan Hovold wrote:
> On Tue, Jan 02, 2024 at 06:03:36PM +0100, Konrad Dybcio wrote:
>> On 2.01.2024 11:17, Johan Hovold wrote:
>>> On Sat, Dec 30, 2023 at 02:16:18AM +0100, Konrad Dybcio wrote:
>>>> On 29.12.2023 16:29, Johan Hovold wrote:
>>>>> On Fri, Dec 29, 2023 at 04:01:27PM +0100, Konrad Dybcio wrote:
>>>>>> On 29.12.2023 15:04, Johan Hovold wrote:
>>>>>>> On Wed, Dec 27, 2023 at 11:17:19PM +0100, Konrad Dybcio wrote:
>>>>>>>> At least on SC8280XP, if the PCIe reset is asserted, the corresponding
>>>>>>>> AUX_CLK will be stuck at 'off'.
>>>>>>>
>>>>>>> No, this path is exercised on every boot without the aux clock ever
>>>>>>> being stuck at off. So something is clearly missing in this description.
>>>>>
>>>>>> That's likely because the hardware has been initialized and not cleanly
>>>>>> shut down by your bootloader. When you reset it, or your bootloader
>>>>>> wasn't so kind, you need to start initialization from scratch.
>>>>>
>>>>> What does that even mean? I'm telling you that this reset is asserted on
>>>>> each boot, on all sc8280xp platforms I have access to, and never have I
>>>>> seen the aux clk stuck at off.
>>>>>
>>>>> So clearly your claim above is too broad and the commit message is
>>>>> incorrect or incomplete.
>
>>> We're clearly talking past each other. When I'm saying reset is asserted
>>> on each boot, I'm referring to reset being asserted in
>>> qcom_pcie_init_2_7_0(), whereas you appear to be referring to whether
>>> the reset has been left asserted by the bootloader when the driver
>>> probes.
>>
>> OK, "boot" meant "booting the device" to me, not the PCIe controller.
>
> Still not getting across to you apparently.
>
> Again, the code in question is exercised on every boot and not once have
> I seen a stuck clock due to reset being asserted *in*
> qcom_pcie_init_2_7_0().
>
> Now that's not what you were trying to describe as you were thinking of
> reset having been left asserted *before* the driver probes (or before
> qcom_pcie_init_2_7_0() is called).
>
> See? Do you understand now what I was trying to say and why my
> misinterpretation of your terse commit message lead me to claim that it
> was clearly false?
No, my response was an acknowledgement of having understood you. Maybe
it's a direct translation of some Polish idiom that's not obvious to
others, but I definitely tried to say that "we were talking about
different things, I had been previously thinking of something else,
but now we're on the same page".
>
>>> I understand what you meant to say now, but I think you should rephrase:
>>>
>>> At least on SC8280XP, if the PCIe reset is asserted, the
>>> corresponding AUX_CLK will be stuck at 'off'.
>>>
>>> because as it stands, it sounds like the problem happens when the driver
>>> asserts reset.
>>
>> Does this sound good?
>>
>> "At least on SC8280XP, trying to enable the AUX_CLK associated with
>> a PCIe host fails if the corresponding PCIe reset is asserted."
>
> Yes, but you need to also say something about how this would happen, for
> example, your hypothetical bootloader leaving it asserted and your actual
> motivation for this change which is that it appears to be needed after
> suspend.
>
> A commit message should be clear and self-contained and not force
> reviewers to have to try to interpret what it means and guess what the
> motivation for the change really is.
Got it
Konrad