2024-04-01 15:51:13

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2 00/10] PCI: endpoint: Make host reboot handling more robust

Hello,

This is the follow up series of [1], to improve the handling of host reboot in
the endpoint subsystem. This involves refining the PERST# and Link Down event
handling in both the controller and function drivers.

Testing
=======

This series is tested on Qcom SM8450 based development board with both MHI_EPF
and EPF_TEST function drivers.

Dependency
==========

This series depends on [1] and [2].

- Mani

[1] https://lore.kernel.org/linux-pci/[email protected]/
[2] https://lore.kernel.org/linux-pci/[email protected]/

Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
Changes in v2:
- Dropped the {start/stop}_link rework patches
- Incorporated comments from Niklas
- Collected review tags
- Rebased on top of v6.9-rc1 and https://lore.kernel.org/linux-pci/[email protected]/
- Link to v1: https://lore.kernel.org/r/[email protected]

---
Manivannan Sadhasivam (10):
PCI: qcom-ep: Disable resources unconditionally during PERST# assert
PCI: endpoint: Decouple EPC and PCIe bus specific events
PCI: endpoint: Rename core_init() callback in 'struct pci_epc_event_ops' to init()
PCI: epf-test: Refactor pci_epf_test_unbind() function
PCI: epf-{mhi/test}: Move DMA initialization to EPC init callback
PCI: endpoint: Introduce EPC 'deinit' event and notify the EPF drivers
PCI: dwc: ep: Add a generic dw_pcie_ep_linkdown() API to handle Link Down event
PCI: qcom-ep: Use the generic dw_pcie_ep_linkdown() API to handle Link Down event
PCI: epf-test: Handle Link Down event
PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices

drivers/pci/controller/dwc/pcie-designware-ep.c | 99 ++++++++++++++---------
drivers/pci/controller/dwc/pcie-designware.h | 5 ++
drivers/pci/controller/dwc/pcie-qcom-ep.c | 9 +--
drivers/pci/controller/dwc/pcie-qcom.c | 8 ++
drivers/pci/controller/dwc/pcie-tegra194.c | 1 +
drivers/pci/endpoint/functions/pci-epf-mhi.c | 47 ++++++++---
drivers/pci/endpoint/functions/pci-epf-test.c | 103 +++++++++++++++++-------
drivers/pci/endpoint/pci-epc-core.c | 53 ++++++++----
include/linux/pci-epc.h | 1 +
include/linux/pci-epf.h | 27 +++++--
10 files changed, 248 insertions(+), 105 deletions(-)
---
base-commit: e6377605ca734126533a0f8e4de2b4bac881f076
change-id: 20240314-pci-epf-rework-a6e65b103a79

Best regards,
--
Manivannan Sadhasivam <[email protected]>



2024-04-01 15:51:35

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2 01/10] PCI: qcom-ep: Disable resources unconditionally during PERST# assert

All EP specific resources are enabled during PERST# deassert. As a counter
operation, all resources should be disabled during PERST# assert. There is
no point in skipping that if the link was not enabled.

This will also result in enablement of the resources twice if PERST# got
deasserted again. So remove the check from qcom_pcie_perst_assert() and
disable all the resources unconditionally.

Fixes: f55fee56a631 ("PCI: qcom-ep: Add Qualcomm PCIe Endpoint controller driver")
Reviewed-by: Niklas Cassel <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/pci/controller/dwc/pcie-qcom-ep.c | 6 ------
1 file changed, 6 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c
index 2fb8c15e7a91..50b1635e3cbb 100644
--- a/drivers/pci/controller/dwc/pcie-qcom-ep.c
+++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c
@@ -500,12 +500,6 @@ static int qcom_pcie_perst_deassert(struct dw_pcie *pci)
static void qcom_pcie_perst_assert(struct dw_pcie *pci)
{
struct qcom_pcie_ep *pcie_ep = to_pcie_ep(pci);
- struct device *dev = pci->dev;
-
- if (pcie_ep->link_status == QCOM_PCIE_EP_LINK_DISABLED) {
- dev_dbg(dev, "Link is already disabled\n");
- return;
- }

dw_pcie_ep_cleanup(&pci->ep);
qcom_pcie_disable_resources(pcie_ep);

--
2.25.1


2024-04-01 15:51:56

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2 02/10] PCI: endpoint: Decouple EPC and PCIe bus specific events

Currently, 'struct pci_epc_event_ops' has a bunch of events that are sent
from the EPC driver to EPF driver. But those events are a mix of EPC
specific events like core_init and PCIe bus specific events like LINK_UP,
LINK_DOWN, BME etc...

Let's decouple them to respective structs (pci_epc_event_ops,
pci_epc_bus_event_ops) to make the separation clear.

Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/pci/endpoint/functions/pci-epf-mhi.c | 8 ++++++--
drivers/pci/endpoint/functions/pci-epf-test.c | 8 ++++++--
drivers/pci/endpoint/pci-epc-core.c | 20 ++++++++++----------
include/linux/pci-epf.h | 23 ++++++++++++++++-------
4 files changed, 38 insertions(+), 21 deletions(-)

diff --git a/drivers/pci/endpoint/functions/pci-epf-mhi.c b/drivers/pci/endpoint/functions/pci-epf-mhi.c
index 2c54d80107cf..280863c0eeb9 100644
--- a/drivers/pci/endpoint/functions/pci-epf-mhi.c
+++ b/drivers/pci/endpoint/functions/pci-epf-mhi.c
@@ -896,8 +896,11 @@ static void pci_epf_mhi_unbind(struct pci_epf *epf)
pci_epc_clear_bar(epc, epf->func_no, epf->vfunc_no, epf_bar);
}

-static const struct pci_epc_event_ops pci_epf_mhi_event_ops = {
+static const struct pci_epc_event_ops pci_epf_mhi_epc_event_ops = {
.core_init = pci_epf_mhi_core_init,
+};
+
+static const struct pci_epc_bus_event_ops pci_epf_mhi_bus_event_ops = {
.link_up = pci_epf_mhi_link_up,
.link_down = pci_epf_mhi_link_down,
.bme = pci_epf_mhi_bme,
@@ -919,7 +922,8 @@ static int pci_epf_mhi_probe(struct pci_epf *epf,
epf_mhi->info = info;
epf_mhi->epf = epf;

- epf->event_ops = &pci_epf_mhi_event_ops;
+ epf->epc_event_ops = &pci_epf_mhi_epc_event_ops;
+ epf->bus_event_ops = &pci_epf_mhi_bus_event_ops;

mutex_init(&epf_mhi->lock);

diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 977fb79c1567..973db0b1bde2 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -798,8 +798,11 @@ static int pci_epf_test_link_up(struct pci_epf *epf)
return 0;
}

-static const struct pci_epc_event_ops pci_epf_test_event_ops = {
+static const struct pci_epc_event_ops pci_epf_test_epc_event_ops = {
.core_init = pci_epf_test_core_init,
+};
+
+static const struct pci_epc_bus_event_ops pci_epf_test_bus_event_ops = {
.link_up = pci_epf_test_link_up,
};

@@ -916,7 +919,8 @@ static int pci_epf_test_probe(struct pci_epf *epf,

INIT_DELAYED_WORK(&epf_test->cmd_handler, pci_epf_test_cmd_handler);

- epf->event_ops = &pci_epf_test_event_ops;
+ epf->epc_event_ops = &pci_epf_test_epc_event_ops;
+ epf->bus_event_ops = &pci_epf_test_bus_event_ops;

epf_set_drvdata(epf, epf_test);
return 0;
diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
index 47d27ec7439d..f202ae07ffa9 100644
--- a/drivers/pci/endpoint/pci-epc-core.c
+++ b/drivers/pci/endpoint/pci-epc-core.c
@@ -692,8 +692,8 @@ void pci_epc_linkup(struct pci_epc *epc)
mutex_lock(&epc->list_lock);
list_for_each_entry(epf, &epc->pci_epf, list) {
mutex_lock(&epf->lock);
- if (epf->event_ops && epf->event_ops->link_up)
- epf->event_ops->link_up(epf);
+ if (epf->bus_event_ops && epf->bus_event_ops->link_up)
+ epf->bus_event_ops->link_up(epf);
mutex_unlock(&epf->lock);
}
mutex_unlock(&epc->list_lock);
@@ -718,8 +718,8 @@ void pci_epc_linkdown(struct pci_epc *epc)
mutex_lock(&epc->list_lock);
list_for_each_entry(epf, &epc->pci_epf, list) {
mutex_lock(&epf->lock);
- if (epf->event_ops && epf->event_ops->link_down)
- epf->event_ops->link_down(epf);
+ if (epf->bus_event_ops && epf->bus_event_ops->link_down)
+ epf->bus_event_ops->link_down(epf);
mutex_unlock(&epf->lock);
}
mutex_unlock(&epc->list_lock);
@@ -744,8 +744,8 @@ void pci_epc_init_notify(struct pci_epc *epc)
mutex_lock(&epc->list_lock);
list_for_each_entry(epf, &epc->pci_epf, list) {
mutex_lock(&epf->lock);
- if (epf->event_ops && epf->event_ops->core_init)
- epf->event_ops->core_init(epf);
+ if (epf->epc_event_ops && epf->epc_event_ops->core_init)
+ epf->epc_event_ops->core_init(epf);
mutex_unlock(&epf->lock);
}
epc->init_complete = true;
@@ -767,8 +767,8 @@ void pci_epc_notify_pending_init(struct pci_epc *epc, struct pci_epf *epf)
{
if (epc->init_complete) {
mutex_lock(&epf->lock);
- if (epf->event_ops && epf->event_ops->core_init)
- epf->event_ops->core_init(epf);
+ if (epf->epc_event_ops && epf->epc_event_ops->core_init)
+ epf->epc_event_ops->core_init(epf);
mutex_unlock(&epf->lock);
}
}
@@ -792,8 +792,8 @@ void pci_epc_bme_notify(struct pci_epc *epc)
mutex_lock(&epc->list_lock);
list_for_each_entry(epf, &epc->pci_epf, list) {
mutex_lock(&epf->lock);
- if (epf->event_ops && epf->event_ops->bme)
- epf->event_ops->bme(epf);
+ if (epf->bus_event_ops && epf->bus_event_ops->bme)
+ epf->bus_event_ops->bme(epf);
mutex_unlock(&epf->lock);
}
mutex_unlock(&epc->list_lock);
diff --git a/include/linux/pci-epf.h b/include/linux/pci-epf.h
index adee6a1b35db..77399fecaeb5 100644
--- a/include/linux/pci-epf.h
+++ b/include/linux/pci-epf.h
@@ -69,14 +69,21 @@ struct pci_epf_ops {
};

/**
- * struct pci_epc_event_ops - Callbacks for capturing the EPC events
- * @core_init: Callback for the EPC initialization complete event
- * @link_up: Callback for the EPC link up event
- * @link_down: Callback for the EPC link down event
- * @bme: Callback for the EPC BME (Bus Master Enable) event
+ * struct pci_epc_event_ops - Callbacks for capturing the EPC specific events
+ * @core_init: Callback for the EPC initialization event
*/
struct pci_epc_event_ops {
int (*core_init)(struct pci_epf *epf);
+};
+
+/**
+ * struct pci_epc_bus_event_ops - Callbacks for capturing the PCIe bus specific
+ * events
+ * @link_up: Callback for the PCIe bus link up event
+ * @link_down: Callback for the PCIe bus link down event
+ * @bme: Callback for the PCIe bus BME (Bus Master Enable) event
+ */
+struct pci_epc_bus_event_ops {
int (*link_up)(struct pci_epf *epf);
int (*link_down)(struct pci_epf *epf);
int (*bme)(struct pci_epf *epf);
@@ -150,7 +157,8 @@ struct pci_epf_bar {
* @is_vf: true - virtual function, false - physical function
* @vfunction_num_map: bitmap to manage virtual function number
* @pci_vepf: list of virtual endpoint functions associated with this function
- * @event_ops: Callbacks for capturing the EPC events
+ * @epc_event_ops: Callbacks for capturing the EPC events
+ * @bus_event_ops: Callbacks for capturing the PCIe bus events
*/
struct pci_epf {
struct device dev;
@@ -180,7 +188,8 @@ struct pci_epf {
unsigned int is_vf;
unsigned long vfunction_num_map;
struct list_head pci_vepf;
- const struct pci_epc_event_ops *event_ops;
+ const struct pci_epc_event_ops *epc_event_ops;
+ const struct pci_epc_bus_event_ops *bus_event_ops;
};

/**

--
2.25.1


2024-04-01 15:52:09

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2 03/10] PCI: endpoint: Rename core_init() callback in 'struct pci_epc_event_ops' to init()

core_init() callback is used to notify the EPC initialization event to the
EPF drivers. The 'core' prefix was used indicate that the controller IP
core has completed initialization. But it serves no purpose as the EPF
driver will only care about the EPC initialization as a whole and there is
no real benefit to distinguish the IP core part.

So let's rename the core_init() callback in 'struct pci_epc_event_ops' to
just init() to make it more clear.

Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/pci/endpoint/functions/pci-epf-mhi.c | 4 ++--
drivers/pci/endpoint/functions/pci-epf-test.c | 4 ++--
drivers/pci/endpoint/pci-epc-core.c | 16 ++++++++--------
include/linux/pci-epf.h | 4 ++--
4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/pci/endpoint/functions/pci-epf-mhi.c b/drivers/pci/endpoint/functions/pci-epf-mhi.c
index 280863c0eeb9..b3c26ffd29a5 100644
--- a/drivers/pci/endpoint/functions/pci-epf-mhi.c
+++ b/drivers/pci/endpoint/functions/pci-epf-mhi.c
@@ -716,7 +716,7 @@ static void pci_epf_mhi_dma_deinit(struct pci_epf_mhi *epf_mhi)
epf_mhi->dma_chan_rx = NULL;
}

-static int pci_epf_mhi_core_init(struct pci_epf *epf)
+static int pci_epf_mhi_epc_init(struct pci_epf *epf)
{
struct pci_epf_mhi *epf_mhi = epf_get_drvdata(epf);
const struct pci_epf_mhi_ep_info *info = epf_mhi->info;
@@ -897,7 +897,7 @@ static void pci_epf_mhi_unbind(struct pci_epf *epf)
}

static const struct pci_epc_event_ops pci_epf_mhi_epc_event_ops = {
- .core_init = pci_epf_mhi_core_init,
+ .init = pci_epf_mhi_epc_init,
};

static const struct pci_epc_bus_event_ops pci_epf_mhi_bus_event_ops = {
diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 973db0b1bde2..abcb6ca61c4e 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -731,7 +731,7 @@ static int pci_epf_test_set_bar(struct pci_epf *epf)
return 0;
}

-static int pci_epf_test_core_init(struct pci_epf *epf)
+static int pci_epf_test_epc_init(struct pci_epf *epf)
{
struct pci_epf_test *epf_test = epf_get_drvdata(epf);
struct pci_epf_header *header = epf->header;
@@ -799,7 +799,7 @@ static int pci_epf_test_link_up(struct pci_epf *epf)
}

static const struct pci_epc_event_ops pci_epf_test_epc_event_ops = {
- .core_init = pci_epf_test_core_init,
+ .init = pci_epf_test_epc_init,
};

static const struct pci_epc_bus_event_ops pci_epf_test_bus_event_ops = {
diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
index f202ae07ffa9..fe3cb62dd866 100644
--- a/drivers/pci/endpoint/pci-epc-core.c
+++ b/drivers/pci/endpoint/pci-epc-core.c
@@ -727,9 +727,9 @@ void pci_epc_linkdown(struct pci_epc *epc)
EXPORT_SYMBOL_GPL(pci_epc_linkdown);

/**
- * pci_epc_init_notify() - Notify the EPF device that EPC device's core
- * initialization is completed.
- * @epc: the EPC device whose core initialization is completed
+ * pci_epc_init_notify() - Notify the EPF device that EPC device initialization
+ * is completed.
+ * @epc: the EPC device whose initialization is completed
*
* Invoke to Notify the EPF device that the EPC device's initialization
* is completed.
@@ -744,8 +744,8 @@ void pci_epc_init_notify(struct pci_epc *epc)
mutex_lock(&epc->list_lock);
list_for_each_entry(epf, &epc->pci_epf, list) {
mutex_lock(&epf->lock);
- if (epf->epc_event_ops && epf->epc_event_ops->core_init)
- epf->epc_event_ops->core_init(epf);
+ if (epf->epc_event_ops && epf->epc_event_ops->init)
+ epf->epc_event_ops->init(epf);
mutex_unlock(&epf->lock);
}
epc->init_complete = true;
@@ -756,7 +756,7 @@ EXPORT_SYMBOL_GPL(pci_epc_init_notify);
/**
* pci_epc_notify_pending_init() - Notify the pending EPC device initialization
* complete to the EPF device
- * @epc: the EPC device whose core initialization is pending to be notified
+ * @epc: the EPC device whose initialization is pending to be notified
* @epf: the EPF device to be notified
*
* Invoke to notify the pending EPC device initialization complete to the EPF
@@ -767,8 +767,8 @@ void pci_epc_notify_pending_init(struct pci_epc *epc, struct pci_epf *epf)
{
if (epc->init_complete) {
mutex_lock(&epf->lock);
- if (epf->epc_event_ops && epf->epc_event_ops->core_init)
- epf->epc_event_ops->core_init(epf);
+ if (epf->epc_event_ops && epf->epc_event_ops->init)
+ epf->epc_event_ops->init(epf);
mutex_unlock(&epf->lock);
}
}
diff --git a/include/linux/pci-epf.h b/include/linux/pci-epf.h
index 77399fecaeb5..f4c2aaa6674c 100644
--- a/include/linux/pci-epf.h
+++ b/include/linux/pci-epf.h
@@ -70,10 +70,10 @@ struct pci_epf_ops {

/**
* struct pci_epc_event_ops - Callbacks for capturing the EPC specific events
- * @core_init: Callback for the EPC initialization event
+ * @init: Callback for the EPC initialization event
*/
struct pci_epc_event_ops {
- int (*core_init)(struct pci_epf *epf);
+ int (*init)(struct pci_epf *epf);
};

/**

--
2.25.1


2024-04-01 15:52:32

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2 04/10] PCI: epf-test: Refactor pci_epf_test_unbind() function

Move the pci_epc_clear_bar() and pci_epf_free_space() code to respective
helper functions. This allows reusing the helpers in future commits.

This also requires moving the pci_epf_test_unbind() definition below
pci_epf_test_bind() to avoid forward declaration of the above helpers.

No functional change.

Reviewed-by: Niklas Cassel <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/pci/endpoint/functions/pci-epf-test.c | 58 ++++++++++++++++++---------
1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index abcb6ca61c4e..1a4a35e7bf94 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -686,25 +686,6 @@ static void pci_epf_test_cmd_handler(struct work_struct *work)
msecs_to_jiffies(1));
}

-static void pci_epf_test_unbind(struct pci_epf *epf)
-{
- struct pci_epf_test *epf_test = epf_get_drvdata(epf);
- struct pci_epc *epc = epf->epc;
- int bar;
-
- cancel_delayed_work(&epf_test->cmd_handler);
- pci_epf_test_clean_dma_chan(epf_test);
- for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
- if (!epf_test->reg[bar])
- continue;
-
- pci_epc_clear_bar(epc, epf->func_no, epf->vfunc_no,
- &epf->bar[bar]);
- pci_epf_free_space(epf, epf_test->reg[bar], bar,
- PRIMARY_INTERFACE);
- }
-}
-
static int pci_epf_test_set_bar(struct pci_epf *epf)
{
int bar, ret;
@@ -731,6 +712,21 @@ static int pci_epf_test_set_bar(struct pci_epf *epf)
return 0;
}

+static void pci_epf_test_clear_bar(struct pci_epf *epf)
+{
+ struct pci_epf_test *epf_test = epf_get_drvdata(epf);
+ struct pci_epc *epc = epf->epc;
+ int bar;
+
+ for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
+ if (!epf_test->reg[bar])
+ continue;
+
+ pci_epc_clear_bar(epc, epf->func_no, epf->vfunc_no,
+ &epf->bar[bar]);
+ }
+}
+
static int pci_epf_test_epc_init(struct pci_epf *epf)
{
struct pci_epf_test *epf_test = epf_get_drvdata(epf);
@@ -860,6 +856,20 @@ static int pci_epf_test_alloc_space(struct pci_epf *epf)
return 0;
}

+static void pci_epf_test_free_space(struct pci_epf *epf)
+{
+ struct pci_epf_test *epf_test = epf_get_drvdata(epf);
+ int bar;
+
+ for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
+ if (!epf_test->reg[bar])
+ continue;
+
+ pci_epf_free_space(epf, epf_test->reg[bar], bar,
+ PRIMARY_INTERFACE);
+ }
+}
+
static int pci_epf_test_bind(struct pci_epf *epf)
{
int ret;
@@ -897,6 +907,16 @@ static int pci_epf_test_bind(struct pci_epf *epf)
return 0;
}

+static void pci_epf_test_unbind(struct pci_epf *epf)
+{
+ struct pci_epf_test *epf_test = epf_get_drvdata(epf);
+
+ cancel_delayed_work(&epf_test->cmd_handler);
+ pci_epf_test_clean_dma_chan(epf_test);
+ pci_epf_test_clear_bar(epf);
+ pci_epf_test_free_space(epf);
+}
+
static const struct pci_epf_device_id pci_epf_test_ids[] = {
{
.name = "pci_epf_test",

--
2.25.1


2024-04-01 15:52:49

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2 05/10] PCI: epf-{mhi/test}: Move DMA initialization to EPC init callback

To maintain uniformity across EPF drivers, let's move the DMA
initialization to EPC init callback. This will also allow us to deinit DMA
during PERST# assert in the further commits.

For EPC drivers without PERST#, DMA deinit will only happen during driver
unbind.

Reviewed-by: Niklas Cassel <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/pci/endpoint/functions/pci-epf-mhi.c | 16 ++++++++--------
drivers/pci/endpoint/functions/pci-epf-test.c | 12 ++++++------
2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/pci/endpoint/functions/pci-epf-mhi.c b/drivers/pci/endpoint/functions/pci-epf-mhi.c
index b3c26ffd29a5..4d5c638744a1 100644
--- a/drivers/pci/endpoint/functions/pci-epf-mhi.c
+++ b/drivers/pci/endpoint/functions/pci-epf-mhi.c
@@ -753,6 +753,14 @@ static int pci_epf_mhi_epc_init(struct pci_epf *epf)
if (!epf_mhi->epc_features)
return -ENODATA;

+ if (info->flags & MHI_EPF_USE_DMA) {
+ ret = pci_epf_mhi_dma_init(epf_mhi);
+ if (ret) {
+ dev_err(dev, "Failed to initialize DMA: %d\n", ret);
+ return ret;
+ }
+ }
+
return 0;
}

@@ -765,14 +773,6 @@ static int pci_epf_mhi_link_up(struct pci_epf *epf)
struct device *dev = &epf->dev;
int ret;

- if (info->flags & MHI_EPF_USE_DMA) {
- ret = pci_epf_mhi_dma_init(epf_mhi);
- if (ret) {
- dev_err(dev, "Failed to initialize DMA: %d\n", ret);
- return ret;
- }
- }
-
mhi_cntrl->mmio = epf_mhi->mmio;
mhi_cntrl->irq = epf_mhi->irq;
mhi_cntrl->mru = info->mru;
diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 1a4a35e7bf94..8756ffc5977b 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -739,6 +739,12 @@ static int pci_epf_test_epc_init(struct pci_epf *epf)
bool msi_capable = true;
int ret;

+ epf_test->dma_supported = true;
+
+ ret = pci_epf_test_init_dma_chan(epf_test);
+ if (ret)
+ epf_test->dma_supported = false;
+
epc_features = pci_epc_get_features(epc, epf->func_no, epf->vfunc_no);
if (epc_features) {
msix_capable = epc_features->msix_capable;
@@ -898,12 +904,6 @@ static int pci_epf_test_bind(struct pci_epf *epf)
if (ret)
return ret;

- epf_test->dma_supported = true;
-
- ret = pci_epf_test_init_dma_chan(epf_test);
- if (ret)
- epf_test->dma_supported = false;
-
return 0;
}


--
2.25.1


2024-04-01 15:53:07

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2 06/10] PCI: endpoint: Introduce EPC 'deinit' event and notify the EPF drivers

As like the EPC 'init' event, that is used to signal the EPF drivers about
the EPC initialization, let's introduce 'deinit' event that is used to
signal EPC deinitialization.

The EPC deinitialization applies only when any sort of fundamental reset
is supported by the endpoint controller as per the PCIe spec.

Reference: PCIe Base spec v5.0, sections 4.2.4.9.1 and 6.6.1.

Currently, some EPC drivers like pcie-qcom-ep and pcie-tegra194 support
PERST# as the fundamental reset. So the 'deinit' event will be notified to
the EPF drivers when PERST# assert happens in the above mentioned EPC
drivers.

The EPF drivers, on receiving the event through the deinit() callback
should reset the EPF state machine and also cleanup any configuration that
got affected by the fundamental reset like BAR, DMA etc...

This change also warrants skipping the cleanups in unbind() if already done
in deinit().

Reviewed-by: Niklas Cassel <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/pci/controller/dwc/pcie-designware-ep.c | 1 -
drivers/pci/controller/dwc/pcie-qcom-ep.c | 1 +
drivers/pci/controller/dwc/pcie-tegra194.c | 1 +
drivers/pci/endpoint/functions/pci-epf-mhi.c | 19 +++++++++++++++++++
drivers/pci/endpoint/functions/pci-epf-test.c | 17 +++++++++++++++--
drivers/pci/endpoint/pci-epc-core.c | 25 +++++++++++++++++++++++++
include/linux/pci-epc.h | 1 +
include/linux/pci-epf.h | 2 ++
8 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index 47391d7d3a73..2063cf2049e5 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -632,7 +632,6 @@ void dw_pcie_ep_cleanup(struct dw_pcie_ep *ep)
struct dw_pcie *pci = to_dw_pcie_from_ep(ep);

dw_pcie_edma_remove(pci);
- ep->epc->init_complete = false;
}
EXPORT_SYMBOL_GPL(dw_pcie_ep_cleanup);

diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c
index 50b1635e3cbb..e4b742355d57 100644
--- a/drivers/pci/controller/dwc/pcie-qcom-ep.c
+++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c
@@ -501,6 +501,7 @@ static void qcom_pcie_perst_assert(struct dw_pcie *pci)
{
struct qcom_pcie_ep *pcie_ep = to_pcie_ep(pci);

+ pci_epc_deinit_notify(pci->ep.epc);
dw_pcie_ep_cleanup(&pci->ep);
qcom_pcie_disable_resources(pcie_ep);
pcie_ep->link_status = QCOM_PCIE_EP_LINK_DISABLED;
diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
index ddc23602eca7..d2223821e122 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -1715,6 +1715,7 @@ static void pex_ep_event_pex_rst_assert(struct tegra_pcie_dw *pcie)
if (ret)
dev_err(pcie->dev, "Failed to go Detect state: %d\n", ret);

+ pci_epc_deinit_notify(pcie->pci.ep.epc);
dw_pcie_ep_cleanup(&pcie->pci.ep);

reset_control_assert(pcie->core_rst);
diff --git a/drivers/pci/endpoint/functions/pci-epf-mhi.c b/drivers/pci/endpoint/functions/pci-epf-mhi.c
index 4d5c638744a1..005916722ede 100644
--- a/drivers/pci/endpoint/functions/pci-epf-mhi.c
+++ b/drivers/pci/endpoint/functions/pci-epf-mhi.c
@@ -764,6 +764,24 @@ static int pci_epf_mhi_epc_init(struct pci_epf *epf)
return 0;
}

+static void pci_epf_mhi_epc_deinit(struct pci_epf *epf)
+{
+ struct pci_epf_mhi *epf_mhi = epf_get_drvdata(epf);
+ const struct pci_epf_mhi_ep_info *info = epf_mhi->info;
+ struct pci_epf_bar *epf_bar = &epf->bar[info->bar_num];
+ struct mhi_ep_cntrl *mhi_cntrl = &epf_mhi->mhi_cntrl;
+ struct pci_epc *epc = epf->epc;
+
+ if (mhi_cntrl->mhi_dev) {
+ mhi_ep_power_down(mhi_cntrl);
+ if (info->flags & MHI_EPF_USE_DMA)
+ pci_epf_mhi_dma_deinit(epf_mhi);
+ mhi_ep_unregister_controller(mhi_cntrl);
+ }
+
+ pci_epc_clear_bar(epc, epf->func_no, epf->vfunc_no, epf_bar);
+}
+
static int pci_epf_mhi_link_up(struct pci_epf *epf)
{
struct pci_epf_mhi *epf_mhi = epf_get_drvdata(epf);
@@ -898,6 +916,7 @@ static void pci_epf_mhi_unbind(struct pci_epf *epf)

static const struct pci_epc_event_ops pci_epf_mhi_epc_event_ops = {
.init = pci_epf_mhi_epc_init,
+ .deinit = pci_epf_mhi_epc_deinit,
};

static const struct pci_epc_bus_event_ops pci_epf_mhi_bus_event_ops = {
diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 8756ffc5977b..5933788b0e68 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -790,6 +790,15 @@ static int pci_epf_test_epc_init(struct pci_epf *epf)
return 0;
}

+static void pci_epf_test_epc_deinit(struct pci_epf *epf)
+{
+ struct pci_epf_test *epf_test = epf_get_drvdata(epf);
+
+ cancel_delayed_work(&epf_test->cmd_handler);
+ pci_epf_test_clean_dma_chan(epf_test);
+ pci_epf_test_clear_bar(epf);
+}
+
static int pci_epf_test_link_up(struct pci_epf *epf)
{
struct pci_epf_test *epf_test = epf_get_drvdata(epf);
@@ -802,6 +811,7 @@ static int pci_epf_test_link_up(struct pci_epf *epf)

static const struct pci_epc_event_ops pci_epf_test_epc_event_ops = {
.init = pci_epf_test_epc_init,
+ .deinit = pci_epf_test_epc_deinit,
};

static const struct pci_epc_bus_event_ops pci_epf_test_bus_event_ops = {
@@ -910,10 +920,13 @@ static int pci_epf_test_bind(struct pci_epf *epf)
static void pci_epf_test_unbind(struct pci_epf *epf)
{
struct pci_epf_test *epf_test = epf_get_drvdata(epf);
+ struct pci_epc *epc = epf->epc;

cancel_delayed_work(&epf_test->cmd_handler);
- pci_epf_test_clean_dma_chan(epf_test);
- pci_epf_test_clear_bar(epf);
+ if (epc->init_complete) {
+ pci_epf_test_clean_dma_chan(epf_test);
+ pci_epf_test_clear_bar(epf);
+ }
pci_epf_test_free_space(epf);
}

diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
index fe3cb62dd866..05670c10371a 100644
--- a/drivers/pci/endpoint/pci-epc-core.c
+++ b/drivers/pci/endpoint/pci-epc-core.c
@@ -774,6 +774,31 @@ void pci_epc_notify_pending_init(struct pci_epc *epc, struct pci_epf *epf)
}
EXPORT_SYMBOL_GPL(pci_epc_notify_pending_init);

+/**
+ * pci_epc_deinit_notify() - Notify the EPF device about EPC deinitialization
+ * @epc: the EPC device whose deinitialization is completed
+ *
+ * Invoke to notify the EPF device that the EPC deinitialization is completed.
+ */
+void pci_epc_deinit_notify(struct pci_epc *epc)
+{
+ struct pci_epf *epf;
+
+ if (IS_ERR_OR_NULL(epc))
+ return;
+
+ mutex_lock(&epc->list_lock);
+ list_for_each_entry(epf, &epc->pci_epf, list) {
+ mutex_lock(&epf->lock);
+ if (epf->epc_event_ops && epf->epc_event_ops->deinit)
+ epf->epc_event_ops->deinit(epf);
+ mutex_unlock(&epf->lock);
+ }
+ epc->init_complete = false;
+ mutex_unlock(&epc->list_lock);
+}
+EXPORT_SYMBOL_GPL(pci_epc_deinit_notify);
+
/**
* pci_epc_bme_notify() - Notify the EPF device that the EPC device has received
* the BME event from the Root complex
diff --git a/include/linux/pci-epc.h b/include/linux/pci-epc.h
index acc5f96161fe..69dd7246c9db 100644
--- a/include/linux/pci-epc.h
+++ b/include/linux/pci-epc.h
@@ -226,6 +226,7 @@ void pci_epc_linkup(struct pci_epc *epc);
void pci_epc_linkdown(struct pci_epc *epc);
void pci_epc_init_notify(struct pci_epc *epc);
void pci_epc_notify_pending_init(struct pci_epc *epc, struct pci_epf *epf);
+void pci_epc_deinit_notify(struct pci_epc *epc);
void pci_epc_bme_notify(struct pci_epc *epc);
void pci_epc_remove_epf(struct pci_epc *epc, struct pci_epf *epf,
enum pci_epc_interface_type type);
diff --git a/include/linux/pci-epf.h b/include/linux/pci-epf.h
index f4c2aaa6674c..74a0713661af 100644
--- a/include/linux/pci-epf.h
+++ b/include/linux/pci-epf.h
@@ -71,9 +71,11 @@ struct pci_epf_ops {
/**
* struct pci_epc_event_ops - Callbacks for capturing the EPC specific events
* @init: Callback for the EPC initialization event
+ * @deinit: Callback for the EPC deinitialization event
*/
struct pci_epc_event_ops {
int (*init)(struct pci_epf *epf);
+ void (*deinit)(struct pci_epf *epf);
};

/**

--
2.25.1


2024-04-01 15:53:25

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2 07/10] PCI: dwc: ep: Add a generic dw_pcie_ep_linkdown() API to handle Link Down event

As per the PCIe base spec r5.0, section 5.2, Link Down event can happen
under any of the following circumstances:

1. Fundamental/Hot reset
2. Link disable transmission by upstream component
3. Moving from L2/L3 to L0

In those cases, Link Down causes some non-sticky DWC registers to loose the
state (like REBAR, etc...). So the drivers need to reinitialize them to
function properly once the link comes back again.

This is not a problem for drivers supporting PERST# IRQ, since they can
reinitialize the registers in the PERST# IRQ callback. But for the drivers
not supporting PERST#, there is no way they can reinitialize the registers
other than relying on Link Down IRQ received when the link goes down. So
let's add a DWC generic API dw_pcie_ep_linkdown() that reinitializes the
non-sticky registers and also notifies the EPF drivers about link going
down.

This API can also be used by the drivers supporting PERST# to handle the
scenario (2) mentioned above.

NOTE: For the sake of code organization, move the dw_pcie_ep_linkup()
definition just above dw_pcie_ep_linkdown().

Reviewed-by: Niklas Cassel <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/pci/controller/dwc/pcie-designware-ep.c | 98 ++++++++++++++++---------
drivers/pci/controller/dwc/pcie-designware.h | 5 ++
2 files changed, 68 insertions(+), 35 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index 2063cf2049e5..56b34267850e 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -15,18 +15,6 @@
#include <linux/pci-epc.h>
#include <linux/pci-epf.h>

-/**
- * dw_pcie_ep_linkup - Notify EPF drivers about Link Up event
- * @ep: DWC EP device
- */
-void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
-{
- struct pci_epc *epc = ep->epc;
-
- pci_epc_linkup(epc);
-}
-EXPORT_SYMBOL_GPL(dw_pcie_ep_linkup);
-
/**
* dw_pcie_ep_init_notify - Notify EPF drivers about EPC initialization complete
* @ep: DWC EP device
@@ -673,6 +661,29 @@ static unsigned int dw_pcie_ep_find_ext_capability(struct dw_pcie *pci, int cap)
return 0;
}

+static void dw_pcie_ep_init_non_sticky_registers(struct dw_pcie *pci)
+{
+ unsigned int offset;
+ unsigned int nbars;
+ u32 reg, i;
+
+ offset = dw_pcie_ep_find_ext_capability(pci, PCI_EXT_CAP_ID_REBAR);
+
+ dw_pcie_dbi_ro_wr_en(pci);
+
+ if (offset) {
+ reg = dw_pcie_readl_dbi(pci, offset + PCI_REBAR_CTRL);
+ nbars = (reg & PCI_REBAR_CTRL_NBAR_MASK) >>
+ PCI_REBAR_CTRL_NBAR_SHIFT;
+
+ for (i = 0; i < nbars; i++, offset += PCI_REBAR_CTRL)
+ dw_pcie_writel_dbi(pci, offset + PCI_REBAR_CAP, 0x0);
+ }
+
+ dw_pcie_setup(pci);
+ dw_pcie_dbi_ro_wr_dis(pci);
+}
+
/**
* dw_pcie_ep_init_registers - Initialize DWC EP specific registers
* @ep: DWC EP device
@@ -687,13 +698,11 @@ int dw_pcie_ep_init_registers(struct dw_pcie_ep *ep)
struct dw_pcie_ep_func *ep_func;
struct device *dev = pci->dev;
struct pci_epc *epc = ep->epc;
- unsigned int offset, ptm_cap_base;
- unsigned int nbars;
+ u32 ptm_cap_base, reg;
u8 hdr_type;
u8 func_no;
- int i, ret;
void *addr;
- u32 reg;
+ int ret;

hdr_type = dw_pcie_readb_dbi(pci, PCI_HEADER_TYPE) &
PCI_HEADER_TYPE_MASK;
@@ -756,25 +765,8 @@ int dw_pcie_ep_init_registers(struct dw_pcie_ep *ep)
if (ep->ops->init)
ep->ops->init(ep);

- offset = dw_pcie_ep_find_ext_capability(pci, PCI_EXT_CAP_ID_REBAR);
ptm_cap_base = dw_pcie_ep_find_ext_capability(pci, PCI_EXT_CAP_ID_PTM);

- dw_pcie_dbi_ro_wr_en(pci);
-
- if (offset) {
- reg = dw_pcie_readl_dbi(pci, offset + PCI_REBAR_CTRL);
- nbars = (reg & PCI_REBAR_CTRL_NBAR_MASK) >>
- PCI_REBAR_CTRL_NBAR_SHIFT;
-
- /*
- * PCIe r6.0, sec 7.8.6.2 require us to support at least one
- * size in the range from 1 MB to 512 GB. Advertise support
- * for 1 MB BAR size only.
- */
- for (i = 0; i < nbars; i++, offset += PCI_REBAR_CTRL)
- dw_pcie_writel_dbi(pci, offset + PCI_REBAR_CAP, BIT(4));
- }
-
/*
* PTM responder capability can be disabled only after disabling
* PTM root capability.
@@ -791,8 +783,7 @@ int dw_pcie_ep_init_registers(struct dw_pcie_ep *ep)
dw_pcie_dbi_ro_wr_dis(pci);
}

- dw_pcie_setup(pci);
- dw_pcie_dbi_ro_wr_dis(pci);
+ dw_pcie_ep_init_non_sticky_registers(pci);

return 0;

@@ -803,6 +794,43 @@ int dw_pcie_ep_init_registers(struct dw_pcie_ep *ep)
}
EXPORT_SYMBOL_GPL(dw_pcie_ep_init_registers);

+/**
+ * dw_pcie_ep_linkup - Notify EPF drivers about Link Up event
+ * @ep: DWC EP device
+ */
+void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
+{
+ struct pci_epc *epc = ep->epc;
+
+ pci_epc_linkup(epc);
+}
+EXPORT_SYMBOL_GPL(dw_pcie_ep_linkup);
+
+/**
+ * dw_pcie_ep_linkdown - Notify EPF drivers about Link Down event
+ * @ep: DWC EP device
+ *
+ * Non-sticky registers are also initialized before sending the notification to
+ * the EPF drivers. This is needed since the registers need to be initialized
+ * before the link comes back again.
+ */
+void dw_pcie_ep_linkdown(struct dw_pcie_ep *ep)
+{
+ struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+ struct pci_epc *epc = ep->epc;
+
+ /*
+ * Initialize the non-sticky DWC registers as they would've reset post
+ * Link Down. This is specifically needed for drivers not supporting
+ * PERST# as they have no way to reinitialize the registers before the
+ * link comes back again.
+ */
+ dw_pcie_ep_init_non_sticky_registers(pci);
+
+ pci_epc_linkdown(epc);
+}
+EXPORT_SYMBOL_GPL(dw_pcie_ep_linkdown);
+
/**
* dw_pcie_ep_init - Initialize the endpoint device
* @ep: DWC EP device
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index f8e5431a207b..152969545b0a 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -668,6 +668,7 @@ static inline void __iomem *dw_pcie_own_conf_map_bus(struct pci_bus *bus,

#ifdef CONFIG_PCIE_DW_EP
void dw_pcie_ep_linkup(struct dw_pcie_ep *ep);
+void dw_pcie_ep_linkdown(struct dw_pcie_ep *ep);
int dw_pcie_ep_init(struct dw_pcie_ep *ep);
int dw_pcie_ep_init_registers(struct dw_pcie_ep *ep);
void dw_pcie_ep_init_notify(struct dw_pcie_ep *ep);
@@ -688,6 +689,10 @@ static inline void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
{
}

+static inline void dw_pcie_ep_linkdown(struct dw_pcie_ep *ep)
+{
+}
+
static inline int dw_pcie_ep_init(struct dw_pcie_ep *ep)
{
return 0;

--
2.25.1


2024-04-01 15:53:42

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2 08/10] PCI: qcom-ep: Use the generic dw_pcie_ep_linkdown() API to handle Link Down event

Now that the API is available, let's make use of it. It also handles the
reinitialization of DWC non-sticky registers in addition to sending the
notification to EPF drivers.

Reviewed-by: Niklas Cassel <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/pci/controller/dwc/pcie-qcom-ep.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c
index e4b742355d57..811f250e967a 100644
--- a/drivers/pci/controller/dwc/pcie-qcom-ep.c
+++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c
@@ -635,7 +635,7 @@ static irqreturn_t qcom_pcie_ep_global_irq_thread(int irq, void *data)
if (FIELD_GET(PARF_INT_ALL_LINK_DOWN, status)) {
dev_dbg(dev, "Received Linkdown event\n");
pcie_ep->link_status = QCOM_PCIE_EP_LINK_DOWN;
- pci_epc_linkdown(pci->ep.epc);
+ dw_pcie_ep_linkdown(&pci->ep);
} else if (FIELD_GET(PARF_INT_ALL_BME, status)) {
dev_dbg(dev, "Received BME event. Link is enabled!\n");
pcie_ep->link_status = QCOM_PCIE_EP_LINK_ENABLED;

--
2.25.1


2024-04-01 15:53:59

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2 09/10] PCI: epf-test: Handle Link Down event

As per the PCIe base spec r5.0, section 5.2, Link Down event can happen
under any of the following circumstances:

1. Fundamental/Hot reset
2. Link disable transmission by upstream component
3. Moving from L2/L3 to L0

When the event happens, the EPC driver capable of detecting it may pass the
notification to the EPF driver through link_down() callback in 'struct
pci_epc_bus_event_ops'.

While the PCIe spec has not defined the actual behavior of the endpoint
when the Link Down event happens, we may assume that at least the ongoing
transactions need to be stopped as the link won't be active. So let's
cancel the command handler work in the callback implementation
pci_epf_test_link_down(). The work will be started again in
pci_epf_test_link_up() once the link comes back again.

Reviewed-by: Niklas Cassel <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/pci/endpoint/functions/pci-epf-test.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 5933788b0e68..2264e72115e5 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -809,6 +809,15 @@ static int pci_epf_test_link_up(struct pci_epf *epf)
return 0;
}

+static int pci_epf_test_link_down(struct pci_epf *epf)
+{
+ struct pci_epf_test *epf_test = epf_get_drvdata(epf);
+
+ cancel_delayed_work(&epf_test->cmd_handler);
+
+ return 0;
+}
+
static const struct pci_epc_event_ops pci_epf_test_epc_event_ops = {
.init = pci_epf_test_epc_init,
.deinit = pci_epf_test_epc_deinit,
@@ -816,6 +825,7 @@ static const struct pci_epc_event_ops pci_epf_test_epc_event_ops = {

static const struct pci_epc_bus_event_ops pci_epf_test_bus_event_ops = {
.link_up = pci_epf_test_link_up,
+ .link_down = pci_epf_test_link_down,
};

static int pci_epf_test_alloc_space(struct pci_epf *epf)

--
2.25.1


2024-04-01 15:54:18

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v2 10/10] PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices

PCIe host controller drivers are supposed to properly reset the endpoint
devices during host shutdown/reboot. Currently, Qcom driver doesn't do
anything during host shutdown/reboot, resulting in both PERST# and refclk
getting disabled at the same time. This prevents the endpoint device
firmware to properly reset the state machine. Because, if the refclk is
cutoff immediately along with PERST#, access to device specific registers
within the endpoint will result in a firmware crash.

To address this issue, let's call qcom_pcie_host_deinit() inside the
shutdown callback, that asserts PERST# and then cuts off the refclk with a
delay of 1ms, thus allowing the endpoint device firmware to properly
cleanup the state machine.

Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/pci/controller/dwc/pcie-qcom.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 14772edcf0d3..b2803978c0ad 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -1655,6 +1655,13 @@ static int qcom_pcie_resume_noirq(struct device *dev)
return 0;
}

+static void qcom_pcie_shutdown(struct platform_device *pdev)
+{
+ struct qcom_pcie *pcie = platform_get_drvdata(pdev);
+
+ qcom_pcie_host_deinit(&pcie->pci->pp);
+}
+
static const struct of_device_id qcom_pcie_match[] = {
{ .compatible = "qcom,pcie-apq8064", .data = &cfg_2_1_0 },
{ .compatible = "qcom,pcie-apq8084", .data = &cfg_1_0_0 },
@@ -1708,5 +1715,6 @@ static struct platform_driver qcom_pcie_driver = {
.pm = &qcom_pcie_pm_ops,
.probe_type = PROBE_PREFER_ASYNCHRONOUS,
},
+ .shutdown = qcom_pcie_shutdown,
};
builtin_platform_driver(qcom_pcie_driver);

--
2.25.1


2024-04-02 00:25:05

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 02/10] PCI: endpoint: Decouple EPC and PCIe bus specific events

On 4/2/24 00:50, Manivannan Sadhasivam wrote:
> Currently, 'struct pci_epc_event_ops' has a bunch of events that are sent
> from the EPC driver to EPF driver. But those events are a mix of EPC
> specific events like core_init and PCIe bus specific events like LINK_UP,
> LINK_DOWN, BME etc...
>
> Let's decouple them to respective structs (pci_epc_event_ops,
> pci_epc_bus_event_ops) to make the separation clear.

I fail to see the benefits here. The event operation names are quite clear and,
in my opinion, it is clear if an event op applies to the controller or to the
bus/link. If anything, "core_init" could a little more clear, so renaming that
"ep_controller_init" or something like that (clearly spelling out what is being
initialized) seems enough to me. Similarly, the "bme" op name is very criptic.
Renaming that to "bus_master_enable" would go a long way clarifying the code.
For link events, "link_up", "link_down" are clear. So I think there is no need
to split the event op struct like this. Renaming the ops is better.

Note that I am not opposed to this patch, but I think it is just code churn
that does not really bring any fundamental improvement. Regardless, renaming
"core_init" and "bme" ops is I think desired.

>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
> drivers/pci/endpoint/functions/pci-epf-mhi.c | 8 ++++++--
> drivers/pci/endpoint/functions/pci-epf-test.c | 8 ++++++--
> drivers/pci/endpoint/pci-epc-core.c | 20 ++++++++++----------
> include/linux/pci-epf.h | 23 ++++++++++++++++-------
> 4 files changed, 38 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/pci/endpoint/functions/pci-epf-mhi.c b/drivers/pci/endpoint/functions/pci-epf-mhi.c
> index 2c54d80107cf..280863c0eeb9 100644
> --- a/drivers/pci/endpoint/functions/pci-epf-mhi.c
> +++ b/drivers/pci/endpoint/functions/pci-epf-mhi.c
> @@ -896,8 +896,11 @@ static void pci_epf_mhi_unbind(struct pci_epf *epf)
> pci_epc_clear_bar(epc, epf->func_no, epf->vfunc_no, epf_bar);
> }
>
> -static const struct pci_epc_event_ops pci_epf_mhi_event_ops = {
> +static const struct pci_epc_event_ops pci_epf_mhi_epc_event_ops = {
> .core_init = pci_epf_mhi_core_init,
> +};
> +
> +static const struct pci_epc_bus_event_ops pci_epf_mhi_bus_event_ops = {
> .link_up = pci_epf_mhi_link_up,
> .link_down = pci_epf_mhi_link_down,
> .bme = pci_epf_mhi_bme,
> @@ -919,7 +922,8 @@ static int pci_epf_mhi_probe(struct pci_epf *epf,
> epf_mhi->info = info;
> epf_mhi->epf = epf;
>
> - epf->event_ops = &pci_epf_mhi_event_ops;
> + epf->epc_event_ops = &pci_epf_mhi_epc_event_ops;
> + epf->bus_event_ops = &pci_epf_mhi_bus_event_ops;
>
> mutex_init(&epf_mhi->lock);
>
> diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
> index 977fb79c1567..973db0b1bde2 100644
> --- a/drivers/pci/endpoint/functions/pci-epf-test.c
> +++ b/drivers/pci/endpoint/functions/pci-epf-test.c
> @@ -798,8 +798,11 @@ static int pci_epf_test_link_up(struct pci_epf *epf)
> return 0;
> }
>
> -static const struct pci_epc_event_ops pci_epf_test_event_ops = {
> +static const struct pci_epc_event_ops pci_epf_test_epc_event_ops = {
> .core_init = pci_epf_test_core_init,
> +};
> +
> +static const struct pci_epc_bus_event_ops pci_epf_test_bus_event_ops = {
> .link_up = pci_epf_test_link_up,
> };
>
> @@ -916,7 +919,8 @@ static int pci_epf_test_probe(struct pci_epf *epf,
>
> INIT_DELAYED_WORK(&epf_test->cmd_handler, pci_epf_test_cmd_handler);
>
> - epf->event_ops = &pci_epf_test_event_ops;
> + epf->epc_event_ops = &pci_epf_test_epc_event_ops;
> + epf->bus_event_ops = &pci_epf_test_bus_event_ops;
>
> epf_set_drvdata(epf, epf_test);
> return 0;
> diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
> index 47d27ec7439d..f202ae07ffa9 100644
> --- a/drivers/pci/endpoint/pci-epc-core.c
> +++ b/drivers/pci/endpoint/pci-epc-core.c
> @@ -692,8 +692,8 @@ void pci_epc_linkup(struct pci_epc *epc)
> mutex_lock(&epc->list_lock);
> list_for_each_entry(epf, &epc->pci_epf, list) {
> mutex_lock(&epf->lock);
> - if (epf->event_ops && epf->event_ops->link_up)
> - epf->event_ops->link_up(epf);
> + if (epf->bus_event_ops && epf->bus_event_ops->link_up)
> + epf->bus_event_ops->link_up(epf);
> mutex_unlock(&epf->lock);
> }
> mutex_unlock(&epc->list_lock);
> @@ -718,8 +718,8 @@ void pci_epc_linkdown(struct pci_epc *epc)
> mutex_lock(&epc->list_lock);
> list_for_each_entry(epf, &epc->pci_epf, list) {
> mutex_lock(&epf->lock);
> - if (epf->event_ops && epf->event_ops->link_down)
> - epf->event_ops->link_down(epf);
> + if (epf->bus_event_ops && epf->bus_event_ops->link_down)
> + epf->bus_event_ops->link_down(epf);
> mutex_unlock(&epf->lock);
> }
> mutex_unlock(&epc->list_lock);
> @@ -744,8 +744,8 @@ void pci_epc_init_notify(struct pci_epc *epc)
> mutex_lock(&epc->list_lock);
> list_for_each_entry(epf, &epc->pci_epf, list) {
> mutex_lock(&epf->lock);
> - if (epf->event_ops && epf->event_ops->core_init)
> - epf->event_ops->core_init(epf);
> + if (epf->epc_event_ops && epf->epc_event_ops->core_init)
> + epf->epc_event_ops->core_init(epf);
> mutex_unlock(&epf->lock);
> }
> epc->init_complete = true;
> @@ -767,8 +767,8 @@ void pci_epc_notify_pending_init(struct pci_epc *epc, struct pci_epf *epf)
> {
> if (epc->init_complete) {
> mutex_lock(&epf->lock);
> - if (epf->event_ops && epf->event_ops->core_init)
> - epf->event_ops->core_init(epf);
> + if (epf->epc_event_ops && epf->epc_event_ops->core_init)
> + epf->epc_event_ops->core_init(epf);
> mutex_unlock(&epf->lock);
> }
> }
> @@ -792,8 +792,8 @@ void pci_epc_bme_notify(struct pci_epc *epc)
> mutex_lock(&epc->list_lock);
> list_for_each_entry(epf, &epc->pci_epf, list) {
> mutex_lock(&epf->lock);
> - if (epf->event_ops && epf->event_ops->bme)
> - epf->event_ops->bme(epf);
> + if (epf->bus_event_ops && epf->bus_event_ops->bme)
> + epf->bus_event_ops->bme(epf);
> mutex_unlock(&epf->lock);
> }
> mutex_unlock(&epc->list_lock);
> diff --git a/include/linux/pci-epf.h b/include/linux/pci-epf.h
> index adee6a1b35db..77399fecaeb5 100644
> --- a/include/linux/pci-epf.h
> +++ b/include/linux/pci-epf.h
> @@ -69,14 +69,21 @@ struct pci_epf_ops {
> };
>
> /**
> - * struct pci_epc_event_ops - Callbacks for capturing the EPC events
> - * @core_init: Callback for the EPC initialization complete event
> - * @link_up: Callback for the EPC link up event
> - * @link_down: Callback for the EPC link down event
> - * @bme: Callback for the EPC BME (Bus Master Enable) event
> + * struct pci_epc_event_ops - Callbacks for capturing the EPC specific events
> + * @core_init: Callback for the EPC initialization event
> */
> struct pci_epc_event_ops {
> int (*core_init)(struct pci_epf *epf);
> +};
> +
> +/**
> + * struct pci_epc_bus_event_ops - Callbacks for capturing the PCIe bus specific
> + * events
> + * @link_up: Callback for the PCIe bus link up event
> + * @link_down: Callback for the PCIe bus link down event
> + * @bme: Callback for the PCIe bus BME (Bus Master Enable) event
> + */
> +struct pci_epc_bus_event_ops {
> int (*link_up)(struct pci_epf *epf);
> int (*link_down)(struct pci_epf *epf);
> int (*bme)(struct pci_epf *epf);
> @@ -150,7 +157,8 @@ struct pci_epf_bar {
> * @is_vf: true - virtual function, false - physical function
> * @vfunction_num_map: bitmap to manage virtual function number
> * @pci_vepf: list of virtual endpoint functions associated with this function
> - * @event_ops: Callbacks for capturing the EPC events
> + * @epc_event_ops: Callbacks for capturing the EPC events
> + * @bus_event_ops: Callbacks for capturing the PCIe bus events
> */
> struct pci_epf {
> struct device dev;
> @@ -180,7 +188,8 @@ struct pci_epf {
> unsigned int is_vf;
> unsigned long vfunction_num_map;
> struct list_head pci_vepf;
> - const struct pci_epc_event_ops *event_ops;
> + const struct pci_epc_event_ops *epc_event_ops;
> + const struct pci_epc_bus_event_ops *bus_event_ops;
> };
>
> /**
>

--
Damien Le Moal
Western Digital Research


2024-04-02 09:41:18

by Niklas Cassel

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] PCI: endpoint: Make host reboot handling more robust

On Mon, Apr 01, 2024 at 09:20:26PM +0530, Manivannan Sadhasivam wrote:
> Hello,
>
> This is the follow up series of [1], to improve the handling of host reboot in
> the endpoint subsystem. This involves refining the PERST# and Link Down event
> handling in both the controller and function drivers.
>
> Testing
> =======
>
> This series is tested on Qcom SM8450 based development board with both MHI_EPF
> and EPF_TEST function drivers.
>
> Dependency
> ==========
>
> This series depends on [1] and [2].
>
> - Mani

Hello Mani,

> [1] https://lore.kernel.org/linux-pci/[email protected]/
> [2] https://lore.kernel.org/linux-pci/[email protected]/

AFAICT both these series [1] (DBI rework v12, not v10) and [2] are fully
reviewed and seem to be ready to go.

Considering that we have patches depending on [1] and [2],
namely the series in $subject, but also:
https://lore.kernel.org/linux-pci/[email protected]/T/#t

I think it would be a good idea if you could apply [1] and [2] to the
pci/endpoint branch.

(It is not easy for people to know that they will need to rebase their work on
these (fully reviewed) series, when they have not been applied.)


Kind regards,
Niklas


>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
> Changes in v2:
> - Dropped the {start/stop}_link rework patches
> - Incorporated comments from Niklas
> - Collected review tags
> - Rebased on top of v6.9-rc1 and https://lore.kernel.org/linux-pci/[email protected]/
> - Link to v1: https://lore.kernel.org/r/[email protected]
>
> ---
> Manivannan Sadhasivam (10):
> PCI: qcom-ep: Disable resources unconditionally during PERST# assert
> PCI: endpoint: Decouple EPC and PCIe bus specific events
> PCI: endpoint: Rename core_init() callback in 'struct pci_epc_event_ops' to init()
> PCI: epf-test: Refactor pci_epf_test_unbind() function
> PCI: epf-{mhi/test}: Move DMA initialization to EPC init callback
> PCI: endpoint: Introduce EPC 'deinit' event and notify the EPF drivers
> PCI: dwc: ep: Add a generic dw_pcie_ep_linkdown() API to handle Link Down event
> PCI: qcom-ep: Use the generic dw_pcie_ep_linkdown() API to handle Link Down event
> PCI: epf-test: Handle Link Down event
> PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices
>
> drivers/pci/controller/dwc/pcie-designware-ep.c | 99 ++++++++++++++---------
> drivers/pci/controller/dwc/pcie-designware.h | 5 ++
> drivers/pci/controller/dwc/pcie-qcom-ep.c | 9 +--
> drivers/pci/controller/dwc/pcie-qcom.c | 8 ++
> drivers/pci/controller/dwc/pcie-tegra194.c | 1 +
> drivers/pci/endpoint/functions/pci-epf-mhi.c | 47 ++++++++---
> drivers/pci/endpoint/functions/pci-epf-test.c | 103 +++++++++++++++++-------
> drivers/pci/endpoint/pci-epc-core.c | 53 ++++++++----
> include/linux/pci-epc.h | 1 +
> include/linux/pci-epf.h | 27 +++++--
> 10 files changed, 248 insertions(+), 105 deletions(-)
> ---
> base-commit: e6377605ca734126533a0f8e4de2b4bac881f076
> change-id: 20240314-pci-epf-rework-a6e65b103a79
>
> Best regards,
> --
> Manivannan Sadhasivam <[email protected]>
>

2024-04-02 10:52:58

by Niklas Cassel

[permalink] [raw]
Subject: Re: [PATCH v2 07/10] PCI: dwc: ep: Add a generic dw_pcie_ep_linkdown() API to handle Link Down event

On Mon, Apr 01, 2024 at 09:20:33PM +0530, Manivannan Sadhasivam wrote:
> As per the PCIe base spec r5.0, section 5.2, Link Down event can happen
> under any of the following circumstances:
>
> 1. Fundamental/Hot reset
> 2. Link disable transmission by upstream component
> 3. Moving from L2/L3 to L0
>
> In those cases, Link Down causes some non-sticky DWC registers to loose the
> state (like REBAR, etc...). So the drivers need to reinitialize them to
> function properly once the link comes back again.
>
> This is not a problem for drivers supporting PERST# IRQ, since they can
> reinitialize the registers in the PERST# IRQ callback. But for the drivers
> not supporting PERST#, there is no way they can reinitialize the registers
> other than relying on Link Down IRQ received when the link goes down. So
> let's add a DWC generic API dw_pcie_ep_linkdown() that reinitializes the
> non-sticky registers and also notifies the EPF drivers about link going
> down.
>
> This API can also be used by the drivers supporting PERST# to handle the
> scenario (2) mentioned above.
>
> NOTE: For the sake of code organization, move the dw_pcie_ep_linkup()
> definition just above dw_pcie_ep_linkdown().
>
> Reviewed-by: Niklas Cassel <[email protected]>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
> drivers/pci/controller/dwc/pcie-designware-ep.c | 98 ++++++++++++++++---------
> drivers/pci/controller/dwc/pcie-designware.h | 5 ++
> 2 files changed, 68 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index 2063cf2049e5..56b34267850e 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -15,18 +15,6 @@
> #include <linux/pci-epc.h>
> #include <linux/pci-epf.h>
>
> -/**
> - * dw_pcie_ep_linkup - Notify EPF drivers about Link Up event
> - * @ep: DWC EP device
> - */
> -void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
> -{
> - struct pci_epc *epc = ep->epc;
> -
> - pci_epc_linkup(epc);
> -}
> -EXPORT_SYMBOL_GPL(dw_pcie_ep_linkup);
> -
> /**
> * dw_pcie_ep_init_notify - Notify EPF drivers about EPC initialization complete
> * @ep: DWC EP device
> @@ -673,6 +661,29 @@ static unsigned int dw_pcie_ep_find_ext_capability(struct dw_pcie *pci, int cap)
> return 0;
> }
>
> +static void dw_pcie_ep_init_non_sticky_registers(struct dw_pcie *pci)
> +{
> + unsigned int offset;
> + unsigned int nbars;
> + u32 reg, i;
> +
> + offset = dw_pcie_ep_find_ext_capability(pci, PCI_EXT_CAP_ID_REBAR);
> +
> + dw_pcie_dbi_ro_wr_en(pci);
> +
> + if (offset) {
> + reg = dw_pcie_readl_dbi(pci, offset + PCI_REBAR_CTRL);
> + nbars = (reg & PCI_REBAR_CTRL_NBAR_MASK) >>
> + PCI_REBAR_CTRL_NBAR_SHIFT;
> +
> + for (i = 0; i < nbars; i++, offset += PCI_REBAR_CTRL)
> + dw_pcie_writel_dbi(pci, offset + PCI_REBAR_CAP, 0x0);
> + }
> +
> + dw_pcie_setup(pci);
> + dw_pcie_dbi_ro_wr_dis(pci);
> +}
> +
> /**
> * dw_pcie_ep_init_registers - Initialize DWC EP specific registers
> * @ep: DWC EP device
> @@ -687,13 +698,11 @@ int dw_pcie_ep_init_registers(struct dw_pcie_ep *ep)
> struct dw_pcie_ep_func *ep_func;
> struct device *dev = pci->dev;
> struct pci_epc *epc = ep->epc;
> - unsigned int offset, ptm_cap_base;
> - unsigned int nbars;
> + u32 ptm_cap_base, reg;
> u8 hdr_type;
> u8 func_no;
> - int i, ret;
> void *addr;
> - u32 reg;
> + int ret;
>
> hdr_type = dw_pcie_readb_dbi(pci, PCI_HEADER_TYPE) &
> PCI_HEADER_TYPE_MASK;
> @@ -756,25 +765,8 @@ int dw_pcie_ep_init_registers(struct dw_pcie_ep *ep)
> if (ep->ops->init)
> ep->ops->init(ep);
>
> - offset = dw_pcie_ep_find_ext_capability(pci, PCI_EXT_CAP_ID_REBAR);
> ptm_cap_base = dw_pcie_ep_find_ext_capability(pci, PCI_EXT_CAP_ID_PTM);
>
> - dw_pcie_dbi_ro_wr_en(pci);
> -
> - if (offset) {
> - reg = dw_pcie_readl_dbi(pci, offset + PCI_REBAR_CTRL);
> - nbars = (reg & PCI_REBAR_CTRL_NBAR_MASK) >>
> - PCI_REBAR_CTRL_NBAR_SHIFT;
> -
> - /*
> - * PCIe r6.0, sec 7.8.6.2 require us to support at least one
> - * size in the range from 1 MB to 512 GB. Advertise support
> - * for 1 MB BAR size only.
> - */

Here you remove the above comment, but you don't seem to (re-)add the comment
when moving this code to dw_pcie_ep_init_non_sticky_registers().
(I assume that this is a simple rebase mistake.)


Kind regards,
Niklas

> - for (i = 0; i < nbars; i++, offset += PCI_REBAR_CTRL)
> - dw_pcie_writel_dbi(pci, offset + PCI_REBAR_CAP, BIT(4));
> - }
> -
> /*
> * PTM responder capability can be disabled only after disabling
> * PTM root capability.

2024-04-02 11:09:50

by Niklas Cassel

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] PCI: endpoint: Rename core_init() callback in 'struct pci_epc_event_ops' to init()

On Mon, Apr 01, 2024 at 09:20:29PM +0530, Manivannan Sadhasivam wrote:
> core_init() callback is used to notify the EPC initialization event to the
> EPF drivers. The 'core' prefix was used indicate that the controller IP
> core has completed initialization. But it serves no purpose as the EPF
> driver will only care about the EPC initialization as a whole and there is
> no real benefit to distinguish the IP core part.
>
> So let's rename the core_init() callback in 'struct pci_epc_event_ops' to
> just init() to make it more clear.
>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
> drivers/pci/endpoint/functions/pci-epf-mhi.c | 4 ++--
> drivers/pci/endpoint/functions/pci-epf-test.c | 4 ++--
> drivers/pci/endpoint/pci-epc-core.c | 16 ++++++++--------
> include/linux/pci-epf.h | 4 ++--
> 4 files changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/pci/endpoint/functions/pci-epf-mhi.c b/drivers/pci/endpoint/functions/pci-epf-mhi.c
> index 280863c0eeb9..b3c26ffd29a5 100644
> --- a/drivers/pci/endpoint/functions/pci-epf-mhi.c
> +++ b/drivers/pci/endpoint/functions/pci-epf-mhi.c
> @@ -716,7 +716,7 @@ static void pci_epf_mhi_dma_deinit(struct pci_epf_mhi *epf_mhi)
> epf_mhi->dma_chan_rx = NULL;
> }
>
> -static int pci_epf_mhi_core_init(struct pci_epf *epf)
> +static int pci_epf_mhi_epc_init(struct pci_epf *epf)
> {
> struct pci_epf_mhi *epf_mhi = epf_get_drvdata(epf);
> const struct pci_epf_mhi_ep_info *info = epf_mhi->info;
> @@ -897,7 +897,7 @@ static void pci_epf_mhi_unbind(struct pci_epf *epf)
> }
>
> static const struct pci_epc_event_ops pci_epf_mhi_epc_event_ops = {
> - .core_init = pci_epf_mhi_core_init,
> + .init = pci_epf_mhi_epc_init,
> };
>
> static const struct pci_epc_bus_event_ops pci_epf_mhi_bus_event_ops = {
> diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
> index 973db0b1bde2..abcb6ca61c4e 100644
> --- a/drivers/pci/endpoint/functions/pci-epf-test.c
> +++ b/drivers/pci/endpoint/functions/pci-epf-test.c
> @@ -731,7 +731,7 @@ static int pci_epf_test_set_bar(struct pci_epf *epf)
> return 0;
> }
>
> -static int pci_epf_test_core_init(struct pci_epf *epf)
> +static int pci_epf_test_epc_init(struct pci_epf *epf)

On V1 you agreed that it is better to remove 'epc' from the naming.
(For both pci-epf-test and pci-epf-mhi).
You seem to have forgotten to address this for V2.


Kind regards,
Niklas

2024-04-02 11:10:43

by Niklas Cassel

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] PCI: endpoint: Make host reboot handling more robust

On Mon, Apr 01, 2024 at 09:20:26PM +0530, Manivannan Sadhasivam wrote:
> Hello,
>
> This is the follow up series of [1], to improve the handling of host reboot in
> the endpoint subsystem. This involves refining the PERST# and Link Down event
> handling in both the controller and function drivers.
>
> Testing
> =======
>
> This series is tested on Qcom SM8450 based development board with both MHI_EPF
> and EPF_TEST function drivers.
>
> Dependency
> ==========
>
> This series depends on [1] and [2].
>
> - Mani
>
> [1] https://lore.kernel.org/linux-pci/[email protected]/
> [2] https://lore.kernel.org/linux-pci/[email protected]/
>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---

Tested-by: Niklas Cassel <[email protected]>

2024-04-02 11:20:03

by Niklas Cassel

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices

On Mon, Apr 01, 2024 at 09:20:36PM +0530, Manivannan Sadhasivam wrote:
> PCIe host controller drivers are supposed to properly reset the endpoint
> devices during host shutdown/reboot. Currently, Qcom driver doesn't do
> anything during host shutdown/reboot, resulting in both PERST# and refclk
> getting disabled at the same time. This prevents the endpoint device
> firmware to properly reset the state machine. Because, if the refclk is
> cutoff immediately along with PERST#, access to device specific registers
> within the endpoint will result in a firmware crash.
>
> To address this issue, let's call qcom_pcie_host_deinit() inside the
> shutdown callback, that asserts PERST# and then cuts off the refclk with a
> delay of 1ms, thus allowing the endpoint device firmware to properly
> cleanup the state machine.

Hm... a QCOM EP device could be attached to any of the PCIe RC drivers that
we have in the kernel, so it seems a bit weird to fix this problem by
patching the QCOM RC driver only.

Which DBI call is it that causes this problem during perst assert on EP side?

I assume that it is pci-epf-test:deinit() callback that calls
pci_epc_clear_bar(), which calls dw_pcie_ep_clear_bar(), which will both:
-clear local data structures, e.g.
ep->epf_bar[bar] = NULL;
ep->bar_to_atu[bar] = 0;

but also call:
__dw_pcie_ep_reset_bar()
dw_pcie_disable_atu()


Do we perhaps need to redesign the .deinit EPF callback?

Considering that we know that .deinit() will only be called on platforms
where there will be a fundamental core reset, I guess we could do something
like introduce a __dw_pcie_ep_clear_bar() which will only clear the local
data structures. (It might not need to do any DBI writes, since the
fundamental core reset should have reset all values.)

Or perhaps instead of letting pci_epf_test_epc_deinit() call
pci_epf_test_clear_bar()/__pci_epf_test_clear_bar() directly, perhaps let
pci_epf_test_epc_deinit() call add a .deinit()/.cleanup() defined in the
EPC driver.

This EPC .deinit()/.cleanup() callback would then only clear the
local data structures (no DBI writes...).

Something like that?


>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
> drivers/pci/controller/dwc/pcie-qcom.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> index 14772edcf0d3..b2803978c0ad 100644
> --- a/drivers/pci/controller/dwc/pcie-qcom.c
> +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> @@ -1655,6 +1655,13 @@ static int qcom_pcie_resume_noirq(struct device *dev)
> return 0;
> }
>
> +static void qcom_pcie_shutdown(struct platform_device *pdev)
> +{
> + struct qcom_pcie *pcie = platform_get_drvdata(pdev);
> +
> + qcom_pcie_host_deinit(&pcie->pci->pp);
> +}
> +
> static const struct of_device_id qcom_pcie_match[] = {
> { .compatible = "qcom,pcie-apq8064", .data = &cfg_2_1_0 },
> { .compatible = "qcom,pcie-apq8084", .data = &cfg_1_0_0 },
> @@ -1708,5 +1715,6 @@ static struct platform_driver qcom_pcie_driver = {
> .pm = &qcom_pcie_pm_ops,
> .probe_type = PROBE_PREFER_ASYNCHRONOUS,
> },
> + .shutdown = qcom_pcie_shutdown,
> };
> builtin_platform_driver(qcom_pcie_driver);
>
> --
> 2.25.1
>

2024-04-03 13:43:44

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices

On Tue, Apr 02, 2024 at 01:18:54PM +0200, Niklas Cassel wrote:
> On Mon, Apr 01, 2024 at 09:20:36PM +0530, Manivannan Sadhasivam wrote:
> > PCIe host controller drivers are supposed to properly reset the endpoint
> > devices during host shutdown/reboot. Currently, Qcom driver doesn't do
> > anything during host shutdown/reboot, resulting in both PERST# and refclk
> > getting disabled at the same time. This prevents the endpoint device
> > firmware to properly reset the state machine. Because, if the refclk is
> > cutoff immediately along with PERST#, access to device specific registers
> > within the endpoint will result in a firmware crash.
> >
> > To address this issue, let's call qcom_pcie_host_deinit() inside the
> > shutdown callback, that asserts PERST# and then cuts off the refclk with a
> > delay of 1ms, thus allowing the endpoint device firmware to properly
> > cleanup the state machine.
>
> Hm... a QCOM EP device could be attached to any of the PCIe RC drivers that
> we have in the kernel, so it seems a bit weird to fix this problem by
> patching the QCOM RC driver only.
>
> Which DBI call is it that causes this problem during perst assert on EP side?
>
> I assume that it is pci-epf-test:deinit() callback that calls
> pci_epc_clear_bar(), which calls dw_pcie_ep_clear_bar(), which will both:
> -clear local data structures, e.g.
> ep->epf_bar[bar] = NULL;
> ep->bar_to_atu[bar] = 0;
>
> but also call:
> __dw_pcie_ep_reset_bar()
> dw_pcie_disable_atu()
>
>
> Do we perhaps need to redesign the .deinit EPF callback?
>
> Considering that we know that .deinit() will only be called on platforms
> where there will be a fundamental core reset, I guess we could do something
> like introduce a __dw_pcie_ep_clear_bar() which will only clear the local
> data structures. (It might not need to do any DBI writes, since the
> fundamental core reset should have reset all values.)
>
> Or perhaps instead of letting pci_epf_test_epc_deinit() call
> pci_epf_test_clear_bar()/__pci_epf_test_clear_bar() directly, perhaps let
> pci_epf_test_epc_deinit() call add a .deinit()/.cleanup() defined in the
> EPC driver.
>
> This EPC .deinit()/.cleanup() callback would then only clear the
> local data structures (no DBI writes...).
>
> Something like that?
>

It is not just about the EPF test driver. A function driver may need to do many
things to properly reset the state machine. Like in the case of MHI driver, it
needs to reset channel state, mask interrupts etc... and all requires writing to
some registers. So certainly there should be some time before cutting off the
refclk.

On x86 host machines, I can see that there is enough time before the host cuts
off the refclk.

But it is unfortunate that the PCIe spec didn't define it though.

- Mani

>
> >
> > Signed-off-by: Manivannan Sadhasivam <[email protected]>
> > ---
> > drivers/pci/controller/dwc/pcie-qcom.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> > index 14772edcf0d3..b2803978c0ad 100644
> > --- a/drivers/pci/controller/dwc/pcie-qcom.c
> > +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> > @@ -1655,6 +1655,13 @@ static int qcom_pcie_resume_noirq(struct device *dev)
> > return 0;
> > }
> >
> > +static void qcom_pcie_shutdown(struct platform_device *pdev)
> > +{
> > + struct qcom_pcie *pcie = platform_get_drvdata(pdev);
> > +
> > + qcom_pcie_host_deinit(&pcie->pci->pp);
> > +}
> > +
> > static const struct of_device_id qcom_pcie_match[] = {
> > { .compatible = "qcom,pcie-apq8064", .data = &cfg_2_1_0 },
> > { .compatible = "qcom,pcie-apq8084", .data = &cfg_1_0_0 },
> > @@ -1708,5 +1715,6 @@ static struct platform_driver qcom_pcie_driver = {
> > .pm = &qcom_pcie_pm_ops,
> > .probe_type = PROBE_PREFER_ASYNCHRONOUS,
> > },
> > + .shutdown = qcom_pcie_shutdown,
> > };
> > builtin_platform_driver(qcom_pcie_driver);
> >
> > --
> > 2.25.1
> >

--
மணிவண்ணன் சதாசிவம்

2024-04-03 13:46:57

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] PCI: endpoint: Rename core_init() callback in 'struct pci_epc_event_ops' to init()

On Tue, Apr 02, 2024 at 12:52:10PM +0200, Niklas Cassel wrote:
> On Mon, Apr 01, 2024 at 09:20:29PM +0530, Manivannan Sadhasivam wrote:
> > core_init() callback is used to notify the EPC initialization event to the
> > EPF drivers. The 'core' prefix was used indicate that the controller IP
> > core has completed initialization. But it serves no purpose as the EPF
> > driver will only care about the EPC initialization as a whole and there is
> > no real benefit to distinguish the IP core part.
> >
> > So let's rename the core_init() callback in 'struct pci_epc_event_ops' to
> > just init() to make it more clear.
> >
> > Signed-off-by: Manivannan Sadhasivam <[email protected]>
> > ---
> > drivers/pci/endpoint/functions/pci-epf-mhi.c | 4 ++--
> > drivers/pci/endpoint/functions/pci-epf-test.c | 4 ++--
> > drivers/pci/endpoint/pci-epc-core.c | 16 ++++++++--------
> > include/linux/pci-epf.h | 4 ++--
> > 4 files changed, 14 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-mhi.c b/drivers/pci/endpoint/functions/pci-epf-mhi.c
> > index 280863c0eeb9..b3c26ffd29a5 100644
> > --- a/drivers/pci/endpoint/functions/pci-epf-mhi.c
> > +++ b/drivers/pci/endpoint/functions/pci-epf-mhi.c
> > @@ -716,7 +716,7 @@ static void pci_epf_mhi_dma_deinit(struct pci_epf_mhi *epf_mhi)
> > epf_mhi->dma_chan_rx = NULL;
> > }
> >
> > -static int pci_epf_mhi_core_init(struct pci_epf *epf)
> > +static int pci_epf_mhi_epc_init(struct pci_epf *epf)
> > {
> > struct pci_epf_mhi *epf_mhi = epf_get_drvdata(epf);
> > const struct pci_epf_mhi_ep_info *info = epf_mhi->info;
> > @@ -897,7 +897,7 @@ static void pci_epf_mhi_unbind(struct pci_epf *epf)
> > }
> >
> > static const struct pci_epc_event_ops pci_epf_mhi_epc_event_ops = {
> > - .core_init = pci_epf_mhi_core_init,
> > + .init = pci_epf_mhi_epc_init,
> > };
> >
> > static const struct pci_epc_bus_event_ops pci_epf_mhi_bus_event_ops = {
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
> > index 973db0b1bde2..abcb6ca61c4e 100644
> > --- a/drivers/pci/endpoint/functions/pci-epf-test.c
> > +++ b/drivers/pci/endpoint/functions/pci-epf-test.c
> > @@ -731,7 +731,7 @@ static int pci_epf_test_set_bar(struct pci_epf *epf)
> > return 0;
> > }
> >
> > -static int pci_epf_test_core_init(struct pci_epf *epf)
> > +static int pci_epf_test_epc_init(struct pci_epf *epf)
>
> On V1 you agreed that it is better to remove 'epc' from the naming.
> (For both pci-epf-test and pci-epf-mhi).
> You seem to have forgotten to address this for V2.
>

Oh yeah, sorry about that. I tried to address comments for both series and
apparently this one got missed.

- Mani

--
மணிவண்ணன் சதாசிவம்

2024-04-03 13:48:50

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] PCI: endpoint: Make host reboot handling more robust

On Tue, Apr 02, 2024 at 11:41:00AM +0200, Niklas Cassel wrote:
> On Mon, Apr 01, 2024 at 09:20:26PM +0530, Manivannan Sadhasivam wrote:
> > Hello,
> >
> > This is the follow up series of [1], to improve the handling of host reboot in
> > the endpoint subsystem. This involves refining the PERST# and Link Down event
> > handling in both the controller and function drivers.
> >
> > Testing
> > =======
> >
> > This series is tested on Qcom SM8450 based development board with both MHI_EPF
> > and EPF_TEST function drivers.
> >
> > Dependency
> > ==========
> >
> > This series depends on [1] and [2].
> >
> > - Mani
>
> Hello Mani,
>
> > [1] https://lore.kernel.org/linux-pci/[email protected]/
> > [2] https://lore.kernel.org/linux-pci/[email protected]/
>
> AFAICT both these series [1] (DBI rework v12, not v10) and [2] are fully
> reviewed and seem to be ready to go.
>
> Considering that we have patches depending on [1] and [2],
> namely the series in $subject, but also:
> https://lore.kernel.org/linux-pci/[email protected]/T/#t
>
> I think it would be a good idea if you could apply [1] and [2] to the
> pci/endpoint branch.
>

Unfortunately, I cannot merge the patches outside 'pci/endpoint' even though
these are related to the PCI Endpoint subsystem. But I have delegated these 2
series to Krzysztof, so hopefully he'll apply it soon.

- Mani

> (It is not easy for people to know that they will need to rebase their work on
> these (fully reviewed) series, when they have not been applied.)
>
>
> Kind regards,
> Niklas
>
>
> >
> > Signed-off-by: Manivannan Sadhasivam <[email protected]>
> > ---
> > Changes in v2:
> > - Dropped the {start/stop}_link rework patches
> > - Incorporated comments from Niklas
> > - Collected review tags
> > - Rebased on top of v6.9-rc1 and https://lore.kernel.org/linux-pci/[email protected]/
> > - Link to v1: https://lore.kernel.org/r/[email protected]
> >
> > ---
> > Manivannan Sadhasivam (10):
> > PCI: qcom-ep: Disable resources unconditionally during PERST# assert
> > PCI: endpoint: Decouple EPC and PCIe bus specific events
> > PCI: endpoint: Rename core_init() callback in 'struct pci_epc_event_ops' to init()
> > PCI: epf-test: Refactor pci_epf_test_unbind() function
> > PCI: epf-{mhi/test}: Move DMA initialization to EPC init callback
> > PCI: endpoint: Introduce EPC 'deinit' event and notify the EPF drivers
> > PCI: dwc: ep: Add a generic dw_pcie_ep_linkdown() API to handle Link Down event
> > PCI: qcom-ep: Use the generic dw_pcie_ep_linkdown() API to handle Link Down event
> > PCI: epf-test: Handle Link Down event
> > PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices
> >
> > drivers/pci/controller/dwc/pcie-designware-ep.c | 99 ++++++++++++++---------
> > drivers/pci/controller/dwc/pcie-designware.h | 5 ++
> > drivers/pci/controller/dwc/pcie-qcom-ep.c | 9 +--
> > drivers/pci/controller/dwc/pcie-qcom.c | 8 ++
> > drivers/pci/controller/dwc/pcie-tegra194.c | 1 +
> > drivers/pci/endpoint/functions/pci-epf-mhi.c | 47 ++++++++---
> > drivers/pci/endpoint/functions/pci-epf-test.c | 103 +++++++++++++++++-------
> > drivers/pci/endpoint/pci-epc-core.c | 53 ++++++++----
> > include/linux/pci-epc.h | 1 +
> > include/linux/pci-epf.h | 27 +++++--
> > 10 files changed, 248 insertions(+), 105 deletions(-)
> > ---
> > base-commit: e6377605ca734126533a0f8e4de2b4bac881f076
> > change-id: 20240314-pci-epf-rework-a6e65b103a79
> >
> > Best regards,
> > --
> > Manivannan Sadhasivam <[email protected]>
> >

--
மணிவண்ணன் சதாசிவம்

2024-04-03 14:27:42

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 02/10] PCI: endpoint: Decouple EPC and PCIe bus specific events

On Tue, Apr 02, 2024 at 09:14:20AM +0900, Damien Le Moal wrote:
> On 4/2/24 00:50, Manivannan Sadhasivam wrote:
> > Currently, 'struct pci_epc_event_ops' has a bunch of events that are sent
> > from the EPC driver to EPF driver. But those events are a mix of EPC
> > specific events like core_init and PCIe bus specific events like LINK_UP,
> > LINK_DOWN, BME etc...
> >
> > Let's decouple them to respective structs (pci_epc_event_ops,
> > pci_epc_bus_event_ops) to make the separation clear.
>
> I fail to see the benefits here. The event operation names are quite clear and,
> in my opinion, it is clear if an event op applies to the controller or to the
> bus/link. If anything, "core_init" could a little more clear, so renaming that
> "ep_controller_init" or something like that (clearly spelling out what is being
> initialized) seems enough to me. Similarly, the "bme" op name is very criptic.
> Renaming that to "bus_master_enable" would go a long way clarifying the code.
> For link events, "link_up", "link_down" are clear. So I think there is no need
> to split the event op struct like this. Renaming the ops is better.
>
> Note that I am not opposed to this patch, but I think it is just code churn
> that does not really bring any fundamental improvement. Regardless, renaming
> "core_init" and "bme" ops is I think desired.
>

Niklas shared the same view during v1, but I hate to see the events being mixed
in a single ops. Especially that it will confuse the developers who are not
familiar with the EP subsystem.

But since the argument is coming twice, I've decided to drop this for now and
just rename the 'core_init' callback to 'epc_init' and name the deinit callback
as 'epc_deinit'.

Let's see how long the callback list goes...

- Mani

> >
> > Signed-off-by: Manivannan Sadhasivam <[email protected]>
> > ---
> > drivers/pci/endpoint/functions/pci-epf-mhi.c | 8 ++++++--
> > drivers/pci/endpoint/functions/pci-epf-test.c | 8 ++++++--
> > drivers/pci/endpoint/pci-epc-core.c | 20 ++++++++++----------
> > include/linux/pci-epf.h | 23 ++++++++++++++++-------
> > 4 files changed, 38 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-mhi.c b/drivers/pci/endpoint/functions/pci-epf-mhi.c
> > index 2c54d80107cf..280863c0eeb9 100644
> > --- a/drivers/pci/endpoint/functions/pci-epf-mhi.c
> > +++ b/drivers/pci/endpoint/functions/pci-epf-mhi.c
> > @@ -896,8 +896,11 @@ static void pci_epf_mhi_unbind(struct pci_epf *epf)
> > pci_epc_clear_bar(epc, epf->func_no, epf->vfunc_no, epf_bar);
> > }
> >
> > -static const struct pci_epc_event_ops pci_epf_mhi_event_ops = {
> > +static const struct pci_epc_event_ops pci_epf_mhi_epc_event_ops = {
> > .core_init = pci_epf_mhi_core_init,
> > +};
> > +
> > +static const struct pci_epc_bus_event_ops pci_epf_mhi_bus_event_ops = {
> > .link_up = pci_epf_mhi_link_up,
> > .link_down = pci_epf_mhi_link_down,
> > .bme = pci_epf_mhi_bme,
> > @@ -919,7 +922,8 @@ static int pci_epf_mhi_probe(struct pci_epf *epf,
> > epf_mhi->info = info;
> > epf_mhi->epf = epf;
> >
> > - epf->event_ops = &pci_epf_mhi_event_ops;
> > + epf->epc_event_ops = &pci_epf_mhi_epc_event_ops;
> > + epf->bus_event_ops = &pci_epf_mhi_bus_event_ops;
> >
> > mutex_init(&epf_mhi->lock);
> >
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
> > index 977fb79c1567..973db0b1bde2 100644
> > --- a/drivers/pci/endpoint/functions/pci-epf-test.c
> > +++ b/drivers/pci/endpoint/functions/pci-epf-test.c
> > @@ -798,8 +798,11 @@ static int pci_epf_test_link_up(struct pci_epf *epf)
> > return 0;
> > }
> >
> > -static const struct pci_epc_event_ops pci_epf_test_event_ops = {
> > +static const struct pci_epc_event_ops pci_epf_test_epc_event_ops = {
> > .core_init = pci_epf_test_core_init,
> > +};
> > +
> > +static const struct pci_epc_bus_event_ops pci_epf_test_bus_event_ops = {
> > .link_up = pci_epf_test_link_up,
> > };
> >
> > @@ -916,7 +919,8 @@ static int pci_epf_test_probe(struct pci_epf *epf,
> >
> > INIT_DELAYED_WORK(&epf_test->cmd_handler, pci_epf_test_cmd_handler);
> >
> > - epf->event_ops = &pci_epf_test_event_ops;
> > + epf->epc_event_ops = &pci_epf_test_epc_event_ops;
> > + epf->bus_event_ops = &pci_epf_test_bus_event_ops;
> >
> > epf_set_drvdata(epf, epf_test);
> > return 0;
> > diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
> > index 47d27ec7439d..f202ae07ffa9 100644
> > --- a/drivers/pci/endpoint/pci-epc-core.c
> > +++ b/drivers/pci/endpoint/pci-epc-core.c
> > @@ -692,8 +692,8 @@ void pci_epc_linkup(struct pci_epc *epc)
> > mutex_lock(&epc->list_lock);
> > list_for_each_entry(epf, &epc->pci_epf, list) {
> > mutex_lock(&epf->lock);
> > - if (epf->event_ops && epf->event_ops->link_up)
> > - epf->event_ops->link_up(epf);
> > + if (epf->bus_event_ops && epf->bus_event_ops->link_up)
> > + epf->bus_event_ops->link_up(epf);
> > mutex_unlock(&epf->lock);
> > }
> > mutex_unlock(&epc->list_lock);
> > @@ -718,8 +718,8 @@ void pci_epc_linkdown(struct pci_epc *epc)
> > mutex_lock(&epc->list_lock);
> > list_for_each_entry(epf, &epc->pci_epf, list) {
> > mutex_lock(&epf->lock);
> > - if (epf->event_ops && epf->event_ops->link_down)
> > - epf->event_ops->link_down(epf);
> > + if (epf->bus_event_ops && epf->bus_event_ops->link_down)
> > + epf->bus_event_ops->link_down(epf);
> > mutex_unlock(&epf->lock);
> > }
> > mutex_unlock(&epc->list_lock);
> > @@ -744,8 +744,8 @@ void pci_epc_init_notify(struct pci_epc *epc)
> > mutex_lock(&epc->list_lock);
> > list_for_each_entry(epf, &epc->pci_epf, list) {
> > mutex_lock(&epf->lock);
> > - if (epf->event_ops && epf->event_ops->core_init)
> > - epf->event_ops->core_init(epf);
> > + if (epf->epc_event_ops && epf->epc_event_ops->core_init)
> > + epf->epc_event_ops->core_init(epf);
> > mutex_unlock(&epf->lock);
> > }
> > epc->init_complete = true;
> > @@ -767,8 +767,8 @@ void pci_epc_notify_pending_init(struct pci_epc *epc, struct pci_epf *epf)
> > {
> > if (epc->init_complete) {
> > mutex_lock(&epf->lock);
> > - if (epf->event_ops && epf->event_ops->core_init)
> > - epf->event_ops->core_init(epf);
> > + if (epf->epc_event_ops && epf->epc_event_ops->core_init)
> > + epf->epc_event_ops->core_init(epf);
> > mutex_unlock(&epf->lock);
> > }
> > }
> > @@ -792,8 +792,8 @@ void pci_epc_bme_notify(struct pci_epc *epc)
> > mutex_lock(&epc->list_lock);
> > list_for_each_entry(epf, &epc->pci_epf, list) {
> > mutex_lock(&epf->lock);
> > - if (epf->event_ops && epf->event_ops->bme)
> > - epf->event_ops->bme(epf);
> > + if (epf->bus_event_ops && epf->bus_event_ops->bme)
> > + epf->bus_event_ops->bme(epf);
> > mutex_unlock(&epf->lock);
> > }
> > mutex_unlock(&epc->list_lock);
> > diff --git a/include/linux/pci-epf.h b/include/linux/pci-epf.h
> > index adee6a1b35db..77399fecaeb5 100644
> > --- a/include/linux/pci-epf.h
> > +++ b/include/linux/pci-epf.h
> > @@ -69,14 +69,21 @@ struct pci_epf_ops {
> > };
> >
> > /**
> > - * struct pci_epc_event_ops - Callbacks for capturing the EPC events
> > - * @core_init: Callback for the EPC initialization complete event
> > - * @link_up: Callback for the EPC link up event
> > - * @link_down: Callback for the EPC link down event
> > - * @bme: Callback for the EPC BME (Bus Master Enable) event
> > + * struct pci_epc_event_ops - Callbacks for capturing the EPC specific events
> > + * @core_init: Callback for the EPC initialization event
> > */
> > struct pci_epc_event_ops {
> > int (*core_init)(struct pci_epf *epf);
> > +};
> > +
> > +/**
> > + * struct pci_epc_bus_event_ops - Callbacks for capturing the PCIe bus specific
> > + * events
> > + * @link_up: Callback for the PCIe bus link up event
> > + * @link_down: Callback for the PCIe bus link down event
> > + * @bme: Callback for the PCIe bus BME (Bus Master Enable) event
> > + */
> > +struct pci_epc_bus_event_ops {
> > int (*link_up)(struct pci_epf *epf);
> > int (*link_down)(struct pci_epf *epf);
> > int (*bme)(struct pci_epf *epf);
> > @@ -150,7 +157,8 @@ struct pci_epf_bar {
> > * @is_vf: true - virtual function, false - physical function
> > * @vfunction_num_map: bitmap to manage virtual function number
> > * @pci_vepf: list of virtual endpoint functions associated with this function
> > - * @event_ops: Callbacks for capturing the EPC events
> > + * @epc_event_ops: Callbacks for capturing the EPC events
> > + * @bus_event_ops: Callbacks for capturing the PCIe bus events
> > */
> > struct pci_epf {
> > struct device dev;
> > @@ -180,7 +188,8 @@ struct pci_epf {
> > unsigned int is_vf;
> > unsigned long vfunction_num_map;
> > struct list_head pci_vepf;
> > - const struct pci_epc_event_ops *event_ops;
> > + const struct pci_epc_event_ops *epc_event_ops;
> > + const struct pci_epc_bus_event_ops *bus_event_ops;
> > };
> >
> > /**
> >
>
> --
> Damien Le Moal
> Western Digital Research
>

--
மணிவண்ணன் சதாசிவம்

2024-04-03 15:54:45

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] PCI: endpoint: Rename core_init() callback in 'struct pci_epc_event_ops' to init()

On Wed, Apr 03, 2024 at 07:16:05PM +0530, Manivannan Sadhasivam wrote:
> On Tue, Apr 02, 2024 at 12:52:10PM +0200, Niklas Cassel wrote:
> > On Mon, Apr 01, 2024 at 09:20:29PM +0530, Manivannan Sadhasivam wrote:
> > > core_init() callback is used to notify the EPC initialization event to the
> > > EPF drivers. The 'core' prefix was used indicate that the controller IP
> > > core has completed initialization. But it serves no purpose as the EPF
> > > driver will only care about the EPC initialization as a whole and there is
> > > no real benefit to distinguish the IP core part.
> > >
> > > So let's rename the core_init() callback in 'struct pci_epc_event_ops' to
> > > just init() to make it more clear.
> > >
> > > Signed-off-by: Manivannan Sadhasivam <[email protected]>
> > > ---
> > > drivers/pci/endpoint/functions/pci-epf-mhi.c | 4 ++--
> > > drivers/pci/endpoint/functions/pci-epf-test.c | 4 ++--
> > > drivers/pci/endpoint/pci-epc-core.c | 16 ++++++++--------
> > > include/linux/pci-epf.h | 4 ++--
> > > 4 files changed, 14 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/drivers/pci/endpoint/functions/pci-epf-mhi.c b/drivers/pci/endpoint/functions/pci-epf-mhi.c
> > > index 280863c0eeb9..b3c26ffd29a5 100644
> > > --- a/drivers/pci/endpoint/functions/pci-epf-mhi.c
> > > +++ b/drivers/pci/endpoint/functions/pci-epf-mhi.c
> > > @@ -716,7 +716,7 @@ static void pci_epf_mhi_dma_deinit(struct pci_epf_mhi *epf_mhi)
> > > epf_mhi->dma_chan_rx = NULL;
> > > }
> > >
> > > -static int pci_epf_mhi_core_init(struct pci_epf *epf)
> > > +static int pci_epf_mhi_epc_init(struct pci_epf *epf)
> > > {
> > > struct pci_epf_mhi *epf_mhi = epf_get_drvdata(epf);
> > > const struct pci_epf_mhi_ep_info *info = epf_mhi->info;
> > > @@ -897,7 +897,7 @@ static void pci_epf_mhi_unbind(struct pci_epf *epf)
> > > }
> > >
> > > static const struct pci_epc_event_ops pci_epf_mhi_epc_event_ops = {
> > > - .core_init = pci_epf_mhi_core_init,
> > > + .init = pci_epf_mhi_epc_init,
> > > };
> > >
> > > static const struct pci_epc_bus_event_ops pci_epf_mhi_bus_event_ops = {
> > > diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
> > > index 973db0b1bde2..abcb6ca61c4e 100644
> > > --- a/drivers/pci/endpoint/functions/pci-epf-test.c
> > > +++ b/drivers/pci/endpoint/functions/pci-epf-test.c
> > > @@ -731,7 +731,7 @@ static int pci_epf_test_set_bar(struct pci_epf *epf)
> > > return 0;
> > > }
> > >
> > > -static int pci_epf_test_core_init(struct pci_epf *epf)
> > > +static int pci_epf_test_epc_init(struct pci_epf *epf)
> >
> > On V1 you agreed that it is better to remove 'epc' from the naming.
> > (For both pci-epf-test and pci-epf-mhi).
> > You seem to have forgotten to address this for V2.
> >
>
> Oh yeah, sorry about that. I tried to address comments for both series and
> apparently this one got missed.
>

Ok, now I remember that I kept the prefix intentionally. The module init
functions are already named as pci_epf_{test/mhi}_init(), so cannot use the same
name for the callback also. And using some other name didn't fit, so I kept
'epc' as the prefix since the callback acts on the EPC initialization event
anyway.

- Mani

--
மணிவண்ணன் சதாசிவம்

2024-04-03 20:05:27

by Niklas Cassel

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices

On Wed, Apr 03, 2024 at 07:02:17PM +0530, Manivannan Sadhasivam wrote:
> On Tue, Apr 02, 2024 at 01:18:54PM +0200, Niklas Cassel wrote:
> > On Mon, Apr 01, 2024 at 09:20:36PM +0530, Manivannan Sadhasivam wrote:
> > > PCIe host controller drivers are supposed to properly reset the endpoint
> > > devices during host shutdown/reboot. Currently, Qcom driver doesn't do
> > > anything during host shutdown/reboot, resulting in both PERST# and refclk
> > > getting disabled at the same time. This prevents the endpoint device
> > > firmware to properly reset the state machine. Because, if the refclk is
> > > cutoff immediately along with PERST#, access to device specific registers
> > > within the endpoint will result in a firmware crash.
> > >
> > > To address this issue, let's call qcom_pcie_host_deinit() inside the
> > > shutdown callback, that asserts PERST# and then cuts off the refclk with a
> > > delay of 1ms, thus allowing the endpoint device firmware to properly
> > > cleanup the state machine.
> >
> > Hm... a QCOM EP device could be attached to any of the PCIe RC drivers that
> > we have in the kernel, so it seems a bit weird to fix this problem by
> > patching the QCOM RC driver only.
> >
> > Which DBI call is it that causes this problem during perst assert on EP side?
> >
> > I assume that it is pci-epf-test:deinit() callback that calls
> > pci_epc_clear_bar(), which calls dw_pcie_ep_clear_bar(), which will both:
> > -clear local data structures, e.g.
> > ep->epf_bar[bar] = NULL;
> > ep->bar_to_atu[bar] = 0;
> >
> > but also call:
> > __dw_pcie_ep_reset_bar()
> > dw_pcie_disable_atu()
> >
> >
> > Do we perhaps need to redesign the .deinit EPF callback?
> >
> > Considering that we know that .deinit() will only be called on platforms
> > where there will be a fundamental core reset, I guess we could do something
> > like introduce a __dw_pcie_ep_clear_bar() which will only clear the local
> > data structures. (It might not need to do any DBI writes, since the
> > fundamental core reset should have reset all values.)
> >
> > Or perhaps instead of letting pci_epf_test_epc_deinit() call
> > pci_epf_test_clear_bar()/__pci_epf_test_clear_bar() directly, perhaps let
> > pci_epf_test_epc_deinit() call add a .deinit()/.cleanup() defined in the
> > EPC driver.
> >
> > This EPC .deinit()/.cleanup() callback would then only clear the
> > local data structures (no DBI writes...).
> >
> > Something like that?
> >
>
> It is not just about the EPF test driver. A function driver may need to do many
> things to properly reset the state machine. Like in the case of MHI driver, it
> needs to reset channel state, mask interrupts etc... and all requires writing to
> some registers. So certainly there should be some time before cutting off the
> refclk.

I was more thinking that perhaps we should think of .deinit() as in how
dw_pcie_ep_init() used to be. It was not allowed to have any DBI writes.
(The DBI writes were all in dw_pcie_ep_init_complete()).
So perhaps we could define that a EPF .deinit() callback is not allowed
to have any DBI writes.

If we take qcom-ep as an example, as soon as you get a PERST assertion
the qcom-ep driver calls notify_deinit(), then asserts the reset control,
disables clocks and regulators.

Since the PCIe core is held in reset, the hardware is in a well defined
state, no? Sure, the data structures e.g. bar_to_iatu[], etc., might be
out of sync, but these could be memset():ed no? Since this is a fundamental
reset, all registers should be reset to their default state (once reset
is deasserted).

For a real PCIe card, if you assert + msleep(100) + deassert PERST, surely
the endpoint is supposed to be in a good/well defined state, regardless if
he REFCLK was cutoff at the exact time as PERST was asserted or not?

I would assume that we would want a PCI EPF driver to behave the same way,
if possible.


Kind regards,
Niklas

2024-04-04 02:41:18

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 02/10] PCI: endpoint: Decouple EPC and PCIe bus specific events

On 4/3/24 23:26, Manivannan Sadhasivam wrote:
> On Tue, Apr 02, 2024 at 09:14:20AM +0900, Damien Le Moal wrote:
>> On 4/2/24 00:50, Manivannan Sadhasivam wrote:
>>> Currently, 'struct pci_epc_event_ops' has a bunch of events that are sent
>>> from the EPC driver to EPF driver. But those events are a mix of EPC
>>> specific events like core_init and PCIe bus specific events like LINK_UP,
>>> LINK_DOWN, BME etc...
>>>
>>> Let's decouple them to respective structs (pci_epc_event_ops,
>>> pci_epc_bus_event_ops) to make the separation clear.
>>
>> I fail to see the benefits here. The event operation names are quite clear and,
>> in my opinion, it is clear if an event op applies to the controller or to the
>> bus/link. If anything, "core_init" could a little more clear, so renaming that
>> "ep_controller_init" or something like that (clearly spelling out what is being
>> initialized) seems enough to me. Similarly, the "bme" op name is very criptic.
>> Renaming that to "bus_master_enable" would go a long way clarifying the code.
>> For link events, "link_up", "link_down" are clear. So I think there is no need
>> to split the event op struct like this. Renaming the ops is better.
>>
>> Note that I am not opposed to this patch, but I think it is just code churn
>> that does not really bring any fundamental improvement. Regardless, renaming
>> "core_init" and "bme" ops is I think desired.
>>
>
> Niklas shared the same view during v1, but I hate to see the events being mixed
> in a single ops. Especially that it will confuse the developers who are not
> familiar with the EP subsystem.
>
> But since the argument is coming twice, I've decided to drop this for now and
> just rename the 'core_init' callback to 'epc_init' and name the deinit callback
> as 'epc_deinit'.

Sounds good. Please also rename the completely unclear "bme" operation. Spell it
out to be clear.

--
Damien Le Moal
Western Digital Research


2024-04-08 09:50:02

by Niklas Cassel

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] PCI: endpoint: Make host reboot handling more robust

On Wed, Apr 03, 2024 at 07:18:25PM +0530, Manivannan Sadhasivam wrote:
> On Tue, Apr 02, 2024 at 11:41:00AM +0200, Niklas Cassel wrote:
> > On Mon, Apr 01, 2024 at 09:20:26PM +0530, Manivannan Sadhasivam wrote:
> > > Hello,
> > >
> > > This is the follow up series of [1], to improve the handling of host reboot in
> > > the endpoint subsystem. This involves refining the PERST# and Link Down event
> > > handling in both the controller and function drivers.
> > >
> > > Testing
> > > =======
> > >
> > > This series is tested on Qcom SM8450 based development board with both MHI_EPF
> > > and EPF_TEST function drivers.
> > >
> > > Dependency
> > > ==========
> > >
> > > This series depends on [1] and [2].
> > >
> > > - Mani
> >
> > Hello Mani,
> >
> > > [1] https://lore.kernel.org/linux-pci/[email protected]/
> > > [2] https://lore.kernel.org/linux-pci/[email protected]/
> >
> > AFAICT both these series [1] (DBI rework v12, not v10) and [2] are fully
> > reviewed and seem to be ready to go.
> >
> > Considering that we have patches depending on [1] and [2],
> > namely the series in $subject, but also:
> > https://lore.kernel.org/linux-pci/[email protected]/T/#t
> >
> > I think it would be a good idea if you could apply [1] and [2] to the
> > pci/endpoint branch.
> >
>
> Unfortunately, I cannot merge the patches outside 'pci/endpoint' even though
> these are related to the PCI Endpoint subsystem. But I have delegated these 2
> series to Krzysztof, so hopefully he'll apply it soon.

Hello Mani, Krzysztof,

These three series:
https://patchwork.kernel.org/project/linux-pci/list/?series=836730&state=*
https://patchwork.kernel.org/project/linux-pci/list/?series=838789&state=*

Still haven't been merged.

Considering that the PCI endpoint memory mapping series:
https://patchwork.kernel.org/project/linux-pci/list/?series=839970
conflicts with both of these series, I think that it would be nice if the
two fully reviewed series above could get picked up.

Right now, I think we still have time to get the PCI endpoint memory mapping
series fully reviewed to live in linux-next for 2 weeks before the next merge
window opens, but time is running out (because of delays for what appears to
be no apparent reason).


Kind regards,
Niklas


P.S.
It would also be nice to see:
https://patchwork.kernel.org/project/linux-pci/list/?series=838545&state=*
getting picked up.

2024-04-09 13:36:28

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices

On Wed, Apr 03, 2024 at 10:03:26PM +0200, Niklas Cassel wrote:
> On Wed, Apr 03, 2024 at 07:02:17PM +0530, Manivannan Sadhasivam wrote:
> > On Tue, Apr 02, 2024 at 01:18:54PM +0200, Niklas Cassel wrote:
> > > On Mon, Apr 01, 2024 at 09:20:36PM +0530, Manivannan Sadhasivam wrote:
> > > > PCIe host controller drivers are supposed to properly reset the endpoint
> > > > devices during host shutdown/reboot. Currently, Qcom driver doesn't do
> > > > anything during host shutdown/reboot, resulting in both PERST# and refclk
> > > > getting disabled at the same time. This prevents the endpoint device
> > > > firmware to properly reset the state machine. Because, if the refclk is
> > > > cutoff immediately along with PERST#, access to device specific registers
> > > > within the endpoint will result in a firmware crash.
> > > >
> > > > To address this issue, let's call qcom_pcie_host_deinit() inside the
> > > > shutdown callback, that asserts PERST# and then cuts off the refclk with a
> > > > delay of 1ms, thus allowing the endpoint device firmware to properly
> > > > cleanup the state machine.
> ...

> For a real PCIe card, if you assert + msleep(100) + deassert PERST, surely
> the endpoint is supposed to be in a good/well defined state, regardless if
> he REFCLK was cutoff at the exact time as PERST was asserted or not?

I think this is the key point. This PERST#/REFCLK timing requirement
seems like a defect in the endpoint. I know there's firmware involved
and whatnot, but IMO it's the endpoint's problem to handle these issues
internally.

Bjorn

2024-04-10 10:54:33

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices

On Wed, Apr 03, 2024 at 10:03:26PM +0200, Niklas Cassel wrote:
> On Wed, Apr 03, 2024 at 07:02:17PM +0530, Manivannan Sadhasivam wrote:
> > On Tue, Apr 02, 2024 at 01:18:54PM +0200, Niklas Cassel wrote:
> > > On Mon, Apr 01, 2024 at 09:20:36PM +0530, Manivannan Sadhasivam wrote:
> > > > PCIe host controller drivers are supposed to properly reset the endpoint
> > > > devices during host shutdown/reboot. Currently, Qcom driver doesn't do
> > > > anything during host shutdown/reboot, resulting in both PERST# and refclk
> > > > getting disabled at the same time. This prevents the endpoint device
> > > > firmware to properly reset the state machine. Because, if the refclk is
> > > > cutoff immediately along with PERST#, access to device specific registers
> > > > within the endpoint will result in a firmware crash.
> > > >
> > > > To address this issue, let's call qcom_pcie_host_deinit() inside the
> > > > shutdown callback, that asserts PERST# and then cuts off the refclk with a
> > > > delay of 1ms, thus allowing the endpoint device firmware to properly
> > > > cleanup the state machine.
> > >
> > > Hm... a QCOM EP device could be attached to any of the PCIe RC drivers that
> > > we have in the kernel, so it seems a bit weird to fix this problem by
> > > patching the QCOM RC driver only.
> > >
> > > Which DBI call is it that causes this problem during perst assert on EP side?
> > >
> > > I assume that it is pci-epf-test:deinit() callback that calls
> > > pci_epc_clear_bar(), which calls dw_pcie_ep_clear_bar(), which will both:
> > > -clear local data structures, e.g.
> > > ep->epf_bar[bar] = NULL;
> > > ep->bar_to_atu[bar] = 0;
> > >
> > > but also call:
> > > __dw_pcie_ep_reset_bar()
> > > dw_pcie_disable_atu()
> > >
> > >
> > > Do we perhaps need to redesign the .deinit EPF callback?
> > >
> > > Considering that we know that .deinit() will only be called on platforms
> > > where there will be a fundamental core reset, I guess we could do something
> > > like introduce a __dw_pcie_ep_clear_bar() which will only clear the local
> > > data structures. (It might not need to do any DBI writes, since the
> > > fundamental core reset should have reset all values.)
> > >
> > > Or perhaps instead of letting pci_epf_test_epc_deinit() call
> > > pci_epf_test_clear_bar()/__pci_epf_test_clear_bar() directly, perhaps let
> > > pci_epf_test_epc_deinit() call add a .deinit()/.cleanup() defined in the
> > > EPC driver.
> > >
> > > This EPC .deinit()/.cleanup() callback would then only clear the
> > > local data structures (no DBI writes...).
> > >
> > > Something like that?
> > >
> >
> > It is not just about the EPF test driver. A function driver may need to do many
> > things to properly reset the state machine. Like in the case of MHI driver, it
> > needs to reset channel state, mask interrupts etc... and all requires writing to
> > some registers. So certainly there should be some time before cutting off the
> > refclk.
>
> I was more thinking that perhaps we should think of .deinit() as in how
> dw_pcie_ep_init() used to be. It was not allowed to have any DBI writes.
> (The DBI writes were all in dw_pcie_ep_init_complete()).
> So perhaps we could define that a EPF .deinit() callback is not allowed
> to have any DBI writes.
>
> If we take qcom-ep as an example, as soon as you get a PERST assertion
> the qcom-ep driver calls notify_deinit(), then asserts the reset control,
> disables clocks and regulators.
>
> Since the PCIe core is held in reset, the hardware is in a well defined
> state, no? Sure, the data structures e.g. bar_to_iatu[], etc., might be
> out of sync, but these could be memset():ed no? Since this is a fundamental
> reset, all registers should be reset to their default state (once reset
> is deasserted).
>

Well, we could prevent the register access during PERST# assert time in the
endpoint, but my worry is that we will end up with 2 version of the cleanup
APIs. Lets take an example of dw_pcie_edma_remove() API which gets called
during deinit and it touches some eDMA registers.

So should we introduce another API which just clears the sw data structure and
not touching the registers? And this may be needed for other generic APIs as
well.

Ideally, if there is a Link Down event before PERST# assert, then this could've
been solved, but that is also not a spec defined behavior.

- Mani

> For a real PCIe card, if you assert + msleep(100) + deassert PERST, surely
> the endpoint is supposed to be in a good/well defined state, regardless if
> he REFCLK was cutoff at the exact time as PERST was asserted or not?
>
> I would assume that we would want a PCI EPF driver to behave the same way,
> if possible.
>
>
> Kind regards,
> Niklas

--
மணிவண்ணன் சதாசிவம்

2024-04-10 14:56:13

by Niklas Cassel

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices

On Wed, Apr 10, 2024 at 04:24:10PM +0530, Manivannan Sadhasivam wrote:
> On Wed, Apr 03, 2024 at 10:03:26PM +0200, Niklas Cassel wrote:
> > On Wed, Apr 03, 2024 at 07:02:17PM +0530, Manivannan Sadhasivam wrote:
> > > On Tue, Apr 02, 2024 at 01:18:54PM +0200, Niklas Cassel wrote:
> > > > On Mon, Apr 01, 2024 at 09:20:36PM +0530, Manivannan Sadhasivam wrote:
> > > > > PCIe host controller drivers are supposed to properly reset the endpoint
> > > > > devices during host shutdown/reboot. Currently, Qcom driver doesn't do
> > > > > anything during host shutdown/reboot, resulting in both PERST# and refclk
> > > > > getting disabled at the same time. This prevents the endpoint device
> > > > > firmware to properly reset the state machine. Because, if the refclk is
> > > > > cutoff immediately along with PERST#, access to device specific registers
> > > > > within the endpoint will result in a firmware crash.
> > > > >
> > > > > To address this issue, let's call qcom_pcie_host_deinit() inside the
> > > > > shutdown callback, that asserts PERST# and then cuts off the refclk with a
> > > > > delay of 1ms, thus allowing the endpoint device firmware to properly
> > > > > cleanup the state machine.
> > > >
> > > > Hm... a QCOM EP device could be attached to any of the PCIe RC drivers that
> > > > we have in the kernel, so it seems a bit weird to fix this problem by
> > > > patching the QCOM RC driver only.
> > > >
> > > > Which DBI call is it that causes this problem during perst assert on EP side?
> > > >
> > > > I assume that it is pci-epf-test:deinit() callback that calls
> > > > pci_epc_clear_bar(), which calls dw_pcie_ep_clear_bar(), which will both:
> > > > -clear local data structures, e.g.
> > > > ep->epf_bar[bar] = NULL;
> > > > ep->bar_to_atu[bar] = 0;
> > > >
> > > > but also call:
> > > > __dw_pcie_ep_reset_bar()
> > > > dw_pcie_disable_atu()
> > > >
> > > >
> > > > Do we perhaps need to redesign the .deinit EPF callback?
> > > >
> > > > Considering that we know that .deinit() will only be called on platforms
> > > > where there will be a fundamental core reset, I guess we could do something
> > > > like introduce a __dw_pcie_ep_clear_bar() which will only clear the local
> > > > data structures. (It might not need to do any DBI writes, since the
> > > > fundamental core reset should have reset all values.)
> > > >
> > > > Or perhaps instead of letting pci_epf_test_epc_deinit() call
> > > > pci_epf_test_clear_bar()/__pci_epf_test_clear_bar() directly, perhaps let
> > > > pci_epf_test_epc_deinit() call add a .deinit()/.cleanup() defined in the
> > > > EPC driver.
> > > >
> > > > This EPC .deinit()/.cleanup() callback would then only clear the
> > > > local data structures (no DBI writes...).
> > > >
> > > > Something like that?
> > > >
> > >
> > > It is not just about the EPF test driver. A function driver may need to do many
> > > things to properly reset the state machine. Like in the case of MHI driver, it
> > > needs to reset channel state, mask interrupts etc... and all requires writing to
> > > some registers. So certainly there should be some time before cutting off the
> > > refclk.
> >
> > I was more thinking that perhaps we should think of .deinit() as in how
> > dw_pcie_ep_init() used to be. It was not allowed to have any DBI writes.
> > (The DBI writes were all in dw_pcie_ep_init_complete()).
> > So perhaps we could define that a EPF .deinit() callback is not allowed
> > to have any DBI writes.
> >
> > If we take qcom-ep as an example, as soon as you get a PERST assertion
> > the qcom-ep driver calls notify_deinit(), then asserts the reset control,
> > disables clocks and regulators.
> >
> > Since the PCIe core is held in reset, the hardware is in a well defined
> > state, no? Sure, the data structures e.g. bar_to_iatu[], etc., might be
> > out of sync, but these could be memset():ed no? Since this is a fundamental
> > reset, all registers should be reset to their default state (once reset
> > is deasserted).
> >
>
> Well, we could prevent the register access during PERST# assert time in the
> endpoint, but my worry is that we will end up with 2 version of the cleanup
> APIs. Lets take an example of dw_pcie_edma_remove() API which gets called
> during deinit and it touches some eDMA registers.
>
> So should we introduce another API which just clears the sw data structure and
> not touching the registers? And this may be needed for other generic APIs as
> well.

I agree that it will be hard to come up with an elegant solution to this
problem.

These endpoint controllers that cannot do register accesses to their own
controllers' DBI/register space without the RC providing a refclock are
really becoming a pain... and the design and complexity of the PCI endpoint
APIs is what suffers as a result.

PERST could be asserted at any time.
So for your system to not crash/hang by accessing registers in the controller,
an EPF driver must be designed with great care to never do any register access
when it is not safe...

Perhaps the the EPF core should set the state (i.e. init_complete = false,
even before calling the deinit callback in EPF driver, and perhaps even add
safe-guards against init_complete in some APIs, so that e.g. a set_bar() call
cannot trigger a crash because PERST# is asserted.. but even then, it could
still be asserted in the middle of set_bar()'s execution.)


Looking at the databook, it looks like core_clk is derived from pipe_clk
output of the PHY. The PHY will either use external clock or internal clock.

4.6.2 DBI Protocol Transactions
it looks like core_clk must be active to read/write the DBI.

I really wish those controllers could e.g. change the clock temporarily
using a mux, so that it could still perform DBI read/writes when there is
not external refclk... Something like pm_sel_aux_clk selecting to use the
aux clk instead of core_clk when in low power states.
But I don't know the hardware well enough to know if that is possible for
the DBI, so that might just be wishful thinking...


Kind regards,
Niklas

2024-04-10 21:10:18

by Niklas Cassel

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices

On Wed, Apr 10, 2024 at 04:52:18PM +0200, Niklas Cassel wrote:
> On Wed, Apr 10, 2024 at 04:24:10PM +0530, Manivannan Sadhasivam wrote:
> >
> > Well, we could prevent the register access during PERST# assert time in the
> > endpoint, but my worry is that we will end up with 2 version of the cleanup
> > APIs. Lets take an example of dw_pcie_edma_remove() API which gets called
> > during deinit and it touches some eDMA registers.
> >
> > So should we introduce another API which just clears the sw data structure and
> > not touching the registers? And this may be needed for other generic APIs as
> > well.
>
> I agree that it will be hard to come up with an elegant solution to this
> problem.
>
> These endpoint controllers that cannot do register accesses to their own
> controllers' DBI/register space without the RC providing a refclock are
> really becoming a pain... and the design and complexity of the PCI endpoint
> APIs is what suffers as a result.
>
> PERST could be asserted at any time.
> So for your system to not crash/hang by accessing registers in the controller,
> an EPF driver must be designed with great care to never do any register access
> when it is not safe...
>
> Perhaps the the EPF core should set the state (i.e. init_complete = false,
> even before calling the deinit callback in EPF driver, and perhaps even add
> safe-guards against init_complete in some APIs, so that e.g. a set_bar() call
> cannot trigger a crash because PERST# is asserted.. but even then, it could
> still be asserted in the middle of set_bar()'s execution.)
>
>
> Looking at the databook, it looks like core_clk is derived from pipe_clk
> output of the PHY. The PHY will either use external clock or internal clock.
>
> 4.6.2 DBI Protocol Transactions
> it looks like core_clk must be active to read/write the DBI.
>
> I really wish those controllers could e.g. change the clock temporarily
> using a mux, so that it could still perform DBI read/writes when there is
> not external refclk... Something like pm_sel_aux_clk selecting to use the
> aux clk instead of core_clk when in low power states.
> But I don't know the hardware well enough to know if that is possible for
> the DBI, so that might just be wishful thinking...


Looking at the rock5b SBC (rockchip rk3588), the PHY refclk can either
be taken from
-a PLL internally from the SoC.
or
-an external clock on the SBC.

There does not seem to be an option to get the refclk as an input from
the PCIe slot. (The refclk can only be output to the PCIe slot.)

So when running two rock5b SBC, you cannot use a common clock for the RC
and the EP side, you have to use a separate reference clock scheme,
either SRNS or SRIS.

Since I assume that you use two qcom platforms of the same model
(I remember that you wrote that you usually test with
qcom,sdx55-pcie-ep somewhere.)
Surely this board must be able to supply a reference clock?
(How else does it send this clock to the EP side?)

So... why can't you run in SRNS or SRIS mode, where the EP provides
it's own clock?


Kind regards,
Niklas

2024-04-12 06:33:15

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] PCI: qcom: Implement shutdown() callback to properly reset the endpoint devices

On Wed, Apr 10, 2024 at 11:10:01PM +0200, Niklas Cassel wrote:
> On Wed, Apr 10, 2024 at 04:52:18PM +0200, Niklas Cassel wrote:
> > On Wed, Apr 10, 2024 at 04:24:10PM +0530, Manivannan Sadhasivam wrote:
> > >
> > > Well, we could prevent the register access during PERST# assert time in the
> > > endpoint, but my worry is that we will end up with 2 version of the cleanup
> > > APIs. Lets take an example of dw_pcie_edma_remove() API which gets called
> > > during deinit and it touches some eDMA registers.
> > >
> > > So should we introduce another API which just clears the sw data structure and
> > > not touching the registers? And this may be needed for other generic APIs as
> > > well.
> >
> > I agree that it will be hard to come up with an elegant solution to this
> > problem.
> >
> > These endpoint controllers that cannot do register accesses to their own
> > controllers' DBI/register space without the RC providing a refclock are
> > really becoming a pain... and the design and complexity of the PCI endpoint
> > APIs is what suffers as a result.
> >
> > PERST could be asserted at any time.
> > So for your system to not crash/hang by accessing registers in the controller,
> > an EPF driver must be designed with great care to never do any register access
> > when it is not safe...
> >
> > Perhaps the the EPF core should set the state (i.e. init_complete = false,
> > even before calling the deinit callback in EPF driver, and perhaps even add
> > safe-guards against init_complete in some APIs, so that e.g. a set_bar() call
> > cannot trigger a crash because PERST# is asserted.. but even then, it could
> > still be asserted in the middle of set_bar()'s execution.)
> >
> >
> > Looking at the databook, it looks like core_clk is derived from pipe_clk
> > output of the PHY. The PHY will either use external clock or internal clock.
> >
> > 4.6.2 DBI Protocol Transactions
> > it looks like core_clk must be active to read/write the DBI.
> >
> > I really wish those controllers could e.g. change the clock temporarily
> > using a mux, so that it could still perform DBI read/writes when there is
> > not external refclk... Something like pm_sel_aux_clk selecting to use the
> > aux clk instead of core_clk when in low power states.
> > But I don't know the hardware well enough to know if that is possible for
> > the DBI, so that might just be wishful thinking...
>
>
> Looking at the rock5b SBC (rockchip rk3588), the PHY refclk can either
> be taken from
> -a PLL internally from the SoC.
> or
> -an external clock on the SBC.
>
> There does not seem to be an option to get the refclk as an input from
> the PCIe slot. (The refclk can only be output to the PCIe slot.)
>
> So when running two rock5b SBC, you cannot use a common clock for the RC
> and the EP side, you have to use a separate reference clock scheme,
> either SRNS or SRIS.
>
> Since I assume that you use two qcom platforms of the same model
> (I remember that you wrote that you usually test with
> qcom,sdx55-pcie-ep somewhere.)
> Surely this board must be able to supply a reference clock?
> (How else does it send this clock to the EP side?)
>

It is not the same model. It is the same SoC but with different base boards.
FWIW, I'm testing with both SDX55 and SM8450 SoC based boards.

So the Qcom EP has no way to generate refclk internally (I double checked this)
and it has to rely on refclk from either host or a separate clock source.

But in most of the board designs they connect to host refclk by default.

> So... why can't you run in SRNS or SRIS mode, where the EP provides
> it's own clock?
>

As per the discussion I had with Qcom PCIe team, they that said SRIS is only
available at the host side to supply independent clock to the EP. But that
requires changes to the PHY sequence and could also result in preformance drop.
Also this mode is available only on new SoCs but not on older ones like SDX55.

So I don't think this is a viable option.

- Mani

--
மணிவண்ணன் சதாசிவம்