2023-11-30 13:51:53

by Shradha Todi

[permalink] [raw]
Subject: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

DesignWare controller provides a vendor specific extended capability
called RASDES as an IP feature. This extended capability provides
hardware information like:
- Debug registers to know the state of the link or controller.
- Error injection mechanisms to inject various PCIe errors including
sequence number, CRC
- Statistical counters to know how many times a particular event
occurred

However, in Linux we do not have any generic or custom support to be
able to use this feature in an efficient manner. This is the reason we
are proposing this framework. Debug and bring up time of high-speed IPs
are highly dependent on costlier hardware analyzers and this solution
will in some ways help to reduce the HW analyzer usage.

The debugfs entries can be used to get information about underlying
hardware and can be shared with user space. Separate debugfs entries has
been created to cater to all the DES hooks provided by the controller.
The debugfs entries interacts with the RASDES registers in the required
sequence and provides the meaningful data to the user. This eases the
effort to understand and use the register information for debugging.

v1 version was posted long back and for some reasons I couldn't work on
it. I apologize for the long break. I'm restarting this activity and
have taken care of all previous review comments shared.
v1: https://lore.kernel.org/all/[email protected]/T/

Shradha Todi (3):
PCI: dwc: Add support for vendor specific capability search
PCI: debugfs: Add support for RASDES framework in DWC
PCI: dwc: Create debugfs files in DWC driver

drivers/pci/controller/dwc/Kconfig | 8 +
drivers/pci/controller/dwc/Makefile | 1 +
.../controller/dwc/pcie-designware-debugfs.c | 476 ++++++++++++++++++
.../controller/dwc/pcie-designware-debugfs.h | 0
drivers/pci/controller/dwc/pcie-designware.c | 20 +
drivers/pci/controller/dwc/pcie-designware.h | 18 +
6 files changed, 523 insertions(+)
create mode 100644 drivers/pci/controller/dwc/pcie-designware-debugfs.c
create mode 100644 drivers/pci/controller/dwc/pcie-designware-debugfs.h

--
2.17.1


2023-11-30 13:52:07

by Shradha Todi

[permalink] [raw]
Subject: [PATCH v2 1/3] PCI: dwc: Add support for vendor specific capability search

Add vendor specific extended configuration space capability search API
using struct dw_pcie pointer for DW controllers.

Signed-off-by: Shradha Todi <[email protected]>
---
drivers/pci/controller/dwc/pcie-designware.c | 16 ++++++++++++++++
drivers/pci/controller/dwc/pcie-designware.h | 1 +
2 files changed, 17 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index 1c1c7348972b..064b4951afd8 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -275,6 +275,22 @@ static u16 dw_pcie_find_next_ext_capability(struct dw_pcie *pci, u16 start,
return 0;
}

+u16 dw_pcie_find_vsec_capability(struct dw_pcie *pci, u8 vsec_cap)
+{
+ u16 vsec = 0;
+ u32 header;
+
+ while ((vsec = dw_pcie_find_next_ext_capability(pci, vsec,
+ PCI_EXT_CAP_ID_VNDR))) {
+ header = dw_pcie_readl_dbi(pci, vsec + PCI_VNDR_HEADER);
+ if (PCI_VNDR_HEADER_ID(header) == vsec_cap)
+ return vsec;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(dw_pcie_find_vsec_capability);
+
u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap)
{
return dw_pcie_find_next_ext_capability(pci, 0, cap);
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index ef0b2efa9f93..b7ea1db14f6a 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -419,6 +419,7 @@ void dw_pcie_version_detect(struct dw_pcie *pci);

u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap);
+u16 dw_pcie_find_vsec_capability(struct dw_pcie *pci, u8 vsec_cap);

int dw_pcie_read(void __iomem *addr, int size, u32 *val);
int dw_pcie_write(void __iomem *addr, int size, u32 val);
--
2.17.1

2023-11-30 13:52:39

by Shradha Todi

[permalink] [raw]
Subject: [PATCH v2 2/3] PCI: debugfs: Add support for RASDES framework in DWC

Add support to use the RASDES feature of DesignWare PCIe controller
using debugfs entries.

RASDES is a vendor specific extended PCIe capability which reads the
current hardware internal state of PCIe device. Following primary
features are provided to userspace via debugfs:
- Debug registers
- Error injection
- Statistical counters

Signed-off-by: Shradha Todi <[email protected]>
---
drivers/pci/controller/dwc/Kconfig | 8 +
drivers/pci/controller/dwc/Makefile | 1 +
.../controller/dwc/pcie-designware-debugfs.c | 476 ++++++++++++++++++
.../controller/dwc/pcie-designware-debugfs.h | 0
drivers/pci/controller/dwc/pcie-designware.h | 17 +
5 files changed, 502 insertions(+)
create mode 100644 drivers/pci/controller/dwc/pcie-designware-debugfs.c
create mode 100644 drivers/pci/controller/dwc/pcie-designware-debugfs.h

diff --git a/drivers/pci/controller/dwc/Kconfig b/drivers/pci/controller/dwc/Kconfig
index ab96da43e0c2..fc84ba03b20e 100644
--- a/drivers/pci/controller/dwc/Kconfig
+++ b/drivers/pci/controller/dwc/Kconfig
@@ -6,6 +6,14 @@ menu "DesignWare-based PCIe controllers"
config PCIE_DW
bool

+config PCIE_DW_DEBUGFS
+ bool "DWC PCIe debugfs entries"
+ help
+ Enables debugfs entries for the DWC PCIe Controller.
+ These entries make use of the RAS features in the DW
+ controller to help in debug, error injection and statistical
+ counters
+
config PCIE_DW_HOST
bool
select PCIE_DW
diff --git a/drivers/pci/controller/dwc/Makefile b/drivers/pci/controller/dwc/Makefile
index bf5c311875a1..cbd1618b0b20 100644
--- a/drivers/pci/controller/dwc/Makefile
+++ b/drivers/pci/controller/dwc/Makefile
@@ -1,5 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_PCIE_DW) += pcie-designware.o
+obj-$(CONFIG_PCIE_DW_DEBUGFS) += pcie-designware-debugfs.o
obj-$(CONFIG_PCIE_DW_HOST) += pcie-designware-host.o
obj-$(CONFIG_PCIE_DW_EP) += pcie-designware-ep.o
obj-$(CONFIG_PCIE_DW_PLAT) += pcie-designware-plat.o
diff --git a/drivers/pci/controller/dwc/pcie-designware-debugfs.c b/drivers/pci/controller/dwc/pcie-designware-debugfs.c
new file mode 100644
index 000000000000..46481650ed6b
--- /dev/null
+++ b/drivers/pci/controller/dwc/pcie-designware-debugfs.c
@@ -0,0 +1,476 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Synopsys DesignWare PCIe controller debugfs driver
+ *
+ * Copyright (C) 2023 Samsung Electronics Co., Ltd.
+ * http://www.samsung.com
+ *
+ * Author: Shradha Todi <[email protected]>
+ */
+
+#include <linux/debugfs.h>
+
+#include "pcie-designware.h"
+
+#define RAS_DES_EVENT_COUNTER_CTRL_REG 0x8
+#define RAS_DES_EVENT_COUNTER_DATA_REG 0xc
+#define SD_STATUS_L1LANE_REG 0xb0
+#define ERR_INJ_ENABLE_REG 0x30
+#define ERR_INJ0_OFF 0x34
+
+#define LANE_DETECT_SHIFT 17
+#define LANE_DETECT_MASK 0x1
+#define PIPE_RXVALID_SHIFT 18
+#define PIPE_RXVALID_MASK 0x1
+
+#define LANE_SELECT_SHIFT 8
+#define LANE_SELECT_MASK 0xf
+#define EVENT_COUNTER_STATUS_SHIFT 7
+#define EVENT_COUNTER_STATUS_MASK 0x1
+#define EVENT_COUNTER_ENABLE (0x7 << 2)
+#define PER_EVENT_OFF (0x1 << 2)
+#define PER_EVENT_ON (0x3 << 2)
+
+#define EINJ_COUNT_MASK 0xff
+#define EINJ_TYPE_MASK 0xf
+#define EINJ_TYPE_SHIFT 8
+#define EINJ_INFO_MASK 0xfffff
+#define EINJ_INFO_SHIFT 12
+
+#define DWC_DEBUGFS_MAX 128
+
+struct rasdes_info {
+ /* to store rasdes capability offset */
+ u32 ras_cap;
+ struct mutex dbg_mutex;
+ struct dentry *rasdes;
+};
+
+struct rasdes_priv {
+ struct dw_pcie *pci;
+ int idx;
+};
+
+struct event_counter {
+ const char *name;
+ /* values can be between 0-15 */
+ u32 group_no;
+ /* values can be between 0-32 */
+ u32 event_no;
+};
+
+static const struct event_counter event_counters[] = {
+ {"ebuf_overflow", 0x0, 0x0},
+ {"ebuf_underrun", 0x0, 0x1},
+ {"decode_err", 0x0, 0x2},
+ {"running_disparity_err", 0x0, 0x3},
+ {"skp_os_parity_err", 0x0, 0x4},
+ {"sync_header_err", 0x0, 0x5},
+ {"detect_ei_infer", 0x1, 0x5},
+ {"receiver_err", 0x1, 0x6},
+ {"rx_recovery_req", 0x1, 0x7},
+ {"framing_err", 0x1, 0x9},
+ {"deskew_err", 0x1, 0xa},
+ {"bad_tlp", 0x2, 0x0},
+ {"lcrc_err", 0x2, 0x1},
+ {"bad_dllp", 0x2, 0x2},
+};
+
+struct err_inj {
+ const char *name;
+ /* values can be from group 0 - 6 */
+ u32 err_inj_group;
+ /* within each group there can be types */
+ u32 err_inj_type;
+ /* More details about the error */
+ u32 err_inj_12_31;
+};
+
+static const struct err_inj err_inj_list[] = {
+ {"tx_lcrc", 0x0, 0x0, 0x0},
+ {"tx_ecrc", 0x0, 0x3, 0x0},
+ {"rx_lcrc", 0x0, 0x8, 0x0},
+ {"rx_ecrc", 0x0, 0xb, 0x0},
+};
+
+static ssize_t dbg_lane_detect_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct dw_pcie *pci = file->private_data;
+ struct rasdes_info *rinfo = pci->dump_info;
+ u32 val;
+ ssize_t off = 0;
+ char debugfs_buf[DWC_DEBUGFS_MAX];
+
+ val = dw_pcie_readl_dbi(pci, rinfo->ras_cap + SD_STATUS_L1LANE_REG);
+ val = (val >> LANE_DETECT_SHIFT) & LANE_DETECT_MASK;
+ if (val)
+ off += scnprintf(debugfs_buf, DWC_DEBUGFS_MAX - off,
+ "Detected\n");
+ else
+ off += scnprintf(debugfs_buf, DWC_DEBUGFS_MAX - off,
+ "Undetected\n");
+
+ return simple_read_from_buffer(buf, count, ppos, debugfs_buf, off);
+}
+
+static ssize_t dbg_lane_detect_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct dw_pcie *pci = file->private_data;
+ struct rasdes_info *rinfo = pci->dump_info;
+ u32 val;
+ u32 lane;
+
+ val = kstrtou32_from_user(buf, count, 0, &lane);
+ if (val)
+ return val;
+
+ if (lane > 15)
+ return -EINVAL;
+
+ val = dw_pcie_readl_dbi(pci, rinfo->ras_cap + SD_STATUS_L1LANE_REG);
+ val &= ~LANE_SELECT_MASK;
+ val |= lane;
+ dw_pcie_writel_dbi(pci, rinfo->ras_cap + SD_STATUS_L1LANE_REG, val);
+
+ return count;
+}
+
+static ssize_t dbg_rx_valid_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct dw_pcie *pci = file->private_data;
+ struct rasdes_info *rinfo = pci->dump_info;
+ u32 val;
+ ssize_t off = 0;
+ char debugfs_buf[DWC_DEBUGFS_MAX];
+
+ val = dw_pcie_readl_dbi(pci, rinfo->ras_cap + SD_STATUS_L1LANE_REG);
+ val = (val >> PIPE_RXVALID_SHIFT) & PIPE_RXVALID_MASK;
+ if (val)
+ off += scnprintf(debugfs_buf, DWC_DEBUGFS_MAX - off,
+ "Valid\n");
+ else
+ off += scnprintf(debugfs_buf, DWC_DEBUGFS_MAX - off,
+ "Invalid\n");
+
+ return simple_read_from_buffer(buf, count, ppos, debugfs_buf, off);
+}
+
+static ssize_t dbg_rx_valid_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ return dbg_lane_detect_write(file, buf, count, ppos);
+}
+
+static void set_event_number(struct rasdes_priv *pdata, struct dw_pcie *pci,
+ struct rasdes_info *rinfo)
+{
+ u32 val;
+
+ val = dw_pcie_readl_dbi(pci, rinfo->ras_cap +
+ RAS_DES_EVENT_COUNTER_CTRL_REG);
+ val &= ~EVENT_COUNTER_ENABLE;
+ val &= ~(0xFFF << 16);
+ val |= (event_counters[pdata->idx].group_no << 24);
+ val |= (event_counters[pdata->idx].event_no << 16);
+ dw_pcie_writel_dbi(pci, rinfo->ras_cap +
+ RAS_DES_EVENT_COUNTER_CTRL_REG, val);
+}
+
+static ssize_t cnt_en_read(struct file *file, char __user *buf, size_t count,
+ loff_t *ppos)
+{
+ struct rasdes_priv *pdata = file->private_data;
+ struct dw_pcie *pci = pdata->pci;
+ struct rasdes_info *rinfo = pci->dump_info;
+ u32 val;
+ ssize_t off = 0;
+ char debugfs_buf[DWC_DEBUGFS_MAX];
+
+ mutex_lock(&rinfo->dbg_mutex);
+ set_event_number(pdata, pci, rinfo);
+ val = dw_pcie_readl_dbi(pci, rinfo->ras_cap +
+ RAS_DES_EVENT_COUNTER_CTRL_REG);
+ mutex_unlock(&rinfo->dbg_mutex);
+ val = (val >> EVENT_COUNTER_STATUS_SHIFT) & EVENT_COUNTER_STATUS_MASK;
+ if (val)
+ off += scnprintf(debugfs_buf, DWC_DEBUGFS_MAX - off,
+ "Enabled\n");
+ else
+ off += scnprintf(debugfs_buf, DWC_DEBUGFS_MAX - off,
+ "Disabled\n");
+
+ return simple_read_from_buffer(buf, count, ppos, debugfs_buf, off);
+}
+
+static ssize_t cnt_en_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct rasdes_priv *pdata = file->private_data;
+ struct dw_pcie *pci = pdata->pci;
+ struct rasdes_info *rinfo = pci->dump_info;
+ u32 val;
+ u32 enable;
+
+ val = kstrtou32_from_user(buf, count, 0, &enable);
+ if (val)
+ return val;
+
+ mutex_lock(&rinfo->dbg_mutex);
+ set_event_number(pdata, pci, rinfo);
+ val = dw_pcie_readl_dbi(pci, rinfo->ras_cap +
+ RAS_DES_EVENT_COUNTER_CTRL_REG);
+ if (enable)
+ val |= PER_EVENT_ON;
+ else
+ val |= PER_EVENT_OFF;
+
+ dw_pcie_writel_dbi(pci, rinfo->ras_cap +
+ RAS_DES_EVENT_COUNTER_CTRL_REG, val);
+ mutex_unlock(&rinfo->dbg_mutex);
+
+ return count;
+}
+
+static ssize_t cnt_lane_read(struct file *file, char __user *buf, size_t count,
+ loff_t *ppos)
+{
+ struct rasdes_priv *pdata = file->private_data;
+ struct dw_pcie *pci = pdata->pci;
+ struct rasdes_info *rinfo = pci->dump_info;
+ u32 val;
+ ssize_t off = 0;
+ char debugfs_buf[DWC_DEBUGFS_MAX];
+
+ mutex_lock(&rinfo->dbg_mutex);
+ set_event_number(pdata, pci, rinfo);
+ val = dw_pcie_readl_dbi(pci, rinfo->ras_cap +
+ RAS_DES_EVENT_COUNTER_CTRL_REG);
+ mutex_unlock(&rinfo->dbg_mutex);
+ val = (val >> LANE_SELECT_SHIFT) & LANE_SELECT_MASK;
+ off += scnprintf(debugfs_buf, DWC_DEBUGFS_MAX - off,
+ "Lane: %d\n", val);
+
+ return simple_read_from_buffer(buf, count, ppos, debugfs_buf, off);
+}
+
+static ssize_t cnt_lane_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct rasdes_priv *pdata = file->private_data;
+ struct dw_pcie *pci = pdata->pci;
+ struct rasdes_info *rinfo = pci->dump_info;
+ u32 val;
+ u32 lane;
+
+ val = kstrtou32_from_user(buf, count, 0, &lane);
+ if (val)
+ return val;
+
+ if (lane > 15)
+ return -EINVAL;
+
+ mutex_lock(&rinfo->dbg_mutex);
+ set_event_number(pdata, pci, rinfo);
+ val = dw_pcie_readl_dbi(pci, rinfo->ras_cap +
+ RAS_DES_EVENT_COUNTER_CTRL_REG);
+ val &= ~(LANE_SELECT_MASK << LANE_SELECT_SHIFT);
+ val |= (lane << LANE_SELECT_SHIFT);
+ dw_pcie_writel_dbi(pci, rinfo->ras_cap +
+ RAS_DES_EVENT_COUNTER_CTRL_REG, val);
+ mutex_unlock(&rinfo->dbg_mutex);
+
+ return count;
+}
+
+static ssize_t cnt_val_read(struct file *file, char __user *buf, size_t count,
+ loff_t *ppos)
+{
+ struct rasdes_priv *pdata = file->private_data;
+ struct dw_pcie *pci = pdata->pci;
+ struct rasdes_info *rinfo = pci->dump_info;
+ u32 val;
+ ssize_t off = 0;
+ char debugfs_buf[DWC_DEBUGFS_MAX];
+
+ mutex_lock(&rinfo->dbg_mutex);
+ set_event_number(pdata, pci, rinfo);
+ val = dw_pcie_readl_dbi(pci, rinfo->ras_cap +
+ RAS_DES_EVENT_COUNTER_DATA_REG);
+ mutex_unlock(&rinfo->dbg_mutex);
+ off += scnprintf(debugfs_buf, DWC_DEBUGFS_MAX - off,
+ "Value: %d\n", val);
+
+ return simple_read_from_buffer(buf, count, ppos, debugfs_buf, off);
+}
+
+static ssize_t err_inj_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct rasdes_priv *pdata = file->private_data;
+ struct dw_pcie *pci = pdata->pci;
+ struct rasdes_info *rinfo = pci->dump_info;
+ u32 val;
+ ssize_t off = 0;
+ char debugfs_buf[DWC_DEBUGFS_MAX];
+
+ val = dw_pcie_readl_dbi(pci, rinfo->ras_cap + ERR_INJ0_OFF +
+ (0x4 * err_inj_list[pdata->idx].err_inj_group));
+ val &= EINJ_COUNT_MASK;
+ off += scnprintf(debugfs_buf, DWC_DEBUGFS_MAX - off,
+ "Count: %d\n", val);
+
+ return simple_read_from_buffer(buf, count, ppos, debugfs_buf, off);
+}
+
+static ssize_t err_inj_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct rasdes_priv *pdata = file->private_data;
+ struct dw_pcie *pci = pdata->pci;
+ struct rasdes_info *rinfo = pci->dump_info;
+ u32 val;
+ u32 counter;
+
+ val = kstrtou32_from_user(buf, count, 0, &counter);
+ if (val)
+ return val;
+
+ val = dw_pcie_readl_dbi(pci, rinfo->ras_cap + ERR_INJ0_OFF +
+ (0x4 * err_inj_list[pdata->idx].err_inj_group));
+ val &= ~(EINJ_TYPE_MASK << EINJ_TYPE_SHIFT);
+ val |= err_inj_list[pdata->idx].err_inj_type << EINJ_TYPE_SHIFT;
+ val &= ~(EINJ_INFO_MASK << EINJ_INFO_SHIFT);
+ val |= err_inj_list[pdata->idx].err_inj_12_31 << EINJ_INFO_SHIFT;
+ val &= ~EINJ_COUNT_MASK;
+ val |= counter;
+ dw_pcie_writel_dbi(pci, rinfo->ras_cap + ERR_INJ0_OFF +
+ (0x4 * err_inj_list[pdata->idx].err_inj_group), val);
+ dw_pcie_writel_dbi(pci, rinfo->ras_cap + ERR_INJ_ENABLE_REG,
+ (0x1 << err_inj_list[pdata->idx].err_inj_group));
+
+ return count;
+}
+
+#define dwc_debugfs_create(name) \
+debugfs_create_file(#name, 0644, rasdes_debug, pci, \
+ &dbg_ ## name ## _fops)
+
+#define DWC_DEBUGFS_FOPS(name) \
+static const struct file_operations dbg_ ## name ## _fops = { \
+ .read = dbg_ ## name ## _read, \
+ .write = dbg_ ## name ## _write \
+}
+
+DWC_DEBUGFS_FOPS(lane_detect);
+DWC_DEBUGFS_FOPS(rx_valid);
+
+static const struct file_operations cnt_en_ops = {
+ .open = simple_open,
+ .read = cnt_en_read,
+ .write = cnt_en_write,
+};
+
+static const struct file_operations cnt_lane_ops = {
+ .open = simple_open,
+ .read = cnt_lane_read,
+ .write = cnt_lane_write,
+};
+
+static const struct file_operations cnt_val_ops = {
+ .open = simple_open,
+ .read = cnt_val_read,
+};
+
+static const struct file_operations err_inj_ops = {
+ .open = simple_open,
+ .read = err_inj_read,
+ .write = err_inj_write,
+};
+
+void dwc_pcie_rasdes_debugfs_deinit(struct dw_pcie *pci)
+{
+ struct rasdes_info *rinfo = pci->dump_info;
+
+ debugfs_remove_recursive(rinfo->rasdes);
+ mutex_destroy(&rinfo->dbg_mutex);
+}
+
+int dwc_pcie_rasdes_debugfs_init(struct dw_pcie *pci)
+{
+ struct device *dev = pci->dev;
+ int ras_cap;
+ struct rasdes_info *dump_info;
+ char dirname[DWC_DEBUGFS_MAX];
+ struct dentry *dir, *rasdes_debug, *rasdes_err_inj;
+ struct dentry *rasdes_event_counter, *rasdes_events;
+ int i;
+ struct rasdes_priv *priv_tmp;
+
+ ras_cap = dw_pcie_find_vsec_capability(pci, DW_PCIE_RAS_DES_CAP);
+ if (!ras_cap) {
+ dev_err(dev, "No RASDES capability available\n");
+ return -ENODEV;
+ }
+
+ dump_info = devm_kzalloc(dev, sizeof(*dump_info), GFP_KERNEL);
+ if (!dump_info)
+ return -ENOMEM;
+
+ /* Create main directory for each platform driver */
+ sprintf(dirname, "pcie_dwc_%s", dev_name(dev));
+ dir = debugfs_create_dir(dirname, NULL);
+
+ /* Create subdirectories for Debug, Error injection, Statistics */
+ rasdes_debug = debugfs_create_dir("rasdes_debug", dir);
+ rasdes_err_inj = debugfs_create_dir("rasdes_err_inj", dir);
+ rasdes_event_counter = debugfs_create_dir("rasdes_event_counter", dir);
+
+ mutex_init(&dump_info->dbg_mutex);
+ dump_info->ras_cap = ras_cap;
+ dump_info->rasdes = dir;
+ pci->dump_info = dump_info;
+
+ /* Create debugfs files for Debug subdirectory */
+ dwc_debugfs_create(lane_detect);
+ dwc_debugfs_create(rx_valid);
+
+ /* Create debugfs files for Error injection subdirectory */
+ for (i = 0; i < ARRAY_SIZE(err_inj_list); i++) {
+ priv_tmp = devm_kzalloc(dev, sizeof(*priv_tmp), GFP_KERNEL);
+ if (!priv_tmp)
+ goto err;
+
+ priv_tmp->idx = i;
+ priv_tmp->pci = pci;
+ debugfs_create_file(err_inj_list[i].name, 0644,
+ rasdes_err_inj, priv_tmp, &err_inj_ops);
+ }
+
+ /* Create debugfs files for Statistical counter subdirectory */
+ for (i = 0; i < ARRAY_SIZE(event_counters); i++) {
+ priv_tmp = devm_kzalloc(dev, sizeof(*priv_tmp), GFP_KERNEL);
+ if (!priv_tmp)
+ goto err;
+
+ priv_tmp->idx = i;
+ priv_tmp->pci = pci;
+ rasdes_events = debugfs_create_dir(event_counters[i].name,
+ rasdes_event_counter);
+ if (event_counters[i].group_no == 0) {
+ debugfs_create_file("lane_select", 0644, rasdes_events,
+ priv_tmp, &cnt_lane_ops);
+ }
+ debugfs_create_file("counter_value", 0644, rasdes_events, priv_tmp,
+ &cnt_val_ops);
+ debugfs_create_file("counter_enable", 0444, rasdes_events, priv_tmp,
+ &cnt_en_ops);
+ }
+
+ return 0;
+err:
+ dwc_pcie_rasdes_debugfs_deinit(pci);
+ return -ENOMEM;
+}
diff --git a/drivers/pci/controller/dwc/pcie-designware-debugfs.h b/drivers/pci/controller/dwc/pcie-designware-debugfs.h
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index b7ea1db14f6a..d3453db34c1f 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -222,6 +222,8 @@

#define PCIE_RAS_DES_EVENT_COUNTER_DATA 0xc

+#define DW_PCIE_RAS_DES_CAP 0x2
+
/*
* The default address offset between dbi_base and atu_base. Root controller
* drivers are not required to initialize atu_base if the offset matches this
@@ -406,6 +408,7 @@ struct dw_pcie {
struct reset_control_bulk_data core_rsts[DW_PCIE_NUM_CORE_RSTS];
struct gpio_desc *pe_rst;
bool suspended;
+ void *dump_info;
};

#define to_dw_pcie_from_pp(port) container_of((port), struct dw_pcie, pp)
@@ -643,4 +646,18 @@ dw_pcie_ep_get_func_from_ep(struct dw_pcie_ep *ep, u8 func_no)
return NULL;
}
#endif
+
+#ifdef CONFIG_PCIE_DW_DEBUGFS
+int dwc_pcie_rasdes_debugfs_init(struct dw_pcie *pci);
+void dwc_pcie_rasdes_debugfs_deinit(struct dw_pcie *pci);
+#else
+static inline int dwc_pcie_rasdes_debugfs_init(struct dw_pcie *pci)
+{
+ return 0;
+}
+static inline void dwc_pcie_rasdes_debugfs_deinit(struct dw_pcie *pci)
+{
+}
+#endif
+
#endif /* _PCIE_DESIGNWARE_H */
--
2.17.1

2023-11-30 13:52:51

by Shradha Todi

[permalink] [raw]
Subject: [PATCH v2 3/3] PCI: dwc: Create debugfs files in DWC driver

Add call to initialize debugfs from DWC driver and create the RASDES
debugfs hierarchy for each platform driver. Since it can be used for
both DW HOST controller as well as DW EP controller, add it in the
common setup function.

Signed-off-by: Shradha Todi <[email protected]>
---
drivers/pci/controller/dwc/pcie-designware.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index 064b4951afd8..16c9018c2ada 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -1074,4 +1074,8 @@ void dw_pcie_setup(struct dw_pcie *pci)
break;
}
dw_pcie_writel_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL, val);
+
+ val = dwc_pcie_rasdes_debugfs_init(pci);
+ if (val)
+ dev_err(pci->dev, "Couldn't create debugfs files\n");
}
--
2.17.1

2023-11-30 16:55:38

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

On Thu, Nov 30, 2023 at 05:20:41PM +0530, Shradha Todi wrote:
> DesignWare controller provides a vendor specific extended capability
> called RASDES as an IP feature. This extended capability provides
> hardware information like:
> - Debug registers to know the state of the link or controller.
> - Error injection mechanisms to inject various PCIe errors including
> sequence number, CRC
> - Statistical counters to know how many times a particular event
> occurred
>
> However, in Linux we do not have any generic or custom support to be
> able to use this feature in an efficient manner. This is the reason we
> are proposing this framework. Debug and bring up time of high-speed IPs
> are highly dependent on costlier hardware analyzers and this solution
> will in some ways help to reduce the HW analyzer usage.
>
> The debugfs entries can be used to get information about underlying
> hardware and can be shared with user space. Separate debugfs entries has
> been created to cater to all the DES hooks provided by the controller.
> The debugfs entries interacts with the RASDES registers in the required
> sequence and provides the meaningful data to the user. This eases the
> effort to understand and use the register information for debugging.
>
> v1 version was posted long back and for some reasons I couldn't work on
> it. I apologize for the long break. I'm restarting this activity and
> have taken care of all previous review comments shared.
> v1: https://lore.kernel.org/all/[email protected]/T/
>

There is already a series floating to add similar functionality via perf
subsystem: https://lore.kernel.org/linux-pci/[email protected]/

- Mani

> Shradha Todi (3):
> PCI: dwc: Add support for vendor specific capability search
> PCI: debugfs: Add support for RASDES framework in DWC
> PCI: dwc: Create debugfs files in DWC driver
>
> drivers/pci/controller/dwc/Kconfig | 8 +
> drivers/pci/controller/dwc/Makefile | 1 +
> .../controller/dwc/pcie-designware-debugfs.c | 476 ++++++++++++++++++
> .../controller/dwc/pcie-designware-debugfs.h | 0
> drivers/pci/controller/dwc/pcie-designware.c | 20 +
> drivers/pci/controller/dwc/pcie-designware.h | 18 +
> 6 files changed, 523 insertions(+)
> create mode 100644 drivers/pci/controller/dwc/pcie-designware-debugfs.c
> create mode 100644 drivers/pci/controller/dwc/pcie-designware-debugfs.h
>
> --
> 2.17.1
>

--
மணிவண்ணன் சதாசிவம்

2023-12-04 09:07:59

by Shradha Todi

[permalink] [raw]
Subject: RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller



> -----Original Message-----
> From: Manivannan Sadhasivam [mailto:[email protected]]
> Sent: 30 November 2023 22:25
> To: Shradha Todi <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> controller
>
> On Thu, Nov 30, 2023 at 05:20:41PM +0530, Shradha Todi wrote:
> > DesignWare controller provides a vendor specific extended capability
> > called RASDES as an IP feature. This extended capability provides
> > hardware information like:
> > - Debug registers to know the state of the link or controller.
> > - Error injection mechanisms to inject various PCIe errors including
> > sequence number, CRC
> > - Statistical counters to know how many times a particular event
> > occurred
> >
> > However, in Linux we do not have any generic or custom support to be
> > able to use this feature in an efficient manner. This is the reason we
> > are proposing this framework. Debug and bring up time of high-speed
> > IPs are highly dependent on costlier hardware analyzers and this
> > solution will in some ways help to reduce the HW analyzer usage.
> >
> > The debugfs entries can be used to get information about underlying
> > hardware and can be shared with user space. Separate debugfs entries
> > has been created to cater to all the DES hooks provided by the controller.
> > The debugfs entries interacts with the RASDES registers in the
> > required sequence and provides the meaningful data to the user. This
> > eases the effort to understand and use the register information for
> debugging.
> >
> > v1 version was posted long back and for some reasons I couldn't work
> > on it. I apologize for the long break. I'm restarting this activity
> > and have taken care of all previous review comments shared.
> > v1:
> > https://lore.kernel.org/all/[email protected]
> > om/T/
> >
>
> There is already a series floating to add similar functionality via perf
> subsystem: https://lore.kernel.org/linux-pci/20231121013400.18367-1-
> [email protected]/
>
> - Mani
>

Hi Mani,

The series proposed in perf includes only time based-analysis and event counters which will monitor performance (Group 6 and 7). The patch or framework that we have proposed includes debug information, error injection facility and error counters (Group 0 - 5) which are not included as part of the functionality implemented via perf. In my opinion, these functionalities don't count as performance monitoring or counters but rather as debug counters. How about we take this up as a debugfs framework as proposed in my patch?
Or if others feel it can be taken via perf driver then I am happy to extend the perf driver if authors do not have objection. Let me know what you think of this? Meanwhile I will review the perf patches and share my feedback.

> > Shradha Todi (3):
> > PCI: dwc: Add support for vendor specific capability search
> > PCI: debugfs: Add support for RASDES framework in DWC
> > PCI: dwc: Create debugfs files in DWC driver
> >
> > drivers/pci/controller/dwc/Kconfig | 8 +
> > drivers/pci/controller/dwc/Makefile | 1 +
> > .../controller/dwc/pcie-designware-debugfs.c | 476
> ++++++++++++++++++
> > .../controller/dwc/pcie-designware-debugfs.h | 0
> > drivers/pci/controller/dwc/pcie-designware.c | 20 +
> > drivers/pci/controller/dwc/pcie-designware.h | 18 +
> > 6 files changed, 523 insertions(+)
> > create mode 100644
> > drivers/pci/controller/dwc/pcie-designware-debugfs.c
> > create mode 100644
> > drivers/pci/controller/dwc/pcie-designware-debugfs.h
> >
> > --
> > 2.17.1
> >
>
> --
> மணிவண்ணன் சதாசிவம்


2024-01-04 02:55:37

by Shradha Todi

[permalink] [raw]
Subject: RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller



> -----Original Message-----
> From: Shradha Todi <[email protected]>
> Sent: 04 December 2023 14:10
> To: 'Manivannan Sadhasivam' <[email protected]>
> Cc: '[email protected]' <[email protected]>; '[email protected]'
> <[email protected]>; '[email protected]' <[email protected]>;
> '[email protected]' <[email protected]>; '[email protected]'
> <[email protected]>; '[email protected]'
> <[email protected]>; '[email protected]'
> <[email protected]>; '[email protected]'
> <[email protected]>; '[email protected]'
> <[email protected]>; '[email protected]'
> <[email protected]>; '[email protected]' <linux-
> [email protected]>; '[email protected]' <linux-
> [email protected]>
> Subject: RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> controller
>
>
>
> > -----Original Message-----
> > From: Manivannan Sadhasivam [mailto:[email protected]]
> > Sent: 30 November 2023 22:25
> > To: Shradha Todi <[email protected]>
> > Cc: [email protected]; [email protected]; [email protected];
> > [email protected]; [email protected];
> > [email protected]; [email protected];
> > [email protected]; [email protected];
> > [email protected]; [email protected]; linux-
> > [email protected]
> > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> > controller
> >
> > On Thu, Nov 30, 2023 at 05:20:41PM +0530, Shradha Todi wrote:
> > > DesignWare controller provides a vendor specific extended capability
> > > called RASDES as an IP feature. This extended capability provides
> > > hardware information like:
> > > - Debug registers to know the state of the link or controller.
> > > - Error injection mechanisms to inject various PCIe errors including
> > > sequence number, CRC
> > > - Statistical counters to know how many times a particular event
> > > occurred
> > >
> > > However, in Linux we do not have any generic or custom support to be
> > > able to use this feature in an efficient manner. This is the reason
> > > we are proposing this framework. Debug and bring up time of
> > > high-speed IPs are highly dependent on costlier hardware analyzers
> > > and this solution will in some ways help to reduce the HW analyzer usage.
> > >
> > > The debugfs entries can be used to get information about underlying
> > > hardware and can be shared with user space. Separate debugfs entries
> > > has been created to cater to all the DES hooks provided by the controller.
> > > The debugfs entries interacts with the RASDES registers in the
> > > required sequence and provides the meaningful data to the user. This
> > > eases the effort to understand and use the register information for
> > debugging.
> > >
> > > v1 version was posted long back and for some reasons I couldn't work
> > > on it. I apologize for the long break. I'm restarting this activity
> > > and have taken care of all previous review comments shared.
> > > v1:
> > > https://lore.kernel.org/all/20210518174618.42089-1-shradha.t@samsung
> > > .c
> > > om/T/
> > >
> >
> > There is already a series floating to add similar functionality via
> > perf
> > subsystem: https://lore.kernel.org/linux-pci/20231121013400.18367-1-
> > [email protected]/
> >
> > - Mani
> >
>
> Hi Mani,
>
> The series proposed in perf includes only time based-analysis and event counters
> which will monitor performance (Group 6 and 7). The patch or framework that we
> have proposed includes debug information, error injection facility and error
> counters (Group 0 - 5) which are not included as part of the functionality
> implemented via perf. In my opinion, these functionalities don't count as
> performance monitoring or counters but rather as debug counters. How about
> we take this up as a debugfs framework as proposed in my patch?
> Or if others feel it can be taken via perf driver then I am happy to extend the perf
> driver if authors do not have objection. Let me know what you think of this?
> Meanwhile I will review the perf patches and share my feedback.
>

Hello Mani,
Any update on the above comment? IMO, even though the perf patches and this
patchset are both part of the DWC vendor specific capability - RASDES, they
cover different features. The perf file includes performance based parameters
like time-based analysis and event counters for count of packets whereas this
patchset includes debugging fields, error injection and event counters for count
of errors. I think having a separate debugfs file fits more but would you suggest
we extend the perf file itself?

Shradha

> > > Shradha Todi (3):
> > > PCI: dwc: Add support for vendor specific capability search
> > > PCI: debugfs: Add support for RASDES framework in DWC
> > > PCI: dwc: Create debugfs files in DWC driver
> > >
> > > drivers/pci/controller/dwc/Kconfig | 8 +
> > > drivers/pci/controller/dwc/Makefile | 1 +
> > > .../controller/dwc/pcie-designware-debugfs.c | 476
> > ++++++++++++++++++
> > > .../controller/dwc/pcie-designware-debugfs.h | 0
> > > drivers/pci/controller/dwc/pcie-designware.c | 20 +
> > > drivers/pci/controller/dwc/pcie-designware.h | 18 +
> > > 6 files changed, 523 insertions(+)
> > > create mode 100644
> > > drivers/pci/controller/dwc/pcie-designware-debugfs.c
> > > create mode 100644
> > > drivers/pci/controller/dwc/pcie-designware-debugfs.h
> > >
> > > --
> > > 2.17.1
> > >
> >
> > --
> > மணிவண்ணன் சதாசிவம்



2024-01-04 05:56:23

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

On Wed, Jan 03, 2024 at 11:13:20AM +0530, Shradha Todi wrote:
>
>
> > -----Original Message-----
> > From: Shradha Todi <[email protected]>
> > Sent: 04 December 2023 14:10
> > To: 'Manivannan Sadhasivam' <[email protected]>
> > Cc: '[email protected]' <[email protected]>; '[email protected]'
> > <[email protected]>; '[email protected]' <[email protected]>;
> > '[email protected]' <[email protected]>; '[email protected]'
> > <[email protected]>; '[email protected]'
> > <[email protected]>; '[email protected]'
> > <[email protected]>; '[email protected]'
> > <[email protected]>; '[email protected]'
> > <[email protected]>; '[email protected]'
> > <[email protected]>; '[email protected]' <linux-
> > [email protected]>; '[email protected]' <linux-
> > [email protected]>
> > Subject: RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> > controller
> >
> >
> >
> > > -----Original Message-----
> > > From: Manivannan Sadhasivam [mailto:[email protected]]
> > > Sent: 30 November 2023 22:25
> > > To: Shradha Todi <[email protected]>
> > > Cc: [email protected]; [email protected]; [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; [email protected]; linux-
> > > [email protected]
> > > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> > > controller
> > >
> > > On Thu, Nov 30, 2023 at 05:20:41PM +0530, Shradha Todi wrote:
> > > > DesignWare controller provides a vendor specific extended capability
> > > > called RASDES as an IP feature. This extended capability provides
> > > > hardware information like:
> > > > - Debug registers to know the state of the link or controller.
> > > > - Error injection mechanisms to inject various PCIe errors including
> > > > sequence number, CRC
> > > > - Statistical counters to know how many times a particular event
> > > > occurred
> > > >
> > > > However, in Linux we do not have any generic or custom support to be
> > > > able to use this feature in an efficient manner. This is the reason
> > > > we are proposing this framework. Debug and bring up time of
> > > > high-speed IPs are highly dependent on costlier hardware analyzers
> > > > and this solution will in some ways help to reduce the HW analyzer usage.
> > > >
> > > > The debugfs entries can be used to get information about underlying
> > > > hardware and can be shared with user space. Separate debugfs entries
> > > > has been created to cater to all the DES hooks provided by the controller.
> > > > The debugfs entries interacts with the RASDES registers in the
> > > > required sequence and provides the meaningful data to the user. This
> > > > eases the effort to understand and use the register information for
> > > debugging.
> > > >
> > > > v1 version was posted long back and for some reasons I couldn't work
> > > > on it. I apologize for the long break. I'm restarting this activity
> > > > and have taken care of all previous review comments shared.
> > > > v1:
> > > > https://lore.kernel.org/all/20210518174618.42089-1-shradha.t@samsung
> > > > .c
> > > > om/T/
> > > >
> > >
> > > There is already a series floating to add similar functionality via
> > > perf
> > > subsystem: https://lore.kernel.org/linux-pci/20231121013400.18367-1-
> > > [email protected]/
> > >
> > > - Mani
> > >
> >
> > Hi Mani,
> >
> > The series proposed in perf includes only time based-analysis and event counters
> > which will monitor performance (Group 6 and 7). The patch or framework that we
> > have proposed includes debug information, error injection facility and error
> > counters (Group 0 - 5) which are not included as part of the functionality
> > implemented via perf. In my opinion, these functionalities don't count as
> > performance monitoring or counters but rather as debug counters. How about
> > we take this up as a debugfs framework as proposed in my patch?
> > Or if others feel it can be taken via perf driver then I am happy to extend the perf
> > driver if authors do not have objection. Let me know what you think of this?
> > Meanwhile I will review the perf patches and share my feedback.
> >
>
> Hello Mani,
> Any update on the above comment? IMO, even though the perf patches and this
> patchset are both part of the DWC vendor specific capability - RASDES, they
> cover different features. The perf file includes performance based parameters
> like time-based analysis and event counters for count of packets whereas this
> patchset includes debugging fields, error injection and event counters for count
> of errors. I think having a separate debugfs file fits more but would you suggest
> we extend the perf file itself?
>

For the error injection and counters, we already have the EDAC framework. So
adding them in the DWC driver doesn't make sense to me.

But first check with the perf driver author if they have any plans on adding the
proposed functionality. If they do not have any plan or not working on it, then
look into EDAC.

- Mani

> Shradha
>
> > > > Shradha Todi (3):
> > > > PCI: dwc: Add support for vendor specific capability search
> > > > PCI: debugfs: Add support for RASDES framework in DWC
> > > > PCI: dwc: Create debugfs files in DWC driver
> > > >
> > > > drivers/pci/controller/dwc/Kconfig | 8 +
> > > > drivers/pci/controller/dwc/Makefile | 1 +
> > > > .../controller/dwc/pcie-designware-debugfs.c | 476
> > > ++++++++++++++++++
> > > > .../controller/dwc/pcie-designware-debugfs.h | 0
> > > > drivers/pci/controller/dwc/pcie-designware.c | 20 +
> > > > drivers/pci/controller/dwc/pcie-designware.h | 18 +
> > > > 6 files changed, 523 insertions(+)
> > > > create mode 100644
> > > > drivers/pci/controller/dwc/pcie-designware-debugfs.c
> > > > create mode 100644
> > > > drivers/pci/controller/dwc/pcie-designware-debugfs.h
> > > >
> > > > --
> > > > 2.17.1
> > > >
> > >
> > > --
> > > மணிவண்ணன் சதாசிவம்
>
>

--
மணிவண்ணன் சதாசிவம்

2024-02-15 09:58:32

by Shradha Todi

[permalink] [raw]
Subject: RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller



> -----Original Message-----
> From: 'Manivannan Sadhasivam' <[email protected]>
> Sent: 04 January 2024 11:21
> To: Shradha Todi <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> controller
>
> On Wed, Jan 03, 2024 at 11:13:20AM +0530, Shradha Todi wrote:
> >
> >
> > > -----Original Message-----
> > > From: Shradha Todi <[email protected]>
> > > Sent: 04 December 2023 14:10
> > > To: 'Manivannan Sadhasivam' <[email protected]>
> > > Cc: '[email protected]' <[email protected]>; '[email protected]'
> > > <[email protected]>; '[email protected]' <[email protected]>;
> > > '[email protected]' <[email protected]>; '[email protected]'
> > > <[email protected]>; '[email protected]'
> > > <[email protected]>; '[email protected]'
> > > <[email protected]>; '[email protected]'
> > > <[email protected]>; '[email protected]'
> > > <[email protected]>; '[email protected]'
> > > <[email protected]>; '[email protected]' <linux-
> > > [email protected]>; '[email protected]' <linux-
> > > [email protected]>
> > > Subject: RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe
> > > DW controller
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Manivannan Sadhasivam
> > > > [mailto:[email protected]]
> > > > Sent: 30 November 2023 22:25
> > > > To: Shradha Todi <[email protected]>
> > > > Cc: [email protected]; [email protected]; [email protected];
> > > > [email protected]; [email protected];
> > > > [email protected]; [email protected];
> > > > [email protected]; [email protected];
> > > > [email protected]; [email protected]; linux-
> > > > [email protected]
> > > > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in
> > > > PCIe DW controller
> > > >
> > > > On Thu, Nov 30, 2023 at 05:20:41PM +0530, Shradha Todi wrote:
> > > > > DesignWare controller provides a vendor specific extended
> > > > > capability called RASDES as an IP feature. This extended
> > > > > capability provides hardware information like:
> > > > > - Debug registers to know the state of the link or controller.
> > > > > - Error injection mechanisms to inject various PCIe errors including
> > > > > sequence number, CRC
> > > > > - Statistical counters to know how many times a particular event
> > > > > occurred
> > > > >
> > > > > However, in Linux we do not have any generic or custom support
> > > > > to be able to use this feature in an efficient manner. This is
> > > > > the reason we are proposing this framework. Debug and bring up
> > > > > time of high-speed IPs are highly dependent on costlier hardware
> > > > > analyzers and this solution will in some ways help to reduce the HW
> analyzer usage.
> > > > >
> > > > > The debugfs entries can be used to get information about
> > > > > underlying hardware and can be shared with user space. Separate
> > > > > debugfs entries has been created to cater to all the DES hooks provided
> by the controller.
> > > > > The debugfs entries interacts with the RASDES registers in the
> > > > > required sequence and provides the meaningful data to the user.
> > > > > This eases the effort to understand and use the register
> > > > > information for
> > > > debugging.
> > > > >
> > > > > v1 version was posted long back and for some reasons I couldn't
> > > > > work on it. I apologize for the long break. I'm restarting this
> > > > > activity and have taken care of all previous review comments shared.
> > > > > v1:
> > > > > https://lore.kernel.org/all/20210518174618.42089-1-shradha.t@sam
> > > > > sung
> > > > > .c
> > > > > om/T/
> > > > >
> > > >
> > > > There is already a series floating to add similar functionality
> > > > via perf
> > > > subsystem:
> > > > https://lore.kernel.org/linux-pci/20231121013400.18367-1-
> > > > [email protected]/
> > > >
> > > > - Mani
> > > >
> > >
> > > Hi Mani,
> > >
> > > The series proposed in perf includes only time based-analysis and
> > > event counters which will monitor performance (Group 6 and 7). The
> > > patch or framework that we have proposed includes debug information,
> > > error injection facility and error counters (Group 0 - 5) which are
> > > not included as part of the functionality implemented via perf. In
> > > my opinion, these functionalities don't count as performance
> > > monitoring or counters but rather as debug counters. How about we take this
> up as a debugfs framework as proposed in my patch?
> > > Or if others feel it can be taken via perf driver then I am happy to
> > > extend the perf driver if authors do not have objection. Let me know what
> you think of this?
> > > Meanwhile I will review the perf patches and share my feedback.
> > >
> >
> > Hello Mani,
> > Any update on the above comment? IMO, even though the perf patches and
> > this patchset are both part of the DWC vendor specific capability -
> > RASDES, they cover different features. The perf file includes
> > performance based parameters like time-based analysis and event
> > counters for count of packets whereas this patchset includes debugging
> > fields, error injection and event counters for count of errors. I
> > think having a separate debugfs file fits more but would you suggest we extend
> the perf file itself?
> >
>
> For the error injection and counters, we already have the EDAC framework. So
> adding them in the DWC driver doesn't make sense to me.
>

Sorry for late response, was going through the EDAC framework to understand better how we can fit RAS DES support in it. Below are some technical challenges found so far:
1: This debugfs framework proposed [1] can run on both side of the link i.e. RC and EP as it will be a part of the link controller platform driver Here for the EP side the assumption is that it has Linux running, which is primarily a use case for chip-to-chip communication. After your suggestion to migrate to EDAC framework we studied and here are the findings:
- If we move to EDAC framework, we need to have RAS DES as a pci_driver which will be binded based on vendor_id and device_id. Our observation is that on EP side system we are unable to bind two function driver (pci_driver), as pci_endpoint_test function driver or some other chip-to-chip function driver will already be bound. On the other hand, on RC side we observed that if we have portdrv enabled in Linux running on RC system, it gets bound to RC controller and then it does not allow EDAC pci_driver to bind. So basically we see a problem here, that we can't have two pci_driver binding to same PCI device
2: Another point is even though we use EDAC driver framework, we may not be able to use any of EDAC framework APIs as they are mostly suitable for memory controller devices sitting on PCI BUS. We will end up using debugfs entries just via a pci_driver placed inside EDAC framework.

Please let me know if my understanding is wrong.

> But first check with the perf driver author if they have any plans on adding the
> proposed functionality. If they do not have any plan or not working on it, then
> look into EDAC.
>
> - Mani
>

Since we already worked and posted patches [1], [2], we will continue to work on this and based on consent from community we will adopt to most suitable framework.
We see many subsystems like ethernet, usb, gpu, cxl having debugfs files that give information about the current status of the running system and as of now based on our findings, we still feel there is no harm in having debugfs entry based support in DesignWare controller driver itself.
Others any opinion please?

[1]: https://lkml.org/lkml/2021/5/18/1371
[2]: https://lkml.org/lkml/2023/11/30/574

> > Shradha
> >
> > > > > Shradha Todi (3):
> > > > > PCI: dwc: Add support for vendor specific capability search
> > > > > PCI: debugfs: Add support for RASDES framework in DWC
> > > > > PCI: dwc: Create debugfs files in DWC driver
> > > > >
> > > > > drivers/pci/controller/dwc/Kconfig | 8 +
> > > > > drivers/pci/controller/dwc/Makefile | 1 +
> > > > > .../controller/dwc/pcie-designware-debugfs.c | 476
> > > > ++++++++++++++++++
> > > > > .../controller/dwc/pcie-designware-debugfs.h | 0
> > > > > drivers/pci/controller/dwc/pcie-designware.c | 20 +
> > > > > drivers/pci/controller/dwc/pcie-designware.h | 18 +
> > > > > 6 files changed, 523 insertions(+) create mode 100644
> > > > > drivers/pci/controller/dwc/pcie-designware-debugfs.c
> > > > > create mode 100644
> > > > > drivers/pci/controller/dwc/pcie-designware-debugfs.h
> > > > >
> > > > > --
> > > > > 2.17.1
> > > > >
> > > >
> > > > --
> > > > மணிவண்ணன் சதாசிவம்
> >
> >
>
> --
> மணிவண்ணன் சதாசிவம்



2024-02-16 13:49:46

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

On Thu, Feb 15, 2024 at 02:55:06PM +0530, Shradha Todi wrote:
>
>

[...]

> > For the error injection and counters, we already have the EDAC framework. So
> > adding them in the DWC driver doesn't make sense to me.
> >
>
> Sorry for late response, was going through the EDAC framework to understand better how we can fit RAS DES support in it. Below are some technical challenges found so far:
> 1: This debugfs framework proposed [1] can run on both side of the link i.e. RC and EP as it will be a part of the link controller platform driver. Here for the EP side the assumption is that it has Linux running, which is primarily a use case for chip-to-chip communication. After your suggestion to migrate to EDAC framework we studied and here are the findings:
> - If we move to EDAC framework, we need to have RAS DES as a pci_driver which will be binded based on vendor_id and device_id. Our observation is that on EP side system we are unable to bind two function driver (pci_driver), as pci_endpoint_test function driver or some other chip-to-chip function driver will already be bound. On the other hand, on RC side we observed that if we have portdrv enabled in Linux running on RC system, it gets bound to RC controller and then it does not allow EDAC pci_driver to bind. So basically we see a problem here, that we can't have two pci_driver binding to same PCI device
> 2: Another point is even though we use EDAC driver framework, we may not be able to use any of EDAC framework APIs as they are mostly suitable for memory controller devices sitting on PCI BUS. We will end up using debugfs entries just via a pci_driver placed inside EDAC framework.

Please wrap your replies to 80 characters.

There is no need to bind the edac driver to VID:PID of the device. The edac
driver can be a platform driver and you can instantiate the platform device
from the DWC driver. This way, the PCI device can be assocaited with whatever
driver, but still there can be a separate edac driver for handling errors.

Regarding API limitation, you should ask the maintainer about the possibility of
extending them.

>
> Please let me know if my understanding is wrong.
>
> > But first check with the perf driver author if they have any plans on adding the
> > proposed functionality. If they do not have any plan or not working on it, then
> > look into EDAC.
> >
> > - Mani
> >
>
> Since we already worked and posted patches [1], [2], we will continue to work on this and based on consent from community we will adopt to most suitable framework.
> We see many subsystems like ethernet, usb, gpu, cxl having debugfs files that give information about the current status of the running system and as of now based on our findings, we still feel there is no harm in having debugfs entry based support in DesignWare controller driver itself.

There is no issue in exposing the debug information through debugfs, that's the
sole purpose of the interface. But here, you are trying to add support for DWC
RAS feature for which a dedicated framework already exists.

And there will be more similar requests coming for vendor specific error
protocols as well. So your investigation could benefit everyone.

From your above investigation, looks like there are some shortcomings of the
EDAC framework. So let's get that clarified by writing to the EDAC maintainers
(keep us in CC). If the EDAC maintainer suggests you to add support for this
feature in DWC driver itself citing some reasons, then no issues with me.

- Mani

--
மணிவண்ணன் சதாசிவம்

2024-02-22 11:02:59

by Shradha Todi

[permalink] [raw]
Subject: RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

+ Borislav, Tony, James, Mauro, Robert

Hi All,

Synopsys DesignWare PCIe controllers have a vendor specific capability (which
means that this set of registers are only present in DesignWare controllers)
to perform debug operations called "RASDES".
The functionalities provided by this extended capability are:

1. Debug: This has some debug related diagnostic features like holding LTSSM
in certain states, reading the status of lane detection, checking if any PCIe
lanes are broken (RX Valid) and so on. It's a debug only feature used for diagnostic
use-cases.

2. Error Injection: This is a way to inject certain errors in PCIe like LCRC, ECRC,
Bad TLPs and so on. Again, this is a debug feature and generally not used in
functional use-case.

3. Statistical counters: This has 3 parts
- Error counters
- Non error counters (covered as part of perf [1])
- Time based analysis counters (covered as part of perf [1])

Selective features of the above functionality has been implemented
by vendor specific PCIe controller drivers (pcie-tegra194.c) that use
Synopsys DesignWare PCIe controllers.
In order to make it useful to all vendors using DWC controller, we had
proposed a common implementation in DWC PCIe controller directory
(drivers/pci/controller/dwc/) and our original idea was based on debugfs
filesystem. v1 and v2 are mentioned in [2] and [3].

We got a suggestion to implement this as part of EDAC framework [3] and
we looked into the same. But as far as I understood, what I am trying to
implement is a very specific feature (only valid for Synopsys DWC PCIe controllers).
This doesn't seem to fit in very well with the EDAC framework and we can
hardly use any of the EDAC framework APIs. We tried implementing a
"pci_driver" but since a function driver will already be running on the EP and
portdrv on the root-complex, we will not be able to bind 2 drivers to a single
PCI device (root-complex or endpoint). Ultimately, what I will be doing is
writing a platform driver with debugfs entries which will be present in EDAC
directory instead of DWC directory.

Can you please help us out by going through this thread [3] and letting us
know if our understanding is wrong at any point. If you think it is a better
idea to integrate this in the EDAC framework, can you guide me as
to how I can utilize the framework better?
Please let me know if you need any other information to conclude.

[1] https://lore.kernel.org/linux-pci/[email protected]/
[2] https://lore.kernel.org/all/[email protected]/T/
[3] https://lore.kernel.org/all/[email protected]/

Thanks,
Shradha

> -----Original Message-----
> From: 'Manivannan Sadhasivam' <[email protected]>
> Sent: 16 February 2024 19:19
> To: Shradha Todi <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected]
> Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> controller
>
> On Thu, Feb 15, 2024 at 02:55:06PM +0530, Shradha Todi wrote:
> >
> >
>
> [...]
>
> > > For the error injection and counters, we already have the EDAC
> > > framework. So adding them in the DWC driver doesn't make sense to me.
> > >
> >
> > Sorry for late response, was going through the EDAC framework to understand
> better how we can fit RAS DES support in it. Below are some technical challenges
> found so far:
> > 1: This debugfs framework proposed [1] can run on both side of the link i.e. RC
> and EP as it will be a part of the link controller platform driver. Here for the EP
> side the assumption is that it has Linux running, which is primarily a use case for
> chip-to-chip communication. After your suggestion to migrate to EDAC
> framework we studied and here are the findings:
> > - If we move to EDAC framework, we need to have RAS DES as a
> > pci_driver which will be binded based on vendor_id and device_id. Our
> > observation is that on EP side system we are unable to bind two
> > function driver (pci_driver), as pci_endpoint_test function driver or
> > some other chip-to-chip function driver will already be bound. On the
> > other hand, on RC side we observed that if we have portdrv enabled in
> > Linux running on RC system, it gets bound to RC controller and then it
> > does not allow EDAC pci_driver to bind. So basically we see a problem
> > here, that we can't have two pci_driver binding to same PCI device
> > 2: Another point is even though we use EDAC driver framework, we may not be
> able to use any of EDAC framework APIs as they are mostly suitable for memory
> controller devices sitting on PCI BUS. We will end up using debugfs entries just via
> a pci_driver placed inside EDAC framework.
>
> Please wrap your replies to 80 characters.
>
> There is no need to bind the edac driver to VID:PID of the device. The edac driver
> can be a platform driver and you can instantiate the platform device from the
> DWC driver. This way, the PCI device can be assocaited with whatever driver, but
> still there can be a separate edac driver for handling errors.
>
> Regarding API limitation, you should ask the maintainer about the possibility of
> extending them.
>
> >
> > Please let me know if my understanding is wrong.
> >
> > > But first check with the perf driver author if they have any plans
> > > on adding the proposed functionality. If they do not have any plan
> > > or not working on it, then look into EDAC.
> > >
> > > - Mani
> > >
> >
> > Since we already worked and posted patches [1], [2], we will continue to work
> on this and based on consent from community we will adopt to most suitable
> framework.
> > We see many subsystems like ethernet, usb, gpu, cxl having debugfs files that
> give information about the current status of the running system and as of now
> based on our findings, we still feel there is no harm in having debugfs entry based
> support in DesignWare controller driver itself.
>
> There is no issue in exposing the debug information through debugfs, that's the
> sole purpose of the interface. But here, you are trying to add support for DWC
> RAS feature for which a dedicated framework already exists.
>
> And there will be more similar requests coming for vendor specific error protocols
> as well. So your investigation could benefit everyone.
>
> From your above investigation, looks like there are some shortcomings of the
> EDAC framework. So let's get that clarified by writing to the EDAC maintainers
> (keep us in CC). If the EDAC maintainer suggests you to add support for this
> feature in DWC driver itself citing some reasons, then no issues with me.
>
> - Mani
>
> --
> மணிவண்ணன் சதாசிவம்



2024-03-19 16:43:24

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

On Thu, Feb 22, 2024 at 04:30:47PM +0530, Shradha Todi wrote:
> + Borislav, Tony, James, Mauro, Robert
>
> Hi All,
>
> Synopsys DesignWare PCIe controllers have a vendor specific capability (which
> means that this set of registers are only present in DesignWare controllers)
> to perform debug operations called "RASDES".
> The functionalities provided by this extended capability are:
>
> 1. Debug: This has some debug related diagnostic features like holding LTSSM
> in certain states, reading the status of lane detection, checking if any PCIe
> lanes are broken (RX Valid) and so on. It's a debug only feature used for diagnostic
> use-cases.
>
> 2. Error Injection: This is a way to inject certain errors in PCIe like LCRC, ECRC,
> Bad TLPs and so on. Again, this is a debug feature and generally not used in
> functional use-case.
>
> 3. Statistical counters: This has 3 parts
> - Error counters
> - Non error counters (covered as part of perf [1])
> - Time based analysis counters (covered as part of perf [1])
>
> Selective features of the above functionality has been implemented
> by vendor specific PCIe controller drivers (pcie-tegra194.c) that use
> Synopsys DesignWare PCIe controllers.
> In order to make it useful to all vendors using DWC controller, we had
> proposed a common implementation in DWC PCIe controller directory
> (drivers/pci/controller/dwc/) and our original idea was based on debugfs
> filesystem. v1 and v2 are mentioned in [2] and [3].
>
> We got a suggestion to implement this as part of EDAC framework [3] and
> we looked into the same. But as far as I understood, what I am trying to
> implement is a very specific feature (only valid for Synopsys DWC PCIe controllers).
> This doesn't seem to fit in very well with the EDAC framework and we can
> hardly use any of the EDAC framework APIs. We tried implementing a
> "pci_driver" but since a function driver will already be running on the EP and
> portdrv on the root-complex, we will not be able to bind 2 drivers to a single
> PCI device (root-complex or endpoint). Ultimately, what I will be doing is
> writing a platform driver with debugfs entries which will be present in EDAC
> directory instead of DWC directory.
>
> Can you please help us out by going through this thread [3] and letting us
> know if our understanding is wrong at any point. If you think it is a better
> idea to integrate this in the EDAC framework, can you guide me as
> to how I can utilize the framework better?
> Please let me know if you need any other information to conclude.
>
> [1] https://lore.kernel.org/linux-pci/[email protected]/
> [2] https://lore.kernel.org/all/[email protected]/T/
> [3] https://lore.kernel.org/all/[email protected]/
>

Gentle ping for the EDAC maintainers.

- Mani

> Thanks,
> Shradha
>
> > -----Original Message-----
> > From: 'Manivannan Sadhasivam' <[email protected]>
> > Sent: 16 February 2024 19:19
> > To: Shradha Todi <[email protected]>
> > Cc: [email protected]; [email protected]; [email protected];
> > [email protected]; [email protected];
> > [email protected]; [email protected];
> > [email protected]; [email protected];
> > [email protected]; [email protected]; linux-
> > [email protected]; [email protected]; [email protected]
> > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> > controller
> >
> > On Thu, Feb 15, 2024 at 02:55:06PM +0530, Shradha Todi wrote:
> > >
> > >
> >
> > [...]
> >
> > > > For the error injection and counters, we already have the EDAC
> > > > framework. So adding them in the DWC driver doesn't make sense to me.
> > > >
> > >
> > > Sorry for late response, was going through the EDAC framework to understand
> > better how we can fit RAS DES support in it. Below are some technical challenges
> > found so far:
> > > 1: This debugfs framework proposed [1] can run on both side of the link i.e. RC
> > and EP as it will be a part of the link controller platform driver. Here for the EP
> > side the assumption is that it has Linux running, which is primarily a use case for
> > chip-to-chip communication. After your suggestion to migrate to EDAC
> > framework we studied and here are the findings:
> > > - If we move to EDAC framework, we need to have RAS DES as a
> > > pci_driver which will be binded based on vendor_id and device_id. Our
> > > observation is that on EP side system we are unable to bind two
> > > function driver (pci_driver), as pci_endpoint_test function driver or
> > > some other chip-to-chip function driver will already be bound. On the
> > > other hand, on RC side we observed that if we have portdrv enabled in
> > > Linux running on RC system, it gets bound to RC controller and then it
> > > does not allow EDAC pci_driver to bind. So basically we see a problem
> > > here, that we can't have two pci_driver binding to same PCI device
> > > 2: Another point is even though we use EDAC driver framework, we may not be
> > able to use any of EDAC framework APIs as they are mostly suitable for memory
> > controller devices sitting on PCI BUS. We will end up using debugfs entries just via
> > a pci_driver placed inside EDAC framework.
> >
> > Please wrap your replies to 80 characters.
> >
> > There is no need to bind the edac driver to VID:PID of the device. The edac driver
> > can be a platform driver and you can instantiate the platform device from the
> > DWC driver. This way, the PCI device can be assocaited with whatever driver, but
> > still there can be a separate edac driver for handling errors.
> >
> > Regarding API limitation, you should ask the maintainer about the possibility of
> > extending them.
> >
> > >
> > > Please let me know if my understanding is wrong.
> > >
> > > > But first check with the perf driver author if they have any plans
> > > > on adding the proposed functionality. If they do not have any plan
> > > > or not working on it, then look into EDAC.
> > > >
> > > > - Mani
> > > >
> > >
> > > Since we already worked and posted patches [1], [2], we will continue to work
> > on this and based on consent from community we will adopt to most suitable
> > framework.
> > > We see many subsystems like ethernet, usb, gpu, cxl having debugfs files that
> > give information about the current status of the running system and as of now
> > based on our findings, we still feel there is no harm in having debugfs entry based
> > support in DesignWare controller driver itself.
> >
> > There is no issue in exposing the debug information through debugfs, that's the
> > sole purpose of the interface. But here, you are trying to add support for DWC
> > RAS feature for which a dedicated framework already exists.
> >
> > And there will be more similar requests coming for vendor specific error protocols
> > as well. So your investigation could benefit everyone.
> >
> > From your above investigation, looks like there are some shortcomings of the
> > EDAC framework. So let's get that clarified by writing to the EDAC maintainers
> > (keep us in CC). If the EDAC maintainer suggests you to add support for this
> > feature in DWC driver itself citing some reasons, then no issues with me.
> >
> > - Mani
> >
> > --
> > மணிவண்ணன் சதாசிவம்
>
>

--
மணிவண்ணன் சதாசிவம்

2024-03-20 10:02:07

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

On Tue, 19 Mar 2024 22:03:15 +0530
'Manivannan Sadhasivam' <[email protected]> wrote:

> On Thu, Feb 22, 2024 at 04:30:47PM +0530, Shradha Todi wrote:
> > + Borislav, Tony, James, Mauro, Robert
> >
> > Hi All,
> >
> > Synopsys DesignWare PCIe controllers have a vendor specific capability (which
> > means that this set of registers are only present in DesignWare controllers)
> > to perform debug operations called "RASDES".
> > The functionalities provided by this extended capability are:
> >
> > 1. Debug: This has some debug related diagnostic features like holding LTSSM
> > in certain states, reading the status of lane detection, checking if any PCIe
> > lanes are broken (RX Valid) and so on. It's a debug only feature used for diagnostic
> > use-cases.
> >
> > 2. Error Injection: This is a way to inject certain errors in PCIe like LCRC, ECRC,
> > Bad TLPs and so on. Again, this is a debug feature and generally not used in
> > functional use-case.
> >
> > 3. Statistical counters: This has 3 parts
> > - Error counters
> > - Non error counters (covered as part of perf [1])
> > - Time based analysis counters (covered as part of perf [1])
> >
> > Selective features of the above functionality has been implemented
> > by vendor specific PCIe controller drivers (pcie-tegra194.c) that use
> > Synopsys DesignWare PCIe controllers.
> > In order to make it useful to all vendors using DWC controller, we had
> > proposed a common implementation in DWC PCIe controller directory
> > (drivers/pci/controller/dwc/) and our original idea was based on debugfs
> > filesystem. v1 and v2 are mentioned in [2] and [3].
> >
> > We got a suggestion to implement this as part of EDAC framework [3] and
> > we looked into the same. But as far as I understood, what I am trying to
> > implement is a very specific feature (only valid for Synopsys DWC PCIe controllers).

For error part there are (at least superficially) similar features in the PCIe
standard that we've started thinking about how to support.

See Flit Logging Extended capablity (7.7.8 in PCIe Base Spec rev6.
That has the benefit that they are part of the standard so we can
support them directly in portdrv / EP drivers using some library code in the
PCI core.

There are other interconnect and PCI PMU drivers that log retries etc which are also basically error
counts. At least some of that is done through perf today.


> > This doesn't seem to fit in very well with the EDAC framework and we can
> > hardly use any of the EDAC framework APIs. We tried implementing a
> > "pci_driver" but since a function driver will already be running on the EP and
> > portdrv on the root-complex, we will not be able to bind 2 drivers to a single
> > PCI device (root-complex or endpoint). Ultimately, what I will be doing is
> > writing a platform driver with debugfs entries which will be present in EDAC
> > directory instead of DWC directory.

The addition of this type of functionality to pordrv is a long running question.
Everyone wants a solution, I believe some people are looking at it (+CC Terry)

Terry, another case for your long list.

For the EP end, this should be fired up by the EP driver, whilst it might be
infrastructure used on a bunch of devices, it is a feature of that particular
EP - so you'd want to provide any functionality in a form that could be used
by both the EP driver and a nice shiny new portdrv replacement.

> >
> > Can you please help us out by going through this thread [3] and letting us
> > know if our understanding is wrong at any point. If you think it is a better
> > idea to integrate this in the EDAC framework, can you guide me as
> > to how I can utilize the framework better?
> > Please let me know if you need any other information to conclude.
> >
> > [1] https://lore.kernel.org/linux-pci/[email protected]/
> > [2] https://lore.kernel.org/all/[email protected]/T/
> > [3] https://lore.kernel.org/all/[email protected]/
> >
>
> Gentle ping for the EDAC maintainers.
>
> - Mani
>
> > Thanks,
> > Shradha
> >
> > > -----Original Message-----
> > > From: 'Manivannan Sadhasivam' <[email protected]>
> > > Sent: 16 February 2024 19:19
> > > To: Shradha Todi <[email protected]>
> > > Cc: [email protected]; [email protected]; [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; [email protected]; linux-
> > > [email protected]; [email protected]; [email protected]
> > > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> > > controller
> > >
> > > On Thu, Feb 15, 2024 at 02:55:06PM +0530, Shradha Todi wrote:
> > > >
> > > >
> > >
> > > [...]
> > >
> > > > > For the error injection and counters, we already have the EDAC
> > > > > framework. So adding them in the DWC driver doesn't make sense to me.
> > > > >
> > > >
> > > > Sorry for late response, was going through the EDAC framework to understand
> > > better how we can fit RAS DES support in it. Below are some technical challenges
> > > found so far:
> > > > 1: This debugfs framework proposed [1] can run on both side of the link i.e. RC
> > > and EP as it will be a part of the link controller platform driver. Here for the EP
> > > side the assumption is that it has Linux running, which is primarily a use case for
> > > chip-to-chip communication. After your suggestion to migrate to EDAC
> > > framework we studied and here are the findings:
> > > > - If we move to EDAC framework, we need to have RAS DES as a
> > > > pci_driver which will be binded based on vendor_id and device_id. Our
> > > > observation is that on EP side system we are unable to bind two
> > > > function driver (pci_driver), as pci_endpoint_test function driver or
> > > > some other chip-to-chip function driver will already be bound. On the
> > > > other hand, on RC side we observed that if we have portdrv enabled in
> > > > Linux running on RC system, it gets bound to RC controller and then it
> > > > does not allow EDAC pci_driver to bind. So basically we see a problem
> > > > here, that we can't have two pci_driver binding to same PCI device
> > > > 2: Another point is even though we use EDAC driver framework, we may not be
> > > able to use any of EDAC framework APIs as they are mostly suitable for memory
> > > controller devices sitting on PCI BUS. We will end up using debugfs entries just via
> > > a pci_driver placed inside EDAC framework.
> > >
> > > Please wrap your replies to 80 characters.
> > >
> > > There is no need to bind the edac driver to VID:PID of the device. The edac driver
> > > can be a platform driver and you can instantiate the platform device from the
> > > DWC driver. This way, the PCI device can be assocaited with whatever driver, but
> > > still there can be a separate edac driver for handling errors.
> > >
> > > Regarding API limitation, you should ask the maintainer about the possibility of
> > > extending them.
> > >
> > > >
> > > > Please let me know if my understanding is wrong.
> > > >
> > > > > But first check with the perf driver author if they have any plans
> > > > > on adding the proposed functionality. If they do not have any plan
> > > > > or not working on it, then look into EDAC.
> > > > >
> > > > > - Mani
> > > > >
> > > >
> > > > Since we already worked and posted patches [1], [2], we will continue to work
> > > on this and based on consent from community we will adopt to most suitable
> > > framework.
> > > > We see many subsystems like ethernet, usb, gpu, cxl having debugfs files that
> > > give information about the current status of the running system and as of now
> > > based on our findings, we still feel there is no harm in having debugfs entry based
> > > support in DesignWare controller driver itself.
> > >
> > > There is no issue in exposing the debug information through debugfs, that's the
> > > sole purpose of the interface. But here, you are trying to add support for DWC
> > > RAS feature for which a dedicated framework already exists.
> > >
> > > And there will be more similar requests coming for vendor specific error protocols
> > > as well. So your investigation could benefit everyone.
> > >
> > > From your above investigation, looks like there are some shortcomings of the
> > > EDAC framework. So let's get that clarified by writing to the EDAC maintainers
> > > (keep us in CC). If the EDAC maintainer suggests you to add support for this
> > > feature in DWC driver itself citing some reasons, then no issues with me.
> > >
> > > - Mani
> > >
> > > --
> > > மணிவண்ணன் சதாசிவம்
> >
> >
>


2024-03-22 10:46:39

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

On Wed, Mar 20, 2024 at 10:01:44AM +0000, Jonathan Cameron wrote:
> On Tue, 19 Mar 2024 22:03:15 +0530
> 'Manivannan Sadhasivam' <[email protected]> wrote:
>
> > On Thu, Feb 22, 2024 at 04:30:47PM +0530, Shradha Todi wrote:
> > > + Borislav, Tony, James, Mauro, Robert
> > >
> > > Hi All,
> > >
> > > Synopsys DesignWare PCIe controllers have a vendor specific capability (which
> > > means that this set of registers are only present in DesignWare controllers)
> > > to perform debug operations called "RASDES".
> > > The functionalities provided by this extended capability are:
> > >
> > > 1. Debug: This has some debug related diagnostic features like holding LTSSM
> > > in certain states, reading the status of lane detection, checking if any PCIe
> > > lanes are broken (RX Valid) and so on. It's a debug only feature used for diagnostic
> > > use-cases.
> > >
> > > 2. Error Injection: This is a way to inject certain errors in PCIe like LCRC, ECRC,
> > > Bad TLPs and so on. Again, this is a debug feature and generally not used in
> > > functional use-case.
> > >
> > > 3. Statistical counters: This has 3 parts
> > > - Error counters
> > > - Non error counters (covered as part of perf [1])
> > > - Time based analysis counters (covered as part of perf [1])
> > >
> > > Selective features of the above functionality has been implemented
> > > by vendor specific PCIe controller drivers (pcie-tegra194.c) that use
> > > Synopsys DesignWare PCIe controllers.
> > > In order to make it useful to all vendors using DWC controller, we had
> > > proposed a common implementation in DWC PCIe controller directory
> > > (drivers/pci/controller/dwc/) and our original idea was based on debugfs
> > > filesystem. v1 and v2 are mentioned in [2] and [3].
> > >
> > > We got a suggestion to implement this as part of EDAC framework [3] and
> > > we looked into the same. But as far as I understood, what I am trying to
> > > implement is a very specific feature (only valid for Synopsys DWC PCIe controllers).
>
> For error part there are (at least superficially) similar features in the PCIe
> standard that we've started thinking about how to support.
>
> See Flit Logging Extended capablity (7.7.8 in PCIe Base Spec rev6.
> That has the benefit that they are part of the standard so we can
> support them directly in portdrv / EP drivers using some library code in the
> PCI core.
>

Sounds good. But v6 is a relatively new version and the DWC RAS predates that.
So we still need to support it somehow (either in EDAC or in
drivers/pci/controller/dwc).

> There are other interconnect and PCI PMU drivers that log retries etc which are also basically error
> counts. At least some of that is done through perf today.
>

IMO all the RAS support should be exposed through EDAC, otherwise it defeats the
purpose of the subsystem.

- Mani

>
> > > This doesn't seem to fit in very well with the EDAC framework and we can
> > > hardly use any of the EDAC framework APIs. We tried implementing a
> > > "pci_driver" but since a function driver will already be running on the EP and
> > > portdrv on the root-complex, we will not be able to bind 2 drivers to a single
> > > PCI device (root-complex or endpoint). Ultimately, what I will be doing is
> > > writing a platform driver with debugfs entries which will be present in EDAC
> > > directory instead of DWC directory.
>
> The addition of this type of functionality to pordrv is a long running question.
> Everyone wants a solution, I believe some people are looking at it (+CC Terry)
>
> Terry, another case for your long list.
>
> For the EP end, this should be fired up by the EP driver, whilst it might be
> infrastructure used on a bunch of devices, it is a feature of that particular
> EP - so you'd want to provide any functionality in a form that could be used
> by both the EP driver and a nice shiny new portdrv replacement.
>
> > >
> > > Can you please help us out by going through this thread [3] and letting us
> > > know if our understanding is wrong at any point. If you think it is a better
> > > idea to integrate this in the EDAC framework, can you guide me as
> > > to how I can utilize the framework better?
> > > Please let me know if you need any other information to conclude.
> > >
> > > [1] https://lore.kernel.org/linux-pci/[email protected]/
> > > [2] https://lore.kernel.org/all/[email protected]/T/
> > > [3] https://lore.kernel.org/all/[email protected]/
> > >
> >
> > Gentle ping for the EDAC maintainers.
> >
> > - Mani
> >
> > > Thanks,
> > > Shradha
> > >
> > > > -----Original Message-----
> > > > From: 'Manivannan Sadhasivam' <[email protected]>
> > > > Sent: 16 February 2024 19:19
> > > > To: Shradha Todi <[email protected]>
> > > > Cc: [email protected]; [email protected]; [email protected];
> > > > [email protected]; [email protected];
> > > > [email protected]; [email protected];
> > > > [email protected]; [email protected];
> > > > [email protected]; [email protected]; linux-
> > > > [email protected]; [email protected]; [email protected]
> > > > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> > > > controller
> > > >
> > > > On Thu, Feb 15, 2024 at 02:55:06PM +0530, Shradha Todi wrote:
> > > > >
> > > > >
> > > >
> > > > [...]
> > > >
> > > > > > For the error injection and counters, we already have the EDAC
> > > > > > framework. So adding them in the DWC driver doesn't make sense to me.
> > > > > >
> > > > >
> > > > > Sorry for late response, was going through the EDAC framework to understand
> > > > better how we can fit RAS DES support in it. Below are some technical challenges
> > > > found so far:
> > > > > 1: This debugfs framework proposed [1] can run on both side of the link i.e. RC
> > > > and EP as it will be a part of the link controller platform driver. Here for the EP
> > > > side the assumption is that it has Linux running, which is primarily a use case for
> > > > chip-to-chip communication. After your suggestion to migrate to EDAC
> > > > framework we studied and here are the findings:
> > > > > - If we move to EDAC framework, we need to have RAS DES as a
> > > > > pci_driver which will be binded based on vendor_id and device_id. Our
> > > > > observation is that on EP side system we are unable to bind two
> > > > > function driver (pci_driver), as pci_endpoint_test function driver or
> > > > > some other chip-to-chip function driver will already be bound. On the
> > > > > other hand, on RC side we observed that if we have portdrv enabled in
> > > > > Linux running on RC system, it gets bound to RC controller and then it
> > > > > does not allow EDAC pci_driver to bind. So basically we see a problem
> > > > > here, that we can't have two pci_driver binding to same PCI device
> > > > > 2: Another point is even though we use EDAC driver framework, we may not be
> > > > able to use any of EDAC framework APIs as they are mostly suitable for memory
> > > > controller devices sitting on PCI BUS. We will end up using debugfs entries just via
> > > > a pci_driver placed inside EDAC framework.
> > > >
> > > > Please wrap your replies to 80 characters.
> > > >
> > > > There is no need to bind the edac driver to VID:PID of the device. The edac driver
> > > > can be a platform driver and you can instantiate the platform device from the
> > > > DWC driver. This way, the PCI device can be assocaited with whatever driver, but
> > > > still there can be a separate edac driver for handling errors.
> > > >
> > > > Regarding API limitation, you should ask the maintainer about the possibility of
> > > > extending them.
> > > >
> > > > >
> > > > > Please let me know if my understanding is wrong.
> > > > >
> > > > > > But first check with the perf driver author if they have any plans
> > > > > > on adding the proposed functionality. If they do not have any plan
> > > > > > or not working on it, then look into EDAC.
> > > > > >
> > > > > > - Mani
> > > > > >
> > > > >
> > > > > Since we already worked and posted patches [1], [2], we will continue to work
> > > > on this and based on consent from community we will adopt to most suitable
> > > > framework.
> > > > > We see many subsystems like ethernet, usb, gpu, cxl having debugfs files that
> > > > give information about the current status of the running system and as of now
> > > > based on our findings, we still feel there is no harm in having debugfs entry based
> > > > support in DesignWare controller driver itself.
> > > >
> > > > There is no issue in exposing the debug information through debugfs, that's the
> > > > sole purpose of the interface. But here, you are trying to add support for DWC
> > > > RAS feature for which a dedicated framework already exists.
> > > >
> > > > And there will be more similar requests coming for vendor specific error protocols
> > > > as well. So your investigation could benefit everyone.
> > > >
> > > > From your above investigation, looks like there are some shortcomings of the
> > > > EDAC framework. So let's get that clarified by writing to the EDAC maintainers
> > > > (keep us in CC). If the EDAC maintainer suggests you to add support for this
> > > > feature in DWC driver itself citing some reasons, then no issues with me.
> > > >
> > > > - Mani
> > > >
> > > > --
> > > > மணிவண்ணன் சதாசிவம்
> > >
> > >
> >
>

--
மணிவண்ணன் சதாசிவம்

2024-03-22 11:37:34

by Shradha Todi

[permalink] [raw]
Subject: RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller



> -----Original Message-----
> From: 'Manivannan Sadhasivam' <[email protected]>
> Sent: 22 March 2024 16:10
> To: Jonathan Cameron <[email protected]>
> Cc: Shradha Todi <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]; robh@kernelorg;
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; Terry Bowman
> <[email protected]>
> Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> controller
>
> On Wed, Mar 20, 2024 at 10:01:44AM +0000, Jonathan Cameron wrote:
> > On Tue, 19 Mar 2024 22:03:15 +0530
> > 'Manivannan Sadhasivam' <[email protected]> wrote:
> >
> > > On Thu, Feb 22, 2024 at 04:30:47PM +0530, Shradha Todi wrote:
> > > > + Borislav, Tony, James, Mauro, Robert
> > > >
> > > > Hi All,
> > > >
> > > > Synopsys DesignWare PCIe controllers have a vendor specific
> > > > capability (which means that this set of registers are only
> > > > present in DesignWare controllers) to perform debug operations called
> "RASDES".
> > > > The functionalities provided by this extended capability are:
> > > >
> > > > 1. Debug: This has some debug related diagnostic features like
> > > > holding LTSSM in certain states, reading the status of lane
> > > > detection, checking if any PCIe lanes are broken (RX Valid) and so
> > > > on. It's a debug only feature used for diagnostic use-cases.
> > > >
> > > > 2. Error Injection: This is a way to inject certain errors in PCIe
> > > > like LCRC, ECRC, Bad TLPs and so on. Again, this is a debug
> > > > feature and generally not used in functional use-case.
> > > >
> > > > 3. Statistical counters: This has 3 parts
> > > > - Error counters
> > > > - Non error counters (covered as part of perf [1])
> > > > - Time based analysis counters (covered as part of perf [1])
> > > >
> > > > Selective features of the above functionality has been
> > > > implemented by vendor specific PCIe controller drivers
> > > > (pcie-tegra194.c) that use Synopsys DesignWare PCIe controllers.
> > > > In order to make it useful to all vendors using DWC controller, we
> > > > had proposed a common implementation in DWC PCIe controller
> > > > directory
> > > > (drivers/pci/controller/dwc/) and our original idea was based on
> > > > debugfs filesystem. v1 and v2 are mentioned in [2] and [3].
> > > >
> > > > We got a suggestion to implement this as part of EDAC framework
> > > > [3] and we looked into the same. But as far as I understood, what
> > > > I am trying to implement is a very specific feature (only valid for Synopsys
> DWC PCIe controllers).
> >
> > For error part there are (at least superficially) similar features in
> > the PCIe standard that we've started thinking about how to support.
> >
> > See Flit Logging Extended capablity (7.7.8 in PCIe Base Spec rev6.
> > That has the benefit that they are part of the standard so we can
> > support them directly in portdrv / EP drivers using some library code
> > in the PCI core.
> >
>
> Sounds good. But v6 is a relatively new version and the DWC RAS predates that.
> So we still need to support it somehow (either in EDAC or in
> drivers/pci/controller/dwc).
>

For all PCIe spec standard errors (All of those mentioned in 6.2 in PCIe Base Spec rev6 like AER,
FLIT logging, DPC, SR-IOV Baseline Error Handling), extending the EDAC / portdrv / EP drivers
seems like the best way. But this particular RAS-DES functionality exists only for Synopsys
DesignWare controller, hence I feel the *DWC controller driver* should extend this as it is a
controller specific feature. What would you suggest?
Unless there a plan to support vendor specific implementations through EDAC or
portdrv callbacks? Because writing a new pci driver would cause 2 driver binding problem.

> > There are other interconnect and PCI PMU drivers that log retries etc
> > which are also basically error counts. At least some of that is done through perf
> today.
> >
>
> IMO all the RAS support should be exposed through EDAC, otherwise it defeats
> the purpose of the subsystem.
>
> - Mani
>

Yes, I see that lot of non-error based counters like memory packet tx/rx received,
nak dllp received, etc is done as a part of perf today. But I believe, the functionality I want to
implement has features like error injection which would not be suitable to be a part of perf.

> >
> > > > This doesn't seem to fit in very well with the EDAC framework and
> > > > we can hardly use any of the EDAC framework APIs. We tried
> > > > implementing a "pci_driver" but since a function driver will
> > > > already be running on the EP and portdrv on the root-complex, we
> > > > will not be able to bind 2 drivers to a single PCI device
> > > > (root-complex or endpoint). Ultimately, what I will be doing is
> > > > writing a platform driver with debugfs entries which will be present in EDAC
> directory instead of DWC directory.
> >
> > The addition of this type of functionality to pordrv is a long running question.
> > Everyone wants a solution, I believe some people are looking at it
> > (+CC Terry)
> >
> > Terry, another case for your long list.
> >
> > For the EP end, this should be fired up by the EP driver, whilst it
> > might be infrastructure used on a bunch of devices, it is a feature
> > of that particular EP - so you'd want to provide any functionality in
> > a form that could be used by both the EP driver and a nice shiny new portdrv
> replacement.
> >
> > > >
> > > > Can you please help us out by going through this thread [3] and
> > > > letting us know if our understanding is wrong at any point. If you
> > > > think it is a better idea to integrate this in the EDAC framework,
> > > > can you guide me as to how I can utilize the framework better?
> > > > Please let me know if you need any other information to conclude.
> > > >
> > > > [1]
> > > > https://lore.kernel.org/linux-pci/20231121013400.18367-1-xueshuai@
> > > > linux.alibaba.com/ [2]
> > > > https://lore.kernel.org/all/20210518174618.42089-1-shradha.t@samsu
> > > > ng.com/T/ [3]
> > > > https://lore.kernel.org/all/20231130115044.53512-1-shradha.t@samsu
> > > > ng.com/
> > > >
> > >
> > > Gentle ping for the EDAC maintainers.
> > >
> > > - Mani
> > >
> > > > Thanks,
> > > > Shradha
> > > >
> > > > > -----Original Message-----
> > > > > From: 'Manivannan Sadhasivam' <[email protected]>
> > > > > Sent: 16 February 2024 19:19
> > > > > To: Shradha Todi <[email protected]>
> > > > > Cc: [email protected]; [email protected]; [email protected];
> > > > > [email protected]; [email protected];
> > > > > [email protected]; [email protected];
> > > > > [email protected]; [email protected];
> > > > > [email protected]; [email protected]; linux-
> > > > > [email protected]; [email protected]; [email protected]
> > > > > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in
> > > > > PCIe DW controller
> > > > >
> > > > > On Thu, Feb 15, 2024 at 02:55:06PM +0530, Shradha Todi wrote:
> > > > > >
> > > > > >
> > > > >
> > > > > [...]
> > > > >
> > > > > > > For the error injection and counters, we already have the
> > > > > > > EDAC framework. So adding them in the DWC driver doesn't make
> sense to me.
> > > > > > >
> > > > > >
> > > > > > Sorry for late response, was going through the EDAC framework
> > > > > > to understand
> > > > > better how we can fit RAS DES support in it. Below are some
> > > > > technical challenges found so far:
> > > > > > 1: This debugfs framework proposed [1] can run on both side of
> > > > > > the link i.e. RC
> > > > > and EP as it will be a part of the link controller platform
> > > > > driver. Here for the EP side the assumption is that it has Linux
> > > > > running, which is primarily a use case for chip-to-chip
> > > > > communication. After your suggestion to migrate to EDAC framework we
> studied and here are the findings:
> > > > > > - If we move to EDAC framework, we need to have RAS DES as a
> > > > > > pci_driver which will be binded based on vendor_id and
> > > > > > device_id. Our observation is that on EP side system we are
> > > > > > unable to bind two function driver (pci_driver), as
> > > > > > pci_endpoint_test function driver or some other chip-to-chip
> > > > > > function driver will already be bound. On the other hand, on
> > > > > > RC side we observed that if we have portdrv enabled in Linux
> > > > > > running on RC system, it gets bound to RC controller and then
> > > > > > it does not allow EDAC pci_driver to bind. So basically we see
> > > > > > a problem here, that we can't have two pci_driver binding to
> > > > > > same PCI device
> > > > > > 2: Another point is even though we use EDAC driver framework,
> > > > > > we may not be
> > > > > able to use any of EDAC framework APIs as they are mostly
> > > > > suitable for memory controller devices sitting on PCI BUS. We
> > > > > will end up using debugfs entries just via a pci_driver placed inside EDAC
> framework.
> > > > >
> > > > > Please wrap your replies to 80 characters.
> > > > >
> > > > > There is no need to bind the edac driver to VID:PID of the
> > > > > device. The edac driver can be a platform driver and you can
> > > > > instantiate the platform device from the DWC driver. This way,
> > > > > the PCI device can be assocaited with whatever driver, but still there can
> be a separate edac driver for handling errors.
> > > > >
> > > > > Regarding API limitation, you should ask the maintainer about
> > > > > the possibility of extending them.
> > > > >
> > > > > >
> > > > > > Please let me know if my understanding is wrong.
> > > > > >
> > > > > > > But first check with the perf driver author if they have any
> > > > > > > plans on adding the proposed functionality. If they do not
> > > > > > > have any plan or not working on it, then look into EDAC.
> > > > > > >
> > > > > > > - Mani
> > > > > > >
> > > > > >
> > > > > > Since we already worked and posted patches [1], [2], we will
> > > > > > continue to work
> > > > > on this and based on consent from community we will adopt to
> > > > > most suitable framework.
> > > > > > We see many subsystems like ethernet, usb, gpu, cxl having
> > > > > > debugfs files that
> > > > > give information about the current status of the running system
> > > > > and as of now based on our findings, we still feel there is no
> > > > > harm in having debugfs entry based support in DesignWare controller
> driver itself.
> > > > >
> > > > > There is no issue in exposing the debug information through
> > > > > debugfs, that's the sole purpose of the interface. But here, you
> > > > > are trying to add support for DWC RAS feature for which a dedicated
> framework already exists.
> > > > >
> > > > > And there will be more similar requests coming for vendor
> > > > > specific error protocols as well. So your investigation could benefit
> everyone.
> > > > >
> > > > > From your above investigation, looks like there are some
> > > > > shortcomings of the EDAC framework. So let's get that clarified
> > > > > by writing to the EDAC maintainers (keep us in CC). If the EDAC
> > > > > maintainer suggests you to add support for this feature in DWC driver
> itself citing some reasons, then no issues with me.
> > > > >
> > > > > - Mani
> > > > >
> > > > > --
> > > > > மணிவண்ணன் சதாசிவம்
> > > >
> > > >
> > >
> >
>
> --
> மணிவண்ணன் சதாசிவம்



2024-03-22 15:31:29

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

On Fri, Mar 22, 2024 at 12:58:17PM +0000, Jonathan Cameron wrote:
> On Fri, 22 Mar 2024 16:09:35 +0530
> 'Manivannan Sadhasivam' <[email protected]> wrote:
>
> > On Wed, Mar 20, 2024 at 10:01:44AM +0000, Jonathan Cameron wrote:
> > > On Tue, 19 Mar 2024 22:03:15 +0530
> > > 'Manivannan Sadhasivam' <[email protected]> wrote:
> > >
> > > > On Thu, Feb 22, 2024 at 04:30:47PM +0530, Shradha Todi wrote:
> > > > > + Borislav, Tony, James, Mauro, Robert
> > > > >
> > > > > Hi All,
> > > > >
> > > > > Synopsys DesignWare PCIe controllers have a vendor specific capability (which
> > > > > means that this set of registers are only present in DesignWare controllers)
> > > > > to perform debug operations called "RASDES".
> > > > > The functionalities provided by this extended capability are:
> > > > >
> > > > > 1. Debug: This has some debug related diagnostic features like holding LTSSM
> > > > > in certain states, reading the status of lane detection, checking if any PCIe
> > > > > lanes are broken (RX Valid) and so on. It's a debug only feature used for diagnostic
> > > > > use-cases.
> > > > >
> > > > > 2. Error Injection: This is a way to inject certain errors in PCIe like LCRC, ECRC,
> > > > > Bad TLPs and so on. Again, this is a debug feature and generally not used in
> > > > > functional use-case.
> > > > >
> > > > > 3. Statistical counters: This has 3 parts
> > > > > - Error counters
> > > > > - Non error counters (covered as part of perf [1])
> > > > > - Time based analysis counters (covered as part of perf [1])
> > > > >
> > > > > Selective features of the above functionality has been implemented
> > > > > by vendor specific PCIe controller drivers (pcie-tegra194.c) that use
> > > > > Synopsys DesignWare PCIe controllers.
> > > > > In order to make it useful to all vendors using DWC controller, we had
> > > > > proposed a common implementation in DWC PCIe controller directory
> > > > > (drivers/pci/controller/dwc/) and our original idea was based on debugfs
> > > > > filesystem. v1 and v2 are mentioned in [2] and [3].
> > > > >
> > > > > We got a suggestion to implement this as part of EDAC framework [3] and
> > > > > we looked into the same. But as far as I understood, what I am trying to
> > > > > implement is a very specific feature (only valid for Synopsys DWC PCIe controllers).
> > >
> > > For error part there are (at least superficially) similar features in the PCIe
> > > standard that we've started thinking about how to support.
> > >
> > > See Flit Logging Extended capablity (7.7.8 in PCIe Base Spec rev6.
> > > That has the benefit that they are part of the standard so we can
> > > support them directly in portdrv / EP drivers using some library code in the
> > > PCI core.
> > >
> >
> > Sounds good. But v6 is a relatively new version and the DWC RAS predates that.
> > So we still need to support it somehow (either in EDAC or in
> > drivers/pci/controller/dwc).
> >
> > > There are other interconnect and PCI PMU drivers that log retries etc which are also basically error
> > > counts. At least some of that is done through perf today.
> > >
> >
> > IMO all the RAS support should be exposed through EDAC, otherwise it defeats the
> > purpose of the subsystem.
>
> Some RAS flows go nowhere near EDAC today. Individual drivers poke out
> the tracepoints and userspace tooling deals with it. One example is
> CXL but there are plenty of others.
>

That's the problem. You'd end up with custom tooling for each individual
drivers leading to userspace fragmentation.

- Mani

> Perhaps that should not be the case, but there is definitely
> precedence for ignoring edac when considering error flows.
>
> Jonathan
>
> >
> > - Mani
> >
> > >
> > > > > This doesn't seem to fit in very well with the EDAC framework and we can
> > > > > hardly use any of the EDAC framework APIs. We tried implementing a
> > > > > "pci_driver" but since a function driver will already be running on the EP and
> > > > > portdrv on the root-complex, we will not be able to bind 2 drivers to a single
> > > > > PCI device (root-complex or endpoint). Ultimately, what I will be doing is
> > > > > writing a platform driver with debugfs entries which will be present in EDAC
> > > > > directory instead of DWC directory.
> > >
> > > The addition of this type of functionality to pordrv is a long running question.
> > > Everyone wants a solution, I believe some people are looking at it (+CC Terry)
> > >
> > > Terry, another case for your long list.
> > >
> > > For the EP end, this should be fired up by the EP driver, whilst it might be
> > > infrastructure used on a bunch of devices, it is a feature of that particular
> > > EP - so you'd want to provide any functionality in a form that could be used
> > > by both the EP driver and a nice shiny new portdrv replacement.
> > >
> > > > >
> > > > > Can you please help us out by going through this thread [3] and letting us
> > > > > know if our understanding is wrong at any point. If you think it is a better
> > > > > idea to integrate this in the EDAC framework, can you guide me as
> > > > > to how I can utilize the framework better?
> > > > > Please let me know if you need any other information to conclude.
> > > > >
> > > > > [1] https://lore.kernel.org/linux-pci/[email protected]/
> > > > > [2] https://lore.kernel.org/all/[email protected]/T/
> > > > > [3] https://lore.kernel.org/all/[email protected]/
> > > > >
> > > >
> > > > Gentle ping for the EDAC maintainers.
> > > >
> > > > - Mani
> > > >
> > > > > Thanks,
> > > > > Shradha
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: 'Manivannan Sadhasivam' <[email protected]>
> > > > > > Sent: 16 February 2024 19:19
> > > > > > To: Shradha Todi <[email protected]>
> > > > > > Cc: [email protected]; [email protected]; [email protected];
> > > > > > [email protected]; [email protected];
> > > > > > [email protected]; [email protected];
> > > > > > [email protected]; [email protected];
> > > > > > [email protected]; [email protected]; linux-
> > > > > > [email protected]; [email protected]; [email protected]
> > > > > > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> > > > > > controller
> > > > > >
> > > > > > On Thu, Feb 15, 2024 at 02:55:06PM +0530, Shradha Todi wrote:
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > [...]
> > > > > >
> > > > > > > > For the error injection and counters, we already have the EDAC
> > > > > > > > framework. So adding them in the DWC driver doesn't make sense to me.
> > > > > > > >
> > > > > > >
> > > > > > > Sorry for late response, was going through the EDAC framework to understand
> > > > > > better how we can fit RAS DES support in it. Below are some technical challenges
> > > > > > found so far:
> > > > > > > 1: This debugfs framework proposed [1] can run on both side of the link i.e. RC
> > > > > > and EP as it will be a part of the link controller platform driver. Here for the EP
> > > > > > side the assumption is that it has Linux running, which is primarily a use case for
> > > > > > chip-to-chip communication. After your suggestion to migrate to EDAC
> > > > > > framework we studied and here are the findings:
> > > > > > > - If we move to EDAC framework, we need to have RAS DES as a
> > > > > > > pci_driver which will be binded based on vendor_id and device_id. Our
> > > > > > > observation is that on EP side system we are unable to bind two
> > > > > > > function driver (pci_driver), as pci_endpoint_test function driver or
> > > > > > > some other chip-to-chip function driver will already be bound. On the
> > > > > > > other hand, on RC side we observed that if we have portdrv enabled in
> > > > > > > Linux running on RC system, it gets bound to RC controller and then it
> > > > > > > does not allow EDAC pci_driver to bind. So basically we see a problem
> > > > > > > here, that we can't have two pci_driver binding to same PCI device
> > > > > > > 2: Another point is even though we use EDAC driver framework, we may not be
> > > > > > able to use any of EDAC framework APIs as they are mostly suitable for memory
> > > > > > controller devices sitting on PCI BUS. We will end up using debugfs entries just via
> > > > > > a pci_driver placed inside EDAC framework.
> > > > > >
> > > > > > Please wrap your replies to 80 characters.
> > > > > >
> > > > > > There is no need to bind the edac driver to VID:PID of the device. The edac driver
> > > > > > can be a platform driver and you can instantiate the platform device from the
> > > > > > DWC driver. This way, the PCI device can be assocaited with whatever driver, but
> > > > > > still there can be a separate edac driver for handling errors.
> > > > > >
> > > > > > Regarding API limitation, you should ask the maintainer about the possibility of
> > > > > > extending them.
> > > > > >
> > > > > > >
> > > > > > > Please let me know if my understanding is wrong.
> > > > > > >
> > > > > > > > But first check with the perf driver author if they have any plans
> > > > > > > > on adding the proposed functionality. If they do not have any plan
> > > > > > > > or not working on it, then look into EDAC.
> > > > > > > >
> > > > > > > > - Mani
> > > > > > > >
> > > > > > >
> > > > > > > Since we already worked and posted patches [1], [2], we will continue to work
> > > > > > on this and based on consent from community we will adopt to most suitable
> > > > > > framework.
> > > > > > > We see many subsystems like ethernet, usb, gpu, cxl having debugfs files that
> > > > > > give information about the current status of the running system and as of now
> > > > > > based on our findings, we still feel there is no harm in having debugfs entry based
> > > > > > support in DesignWare controller driver itself.
> > > > > >
> > > > > > There is no issue in exposing the debug information through debugfs, that's the
> > > > > > sole purpose of the interface. But here, you are trying to add support for DWC
> > > > > > RAS feature for which a dedicated framework already exists.
> > > > > >
> > > > > > And there will be more similar requests coming for vendor specific error protocols
> > > > > > as well. So your investigation could benefit everyone.
> > > > > >
> > > > > > From your above investigation, looks like there are some shortcomings of the
> > > > > > EDAC framework. So let's get that clarified by writing to the EDAC maintainers
> > > > > > (keep us in CC). If the EDAC maintainer suggests you to add support for this
> > > > > > feature in DWC driver itself citing some reasons, then no issues with me.
> > > > > >
> > > > > > - Mani
> > > > > >
> > > > > > --
> > > > > > மணிவண்ணன் சதாசிவம்
> > > > >
> > > > >
> > > >
> > >
> >
>

--
மணிவண்ணன் சதாசிவம்

2024-03-22 12:58:36

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

On Fri, 22 Mar 2024 16:09:35 +0530
'Manivannan Sadhasivam' <[email protected]> wrote:

> On Wed, Mar 20, 2024 at 10:01:44AM +0000, Jonathan Cameron wrote:
> > On Tue, 19 Mar 2024 22:03:15 +0530
> > 'Manivannan Sadhasivam' <[email protected]> wrote:
> >
> > > On Thu, Feb 22, 2024 at 04:30:47PM +0530, Shradha Todi wrote:
> > > > + Borislav, Tony, James, Mauro, Robert
> > > >
> > > > Hi All,
> > > >
> > > > Synopsys DesignWare PCIe controllers have a vendor specific capability (which
> > > > means that this set of registers are only present in DesignWare controllers)
> > > > to perform debug operations called "RASDES".
> > > > The functionalities provided by this extended capability are:
> > > >
> > > > 1. Debug: This has some debug related diagnostic features like holding LTSSM
> > > > in certain states, reading the status of lane detection, checking if any PCIe
> > > > lanes are broken (RX Valid) and so on. It's a debug only feature used for diagnostic
> > > > use-cases.
> > > >
> > > > 2. Error Injection: This is a way to inject certain errors in PCIe like LCRC, ECRC,
> > > > Bad TLPs and so on. Again, this is a debug feature and generally not used in
> > > > functional use-case.
> > > >
> > > > 3. Statistical counters: This has 3 parts
> > > > - Error counters
> > > > - Non error counters (covered as part of perf [1])
> > > > - Time based analysis counters (covered as part of perf [1])
> > > >
> > > > Selective features of the above functionality has been implemented
> > > > by vendor specific PCIe controller drivers (pcie-tegra194.c) that use
> > > > Synopsys DesignWare PCIe controllers.
> > > > In order to make it useful to all vendors using DWC controller, we had
> > > > proposed a common implementation in DWC PCIe controller directory
> > > > (drivers/pci/controller/dwc/) and our original idea was based on debugfs
> > > > filesystem. v1 and v2 are mentioned in [2] and [3].
> > > >
> > > > We got a suggestion to implement this as part of EDAC framework [3] and
> > > > we looked into the same. But as far as I understood, what I am trying to
> > > > implement is a very specific feature (only valid for Synopsys DWC PCIe controllers).
> >
> > For error part there are (at least superficially) similar features in the PCIe
> > standard that we've started thinking about how to support.
> >
> > See Flit Logging Extended capablity (7.7.8 in PCIe Base Spec rev6.
> > That has the benefit that they are part of the standard so we can
> > support them directly in portdrv / EP drivers using some library code in the
> > PCI core.
> >
>
> Sounds good. But v6 is a relatively new version and the DWC RAS predates that.
> So we still need to support it somehow (either in EDAC or in
> drivers/pci/controller/dwc).
>
> > There are other interconnect and PCI PMU drivers that log retries etc which are also basically error
> > counts. At least some of that is done through perf today.
> >
>
> IMO all the RAS support should be exposed through EDAC, otherwise it defeats the
> purpose of the subsystem.

Some RAS flows go nowhere near EDAC today. Individual drivers poke out
the tracepoints and userspace tooling deals with it. One example is
CXL but there are plenty of others.

Perhaps that should not be the case, but there is definitely
precedence for ignoring edac when considering error flows.

Jonathan

>
> - Mani
>
> >
> > > > This doesn't seem to fit in very well with the EDAC framework and we can
> > > > hardly use any of the EDAC framework APIs. We tried implementing a
> > > > "pci_driver" but since a function driver will already be running on the EP and
> > > > portdrv on the root-complex, we will not be able to bind 2 drivers to a single
> > > > PCI device (root-complex or endpoint). Ultimately, what I will be doing is
> > > > writing a platform driver with debugfs entries which will be present in EDAC
> > > > directory instead of DWC directory.
> >
> > The addition of this type of functionality to pordrv is a long running question.
> > Everyone wants a solution, I believe some people are looking at it (+CC Terry)
> >
> > Terry, another case for your long list.
> >
> > For the EP end, this should be fired up by the EP driver, whilst it might be
> > infrastructure used on a bunch of devices, it is a feature of that particular
> > EP - so you'd want to provide any functionality in a form that could be used
> > by both the EP driver and a nice shiny new portdrv replacement.
> >
> > > >
> > > > Can you please help us out by going through this thread [3] and letting us
> > > > know if our understanding is wrong at any point. If you think it is a better
> > > > idea to integrate this in the EDAC framework, can you guide me as
> > > > to how I can utilize the framework better?
> > > > Please let me know if you need any other information to conclude.
> > > >
> > > > [1] https://lore.kernel.org/linux-pci/[email protected]/
> > > > [2] https://lore.kernel.org/all/[email protected]/T/
> > > > [3] https://lore.kernel.org/all/[email protected]/
> > > >
> > >
> > > Gentle ping for the EDAC maintainers.
> > >
> > > - Mani
> > >
> > > > Thanks,
> > > > Shradha
> > > >
> > > > > -----Original Message-----
> > > > > From: 'Manivannan Sadhasivam' <[email protected]>
> > > > > Sent: 16 February 2024 19:19
> > > > > To: Shradha Todi <[email protected]>
> > > > > Cc: [email protected]; [email protected]; [email protected];
> > > > > [email protected]; [email protected];
> > > > > [email protected]; [email protected];
> > > > > [email protected]; [email protected];
> > > > > [email protected]; [email protected]; linux-
> > > > > [email protected]; [email protected]; [email protected]
> > > > > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> > > > > controller
> > > > >
> > > > > On Thu, Feb 15, 2024 at 02:55:06PM +0530, Shradha Todi wrote:
> > > > > >
> > > > > >
> > > > >
> > > > > [...]
> > > > >
> > > > > > > For the error injection and counters, we already have the EDAC
> > > > > > > framework. So adding them in the DWC driver doesn't make sense to me.
> > > > > > >
> > > > > >
> > > > > > Sorry for late response, was going through the EDAC framework to understand
> > > > > better how we can fit RAS DES support in it. Below are some technical challenges
> > > > > found so far:
> > > > > > 1: This debugfs framework proposed [1] can run on both side of the link i.e. RC
> > > > > and EP as it will be a part of the link controller platform driver. Here for the EP
> > > > > side the assumption is that it has Linux running, which is primarily a use case for
> > > > > chip-to-chip communication. After your suggestion to migrate to EDAC
> > > > > framework we studied and here are the findings:
> > > > > > - If we move to EDAC framework, we need to have RAS DES as a
> > > > > > pci_driver which will be binded based on vendor_id and device_id. Our
> > > > > > observation is that on EP side system we are unable to bind two
> > > > > > function driver (pci_driver), as pci_endpoint_test function driver or
> > > > > > some other chip-to-chip function driver will already be bound. On the
> > > > > > other hand, on RC side we observed that if we have portdrv enabled in
> > > > > > Linux running on RC system, it gets bound to RC controller and then it
> > > > > > does not allow EDAC pci_driver to bind. So basically we see a problem
> > > > > > here, that we can't have two pci_driver binding to same PCI device
> > > > > > 2: Another point is even though we use EDAC driver framework, we may not be
> > > > > able to use any of EDAC framework APIs as they are mostly suitable for memory
> > > > > controller devices sitting on PCI BUS. We will end up using debugfs entries just via
> > > > > a pci_driver placed inside EDAC framework.
> > > > >
> > > > > Please wrap your replies to 80 characters.
> > > > >
> > > > > There is no need to bind the edac driver to VID:PID of the device The edac driver
> > > > > can be a platform driver and you can instantiate the platform device from the
> > > > > DWC driver. This way, the PCI device can be assocaited with whatever driver, but
> > > > > still there can be a separate edac driver for handling errors.
> > > > >
> > > > > Regarding API limitation, you should ask the maintainer about the possibility of
> > > > > extending them.
> > > > >
> > > > > >
> > > > > > Please let me know if my understanding is wrong.
> > > > > >
> > > > > > > But first check with the perf driver author if they have any plans
> > > > > > > on adding the proposed functionality. If they do not have any plan
> > > > > > > or not working on it, then look into EDAC.
> > > > > > >
> > > > > > > - Mani
> > > > > > >
> > > > > >
> > > > > > Since we already worked and posted patches [1], [2], we will continue to work
> > > > > on this and based on consent from community we will adopt to most suitable
> > > > > framework.
> > > > > > We see many subsystems like ethernet, usb, gpu, cxl having debugfs files that
> > > > > give information about the current status of the running system and as of now
> > > > > based on our findings, we still feel there is no harm in having debugfs entry based
> > > > > support in DesignWare controller driver itself.
> > > > >
> > > > > There is no issue in exposing the debug information through debugfs, that's the
> > > > > sole purpose of the interface. But here, you are trying to add support for DWC
> > > > > RAS feature for which a dedicated framework already exists.
> > > > >
> > > > > And there will be more similar requests coming for vendor specific error protocols
> > > > > as well. So your investigation could benefit everyone.
> > > > >
> > > > > From your above investigation, looks like there are some shortcomings of the
> > > > > EDAC framework. So let's get that clarified by writing to the EDAC maintainers
> > > > > (keep us in CC). If the EDAC maintainer suggests you to add support for this
> > > > > feature in DWC driver itself citing some reasons, then no issues with me.
> > > > >
> > > > > - Mani
> > > > >
> > > > > --
> > > > > மணிவண்ணன் சதாசிவம்
> > > >
> > > >
> > >
> >
>


2024-04-24 15:34:18

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

On Tue, Mar 19, 2024 at 10:03:15PM +0530, 'Manivannan Sadhasivam' wrote:
> On Thu, Feb 22, 2024 at 04:30:47PM +0530, Shradha Todi wrote:
> > + Borislav, Tony, James, Mauro, Robert
> >
> > Hi All,
> >
> > Synopsys DesignWare PCIe controllers have a vendor specific capability (which
> > means that this set of registers are only present in DesignWare controllers)
> > to perform debug operations called "RASDES".
> > The functionalities provided by this extended capability are:
> >
> > 1. Debug: This has some debug related diagnostic features like holding LTSSM
> > in certain states, reading the status of lane detection, checking if any PCIe
> > lanes are broken (RX Valid) and so on. It's a debug only feature used for diagnostic
> > use-cases.
> >
> > 2. Error Injection: This is a way to inject certain errors in PCIe like LCRC, ECRC,
> > Bad TLPs and so on. Again, this is a debug feature and generally not used in
> > functional use-case.
> >
> > 3. Statistical counters: This has 3 parts
> > - Error counters
> > - Non error counters (covered as part of perf [1])
> > - Time based analysis counters (covered as part of perf [1])
> >
> > Selective features of the above functionality has been implemented
> > by vendor specific PCIe controller drivers (pcie-tegra194.c) that use
> > Synopsys DesignWare PCIe controllers.
> > In order to make it useful to all vendors using DWC controller, we had
> > proposed a common implementation in DWC PCIe controller directory
> > (drivers/pci/controller/dwc/) and our original idea was based on debugfs
> > filesystem. v1 and v2 are mentioned in [2] and [3].
> >
> > We got a suggestion to implement this as part of EDAC framework [3] and
> > we looked into the same. But as far as I understood, what I am trying to
> > implement is a very specific feature (only valid for Synopsys DWC PCIe controllers).
> > This doesn't seem to fit in very well with the EDAC framework and we can
> > hardly use any of the EDAC framework APIs. We tried implementing a
> > "pci_driver" but since a function driver will already be running on the EP and
> > portdrv on the root-complex, we will not be able to bind 2 drivers to a single
> > PCI device (root-complex or endpoint). Ultimately, what I will be doing is
> > writing a platform driver with debugfs entries which will be present in EDAC
> > directory instead of DWC directory.
> >
> > Can you please help us out by going through this thread [3] and letting us
> > know if our understanding is wrong at any point. If you think it is a better
> > idea to integrate this in the EDAC framework, can you guide me as
> > to how I can utilize the framework better?
> > Please let me know if you need any other information to conclude.
> >
> > [1] https://lore.kernel.org/linux-pci/[email protected]/
> > [2] https://lore.kernel.org/all/[email protected]/T/
> > [3] https://lore.kernel.org/all/[email protected]/
> >
>
> Gentle ping for the EDAC maintainers.
>

Since there is no response from the EDAC maintainers for a while, I think we
should go ahead and add the RAS feature in the PCI DWC driver itself.

- Mani

> - Mani
>
> > Thanks,
> > Shradha
> >
> > > -----Original Message-----
> > > From: 'Manivannan Sadhasivam' <[email protected]>
> > > Sent: 16 February 2024 19:19
> > > To: Shradha Todi <[email protected]>
> > > Cc: [email protected]; [email protected]; [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; [email protected]; linux-
> > > [email protected]; [email protected]; [email protected]
> > > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> > > controller
> > >
> > > On Thu, Feb 15, 2024 at 02:55:06PM +0530, Shradha Todi wrote:
> > > >
> > > >
> > >
> > > [...]
> > >
> > > > > For the error injection and counters, we already have the EDAC
> > > > > framework. So adding them in the DWC driver doesn't make sense to me.
> > > > >
> > > >
> > > > Sorry for late response, was going through the EDAC framework to understand
> > > better how we can fit RAS DES support in it. Below are some technical challenges
> > > found so far:
> > > > 1: This debugfs framework proposed [1] can run on both side of the link i.e. RC
> > > and EP as it will be a part of the link controller platform driver. Here for the EP
> > > side the assumption is that it has Linux running, which is primarily a use case for
> > > chip-to-chip communication. After your suggestion to migrate to EDAC
> > > framework we studied and here are the findings:
> > > > - If we move to EDAC framework, we need to have RAS DES as a
> > > > pci_driver which will be binded based on vendor_id and device_id. Our
> > > > observation is that on EP side system we are unable to bind two
> > > > function driver (pci_driver), as pci_endpoint_test function driver or
> > > > some other chip-to-chip function driver will already be bound. On the
> > > > other hand, on RC side we observed that if we have portdrv enabled in
> > > > Linux running on RC system, it gets bound to RC controller and then it
> > > > does not allow EDAC pci_driver to bind. So basically we see a problem
> > > > here, that we can't have two pci_driver binding to same PCI device
> > > > 2: Another point is even though we use EDAC driver framework, we may not be
> > > able to use any of EDAC framework APIs as they are mostly suitable for memory
> > > controller devices sitting on PCI BUS. We will end up using debugfs entries just via
> > > a pci_driver placed inside EDAC framework.
> > >
> > > Please wrap your replies to 80 characters.
> > >
> > > There is no need to bind the edac driver to VID:PID of the device. The edac driver
> > > can be a platform driver and you can instantiate the platform device from the
> > > DWC driver. This way, the PCI device can be assocaited with whatever driver, but
> > > still there can be a separate edac driver for handling errors.
> > >
> > > Regarding API limitation, you should ask the maintainer about the possibility of
> > > extending them.
> > >
> > > >
> > > > Please let me know if my understanding is wrong.
> > > >
> > > > > But first check with the perf driver author if they have any plans
> > > > > on adding the proposed functionality. If they do not have any plan
> > > > > or not working on it, then look into EDAC.
> > > > >
> > > > > - Mani
> > > > >
> > > >
> > > > Since we already worked and posted patches [1], [2], we will continue to work
> > > on this and based on consent from community we will adopt to most suitable
> > > framework.
> > > > We see many subsystems like ethernet, usb, gpu, cxl having debugfs files that
> > > give information about the current status of the running system and as of now
> > > based on our findings, we still feel there is no harm in having debugfs entry based
> > > support in DesignWare controller driver itself.
> > >
> > > There is no issue in exposing the debug information through debugfs, that's the
> > > sole purpose of the interface. But here, you are trying to add support for DWC
> > > RAS feature for which a dedicated framework already exists.
> > >
> > > And there will be more similar requests coming for vendor specific error protocols
> > > as well. So your investigation could benefit everyone.
> > >
> > > From your above investigation, looks like there are some shortcomings of the
> > > EDAC framework. So let's get that clarified by writing to the EDAC maintainers
> > > (keep us in CC). If the EDAC maintainer suggests you to add support for this
> > > feature in DWC driver itself citing some reasons, then no issues with me.
> > >
> > > - Mani
> > >
> > > --
> > > மணிவண்ணன் சதாசிவம்
> >
> >
>
> --
> மணிவண்ணன் சதாசிவம்

--
மணிவண்ணன் சதாசிவம்