2021-06-24 14:41:31

by Guangbin Huang

[permalink] [raw]
Subject: [PATCH net-next 3/3] net: hns3: add support for link diagnosis info in debugfs

In order to know reason why link down, add a debugfs file
"link_diagnosis_info" to get link faults from firmware, and each bit
represents one kind of fault.

usage example:
$ cat link_diagnosis_info
Reference clock lost
SFP is absent

Signed-off-by: Guangbin Huang <[email protected]>
---
drivers/net/ethernet/hisilicon/hns3/hnae3.h | 1 +
drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c | 8 +++
.../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h | 3 ++
.../ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c | 58 ++++++++++++++++++++++
4 files changed, 70 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index e0b7c3c44e7b..34e5eb65c7b1 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -294,6 +294,7 @@ enum hnae3_dbg_cmd {
HNAE3_DBG_CMD_MAC_TNL_STATUS,
HNAE3_DBG_CMD_SERV_INFO,
HNAE3_DBG_CMD_UMV_INFO,
+ HNAE3_DBG_CMD_LINK_DIAGNOSIS_INFO,
HNAE3_DBG_CMD_UNKNOWN,
};

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
index 532523069d74..6a0385b5f80a 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
@@ -337,6 +337,14 @@ static struct hns3_dbg_cmd_info hns3_dbg_cmd[] = {
.buf_len = HNS3_DBG_READ_LEN,
.init = hns3_dbg_common_file_init,
},
+ {
+ .name = "link_diagnosis_info",
+ .cmd = HNAE3_DBG_CMD_LINK_DIAGNOSIS_INFO,
+ .dentry = HNS3_DBG_DENTRY_COMMON,
+ .buf_len = HNS3_DBG_READ_LEN,
+ .init = hns3_dbg_common_file_init,
+ },
+
};

static struct hns3_dbg_cap_info hns3_dbg_cap[] = {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 18bde77ef944..8e5be127909b 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -316,6 +316,9 @@ enum hclge_opcode_type {
/* PHY command */
HCLGE_OPC_PHY_LINK_KSETTING = 0x7025,
HCLGE_OPC_PHY_REG = 0x7026,
+
+ /* Query link diagnosis info command */
+ HCLGE_OPC_QUERY_LINK_DIAGNOSIS = 0x702A,
};

#define HCLGE_TQP_REG_OFFSET 0x80000
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
index 288788186ecc..e6d8d070711d 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
@@ -2301,6 +2301,60 @@ static int hclge_dbg_dump_mac_mc(struct hclge_dev *hdev, char *buf, int len)
return 0;
}

+/* The order of each reason is defined by firmware, so don't change the order */
+static const char * const link_down_reason[] = {
+ "Reference clock lost",
+ "SFP tx is disabled",
+ "SFP is absent",
+ "PHY power down",
+ "Serdes analog loss of signal",
+ "Auto negotiation failed",
+ "Link training failed",
+ "Remote fault",
+ "I2C bus error",
+ "BER is too high",
+};
+
+static int hclge_dbg_dump_link_diagnosis_info(struct hclge_dev *hdev, char *buf,
+ int len)
+{
+#define HCLGE_DBG_BIT_LEN_PER_WORD 32
+
+ u16 word_index, bit_index, i;
+ struct hclge_desc desc;
+ int pos = 0;
+ u32 data;
+ int ret;
+
+ if (hdev->ae_dev->dev_version <= HNAE3_DEVICE_VERSION_V2) {
+ dev_err(&hdev->pdev->dev, "Operation not supported\n");
+ return -EOPNOTSUPP;
+ }
+
+ hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_QUERY_LINK_DIAGNOSIS, true);
+ ret = hclge_cmd_send(&hdev->hw, &desc, 1);
+ if (ret) {
+ dev_err(&hdev->pdev->dev,
+ "failed to query link diagnosis info, ret = %d\n", ret);
+ return ret;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(link_down_reason); i++) {
+ word_index = i / HCLGE_DBG_BIT_LEN_PER_WORD;
+ bit_index = i % HCLGE_DBG_BIT_LEN_PER_WORD;
+
+ data = le32_to_cpu(desc.data[word_index]);
+ if (hnae3_get_bit(data, bit_index))
+ pos += scnprintf(buf + pos, len - pos, "%s\n",
+ link_down_reason[i]);
+ }
+
+ if (!pos)
+ pos += scnprintf(buf + pos, len - pos, "No error\n");
+
+ return 0;
+}
+
static const struct hclge_dbg_func hclge_dbg_cmd_func[] = {
{
.cmd = HNAE3_DBG_CMD_TM_NODES,
@@ -2446,6 +2500,10 @@ static const struct hclge_dbg_func hclge_dbg_cmd_func[] = {
.cmd = HNAE3_DBG_CMD_UMV_INFO,
.dbg_dump = hclge_dbg_dump_umv_info,
},
+ {
+ .cmd = HNAE3_DBG_CMD_LINK_DIAGNOSIS_INFO,
+ .dbg_dump = hclge_dbg_dump_link_diagnosis_info,
+ },
};

int hclge_dbg_read_cmd(struct hnae3_handle *handle, enum hnae3_dbg_cmd cmd,
--
2.8.1


2021-06-24 19:28:27

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net-next 3/3] net: hns3: add support for link diagnosis info in debugfs

On Thu, 24 Jun 2021 22:36:45 +0800 Guangbin Huang wrote:
> In order to know reason why link down, add a debugfs file
> "link_diagnosis_info" to get link faults from firmware, and each bit
> represents one kind of fault.
>
> usage example:
> $ cat link_diagnosis_info
> Reference clock lost

Please use ethtool->get_link_ext_state instead.

2021-06-25 07:09:41

by Guangbin Huang

[permalink] [raw]
Subject: Re: [PATCH net-next 3/3] net: hns3: add support for link diagnosis info in debugfs



On 2021/6/25 3:25, Jakub Kicinski wrote:
> On Thu, 24 Jun 2021 22:36:45 +0800 Guangbin Huang wrote:
>> In order to know reason why link down, add a debugfs file
>> "link_diagnosis_info" to get link faults from firmware, and each bit
>> represents one kind of fault.
>>
>> usage example:
>> $ cat link_diagnosis_info
>> Reference clock lost
>
> Please use ethtool->get_link_ext_state instead.
> .
>
Ok, thanks.

2021-07-01 09:04:41

by Guangbin Huang

[permalink] [raw]
Subject: Re: [PATCH net-next 3/3] net: hns3: add support for link diagnosis info in debugfs



On 2021/6/25 3:25, Jakub Kicinski wrote:
> On Thu, 24 Jun 2021 22:36:45 +0800 Guangbin Huang wrote:
>> In order to know reason why link down, add a debugfs file
>> "link_diagnosis_info" to get link faults from firmware, and each bit
>> represents one kind of fault.
>>
>> usage example:
>> $ cat link_diagnosis_info
>> Reference clock lost
>
> Please use ethtool->get_link_ext_state instead.
> .
>
Hi Jakub, I have a question to consult you.
Some fault information in our patch are not existed in current ethtool extended
link states, for examples:
"Serdes reference clock lost"
"Serdes analog loss of signal"
"SFP tx is disabled"
"PHY power down"
"Remote fault"

Do you think these fault information can be added to ethtool extended link states?

Thanks,
Guangbin
.

2021-07-01 15:55:37

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net-next 3/3] net: hns3: add support for link diagnosis info in debugfs

On Thu, 1 Jul 2021 17:03:32 +0800 huangguangbin (A) wrote:
> On 2021/6/25 3:25, Jakub Kicinski wrote:
> > On Thu, 24 Jun 2021 22:36:45 +0800 Guangbin Huang wrote:
> >> In order to know reason why link down, add a debugfs file
> >> "link_diagnosis_info" to get link faults from firmware, and each bit
> >> represents one kind of fault.
> >>
> >> usage example:
> >> $ cat link_diagnosis_info
> >> Reference clock lost
> >
> > Please use ethtool->get_link_ext_state instead.
> > .
> >
> Hi Jakub, I have a question to consult you.
> Some fault information in our patch are not existed in current ethtool extended
> link states, for examples:
> "Serdes reference clock lost"
> "Serdes analog loss of signal"
> "SFP tx is disabled"
> "PHY power down"

Why would the PHY be powered down if user requested port to be up?

> "Remote fault"

I think we do have remote fault:

state: ETHTOOL_LINK_EXT_STATE_LINK_TRAINING_FAILURE
substate: ETHTOOL_LINK_EXT_SUBSTATE_LT_REMOTE_FAULT

> Do you think these fault information can be added to ethtool extended link states?

Yes, would you mind categorizing them into state/substate and sharing
the proposed additions with Amit, Ido, Andrew and other PHY experts?

2021-07-04 01:45:29

by Guangbin Huang

[permalink] [raw]
Subject: Re: [PATCH net-next 3/3] net: hns3: add support for link diagnosis info in debugfs



On 2021/7/1 23:54, Jakub Kicinski wrote:
> On Thu, 1 Jul 2021 17:03:32 +0800 huangguangbin (A) wrote:
>> On 2021/6/25 3:25, Jakub Kicinski wrote:
>>> On Thu, 24 Jun 2021 22:36:45 +0800 Guangbin Huang wrote:
>>>> In order to know reason why link down, add a debugfs file
>>>> "link_diagnosis_info" to get link faults from firmware, and each bit
>>>> represents one kind of fault.
>>>>
>>>> usage example:
>>>> $ cat link_diagnosis_info
>>>> Reference clock lost
>>>
>>> Please use ethtool->get_link_ext_state instead.
>>> .
>>>
>> Hi Jakub, I have a question to consult you.
>> Some fault information in our patch are not existed in current ethtool extended
>> link states, for examples:
>> "Serdes reference clock lost"
>> "Serdes analog loss of signal"
>> "SFP tx is disabled"
>> "PHY power down"
>
> Why would the PHY be powered down if user requested port to be up?
>
In the case of other user may use MDIO tool to write PHY register directly to make
PHY power down, if link state can display this information, I think it is helpful.

>> "Remote fault"
>
> I think we do have remote fault:
>
> state: ETHTOOL_LINK_EXT_STATE_LINK_TRAINING_FAILURE
> substate: ETHTOOL_LINK_EXT_SUBSTATE_LT_REMOTE_FAULT
>
OK.

>> Do you think these fault information can be added to ethtool extended link states?
>
> Yes, would you mind categorizing them into state/substate and sharing
> the proposed additions with Amit, Ido, Andrew and other PHY experts?
> .
>
OK.

2021-07-04 20:08:26

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH net-next 3/3] net: hns3: add support for link diagnosis info in debugfs

> > > Hi Jakub, I have a question to consult you.
> > > Some fault information in our patch are not existed in current ethtool extended
> > > link states, for examples:
> > > "Serdes reference clock lost"
> > > "Serdes analog loss of signal"
> > > "SFP tx is disabled"
> > > "PHY power down"
> >
> > Why would the PHY be powered down if user requested port to be up?
> >
> In the case of other user may use MDIO tool to write PHY register directly to make
> PHY power down, if link state can display this information, I think it is helpful.

If the user directly writes to PHY registers, they should expect bad
things to happen. They can do a lot more than power the PHY down. They
could configure it into loopback mode, turn off autoneg and force a
mode which is compatible with the peer, etc.

I don't think you need to tell the user they have pointed a foot gun
at their feet and pulled the trigger.

Andrew

2021-07-05 01:20:01

by Guangbin Huang

[permalink] [raw]
Subject: Re: [PATCH net-next 3/3] net: hns3: add support for link diagnosis info in debugfs



On 2021/7/5 3:57, Andrew Lunn wrote:
>>>> Hi Jakub, I have a question to consult you.
>>>> Some fault information in our patch are not existed in current ethtool extended
>>>> link states, for examples:
>>>> "Serdes reference clock lost"
>>>> "Serdes analog loss of signal"
>>>> "SFP tx is disabled"
>>>> "PHY power down"
>>>
>>> Why would the PHY be powered down if user requested port to be up?
>>>
>> In the case of other user may use MDIO tool to write PHY register directly to make
>> PHY power down, if link state can display this information, I think it is helpful.
>
> If the user directly writes to PHY registers, they should expect bad
> things to happen. They can do a lot more than power the PHY down. They
> could configure it into loopback mode, turn off autoneg and force a
> mode which is compatible with the peer, etc.
>
> I don't think you need to tell the user they have pointed a foot gun
> at their feet and pulled the trigger.
>
> Andrew
> .
>
OK, I accept your point, thanks!