2018-09-14 00:27:27

by Qing Huang

[permalink] [raw]
Subject: [PATCH] net/mlx4_core: print firmware version during driver loading

When debugging firmware related issues, it's very helpful to have
the installed FW version info in the kernel log when the driver is
loaded. It's easier to match error/warning messages with different
FW versions in the log other than running a separate tool to get
the information back and forth.

Signed-off-by: Qing Huang <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/fw.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index babcfd9..e1c5218 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -1686,11 +1686,11 @@ int mlx4_QUERY_FW(struct mlx4_dev *dev)
MLX4_GET(lg, outbox, QUERY_FW_MAX_CMD_OFFSET);
cmd->max_cmds = 1 << lg;

- mlx4_dbg(dev, "FW version %d.%d.%03d (cmd intf rev %d), max commands %d\n",
- (int) (dev->caps.fw_ver >> 32),
- (int) (dev->caps.fw_ver >> 16) & 0xffff,
- (int) dev->caps.fw_ver & 0xffff,
- cmd_if_rev, cmd->max_cmds);
+ mlx4_info(dev, "FW version %d.%d.%03d (cmd intf rev %d), max commands %d\n",
+ (int)(dev->caps.fw_ver >> 32),
+ (int)(dev->caps.fw_ver >> 16) & 0xffff,
+ (int)dev->caps.fw_ver & 0xffff,
+ cmd_if_rev, cmd->max_cmds);

MLX4_GET(fw->catas_offset, outbox, QUERY_FW_ERR_START_OFFSET);
MLX4_GET(fw->catas_size, outbox, QUERY_FW_ERR_SIZE_OFFSET);
--
2.9.3



2018-09-14 04:43:54

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH] net/mlx4_core: print firmware version during driver loading

On Thu, Sep 13, 2018 at 05:25:14PM -0700, Qing Huang wrote:
> When debugging firmware related issues, it's very helpful to have

^^^^^^^^^^ exactly, this is why we set this print as mlx4_dbg and
not mlx4_info.

> the installed FW version info in the kernel log when the driver is
> loaded. It's easier to match error/warning messages with different
> FW versions in the log other than running a separate tool to get
> the information back and forth.
>
> Signed-off-by: Qing Huang <[email protected]>
> ---
> drivers/net/ethernet/mellanox/mlx4/fw.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
> index babcfd9..e1c5218 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/fw.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
> @@ -1686,11 +1686,11 @@ int mlx4_QUERY_FW(struct mlx4_dev *dev)
> MLX4_GET(lg, outbox, QUERY_FW_MAX_CMD_OFFSET);
> cmd->max_cmds = 1 << lg;
>
> - mlx4_dbg(dev, "FW version %d.%d.%03d (cmd intf rev %d), max commands %d\n",
> - (int) (dev->caps.fw_ver >> 32),
> - (int) (dev->caps.fw_ver >> 16) & 0xffff,
> - (int) dev->caps.fw_ver & 0xffff,
> - cmd_if_rev, cmd->max_cmds);
> + mlx4_info(dev, "FW version %d.%d.%03d (cmd intf rev %d), max commands %d\n",
> + (int)(dev->caps.fw_ver >> 32),
> + (int)(dev->caps.fw_ver >> 16) & 0xffff,
> + (int)dev->caps.fw_ver & 0xffff,
> + cmd_if_rev, cmd->max_cmds);
>
> MLX4_GET(fw->catas_offset, outbox, QUERY_FW_ERR_START_OFFSET);
> MLX4_GET(fw->catas_size, outbox, QUERY_FW_ERR_SIZE_OFFSET);
> --
> 2.9.3
>


Attachments:
(No filename) (1.64 kB)
signature.asc (817.00 B)
Download all attachments

2018-09-14 17:17:51

by Qing Huang

[permalink] [raw]
Subject: Re: [PATCH] net/mlx4_core: print firmware version during driver loading

The FW version is actually a very crucial piece of information and only
printed once here
when the driver is loaded. People tend to get confused when switching
multiple FW files
back and forth without running separate utility tools, especially at
customer sites.
IMHO, this information is very useful and only takes up very little log
file space. :-)

I was also thinking of doing something slightly differently. Maybe we
just trim down the
output string, and add something like this?
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2208,6 +2208,11 @@ static int mlx4_init_fw(struct mlx4_dev *dev)
??????????????????????? return err;
??????????????? }

+?????????????? mlx4_info(dev, "Installed FW version is %d.%d.%03d.\n",
+???????????????????????? (int) (dev->caps.fw_ver >> 32),
+???????????????????????? (int) (dev->caps.fw_ver >> 16) & 0xffff,
+???????????????????????? (int) dev->caps.fw_ver & 0xffff);
+
??????????????? err = mlx4_load_fw(dev);
??????????????? if (err) {
??????????????????????? mlx4_err(dev, "Failed to start FW, aborting\n");

Thanks,
Qing

On 9/13/2018 9:43 PM, Leon Romanovsky wrote:
> On Thu, Sep 13, 2018 at 05:25:14PM -0700, Qing Huang wrote:
>> When debugging firmware related issues, it's very helpful to have
> ^^^^^^^^^^ exactly, this is why we set this print as mlx4_dbg and
> not mlx4_info.
>
>> the installed FW version info in the kernel log when the driver is
>> loaded. It's easier to match error/warning messages with different
>> FW versions in the log other than running a separate tool to get
>> the information back and forth.
>>
>> Signed-off-by: Qing Huang <[email protected]>
>> ---
>> drivers/net/ethernet/mellanox/mlx4/fw.c | 10 +++++-----
>> 1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
>> index babcfd9..e1c5218 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/fw.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
>> @@ -1686,11 +1686,11 @@ int mlx4_QUERY_FW(struct mlx4_dev *dev)
>> MLX4_GET(lg, outbox, QUERY_FW_MAX_CMD_OFFSET);
>> cmd->max_cmds = 1 << lg;
>>
>> - mlx4_dbg(dev, "FW version %d.%d.%03d (cmd intf rev %d), max commands %d\n",
>> - (int) (dev->caps.fw_ver >> 32),
>> - (int) (dev->caps.fw_ver >> 16) & 0xffff,
>> - (int) dev->caps.fw_ver & 0xffff,
>> - cmd_if_rev, cmd->max_cmds);
>> + mlx4_info(dev, "FW version %d.%d.%03d (cmd intf rev %d), max commands %d\n",
>> + (int)(dev->caps.fw_ver >> 32),
>> + (int)(dev->caps.fw_ver >> 16) & 0xffff,
>> + (int)dev->caps.fw_ver & 0xffff,
>> + cmd_if_rev, cmd->max_cmds);
>>
>> MLX4_GET(fw->catas_offset, outbox, QUERY_FW_ERR_START_OFFSET);
>> MLX4_GET(fw->catas_size, outbox, QUERY_FW_ERR_SIZE_OFFSET);
>> --
>> 2.9.3
>>


2018-09-14 18:17:57

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH] net/mlx4_core: print firmware version during driver loading

On Fri, Sep 14, 2018 at 10:15:48AM -0700, Qing Huang wrote:
> The FW version is actually a very crucial piece of information and only
> printed once here
> when the driver is loaded. People tend to get confused when switching
> multiple FW files
> back and forth without running separate utility tools, especially at
> customer sites.
> IMHO, this information is very useful and only takes up very little log file
> space. :-)

Why not use ethtool -i ?

$ sudo ethtool -i eth0
driver: r8169
version: 2.3LK-NAPI
firmware-version: rtl8168g-2_0.0.1 02/06/13

Andrew

2018-09-14 18:34:21

by Qing Huang

[permalink] [raw]
Subject: Re: [PATCH] net/mlx4_core: print firmware version during driver loading



On 9/14/2018 11:17 AM, Andrew Lunn wrote:
> On Fri, Sep 14, 2018 at 10:15:48AM -0700, Qing Huang wrote:
>> The FW version is actually a very crucial piece of information and only
>> printed once here
>> when the driver is loaded. People tend to get confused when switching
>> multiple FW files
>> back and forth without running separate utility tools, especially at
>> customer sites.
>> IMHO, this information is very useful and only takes up very little log file
>> space. :-)
> Why not use ethtool -i ?
>
> $ sudo ethtool -i eth0
> driver: r8169
> version: 2.3LK-NAPI
> firmware-version: rtl8168g-2_0.0.1 02/06/13
>
> Andrew
Sure. You can also use ibstat or ibv_devinfo tool if they are installed.
But it's not very
convenient in some cases.

E.g.
A customer upgrades FW on HCAs and encounters issues. During triage,
it's much easier
to study customer uploaded log files when remotely testing different FW
files.

Thanks.




2018-09-14 20:14:20

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH] net/mlx4_core: print firmware version during driver loading

> >$ sudo ethtool -i eth0
> >driver: r8169
> >version: 2.3LK-NAPI
> >firmware-version: rtl8168g-2_0.0.1 02/06/13
> >
> > Andrew

> Sure. You can also use ibstat or ibv_devinfo tool if they are installed. But
> it's not very convenient in some cases.

This is the standardised way to do this. It should work for any
Ethernet driver, so long as it fills in the needed information.
Anything else is non-standard, and so inconvenient by definition.

Andrew

2018-09-14 21:10:01

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] net/mlx4_core: print firmware version during driver loading

From: Qing Huang <[email protected]>
Date: Fri, 14 Sep 2018 10:15:48 -0700

> IMHO, this information is very useful and only takes up very little
> log file space. :-)

If it's critical then the log is the wrong place for it as the log
is lossy.

The proper place to obtain this information is via the fw_version
field of the ethtool_drvinfo struct. This can be obtained at any time
and is reliable. And if it isn't reliable or correct, we must fix
that.

2018-09-14 21:14:34

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] net/mlx4_core: print firmware version during driver loading

From: Qing Huang <[email protected]>
Date: Fri, 14 Sep 2018 11:33:40 -0700

>
>
> On 9/14/2018 11:17 AM, Andrew Lunn wrote:
>> On Fri, Sep 14, 2018 at 10:15:48AM -0700, Qing Huang wrote:
>>> The FW version is actually a very crucial piece of information and
>>> only
>>> printed once here
>>> when the driver is loaded. People tend to get confused when switching
>>> multiple FW files
>>> back and forth without running separate utility tools, especially at
>>> customer sites.
>>> IMHO, this information is very useful and only takes up very little
>>> log file
>>> space. :-)
>> Why not use ethtool -i ?
>>
>> $ sudo ethtool -i eth0
>> driver: r8169
>> version: 2.3LK-NAPI
>> firmware-version: rtl8168g-2_0.0.1 02/06/13
>>
>> Andrew
> Sure. You can also use ibstat or ibv_devinfo tool if they are
> installed. But it's not very
> convenient in some cases.
>
> E.g.
> A customer upgrades FW on HCAs and encounters issues. During triage,
> it's much easier
> to study customer uploaded log files when remotely testing different
> FW files.

Not a valid argument. You can print the ethtool output from initramfs
if necessary for triage.

I still stand by the fact that ethtool is the only fully reliable way
to obtain this information, the kernel log is not.

2018-09-14 21:14:37

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] net/mlx4_core: print firmware version during driver loading

From: Andrew Lunn <[email protected]>
Date: Fri, 14 Sep 2018 20:17:18 +0200

> On Fri, Sep 14, 2018 at 10:15:48AM -0700, Qing Huang wrote:
>> The FW version is actually a very crucial piece of information and only
>> printed once here
>> when the driver is loaded. People tend to get confused when switching
>> multiple FW files
>> back and forth without running separate utility tools, especially at
>> customer sites.
>> IMHO, this information is very useful and only takes up very little log file
>> space. :-)
>
> Why not use ethtool -i ?
>
> $ sudo ethtool -i eth0
> driver: r8169
> version: 2.3LK-NAPI
> firmware-version: rtl8168g-2_0.0.1 02/06/13

+1

2018-09-14 22:37:24

by Qing Huang

[permalink] [raw]
Subject: Re: [PATCH] net/mlx4_core: print firmware version during driver loading



On 9/14/2018 2:14 PM, David Miller wrote:
> From: Qing Huang<[email protected]>
> Date: Fri, 14 Sep 2018 11:33:40 -0700
>
>> On 9/14/2018 11:17 AM, Andrew Lunn wrote:
>>> On Fri, Sep 14, 2018 at 10:15:48AM -0700, Qing Huang wrote:
>>>> The FW version is actually a very crucial piece of information and
>>>> only
>>>> printed once here
>>>> when the driver is loaded. People tend to get confused when switching
>>>> multiple FW files
>>>> back and forth without running separate utility tools, especially at
>>>> customer sites.
>>>> IMHO, this information is very useful and only takes up very little
>>>> log file
>>>> space. :-)
>>> Why not use ethtool -i ?
>>>
>>> $ sudo ethtool -i eth0
>>> driver: r8169
>>> version: 2.3LK-NAPI
>>> firmware-version: rtl8168g-2_0.0.1 02/06/13
>>>
>>> Andrew
>> Sure. You can also use ibstat or ibv_devinfo tool if they are
>> installed. But it's not very
>> convenient in some cases.
>>
>> E.g.
>> A customer upgrades FW on HCAs and encounters issues. During triage,
>> it's much easier
>> to study customer uploaded log files when remotely testing different
>> FW files.
> Not a valid argument. You can print the ethtool output from initramfs
> if necessary for triage.
>
> I still stand by the fact that ethtool is the only fully reliable way
> to obtain this information, the kernel log is not.

This is more for Infiniband mode which depends more on features and
functionalities
provided in firmware and get much more frequent FW bug fixes than
typical Ethernet
devices. This is not meant to replace other ways of getting the
information, more like
an enhancement for checking log history.

This can provide valuable information when tracing through system log
history to
discover what happened with a specific HCA drv ver and fw ver
combination in the past.

Regards,
Qing

2018-09-15 08:50:39

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH] net/mlx4_core: print firmware version during driver loading

On Fri, Sep 14, 2018 at 03:36:46PM -0700, Qing Huang wrote:
>
>
> On 9/14/2018 2:14 PM, David Miller wrote:
> > From: Qing Huang<[email protected]>
> > Date: Fri, 14 Sep 2018 11:33:40 -0700
> >
> > > On 9/14/2018 11:17 AM, Andrew Lunn wrote:
> > > > On Fri, Sep 14, 2018 at 10:15:48AM -0700, Qing Huang wrote:
> > > > > The FW version is actually a very crucial piece of information and
> > > > > only
> > > > > printed once here
> > > > > when the driver is loaded. People tend to get confused when switching
> > > > > multiple FW files
> > > > > back and forth without running separate utility tools, especially at
> > > > > customer sites.
> > > > > IMHO, this information is very useful and only takes up very little
> > > > > log file
> > > > > space. :-)
> > > > Why not use ethtool -i ?
> > > >
> > > > $ sudo ethtool -i eth0
> > > > driver: r8169
> > > > version: 2.3LK-NAPI
> > > > firmware-version: rtl8168g-2_0.0.1 02/06/13
> > > >
> > > > Andrew
> > > Sure. You can also use ibstat or ibv_devinfo tool if they are
> > > installed. But it's not very
> > > convenient in some cases.
> > >
> > > E.g.
> > > A customer upgrades FW on HCAs and encounters issues. During triage,
> > > it's much easier
> > > to study customer uploaded log files when remotely testing different
> > > FW files.
> > Not a valid argument. You can print the ethtool output from initramfs
> > if necessary for triage.
> >
> > I still stand by the fact that ethtool is the only fully reliable way
> > to obtain this information, the kernel log is not.
>
> This is more for Infiniband mode which depends more on features and
> functionalities

For pure infiniband devices you have rdmatool, part of iproute2.
[leonro@server-14-015 ~]$ rdma dev
1: mlx5_0: node_type ca fw 3.8.9999 node_guid 5254:00c0:fe12:3455 sys_image_guid 5254:00c0:fe12:3455

> provided in firmware and get much more frequent FW bug fixes than typical
> Ethernet
> devices. This is not meant to replace other ways of getting the information,
> more like
> an enhancement for checking log history.
>
> This can provide valuable information when tracing through system log
> history to
> discover what happened with a specific HCA drv ver and fw ver combination in
> the past.
>
> Regards,
> Qing


Attachments:
(No filename) (2.27 kB)
signature.asc (817.00 B)
Download all attachments