2020-02-03 23:03:03

by Ray Jui

[permalink] [raw]
Subject: RFC: Use of devlink/health report for non-Ethernet devices

Hi Jiri/Eran/David,

I've been investigating the health report feature of devlink, and have a
couple related questions as follows:

1. Based on my investigation, it seems that devlink health report
mechanism provides the hook for a device driver to report errors, dump
debug information, trigger object dump, initiate self-recovery, and etc.
The current users of health report are all Ethernet based drivers.
However, it does not seem the health report framework prohibits the use
from any non-Ethernet based device drivers. Is my understanding correct?

2. Following my first question, in this case, do you think it makes any
sense to use devlink health report as a generic error reporting and
recovery mechanism, for other devices, e.g., NVMe and Virt I/O?

3. In the Ethernet device driver based use case, if one has a "smart
NIC" type of platform, i.e., running Linux on the embedded processor of
the NIC, it seems to make a lot of sense to also use devlink health
report to deal with other non-Ethernet specific errors, originated from
the embedded Linux (or any other OSes). The front-end driver that
registers various health reporters will still be an Ethernet based
device driver, running on the host server system. Does this make sense
to you?

Thanks in advance for your feedback!

Thanks,

Ray



2020-02-04 06:50:17

by Jiri Pirko

[permalink] [raw]
Subject: Re: RFC: Use of devlink/health report for non-Ethernet devices

Tue, Feb 04, 2020 at 12:01:37AM CET, [email protected] wrote:
>Hi Jiri/Eran/David,
>
>I've been investigating the health report feature of devlink, and have a
>couple related questions as follows:
>
>1. Based on my investigation, it seems that devlink health report mechanism
>provides the hook for a device driver to report errors, dump debug
>information, trigger object dump, initiate self-recovery, and etc. The
>current users of health report are all Ethernet based drivers. However, it
>does not seem the health report framework prohibits the use from any
>non-Ethernet based device drivers. Is my understanding correct?

The whole devlink framework is designed to be independent on
ethernet/networking.


>
>2. Following my first question, in this case, do you think it makes any sense
>to use devlink health report as a generic error reporting and recovery
>mechanism, for other devices, e.g., NVMe and Virt I/O?

Sure.


>
>3. In the Ethernet device driver based use case, if one has a "smart NIC"
>type of platform, i.e., running Linux on the embedded processor of the NIC,
>it seems to make a lot of sense to also use devlink health report to deal
>with other non-Ethernet specific errors, originated from the embedded Linux
>(or any other OSes). The front-end driver that registers various health
>reporters will still be an Ethernet based device driver, running on the host
>server system. Does this make sense to you?

Should not be ethetnet based driver. You should create the devlink
instance in a driver for the particular device you want to report
the health for.


>
>Thanks in advance for your feedback!
>
>Thanks,
>
>Ray
>
>

2020-02-04 17:48:15

by Ray Jui

[permalink] [raw]
Subject: Re: RFC: Use of devlink/health report for non-Ethernet devices

Hi Jiri,

On 2020-02-03 10:48 p.m., Jiri Pirko wrote:
> Tue, Feb 04, 2020 at 12:01:37AM CET, [email protected] wrote:
>> Hi Jiri/Eran/David,
>>
>> I've been investigating the health report feature of devlink, and have a
>> couple related questions as follows:
>>
>> 1. Based on my investigation, it seems that devlink health report mechanism
>> provides the hook for a device driver to report errors, dump debug
>> information, trigger object dump, initiate self-recovery, and etc. The
>> current users of health report are all Ethernet based drivers. However, it
>> does not seem the health report framework prohibits the use from any
>> non-Ethernet based device drivers. Is my understanding correct?
>
> The whole devlink framework is designed to be independent on
> ethernet/networking.
>
>

Great. This is what I thought it is. Thanks for confirming.

>>
>> 2. Following my first question, in this case, do you think it makes any sense
>> to use devlink health report as a generic error reporting and recovery
>> mechanism, for other devices, e.g., NVMe and Virt I/O?
>
> Sure.
>
>

Thanks.

>>
>> 3. In the Ethernet device driver based use case, if one has a "smart NIC"
>> type of platform, i.e., running Linux on the embedded processor of the NIC,
>> it seems to make a lot of sense to also use devlink health report to deal
>> with other non-Ethernet specific errors, originated from the embedded Linux
>> (or any other OSes). The front-end driver that registers various health
>> reporters will still be an Ethernet based device driver, running on the host
>> server system. Does this make sense to you?
>
> Should not be ethetnet based driver. You should create the devlink
> instance in a driver for the particular device you want to report
> the health for.
>
>

Okay thanks!

>>
>> Thanks in advance for your feedback!
>>
>> Thanks,
>>
>> Ray
>>
>>