On Mon, Mar 21, 2022 at 03:13:55PM +0700, Quan Nguyen wrote:
> Add documentation for the Ampere(R)'s Altra(R) SMpro sysfs interfaces
>
> Signed-off-by: Quan Nguyen <[email protected]>
> ---
> Changes in v7:
> + First introduce in v7 [Greg]
>
> .../sysfs-bus-platform-devices-ampere-smpro | 133 ++++++++++++++++++
> 1 file changed, 133 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-bus-platform-devices-ampere-smpro
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-platform-devices-ampere-smpro b/Documentation/ABI/testing/sysfs-bus-platform-devices-ampere-smpro
> new file mode 100644
> index 000000000000..9bfd8d6d0f71
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-bus-platform-devices-ampere-smpro
> @@ -0,0 +1,133 @@
> +What: /sys/bus/platform/devices/smpro-errmon.*/errors_[core|mem|pcie|other]_[ce|ue]
Please split this out as one entry per file.
> +KernelVersion: 5.14
5.14 is a long time ago.
> +Contact: [email protected]
> +Description:
> + (RO) Contains the 48-byte Ampere (Vendor-Specific) Error Record, see [1]
> + printed in hex format as below:
> +
> + AA BB CCCC DDDDDDDD DDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDD \
> + DDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDD
> + Where:
> + AA : Error Type
> + BB : Subtype
> + CCCC : Instance
> + DDD...DDD: Similar to the Arm RAS standard error record
No, this is not a valid sysfs file, sorry. This should just be one
value per file.
> +
> + See [1] below for the format details.
> +
> + The detail of each sysfs entries is as below:
> + +-------------+---------------------------------------------------------+
> + | Error | Sysfs entry |
> + +-------------+---------------------------------------------------------+
> + | Core's CE | /sys/bus/platform/devices/smpro-errmon.*/errors_core_ce |
> + | Core's UE | /sys/bus/platform/devices/smpro-errmon.*/errors_core_ue |
> + | Memory's CE | /sys/bus/platform/devices/smpro-errmon.*/errors_mem_ce |
> + | Memory's UE | /sys/bus/platform/devices/smpro-errmon.*/errors_mem_ue |
> + | PCIe's CE | /sys/bus/platform/devices/smpro-errmon.*/errors_pcie_ce |
> + | PCIe's UE | /sys/bus/platform/devices/smpro-errmon.*/errors_pcie_ue |
> + | Other's CE | /sys/bus/platform/devices/smpro-errmon.*/errors_other_ce|
> + | Other's UE | /sys/bus/platform/devices/smpro-errmon.*/errors_other_ue|
> + +-------------+---------------------------------------------------------+
> + UE: Uncorrect-able Error
> + CE: Correct-able Error
> +
> + [1] Section 3.3 Ampere (Vendor-Specific) Error Record Formats,
> + Altra Family RAS Supplement.
> +
> +
> +What: /sys/bus/platform/devices/smpro-errmon.*/errors_[smpro|pmpro]
> +KernelVersion: 5.14
> +Contact: [email protected]
> +Description:
> + (RO) Contains the internal firmware error record printed as hex format
> + as below:
> +
> + A BB C DD EEEE FFFFFFFF
Again this isn't a good sysfs entry. You should never have to parse a
sysfs file except for a single value.
thanks,
greg k-h
On 21/03/2022 15:23, Greg Kroah-Hartman wrote:
> On Mon, Mar 21, 2022 at 03:13:55PM +0700, Quan Nguyen wrote:
>> Add documentation for the Ampere(R)'s Altra(R) SMpro sysfs interfaces
>>
>> Signed-off-by: Quan Nguyen <[email protected]>
>> ---
>> Changes in v7:
>> + First introduce in v7 [Greg]
>>
>> .../sysfs-bus-platform-devices-ampere-smpro | 133 ++++++++++++++++++
>> 1 file changed, 133 insertions(+)
>> create mode 100644 Documentation/ABI/testing/sysfs-bus-platform-devices-ampere-smpro
>>
>> diff --git a/Documentation/ABI/testing/sysfs-bus-platform-devices-ampere-smpro b/Documentation/ABI/testing/sysfs-bus-platform-devices-ampere-smpro
>> new file mode 100644
>> index 000000000000..9bfd8d6d0f71
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/sysfs-bus-platform-devices-ampere-smpro
>> @@ -0,0 +1,133 @@
>> +What: /sys/bus/platform/devices/smpro-errmon.*/errors_[core|mem|pcie|other]_[ce|ue]
>
> Please split this out as one entry per file.
>
These sysfs share same format of HW errors (the 48-byte Arm vendor
specific HW error record) but for separate HW domains: Core, PCIe,
Mem... etc
>> +KernelVersion: 5.14
>
> 5.14 is a long time ago.
>
>> +Contact: [email protected]
>> +Description:
>> + (RO) Contains the 48-byte Ampere (Vendor-Specific) Error Record, see [1]
>> + printed in hex format as below:
>> +
>> + AA BB CCCC DDDDDDDD DDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDD \
>> + DDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDD
>> + Where:
>> + AA : Error Type
>> + BB : Subtype
>> + CCCC : Instance
>> + DDD...DDD: Similar to the Arm RAS standard error record
>
> No, this is not a valid sysfs file, sorry. This should just be one
> value per file.
>
This 48-byte value is unable to separate into smaller values because it
contain all information necessary to indicate a single HW error as per
ARM RAS supplement document [1]. The format is to make it read-able
other than a single 48-byte hex value.
[1] https://developer.arm.com/documentation/ddi0587/latest/
>
>> +
>> + See [1] below for the format details.
>> +
>> + The detail of each sysfs entries is as below:
>> + +-------------+---------------------------------------------------------+
>> + | Error | Sysfs entry |
>> + +-------------+---------------------------------------------------------+
>> + | Core's CE | /sys/bus/platform/devices/smpro-errmon.*/errors_core_ce |
>> + | Core's UE | /sys/bus/platform/devices/smpro-errmon.*/errors_core_ue |
>> + | Memory's CE | /sys/bus/platform/devices/smpro-errmon.*/errors_mem_ce |
>> + | Memory's UE | /sys/bus/platform/devices/smpro-errmon.*/errors_mem_ue |
>> + | PCIe's CE | /sys/bus/platform/devices/smpro-errmon.*/errors_pcie_ce |
>> + | PCIe's UE | /sys/bus/platform/devices/smpro-errmon.*/errors_pcie_ue |
>> + | Other's CE | /sys/bus/platform/devices/smpro-errmon.*/errors_other_ce|
>> + | Other's UE | /sys/bus/platform/devices/smpro-errmon.*/errors_other_ue|
>> + +-------------+---------------------------------------------------------+
>> + UE: Uncorrect-able Error
>> + CE: Correct-able Error
>> +
>> + [1] Section 3.3 Ampere (Vendor-Specific) Error Record Formats,
>> + Altra Family RAS Supplement.
>> +
>> +
>> +What: /sys/bus/platform/devices/smpro-errmon.*/errors_[smpro|pmpro]
>> +KernelVersion: 5.14
>> +Contact: [email protected]
>> +Description:
>> + (RO) Contains the internal firmware error record printed as hex format
>> + as below:
>> +
>> + A BB C DD EEEE FFFFFFFF
>
> Again this isn't a good sysfs entry. You should never have to parse a
> sysfs file except for a single value.
>
> thanks,
>
> greg k-h
This error is also unable to separate further as well.
Thanks Greg for the review.
- Quan