2017-11-08 03:17:47

by Tyler Baicar

[permalink] [raw]
Subject: Re: [PATCH] PCI/AER: update AER status string print to match other AER logs

On 10/20/2017 7:55 PM, Bjorn Helgaas wrote:
> On Tue, Oct 17, 2017 at 09:42:02AM -0600, Tyler Baicar wrote:
>> Currently the AER driver uses cper_print_bits() to print the AER status
>> string. This causes the status string to not include the proper PCI device
>> name prefix that the other AER prints include. Also, it has a different
>> print level than all the other AER prints.
>>
>> Update the AER driver to print the AER status string with the proper string
>> prefix and proper print level.
>>
>> Previous log example:
>>
>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>> Receiver Error, Bad TLP
>> e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
>> pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
>> Replay Timer Timeout
>> pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID
>>
>> New log:
>>
>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>> e1000e 0003:01:00.1: Receiver Error
>> e1000e 0003:01:00.1: Bad TLP
>> e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
>> pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
>> pcieport 0003:00:00.0: Replay Timer Timeout
>> pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID
> I definitely think it's MUCH better to use dev_err() as you do.
>
> I don't like the cper_print_bits() strategy of inserting line breaks
> to fit in 80 columns. That leads to atomicity issues, e.g., other
> printk output getting inserted in the middle of a single AER log, and
> suggests an ordering ("Receiver Error" occurred before "Bad TLP") that
> isn't real. It'd be ideal if everything fit on one line per event,
> but that might not be practical.
>
> I'm not necessarily attached to the actual strings. These messages
> are for sophisticated users and maybe could be abbreviated as in lspci
> output. It might actually be kind of neat if the output here matched
> up with the output of "lspci -vv" (lspci prints all the bits; here you
> probably want only the set bits). Or maybe not.
>
> But even what you have here is a huge improvement. I *hate*
> unattached things in dmesg like we currently get. There's no reliable
> way to connect that "Receiver Error, Bad TLP" with the device.
Hello Bjorn,

Thanks for the feedback. Do you think this can get into 4.15?

Thanks,
Tyler
>> Signed-off-by: Tyler Baicar <[email protected]>
>> ---
>> drivers/pci/pcie/aer/aerdrv_errprint.c | 15 ++++++++++++++-
>> 1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
>> index 54c4b69..b718daa 100644
>> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
>> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
>> @@ -206,6 +206,19 @@ void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
>> }
>>
>> #ifdef CONFIG_ACPI_APEI_PCIEAER
>> +void dev_print_bits(struct pci_dev *dev, unsigned int bits,
>> + const char * const strs[], unsigned int strs_size)
>> +{
>> + unsigned int i;
>> +
>> + for (i = 0; i < strs_size; i++) {
>> + if (!(bits & (1U << i)))
>> + continue;
>> + if (strs[i])
>> + dev_err(&dev->dev, "%s\n", strs[i]);
>> + }
>> +}
>> +
>> int cper_severity_to_aer(int cper_severity)
>> {
>> switch (cper_severity) {
>> @@ -243,7 +256,7 @@ void cper_print_aer(struct pci_dev *dev, int aer_severity,
>> agent = AER_GET_AGENT(aer_severity, status);
>>
>> dev_err(&dev->dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
>> - cper_print_bits("", status, status_strs, status_strs_size);
>> + dev_print_bits(dev, status, status_strs, status_strs_size);
>> dev_err(&dev->dev, "aer_layer=%s, aer_agent=%s\n",
>> aer_error_layer[layer], aer_agent_string[agent]);
>>
>> --
>> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
>> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
>> a Linux Foundation Collaborative Project.
>>

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


From 1581822847646964656@xxx Fri Oct 20 23:57:02 +0000 2017
X-GM-THRID: 1581559573280236686
X-Gmail-Labels: Inbox,Category Forums