2009-12-11 16:13:19

by Youquan Song

[permalink] [raw]
Subject: [PATCH]PCIe AER: reject aer inject if hardware mask error reporting

Correcteable/Uncorrectable Error Mask Register are used by PCIe AER driver
which will controls the reporting of idividual errors to PCIe RC via PCIe
error messages.

If hardware masks special error reporting to RC, the aer_inject driver should
not inject aer error.

Signed-off-by: Youquan, Song <[email protected]>
Acked-by: Ying, Huang <[email protected]>
---

diff --git a/drivers/pci/pcie/aer/aer_inject.c b/drivers/pci/pcie/aer/aer_inject.c
index ad77f0c..fa2bc22 100644
--- a/drivers/pci/pcie/aer/aer_inject.c
+++ b/drivers/pci/pcie/aer/aer_inject.c
@@ -302,7 +302,7 @@ static int aer_inject(struct aer_error_inj *einj)
unsigned long flags;
unsigned int devfn = PCI_DEVFN(einj->dev, einj->fn);
int pos_cap_err, rp_pos_cap_err;
- u32 sever;
+ u32 sever, mask;
int ret = 0;

dev = pci_get_bus_and_slot(einj->bus, devfn);
@@ -354,6 +354,22 @@ static int aer_inject(struct aer_error_inj *einj)
err->header_log2 = einj->header_log2;
err->header_log3 = einj->header_log3;

+ pci_read_config_dword(dev, pos_cap_err + PCI_ERR_COR_MASK, &mask);
+ if (einj->cor_status && !(einj->cor_status & ~mask)) {
+ ret = -EINVAL;
+ printk(KERN_WARNING "The correctable error is masked by device\n");
+ spin_unlock_irqrestore(&inject_lock, flags);
+ goto out_put;
+ }
+
+ pci_read_config_dword(dev, pos_cap_err + PCI_ERR_UNCOR_MASK, &mask);
+ if (einj->uncor_status && !(einj->uncor_status & ~mask)) {
+ ret = -EINVAL;
+ printk(KERN_WARNING "The uncorrectable error is masked by device\n");
+ spin_unlock_irqrestore(&inject_lock, flags);
+ goto out_put;
+ }
+
rperr = __find_aer_error_by_dev(rpdev);
if (!rperr) {
rperr = rperr_alloc;


2009-12-15 00:29:20

by Andrew Patterson

[permalink] [raw]
Subject: Re: [PATCH]PCIe AER: reject aer inject if hardware mask error reporting

On Fri, 2009-12-11 at 18:48 -0500, Youquan,Song wrote:
> Correcteable/Uncorrectable Error Mask Register are used by PCIe AER driver
> which will controls the reporting of idividual errors to PCIe RC via PCIe
> error messages.
>
> If hardware masks special error reporting to RC, the aer_inject driver should
> not inject aer error.
>
> Signed-off-by: Youquan, Song <[email protected]>
> Acked-by: Ying, Huang <[email protected]>
> ---
>
> diff --git a/drivers/pci/pcie/aer/aer_inject.c b/drivers/pci/pcie/aer/aer_inject.c
> index ad77f0c..fa2bc22 100644
> --- a/drivers/pci/pcie/aer/aer_inject.c
> +++ b/drivers/pci/pcie/aer/aer_inject.c
> @@ -302,7 +302,7 @@ static int aer_inject(struct aer_error_inj *einj)
> unsigned long flags;
> unsigned int devfn = PCI_DEVFN(einj->dev, einj->fn);
> int pos_cap_err, rp_pos_cap_err;
> - u32 sever;
> + u32 sever, mask;
> int ret = 0;
>
> dev = pci_get_bus_and_slot(einj->bus, devfn);

This does not apply. Please respin against latest linux-2.6 or pci-2.6.

> @@ -354,6 +354,22 @@ static int aer_inject(struct aer_error_inj *einj)
> err->header_log2 = einj->header_log2;
> err->header_log3 = einj->header_log3;
>
> + pci_read_config_dword(dev, pos_cap_err + PCI_ERR_COR_MASK, &mask);
> + if (einj->cor_status && !(einj->cor_status & ~mask)) {
> + ret = -EINVAL;
> + printk(KERN_WARNING "The correctable error is masked by device\n");

You can inject multiple correctable errors with the aer-inject user-land
tool, so perhaps this should be re-worded as:

"The correctable error(s) are masked by the device\n"

> + spin_unlock_irqrestore(&inject_lock, flags);
> + goto out_put;
> + }
> +
> + pci_read_config_dword(dev, pos_cap_err + PCI_ERR_UNCOR_MASK, &mask);
> + if (einj->uncor_status && !(einj->uncor_status & ~mask)) {
> + ret = -EINVAL;
> + printk(KERN_WARNING "The uncorrectable error is masked by device\n");

Same as above

> + spin_unlock_irqrestore(&inject_lock, flags);
> + goto out_put;
> + }
> +

You can also simultaneously inject correctable and uncorrectable errors,
so I don't particularly like returning errors here. Perhaps you should
just print the warning message out and just not inject the masked
errors.

> rperr = __find_aer_error_by_dev(rpdev);
> if (!rperr) {
> rperr = rperr_alloc;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


--
Andrew Patterson
Hewlett-Packard

2009-12-17 08:43:56

by Andi Kleen

[permalink] [raw]
Subject: Re: [Resend PATCH]PCIe AER: reject aer inject if hardware mask error reporting

On Thu, Dec 17, 2009 at 08:22:48AM -0500, Youquan,Song wrote:
> Correcteable/Uncorrectable Error Mask Register are used by PCIe AER driver
> which will controls the reporting of idividual errors to PCIe RC via PCIe
> error messages.
>
> If hardware masks special error reporting to RC, the aer_inject driver should
> not inject aer error.
>
> Signed-off-by: Youquan, Song <[email protected]>
> Acked-by: Ying, Huang <[email protected]>

Acked-by: Andi Kleen <[email protected]>

-Andi

--
[email protected] -- Speaking for myself only.

2009-12-17 05:47:35

by Youquan Song

[permalink] [raw]
Subject: [Resend PATCH]PCIe AER: reject aer inject if hardware mask error reporting

Correcteable/Uncorrectable Error Mask Register are used by PCIe AER driver
which will controls the reporting of idividual errors to PCIe RC via PCIe
error messages.

If hardware masks special error reporting to RC, the aer_inject driver should
not inject aer error.

Signed-off-by: Youquan, Song <[email protected]>
Acked-by: Ying, Huang <[email protected]>
---


diff --git a/drivers/pci/pcie/aer/aer_inject.c b/drivers/pci/pcie/aer/aer_inject.c
index 7fcd533..d002cd9 100644
--- a/drivers/pci/pcie/aer/aer_inject.c
+++ b/drivers/pci/pcie/aer/aer_inject.c
@@ -321,7 +321,7 @@ static int aer_inject(struct aer_error_inj *einj)
unsigned long flags;
unsigned int devfn = PCI_DEVFN(einj->dev, einj->fn);
int pos_cap_err, rp_pos_cap_err;
- u32 sever;
+ u32 sever, mask;
int ret = 0;

dev = pci_get_domain_bus_and_slot((int)einj->domain, einj->bus, devfn);
@@ -374,6 +374,24 @@ static int aer_inject(struct aer_error_inj *einj)
err->header_log2 = einj->header_log2;
err->header_log3 = einj->header_log3;

+ pci_read_config_dword(dev, pos_cap_err + PCI_ERR_COR_MASK, &mask);
+ if (einj->cor_status && !(einj->cor_status & ~mask)) {
+ ret = -EINVAL;
+ printk(KERN_WARNING "The correctable error(s) is masked "
+ "by device\n");
+ spin_unlock_irqrestore(&inject_lock, flags);
+ goto out_put;
+ }
+
+ pci_read_config_dword(dev, pos_cap_err + PCI_ERR_UNCOR_MASK, &mask);
+ if (einj->uncor_status && !(einj->uncor_status & ~mask)) {
+ ret = -EINVAL;
+ printk(KERN_WARNING "The uncorrectable error(s) is masked "
+ "by device\n");
+ spin_unlock_irqrestore(&inject_lock, flags);
+ goto out_put;
+ }
+
rperr = __find_aer_error_by_dev(rpdev);
if (!rperr) {
rperr = rperr_alloc;

2009-12-17 05:59:38

by Youquan Song

[permalink] [raw]
Subject: Re: [PATCH]PCIe AER: reject aer inject if hardware mask error reporting

Hi Andrew,

I have udpate and sent the patch according to your comments.

> > dev = pci_get_bus_and_slot(einj->bus, devfn);
>
> This does not apply. Please respin against latest linux-2.6 or pci-2.6.
Yes, it is true that it is udapted at 32 final. Thanks.

> >
> > + pci_read_config_dword(dev, pos_cap_err + PCI_ERR_COR_MASK, &mask);
> > + if (einj->cor_status && !(einj->cor_status & ~mask)) {
> > + ret = -EINVAL;
> > + printk(KERN_WARNING "The correctable error is masked by device\n");
>
> You can inject multiple correctable errors with the aer-inject user-land
> tool, so perhaps this should be re-worded as:
>
> "The correctable error(s) are masked by the device\n"
Yes. it is update.

> You can also simultaneously inject correctable and uncorrectable errors,
> so I don't particularly like returning errors here. Perhaps you should
> just print the warning message out and just not inject the masked
> errors.
I do not agree with you at this point. If the hardware is not support
some error reporting, it need direct report this information to user
who use aer_inject userspace tool. He need change the his inject
configuration file, rather than kernel report him successful inject
AER but no any useful information at console or dmesg output.

Anyway, In my mind, it should not be very important issue to stop this patch
go to mainline.

Thanks.

-Youquan

2010-01-04 23:53:21

by Jesse Barnes

[permalink] [raw]
Subject: Re: [Resend PATCH]PCIe AER: reject aer inject if hardware mask error reporting

On Thu, 17 Dec 2009 08:22:48 -0500
"Youquan,Song" <[email protected]> wrote:

> Correcteable/Uncorrectable Error Mask Register are used by PCIe AER
> driver which will controls the reporting of idividual errors to PCIe
> RC via PCIe error messages.
>
> If hardware masks special error reporting to RC, the aer_inject
> driver should not inject aer error.
>
> Signed-off-by: Youquan, Song <[email protected]>
> Acked-by: Ying, Huang <[email protected]>
> ---

Applied this to my for-linus branch, thanks.

--
Jesse Barnes, Intel Open Source Technology Center