LinuxLists.cc - [PATCH v5 0/2] PCI: mediatek: Fixups for the IRQ handle routine and MT7622's class code

2017-12-27 01:00:07

Subject: [PATCH v5 0/2] PCI: mediatek: Fixups for the IRQ handle routine and MT7622's class code

From: Honghui Zhang <[email protected]>

Two fixups for mediatek's host bridge:
The first patch fixup the IRQ handle routine to avoid IRQ reentry which
may exist for both MT2712 and MT7622.
The second patch fixup class type for MT7622.

Change since v4:
- Only setup vendor ID for MT7622, igorning the device ID since mediatek's
host bridge driver does not cares about the device ID.

Change since v3:
- Setup the class type and vendor ID at the beginning of startup instead
of in a quirk.
- Add mediatek's vendor ID, it could be found in:
https://pcisig.com/membership/member-companies?combine=&page=4

Change since v2:
- Move the initialize of the iterate before the loop to fix an
INTx IRQ issue in the first patch

Change since v1:
- Add the second patch.
- Make the first patch's commit message more standard.

Honghui Zhang (2):
PCI: mediatek: Clear IRQ status after IRQ dispatched to avoid reentry
PCI: mediatek: Set up class type and vendor ID for MT7622

drivers/pci/host/pcie-mediatek.c | 23 ++++++++++++++++++-----
include/linux/pci_ids.h | 2 ++
2 files changed, 20 insertions(+), 5 deletions(-)

--
2.6.4

2017-12-27 01:00:05

by Honghui Zhang

[permalink] [raw]

Subject: [PATCH v5 1/2] PCI: mediatek: Clear IRQ status after IRQ dispatched to avoid reentry

From: Honghui Zhang <[email protected]>

There maybe a same IRQ reentry scenario after IRQ received in current
IRQ handle flow:
EP device PCIe host driver EP driver
1. issue an IRQ
2. received IRQ
3. clear IRQ status
4. dispatch IRQ
5. clear IRQ source
The IRQ status was not successfully cleared at step 2 since the IRQ
source was not cleared yet. So the PCIe host driver may receive the
same IRQ after step 5. Then there's an IRQ reentry occurred.
Even worse, if the reentry IRQ was not an IRQ that EP driver expected,
it may not handle the IRQ. Then we may run into the infinite loop from
step 2 to step 4.
Clear the IRQ status after IRQ have been dispatched to avoid the IRQ
reentry.
This patch also fix another INTx IRQ issue by initialize the iterate
before the loop. If an INTx IRQ re-occurred while we are dispatching
the INTx IRQ, then iterate may start from PCI_NUM_INTX + INTX_SHIFT
instead of INTX_SHIFT for the second time entering the
for_each_set_bit_from() loop.

Signed-off-by: Honghui Zhang <[email protected]>
Acked-by: Ryder Lee <[email protected]>
---
drivers/pci/host/pcie-mediatek.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
index db93efd..fc29a9a 100644
--- a/drivers/pci/host/pcie-mediatek.c
+++ b/drivers/pci/host/pcie-mediatek.c
@@ -601,15 +601,16 @@ static irqreturn_t mtk_pcie_intr_handler(int irq, void *data)
struct mtk_pcie_port *port = (struct mtk_pcie_port *)data;
unsigned long status;
u32 virq;
- u32 bit = INTX_SHIFT;
+ u32 bit;

while ((status = readl(port->base + PCIE_INT_STATUS)) & INTX_MASK) {
+ bit = INTX_SHIFT;
for_each_set_bit_from(bit, &status, PCI_NUM_INTX + INTX_SHIFT) {
- /* Clear the INTx */
- writel(1 << bit, port->base + PCIE_INT_STATUS);
virq = irq_find_mapping(port->irq_domain,
bit - INTX_SHIFT);
generic_handle_irq(virq);
+ /* Clear the INTx */
+ writel(1 << bit, port->base + PCIE_INT_STATUS);
}
}

@@ -619,10 +620,10 @@ static irqreturn_t mtk_pcie_intr_handler(int irq, void *data)

while ((imsi_status = readl(port->base + PCIE_IMSI_STATUS))) {
for_each_set_bit(bit, &imsi_status, MTK_MSI_IRQS_NUM) {
- /* Clear the MSI */
- writel(1 << bit, port->base + PCIE_IMSI_STATUS);
virq = irq_find_mapping(port->msi_domain, bit);
generic_handle_irq(virq);
+ /* Clear the MSI */
+ writel(1 << bit, port->base + PCIE_IMSI_STATUS);
}
}
/* Clear MSI interrupt status */
--
2.6.4

2017-12-27 01:00:30

by Honghui Zhang

[permalink] [raw]

Subject: [PATCH v5 2/2] PCI: mediatek: Set up class type and vendor ID for MT7622

From: Honghui Zhang <[email protected]>

The hardware default value of IDs and class type is not correct,
fix that by setup the correct values before start up.

Signed-off-by: Honghui Zhang <[email protected]>
---
drivers/pci/host/pcie-mediatek.c | 12 ++++++++++++
include/linux/pci_ids.h | 2 ++
2 files changed, 14 insertions(+)

diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
index fc29a9a..62aac0ea 100644
--- a/drivers/pci/host/pcie-mediatek.c
+++ b/drivers/pci/host/pcie-mediatek.c
@@ -74,6 +74,10 @@

/* PCIe V2 per-port registers */
#define PCIE_MSI_VECTOR 0x0c0
+
+#define PCIE_CONF_ID 0x100
+#define PCIE_CONF_CLASS 0x104
+
#define PCIE_INT_MASK 0x420
#define INTX_MASK GENMASK(19, 16)
#define INTX_SHIFT 16
@@ -393,6 +397,14 @@ static int mtk_pcie_startup_port_v2(struct mtk_pcie_port *port)
val |= PCIE_CSR_LTSSM_EN(port->slot) |
PCIE_CSR_ASPM_L1_EN(port->slot);
writel(val, pcie->base + PCIE_SYS_CFG_V2);
+
+ /* Set up vendor ID and device ID for MT7622*/
+ val = PCI_VENDOR_ID_MEDIATEK;
+ writel(val, port->base + PCIE_CONF_ID);
+
+ /* Set up class code for MT7622 */
+ val = PCI_CLASS_BRIDGE_PCI << 16;
+ writel(val, port->base + PCIE_CONF_CLASS);
}

/* Assert all reset signals */
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index ab20dc5..2480b0e 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2113,6 +2113,8 @@

#define PCI_VENDOR_ID_MYRICOM 0x14c1

+#define PCI_VENDOR_ID_MEDIATEK 0x14c3
+
#define PCI_VENDOR_ID_TITAN 0x14D2
#define PCI_DEVICE_ID_TITAN_010L 0x8001
#define PCI_DEVICE_ID_TITAN_100L 0x8010
--
2.6.4

2017-12-27 18:45:55

by Bjorn Helgaas

[permalink] [raw]

Subject: Re: [PATCH v5 2/2] PCI: mediatek: Set up class type and vendor ID for MT7622

On Wed, Dec 27, 2017 at 08:59:54AM +0800, [email protected] wrote:
> From: Honghui Zhang <[email protected]>
>
> The hardware default value of IDs and class type is not correct,
> fix that by setup the correct values before start up.
>
> Signed-off-by: Honghui Zhang <[email protected]>
> ---
> drivers/pci/host/pcie-mediatek.c | 12 ++++++++++++
> include/linux/pci_ids.h | 2 ++
> 2 files changed, 14 insertions(+)
>
> diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
> index fc29a9a..62aac0ea 100644
> --- a/drivers/pci/host/pcie-mediatek.c
> +++ b/drivers/pci/host/pcie-mediatek.c
> @@ -74,6 +74,10 @@
>
> /* PCIe V2 per-port registers */
> #define PCIE_MSI_VECTOR 0x0c0
> +
> +#define PCIE_CONF_ID 0x100
> +#define PCIE_CONF_CLASS 0x104
> +
> #define PCIE_INT_MASK 0x420
> #define INTX_MASK GENMASK(19, 16)
> #define INTX_SHIFT 16
> @@ -393,6 +397,14 @@ static int mtk_pcie_startup_port_v2(struct mtk_pcie_port *port)
> val |= PCIE_CSR_LTSSM_EN(port->slot) |
> PCIE_CSR_ASPM_L1_EN(port->slot);
> writel(val, pcie->base + PCIE_SYS_CFG_V2);
> +
> + /* Set up vendor ID and device ID for MT7622*/
> + val = PCI_VENDOR_ID_MEDIATEK;
> + writel(val, port->base + PCIE_CONF_ID);
> +
> + /* Set up class code for MT7622 */
> + val = PCI_CLASS_BRIDGE_PCI << 16;
> + writel(val, port->base + PCIE_CONF_CLASS);

1) Your comments mention MT7622 specifically, but this code is run for
both mt2712-pcie and mt7622-pcie. If this code is safe and necessary
for both mt2712-pcie and mt7622-pcie, please remove the mention of
MT7622.

2) The first comment mentions both "vendor ID and device ID" but you
don't write the device ID. Since this code applies to both
mt2712-pcie and mt7622-pcie, my guess is that you don't *want* to
write the device ID. If that's the case, please fix the comment.

3) If you only need to set the vendor ID, you're performing a 32-bit
write (writel()) to update a 16-bit value. Please use writew()
instead.

4) If you only need to set the vendor ID, please use a definition like
"PCIE_CONF_VENDOR_ID" instead of the ambiguous "PCIE_CONF_ID".

5) If you only need to set the vendor ID, please update the changelog
to mention "vendor ID" specifically instead of the ambiguous "IDs".

6) Please add a space before the closing "*/" of the first comment.

7) PCI_CLASS_BRIDGE_PCI is for a PCI-to-PCI bridge, i.e., one that has
PCI on both the primary (upstream) side and the secondary (downstream)
side. That kind of bridge has a type 1 config header (see
PCI_HEADER_TYPE) and the PCI_PRIMARY_BUS and PCI_SECONDARY_BUS
registers tell us the bus number of the primary and secondary sides.

I don't believe this device is a PCI-to-PCI bridge. I think it's a
*host* bridge that has some non-PCI interface on the upstream side and
should have a type 0 config header. If that's the case you should use
PCI_CLASS_BRIDGE_HOST instead.

> }
>
> /* Assert all reset signals */
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index ab20dc5..2480b0e 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -2113,6 +2113,8 @@
>
> #define PCI_VENDOR_ID_MYRICOM 0x14c1
>
> +#define PCI_VENDOR_ID_MEDIATEK 0x14c3
> +
> #define PCI_VENDOR_ID_TITAN 0x14D2
> #define PCI_DEVICE_ID_TITAN_010L 0x8001
> #define PCI_DEVICE_ID_TITAN_100L 0x8010
> --
> 2.6.4
>

2017-12-28 01:39:35

by Honghui Zhang

[permalink] [raw]

Subject: Re: [PATCH v5 2/2] PCI: mediatek: Set up class type and vendor ID for MT7622

On Wed, 2017-12-27 at 12:45 -0600, Bjorn Helgaas wrote:
> On Wed, Dec 27, 2017 at 08:59:54AM +0800, [email protected] wrote:
> > From: Honghui Zhang <[email protected]>
> >
> > The hardware default value of IDs and class type is not correct,
> > fix that by setup the correct values before start up.
> >
> > Signed-off-by: Honghui Zhang <[email protected]>
> > ---
> > drivers/pci/host/pcie-mediatek.c | 12 ++++++++++++
> > include/linux/pci_ids.h | 2 ++
> > 2 files changed, 14 insertions(+)
> >
> > diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
> > index fc29a9a..62aac0ea 100644
> > --- a/drivers/pci/host/pcie-mediatek.c
> > +++ b/drivers/pci/host/pcie-mediatek.c
> > @@ -74,6 +74,10 @@
> >
> > /* PCIe V2 per-port registers */
> > #define PCIE_MSI_VECTOR 0x0c0
> > +
> > +#define PCIE_CONF_ID 0x100
> > +#define PCIE_CONF_CLASS 0x104
> > +
> > #define PCIE_INT_MASK 0x420
> > #define INTX_MASK GENMASK(19, 16)
> > #define INTX_SHIFT 16
> > @@ -393,6 +397,14 @@ static int mtk_pcie_startup_port_v2(struct mtk_pcie_port *port)
> > val |= PCIE_CSR_LTSSM_EN(port->slot) |
> > PCIE_CSR_ASPM_L1_EN(port->slot);
> > writel(val, pcie->base + PCIE_SYS_CFG_V2);
> > +
> > + /* Set up vendor ID and device ID for MT7622*/
> > + val = PCI_VENDOR_ID_MEDIATEK;
> > + writel(val, port->base + PCIE_CONF_ID);
> > +
> > + /* Set up class code for MT7622 */
> > + val = PCI_CLASS_BRIDGE_PCI << 16;
> > + writel(val, port->base + PCIE_CONF_CLASS);
>
> 1) Your comments mention MT7622 specifically, but this code is run for
> both mt2712-pcie and mt7622-pcie. If this code is safe and necessary
> for both mt2712-pcie and mt7622-pcie, please remove the mention of
> MT7622.

Hmm, the code snippet added here will only be executed by MT7622, since
MT2712 will not enter this "if (pcie->base) {" condition.
Should the mention of MT7622 must be removed in this case?

>
> 2) The first comment mentions both "vendor ID and device ID" but you
> don't write the device ID. Since this code applies to both
> mt2712-pcie and mt7622-pcie, my guess is that you don't *want* to
> write the device ID. If that's the case, please fix the comment.
>

My bad, I did not check the comments carefully.
Thanks.

> 3) If you only need to set the vendor ID, you're performing a 32-bit
> write (writel()) to update a 16-bit value. Please use writew()
> instead.
>

Ok, thanks, I guess I could use the following code snippet in the next
version:
val = readl(port->base + PCIE_CONF_VENDOR_ID)
val &= ~GENMASK(15, 0);
val |= PCI_VENDOR_ID_MEDIATEK;
writel(val, port->base + PCIE_CONF_VENDOR_ID);

> 4) If you only need to set the vendor ID, please use a definition like
> "PCIE_CONF_VENDOR_ID" instead of the ambiguous "PCIE_CONF_ID".
>
> 5) If you only need to set the vendor ID, please update the changelog
> to mention "vendor ID" specifically instead of the ambiguous "IDs".

> 6) Please add a space before the closing "*/" of the first comment.
>
> 7) PCI_CLASS_BRIDGE_PCI is for a PCI-to-PCI bridge, i.e., one that has
> PCI on both the primary (upstream) side and the secondary (downstream)
> side. That kind of bridge has a type 1 config header (see
> PCI_HEADER_TYPE) and the PCI_PRIMARY_BUS and PCI_SECONDARY_BUS
> registers tell us the bus number of the primary and secondary sides.
>
> I don't believe this device is a PCI-to-PCI bridge. I think it's a
> *host* bridge that has some non-PCI interface on the upstream side and
> should have a type 0 config header. If that's the case you should use
> PCI_CLASS_BRIDGE_HOST instead.
>

Thanks very much for your help with the review, I will fix the other
issue in the next version.

> > }
> >
> > /* Assert all reset signals */
> > diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> > index ab20dc5..2480b0e 100644
> > --- a/include/linux/pci_ids.h
> > +++ b/include/linux/pci_ids.h
> > @@ -2113,6 +2113,8 @@
> >
> > #define PCI_VENDOR_ID_MYRICOM 0x14c1
> >
> > +#define PCI_VENDOR_ID_MEDIATEK 0x14c3
> > +
> > #define PCI_VENDOR_ID_TITAN 0x14D2
> > #define PCI_DEVICE_ID_TITAN_010L 0x8001
> > #define PCI_DEVICE_ID_TITAN_100L 0x8010
> > --
> > 2.6.4
> >

2018-01-02 10:56:08

by Lorenzo Pieralisi

[permalink] [raw]

Subject: Re: [PATCH v5 2/2] PCI: mediatek: Set up class type and vendor ID for MT7622

On Thu, Dec 28, 2017 at 09:39:12AM +0800, Honghui Zhang wrote:
> On Wed, 2017-12-27 at 12:45 -0600, Bjorn Helgaas wrote:
> > On Wed, Dec 27, 2017 at 08:59:54AM +0800, [email protected] wrote:
> > > From: Honghui Zhang <[email protected]>
> > >
> > > The hardware default value of IDs and class type is not correct,
> > > fix that by setup the correct values before start up.
> > >
> > > Signed-off-by: Honghui Zhang <[email protected]>
> > > ---
> > > drivers/pci/host/pcie-mediatek.c | 12 ++++++++++++
> > > include/linux/pci_ids.h | 2 ++
> > > 2 files changed, 14 insertions(+)
> > >
> > > diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
> > > index fc29a9a..62aac0ea 100644
> > > --- a/drivers/pci/host/pcie-mediatek.c
> > > +++ b/drivers/pci/host/pcie-mediatek.c
> > > @@ -74,6 +74,10 @@
> > >
> > > /* PCIe V2 per-port registers */
> > > #define PCIE_MSI_VECTOR 0x0c0
> > > +
> > > +#define PCIE_CONF_ID 0x100
> > > +#define PCIE_CONF_CLASS 0x104
> > > +
> > > #define PCIE_INT_MASK 0x420
> > > #define INTX_MASK GENMASK(19, 16)
> > > #define INTX_SHIFT 16
> > > @@ -393,6 +397,14 @@ static int mtk_pcie_startup_port_v2(struct mtk_pcie_port *port)
> > > val |= PCIE_CSR_LTSSM_EN(port->slot) |
> > > PCIE_CSR_ASPM_L1_EN(port->slot);
> > > writel(val, pcie->base + PCIE_SYS_CFG_V2);
> > > +
> > > + /* Set up vendor ID and device ID for MT7622*/
> > > + val = PCI_VENDOR_ID_MEDIATEK;
> > > + writel(val, port->base + PCIE_CONF_ID);
> > > +
> > > + /* Set up class code for MT7622 */
> > > + val = PCI_CLASS_BRIDGE_PCI << 16;
> > > + writel(val, port->base + PCIE_CONF_CLASS);
> >
> > 1) Your comments mention MT7622 specifically, but this code is run for
> > both mt2712-pcie and mt7622-pcie. If this code is safe and necessary
> > for both mt2712-pcie and mt7622-pcie, please remove the mention of
> > MT7622.
>
> Hmm, the code snippet added here will only be executed by MT7622, since
> MT2712 will not enter this "if (pcie->base) {" condition.
> Should the mention of MT7622 must be removed in this case?

You should add an explicit way (eg of_device_is_compatible() match for
instance) to apply the quirk just on the platform that requires it.

Checking for "if (pcie->base)" is really not the way to do it.

> > 2) The first comment mentions both "vendor ID and device ID" but you
> > don't write the device ID. Since this code applies to both
> > mt2712-pcie and mt7622-pcie, my guess is that you don't *want* to
> > write the device ID. If that's the case, please fix the comment.
> >
>
> My bad, I did not check the comments carefully.
> Thanks.
>
> > 3) If you only need to set the vendor ID, you're performing a 32-bit
> > write (writel()) to update a 16-bit value. Please use writew()
> > instead.
> >
>
> Ok, thanks, I guess I could use the following code snippet in the next
> version:
> val = readl(port->base + PCIE_CONF_VENDOR_ID)
> val &= ~GENMASK(15, 0);
> val |= PCI_VENDOR_ID_MEDIATEK;
> writel(val, port->base + PCIE_CONF_VENDOR_ID);

Have you read Bjorn's comment ? Or there is a problem with using
a writew() ?

Lorenzo

> > 4) If you only need to set the vendor ID, please use a definition like
> > "PCIE_CONF_VENDOR_ID" instead of the ambiguous "PCIE_CONF_ID".
> >
> > 5) If you only need to set the vendor ID, please update the changelog
> > to mention "vendor ID" specifically instead of the ambiguous "IDs".
>
> > 6) Please add a space before the closing "*/" of the first comment.
> >
> > 7) PCI_CLASS_BRIDGE_PCI is for a PCI-to-PCI bridge, i.e., one that has
> > PCI on both the primary (upstream) side and the secondary (downstream)
> > side. That kind of bridge has a type 1 config header (see
> > PCI_HEADER_TYPE) and the PCI_PRIMARY_BUS and PCI_SECONDARY_BUS
> > registers tell us the bus number of the primary and secondary sides.
> >
> > I don't believe this device is a PCI-to-PCI bridge. I think it's a
> > *host* bridge that has some non-PCI interface on the upstream side and
> > should have a type 0 config header. If that's the case you should use
> > PCI_CLASS_BRIDGE_HOST instead.
> >
>
> Thanks very much for your help with the review, I will fix the other
> issue in the next version.
>
> > > }
> > >
> > > /* Assert all reset signals */
> > > diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> > > index ab20dc5..2480b0e 100644
> > > --- a/include/linux/pci_ids.h
> > > +++ b/include/linux/pci_ids.h
> > > @@ -2113,6 +2113,8 @@
> > >
> > > #define PCI_VENDOR_ID_MYRICOM 0x14c1
> > >
> > > +#define PCI_VENDOR_ID_MEDIATEK 0x14c3
> > > +
> > > #define PCI_VENDOR_ID_TITAN 0x14D2
> > > #define PCI_DEVICE_ID_TITAN_010L 0x8001
> > > #define PCI_DEVICE_ID_TITAN_100L 0x8010
> > > --
> > > 2.6.4
> > >
>
>

2018-01-03 06:39:26

by Honghui Zhang

[permalink] [raw]

Subject: Re: [PATCH v5 2/2] PCI: mediatek: Set up class type and vendor ID for MT7622

On Tue, 2018-01-02 at 10:56 +0000, Lorenzo Pieralisi wrote:
> On Thu, Dec 28, 2017 at 09:39:12AM +0800, Honghui Zhang wrote:
> > On Wed, 2017-12-27 at 12:45 -0600, Bjorn Helgaas wrote:
> > > On Wed, Dec 27, 2017 at 08:59:54AM +0800, [email protected] wrote:
> > > > From: Honghui Zhang <[email protected]>
> > > >

> > > > + /* Set up class code for MT7622 */
> > > > + val = PCI_CLASS_BRIDGE_PCI << 16;
> > > > + writel(val, port->base + PCIE_CONF_CLASS);
> > >
> > > 1) Your comments mention MT7622 specifically, but this code is run for
> > > both mt2712-pcie and mt7622-pcie. If this code is safe and necessary
> > > for both mt2712-pcie and mt7622-pcie, please remove the mention of
> > > MT7622.
> >
> > Hmm, the code snippet added here will only be executed by MT7622, since
> > MT2712 will not enter this "if (pcie->base) {" condition.
> > Should the mention of MT7622 must be removed in this case?
>
> You should add an explicit way (eg of_device_is_compatible() match for
> instance) to apply the quirk just on the platform that requires it.
>
> Checking for "if (pcie->base)" is really not the way to do it.
>

hi, Lorenzo,
Thanks very much for your advise.
Passing the compatible string or platform data into this function needed
to add some new field in the struct mtk_pcie_port, then I guess both set
it for MT2712 and MT7622 is an easy way, since re-setting those values
for MT2712 is safe.

> > > 2) The first comment mentions both "vendor ID and device ID" but you
> > > don't write the device ID. Since this code applies to both
> > > mt2712-pcie and mt7622-pcie, my guess is that you don't *want* to
> > > write the device ID. If that's the case, please fix the comment.
> > >
> >
> > My bad, I did not check the comments carefully.
> > Thanks.
> >
> > > 3) If you only need to set the vendor ID, you're performing a 32-bit
> > > write (writel()) to update a 16-bit value. Please use writew()
> > > instead.
> > >
> >
> > Ok, thanks, I guess I could use the following code snippet in the next
> > version:
> > val = readl(port->base + PCIE_CONF_VENDOR_ID)
> > val &= ~GENMASK(15, 0);
> > val |= PCI_VENDOR_ID_MEDIATEK;
> > writel(val, port->base + PCIE_CONF_VENDOR_ID);
>
> Have you read Bjorn's comment ? Or there is a problem with using
> a writew() ?
>

This control register is a 32bit register, I'm not sure whether the apb
bus support write an 16bit value with 16bit but not 32bit address
alignment. I prefer the more safety old way of writel.

I need to do more test about the writew if the code elegant is more
important.

thanks.

> Lorenzo
>
> > > 4) If you only need to set the vendor ID, please use a definition like
> > > "PCIE_CONF_VENDOR_ID" instead of the ambiguous "PCIE_CONF_ID".
> > >
> > > 5) If you only need to set the vendor ID, please update the changelog
> > > to mention "vendor ID" specifically instead of the ambiguous "IDs".
> >
> > > 6) Please add a space before the closing "*/" of the first comment.
> > >
> > > 7) PCI_CLASS_BRIDGE_PCI is for a PCI-to-PCI bridge, i.e., one that has
> > > PCI on both the primary (upstream) side and the secondary (downstream)
> > > side. That kind of bridge has a type 1 config header (see
> > > PCI_HEADER_TYPE) and the PCI_PRIMARY_BUS and PCI_SECONDARY_BUS
> > > registers tell us the bus number of the primary and secondary sides.
> > >
> > > I don't believe this device is a PCI-to-PCI bridge. I think it's a
> > > *host* bridge that has some non-PCI interface on the upstream side and
> > > should have a type 0 config header. If that's the case you should use
> > > PCI_CLASS_BRIDGE_HOST instead.
> > >
> >
> > Thanks very much for your help with the review, I will fix the other
> > issue in the next version.
> >
> > > > }
> > > >
> > > > /* Assert all reset signals */
> > > > diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> > > > index ab20dc5..2480b0e 100644
> > > > --- a/include/linux/pci_ids.h
> > > > +++ b/include/linux/pci_ids.h
> > > > @@ -2113,6 +2113,8 @@
> > > >
> > > > #define PCI_VENDOR_ID_MYRICOM 0x14c1
> > > >
> > > > +#define PCI_VENDOR_ID_MEDIATEK 0x14c3
> > > > +
> > > > #define PCI_VENDOR_ID_TITAN 0x14D2
> > > > #define PCI_DEVICE_ID_TITAN_010L 0x8001
> > > > #define PCI_DEVICE_ID_TITAN_100L 0x8010
> > > > --
> > > > 2.6.4
> > > >
> >
> >

2018-01-03 12:14:29

by Lorenzo Pieralisi

[permalink] [raw]

Subject: Re: [PATCH v5 2/2] PCI: mediatek: Set up class type and vendor ID for MT7622

On Wed, Jan 03, 2018 at 02:39:04PM +0800, Honghui Zhang wrote:
> On Tue, 2018-01-02 at 10:56 +0000, Lorenzo Pieralisi wrote:
> > On Thu, Dec 28, 2017 at 09:39:12AM +0800, Honghui Zhang wrote:
> > > On Wed, 2017-12-27 at 12:45 -0600, Bjorn Helgaas wrote:
> > > > On Wed, Dec 27, 2017 at 08:59:54AM +0800, [email protected] wrote:
> > > > > From: Honghui Zhang <[email protected]>
> > > > >
>
> > > > > + /* Set up class code for MT7622 */
> > > > > + val = PCI_CLASS_BRIDGE_PCI << 16;
> > > > > + writel(val, port->base + PCIE_CONF_CLASS);
> > > >
> > > > 1) Your comments mention MT7622 specifically, but this code is run for
> > > > both mt2712-pcie and mt7622-pcie. If this code is safe and necessary
> > > > for both mt2712-pcie and mt7622-pcie, please remove the mention of
> > > > MT7622.
> > >
> > > Hmm, the code snippet added here will only be executed by MT7622, since
> > > MT2712 will not enter this "if (pcie->base) {" condition.
> > > Should the mention of MT7622 must be removed in this case?
> >
> > You should add an explicit way (eg of_device_is_compatible() match for
> > instance) to apply the quirk just on the platform that requires it.
> >
> > Checking for "if (pcie->base)" is really not the way to do it.
> >
>
> hi, Lorenzo,
> Thanks very much for your advise.
> Passing the compatible string or platform data into this function needed
> to add some new field in the struct mtk_pcie_port, then I guess both set
> it for MT2712 and MT7622 is an easy way, since re-setting those values
> for MT2712 is safe.

You have a pointer to the host bridge DT node through the
mtk_pcie_port.pcie->dev pointer, use it to carry out the compatible
match, you have to apply quirks only if needed - I do not want to
guess it.

Lorenzo

> > > > 2) The first comment mentions both "vendor ID and device ID" but you
> > > > don't write the device ID. Since this code applies to both
> > > > mt2712-pcie and mt7622-pcie, my guess is that you don't *want* to
> > > > write the device ID. If that's the case, please fix the comment.
> > > >
> > >
> > > My bad, I did not check the comments carefully.
> > > Thanks.
> > >
> > > > 3) If you only need to set the vendor ID, you're performing a 32-bit
> > > > write (writel()) to update a 16-bit value. Please use writew()
> > > > instead.
> > > >
> > >
> > > Ok, thanks, I guess I could use the following code snippet in the next
> > > version:
> > > val = readl(port->base + PCIE_CONF_VENDOR_ID)
> > > val &= ~GENMASK(15, 0);
> > > val |= PCI_VENDOR_ID_MEDIATEK;
> > > writel(val, port->base + PCIE_CONF_VENDOR_ID);
> >
> > Have you read Bjorn's comment ? Or there is a problem with using
> > a writew() ?
> >
>
> This control register is a 32bit register, I'm not sure whether the apb
> bus support write an 16bit value with 16bit but not 32bit address
> alignment. I prefer the more safety old way of writel.
>
> I need to do more test about the writew if the code elegant is more
> important.
>
> thanks.
>
> > Lorenzo
> >
> > > > 4) If you only need to set the vendor ID, please use a definition like
> > > > "PCIE_CONF_VENDOR_ID" instead of the ambiguous "PCIE_CONF_ID".
> > > >
> > > > 5) If you only need to set the vendor ID, please update the changelog
> > > > to mention "vendor ID" specifically instead of the ambiguous "IDs".
> > >
> > > > 6) Please add a space before the closing "*/" of the first comment.
> > > >
> > > > 7) PCI_CLASS_BRIDGE_PCI is for a PCI-to-PCI bridge, i.e., one that has
> > > > PCI on both the primary (upstream) side and the secondary (downstream)
> > > > side. That kind of bridge has a type 1 config header (see
> > > > PCI_HEADER_TYPE) and the PCI_PRIMARY_BUS and PCI_SECONDARY_BUS
> > > > registers tell us the bus number of the primary and secondary sides.
> > > >
> > > > I don't believe this device is a PCI-to-PCI bridge. I think it's a
> > > > *host* bridge that has some non-PCI interface on the upstream side and
> > > > should have a type 0 config header. If that's the case you should use
> > > > PCI_CLASS_BRIDGE_HOST instead.
> > > >
> > >
> > > Thanks very much for your help with the review, I will fix the other
> > > issue in the next version.
> > >
> > > > > }
> > > > >
> > > > > /* Assert all reset signals */
> > > > > diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> > > > > index ab20dc5..2480b0e 100644
> > > > > --- a/include/linux/pci_ids.h
> > > > > +++ b/include/linux/pci_ids.h
> > > > > @@ -2113,6 +2113,8 @@
> > > > >
> > > > > #define PCI_VENDOR_ID_MYRICOM 0x14c1
> > > > >
> > > > > +#define PCI_VENDOR_ID_MEDIATEK 0x14c3
> > > > > +
> > > > > #define PCI_VENDOR_ID_TITAN 0x14D2
> > > > > #define PCI_DEVICE_ID_TITAN_010L 0x8001
> > > > > #define PCI_DEVICE_ID_TITAN_100L 0x8010
> > > > > --
> > > > > 2.6.4
> > > > >
> > >
> > >
>
>

2018-01-04 18:39:47

by Lorenzo Pieralisi

[permalink] [raw]

Subject: Re: [PATCH v5 1/2] PCI: mediatek: Clear IRQ status after IRQ dispatched to avoid reentry

[+Marc]

On Wed, Dec 27, 2017 at 08:59:53AM +0800, [email protected] wrote:
> From: Honghui Zhang <[email protected]>
>
> There maybe a same IRQ reentry scenario after IRQ received in current
> IRQ handle flow:
> EP device PCIe host driver EP driver
> 1. issue an IRQ
> 2. received IRQ
> 3. clear IRQ status
> 4. dispatch IRQ
> 5. clear IRQ source
> The IRQ status was not successfully cleared at step 2 since the IRQ
> source was not cleared yet. So the PCIe host driver may receive the
> same IRQ after step 5. Then there's an IRQ reentry occurred.
> Even worse, if the reentry IRQ was not an IRQ that EP driver expected,
> it may not handle the IRQ. Then we may run into the infinite loop from
> step 2 to step 4.
> Clear the IRQ status after IRQ have been dispatched to avoid the IRQ
> reentry.
> This patch also fix another INTx IRQ issue by initialize the iterate
> before the loop. If an INTx IRQ re-occurred while we are dispatching
> the INTx IRQ, then iterate may start from PCI_NUM_INTX + INTX_SHIFT
> instead of INTX_SHIFT for the second time entering the
> for_each_set_bit_from() loop.

This looks like two different issues that should be fixed with two
patches.

> Signed-off-by: Honghui Zhang <[email protected]>
> Acked-by: Ryder Lee <[email protected]>
> ---
> drivers/pci/host/pcie-mediatek.c | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)

For the sake of uniformity, I first want to understand why this
driver does not call:

chained_irq_enter/exit()

in the primary handler (mtk_pcie_intr_handler()).

With the GIC as a primary interrupt controller we have not
even figured out how current code can actually work without
calling the chained_* API.

I want to come up with a consistent handling of IRQ domains for
all host bridges and any discrepancy should be explained.

> diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
> index db93efd..fc29a9a 100644
> --- a/drivers/pci/host/pcie-mediatek.c
> +++ b/drivers/pci/host/pcie-mediatek.c
> @@ -601,15 +601,16 @@ static irqreturn_t mtk_pcie_intr_handler(int irq, void *data)
> struct mtk_pcie_port *port = (struct mtk_pcie_port *)data;
> unsigned long status;
> u32 virq;
> - u32 bit = INTX_SHIFT;
> + u32 bit;
>
> while ((status = readl(port->base + PCIE_INT_STATUS)) & INTX_MASK) {
> + bit = INTX_SHIFT;
> for_each_set_bit_from(bit, &status, PCI_NUM_INTX + INTX_SHIFT) {
> - /* Clear the INTx */
> - writel(1 << bit, port->base + PCIE_INT_STATUS);
> virq = irq_find_mapping(port->irq_domain,
> bit - INTX_SHIFT);
> generic_handle_irq(virq);
> + /* Clear the INTx */
> + writel(1 << bit, port->base + PCIE_INT_STATUS);

I think that these masking/acking should actually be done through
the irq_chip hooks (see for instance pci-ftpci100.c) - that would
make this kind of bugs much easier to prevent (because the IRQ
layer does the sequencing for you).

Marc (CC'ed) has a more comprehensive view on this than me - I would
like to get to a point where all host bridges uses a consistent
approach for chained IRQ handling and I hope this bug fix can be
a starting point.

Thanks,
Lorenzo

> }
> }
>
> @@ -619,10 +620,10 @@ static irqreturn_t mtk_pcie_intr_handler(int irq, void *data)
>
> while ((imsi_status = readl(port->base + PCIE_IMSI_STATUS))) {
> for_each_set_bit(bit, &imsi_status, MTK_MSI_IRQS_NUM) {
> - /* Clear the MSI */
> - writel(1 << bit, port->base + PCIE_IMSI_STATUS);
> virq = irq_find_mapping(port->msi_domain, bit);
> generic_handle_irq(virq);
> + /* Clear the MSI */
> + writel(1 << bit, port->base + PCIE_IMSI_STATUS);
> }
> }
> /* Clear MSI interrupt status */
> --
> 2.6.4
>

2018-01-04 19:04:10

by Marc Zyngier

[permalink] [raw]

Subject: Re: [PATCH v5 1/2] PCI: mediatek: Clear IRQ status after IRQ dispatched to avoid reentry

On 04/01/18 18:40, Lorenzo Pieralisi wrote:
> [+Marc]
>
> On Wed, Dec 27, 2017 at 08:59:53AM +0800, [email protected] wrote:
>> From: Honghui Zhang <[email protected]>
>>
>> There maybe a same IRQ reentry scenario after IRQ received in current
>> IRQ handle flow:
>> EP device PCIe host driver EP driver
>> 1. issue an IRQ
>> 2. received IRQ
>> 3. clear IRQ status
>> 4. dispatch IRQ
>> 5. clear IRQ source
>> The IRQ status was not successfully cleared at step 2 since the IRQ
>> source was not cleared yet. So the PCIe host driver may receive the
>> same IRQ after step 5. Then there's an IRQ reentry occurred.
>> Even worse, if the reentry IRQ was not an IRQ that EP driver expected,
>> it may not handle the IRQ. Then we may run into the infinite loop from
>> step 2 to step 4.
>> Clear the IRQ status after IRQ have been dispatched to avoid the IRQ
>> reentry.
>> This patch also fix another INTx IRQ issue by initialize the iterate
>> before the loop. If an INTx IRQ re-occurred while we are dispatching
>> the INTx IRQ, then iterate may start from PCI_NUM_INTX + INTX_SHIFT
>> instead of INTX_SHIFT for the second time entering the
>> for_each_set_bit_from() loop.
>
> This looks like two different issues that should be fixed with two
> patches.
>
>> Signed-off-by: Honghui Zhang <[email protected]>
>> Acked-by: Ryder Lee <[email protected]>
>> ---
>> drivers/pci/host/pcie-mediatek.c | 11 ++++++-----
>> 1 file changed, 6 insertions(+), 5 deletions(-)
>
> For the sake of uniformity, I first want to understand why this
> driver does not call:
>
> chained_irq_enter/exit()
>
> in the primary handler (mtk_pcie_intr_handler()).
>
> With the GIC as a primary interrupt controller we have not
> even figured out how current code can actually work without
> calling the chained_* API.
>
> I want to come up with a consistent handling of IRQ domains for
> all host bridges and any discrepancy should be explained.

That's because this driver is a huge hack, see below:

>
>> diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
>> index db93efd..fc29a9a 100644
>> --- a/drivers/pci/host/pcie-mediatek.c
>> +++ b/drivers/pci/host/pcie-mediatek.c
>> @@ -601,15 +601,16 @@ static irqreturn_t mtk_pcie_intr_handler(int irq, void *data)

This function is not a chained irqchip, but an interrupt handler...

>> struct mtk_pcie_port *port = (struct mtk_pcie_port *)data;
>> unsigned long status;
>> u32 virq;
>> - u32 bit = INTX_SHIFT;
>> + u32 bit;
>>
>> while ((status = readl(port->base + PCIE_INT_STATUS)) & INTX_MASK) {
>> + bit = INTX_SHIFT;
>> for_each_set_bit_from(bit, &status, PCI_NUM_INTX + INTX_SHIFT) {
>> - /* Clear the INTx */
>> - writel(1 << bit, port->base + PCIE_INT_STATUS);
>> virq = irq_find_mapping(port->irq_domain,
>> bit - INTX_SHIFT);
>> generic_handle_irq(virq);

and nonetheless, this calls into generic_handle_irq(). That's a complete
violation of the interrupt layering. Maybe there is a good reason for
it, but I'd like to know which one.

Which means that all of the ack/mask has to be done outside of the
irqchip framework too... Disgusting.

>> + /* Clear the INTx */
>> + writel(1 << bit, port->base + PCIE_INT_STATUS);
>
> I think that these masking/acking should actually be done through
> the irq_chip hooks (see for instance pci-ftpci100.c) - that would
> make this kind of bugs much easier to prevent (because the IRQ
> layer does the sequencing for you).

+1.

> Marc (CC'ed) has a more comprehensive view on this than me - I would
> like to get to a point where all host bridges uses a consistent
> approach for chained IRQ handling and I hope this bug fix can be
> a starting point.

+1 again. We definitely need to come up with some form of common
approach for all these host drivers, and maybe turn that into a library...

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-01-05 11:52:02

by Honghui Zhang

[permalink] [raw]

Subject: Re: [PATCH v5 1/2] PCI: mediatek: Clear IRQ status after IRQ dispatched to avoid reentry

On Thu, 2018-01-04 at 19:04 +0000, Marc Zyngier wrote:
> On 04/01/18 18:40, Lorenzo Pieralisi wrote:
> > [+Marc]
> >
> > On Wed, Dec 27, 2017 at 08:59:53AM +0800, [email protected] wrote:
> >> From: Honghui Zhang <[email protected]>
> >>
> >> There maybe a same IRQ reentry scenario after IRQ received in current
> >> IRQ handle flow:
> >> EP device PCIe host driver EP driver
> >> 1. issue an IRQ
> >> 2. received IRQ
> >> 3. clear IRQ status
> >> 4. dispatch IRQ
> >> 5. clear IRQ source
> >> The IRQ status was not successfully cleared at step 2 since the IRQ
> >> source was not cleared yet. So the PCIe host driver may receive the
> >> same IRQ after step 5. Then there's an IRQ reentry occurred.
> >> Even worse, if the reentry IRQ was not an IRQ that EP driver expected,
> >> it may not handle the IRQ. Then we may run into the infinite loop from
> >> step 2 to step 4.
> >> Clear the IRQ status after IRQ have been dispatched to avoid the IRQ
> >> reentry.
> >> This patch also fix another INTx IRQ issue by initialize the iterate
> >> before the loop. If an INTx IRQ re-occurred while we are dispatching
> >> the INTx IRQ, then iterate may start from PCI_NUM_INTX + INTX_SHIFT
> >> instead of INTX_SHIFT for the second time entering the
> >> for_each_set_bit_from() loop.
> >
> > This looks like two different issues that should be fixed with two
> > patches.

Ok, I split this into two patches and figure out a more reasonable
approach by using irq_chip solution.

> >
> >> Signed-off-by: Honghui Zhang <[email protected]>
> >> Acked-by: Ryder Lee <[email protected]>
> >> ---
> >> drivers/pci/host/pcie-mediatek.c | 11 ++++++-----
> >> 1 file changed, 6 insertions(+), 5 deletions(-)
> >
> > For the sake of uniformity, I first want to understand why this
> > driver does not call:
> >
> > chained_irq_enter/exit()
> >
> > in the primary handler (mtk_pcie_intr_handler()).
> >
> > With the GIC as a primary interrupt controller we have not
> > even figured out how current code can actually work without
> > calling the chained_* API.
> >
> > I want to come up with a consistent handling of IRQ domains for
> > all host bridges and any discrepancy should be explained.
>
> That's because this driver is a huge hack, see below:
>
> >
> >> diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
> >> index db93efd..fc29a9a 100644
> >> --- a/drivers/pci/host/pcie-mediatek.c
> >> +++ b/drivers/pci/host/pcie-mediatek.c
> >> @@ -601,15 +601,16 @@ static irqreturn_t mtk_pcie_intr_handler(int irq, void *data)
>
> This function is not a chained irqchip, but an interrupt handler...
>
> >> struct mtk_pcie_port *port = (struct mtk_pcie_port *)data;
> >> unsigned long status;
> >> u32 virq;
> >> - u32 bit = INTX_SHIFT;
> >> + u32 bit;
> >>
> >> while ((status = readl(port->base + PCIE_INT_STATUS)) & INTX_MASK) {
> >> + bit = INTX_SHIFT;
> >> for_each_set_bit_from(bit, &status, PCI_NUM_INTX + INTX_SHIFT) {
> >> - /* Clear the INTx */
> >> - writel(1 << bit, port->base + PCIE_INT_STATUS);
> >> virq = irq_find_mapping(port->irq_domain,
> >> bit - INTX_SHIFT);
> >> generic_handle_irq(virq);
>
> and nonetheless, this calls into generic_handle_irq(). That's a complete
> violation of the interrupt layering. Maybe there is a good reason for
> it, but I'd like to know which one.
>
> Which means that all of the ack/mask has to be done outside of the
> irqchip framework too... Disgusting.
>
> >> + /* Clear the INTx */
> >> + writel(1 << bit, port->base + PCIE_INT_STATUS);
> >
> > I think that these masking/acking should actually be done through
> > the irq_chip hooks (see for instance pci-ftpci100.c) - that would
> > make this kind of bugs much easier to prevent (because the IRQ
> > layer does the sequencing for you).
>
> +1.
>

Thanks for your advice, I need to do some homework to have a better
understanding of the irq_chip approach.

> > Marc (CC'ed) has a more comprehensive view on this than me - I would
> > like to get to a point where all host bridges uses a consistent
> > approach for chained IRQ handling and I hope this bug fix can be
> > a starting point.
>
> +1 again. We definitely need to come up with some form of common
> approach for all these host drivers, and maybe turn that into a library...
>

Well, this is beyond my knowledge now, I guess I can figure out how to
using irq_chip for the first step, then I may following this "common
approach" after we have a solution for that?

thanks.
> Thanks,
>
> M.

2018-01-05 17:42:34

by Marc Zyngier

[permalink] [raw]

Subject: Re: [PATCH v5 1/2] PCI: mediatek: Clear IRQ status after IRQ dispatched to avoid reentry

On 05/01/18 11:51, Honghui Zhang wrote:
> On Thu, 2018-01-04 at 19:04 +0000, Marc Zyngier wrote:
>> On 04/01/18 18:40, Lorenzo Pieralisi wrote:
>>> [+Marc]
>>>
>>> On Wed, Dec 27, 2017 at 08:59:53AM +0800, [email protected] wrote:
>>>> From: Honghui Zhang <[email protected]>
>>>>
>>>> There maybe a same IRQ reentry scenario after IRQ received in current
>>>> IRQ handle flow:
>>>> EP device PCIe host driver EP driver
>>>> 1. issue an IRQ
>>>> 2. received IRQ
>>>> 3. clear IRQ status
>>>> 4. dispatch IRQ
>>>> 5. clear IRQ source
>>>> The IRQ status was not successfully cleared at step 2 since the IRQ
>>>> source was not cleared yet. So the PCIe host driver may receive the
>>>> same IRQ after step 5. Then there's an IRQ reentry occurred.
>>>> Even worse, if the reentry IRQ was not an IRQ that EP driver expected,
>>>> it may not handle the IRQ. Then we may run into the infinite loop from
>>>> step 2 to step 4.
>>>> Clear the IRQ status after IRQ have been dispatched to avoid the IRQ
>>>> reentry.
>>>> This patch also fix another INTx IRQ issue by initialize the iterate
>>>> before the loop. If an INTx IRQ re-occurred while we are dispatching
>>>> the INTx IRQ, then iterate may start from PCI_NUM_INTX + INTX_SHIFT
>>>> instead of INTX_SHIFT for the second time entering the
>>>> for_each_set_bit_from() loop.
>>>
>>> This looks like two different issues that should be fixed with two
>>> patches.
>
> Ok, I split this into two patches and figure out a more reasonable
> approach by using irq_chip solution.
>
>>>
>>>> Signed-off-by: Honghui Zhang <[email protected]>
>>>> Acked-by: Ryder Lee <[email protected]>
>>>> ---
>>>> drivers/pci/host/pcie-mediatek.c | 11 ++++++-----
>>>> 1 file changed, 6 insertions(+), 5 deletions(-)
>>>
>>> For the sake of uniformity, I first want to understand why this
>>> driver does not call:
>>>
>>> chained_irq_enter/exit()
>>>
>>> in the primary handler (mtk_pcie_intr_handler()).
>>>
>>> With the GIC as a primary interrupt controller we have not
>>> even figured out how current code can actually work without
>>> calling the chained_* API.
>>>
>>> I want to come up with a consistent handling of IRQ domains for
>>> all host bridges and any discrepancy should be explained.
>>
>> That's because this driver is a huge hack, see below:
>>
>>>
>>>> diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
>>>> index db93efd..fc29a9a 100644
>>>> --- a/drivers/pci/host/pcie-mediatek.c
>>>> +++ b/drivers/pci/host/pcie-mediatek.c
>>>> @@ -601,15 +601,16 @@ static irqreturn_t mtk_pcie_intr_handler(int irq, void *data)
>>
>> This function is not a chained irqchip, but an interrupt handler...
>>
>>>> struct mtk_pcie_port *port = (struct mtk_pcie_port *)data;
>>>> unsigned long status;
>>>> u32 virq;
>>>> - u32 bit = INTX_SHIFT;
>>>> + u32 bit;
>>>>
>>>> while ((status = readl(port->base + PCIE_INT_STATUS)) & INTX_MASK) {
>>>> + bit = INTX_SHIFT;
>>>> for_each_set_bit_from(bit, &status, PCI_NUM_INTX + INTX_SHIFT) {
>>>> - /* Clear the INTx */
>>>> - writel(1 << bit, port->base + PCIE_INT_STATUS);
>>>> virq = irq_find_mapping(port->irq_domain,
>>>> bit - INTX_SHIFT);
>>>> generic_handle_irq(virq);
>>
>> and nonetheless, this calls into generic_handle_irq(). That's a complete
>> violation of the interrupt layering. Maybe there is a good reason for
>> it, but I'd like to know which one.
>>
>> Which means that all of the ack/mask has to be done outside of the
>> irqchip framework too... Disgusting.
>>
>>>> + /* Clear the INTx */
>>>> + writel(1 << bit, port->base + PCIE_INT_STATUS);
>>>
>>> I think that these masking/acking should actually be done through
>>> the irq_chip hooks (see for instance pci-ftpci100.c) - that would
>>> make this kind of bugs much easier to prevent (because the IRQ
>>> layer does the sequencing for you).
>>
>> +1.
>>
>
> Thanks for your advice, I need to do some homework to have a better
> understanding of the irq_chip approach.
>
>>> Marc (CC'ed) has a more comprehensive view on this than me - I would
>>> like to get to a point where all host bridges uses a consistent
>>> approach for chained IRQ handling and I hope this bug fix can be
>>> a starting point.
>>
>> +1 again. We definitely need to come up with some form of common
>> approach for all these host drivers, and maybe turn that into a library...
>>
>
> Well, this is beyond my knowledge now, I guess I can figure out how to
> using irq_chip for the first step, then I may following this "common
> approach" after we have a solution for that?

We can help you with that at a later time indeed. the urgent thing is to
fix this driver so that it does the right thing, and we can then look at
using a common approach for a number of them.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-03-16 11:23:33

by Lorenzo Pieralisi

[permalink] [raw]

Subject: Re: [PATCH v5 1/2] PCI: mediatek: Clear IRQ status after IRQ dispatched to avoid reentry

On Fri, Jan 05, 2018 at 07:51:47PM +0800, Honghui Zhang wrote:
> On Thu, 2018-01-04 at 19:04 +0000, Marc Zyngier wrote:
> > On 04/01/18 18:40, Lorenzo Pieralisi wrote:
> > > [+Marc]
> > >
> > > On Wed, Dec 27, 2017 at 08:59:53AM +0800, [email protected] wrote:
> > >> From: Honghui Zhang <[email protected]>
> > >>
> > >> There maybe a same IRQ reentry scenario after IRQ received in current
> > >> IRQ handle flow:
> > >> EP device PCIe host driver EP driver
> > >> 1. issue an IRQ
> > >> 2. received IRQ
> > >> 3. clear IRQ status
> > >> 4. dispatch IRQ
> > >> 5. clear IRQ source
> > >> The IRQ status was not successfully cleared at step 2 since the IRQ
> > >> source was not cleared yet. So the PCIe host driver may receive the
> > >> same IRQ after step 5. Then there's an IRQ reentry occurred.
> > >> Even worse, if the reentry IRQ was not an IRQ that EP driver expected,
> > >> it may not handle the IRQ. Then we may run into the infinite loop from
> > >> step 2 to step 4.
> > >> Clear the IRQ status after IRQ have been dispatched to avoid the IRQ
> > >> reentry.
> > >> This patch also fix another INTx IRQ issue by initialize the iterate
> > >> before the loop. If an INTx IRQ re-occurred while we are dispatching
> > >> the INTx IRQ, then iterate may start from PCI_NUM_INTX + INTX_SHIFT
> > >> instead of INTX_SHIFT for the second time entering the
> > >> for_each_set_bit_from() loop.
> > >
> > > This looks like two different issues that should be fixed with two
> > > patches.
>
> Ok, I split this into two patches and figure out a more reasonable
> approach by using irq_chip solution.

For the time being, I will mark this patch as "Changes Requested"
waiting for a new version.

Thanks,
Lorenzo

>
> > >
> > >> Signed-off-by: Honghui Zhang <[email protected]>
> > >> Acked-by: Ryder Lee <[email protected]>
> > >> ---
> > >> drivers/pci/host/pcie-mediatek.c | 11 ++++++-----
> > >> 1 file changed, 6 insertions(+), 5 deletions(-)
> > >
> > > For the sake of uniformity, I first want to understand why this
> > > driver does not call:
> > >
> > > chained_irq_enter/exit()
> > >
> > > in the primary handler (mtk_pcie_intr_handler()).
> > >
> > > With the GIC as a primary interrupt controller we have not
> > > even figured out how current code can actually work without
> > > calling the chained_* API.
> > >
> > > I want to come up with a consistent handling of IRQ domains for
> > > all host bridges and any discrepancy should be explained.
> >
> > That's because this driver is a huge hack, see below:
> >
> > >
> > >> diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
> > >> index db93efd..fc29a9a 100644
> > >> --- a/drivers/pci/host/pcie-mediatek.c
> > >> +++ b/drivers/pci/host/pcie-mediatek.c
> > >> @@ -601,15 +601,16 @@ static irqreturn_t mtk_pcie_intr_handler(int irq, void *data)
> >
> > This function is not a chained irqchip, but an interrupt handler...
> >
> > >> struct mtk_pcie_port *port = (struct mtk_pcie_port *)data;
> > >> unsigned long status;
> > >> u32 virq;
> > >> - u32 bit = INTX_SHIFT;
> > >> + u32 bit;
> > >>
> > >> while ((status = readl(port->base + PCIE_INT_STATUS)) & INTX_MASK) {
> > >> + bit = INTX_SHIFT;
> > >> for_each_set_bit_from(bit, &status, PCI_NUM_INTX + INTX_SHIFT) {
> > >> - /* Clear the INTx */
> > >> - writel(1 << bit, port->base + PCIE_INT_STATUS);
> > >> virq = irq_find_mapping(port->irq_domain,
> > >> bit - INTX_SHIFT);
> > >> generic_handle_irq(virq);
> >
> > and nonetheless, this calls into generic_handle_irq(). That's a complete
> > violation of the interrupt layering. Maybe there is a good reason for
> > it, but I'd like to know which one.
> >
> > Which means that all of the ack/mask has to be done outside of the
> > irqchip framework too... Disgusting.
> >
> > >> + /* Clear the INTx */
> > >> + writel(1 << bit, port->base + PCIE_INT_STATUS);
> > >
> > > I think that these masking/acking should actually be done through
> > > the irq_chip hooks (see for instance pci-ftpci100.c) - that would
> > > make this kind of bugs much easier to prevent (because the IRQ
> > > layer does the sequencing for you).
> >
> > +1.
> >
>
> Thanks for your advice, I need to do some homework to have a better
> understanding of the irq_chip approach.
>
> > > Marc (CC'ed) has a more comprehensive view on this than me - I would
> > > like to get to a point where all host bridges uses a consistent
> > > approach for chained IRQ handling and I hope this bug fix can be
> > > a starting point.
> >
> > +1 again. We definitely need to come up with some form of common
> > approach for all these host drivers, and maybe turn that into a library...
> >
>
> Well, this is beyond my knowledge now, I guess I can figure out how to
> using irq_chip for the first step, then I may following this "common
> approach" after we have a solution for that?
>
> thanks.
> > Thanks,
> >
> > M.
>
>

2018-03-16 12:15:16

by Lorenzo Pieralisi

[permalink] [raw]

Subject: Re: [PATCH v5 2/2] PCI: mediatek: Set up class type and vendor ID for MT7622

On Wed, Dec 27, 2017 at 12:45:42PM -0600, Bjorn Helgaas wrote:

[...]

> > + /* Set up class code for MT7622 */
> > + val = PCI_CLASS_BRIDGE_PCI << 16;
> > + writel(val, port->base + PCIE_CONF_CLASS);
>
> 1) Your comments mention MT7622 specifically, but this code is run for
> both mt2712-pcie and mt7622-pcie. If this code is safe and necessary
> for both mt2712-pcie and mt7622-pcie, please remove the mention of
> MT7622.
>
> 2) The first comment mentions both "vendor ID and device ID" but you
> don't write the device ID. Since this code applies to both
> mt2712-pcie and mt7622-pcie, my guess is that you don't *want* to
> write the device ID. If that's the case, please fix the comment.
>
> 3) If you only need to set the vendor ID, you're performing a 32-bit
> write (writel()) to update a 16-bit value. Please use writew()
> instead.
>
> 4) If you only need to set the vendor ID, please use a definition like
> "PCIE_CONF_VENDOR_ID" instead of the ambiguous "PCIE_CONF_ID".
>
> 5) If you only need to set the vendor ID, please update the changelog
> to mention "vendor ID" specifically instead of the ambiguous "IDs".
>
> 6) Please add a space before the closing "*/" of the first comment.
>
> 7) PCI_CLASS_BRIDGE_PCI is for a PCI-to-PCI bridge, i.e., one that has
> PCI on both the primary (upstream) side and the secondary (downstream)
> side. That kind of bridge has a type 1 config header (see
> PCI_HEADER_TYPE) and the PCI_PRIMARY_BUS and PCI_SECONDARY_BUS
> registers tell us the bus number of the primary and secondary sides.
>
> I don't believe this device is a PCI-to-PCI bridge. I think it's a
> *host* bridge that has some non-PCI interface on the upstream side and
> should have a type 0 config header. If that's the case you should use
> PCI_CLASS_BRIDGE_HOST instead.

I think these registers actually programme the root port register space
in the RC (whether real or fake - that depends on the RC design) so
the class is PCI_CLASS_BRIDGE_PCI, that's what most of host bridge
drivers in drivers/pci/host do.

I would like to get to the bottom of this since it is indeed misleading
(and I do not have HW to test it myself to check my understanding).

Thanks,
Lorenzo

2018-03-16 12:16:29

by Lorenzo Pieralisi

[permalink] [raw]