2020-12-15 10:28:54

by Yousaf Kaukab

[permalink] [raw]
Subject: dwc: tegra194: issue with card containing a bridge

Hi,
I am seeing an issue with next-20201211 with USB3380[1] based PCIe card
(vid:pid 10b5:3380) on Jetson AGX Xavier. Card doesn't show up in the
lspci output.

In non working case (next-20201211):
# lspci
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)

In working case (v5.10-rc7):
# lspci
0001:00:00.0 PCI bridge: Molex Incorporated Device 1ad2 (rev a1)
0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
0005:00:00.0 PCI bridge: Molex Incorporated Device 1ad0 (rev a1)
0005:01:00.0 PCI bridge: PLX Technology, Inc. Device 3380 (rev ab)
0005:02:02.0 PCI bridge: PLX Technology, Inc. Device 3380 (rev ab)
0005:03:00.0 USB controller: PLX Technology, Inc. Device 3380 (rev ab)
# lspci -t
-+-[0005:00]---00.0-[01-ff]----00.0-[02-03]----02.0-[03]----00.0
+-[0001:00]---00.0-[01-ff]----00.0
\-[0000:00]-
#lspci -v
https://paste.opensuse.org/87573209

git-bisect points to commit b9ac0f9dc8ea ("PCI: dwc: Move dw_pcie_setup_rc() to DWC common code").
dw_pcie_setup_rc() is not removed from pcie-tegra194.c in this commit.

Could the failure be caused because dw_pcie_setup_rc() is called twice now in case of tegra194?

BR,
Yousaf

[1]: https://www.broadcom.com/products/pcie-switches-bridges/usb-pci/usb-controllers/usb3380


2020-12-15 12:22:35

by Vidya Sagar

[permalink] [raw]
Subject: Re: dwc: tegra194: issue with card containing a bridge

Thanks Mian for bringing it to our notice.
Have you tried removing the dw_pcie_setup_rc(pp); call from
pcie-tegra194.c file on top of linux-next? and does that solve the issue?

diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c
b/drivers/pci/controller/dwc/pcie-tegra194.c
index 5597b2a49598..1c9e9c054592 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -907,7 +907,7 @@ static void tegra_pcie_prepare_host(struct pcie_port
*pp)
dw_pcie_writel_dbi(pci,
CFG_TIMER_CTRL_MAX_FUNC_NUM_OFF, val);
}

- dw_pcie_setup_rc(pp);
+ //dw_pcie_setup_rc(pp);

clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);

I took a quick look at the dw_pcie_setup_rc() implementation and I'm not
sure why calling it second time should create any issue for the
enumeration of devices behind a switch. Perhaps I need to spend more
time to debug that part.
In any case, since dw_pcie_setup_rc() is already part of
dw_pcie_host_init(), I think it can be removed from
tegra_pcie_prepare_host() implemention.

Thanks,
Vidya Sagar

On 12/15/2020 3:54 PM, Mian Yousaf Kaukab wrote:
> External email: Use caution opening links or attachments
>
>
> Hi,
> I am seeing an issue with next-20201211 with USB3380[1] based PCIe card
> (vid:pid 10b5:3380) on Jetson AGX Xavier. Card doesn't show up in the
> lspci output.
>
> In non working case (next-20201211):
> # lspci
> 0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
> 0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
> 0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
>
> In working case (v5.10-rc7):
> # lspci
> 0001:00:00.0 PCI bridge: Molex Incorporated Device 1ad2 (rev a1)
> 0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
> 0005:00:00.0 PCI bridge: Molex Incorporated Device 1ad0 (rev a1)
> 0005:01:00.0 PCI bridge: PLX Technology, Inc. Device 3380 (rev ab)
> 0005:02:02.0 PCI bridge: PLX Technology, Inc. Device 3380 (rev ab)
> 0005:03:00.0 USB controller: PLX Technology, Inc. Device 3380 (rev ab)
> # lspci -t
> -+-[0005:00]---00.0-[01-ff]----00.0-[02-03]----02.0-[03]----00.0
> +-[0001:00]---00.0-[01-ff]----00.0
> \-[0000:00]-
> #lspci -v
> https://paste.opensuse.org/87573209
>
> git-bisect points to commit b9ac0f9dc8ea ("PCI: dwc: Move dw_pcie_setup_rc() to DWC common code").
> dw_pcie_setup_rc() is not removed from pcie-tegra194.c in this commit.
>
> Could the failure be caused because dw_pcie_setup_rc() is called twice now in case of tegra194?
>
> BR,
> Yousaf
>
> [1]: https://www.broadcom.com/products/pcie-switches-bridges/usb-pci/usb-controllers/usb3380
>

2020-12-15 13:30:11

by Yousaf Kaukab

[permalink] [raw]
Subject: Re: dwc: tegra194: issue with card containing a bridge

On Tue, Dec 15, 2020 at 05:45:59PM +0530, Vidya Sagar wrote:
> Thanks Mian for bringing it to our notice.
> Have you tried removing the dw_pcie_setup_rc(pp); call from pcie-tegra194.c
> file on top of linux-next? and does that solve the issue?
>
> diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c
> b/drivers/pci/controller/dwc/pcie-tegra194.c
> index 5597b2a49598..1c9e9c054592 100644
> --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> @@ -907,7 +907,7 @@ static void tegra_pcie_prepare_host(struct pcie_port
> *pp)
> dw_pcie_writel_dbi(pci, CFG_TIMER_CTRL_MAX_FUNC_NUM_OFF,
> val);
> }
>
> - dw_pcie_setup_rc(pp);
> + //dw_pcie_setup_rc(pp);
I still see the same issue with this change.
Reverting b9ac0f9dc8ea works though.
>
> clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);
>
> I took a quick look at the dw_pcie_setup_rc() implementation and I'm not
> sure why calling it second time should create any issue for the enumeration
> of devices behind a switch. Perhaps I need to spend more time to debug that
> part.
> In any case, since dw_pcie_setup_rc() is already part of
> dw_pcie_host_init(), I think it can be removed from
> tegra_pcie_prepare_host() implemention.
>
> Thanks,
> Vidya Sagar
>
BR,
Yousaf

2020-12-15 15:47:00

by Rob Herring

[permalink] [raw]
Subject: Re: dwc: tegra194: issue with card containing a bridge

On Tue, Dec 15, 2020 at 02:25:04PM +0100, Mian Yousaf Kaukab wrote:
> On Tue, Dec 15, 2020 at 05:45:59PM +0530, Vidya Sagar wrote:
> > Thanks Mian for bringing it to our notice.
> > Have you tried removing the dw_pcie_setup_rc(pp); call from pcie-tegra194.c
> > file on top of linux-next? and does that solve the issue?
> >
> > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c
> > b/drivers/pci/controller/dwc/pcie-tegra194.c
> > index 5597b2a49598..1c9e9c054592 100644
> > --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> > @@ -907,7 +907,7 @@ static void tegra_pcie_prepare_host(struct pcie_port
> > *pp)
> > dw_pcie_writel_dbi(pci, CFG_TIMER_CTRL_MAX_FUNC_NUM_OFF,
> > val);
> > }
> >
> > - dw_pcie_setup_rc(pp);
> > + //dw_pcie_setup_rc(pp);
> I still see the same issue with this change.
> Reverting b9ac0f9dc8ea works though.
> >
> > clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);
> >
> > I took a quick look at the dw_pcie_setup_rc() implementation and I'm not
> > sure why calling it second time should create any issue for the enumeration
> > of devices behind a switch. Perhaps I need to spend more time to debug that
> > part.
> > In any case, since dw_pcie_setup_rc() is already part of
> > dw_pcie_host_init(), I think it can be removed from
> > tegra_pcie_prepare_host() implemention.

I think the 2nd time is making the link go down is my guess. Tegra was
odd in that its start/stop link functions don't do link handling, so I
didn't implement those functions and left the link handling in the Tegra
driver.

Can you try the below patch. It needs some more work as it breaks
endpoint mode.

8<--------------------------------------------------------------------

diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
index 648e731bccfa..49bb487b16ae 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -907,9 +907,32 @@ static void tegra_pcie_prepare_host(struct pcie_port *pp)
dw_pcie_writel_dbi(pci, CFG_TIMER_CTRL_MAX_FUNC_NUM_OFF, val);
}

- dw_pcie_setup_rc(pp);
-
clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);
+}
+
+static int tegra_pcie_dw_host_init(struct pcie_port *pp)
+{
+ pp->bridge->ops = &tegra_pci_ops;
+
+ tegra_pcie_prepare_host(pp);
+ tegra_pcie_enable_interrupts(pp);
+
+ return 0;
+}
+
+static int tegra_pcie_dw_link_up(struct dw_pcie *pci)
+{
+ struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
+ u32 val = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKSTA);
+
+ return !!(val & PCI_EXP_LNKSTA_DLLLA);
+}
+
+static int tegra_pcie_dw_start_link(struct dw_pcie *pci)
+{
+ u32 val, offset, speed, tmp;
+ struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
+ struct pcie_port *pp = &pci->pp;

/* Assert RST */
val = appl_readl(pcie, APPL_PINMUX);
@@ -929,17 +952,6 @@ static void tegra_pcie_prepare_host(struct pcie_port *pp)
appl_writel(pcie, val, APPL_PINMUX);

msleep(100);
-}
-
-static int tegra_pcie_dw_host_init(struct pcie_port *pp)
-{
- struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
- struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
- u32 val, tmp, offset, speed;
-
- pp->bridge->ops = &tegra_pci_ops;
-
- tegra_pcie_prepare_host(pp);

if (dw_pcie_wait_for_link(pci)) {
/*
@@ -975,7 +987,8 @@ static int tegra_pcie_dw_host_init(struct pcie_port *pp)
val &= ~PCI_DLF_EXCHANGE_ENABLE;
dw_pcie_writel_dbi(pci, offset, val);

- tegra_pcie_prepare_host(pp);
+ tegra_pcie_dw_host_init(pp);
+ dw_pcie_setup_rc(pp);

if (dw_pcie_wait_for_link(pci))
return 0;
@@ -985,25 +998,6 @@ static int tegra_pcie_dw_host_init(struct pcie_port *pp)
PCI_EXP_LNKSTA_CLS;
clk_set_rate(pcie->core_clk, pcie_gen_freq[speed - 1]);

- tegra_pcie_enable_interrupts(pp);
-
- return 0;
-}
-
-static int tegra_pcie_dw_link_up(struct dw_pcie *pci)
-{
- struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
- u32 val = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKSTA);
-
- return !!(val & PCI_EXP_LNKSTA_DLLLA);
-}
-
-static int tegra_pcie_dw_start_link(struct dw_pcie *pci)
-{
- struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
-
- enable_irq(pcie->pex_rst_irq);
-
return 0;
}

2020-12-15 19:48:30

by Rob Herring

[permalink] [raw]
Subject: Re: dwc: tegra194: issue with card containing a bridge

On Tue, Dec 15, 2020 at 09:41:47AM -0600, Rob Herring wrote:
> On Tue, Dec 15, 2020 at 02:25:04PM +0100, Mian Yousaf Kaukab wrote:
> > On Tue, Dec 15, 2020 at 05:45:59PM +0530, Vidya Sagar wrote:
> > > Thanks Mian for bringing it to our notice.
> > > Have you tried removing the dw_pcie_setup_rc(pp); call from pcie-tegra194.c
> > > file on top of linux-next? and does that solve the issue?
> > >
> > > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c
> > > b/drivers/pci/controller/dwc/pcie-tegra194.c
> > > index 5597b2a49598..1c9e9c054592 100644
> > > --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> > > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> > > @@ -907,7 +907,7 @@ static void tegra_pcie_prepare_host(struct pcie_port
> > > *pp)
> > > dw_pcie_writel_dbi(pci, CFG_TIMER_CTRL_MAX_FUNC_NUM_OFF,
> > > val);
> > > }
> > >
> > > - dw_pcie_setup_rc(pp);
> > > + //dw_pcie_setup_rc(pp);
> > I still see the same issue with this change.
> > Reverting b9ac0f9dc8ea works though.
> > >
> > > clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);
> > >
> > > I took a quick look at the dw_pcie_setup_rc() implementation and I'm not
> > > sure why calling it second time should create any issue for the enumeration
> > > of devices behind a switch. Perhaps I need to spend more time to debug that
> > > part.
> > > In any case, since dw_pcie_setup_rc() is already part of
> > > dw_pcie_host_init(), I think it can be removed from
> > > tegra_pcie_prepare_host() implemention.
>
> I think the 2nd time is making the link go down is my guess. Tegra was
> odd in that its start/stop link functions don't do link handling, so I
> didn't implement those functions and left the link handling in the Tegra
> driver.
>
> Can you try the below patch. It needs some more work as it breaks
> endpoint mode.

That one missed some re-init. Try this one instead

8<--------------------------------------------------------------------

diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
index 5597b2a49598..d8fed3561e91 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -933,14 +933,24 @@ static void tegra_pcie_prepare_host(struct pcie_port *pp)

static int tegra_pcie_dw_host_init(struct pcie_port *pp)
{
- struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
- struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
- u32 val, tmp, offset, speed;
-
pp->bridge->ops = &tegra_pci_ops;

tegra_pcie_prepare_host(pp);

+ return 0;
+}
+
+static int tegra_pcie_dw_start_link(struct dw_pcie *pci)
+{
+ u32 val, offset, speed, tmp;
+ struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
+ struct pcie_port *pp = &pci->pp;
+
+ if (pcie->mode == DW_PCIE_EP_TYPE) {
+ enable_irq(pcie->pex_rst_irq);
+ return 0;
+ }
+
if (dw_pcie_wait_for_link(pci)) {
/*
* There are some endpoints which can't get the link up if
@@ -998,15 +1008,6 @@ static int tegra_pcie_dw_link_up(struct dw_pcie *pci)
return !!(val & PCI_EXP_LNKSTA_DLLLA);
}

-static int tegra_pcie_dw_start_link(struct dw_pcie *pci)
-{
- struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
-
- enable_irq(pcie->pex_rst_irq);
-
- return 0;
-}
-
static void tegra_pcie_dw_stop_link(struct dw_pcie *pci)
{
struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);

2020-12-15 20:59:44

by Yousaf Kaukab

[permalink] [raw]
Subject: Re: dwc: tegra194: issue with card containing a bridge

On Tue, Dec 15, 2020 at 01:44:21PM -0600, Rob Herring wrote:
> On Tue, Dec 15, 2020 at 09:41:47AM -0600, Rob Herring wrote:
> > On Tue, Dec 15, 2020 at 02:25:04PM +0100, Mian Yousaf Kaukab wrote:
> > > On Tue, Dec 15, 2020 at 05:45:59PM +0530, Vidya Sagar wrote:
> > > > Thanks Mian for bringing it to our notice.
> > > > Have you tried removing the dw_pcie_setup_rc(pp); call from pcie-tegra194.c
> > > > file on top of linux-next? and does that solve the issue?
> > > >
> > > > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > b/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > index 5597b2a49598..1c9e9c054592 100644
> > > > --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > @@ -907,7 +907,7 @@ static void tegra_pcie_prepare_host(struct pcie_port
> > > > *pp)
> > > > dw_pcie_writel_dbi(pci, CFG_TIMER_CTRL_MAX_FUNC_NUM_OFF,
> > > > val);
> > > > }
> > > >
> > > > - dw_pcie_setup_rc(pp);
> > > > + //dw_pcie_setup_rc(pp);
> > > I still see the same issue with this change.
> > > Reverting b9ac0f9dc8ea works though.
> > > >
> > > > clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);
> > > >
> > > > I took a quick look at the dw_pcie_setup_rc() implementation and I'm not
> > > > sure why calling it second time should create any issue for the enumeration
> > > > of devices behind a switch. Perhaps I need to spend more time to debug that
> > > > part.
> > > > In any case, since dw_pcie_setup_rc() is already part of
> > > > dw_pcie_host_init(), I think it can be removed from
> > > > tegra_pcie_prepare_host() implemention.
> >
> > I think the 2nd time is making the link go down is my guess. Tegra was
> > odd in that its start/stop link functions don't do link handling, so I
> > didn't implement those functions and left the link handling in the Tegra
> > driver.
> >
> > Can you try the below patch. It needs some more work as it breaks
> > endpoint mode.
>
> That one missed some re-init. Try this one instead
>
> 8<--------------------------------------------------------------------
>
> diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
> index 5597b2a49598..d8fed3561e91 100644
> --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> @@ -933,14 +933,24 @@ static void tegra_pcie_prepare_host(struct pcie_port *pp)
>
> static int tegra_pcie_dw_host_init(struct pcie_port *pp)
> {
> - struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
> - struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
> - u32 val, tmp, offset, speed;
> -
> pp->bridge->ops = &tegra_pci_ops;
>
> tegra_pcie_prepare_host(pp);
>
> + return 0;
> +}
> +
> +static int tegra_pcie_dw_start_link(struct dw_pcie *pci)
> +{
> + u32 val, offset, speed, tmp;
> + struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
> + struct pcie_port *pp = &pci->pp;
> +
> + if (pcie->mode == DW_PCIE_EP_TYPE) {
> + enable_irq(pcie->pex_rst_irq);
> + return 0;
> + }
> +
> if (dw_pcie_wait_for_link(pci)) {
> /*
> * There are some endpoints which can't get the link up if
> @@ -998,15 +1008,6 @@ static int tegra_pcie_dw_link_up(struct dw_pcie *pci)
> return !!(val & PCI_EXP_LNKSTA_DLLLA);
> }
>
> -static int tegra_pcie_dw_start_link(struct dw_pcie *pci)
> -{
> - struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
> -
> - enable_irq(pcie->pex_rst_irq);
> -
> - return 0;
> -}
> -
> static void tegra_pcie_dw_stop_link(struct dw_pcie *pci)
> {
> struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
With USB3380 card PTM enabled... is the last message I see. It doesn't
boot further:
[ 9.124500] tegra194-pcie 141a0000.pcie: iATU unroll: enabled
[ 9.130310] tegra194-pcie 141a0000.pcie: Detected iATU regions: 8 outbound, 2 inbound
[ 9.138915] tegra194-pcie 141a0000.pcie: Link up
[ 9.144940] tegra194-pcie 141a0000.pcie: PCI host bridge to bus 0005:00
[ 9.151849] pci_bus 0005:00: root bus resource [bus 00-ff]
[ 9.157595] pci_bus 0005:00: root bus resource [mem 0x1c00000000-0x1f3fffffff pref]
[ 9.165509] pci_bus 0005:00: root bus resource [mem 0x1f40000000-0x1ffffeffff] (bus address [0x40000000-0xfffeffff])
[ 9.176444] pci_bus 0005:00: root bus resource [io 0x30000-0x3ffff] (bus address [0x0000-0xffff])
[ 9.186189] pci 0005:00:00.0: [10de:1ad0] type 01 class 0x060400
[ 9.193638] pci 0005:00:00.0: PME# supported from D0 D3hot D3cold
[ 9.200626] pci 0005:00:00.0: PTM enabled (root), 16ns granularity

With a Nvidia GT530 It boots to the prompt:
[ 9.133576] tegra194-pcie 141a0000.pcie: iATU unroll: enabled
[ 9.139371] tegra194-pcie 141a0000.pcie: Detected iATU regions: 8 outbound, 2 inbound
[ 10.152565] tegra194-pcie 141a0000.pcie: Phy link never came up
[ 11.164571] tegra194-pcie 141a0000.pcie: Phy link never came up
[ 11.172598] tegra194-pcie 141a0000.pcie: PCI host bridge to bus 0005:00
[ 11.179999] pci_bus 0005:00: root bus resource [bus 00-ff]
[ 11.185998] pci_bus 0005:00: root bus resource [mem 0x1c00000000-0x1f3fffffff pref]
[ 11.194120] pci_bus 0005:00: root bus resource [mem 0x1f40000000-0x1ffffeffff] (bus address [0x40000000-0xfffeffff])
[ 11.205339] pci_bus 0005:00: root bus resource [io 0x30000-0x3ffff] (bus address [0x0000-0xffff])
[ 11.215464] pci 0005:00:00.0: [10de:1ad0] type 01 class 0x060400
[ 11.224039] pci 0005:00:00.0: PME# supported from D0 D3hot D3cold
[ 11.231086] pci 0005:00:00.0: PTM enabled (root), 16ns granularity
[ 11.268132] pci 0005:00:00.0: PCI bridge to [bus 01-ff]
[ 11.275849] pcieport 0005:00:00.0: PME: Signaling with IRQ 54
[ 11.284553] pcieport 0005:00:00.0: AER: enabled with IRQ 54
[ 11.291322] pcieport 0005:00:00.0: bw_notification: enabled with IRQ 54
[ 11.299108] pci_bus 0005:01: busn_res: [bus 01-ff] is released
[ 11.305119] pci_bus 0005:00: busn_res: [bus 00-ff] is released
...
However, lspci doesn't list GT530. Before the patch GT530 was working
properly.

BR,
Yousaf

2020-12-17 15:01:13

by Rob Herring

[permalink] [raw]
Subject: Re: dwc: tegra194: issue with card containing a bridge

On Tue, Dec 15, 2020 at 09:52:35PM +0100, Mian Yousaf Kaukab wrote:
> On Tue, Dec 15, 2020 at 09:41:47AM -0600, Rob Herring wrote:
> > On Tue, Dec 15, 2020 at 02:25:04PM +0100, Mian Yousaf Kaukab wrote:
> > > On Tue, Dec 15, 2020 at 05:45:59PM +0530, Vidya Sagar wrote:
> > > > Thanks Mian for bringing it to our notice.
> > > > Have you tried removing the dw_pcie_setup_rc(pp); call from pcie-tegra194.c
> > > > file on top of linux-next? and does that solve the issue?
> > > >
> > > > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > b/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > index 5597b2a49598..1c9e9c054592 100644
> > > > --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > @@ -907,7 +907,7 @@ static void tegra_pcie_prepare_host(struct pcie_port
> > > > *pp)
> > > > dw_pcie_writel_dbi(pci, CFG_TIMER_CTRL_MAX_FUNC_NUM_OFF,
> > > > val);
> > > > }
> > > >
> > > > - dw_pcie_setup_rc(pp);
> > > > + //dw_pcie_setup_rc(pp);
> > > I still see the same issue with this change.
> > > Reverting b9ac0f9dc8ea works though.
> > > >
> > > > clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);
> > > >
> > > > I took a quick look at the dw_pcie_setup_rc() implementation and I'm not
> > > > sure why calling it second time should create any issue for the enumeration
> > > > of devices behind a switch. Perhaps I need to spend more time to debug that
> > > > part.
> > > > In any case, since dw_pcie_setup_rc() is already part of
> > > > dw_pcie_host_init(), I think it can be removed from
> > > > tegra_pcie_prepare_host() implemention.
> >
> > I think the 2nd time is making the link go down is my guess. Tegra was
> > odd in that its start/stop link functions don't do link handling, so I
> > didn't implement those functions and left the link handling in the Tegra
> > driver.
> >
> > Can you try the below patch. It needs some more work as it breaks
> > endpoint mode.

[...]

> Boot is ok with this patch. Some improvement in lspci as well:

Some improvement? Meaning not completely working still?

> # lspci
> 0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
> 0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
> 0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
> 0005:01:00.0 PCI bridge: PLX Technology, Inc. Device 3380 (rev ab)

This patch was closer to the original flow, but would not have worked if
DLFE disabled mode was needed.

Please give this patch a try:

8<--------------------------------------------------------
diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
index 5597b2a49598..0515897b2f3a 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -853,12 +853,14 @@ static void config_gen3_gen4_eq_presets(struct tegra_pcie_dw *pcie)
dw_pcie_writel_dbi(pci, GEN3_RELATED_OFF, val);
}

-static void tegra_pcie_prepare_host(struct pcie_port *pp)
+static int tegra_pcie_dw_host_init(struct pcie_port *pp)
{
struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
u32 val;

+ pp->bridge->ops = &tegra_pci_ops;
+
if (!pcie->pcie_cap_base)
pcie->pcie_cap_base = dw_pcie_find_capability(&pcie->pci,
PCI_CAP_ID_EXP);
@@ -907,10 +909,24 @@ static void tegra_pcie_prepare_host(struct pcie_port *pp)
dw_pcie_writel_dbi(pci, CFG_TIMER_CTRL_MAX_FUNC_NUM_OFF, val);
}

- dw_pcie_setup_rc(pp);
-
clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);

+ return 0;
+}
+
+static int tegra_pcie_dw_start_link(struct dw_pcie *pci)
+{
+ u32 val, offset, speed, tmp;
+ struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
+ struct pcie_port *pp = &pci->pp;
+ bool retry = true;
+
+ if (pcie->mode == DW_PCIE_EP_TYPE) {
+ enable_irq(pcie->pex_rst_irq);
+ return 0;
+ }
+
+retry_link:
/* Assert RST */
val = appl_readl(pcie, APPL_PINMUX);
val &= ~APPL_PINMUX_PEX_RST;
@@ -929,19 +945,10 @@ static void tegra_pcie_prepare_host(struct pcie_port *pp)
appl_writel(pcie, val, APPL_PINMUX);

msleep(100);
-}
-
-static int tegra_pcie_dw_host_init(struct pcie_port *pp)
-{
- struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
- struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
- u32 val, tmp, offset, speed;
-
- pp->bridge->ops = &tegra_pci_ops;
-
- tegra_pcie_prepare_host(pp);

if (dw_pcie_wait_for_link(pci)) {
+ if (!retry)
+ return 0;
/*
* There are some endpoints which can't get the link up if
* root port has Data Link Feature (DLF) enabled.
@@ -975,10 +982,11 @@ static int tegra_pcie_dw_host_init(struct pcie_port *pp)
val &= ~PCI_DLF_EXCHANGE_ENABLE;
dw_pcie_writel_dbi(pci, offset, val);

- tegra_pcie_prepare_host(pp);
+ tegra_pcie_dw_host_init(pp);
+ dw_pcie_setup_rc(pp);

- if (dw_pcie_wait_for_link(pci))
- return 0;
+ retry = false;
+ goto retry_link;
}

speed = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKSTA) &
@@ -998,15 +1006,6 @@ static int tegra_pcie_dw_link_up(struct dw_pcie *pci)
return !!(val & PCI_EXP_LNKSTA_DLLLA);
}

-static int tegra_pcie_dw_start_link(struct dw_pcie *pci)
-{
- struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
-
- enable_irq(pcie->pex_rst_irq);
-
- return 0;
-}
-
static void tegra_pcie_dw_stop_link(struct dw_pcie *pci)
{
struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);

2020-12-17 17:08:30

by Yousaf Kaukab

[permalink] [raw]
Subject: Re: dwc: tegra194: issue with card containing a bridge

On Thu, Dec 17, 2020 at 08:58:57AM -0600, Rob Herring wrote:
> On Tue, Dec 15, 2020 at 09:52:35PM +0100, Mian Yousaf Kaukab wrote:
> > On Tue, Dec 15, 2020 at 09:41:47AM -0600, Rob Herring wrote:
> > > On Tue, Dec 15, 2020 at 02:25:04PM +0100, Mian Yousaf Kaukab wrote:
> > > > On Tue, Dec 15, 2020 at 05:45:59PM +0530, Vidya Sagar wrote:
> > > > > Thanks Mian for bringing it to our notice.
> > > > > Have you tried removing the dw_pcie_setup_rc(pp); call from pcie-tegra194.c
> > > > > file on top of linux-next? and does that solve the issue?
> > > > >
> > > > > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > > b/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > > index 5597b2a49598..1c9e9c054592 100644
> > > > > --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > > @@ -907,7 +907,7 @@ static void tegra_pcie_prepare_host(struct pcie_port
> > > > > *pp)
> > > > > dw_pcie_writel_dbi(pci, CFG_TIMER_CTRL_MAX_FUNC_NUM_OFF,
> > > > > val);
> > > > > }
> > > > >
> > > > > - dw_pcie_setup_rc(pp);
> > > > > + //dw_pcie_setup_rc(pp);
> > > > I still see the same issue with this change.
> > > > Reverting b9ac0f9dc8ea works though.
> > > > >
> > > > > clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);
> > > > >
> > > > > I took a quick look at the dw_pcie_setup_rc() implementation and I'm not
> > > > > sure why calling it second time should create any issue for the enumeration
> > > > > of devices behind a switch. Perhaps I need to spend more time to debug that
> > > > > part.
> > > > > In any case, since dw_pcie_setup_rc() is already part of
> > > > > dw_pcie_host_init(), I think it can be removed from
> > > > > tegra_pcie_prepare_host() implemention.
> > >
> > > I think the 2nd time is making the link go down is my guess. Tegra was
> > > odd in that its start/stop link functions don't do link handling, so I
> > > didn't implement those functions and left the link handling in the Tegra
> > > driver.
> > >
> > > Can you try the below patch. It needs some more work as it breaks
> > > endpoint mode.
>
> [...]
>
> > Boot is ok with this patch. Some improvement in lspci as well:
>
> Some improvement? Meaning not completely working still?
>
> > # lspci
> > 0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
> > 0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
> > 0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
> > 0005:01:00.0 PCI bridge: PLX Technology, Inc. Device 3380 (rev ab)
>
> This patch was closer to the original flow, but would not have worked if
> DLFE disabled mode was needed.
>
> Please give this patch a try:
Thank you for the patch! Initial results with it looks very promising.
I’ll get back to you tomorrow after running a few more tests.

BR,
Yousaf

2020-12-18 10:15:28

by Yousaf Kaukab

[permalink] [raw]
Subject: Re: dwc: tegra194: issue with card containing a bridge

On Thu, Dec 17, 2020 at 06:06:35PM +0100, Mian Yousaf Kaukab wrote:
> On Thu, Dec 17, 2020 at 08:58:57AM -0600, Rob Herring wrote:
> > On Tue, Dec 15, 2020 at 09:52:35PM +0100, Mian Yousaf Kaukab wrote:
> > > On Tue, Dec 15, 2020 at 09:41:47AM -0600, Rob Herring wrote:
> > > > On Tue, Dec 15, 2020 at 02:25:04PM +0100, Mian Yousaf Kaukab wrote:
> > > > > On Tue, Dec 15, 2020 at 05:45:59PM +0530, Vidya Sagar wrote:
> > > > > > Thanks Mian for bringing it to our notice.
> > > > > > Have you tried removing the dw_pcie_setup_rc(pp); call from pcie-tegra194.c
> > > > > > file on top of linux-next? and does that solve the issue?
> > > > > >
> > > > > > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > > > b/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > > > index 5597b2a49598..1c9e9c054592 100644
> > > > > > --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > > > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> > > > > > @@ -907,7 +907,7 @@ static void tegra_pcie_prepare_host(struct pcie_port
> > > > > > *pp)
> > > > > > dw_pcie_writel_dbi(pci, CFG_TIMER_CTRL_MAX_FUNC_NUM_OFF,
> > > > > > val);
> > > > > > }
> > > > > >
> > > > > > - dw_pcie_setup_rc(pp);
> > > > > > + //dw_pcie_setup_rc(pp);
> > > > > I still see the same issue with this change.
> > > > > Reverting b9ac0f9dc8ea works though.
> > > > > >
> > > > > > clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);
> > > > > >
> > > > > > I took a quick look at the dw_pcie_setup_rc() implementation and I'm not
> > > > > > sure why calling it second time should create any issue for the enumeration
> > > > > > of devices behind a switch. Perhaps I need to spend more time to debug that
> > > > > > part.
> > > > > > In any case, since dw_pcie_setup_rc() is already part of
> > > > > > dw_pcie_host_init(), I think it can be removed from
> > > > > > tegra_pcie_prepare_host() implemention.
> > > >
> > > > I think the 2nd time is making the link go down is my guess. Tegra was
> > > > odd in that its start/stop link functions don't do link handling, so I
> > > > didn't implement those functions and left the link handling in the Tegra
> > > > driver.
> > > >
> > > > Can you try the below patch. It needs some more work as it breaks
> > > > endpoint mode.
> >
> > [...]
> >
> > > Boot is ok with this patch. Some improvement in lspci as well:
> >
> > Some improvement? Meaning not completely working still?
> >
> > > # lspci
> > > 0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
> > > 0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)
> > > 0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
> > > 0005:01:00.0 PCI bridge: PLX Technology, Inc. Device 3380 (rev ab)
> >
> > This patch was closer to the original flow, but would not have worked if
> > DLFE disabled mode was needed.
> >
> > Please give this patch a try:
> Thank you for the patch! Initial results with it looks very promising.
> I’ll get back to you tomorrow after running a few more tests.
Rob, thank you for your efforts! This patch fixed the issue I was seeing. FWIW:

Tested-by: Mian Yousaf Kaukab <[email protected]>

BR,
Yousaf