2022-11-07 08:46:41

by Nathan Rossi

[permalink] [raw]
Subject: [PATCH] PCI: mvebu: Set Target Link Speed for 2.5GT downstream devices

From: Nathan Rossi <[email protected]>

There is a known issue with the mvebu PCIe controller when triggering
retraining of the link (via Link Control) where the link is dropped
completely causing significant delay in the renegotiation of the link.
This occurs only when the downstream device is 2.5GT and the upstream
port is configured to support both 2.5GT and 5GT.

It is possible to prevent this link dropping by setting the associated
link speed in Target Link Speed of the Link Control 2 register. This
only needs to be done when the downstream is specifically 2.5GT.

This change applies the required Target Link Speed value during
mvebu_pcie_setup_hw conditionally depending on the current link speed
from the Link Status register, only applying the change when the link
is configured to 2.5GT already.

Signed-off-by: Nathan Rossi <[email protected]>
---
drivers/pci/controller/pci-mvebu.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
index 1ced73726a..6a869a33ba 100644
--- a/drivers/pci/controller/pci-mvebu.c
+++ b/drivers/pci/controller/pci-mvebu.c
@@ -248,7 +248,7 @@ static void mvebu_pcie_setup_wins(struct mvebu_pcie_port *port)

static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port)
{
- u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl;
+ u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl, lnksta, lnkctl2;

/* Setup PCIe controller to Root Complex mode. */
ctrl = mvebu_readl(port, PCIE_CTRL_OFF);
@@ -339,6 +339,22 @@ static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port)
unmask |= PCIE_INT_INTX(0) | PCIE_INT_INTX(1) |
PCIE_INT_INTX(2) | PCIE_INT_INTX(3);
mvebu_writel(port, unmask, PCIE_INT_UNMASK_OFF);
+
+ /*
+ * Set Target Link Speed within the Link Control 2 register when the
+ * linked downstream device is connected at 2.5GT. This is configured
+ * in order to avoid issues with the controller when the upstream port
+ * is configured to support 2.5GT and 5GT and the downstream device is
+ * linked at 2.5GT, retraining the link in this case causes the link to
+ * drop taking significant time to retrain.
+ */
+ lnksta = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL) >> 16;
+ if ((lnksta & PCI_EXP_LNKSTA_CLS) == PCI_EXP_LNKSTA_CLS_2_5GB) {
+ lnkctl2 = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
+ lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS;
+ lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
+ mvebu_writel(port, lnkctl2, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
+ }
}

static struct mvebu_pcie_port *mvebu_pcie_find_port(struct mvebu_pcie *pcie,
---
2.37.2


2022-11-07 09:20:38

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH] PCI: mvebu: Set Target Link Speed for 2.5GT downstream devices

On Monday 07 November 2022 08:13:27 Nathan Rossi wrote:
> From: Nathan Rossi <[email protected]>
>
> There is a known issue with the mvebu PCIe controller when triggering
> retraining of the link (via Link Control) where the link is dropped
> completely causing significant delay in the renegotiation of the link.
> This occurs only when the downstream device is 2.5GT and the upstream
> port is configured to support both 2.5GT and 5GT.
>
> It is possible to prevent this link dropping by setting the associated
> link speed in Target Link Speed of the Link Control 2 register. This
> only needs to be done when the downstream is specifically 2.5GT.
>
> This change applies the required Target Link Speed value during
> mvebu_pcie_setup_hw conditionally depending on the current link speed
> from the Link Status register, only applying the change when the link
> is configured to 2.5GT already.
>
> Signed-off-by: Nathan Rossi <[email protected]>
> ---
> drivers/pci/controller/pci-mvebu.c | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
> index 1ced73726a..6a869a33ba 100644
> --- a/drivers/pci/controller/pci-mvebu.c
> +++ b/drivers/pci/controller/pci-mvebu.c
> @@ -248,7 +248,7 @@ static void mvebu_pcie_setup_wins(struct mvebu_pcie_port *port)
>
> static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port)
> {
> - u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl;
> + u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl, lnksta, lnkctl2;
>
> /* Setup PCIe controller to Root Complex mode. */
> ctrl = mvebu_readl(port, PCIE_CTRL_OFF);
> @@ -339,6 +339,22 @@ static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port)
> unmask |= PCIE_INT_INTX(0) | PCIE_INT_INTX(1) |
> PCIE_INT_INTX(2) | PCIE_INT_INTX(3);
> mvebu_writel(port, unmask, PCIE_INT_UNMASK_OFF);
> +
> + /*
> + * Set Target Link Speed within the Link Control 2 register when the
> + * linked downstream device is connected at 2.5GT. This is configured
> + * in order to avoid issues with the controller when the upstream port
> + * is configured to support 2.5GT and 5GT and the downstream device is
> + * linked at 2.5GT, retraining the link in this case causes the link to
> + * drop taking significant time to retrain.
> + */
> + lnksta = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL) >> 16;
> + if ((lnksta & PCI_EXP_LNKSTA_CLS) == PCI_EXP_LNKSTA_CLS_2_5GB) {

This code does not work because at this stage endpoint device does not
have to be ready and therefore link is not established yet.

Also this code is not running when kernel issue PCIe Hot Reset via
PCI Secondary Bus Reset bit.

And it does not handle possible hot-plug situation.

That check that code below has to be done _after_ kernel enumerate
device. PCI core code has already logic to handle delays for "slow"
devices.

And reverse operation (setting lnkctl2 target speed to original value)
has to be called after unplugging device - when link goes down.

If you want to work on this stuff, I can try to find my notes which I
done during investigation of this issue... where is probably the best
place in kernel pci core code for handling this issue.

> + lnkctl2 = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
> + lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS;
> + lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
> + mvebu_writel(port, lnkctl2, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
> + }
> }
>
> static struct mvebu_pcie_port *mvebu_pcie_find_port(struct mvebu_pcie *pcie,
> ---
> 2.37.2

2022-11-07 10:11:42

by Nathan Rossi

[permalink] [raw]
Subject: Re: [PATCH] PCI: mvebu: Set Target Link Speed for 2.5GT downstream devices

On Mon, 7 Nov 2022 at 18:43, Pali Rohár <[email protected]> wrote:
>
> On Monday 07 November 2022 08:13:27 Nathan Rossi wrote:
> > From: Nathan Rossi <[email protected]>
> >
> > There is a known issue with the mvebu PCIe controller when triggering
> > retraining of the link (via Link Control) where the link is dropped
> > completely causing significant delay in the renegotiation of the link.
> > This occurs only when the downstream device is 2.5GT and the upstream
> > port is configured to support both 2.5GT and 5GT.
> >
> > It is possible to prevent this link dropping by setting the associated
> > link speed in Target Link Speed of the Link Control 2 register. This
> > only needs to be done when the downstream is specifically 2.5GT.
> >
> > This change applies the required Target Link Speed value during
> > mvebu_pcie_setup_hw conditionally depending on the current link speed
> > from the Link Status register, only applying the change when the link
> > is configured to 2.5GT already.
> >
> > Signed-off-by: Nathan Rossi <[email protected]>
> > ---
> > drivers/pci/controller/pci-mvebu.c | 18 +++++++++++++++++-
> > 1 file changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
> > index 1ced73726a..6a869a33ba 100644
> > --- a/drivers/pci/controller/pci-mvebu.c
> > +++ b/drivers/pci/controller/pci-mvebu.c
> > @@ -248,7 +248,7 @@ static void mvebu_pcie_setup_wins(struct mvebu_pcie_port *port)
> >
> > static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port)
> > {
> > - u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl;
> > + u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl, lnksta, lnkctl2;
> >
> > /* Setup PCIe controller to Root Complex mode. */
> > ctrl = mvebu_readl(port, PCIE_CTRL_OFF);
> > @@ -339,6 +339,22 @@ static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port)
> > unmask |= PCIE_INT_INTX(0) | PCIE_INT_INTX(1) |
> > PCIE_INT_INTX(2) | PCIE_INT_INTX(3);
> > mvebu_writel(port, unmask, PCIE_INT_UNMASK_OFF);
> > +
> > + /*
> > + * Set Target Link Speed within the Link Control 2 register when the
> > + * linked downstream device is connected at 2.5GT. This is configured
> > + * in order to avoid issues with the controller when the upstream port
> > + * is configured to support 2.5GT and 5GT and the downstream device is
> > + * linked at 2.5GT, retraining the link in this case causes the link to
> > + * drop taking significant time to retrain.
> > + */
> > + lnksta = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL) >> 16;
> > + if ((lnksta & PCI_EXP_LNKSTA_CLS) == PCI_EXP_LNKSTA_CLS_2_5GB) {
>
> This code does not work because at this stage endpoint device does not
> have to be ready and therefore link is not established yet.
>
> Also this code is not running when kernel issue PCIe Hot Reset via
> PCI Secondary Bus Reset bit.
>
> And it does not handle possible hot-plug situation.
>
> That check that code below has to be done _after_ kernel enumerate
> device. PCI core code has already logic to handle delays for "slow"
> devices.
>
> And reverse operation (setting lnkctl2 target speed to original value)
> has to be called after unplugging device - when link goes down.
>
> If you want to work on this stuff, I can try to find my notes which I
> done during investigation of this issue... where is probably the best
> place in kernel pci core code for handling this issue.

Some notes/direction for implementation would be very appreciated. I
am not particularly familiar with the pci core code, so I don't have a
good idea on how to best implement this workaround.

Thanks,
Nathan

>
> > + lnkctl2 = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
> > + lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS;
> > + lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
> > + mvebu_writel(port, lnkctl2, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
> > + }
> > }
> >
> > static struct mvebu_pcie_port *mvebu_pcie_find_port(struct mvebu_pcie *pcie,
> > ---
> > 2.37.2

2022-11-07 20:10:04

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH] PCI: mvebu: Set Target Link Speed for 2.5GT downstream devices

On Monday 07 November 2022 19:10:02 Nathan Rossi wrote:
> On Mon, 7 Nov 2022 at 18:43, Pali Rohár <[email protected]> wrote:
> >
> > On Monday 07 November 2022 08:13:27 Nathan Rossi wrote:
> > > From: Nathan Rossi <[email protected]>
> > >
> > > There is a known issue with the mvebu PCIe controller when triggering
> > > retraining of the link (via Link Control) where the link is dropped
> > > completely causing significant delay in the renegotiation of the link.
> > > This occurs only when the downstream device is 2.5GT and the upstream
> > > port is configured to support both 2.5GT and 5GT.
> > >
> > > It is possible to prevent this link dropping by setting the associated
> > > link speed in Target Link Speed of the Link Control 2 register. This
> > > only needs to be done when the downstream is specifically 2.5GT.
> > >
> > > This change applies the required Target Link Speed value during
> > > mvebu_pcie_setup_hw conditionally depending on the current link speed
> > > from the Link Status register, only applying the change when the link
> > > is configured to 2.5GT already.
> > >
> > > Signed-off-by: Nathan Rossi <[email protected]>
> > > ---
> > > drivers/pci/controller/pci-mvebu.c | 18 +++++++++++++++++-
> > > 1 file changed, 17 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
> > > index 1ced73726a..6a869a33ba 100644
> > > --- a/drivers/pci/controller/pci-mvebu.c
> > > +++ b/drivers/pci/controller/pci-mvebu.c
> > > @@ -248,7 +248,7 @@ static void mvebu_pcie_setup_wins(struct mvebu_pcie_port *port)
> > >
> > > static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port)
> > > {
> > > - u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl;
> > > + u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl, lnksta, lnkctl2;
> > >
> > > /* Setup PCIe controller to Root Complex mode. */
> > > ctrl = mvebu_readl(port, PCIE_CTRL_OFF);
> > > @@ -339,6 +339,22 @@ static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port)
> > > unmask |= PCIE_INT_INTX(0) | PCIE_INT_INTX(1) |
> > > PCIE_INT_INTX(2) | PCIE_INT_INTX(3);
> > > mvebu_writel(port, unmask, PCIE_INT_UNMASK_OFF);
> > > +
> > > + /*
> > > + * Set Target Link Speed within the Link Control 2 register when the
> > > + * linked downstream device is connected at 2.5GT. This is configured
> > > + * in order to avoid issues with the controller when the upstream port
> > > + * is configured to support 2.5GT and 5GT and the downstream device is
> > > + * linked at 2.5GT, retraining the link in this case causes the link to
> > > + * drop taking significant time to retrain.
> > > + */
> > > + lnksta = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL) >> 16;
> > > + if ((lnksta & PCI_EXP_LNKSTA_CLS) == PCI_EXP_LNKSTA_CLS_2_5GB) {
> >
> > This code does not work because at this stage endpoint device does not
> > have to be ready and therefore link is not established yet.
> >
> > Also this code is not running when kernel issue PCIe Hot Reset via
> > PCI Secondary Bus Reset bit.
> >
> > And it does not handle possible hot-plug situation.
> >
> > That check that code below has to be done _after_ kernel enumerate
> > device. PCI core code has already logic to handle delays for "slow"
> > devices.
> >
> > And reverse operation (setting lnkctl2 target speed to original value)
> > has to be called after unplugging device - when link goes down.
> >
> > If you want to work on this stuff, I can try to find my notes which I
> > done during investigation of this issue... where is probably the best
> > place in kernel pci core code for handling this issue.
>
> Some notes/direction for implementation would be very appreciated. I
> am not particularly familiar with the pci core code, so I don't have a
> good idea on how to best implement this workaround.

Ok, I have checked and seems that I have removed my notes :-(

So trying to reconstruct information from my memory...

Target link speed in Root port's lnkctl2 register must be set to
_correct_ value before configuring ASPM. Because link retraining (part
of ASPM configuration) fails. ASPM is initialized by calling function
pcie_aspm_init_link_state() from _non-endpoint_ device and it is called
at the end of function pci_scan_slot().

Look also at the tree-traversal functions pci_scan_child_bus_extend()
and pci_scan_bridge_extend() and try to find the best place where should
be this "fix" called.

Because same issue as you are trying to fix is also in pci-aardvark.c
hardware (Marvell too), I think that you can introduce some flag in
struct pci_host_bridge, set it in pci-mvebu.c (later I can do same in
pci-aardvark.c) and then in core pci code (in some of above mentioned
function when you find the proper place in tree traversal) add code
which "fixes" lnkctl2 register.

Because both pci hotplug and static initialization calls those pci core
scan functions, this should fix init-probe part.

Second thing is fixing unplugging part. Because in hotplug setup you can
connect 2.5GT/s GEN1 card (which requires this workaround), then
disconnect it and connect some 5GT/s GEN2 card, it is needed to set
target link back to 5GT/s to use full speed of GEN2 card.

For this second part, I think that it is needed to change target link
speed back to 5GT/s after card is disconnected. As a good candidates
where to do it is probably pci_stop_dev() or pci_destroy_dev() function.
Beware that it is needed to change link speed of device on the other end
of link - not the device which is being removed/unregistered. And check
if it is the last kernel device being unregistered from the bus
(endpoint card may be multifunction device).

I hope that this information would help you. I'm really sorry that I do
not have my notes about this issue where I documented it. Anyway I would
try to provide other information if needed.

> Thanks,
> Nathan
>
> >
> > > + lnkctl2 = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
> > > + lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS;
> > > + lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
> > > + mvebu_writel(port, lnkctl2, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
> > > + }
> > > }
> > >
> > > static struct mvebu_pcie_port *mvebu_pcie_find_port(struct mvebu_pcie *pcie,
> > > ---
> > > 2.37.2