It turns out that at least some r8169 CardBus cards don't operate correctly
when CLKRUN protocol is enabled - the symptoms are recurring timeouts
during PHY reads / writes and a very high packet drop rate.
This is true of at least RTL8169sc/8110sc (XID 18000000) chip in
Sunrich C-160 CardBus NIC.
Such behavior was observed on two separate laptops, the first one has
TI PCIxx12 CardBus bridge, while the second one has Ricoh RL5c476II.
Setting CLKRUN_En bit in CONFIG 3 register via an EEPROM write didn't
improve things in either case (this is probably why it wasn't set by the
card manufacturer).
The only way to fix the issue was to disable the CLKRUN protocol either
in the CardBus bridge (only possible in the TI one) or in the southbridge.
Since the problem takes some time to debug let's warn people that have
the suspect configuration (Conventional PCI r8169 NIC behind a CardBus
bridge) so they know what they can do if they encounter it.
Signed-off-by: Maciej S. Szmigiero <[email protected]>
---
drivers/net/ethernet/realtek/r8169.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index b08d51bf7a20..b935a18358cb 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7254,6 +7254,22 @@ static int rtl_jumbo_max(struct rtl8169_private *tp)
}
}
+static void rtl_pci_cardbus_check(struct pci_dev *pdev)
+{
+ struct pci_dev *parent = pdev;
+
+ while ((parent = pci_upstream_bridge(parent)) != NULL) {
+ if (parent->hdr_type != PCI_HEADER_TYPE_CARDBUS)
+ continue;
+
+ dev_info(&pdev->dev,
+ "device is behind a CardBus bridge\n");
+ dev_info(&pdev->dev,
+ "in case of erratic or no operation try disabling CLKRUN protocol in the CardBus bridge or in the southbridge\n");
+ break;
+ }
+}
+
static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
{
const struct rtl_cfg_info *cfg = rtl_cfg_infos + ent->driver_data;
@@ -7305,8 +7321,10 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
tp->mmio_addr = pcim_iomap_table(pdev)[region];
- if (!pci_is_pcie(pdev))
+ if (!pci_is_pcie(pdev)) {
dev_info(&pdev->dev, "not PCI Express\n");
+ rtl_pci_cardbus_check(pdev);
+ }
/* Identify chip attached to board */
rtl8169_get_mac_version(tp, cfg->default_ver);
--
2.17.0
From: "Maciej S. Szmigiero" <[email protected]>
Date: Thu, 6 Sep 2018 18:10:53 +0200
> It turns out that at least some r8169 CardBus cards don't operate correctly
> when CLKRUN protocol is enabled - the symptoms are recurring timeouts
> during PHY reads / writes and a very high packet drop rate.
> This is true of at least RTL8169sc/8110sc (XID 18000000) chip in
> Sunrich C-160 CardBus NIC.
>
> Such behavior was observed on two separate laptops, the first one has
> TI PCIxx12 CardBus bridge, while the second one has Ricoh RL5c476II.
>
> Setting CLKRUN_En bit in CONFIG 3 register via an EEPROM write didn't
> improve things in either case (this is probably why it wasn't set by the
> card manufacturer).
> The only way to fix the issue was to disable the CLKRUN protocol either
> in the CardBus bridge (only possible in the TI one) or in the southbridge.
>
> Since the problem takes some time to debug let's warn people that have
> the suspect configuration (Conventional PCI r8169 NIC behind a CardBus
> bridge) so they know what they can do if they encounter it.
>
> Signed-off-by: Maciej S. Szmigiero <[email protected]>
I don't know about this.
Barking at the user in the kernel log about an obscure knob (which btw
doesn't exist for all cardbus bridges without other patches you are
posting elsewhere) is rarely effective.
We should just disable clkrun automatically we know it causes problems.
Sorry, I don't think this is that right approach and therefore I am not
applying this.
On 09.09.2018 17:09, David Miller wrote:
> From: "Maciej S. Szmigiero" <[email protected]>
> Date: Thu, 6 Sep 2018 18:10:53 +0200
>
>> It turns out that at least some r8169 CardBus cards don't operate correctly
>> when CLKRUN protocol is enabled - the symptoms are recurring timeouts
>> during PHY reads / writes and a very high packet drop rate.
>> This is true of at least RTL8169sc/8110sc (XID 18000000) chip in
>> Sunrich C-160 CardBus NIC.
>>
>> Such behavior was observed on two separate laptops, the first one has
>> TI PCIxx12 CardBus bridge, while the second one has Ricoh RL5c476II.
>>
>> Setting CLKRUN_En bit in CONFIG 3 register via an EEPROM write didn't
>> improve things in either case (this is probably why it wasn't set by the
>> card manufacturer).
>> The only way to fix the issue was to disable the CLKRUN protocol either
>> in the CardBus bridge (only possible in the TI one) or in the southbridge.
>>
>> Since the problem takes some time to debug let's warn people that have
>> the suspect configuration (Conventional PCI r8169 NIC behind a CardBus
>> bridge) so they know what they can do if they encounter it.
>>
>> Signed-off-by: Maciej S. Szmigiero <[email protected]>
>
> I don't know about this.
>
> Barking at the user in the kernel log about an obscure knob (which btw
> doesn't exist for all cardbus bridges without other patches you are
> posting elsewhere) is rarely effective.
>
> We should just disable clkrun automatically we know it causes problems.
Unfortunately, as you wrote above, this workaround is only available on
TI CardBus bridges (and I hope will be available for two Ricoh ones soon,
too), while for other CardBus bridges it is either not implemented or
not available at all.
So we can't reliably just turn it on automatically when needed.
BTW, it seems that my RTL8169 card isn't the only model affected.
In fact, the original CLKRUN protocol disabling workaround on TI bridges
was implemented in 2005 because somebody's RTL8139 also had this
problem: https://lkml.org/lkml/2005/2/5/129
The main reason I wanted to add this warning is to save people time
debugging this issue, as it is rather unobvious.
But if this solution is unacceptable then I hope at least this
description will pop out in search results when searching for some
related keywords.
Thanks,
Maciej