Hello,
I recently did some test and found out something interesting about the b44
problem I wrote earlier.
The problem is the following:
When I use my BCM4401 with the b44 driver in wireless-dev I get very high ping
times looking like this:
64 bytes from 172.30.10.1: icmp_seq=1 ttl=64 time=1863 ms
64 bytes from 172.30.10.1: icmp_seq=2 ttl=64 time=855 ms
64 bytes from 172.30.10.1: icmp_seq=3 ttl=64 time=1855 ms
64 bytes from 172.30.10.1: icmp_seq=4 ttl=64 time=855 ms
64 bytes from 172.30.10.1: icmp_seq=5 ttl=64 time=1854 ms
64 bytes from 172.30.10.1: icmp_seq=6 ttl=64 time=854 ms
64 bytes from 172.30.10.1: icmp_seq=7 ttl=64 time=1851 ms
64 bytes from 172.30.10.1: icmp_seq=8 ttl=64 time=851 ms
64 bytes from 172.30.10.1: icmp_seq=9 ttl=64 time=1851 ms
64 bytes from 172.30.10.1: icmp_seq=10 ttl=64 time=851 ms
I also found out that shortly after I boot my laptop and log into kde ping
times are not that high but start to increase very quickly:
64 bytes from 172.30.10.1: icmp_seq=53 ttl=64 time=2.19 ms
64 bytes from 172.30.10.1: icmp_seq=54 ttl=64 time=2.22 ms
64 bytes from 172.30.10.1: icmp_seq=55 ttl=64 time=2.20 ms
64 bytes from 172.30.10.1: icmp_seq=56 ttl=64 time=2.20 ms
64 bytes from 172.30.10.1: icmp_seq=57 ttl=64 time=18.6 ms
64 bytes from 172.30.10.1: icmp_seq=58 ttl=64 time=1268 ms
64 bytes from 172.30.10.1: icmp_seq=59 ttl=64 time=268 ms
64 bytes from 172.30.10.1: icmp_seq=60 ttl=64 time=1268 ms
64 bytes from 172.30.10.1: icmp_seq=61 ttl=64 time=268 ms
64 bytes from 172.30.10.1: icmp_seq=62 ttl=64 time=6.08 ms
64 bytes from 172.30.10.1: icmp_seq=63 ttl=64 time=268 ms
64 bytes from 172.30.10.1: icmp_seq=64 ttl=64 time=1264 ms
64 bytes from 172.30.10.1: icmp_seq=65 ttl=64 time=264 ms
After some time digging around I found out something really interesting. When
I play some music ping times are immediately lower. If I stop playing music
they are back to the same times as they were before.
I guess that there is a problem with interrupts so I post some information of
my system in hope it will be usefull.
maxi@koala:~$ cat /proc/interrupts
CPU0
0: 126317 XT-PIC-XT timer
1: 3600 XT-PIC-XT i8042
2: 0 XT-PIC-XT cascade
7: 1 XT-PIC-XT parport0
8: 1 XT-PIC-XT rtc
9: 17371 XT-PIC-XT acpi
10: 13237 XT-PIC-XT firewire_ohci, yenta, yenta, ehci_hcd:usb1,
uhci_hcd:usb3, uhci_hcd:usb4, Intel 82801DB-ICH4, Intel 82801DB-ICH4 Modem,
eth0
11: 89059 XT-PIC-XT uhci_hcd:usb2, i915@pci:0000:00:02.0
12: 632 XT-PIC-XT i8042
14: 10354 XT-PIC-XT libata
15: 7408 XT-PIC-XT libata
NMI: 0
ERR: 0
[...]
ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
ACPI: PCI Interrupt 0000:02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low) ->
IRQ 10
ssb: Sonics Silicon Backplane found on PCI device 0000:02:02.0
b44.c:v2.0
eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
[...]
This problem did only happen with wireless-dev (checkout this evening) and
with -mm kernels I used some time ago for testing. Currently I'm running
2.6.22-rc4 that works perfectly fine and doesn't show that problem.
Maxi
On Sat, 16 Jun 2007 23:27:43 +0200
Maximilian Engelhardt <[email protected]> wrote:
> Hello,
>
> I recently did some test and found out something interesting about the b44
> problem I wrote earlier.
>
> The problem is the following:
> When I use my BCM4401 with the b44 driver in wireless-dev I get very high ping
> times looking like this:
>
> 64 bytes from 172.30.10.1: icmp_seq=1 ttl=64 time=1863 ms
> 64 bytes from 172.30.10.1: icmp_seq=2 ttl=64 time=855 ms
> 64 bytes from 172.30.10.1: icmp_seq=3 ttl=64 time=1855 ms
> 64 bytes from 172.30.10.1: icmp_seq=4 ttl=64 time=855 ms
> 64 bytes from 172.30.10.1: icmp_seq=5 ttl=64 time=1854 ms
> 64 bytes from 172.30.10.1: icmp_seq=6 ttl=64 time=854 ms
> 64 bytes from 172.30.10.1: icmp_seq=7 ttl=64 time=1851 ms
> 64 bytes from 172.30.10.1: icmp_seq=8 ttl=64 time=851 ms
> 64 bytes from 172.30.10.1: icmp_seq=9 ttl=64 time=1851 ms
> 64 bytes from 172.30.10.1: icmp_seq=10 ttl=64 time=851 ms
>
> I also found out that shortly after I boot my laptop and log into kde ping
> times are not that high but start to increase very quickly:
>
> 64 bytes from 172.30.10.1: icmp_seq=53 ttl=64 time=2.19 ms
> 64 bytes from 172.30.10.1: icmp_seq=54 ttl=64 time=2.22 ms
> 64 bytes from 172.30.10.1: icmp_seq=55 ttl=64 time=2.20 ms
> 64 bytes from 172.30.10.1: icmp_seq=56 ttl=64 time=2.20 ms
> 64 bytes from 172.30.10.1: icmp_seq=57 ttl=64 time=18.6 ms
> 64 bytes from 172.30.10.1: icmp_seq=58 ttl=64 time=1268 ms
> 64 bytes from 172.30.10.1: icmp_seq=59 ttl=64 time=268 ms
> 64 bytes from 172.30.10.1: icmp_seq=60 ttl=64 time=1268 ms
> 64 bytes from 172.30.10.1: icmp_seq=61 ttl=64 time=268 ms
> 64 bytes from 172.30.10.1: icmp_seq=62 ttl=64 time=6.08 ms
> 64 bytes from 172.30.10.1: icmp_seq=63 ttl=64 time=268 ms
> 64 bytes from 172.30.10.1: icmp_seq=64 ttl=64 time=1264 ms
> 64 bytes from 172.30.10.1: icmp_seq=65 ttl=64 time=264 ms
>
> After some time digging around I found out something really interesting. When
> I play some music ping times are immediately lower. If I stop playing music
> they are back to the same times as they were before.
>
> I guess that there is a problem with interrupts so I post some information of
> my system in hope it will be usefull.
>
> maxi@koala:~$ cat /proc/interrupts
> CPU0
> 0: 126317 XT-PIC-XT timer
> 1: 3600 XT-PIC-XT i8042
> 2: 0 XT-PIC-XT cascade
> 7: 1 XT-PIC-XT parport0
> 8: 1 XT-PIC-XT rtc
> 9: 17371 XT-PIC-XT acpi
> 10: 13237 XT-PIC-XT firewire_ohci, yenta, yenta, ehci_hcd:usb1,
> uhci_hcd:usb3, uhci_hcd:usb4, Intel 82801DB-ICH4, Intel 82801DB-ICH4 Modem,
> eth0
> 11: 89059 XT-PIC-XT uhci_hcd:usb2, i915@pci:0000:00:02.0
> 12: 632 XT-PIC-XT i8042
> 14: 10354 XT-PIC-XT libata
> 15: 7408 XT-PIC-XT libata
> NMI: 0
> ERR: 0
>
>
> [...]
> ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
> ACPI: PCI Interrupt 0000:02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low) ->
> IRQ 10
> ssb: Sonics Silicon Backplane found on PCI device 0000:02:02.0
> b44.c:v2.0
> eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
> [...]
>
> This problem did only happen with wireless-dev (checkout this evening) and
> with -mm kernels I used some time ago for testing. Currently I'm running
> 2.6.22-rc4 that works perfectly fine and doesn't show that problem.
>
> Maxi
Can you build with APIC for uniprocessor.
There is lots of IRQ sharing, so
- one of the other device's may be not handling shared IRQ properly.
Try unloading firewhire modem and yenta devices.
- IRQ might be set edge triggered which doesn't work with NAPI
or shared IRQ.
--
Stephen Hemminger <[email protected]>
On Sunday 17 June 2007, Stephen Hemminger wrote:
> On Sat, 16 Jun 2007 23:27:43 +0200
>
> Maximilian Engelhardt <[email protected]> wrote:
> > Hello,
> >
> > I recently did some test and found out something interesting about the
> > b44 problem I wrote earlier.
> >
> > The problem is the following:
> > When I use my BCM4401 with the b44 driver in wireless-dev I get very high
> > ping times looking like this:
> >
> > 64 bytes from 172.30.10.1: icmp_seq=1 ttl=64 time=1863 ms
> > 64 bytes from 172.30.10.1: icmp_seq=2 ttl=64 time=855 ms
> > 64 bytes from 172.30.10.1: icmp_seq=3 ttl=64 time=1855 ms
> > 64 bytes from 172.30.10.1: icmp_seq=4 ttl=64 time=855 ms
> > 64 bytes from 172.30.10.1: icmp_seq=5 ttl=64 time=1854 ms
> > 64 bytes from 172.30.10.1: icmp_seq=6 ttl=64 time=854 ms
> > 64 bytes from 172.30.10.1: icmp_seq=7 ttl=64 time=1851 ms
> > 64 bytes from 172.30.10.1: icmp_seq=8 ttl=64 time=851 ms
> > 64 bytes from 172.30.10.1: icmp_seq=9 ttl=64 time=1851 ms
> > 64 bytes from 172.30.10.1: icmp_seq=10 ttl=64 time=851 ms
> >
> > I also found out that shortly after I boot my laptop and log into kde
> > ping times are not that high but start to increase very quickly:
> >
> > 64 bytes from 172.30.10.1: icmp_seq=53 ttl=64 time=2.19 ms
> > 64 bytes from 172.30.10.1: icmp_seq=54 ttl=64 time=2.22 ms
> > 64 bytes from 172.30.10.1: icmp_seq=55 ttl=64 time=2.20 ms
> > 64 bytes from 172.30.10.1: icmp_seq=56 ttl=64 time=2.20 ms
> > 64 bytes from 172.30.10.1: icmp_seq=57 ttl=64 time=18.6 ms
> > 64 bytes from 172.30.10.1: icmp_seq=58 ttl=64 time=1268 ms
> > 64 bytes from 172.30.10.1: icmp_seq=59 ttl=64 time=268 ms
> > 64 bytes from 172.30.10.1: icmp_seq=60 ttl=64 time=1268 ms
> > 64 bytes from 172.30.10.1: icmp_seq=61 ttl=64 time=268 ms
> > 64 bytes from 172.30.10.1: icmp_seq=62 ttl=64 time=6.08 ms
> > 64 bytes from 172.30.10.1: icmp_seq=63 ttl=64 time=268 ms
> > 64 bytes from 172.30.10.1: icmp_seq=64 ttl=64 time=1264 ms
> > 64 bytes from 172.30.10.1: icmp_seq=65 ttl=64 time=264 ms
> >
> > After some time digging around I found out something really interesting.
> > When I play some music ping times are immediately lower. If I stop
> > playing music they are back to the same times as they were before.
> >
> > I guess that there is a problem with interrupts so I post some
> > information of my system in hope it will be usefull.
> >
> > maxi@koala:~$ cat /proc/interrupts
> > CPU0
> > 0: 126317 XT-PIC-XT timer
> > 1: 3600 XT-PIC-XT i8042
> > 2: 0 XT-PIC-XT cascade
> > 7: 1 XT-PIC-XT parport0
> > 8: 1 XT-PIC-XT rtc
> > 9: 17371 XT-PIC-XT acpi
> > 10: 13237 XT-PIC-XT firewire_ohci, yenta, yenta,
> > ehci_hcd:usb1, uhci_hcd:usb3, uhci_hcd:usb4, Intel 82801DB-ICH4, Intel
> > 82801DB-ICH4 Modem, eth0
> > 11: 89059 XT-PIC-XT uhci_hcd:usb2, i915@pci:0000:00:02.0
> > 12: 632 XT-PIC-XT i8042
> > 14: 10354 XT-PIC-XT libata
> > 15: 7408 XT-PIC-XT libata
> > NMI: 0
> > ERR: 0
> >
> >
> > [...]
> > ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
> > ACPI: PCI Interrupt 0000:02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low)
> > -> IRQ 10
> > ssb: Sonics Silicon Backplane found on PCI device 0000:02:02.0
> > b44.c:v2.0
> > eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
> > [...]
> >
> > This problem did only happen with wireless-dev (checkout this evening)
> > and with -mm kernels I used some time ago for testing. Currently I'm
> > running 2.6.22-rc4 that works perfectly fine and doesn't show that
> > problem.
> >
> > Maxi
>
> Can you build with APIC for uniprocessor.
I did enable CONFIG_X86_UP_APIC and CONFIG_X86_UP_IOAPIC and tried with lapic
and apic=force but couldn't get APIC working.
>
> There is lots of IRQ sharing, so
> - one of the other device's may be not handling shared IRQ properly.
> Try unloading firewhire modem and yenta devices.
>
> - IRQ might be set edge triggered which doesn't work with NAPI
> or shared IRQ.
I did build a kernel without the three mentioned above but the problem is
still the same. I also did remove everything but eth0 on interrupt 10 so the
only device using that interrupt is eth0 and then the card completely stopped
working.
Maxi
On Sunday 17 June 2007 02:42:18 Maximilian Engelhardt wrote:
> I did build a kernel without the three mentioned above but the problem is
> still the same. I also did remove everything but eth0 on interrupt 10 so the
> only device using that interrupt is eth0 and then the card completely stopped
> working.
_That_ is interesting...
Maybe the IRQ isn't correctly wired up on the backplane and
it doesn't generate IRQs at all (so it only worked, because other devices
on the same IRQ line triggered the line).
I'll take a look at that code.
--
Greetings Michael.
On Saturday 16 June 2007 23:27:43 Maximilian Engelhardt wrote:
> [...]
> ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
> ACPI: PCI Interrupt 0000:02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low) ->
> IRQ 10
> ssb: Sonics Silicon Backplane found on PCI device 0000:02:02.0
> b44.c:v2.0
> eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
> [...]
Ok, I prepared two debugging patches.
Please enable SonicsSiliconBackplane Debugging in the kernel kconfig,
so I can get more detail information about your card.
Device Drivers/Sonics Silicon Backplane/SSB debugging
(Must disable "No SSB kernel messages")
Please apply and test the attached debugging patches in a row.
So apply patch 1 and test if it works again. If not, apply
patch 2 and test if it works.
Always save complete dmesg log on each test run and send it to me.
Thanks for testing.
(This time it seems we are actually getting somewhere, when
dealing with sane people. :D )
--
Greetings Michael.
On Sunday 17 June 2007 12:55:39 Michael Buesch wrote:
> On Saturday 16 June 2007 23:27:43 Maximilian Engelhardt wrote:
> > [...]
> > ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
> > ACPI: PCI Interrupt 0000:02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low) ->
> > IRQ 10
> > ssb: Sonics Silicon Backplane found on PCI device 0000:02:02.0
> > b44.c:v2.0
> > eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
> > [...]
>
> Ok, I prepared two debugging patches.
>
> Please enable SonicsSiliconBackplane Debugging in the kernel kconfig,
> so I can get more detail information about your card.
> Device Drivers/Sonics Silicon Backplane/SSB debugging
> (Must disable "No SSB kernel messages")
>
> Please apply and test the attached debugging patches in a row.
> So apply patch 1 and test if it works again. If not, apply
> patch 2 and test if it works.
> Always save complete dmesg log on each test run and send it to me.
>
> Thanks for testing.
> (This time it seems we are actually getting somewhere, when
> dealing with sane people. :D )
>
Ah, forgot to say. Apply patch 2 on top of patch 1.
--
Greetings Michael.
On Sunday 17 June 2007, Michael Buesch wrote:
> On Saturday 16 June 2007 23:27:43 Maximilian Engelhardt wrote:
> > [...]
> > ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
> > ACPI: PCI Interrupt 0000:02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low)
> > -> IRQ 10
> > ssb: Sonics Silicon Backplane found on PCI device 0000:02:02.0
> > b44.c:v2.0
> > eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
> > [...]
>
> Ok, I prepared two debugging patches.
>
> Please enable SonicsSiliconBackplane Debugging in the kernel kconfig,
> so I can get more detail information about your card.
> Device Drivers/Sonics Silicon Backplane/SSB debugging
> (Must disable "No SSB kernel messages")
>
> Please apply and test the attached debugging patches in a row.
> So apply patch 1 and test if it works again. If not, apply
> patch 2 and test if it works.
> Always save complete dmesg log on each test run and send it to me.
>
> Thanks for testing.
> (This time it seems we are actually getting somewhere, when
> dealing with sane people. :D )
I did the tests with my kernel where only the card is on interrupt 10. dmesg
is attached.
With the first patch applied networking does work again. I also additionally
tried patch2 and it also does work.
Maxi
On Sunday 17 June 2007 14:03:30 Maximilian Engelhardt wrote:
> On Sunday 17 June 2007, Michael Buesch wrote:
> > On Saturday 16 June 2007 23:27:43 Maximilian Engelhardt wrote:
> > > [...]
> > > ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
> > > ACPI: PCI Interrupt 0000:02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low)
> > > -> IRQ 10
> > > ssb: Sonics Silicon Backplane found on PCI device 0000:02:02.0
> > > b44.c:v2.0
> > > eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
> > > [...]
> >
> > Ok, I prepared two debugging patches.
> >
> > Please enable SonicsSiliconBackplane Debugging in the kernel kconfig,
> > so I can get more detail information about your card.
> > Device Drivers/Sonics Silicon Backplane/SSB debugging
> > (Must disable "No SSB kernel messages")
> >
> > Please apply and test the attached debugging patches in a row.
> > So apply patch 1 and test if it works again. If not, apply
> > patch 2 and test if it works.
> > Always save complete dmesg log on each test run and send it to me.
> >
> > Thanks for testing.
> > (This time it seems we are actually getting somewhere, when
> > dealing with sane people. :D )
>
> I did the tests with my kernel where only the card is on interrupt 10. dmesg
> is attached.
> With the first patch applied networking does work again. I also additionally
> tried patch2 and it also does work.
Great!
To me this seems to be a silicon bug. The IRQ routing value, which is read
from the chip here, is hardcoded in the old b44 driver.
I'll implement a workaround for this and submit a patch.
--
Greetings Michael.