Since 136d8f377e1575463b47840bc5f1b22d94bf8f63 commit we have kernel
panic on:
01:05.0 Ethernet controller [0200]: Marvell Technology Group Ltd.
Screen: https://www.dropbox.com/s/mu3t3wxpxbn4ou5/IMAG0507.jpg
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1008323
01:05.0 Ethernet controller [0200]: Marvell Technology Group Ltd.
88E8001 Gigabit Ethernet Controller [11ab:4320] (rev 13)
Subsystem: ASUSTeK Computer Inc. Marvell 88E8001 Gigabit
Ethernet Controller (Asus) [1043:811a]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 64 (5750ns min, 7750ns max), Cache Line Size: 16 bytes
Interrupt: pin A routed to IRQ 21
Region 0: Memory at ddcfc000 (32-bit, non-prefetchable)
[size=16K]
Region 1: I/O ports at a800 [size=256]
Expansion ROM at ddcc0000 [disabled] [size=128K]
Capabilities: [48] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1
+,D2+,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=7 DScale=1 PME-
Capabilities: [50] Vital Product Data
Product Name: Yukon Gigabit Ethernet 10/100/1000Base-T
Adapter
Read-only fields:
[PN] Part number: Yukon 88E8001
[EC] Engineering changes: Rev. 1.3
[MN] Manufacture ID: 4d 61 72 76 65 6c 6c
[SN] Serial number: AbCdEfG000002
[CP] Extended capability: 01 10 cc 03
[RV] Reserved: checksum good, 10 byte(s)
reserved
Read/write fields:
[RW] Read-write area: 121 byte(s) free
End
Kernel driver in use: skge
--
Igor Gnatenko
Fedora release 20 (Heisenbug)
Linux 3.11.1-300.fc20.x86_64
On Wed, 2013-09-18 at 13:00 +0400, Igor Gnatenko wrote:
> Since 136d8f377e1575463b47840bc5f1b22d94bf8f63 commit we have kernel
> panic on:
[...]
At a first glance, it looks like this bit is wrong:
> @@ -3058,13 +3090,17 @@ static struct sk_buff *skge_rx_get(struct net_device *dev,
> if (!nskb)
> goto resubmit;
>
> + if (skge_rx_setup(skge, e, nskb, skge->rx_buf_size) < 0) {
> + dev_kfree_skb(nskb);
> + goto resubmit;
> + }
> +
> pci_unmap_single(skge->hw->pdev,
> dma_unmap_addr(e, mapaddr),
> dma_unmap_len(e, maplen),
> PCI_DMA_FROMDEVICE);
> skb = e->skb;
> prefetch(skb->data);
> - skge_rx_setup(skge, e, nskb, skge->rx_buf_size);
> }
>
> skb_put(skb, len);
That pci_unmap_single() appears to unmap the *new*, rather than old, DMA
mapping. I think you need to copy out the old DMA address and length
before doing skge_rx_setup().
Try it with an IOMMU in strict mode (intel_iommu=on,strict or
amd_iommu=fullflush).
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
Igor Gnatenko <[email protected]> :
> Since 136d8f377e1575463b47840bc5f1b22d94bf8f63 commit we have kernel
> panic on:
> 01:05.0 Ethernet controller [0200]: Marvell Technology Group Ltd.
>
> Screen: https://www.dropbox.com/s/mu3t3wxpxbn4ou5/IMAG0507.jpg
>
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1008323
Does the ugly stuff below against mainline make a difference ?
Note to testers: use a size argument above 500 for 'ping' to
exercize the relevant code path.
diff --git a/drivers/net/ethernet/marvell/skge.c b/drivers/net/ethernet/marvell/skge.c
index ef94a59..aa85a3f 100644
--- a/drivers/net/ethernet/marvell/skge.c
+++ b/drivers/net/ethernet/marvell/skge.c
@@ -3086,6 +3086,7 @@ static struct sk_buff *skge_rx_get(struct net_device *dev,
PCI_DMA_FROMDEVICE);
skge_rx_reuse(e, skge->rx_buf_size);
} else {
+ struct skge_element ee = *e;
struct sk_buff *nskb;
nskb = netdev_alloc_skb_ip_align(dev, skge->rx_buf_size);
@@ -3098,10 +3099,10 @@ static struct sk_buff *skge_rx_get(struct net_device *dev,
}
pci_unmap_single(skge->hw->pdev,
- dma_unmap_addr(e, mapaddr),
- dma_unmap_len(e, maplen),
+ dma_unmap_addr(&ee, mapaddr),
+ dma_unmap_len(&ee, maplen),
PCI_DMA_FROMDEVICE);
- skb = e->skb;
+ skb = ee.skb;
prefetch(skb->data);
}
On Thu, 2013-09-19 at 00:08 +0200, Francois Romieu wrote:
> Igor Gnatenko <[email protected]> :
> > Since 136d8f377e1575463b47840bc5f1b22d94bf8f63 commit we have kernel
> > panic on:
> > 01:05.0 Ethernet controller [0200]: Marvell Technology Group Ltd.
> >
> > Screen: https://www.dropbox.com/s/mu3t3wxpxbn4ou5/IMAG0507.jpg
> >
> > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1008323
>
> Does the ugly stuff below against mainline make a difference ?
>
> Note to testers: use a size argument above 500 for 'ping' to
> exercize the relevant code path.
Reported-and-tested-by: Vasiliy Glazov <[email protected]>
Bisected-by: Igor Gnatenko <[email protected]>
>
> diff --git a/drivers/net/ethernet/marvell/skge.c b/drivers/net/ethernet/marvell/skge.c
> index ef94a59..aa85a3f 100644
> --- a/drivers/net/ethernet/marvell/skge.c
> +++ b/drivers/net/ethernet/marvell/skge.c
> @@ -3086,6 +3086,7 @@ static struct sk_buff *skge_rx_get(struct net_device *dev,
> PCI_DMA_FROMDEVICE);
> skge_rx_reuse(e, skge->rx_buf_size);
> } else {
> + struct skge_element ee = *e;
> struct sk_buff *nskb;
>
> nskb = netdev_alloc_skb_ip_align(dev, skge->rx_buf_size);
> @@ -3098,10 +3099,10 @@ static struct sk_buff *skge_rx_get(struct net_device *dev,
> }
>
> pci_unmap_single(skge->hw->pdev,
> - dma_unmap_addr(e, mapaddr),
> - dma_unmap_len(e, maplen),
> + dma_unmap_addr(&ee, mapaddr),
> + dma_unmap_len(&ee, maplen),
> PCI_DMA_FROMDEVICE);
> - skb = e->skb;
> + skb = ee.skb;
> prefetch(skb->data);
> }
>
Yes. This patch fixes problem. ping www.ru -s 500 works fine.
--
Igor Gnatenko
Fedora release 20 (Heisenbug)
Linux 3.11.1-300.fc20.x86_64
Please, send patch.
--
Igor Gnatenko
Fedora release 20 (Heisenbug)
Linux 3.11.1-300.fc20.x86_64
On 09/19/2013 05:03 AM, Igor Gnatenko wrote:
> Please, send patch.
>
The patch is in mainline as of 3.12-rc2 as commit:
Author: Mikulas Patocka <[email protected]>
Date: Thu Sep 19 14:13:17 2013 -0400
skge: fix broken driver
I don't see that the commit was Cc'd to stable. Mikulas, we might need
to send a request directly to the stable maintainers and reqeust that
the commit be pulled into stable, in case they didn't notice the request
in the commit message.
On Tue, 24 Sep 2013, Joseph Salisbury wrote:
> On 09/19/2013 05:03 AM, Igor Gnatenko wrote:
> > Please, send patch.
> >
> The patch is in mainline as of 3.12-rc2 as commit:
>
> Author: Mikulas Patocka <[email protected]>
> Date: Thu Sep 19 14:13:17 2013 -0400
>
> skge: fix broken driver
>
> I don't see that the commit was Cc'd to stable. Mikulas, we might need
> to send a request directly to the stable maintainers and reqeust that
> the commit be pulled into stable, in case they didn't notice the request
> in the commit message.
The patch needs to be added to 3.11 stable tree. David Miller's networking
tree has a rule that it doesn't want "Cc: [email protected]" in patches
submitted through it.
So, David hopefully submits the patch to the stable tree on his own.
Mikulas