2013-09-18 09:00:50

by Igor Gnatenko

[permalink] [raw]
Subject: [REGRESSION][BISECTED] skge: add dma_mapping check

Since 136d8f377e1575463b47840bc5f1b22d94bf8f63 commit we have kernel
panic on:
01:05.0 Ethernet controller [0200]: Marvell Technology Group Ltd.

Screen: https://www.dropbox.com/s/mu3t3wxpxbn4ou5/IMAG0507.jpg

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1008323


01:05.0 Ethernet controller [0200]: Marvell Technology Group Ltd.
88E8001 Gigabit Ethernet Controller [11ab:4320] (rev 13)
Subsystem: ASUSTeK Computer Inc. Marvell 88E8001 Gigabit
Ethernet Controller (Asus) [1043:811a]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 64 (5750ns min, 7750ns max), Cache Line Size: 16 bytes
Interrupt: pin A routed to IRQ 21
Region 0: Memory at ddcfc000 (32-bit, non-prefetchable)
[size=16K]
Region 1: I/O ports at a800 [size=256]
Expansion ROM at ddcc0000 [disabled] [size=128K]
Capabilities: [48] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1
+,D2+,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=7 DScale=1 PME-
Capabilities: [50] Vital Product Data
Product Name: Yukon Gigabit Ethernet 10/100/1000Base-T
Adapter
Read-only fields:
[PN] Part number: Yukon 88E8001
[EC] Engineering changes: Rev. 1.3
[MN] Manufacture ID: 4d 61 72 76 65 6c 6c
[SN] Serial number: AbCdEfG000002
[CP] Extended capability: 01 10 cc 03
[RV] Reserved: checksum good, 10 byte(s)
reserved
Read/write fields:
[RW] Read-write area: 121 byte(s) free
End
Kernel driver in use: skge

--
Igor Gnatenko
Fedora release 20 (Heisenbug)
Linux 3.11.1-300.fc20.x86_64


2013-09-18 15:22:41

by Ben Hutchings

[permalink] [raw]
Subject: Re: [REGRESSION][BISECTED] skge: add dma_mapping check

On Wed, 2013-09-18 at 13:00 +0400, Igor Gnatenko wrote:
> Since 136d8f377e1575463b47840bc5f1b22d94bf8f63 commit we have kernel
> panic on:
[...]

At a first glance, it looks like this bit is wrong:

> @@ -3058,13 +3090,17 @@ static struct sk_buff *skge_rx_get(struct net_device *dev,
> if (!nskb)
> goto resubmit;
>
> + if (skge_rx_setup(skge, e, nskb, skge->rx_buf_size) < 0) {
> + dev_kfree_skb(nskb);
> + goto resubmit;
> + }
> +
> pci_unmap_single(skge->hw->pdev,
> dma_unmap_addr(e, mapaddr),
> dma_unmap_len(e, maplen),
> PCI_DMA_FROMDEVICE);
> skb = e->skb;
> prefetch(skb->data);
> - skge_rx_setup(skge, e, nskb, skge->rx_buf_size);
> }
>
> skb_put(skb, len);

That pci_unmap_single() appears to unmap the *new*, rather than old, DMA
mapping. I think you need to copy out the old DMA address and length
before doing skge_rx_setup().

Try it with an IOMMU in strict mode (intel_iommu=on,strict or
amd_iommu=fullflush).

Ben.

--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

2013-09-18 22:08:30

by Francois Romieu

[permalink] [raw]
Subject: Re: [REGRESSION][BISECTED] skge: add dma_mapping check

Igor Gnatenko <[email protected]> :
> Since 136d8f377e1575463b47840bc5f1b22d94bf8f63 commit we have kernel
> panic on:
> 01:05.0 Ethernet controller [0200]: Marvell Technology Group Ltd.
>
> Screen: https://www.dropbox.com/s/mu3t3wxpxbn4ou5/IMAG0507.jpg
>
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1008323

Does the ugly stuff below against mainline make a difference ?

Note to testers: use a size argument above 500 for 'ping' to
exercize the relevant code path.

diff --git a/drivers/net/ethernet/marvell/skge.c b/drivers/net/ethernet/marvell/skge.c
index ef94a59..aa85a3f 100644
--- a/drivers/net/ethernet/marvell/skge.c
+++ b/drivers/net/ethernet/marvell/skge.c
@@ -3086,6 +3086,7 @@ static struct sk_buff *skge_rx_get(struct net_device *dev,
PCI_DMA_FROMDEVICE);
skge_rx_reuse(e, skge->rx_buf_size);
} else {
+ struct skge_element ee = *e;
struct sk_buff *nskb;

nskb = netdev_alloc_skb_ip_align(dev, skge->rx_buf_size);
@@ -3098,10 +3099,10 @@ static struct sk_buff *skge_rx_get(struct net_device *dev,
}

pci_unmap_single(skge->hw->pdev,
- dma_unmap_addr(e, mapaddr),
- dma_unmap_len(e, maplen),
+ dma_unmap_addr(&ee, mapaddr),
+ dma_unmap_len(&ee, maplen),
PCI_DMA_FROMDEVICE);
- skb = e->skb;
+ skb = ee.skb;
prefetch(skb->data);
}

2013-09-19 08:18:59

by Igor Gnatenko

[permalink] [raw]
Subject: Re: [REGRESSION][BISECTED] skge: add dma_mapping check

On Thu, 2013-09-19 at 00:08 +0200, Francois Romieu wrote:
> Igor Gnatenko <[email protected]> :
> > Since 136d8f377e1575463b47840bc5f1b22d94bf8f63 commit we have kernel
> > panic on:
> > 01:05.0 Ethernet controller [0200]: Marvell Technology Group Ltd.
> >
> > Screen: https://www.dropbox.com/s/mu3t3wxpxbn4ou5/IMAG0507.jpg
> >
> > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1008323
>
> Does the ugly stuff below against mainline make a difference ?
>
> Note to testers: use a size argument above 500 for 'ping' to
> exercize the relevant code path.
Reported-and-tested-by: Vasiliy Glazov <[email protected]>
Bisected-by: Igor Gnatenko <[email protected]>
>
> diff --git a/drivers/net/ethernet/marvell/skge.c b/drivers/net/ethernet/marvell/skge.c
> index ef94a59..aa85a3f 100644
> --- a/drivers/net/ethernet/marvell/skge.c
> +++ b/drivers/net/ethernet/marvell/skge.c
> @@ -3086,6 +3086,7 @@ static struct sk_buff *skge_rx_get(struct net_device *dev,
> PCI_DMA_FROMDEVICE);
> skge_rx_reuse(e, skge->rx_buf_size);
> } else {
> + struct skge_element ee = *e;
> struct sk_buff *nskb;
>
> nskb = netdev_alloc_skb_ip_align(dev, skge->rx_buf_size);
> @@ -3098,10 +3099,10 @@ static struct sk_buff *skge_rx_get(struct net_device *dev,
> }
>
> pci_unmap_single(skge->hw->pdev,
> - dma_unmap_addr(e, mapaddr),
> - dma_unmap_len(e, maplen),
> + dma_unmap_addr(&ee, mapaddr),
> + dma_unmap_len(&ee, maplen),
> PCI_DMA_FROMDEVICE);
> - skb = e->skb;
> + skb = ee.skb;
> prefetch(skb->data);
> }
>
Yes. This patch fixes problem. ping www.ru -s 500 works fine.


--
Igor Gnatenko
Fedora release 20 (Heisenbug)
Linux 3.11.1-300.fc20.x86_64

2013-09-19 09:03:18

by Igor Gnatenko

[permalink] [raw]
Subject: Re: [REGRESSION][BISECTED] skge: add dma_mapping check

Please, send patch.

--
Igor Gnatenko
Fedora release 20 (Heisenbug)
Linux 3.11.1-300.fc20.x86_64

2013-09-24 15:17:13

by Joseph Salisbury

[permalink] [raw]
Subject: Re: [REGRESSION][BISECTED] skge: add dma_mapping check

On 09/19/2013 05:03 AM, Igor Gnatenko wrote:
> Please, send patch.
>
The patch is in mainline as of 3.12-rc2 as commit:

Author: Mikulas Patocka <[email protected]>
Date: Thu Sep 19 14:13:17 2013 -0400

skge: fix broken driver

I don't see that the commit was Cc'd to stable. Mikulas, we might need
to send a request directly to the stable maintainers and reqeust that
the commit be pulled into stable, in case they didn't notice the request
in the commit message.

2013-09-24 17:50:44

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [REGRESSION][BISECTED] skge: add dma_mapping check



On Tue, 24 Sep 2013, Joseph Salisbury wrote:

> On 09/19/2013 05:03 AM, Igor Gnatenko wrote:
> > Please, send patch.
> >
> The patch is in mainline as of 3.12-rc2 as commit:
>
> Author: Mikulas Patocka <[email protected]>
> Date: Thu Sep 19 14:13:17 2013 -0400
>
> skge: fix broken driver
>
> I don't see that the commit was Cc'd to stable. Mikulas, we might need
> to send a request directly to the stable maintainers and reqeust that
> the commit be pulled into stable, in case they didn't notice the request
> in the commit message.

The patch needs to be added to 3.11 stable tree. David Miller's networking
tree has a rule that it doesn't want "Cc: [email protected]" in patches
submitted through it.

So, David hopefully submits the patch to the stable tree on his own.

Mikulas