2008-06-03 20:48:05

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

On Sat, 31 May 2008 18:46:15 -0700
Arjan van de Ven <[email protected]> wrote:

>
> From: Arjan van de Ven <[email protected]>
> Subject: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change
>
> The via-velocity.c driver reinitializes (frees/allocates) several
> metadata structures during an MTU change. Unfortunately the allocations
> of the new versions of the metadata is done with GFP_KERNEL, even
> though this change of datastructures is (and needs to be) done while
> holding a spinlock (with irqs off).
>
> Clearly that isn't a good thing, and kerneloops.org has trapped a large
> deal of the resulting warnings. The fix is to use GFP_ATOMIC.
> While not elegant, avoiding the lock is going to be extremely complex.
> In addition, this is a "static", long lived allocation (after all, how
> often do you actually change your mtu) and not something that happens
> on an ongoing basis.
>
> ...
>
> diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c
> index 6b8d882..4bf08fd 100644
> --- a/drivers/net/via-velocity.c
> +++ b/drivers/net/via-velocity.c
> @@ -1,4 +1,4 @@
> -/*
> +;/*

Cat sat on your keyboard?

> * This code is derived from the VIA reference driver (copyright message
> * below) provided to Red Hat by VIA Networking Technologies, Inc. for
> * addition to the Linux kernel.
> @@ -1241,6 +1241,9 @@ static int velocity_rx_refill(struct velocity_info *vptr)
> *
> * Allocate and set up the receive buffers for each ring slot and
> * assign them to the network adapter.
> + *
> + * Note: This function gets called with irqs off/lock held
> + * from velocity_change_mtu()
> */
>
> static int velocity_init_rd_ring(struct velocity_info *vptr)
> @@ -1251,7 +1254,7 @@ static int velocity_init_rd_ring(struct velocity_info *vptr)
> vptr->rx_buf_sz = (mtu <= ETH_DATA_LEN) ? PKT_BUF_SZ : mtu + 32;
>
> vptr->rd_info = kcalloc(vptr->options.numrx,
> - sizeof(struct velocity_rd_info), GFP_KERNEL);
> + sizeof(struct velocity_rd_info), GFP_ATOMIC);

What happens if this allocation fails? I think the driver is dead?

We've gone and freed the rd_ring and the td_ring and we _might_ have
allocated a new rd_ring and not a new td_ring.

And we've set vptr->rx_buf_sz, which may or may not be a problem.

And we've gone and left the interface in a downed state.

So hrm. It could all be a lot better. Just looking quickly at the
code I _think_ we might be able to do all the needed allocations
outside the lock and then swizzle them into place after taking the
lock. ie, something as simple as:

struct velocity_info *temp_vptr;

...

velocity_init_rd_ring(temp_vptr); /* Can use GFP_KERNEL! */

spin_lock_irqsave(&vptr->lock, flags);

velocity_free_td_ring(vptr);
velocity_free_rd_ring(vptr);

vptr->foo = temp_vptr->foo;
vptr->bar = temp_vptr->bar;
...

spin_unlock_irqrestore(&vptr->lock, flags);

?


2008-06-03 22:19:22

by Francois Romieu

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Andrew Morton <[email protected]> :
[...]
> So hrm. It could all be a lot better. Just looking quickly at the
> code I _think_ we might be able to do all the needed allocations
> outside the lock and then swizzle them into place after taking the
> lock. ie, something as simple as:
[...]
> velocity_init_rd_ring(temp_vptr); /* Can use GFP_KERNEL! */
>
> spin_lock_irqsave(&vptr->lock, flags);
>
> velocity_free_td_ring(vptr);
> velocity_free_rd_ring(vptr);
>
> vptr->foo = temp_vptr->foo;
> vptr->bar = temp_vptr->bar;
> ...
>
> spin_unlock_irqrestore(&vptr->lock, flags);
>
> ?

Almost.

I'll wrap it in the next 24 hours.

--
Ueimor

2008-06-04 22:26:57

by Francois Romieu

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Francois Romieu <[email protected]> :
[...]
> I'll wrap it in the next 24 hours.

... add 24 more as it starts to be ugly and I have not even considered
velocity_init_rings. :o/

--
Ueimor

2008-06-14 21:53:44

by Francois Romieu

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Francois Romieu <[email protected]> :
[...context available from http://lkml.org/lkml/2008/5/31/251 ...]

Régis, I need your help.

Can you give the patchkit below a try and check if the change of mtu
works correctly ?

http://userweb.kernel.org/~romieu/via-velocity/2.6.26-rc6/

or:

git://git.kernel.org/pub/scm/linux/kernel/git/romieu/netdev-2.6.git velocity

--
Ueimor

2008-06-16 21:49:28

by Séguier Régis

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Francois Romieu a ?crit :
> Francois Romieu <[email protected]> :
> [...context available from http://lkml.org/lkml/2008/5/31/251 ...]
>
> R?gis, I need your help.
>
> Can you give the patchkit below a try and check if the change of mtu
> works correctly ?
>
> http://userweb.kernel.org/~romieu/via-velocity/2.6.26-rc6/
>
> or:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/romieu/netdev-2.6.git velocity
>
>
The mtu change doesn't work :

Jun 16 23:22:34 apollo kernel: ------------[ cut here ]------------
Jun 16 23:22:34 apollo kernel: WARNING: at arch/x86/kernel/pci-dma.c:482
dma_free_coherent+0x3a/0x9c()
Jun 16 23:22:34 apollo kernel: Pid: 1527, comm: ip Tainted: G W
2.6.26-rc6EPIA_SN_VB7001 #3
Jun 16 23:22:34 apollo kernel: [<c0115fe5>] warn_on_slowpath+0x3b/0x5f
Jun 16 23:22:34 apollo kernel: [<c0133940>]
get_page_from_freelist+0x24a/0x36f
Jun 16 23:22:34 apollo kernel: [<c012d86b>] handle_IRQ_event+0x1a/0x3f
Jun 16 23:22:34 apollo kernel: [<c0241193>] velocity_free_rd_ring+0xa5/0xb6
Jun 16 23:22:34 apollo kernel: [<c0144752>] kfree+0x6a/0x72
Jun 16 23:22:34 apollo kernel: [<c0241193>] velocity_free_rd_ring+0xa5/0xb6
Jun 16 23:22:34 apollo kernel: [<c0105816>] dma_free_coherent+0x3a/0x9c
Jun 16 23:22:34 apollo kernel: [<c02411eb>]
velocity_free_dma_rings+0x47/0x4d
Jun 16 23:22:34 apollo kernel: [<c024225b>] velocity_change_mtu+0xf6/0x157
Jun 16 23:22:34 apollo kernel: [<c02856c4>] dev_set_mtu+0x2a/0x4f
Jun 16 23:22:34 apollo kernel: [<c0286642>] dev_ioctl+0x4ab/0x530
Jun 16 23:22:34 apollo kernel: [<c012e4cb>] handle_fasteoi_irq+0x74/0x77
Jun 16 23:22:34 apollo kernel: [<c010479b>] do_IRQ+0x50/0x60
Jun 16 23:22:34 apollo kernel: [<c027d785>] sock_ioctl+0x0/0x177
Jun 16 23:22:34 apollo kernel: [<c01032eb>] common_interrupt+0x23/0x28
Jun 16 23:22:34 apollo kernel: [<c027d785>] sock_ioctl+0x0/0x177
Jun 16 23:22:34 apollo kernel: [<c0150b4a>] vfs_ioctl+0x16/0x48
Jun 16 23:22:34 apollo kernel: [<c0150d5a>] do_vfs_ioctl+0x1de/0x1f1
Jun 16 23:22:34 apollo kernel: [<c0150dae>] sys_ioctl+0x41/0x5b
Jun 16 23:22:34 apollo kernel: [<c010291d>] sysenter_past_esp+0x6a/0x91
Jun 16 23:22:34 apollo kernel: =======================
Jun 16 23:22:34 apollo kernel: ---[ end trace 17457a054a24b83c ]---


--
R?gis

2008-06-17 21:46:23

by Francois Romieu

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Séguier Régis <[email protected]> :
[...]
> The mtu change doesn't work :
>
> Jun 16 23:22:34 apollo kernel: ------------[ cut here ]------------
> Jun 16 23:22:34 apollo kernel: WARNING: at arch/x86/kernel/pci-dma.c:482
> dma_free_coherent+0x3a/0x9c()
> Jun 16 23:22:34 apollo kernel: Pid: 1527, comm: ip Tainted: G W
> 2.6.26-rc6EPIA_SN_VB7001 #3
> Jun 16 23:22:34 apollo kernel: [<c0115fe5>] warn_on_slowpath+0x3b/0x5f
> Jun 16 23:22:34 apollo kernel: [<c0133940>]
> get_page_from_freelist+0x24a/0x36f
> Jun 16 23:22:34 apollo kernel: [<c012d86b>] handle_IRQ_event+0x1a/0x3f
> Jun 16 23:22:34 apollo kernel: [<c0241193>] velocity_free_rd_ring+0xa5/0xb6
> Jun 16 23:22:34 apollo kernel: [<c0144752>] kfree+0x6a/0x72
> Jun 16 23:22:34 apollo kernel: [<c0241193>] velocity_free_rd_ring+0xa5/0xb6
> Jun 16 23:22:34 apollo kernel: [<c0105816>] dma_free_coherent+0x3a/0x9c
> Jun 16 23:22:34 apollo kernel: [<c02411eb>]
> velocity_free_dma_rings+0x47/0x4d
> Jun 16 23:22:34 apollo kernel: [<c024225b>] velocity_change_mtu+0xf6/0x157
> Jun 16 23:22:34 apollo kernel: [<c02856c4>] dev_set_mtu+0x2a/0x4f
> Jun 16 23:22:34 apollo kernel: [<c0286642>] dev_ioctl+0x4ab/0x530
> Jun 16 23:22:34 apollo kernel: [<c012e4cb>] handle_fasteoi_irq+0x74/0x77
> Jun 16 23:22:34 apollo kernel: [<c010479b>] do_IRQ+0x50/0x60
> Jun 16 23:22:34 apollo kernel: [<c027d785>] sock_ioctl+0x0/0x177

Does the patch below make a difference ?

diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c
index 71a5133..fa303da 100644
--- a/drivers/net/via-velocity.c
+++ b/drivers/net/via-velocity.c
@@ -1274,7 +1274,7 @@ static void velocity_free_rd_ring(struct velocity_info *vptr)
PCI_DMA_FROMDEVICE);
rd_info->skb_dma = (dma_addr_t) NULL;

- dev_kfree_skb(rd_info->skb);
+ dev_kfree_skb_any(rd_info->skb);
rd_info->skb = NULL;
}

@@ -1336,7 +1336,7 @@ static void velocity_free_td_ring_entry(struct velocity_info *vptr,
td_info->skb_dma[i] = (dma_addr_t) NULL;
}
}
- dev_kfree_skb(td_info->skb);
+ dev_kfree_skb_any(td_info->skb);
td_info->skb = NULL;
}
}
--
Ueimor

2008-06-17 23:37:28

by Séguier Régis

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Francois Romieu a ?crit :
> S?guier R?gis <[email protected]> :
> [...]
>
>> The mtu change doesn't work :
>>
>> Jun 16 23:22:34 apollo kernel: ------------[ cut here ]------------
>> Jun 16 23:22:34 apollo kernel: WARNING: at arch/x86/kernel/pci-dma.c:482
>> dma_free_coherent+0x3a/0x9c()
>> Jun 16 23:22:34 apollo kernel: Pid: 1527, comm: ip Tainted: G W
>> 2.6.26-rc6EPIA_SN_VB7001 #3
>> Jun 16 23:22:34 apollo kernel: [<c0115fe5>] warn_on_slowpath+0x3b/0x5f
>> Jun 16 23:22:34 apollo kernel: [<c0133940>]
>> get_page_from_freelist+0x24a/0x36f
>> Jun 16 23:22:34 apollo kernel: [<c012d86b>] handle_IRQ_event+0x1a/0x3f
>> Jun 16 23:22:34 apollo kernel: [<c0241193>] velocity_free_rd_ring+0xa5/0xb6
>> Jun 16 23:22:34 apollo kernel: [<c0144752>] kfree+0x6a/0x72
>> Jun 16 23:22:34 apollo kernel: [<c0241193>] velocity_free_rd_ring+0xa5/0xb6
>> Jun 16 23:22:34 apollo kernel: [<c0105816>] dma_free_coherent+0x3a/0x9c
>> Jun 16 23:22:34 apollo kernel: [<c02411eb>]
>> velocity_free_dma_rings+0x47/0x4d
>> Jun 16 23:22:34 apollo kernel: [<c024225b>] velocity_change_mtu+0xf6/0x157
>> Jun 16 23:22:34 apollo kernel: [<c02856c4>] dev_set_mtu+0x2a/0x4f
>> Jun 16 23:22:34 apollo kernel: [<c0286642>] dev_ioctl+0x4ab/0x530
>> Jun 16 23:22:34 apollo kernel: [<c012e4cb>] handle_fasteoi_irq+0x74/0x77
>> Jun 16 23:22:34 apollo kernel: [<c010479b>] do_IRQ+0x50/0x60
>> Jun 16 23:22:34 apollo kernel: [<c027d785>] sock_ioctl+0x0/0x177
>>
>
> Does the patch below make a difference ?
>
> diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c
> index 71a5133..fa303da 100644
> --- a/drivers/net/via-velocity.c
> +++ b/drivers/net/via-velocity.c
> @@ -1274,7 +1274,7 @@ static void velocity_free_rd_ring(struct velocity_info *vptr)
> PCI_DMA_FROMDEVICE);
> rd_info->skb_dma = (dma_addr_t) NULL;
>
> - dev_kfree_skb(rd_info->skb);
> + dev_kfree_skb_any(rd_info->skb);
> rd_info->skb = NULL;
> }
>
> @@ -1336,7 +1336,7 @@ static void velocity_free_td_ring_entry(struct velocity_info *vptr,
> td_info->skb_dma[i] = (dma_addr_t) NULL;
> }
> }
> - dev_kfree_skb(td_info->skb);
> + dev_kfree_skb_any(td_info->skb);
> td_info->skb = NULL;
> }
> }
>
Idem.

Jun 18 01:30:15 apollo kernel: ------------[ cut here ]------------
Jun 18 01:30:15 apollo kernel: WARNING: at arch/x86/kernel/pci-dma.c:482
dma_free_coherent+0x3a/0x9c()
Jun 18 01:30:15 apollo kernel: Pid: 1401, comm: ip Not tainted
2.6.26-rc6EPIA_SN_VB7001 #4
Jun 18 01:30:15 apollo kernel: [<c0115fe5>] warn_on_slowpath+0x3b/0x5f
Jun 18 01:30:15 apollo kernel: [<c0133b19>]
__alloc_pages_internal+0xb4/0x349
Jun 18 01:30:15 apollo kernel: [<c0112bfd>] hrtick_start_fair+0x67/0xfd
Jun 18 01:30:15 apollo kernel: [<c011414f>] check_preempt_wakeup+0x9d/0xbb
Jun 18 01:30:15 apollo kernel: [<c0144752>] kfree+0x6a/0x72
Jun 18 01:30:15 apollo kernel: [<c0241193>] velocity_free_rd_ring+0xa5/0xb6
Jun 18 01:30:15 apollo kernel: [<c0105816>] dma_free_coherent+0x3a/0x9c
Jun 18 01:30:15 apollo kernel: [<c02411eb>]
velocity_free_dma_rings+0x47/0x4d
Jun 18 01:30:15 apollo kernel: [<c024225b>] velocity_change_mtu+0xf6/0x157
Jun 18 01:30:15 apollo kernel: [<c02856c4>] dev_set_mtu+0x2a/0x4f
Jun 18 01:30:15 apollo kernel: [<c0286642>] dev_ioctl+0x4ab/0x530
Jun 18 01:30:15 apollo kernel: [<c013964a>] __do_fault+0x256/0x28e
Jun 18 01:30:15 apollo kernel: [<c027d8d7>] sock_ioctl+0x152/0x177
Jun 18 01:30:15 apollo kernel: [<c027d785>] sock_ioctl+0x0/0x177
Jun 18 01:30:15 apollo kernel: [<c0150b4a>] vfs_ioctl+0x16/0x48
Jun 18 01:30:15 apollo kernel: [<c0150d5a>] do_vfs_ioctl+0x1de/0x1f1
Jun 18 01:30:15 apollo kernel: [<c0150dae>] sys_ioctl+0x41/0x5b
Jun 18 01:30:15 apollo kernel: [<c010291d>] sysenter_past_esp+0x6a/0x91
Jun 18 01:30:15 apollo kernel: =======================
Jun 18 01:30:15 apollo kernel: ---[ end trace abc2c54f7fac91dc ]---

--
R?gis

2008-06-18 16:56:35

by Séguier Régis

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

S?guier R?gis a ?crit :
> Francois Romieu a ?crit :
>> S?guier R?gis <[email protected]> :
>> [...]
>>
>>> The mtu change doesn't work :
>>>
>>> Jun 16 23:22:34 apollo kernel: ------------[ cut here ]------------
>>> Jun 16 23:22:34 apollo kernel: WARNING: at
>>> arch/x86/kernel/pci-dma.c:482 dma_free_coherent+0x3a/0x9c()
>>> Jun 16 23:22:34 apollo kernel: Pid: 1527, comm: ip Tainted: G
>>> W 2.6.26-rc6EPIA_SN_VB7001 #3
>>> Jun 16 23:22:34 apollo kernel: [<c0115fe5>] warn_on_slowpath+0x3b/0x5f
>>> Jun 16 23:22:34 apollo kernel: [<c0133940>]
>>> get_page_from_freelist+0x24a/0x36f
>>> Jun 16 23:22:34 apollo kernel: [<c012d86b>] handle_IRQ_event+0x1a/0x3f
>>> Jun 16 23:22:34 apollo kernel: [<c0241193>]
>>> velocity_free_rd_ring+0xa5/0xb6
>>> Jun 16 23:22:34 apollo kernel: [<c0144752>] kfree+0x6a/0x72
>>> Jun 16 23:22:34 apollo kernel: [<c0241193>]
>>> velocity_free_rd_ring+0xa5/0xb6
>>> Jun 16 23:22:34 apollo kernel: [<c0105816>]
>>> dma_free_coherent+0x3a/0x9c
>>> Jun 16 23:22:34 apollo kernel: [<c02411eb>]
>>> velocity_free_dma_rings+0x47/0x4d
>>> Jun 16 23:22:34 apollo kernel: [<c024225b>]
>>> velocity_change_mtu+0xf6/0x157
>>> Jun 16 23:22:34 apollo kernel: [<c02856c4>] dev_set_mtu+0x2a/0x4f
>>> Jun 16 23:22:34 apollo kernel: [<c0286642>] dev_ioctl+0x4ab/0x530
>>> Jun 16 23:22:34 apollo kernel: [<c012e4cb>]
>>> handle_fasteoi_irq+0x74/0x77
>>> Jun 16 23:22:34 apollo kernel: [<c010479b>] do_IRQ+0x50/0x60
>>> Jun 16 23:22:34 apollo kernel: [<c027d785>] sock_ioctl+0x0/0x177
>>>
>>
>> Does the patch below make a difference ?
>>
>> diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c
>> index 71a5133..fa303da 100644
>> --- a/drivers/net/via-velocity.c
>> +++ b/drivers/net/via-velocity.c
>> @@ -1274,7 +1274,7 @@ static void velocity_free_rd_ring(struct
>> velocity_info *vptr)
>> PCI_DMA_FROMDEVICE);
>> rd_info->skb_dma = (dma_addr_t) NULL;
>>
>> - dev_kfree_skb(rd_info->skb);
>> + dev_kfree_skb_any(rd_info->skb);
>> rd_info->skb = NULL;
>> }
>>
>> @@ -1336,7 +1336,7 @@ static void velocity_free_td_ring_entry(struct
>> velocity_info *vptr,
>> td_info->skb_dma[i] = (dma_addr_t) NULL;
>> }
>> }
>> - dev_kfree_skb(td_info->skb);
>> + dev_kfree_skb_any(td_info->skb);
>> td_info->skb = NULL;
>> }
>> }
>>
> Idem.
>
> Jun 18 01:30:15 apollo kernel: ------------[ cut here ]------------
> Jun 18 01:30:15 apollo kernel: WARNING: at
> arch/x86/kernel/pci-dma.c:482 dma_free_coherent+0x3a/0x9c()
> Jun 18 01:30:15 apollo kernel: Pid: 1401, comm: ip Not tainted
> 2.6.26-rc6EPIA_SN_VB7001 #4
> Jun 18 01:30:15 apollo kernel: [<c0115fe5>] warn_on_slowpath+0x3b/0x5f
> Jun 18 01:30:15 apollo kernel: [<c0133b19>]
> __alloc_pages_internal+0xb4/0x349
> Jun 18 01:30:15 apollo kernel: [<c0112bfd>] hrtick_start_fair+0x67/0xfd
> Jun 18 01:30:15 apollo kernel: [<c011414f>]
> check_preempt_wakeup+0x9d/0xbb
> Jun 18 01:30:15 apollo kernel: [<c0144752>] kfree+0x6a/0x72
> Jun 18 01:30:15 apollo kernel: [<c0241193>]
> velocity_free_rd_ring+0xa5/0xb6
> Jun 18 01:30:15 apollo kernel: [<c0105816>] dma_free_coherent+0x3a/0x9c
> Jun 18 01:30:15 apollo kernel: [<c02411eb>]
> velocity_free_dma_rings+0x47/0x4d
> Jun 18 01:30:15 apollo kernel: [<c024225b>]
> velocity_change_mtu+0xf6/0x157
> Jun 18 01:30:15 apollo kernel: [<c02856c4>] dev_set_mtu+0x2a/0x4f
> Jun 18 01:30:15 apollo kernel: [<c0286642>] dev_ioctl+0x4ab/0x530
> Jun 18 01:30:15 apollo kernel: [<c013964a>] __do_fault+0x256/0x28e
> Jun 18 01:30:15 apollo kernel: [<c027d8d7>] sock_ioctl+0x152/0x177
> Jun 18 01:30:15 apollo kernel: [<c027d785>] sock_ioctl+0x0/0x177
> Jun 18 01:30:15 apollo kernel: [<c0150b4a>] vfs_ioctl+0x16/0x48
> Jun 18 01:30:15 apollo kernel: [<c0150d5a>] do_vfs_ioctl+0x1de/0x1f1
> Jun 18 01:30:15 apollo kernel: [<c0150dae>] sys_ioctl+0x41/0x5b
> Jun 18 01:30:15 apollo kernel: [<c010291d>] sysenter_past_esp+0x6a/0x91
> Jun 18 01:30:15 apollo kernel: =======================
> Jun 18 01:30:15 apollo kernel: ---[ end trace abc2c54f7fac91dc ]---
>
This correction seems to work.

I read somewhere pci_free_consistent (in velocity_free_dma_rings)
couldn't be use in a spin_lock.
When I move velocity_free_rings outside the spin_lock, error desappeare
but data transfert doesn't work after the mtu change.
When I add velocity_give_many_rx_descs all work.

I will make some more tests.


--- drivers/net/via-velocity.c 2008-06-18 20:17:18.000000000 +0200
+++ ../linux-2.6.26-rc6-mtu/drivers/net/via-velocity.c 2008-06-18
20:28:14.000000000 +0200
@@ -1968,6 +1968,8 @@ static int velocity_change_mtu(struct ne
if (dev->mtu != new_mtu) {
struct velocity_info *tmp_vptr;
unsigned long flags;
+ struct tx_info tx;
+ struct rx_info rx;

tmp_vptr = kzalloc(sizeof(*tmp_vptr), GFP_KERNEL);
if (!tmp_vptr) {
@@ -1989,13 +1991,16 @@ static int velocity_change_mtu(struct ne
netif_stop_queue(dev);
velocity_shutdown(vptr);

- velocity_free_rings(vptr);
-
+ rx=vptr->rx;
+ tx=vptr->tx;
vptr->rx = tmp_vptr->rx;
vptr->tx = tmp_vptr->tx;
+ tmp_vptr->rx=rx;
+ tmp_vptr->tx=tx;

dev->mtu = new_mtu;

+ velocity_give_many_rx_descs(vptr);
velocity_init_registers(vptr, VELOCITY_INIT_COLD);

mac_enable_int(vptr->mac_regs);
@@ -2003,6 +2008,7 @@ static int velocity_change_mtu(struct ne

spin_unlock_irqrestore(&vptr->lock, flags);

+ velocity_free_rings(tmp_vptr);
out_free_tmp_vptr_1:
kfree(tmp_vptr);
}


--
R?gis

2008-06-18 20:11:22

by Francois Romieu

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Séguier Régis <[email protected]> :
[...]
> This correction seems to work.
>
> I read somewhere pci_free_consistent (in velocity_free_dma_rings) couldn't
> be use in a spin_lock.

Yup.

> When I move velocity_free_rings outside the spin_lock, error desappeare but
> data transfert doesn't work after the mtu change.
> When I add velocity_give_many_rx_descs all work.

I have updated http://userweb.kernel.org/~romieu/via-velocity/2.6.26-rc6/
accordingly. Can you check whether it is fine or not ?

--
Ueimor

2008-06-18 22:09:27

by Séguier Régis

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Francois Romieu a ?crit :
> S?guier R?gis <[email protected]> :
> [...]
>
>> This correction seems to work.
>>
>> I read somewhere pci_free_consistent (in velocity_free_dma_rings) couldn't
>> be use in a spin_lock.
>>
>
> Yup.
>
>
>> When I move velocity_free_rings outside the spin_lock, error desappeare but
>> data transfert doesn't work after the mtu change.
>> When I add velocity_give_many_rx_descs all work.
>>
>
> I have updated http://userweb.kernel.org/~romieu/via-velocity/2.6.26-rc6/
> accordingly. Can you check whether it is fine or not ?
>
>
Patch 0001 doesn't work,

if I don't make a mistake skb->len is not modified in skb_pad{,to}when
the padding is applied. In this case, we need to attribute ETH_ZLEN to len.

--
R?gis

2008-06-19 20:43:52

by Francois Romieu

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Séguier Régis <[email protected]> :
>> I have updated http://userweb.kernel.org/~romieu/via-velocity/2.6.26-rc6/
[...]
> Patch 0001 doesn't work,
>
> if I don't make a mistake skb->len is not modified in skb_pad{,to}when the
> padding is applied. In this case, we need to attribute ETH_ZLEN to len.

Yes, that's what the driver did beforehand.

The updated serie is at the usual place. Does it perform better ?

--
Ueimor

2008-06-20 10:24:19

by Séguier Régis

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Francois Romieu a ?crit :
> S?guier R?gis <[email protected]> :
>
>>> I have updated http://userweb.kernel.org/~romieu/via-velocity/2.6.26-rc6/
>>>
> [...]
>
>> Patch 0001 doesn't work,
>>
>> if I don't make a mistake skb->len is not modified in skb_pad{,to}when the
>> padding is applied. In this case, we need to attribute ETH_ZLEN to len.
>>
>
> Yes, that's what the driver did beforehand.
>
> The updated serie is at the usual place. Does it perform better ?
>
>
Ok for the mtu change, no error messages, I try multiples mtu
(1400,2000,4500,9000).

But if I generate a paquet with a size more than around 3825, the driver
don't send it and no paquet is send after, work again after an interface
down/up and until no paquet > 3825.

--
R?gis

2008-06-20 10:40:10

by Francois Romieu

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Séguier Régis <[email protected]> :
[...]
> But if I generate a paquet with a size more than around 3825, the driver
> don't send it and no paquet is send after, work again after an interface
> down/up and until no paquet > 3825.

Does it qualify as a regression ?

--
Ueimor

2008-06-20 16:51:20

by Séguier Régis

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Francois Romieu a ?crit :
> S?guier R?gis <[email protected]> :
> [...]
>
>> But if I generate a paquet with a size more than around 3825, the driver
>> don't send it and no paquet is send after, work again after an interface
>> down/up and until no paquet > 3825.
>>
>
> Does it qualify as a regression ?
>
>
I couldn't say yes now.

To be honest, I never test mtu > 1500 with this driver before.

This can be detected before with the difficulty of mtu change?

I'll test the driver before patch with a bigger MTU. (i'll try to do
this next week)

--
R?gis

2008-06-23 17:05:00

by Séguier Régis

[permalink] [raw]
Subject: Re: [PATCH] net: via-velocity.c fix sleep-with-spinlock bug during MTU change

Francois Romieu a ?crit :
> S?guier R?gis <[email protected]> :
> [...]
>
>> But if I generate a paquet with a size more than around 3825, the driver
>> don't send it and no paquet is send after, work again after an interface
>> down/up and until no paquet > 3825.
>>
>
> Does it qualify as a regression ?
>
>
No,

I test with the version of the driver before patches and I have the same
error.

I think error is in TX. (TX and RX equipments for the test have the same
chip, I don't have other kind of chip available at the moment, that why
I not sure)

If theses informations help :

When I transmit a paquet of ~ 4000 (for a mtu 4500),

When I reboot and PCIE-AER is compiled, I got this message :

Jun 23 18:28:31 apollo kernel: +------ PCI-Express Device Error ------+
Jun 23 18:28:31 apollo kernel: Error Severity^I^I: Uncorrected (Non-Fatal)
Jun 23 18:28:31 apollo kernel: PCIE Bus Error type^I: Transaction Layer
Jun 23 18:28:31 apollo kernel: Unsupported Request ^I: Multiple
Jun 23 18:28:31 apollo kernel: Requester ID^I^I: 0300
Jun 23 18:28:31 apollo kernel: VendorID=1106h, DeviceID=3119h, Bus=03h,
Device=00h, Function=00h
Jun 23 18:28:31 apollo kernel: TLB Header:
Jun 23 18:28:31 apollo kernel: 40000001 0018000c febffc7c 0000fdff
Jun 23 18:28:31 apollo kernel: Broadcast error_detected message
Jun 23 18:28:31 apollo kernel: Broadcast mmio_enabled message
Jun 23 18:28:31 apollo kernel: Broadcast resume message

When I reboot and "PCIE-AER and MSI" is compiled, I got this message :

Jun 23 18:35:59 apollo kernel: irq 163, desc: c0495dc8, depth: 1, count:
0, unhandled: 0
Jun 23 18:35:59 apollo kernel: ->handle_irq(): c012d997,
handle_bad_irq+0x0/0x199
Jun 23 18:35:59 apollo kernel: ->chip(): c0496b20, 0xc0496b20
Jun 23 18:35:59 apollo kernel: ->action(): 00000000
Jun 23 18:35:59 apollo kernel: IRQ_DISABLED set
Jun 23 18:35:59 apollo kernel: unexpected IRQ trap at vector a3
--
R?gis