2010-06-12 10:28:12

by Markus Trippelsdorf

[permalink] [raw]
Subject: mpd client timeouts (bisected) 2.6.35-rc3

Commit 597a264b1a9c7e36d1728f677c66c5c1f7e3b837:
?net: deliver skbs on inactive slaves to exact matches?

causes large timeouts when mpd clients try to connect to a locally
running mpd (music player demon) on my machine. This makes it
impossible to control mpd.

I bisected this down to the commit mentioned above.
Reverting the commit from 2.6.35-rc3 also solves the problem.

--
Markus


2010-06-12 21:58:27

by David Miller

[permalink] [raw]
Subject: Re: mpd client timeouts (bisected) 2.6.35-rc3

From: Markus Trippelsdorf <[email protected]>
Date: Sat, 12 Jun 2010 12:28:02 +0200

> Commit 597a264b1a9c7e36d1728f677c66c5c1f7e3b837:
> ?net: deliver skbs on inactive slaves to exact matches?
>
> causes large timeouts when mpd clients try to connect to a locally
> running mpd (music player demon) on my machine. This makes it
> impossible to control mpd.
>
> I bisected this down to the commit mentioned above.
> Reverting the commit from 2.6.35-rc3 also solves the problem.

John, find an easy and fast way to fix this or else I am
going to revert.

Thanks.

2010-06-13 08:05:44

by John Fastabend

[permalink] [raw]
Subject: Re: mpd client timeouts (bisected) 2.6.35-rc3

David Miller wrote:
> From: Markus Trippelsdorf <[email protected]>
> Date: Sat, 12 Jun 2010 12:28:02 +0200
>
>> Commit 597a264b1a9c7e36d1728f677c66c5c1f7e3b837:
>> ?net: deliver skbs on inactive slaves to exact matches?
>>
>> causes large timeouts when mpd clients try to connect to a locally
>> running mpd (music player demon) on my machine. This makes it
>> impossible to control mpd.
>>
>> I bisected this down to the commit mentioned above.
>> Reverting the commit from 2.6.35-rc3 also solves the problem.
>
> John, find an easy and fast way to fix this or else I am
> going to revert.
>
> Thanks.

Looks like skbs are hitting loopback_xmit() with deliver_no_wcard set. Then in
the receive path these skbs are only delivered to exact matches. Not sure why
this bit is set here, I'll track this down first thing tomorrow.

Thanks,
John.

2010-06-13 20:36:33

by John Fastabend

[permalink] [raw]
Subject: Re: mpd client timeouts (bisected) 2.6.35-rc3

John Fastabend wrote:
> David Miller wrote:
>> From: Markus Trippelsdorf <[email protected]>
>> Date: Sat, 12 Jun 2010 12:28:02 +0200
>>
>>> Commit 597a264b1a9c7e36d1728f677c66c5c1f7e3b837:
>>> ?net: deliver skbs on inactive slaves to exact matches?
>>>
>>> causes large timeouts when mpd clients try to connect to a locally
>>> running mpd (music player demon) on my machine. This makes it
>>> impossible to control mpd.
>>>
>>> I bisected this down to the commit mentioned above.
>>> Reverting the commit from 2.6.35-rc3 also solves the problem.
>> John, find an easy and fast way to fix this or else I am
>> going to revert.
>>
>> Thanks.
>
> Looks like skbs are hitting loopback_xmit() with deliver_no_wcard set. Then in
> the receive path these skbs are only delivered to exact matches. Not sure why
> this bit is set here, I'll track this down first thing tomorrow.
>
> Thanks,
> John.
> --

Needed to set the wcard bit in copy_skb_header otherwise it will not be cleared
when called from skb_clone. Which then hits the loopback device gets pushed
into the rx path and is eventually dropped. The following patch fixes this.
Hopefully, this is easy and fast enough for you Dave.


[PATCH] net: fix deliver_no_wcard regression on loopback device

deliver_no_wcard is not being set in skb_copy_header.
In the skb_cloned case it is not being cleared and
may cause the skb to be dropped when the loopback device
pushes it back up the stack.

Signed-off-by: John Fastabend <[email protected]>
---

net/core/skbuff.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9f07e74..bcf2fa3 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -532,6 +532,7 @@ static void __copy_skb_header(struct sk_buff *new, const
struct sk_buff *old)
new->ip_summed = old->ip_summed;
skb_copy_queue_mapping(new, old);
new->priority = old->priority;
+ new->deliver_no_wcard = old->deliver_no_wcard;
#if defined(CONFIG_IP_VS) || defined(CONFIG_IP_VS_MODULE)
new->ipvs_property = old->ipvs_property;
#endif

2010-06-13 20:50:55

by Eric Dumazet

[permalink] [raw]
Subject: Re: mpd client timeouts (bisected) 2.6.35-rc3

Le dimanche 13 juin 2010 à 13:36 -0700, John Fastabend a écrit :
> John Fastabend wrote:
> > David Miller wrote:
> >> From: Markus Trippelsdorf <[email protected]>
> >> Date: Sat, 12 Jun 2010 12:28:02 +0200
> >>
> >>> Commit 597a264b1a9c7e36d1728f677c66c5c1f7e3b837:
> >>> »net: deliver skbs on inactive slaves to exact matches«
> >>>
> >>> causes large timeouts when mpd clients try to connect to a locally
> >>> running mpd (music player demon) on my machine. This makes it
> >>> impossible to control mpd.
> >>>
> >>> I bisected this down to the commit mentioned above.
> >>> Reverting the commit from 2.6.35-rc3 also solves the problem.
> >> John, find an easy and fast way to fix this or else I am
> >> going to revert.
> >>
> >> Thanks.
> >
> > Looks like skbs are hitting loopback_xmit() with deliver_no_wcard set. Then in
> > the receive path these skbs are only delivered to exact matches. Not sure why
> > this bit is set here, I'll track this down first thing tomorrow.
> >
> > Thanks,
> > John.
> > --
>
> Needed to set the wcard bit in copy_skb_header otherwise it will not be cleared
> when called from skb_clone. Which then hits the loopback device gets pushed
> into the rx path and is eventually dropped. The following patch fixes this.
> Hopefully, this is easy and fast enough for you Dave.
>
>
> [PATCH] net: fix deliver_no_wcard regression on loopback device
>
> deliver_no_wcard is not being set in skb_copy_header.
> In the skb_cloned case it is not being cleared and
> may cause the skb to be dropped when the loopback device
> pushes it back up the stack.
>
> Signed-off-by: John Fastabend <[email protected]>
> ---
>
> net/core/skbuff.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 9f07e74..bcf2fa3 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -532,6 +532,7 @@ static void __copy_skb_header(struct sk_buff *new, const
> struct sk_buff *old)
> new->ip_summed = old->ip_summed;
> skb_copy_queue_mapping(new, old);
> new->priority = old->priority;
> + new->deliver_no_wcard = old->deliver_no_wcard;
> #if defined(CONFIG_IP_VS) || defined(CONFIG_IP_VS_MODULE)
> new->ipvs_property = old->ipvs_property;
> #endif
> --

Acked-by: Eric Dumazet <[email protected]>

BTW, David, it seems there is a double rxhash copy...

[PATCH] net: rxhash already set in __copy_skb_header

No need to copy rxhash again in __skb_clone()

Signed-off-by: Eric Dumazet <[email protected]>
---
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9f07e74..a58e63b 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -569,7 +569,6 @@ static struct sk_buff *__skb_clone(struct sk_buff *n, struct sk_buff *skb)
C(len);
C(data_len);
C(mac_len);
- C(rxhash);
n->hdr_len = skb->nohdr ? skb_headroom(skb) : skb->hdr_len;
n->cloned = 1;
n->nohdr = 0;

2010-06-13 20:59:28

by Markus Trippelsdorf

[permalink] [raw]
Subject: Re: mpd client timeouts (bisected) 2.6.35-rc3

On Sun, Jun 13, 2010 at 01:36:30PM -0700, John Fastabend wrote:
> John Fastabend wrote:
> >David Miller wrote:
> >>From: Markus Trippelsdorf <[email protected]>
> >>Date: Sat, 12 Jun 2010 12:28:02 +0200
> >>
> >>>Commit 597a264b1a9c7e36d1728f677c66c5c1f7e3b837:
> >>>?net: deliver skbs on inactive slaves to exact matches?
> >>>
> >>>causes large timeouts when mpd clients try to connect to a locally
> >>>running mpd (music player demon) on my machine. This makes it
> >>>impossible to control mpd.
> >>>
> >>>I bisected this down to the commit mentioned above.
> >>>Reverting the commit from 2.6.35-rc3 also solves the problem.
> >>John, find an easy and fast way to fix this or else I am
> >>going to revert.
> >>
> >>Thanks.
> >
> >Looks like skbs are hitting loopback_xmit() with deliver_no_wcard
> >set. Then in the receive path these skbs are only delivered to
> >exact matches. Not sure why this bit is set here, I'll track this
> >down first thing tomorrow.
> >
> >Thanks,
> >John.
> >--
>
> Needed to set the wcard bit in copy_skb_header otherwise it will not
> be cleared when called from skb_clone. Which then hits the loopback
> device gets pushed into the rx path and is eventually dropped. The
> following patch fixes this. Hopefully, this is easy and fast enough
> for you Dave.
>
>
> [PATCH] net: fix deliver_no_wcard regression on loopback device

This solves the problem here. Thanks.

Tested-by: Markus Trippelsdorf <[email protected]>
--
Markus

2010-06-14 00:13:10

by David Miller

[permalink] [raw]
Subject: Re: mpd client timeouts (bisected) 2.6.35-rc3

From: John Fastabend <[email protected]>
Date: Sun, 13 Jun 2010 13:36:30 -0700

> Needed to set the wcard bit in copy_skb_header otherwise it will not
> be cleared when called from skb_clone. Which then hits the loopback
> device gets pushed into the rx path and is eventually dropped. The
> following patch fixes this. Hopefully, this is easy and fast enough
> for you Dave.
>
>
> [PATCH] net: fix deliver_no_wcard regression on loopback device
>
> deliver_no_wcard is not being set in skb_copy_header.
> In the skb_cloned case it is not being cleared and
> may cause the skb to be dropped when the loopback device
> pushes it back up the stack.
>
> Signed-off-by: John Fastabend <[email protected]>

Applied, but your email client corrupted this patch in many
ways. Please correct this for next time, thanks.

2010-06-14 00:14:32

by David Miller

[permalink] [raw]
Subject: Re: mpd client timeouts (bisected) 2.6.35-rc3

From: Eric Dumazet <[email protected]>
Date: Sun, 13 Jun 2010 22:50:46 +0200

> Acked-by: Eric Dumazet <[email protected]>

Eric, please never ACK a patch in the same mail in which you are
posting new patch. Patchwork won't add your ACK to the patchwork
entry you are ACK'ing because all of your text will go into a new
patchwork entry for the patch you are posting.

> BTW, David, it seems there is a double rxhash copy...
>
> [PATCH] net: rxhash already set in __copy_skb_header
>
> No need to copy rxhash again in __skb_clone()
>
> Signed-off-by: Eric Dumazet <[email protected]>

I'll apply this, thanks.

2010-06-14 00:14:45

by David Miller

[permalink] [raw]
Subject: Re: mpd client timeouts (bisected) 2.6.35-rc3

From: "[email protected]" <[email protected]>
Date: Sun, 13 Jun 2010 22:59:22 +0200

> On Sun, Jun 13, 2010 at 01:36:30PM -0700, John Fastabend wrote:
>> [PATCH] net: fix deliver_no_wcard regression on loopback device
>
> This solves the problem here. Thanks.
>
> Tested-by: Markus Trippelsdorf <[email protected]>

Thanks for testing.

2010-06-14 14:17:56

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: mpd client timeouts (bisected) 2.6.35-rc3

On Sun, Jun 13, 2010 at 05:13:18PM -0700, David Miller wrote:
> From: John Fastabend <[email protected]>
> Date: Sun, 13 Jun 2010 13:36:30 -0700
>
> > Needed to set the wcard bit in copy_skb_header otherwise it will not
> > be cleared when called from skb_clone. Which then hits the loopback
> > device gets pushed into the rx path and is eventually dropped. The
> > following patch fixes this. Hopefully, this is easy and fast enough
> > for you Dave.
> >
> >
> > [PATCH] net: fix deliver_no_wcard regression on loopback device
> >
> > deliver_no_wcard is not being set in skb_copy_header.
> > In the skb_cloned case it is not being cleared and
> > may cause the skb to be dropped when the loopback device
> > pushes it back up the stack.
> >
> > Signed-off-by: John Fastabend <[email protected]>
>
> Applied, but your email client corrupted this patch in many
> ways. Please correct this for next time, thanks.

FWIW:

Tested-by: Michael S. Tsirkin <[email protected]>

--
MST

2010-06-16 10:01:29

by Christian Kujau

[permalink] [raw]
Subject: Re: mpd client timeouts (bisected) 2.6.35-rc3

On Sun, 13 Jun 2010 at 22:59, [email protected] wrote:
> This solves the problem here. Thanks.

Not sure if this is related, but I've noticed connection timeouts and
connections going in FIN_WAIT2 state (most of them SSH tunnels) with
2.6.35. Going back to 2.6.34 or applying John's (and Eric's) patch
does seem to fix this issue.

Thanks,
Christian.
--
BOFH excuse #168:

le0: no carrier: transceiver cable problem?

2010-06-17 05:17:08

by Alex Shi

[permalink] [raw]
Subject: RE: mpd client timeouts (bisected) 2.6.35-rc3

It fixed the aim7/specjbb2005 issues!

Tested-by: Alex Shi <[email protected]>


Regards!
Alex

>
>Needed to set the wcard bit in copy_skb_header otherwise it will not be cleared
>when called from skb_clone. Which then hits the loopback device gets pushed
>into the rx path and is eventually dropped. The following patch fixes this.
>Hopefully, this is easy and fast enough for you Dave.
>
>
>[PATCH] net: fix deliver_no_wcard regression on loopback device
>
>deliver_no_wcard is not being set in skb_copy_header.
>In the skb_cloned case it is not being cleared and
>may cause the skb to be dropped when the loopback device
>pushes it back up the stack.
>
>Signed-off-by: John Fastabend <[email protected]>
>---
>
> net/core/skbuff.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
>diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>index 9f07e74..bcf2fa3 100644
>--- a/net/core/skbuff.c
>+++ b/net/core/skbuff.c
>@@ -532,6 +532,7 @@ static void __copy_skb_header(struct sk_buff *new, const
>struct sk_buff *old)
> new->ip_summed = old->ip_summed;
> skb_copy_queue_mapping(new, old);
> new->priority = old->priority;
>+ new->deliver_no_wcard = old->deliver_no_wcard;
> #if defined(CONFIG_IP_VS) || defined(CONFIG_IP_VS_MODULE)
> new->ipvs_property = old->ipvs_property;
> #endif

2010-06-17 15:05:10

by Christoph Fritz

[permalink] [raw]
Subject: Re: mpd client timeouts (bisected) 2.6.35-rc3

On Sun, 2010-06-13 at 22:59 +0200, [email protected] wrote:
> On Sun, Jun 13, 2010 at 01:36:30PM -0700, John Fastabend wrote:

> > [PATCH] net: fix deliver_no_wcard regression on loopback device
>
> This solves the problem here. Thanks.

here too

Tested-by: Christoph Fritz <[email protected]>