2005-11-28 20:57:44

by Pradeep Vincent

[permalink] [raw]
Subject: [Patch] 2.4.32 - Neighbour Cache (ARP) State machine bug Fixed

In 2.4.21, arp code uses gc_timer to check for stale arp cache
entries. In 2.6, each entry has its own timer to check for stale arp
cache. 2.4.29 to 2.4.32 kernels (atleast) use neither of these timers.
This causes problems in environments where IPs or MACs are reassigned
- saw this problem on load balancing router based networks that use
VMACs. Tested this code on load balancing router based networks as
well as peer-linux systems.

Let me know if I need to contact someone else about this,

Thanks,

Pradeep Vincent


diff -Naur old/net/core/neighbour.c new/net/core/neighbour.c
--- old/net/core/neighbour.c Wed Nov 23 17:15:30 2005
+++ new/net/core/neighbour.c Wed Nov 23 17:26:01 2005
@@ -14,6 +14,7 @@
* Vitaly E. Lavrov releasing NULL neighbor in neigh_add.
* Harald Welte Add neighbour cache statistics like rtstat
* Harald Welte port neighbour cache rework from 2.6.9-rcX
+ * Pradeep Vincent Move neighbour cache entry to stale state
*/

#include <linux/config.h>
@@ -705,6 +706,14 @@
neigh_release(n);
continue;
}
+
+ /* Mark it stale - To be reconfirmed later when used */
+ if (n->nud_state&NUD_REACHABLE &&
+ now - n->confirmed > n->parms->reachable_time) {
+ n->nud_state = NUD_STALE;
+ neigh_suspect(n);
+ }
+
write_unlock(&n->lock);

next_elt:


2005-11-28 21:40:13

by Roberto Nibali

[permalink] [raw]
Subject: Re: [Patch] 2.4.32 - Neighbour Cache (ARP) State machine bug Fixed

> In 2.4.21, arp code uses gc_timer to check for stale arp cache
> entries. In 2.6, each entry has its own timer to check for stale arp
> cache. 2.4.29 to 2.4.32 kernels (atleast) use neither of these timers.

Regarding NUD_REACHABLE <-> NUD_STALE transition it has a timer check.
Due to the fast path it's not enabled per default. Use neigh_sync() to
check, although I believe the installed tasklet does this for you already.

The knowledgeable netdev people will know better.

> This causes problems in environments where IPs or MACs are reassigned
> - saw this problem on load balancing router based networks that use
> VMACs. Tested this code on load balancing router based networks as
> well as peer-linux systems.

How do you use VMACs in 2.4.x?

> diff -Naur old/net/core/neighbour.c new/net/core/neighbour.c
> --- old/net/core/neighbour.c Wed Nov 23 17:15:30 2005
> +++ new/net/core/neighbour.c Wed Nov 23 17:26:01 2005
> @@ -14,6 +14,7 @@
> * Vitaly E. Lavrov releasing NULL neighbor in neigh_add.
> * Harald Welte Add neighbour cache statistics like rtstat
> * Harald Welte port neighbour cache rework from 2.6.9-rcX
> + * Pradeep Vincent Move neighbour cache entry to stale state
> */
>
> #include <linux/config.h>
> @@ -705,6 +706,14 @@
> neigh_release(n);
> continue;
> }
> +
> + /* Mark it stale - To be reconfirmed later when used */
> + if (n->nud_state&NUD_REACHABLE &&
> + now - n->confirmed > n->parms->reachable_time) {
> + n->nud_state = NUD_STALE;
> + neigh_suspect(n);
> + }
> +

If this is really a problem, why not simply call neigh_sync()? Your
patch also seems to be whitespace damaged.

> write_unlock(&n->lock);
>
> next_elt:

I've cc'd netdev since this is where such patches should go for
discussion; left Linus in the loop (netiquette) although he's nothing to
do with this ;).

Cheers,
Roberto Nibali, ratz
--
echo
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

2006-02-04 02:06:57

by Pradeep Vincent

[permalink] [raw]
Subject: Fwd: [Patch] 2.4.32 - Neighbour Cache (ARP) State machine bug Fixed

Resending..

---------- Forwarded message ----------
From: Pradeep Vincent <[email protected]>
Date: Nov 28, 2005 12:57 PM
Subject: [Patch] 2.4.32 - Neighbour Cache (ARP) State machine bug Fixed
To: [email protected], [email protected]
Cc: [email protected]


In 2.4.21, arp code uses gc_timer to check for stale arp cache
entries. In 2.6, each entry has its own timer to check for stale arp
cache. 2.4.29 to 2.4.32 kernels (atleast) use neither of these timers.
This causes problems in environments where IPs or MACs are reassigned
- saw this problem on load balancing router based networks that use
VMACs. Tested this code on load balancing router based networks as
well as peer-linux systems.

Let me know if I need to contact someone else about this,

Thanks,

Pradeep Vincent


diff -Naur old/net/core/neighbour.c new/net/core/neighbour.c
--- old/net/core/neighbour.c Wed Nov 23 17:15:30 2005
+++ new/net/core/neighbour.c Wed Nov 23 17:26:01 2005
@@ -14,6 +14,7 @@
* Vitaly E. Lavrov releasing NULL neighbor in neigh_add.
* Harald Welte Add neighbour cache statistics like rtstat
* Harald Welte port neighbour cache rework from 2.6.9-rcX
+ * Pradeep Vincent Move neighbour cache entry to stale state
*/

#include <linux/config.h>
@@ -705,6 +706,14 @@
neigh_release(n);
continue;
}
+
+ /* Mark it stale - To be reconfirmed later when used */
+ if (n->nud_state&NUD_REACHABLE &&
+ now - n->confirmed > n->parms->reachable_time) {
+ n->nud_state = NUD_STALE;
+ neigh_suspect(n);
+ }
+
write_unlock(&n->lock);

next_elt:

2006-02-04 02:18:42

by David Miller

[permalink] [raw]
Subject: Re: [Patch] 2.4.32 - Neighbour Cache (ARP) State machine bug Fixed

From: Pradeep Vincent <[email protected]>
Date: Fri, 3 Feb 2006 18:06:53 -0800

> Resending..

Your email client has tab and newline mangled the patch so it
cannot be applied. Please fix this up and also supply an
appropriate "Signed-off-by: " line.

Thanks.

2006-02-07 07:57:54

by Pradeep Vincent

[permalink] [raw]
Subject: Re: [Patch] 2.4.32 - Neighbour Cache (ARP) State machine bug Fixed

In 2.4.21, arp code uses gc_timer to check for stale arp cache
entries. In 2.6, each entry has its own timer to check for stale arp
cache. 2.4.29 to 2.4.32 kernels (atleast) use neither of these timers.
This causes problems in environments where IPs or MACs are reassigned
- saw this problem on load balancing router based networks that use
VMACs. Tested this code on load balancing router based networks as
well as peer-linux systems.


Thanks,


Signed off by: Pradeep Vincent <[email protected]>

diff -Naur old/net/core/neighbour.c new/net/core/neighbour.c
--- old/net/core/neighbour.c Wed Nov 23 17:15:30 2005
+++ new/net/core/neighbour.c Wed Nov 23 17:26:01 2005
@@ -14,6 +14,7 @@
* Vitaly E. Lavrov releasing NULL neighbor in neigh_add.
* Harald Welte Add neighbour cache statistics like rtstat
* Harald Welte port neighbour cache rework from 2.6.9-rcX
+ * Pradeep Vincent Move neighbour cache entry to stale state
*/

#include <linux/config.h>
@@ -705,6 +706,14 @@
neigh_release(n);
continue;
}
+
+ /* Mark it stale - To be reconfirmed later when used */
+ if (n->nud_state&NUD_REACHABLE &&
+ now - n->confirmed > n->parms->reachable_time) {
+ n->nud_state = NUD_STALE;
+ neigh_suspect(n);
+ }
+
write_unlock(&n->lock);


On 2/3/06, David S. Miller <[email protected]> wrote:
> From: Pradeep Vincent <[email protected]>
> Date: Fri, 3 Feb 2006 18:06:53 -0800
>
> > Resending..
>
> Your email client has tab and newline mangled the patch so it
> cannot be applied. Please fix this up and also supply an
> appropriate "Signed-off-by: " line.
>
> Thanks.
>

2006-02-07 21:53:52

by Willy Tarreau

[permalink] [raw]
Subject: Re: [Patch] 2.4.32 - Neighbour Cache (ARP) State machine bug Fixed

Hi,

On Tue, Feb 07, 2006 at 12:57:43AM -0700, Pradeep Vincent wrote:
> In 2.4.21, arp code uses gc_timer to check for stale arp cache
> entries. In 2.6, each entry has its own timer to check for stale arp
> cache. 2.4.29 to 2.4.32 kernels (atleast) use neither of these timers.
> This causes problems in environments where IPs or MACs are reassigned
> - saw this problem on load balancing router based networks that use
> VMACs. Tested this code on load balancing router based networks as
> well as peer-linux systems.
>
>
> Thanks,
>
>
> Signed off by: Pradeep Vincent <[email protected]>
>
> diff -Naur old/net/core/neighbour.c new/net/core/neighbour.c
> --- old/net/core/neighbour.c Wed Nov 23 17:15:30 2005
> +++ new/net/core/neighbour.c Wed Nov 23 17:26:01 2005
> @@ -14,6 +14,7 @@
> * Vitaly E. Lavrov releasing NULL neighbor in neigh_add.
> * Harald Welte Add neighbour cache statistics like rtstat
> * Harald Welte port neighbour cache rework from 2.6.9-rcX
> + * Pradeep Vincent Move neighbour cache entry to stale state
> */

As you can see above, your mailer is still broken. Leading spaces get
removed and it seems like tabs are replaced with spaces. This makes it
really annoying to fix by hand because we all have to do your work again.
You should try to fix your mailer options, possibly by sending a few
mails to yourself or someone else (if you send *a few* mails to me, I
can confirm which one looks OK). If your mailer is definitely broken,
then you may send it as plain text first (for review), with a text
attachment for people to apply it without trouble.

Thanks,
Willy

2006-02-08 01:50:06

by Pradeep Vincent

[permalink] [raw]
Subject: Re: [Patch] 2.4.32 - Neighbour Cache (ARP) State machine bug Fixed

One more attempt. Attaching the diff file as well.

Signed off by: Pradeep Vincent <[email protected]>

--- old/net/core/neighbour.c Wed Nov 9 16:48:10 2005
+++ new/net/core/neighbour.c Tue Feb 7 17:38:26 2006
@@ -14,6 +14,7 @@
* Vitaly E. Lavrov releasing NULL neighbor in neigh_add.
* Harald Welte Add neighbour cache statistics like rtstat
* Harald Welte port neighbour cache rework from 2.6.9-rcX
+ * Pradeep Vincent fix neighbour cache state machine
*/

#include <linux/config.h>
@@ -705,6 +706,13 @@
neigh_release(n);
continue;
}
+ /* Move to NUD_STALE state */
+ if (n->nud_state&NUD_REACHABLE &&
+ now - n->confirmed > n->parms->reachable_time) {
+ n->nud_state = NUD_STALE;
+ neigh_suspect(n);
+ }
+
write_unlock(&n->lock);

next_elt:

Thanks,

Pradeep
On 2/7/06, Willy Tarreau <[email protected]> wrote:
> Hi,
>
> On Tue, Feb 07, 2006 at 12:57:43AM -0700, Pradeep Vincent wrote:
> > In 2.4.21, arp code uses gc_timer to check for stale arp cache
> > entries. In 2.6, each entry has its own timer to check for stale arp
> > cache. 2.4.29 to 2.4.32 kernels (atleast) use neither of these timers.
> > This causes problems in environments where IPs or MACs are reassigned
> > - saw this problem on load balancing router based networks that use
> > VMACs. Tested this code on load balancing router based networks as
> > well as peer-linux systems.
> >
> >
> > Thanks,
> >
> >
> > Signed off by: Pradeep Vincent <[email protected]>
> >
> > diff -Naur old/net/core/neighbour.c new/net/core/neighbour.c
> > --- old/net/core/neighbour.c Wed Nov 23 17:15:30 2005
> > +++ new/net/core/neighbour.c Wed Nov 23 17:26:01 2005
> > @@ -14,6 +14,7 @@
> > * Vitaly E. Lavrov releasing NULL neighbor in neigh_add.
> > * Harald Welte Add neighbour cache statistics like rtstat
> > * Harald Welte port neighbour cache rework from 2.6.9-rcX
> > + * Pradeep Vincent Move neighbour cache entry to stale state
> > */
>
> As you can see above, your mailer is still broken. Leading spaces get
> removed and it seems like tabs are replaced with spaces. This makes it
> really annoying to fix by hand because we all have to do your work again.
> You should try to fix your mailer options, possibly by sending a few
> mails to yourself or someone else (if you send *a few* mails to me, I
> can confirm which one looks OK). If your mailer is definitely broken,
> then you may send it as plain text first (for review), with a text
> attachment for people to apply it without trouble.
>
> Thanks,
> Willy
>
>


Attachments:
(No filename) (2.75 kB)
linux-2.4.29-arp-fix.patch (691.00 B)
Download all attachments

2006-02-08 02:13:13

by Grant Coady

[permalink] [raw]
Subject: Re: [Patch] 2.4.32 - Neighbour Cache (ARP) State machine bug Fixed

On Tue, 7 Feb 2006 17:50:03 -0800, Pradeep Vincent <[email protected]> wrote:

>One more attempt. Attaching the diff file as well.
>
>Signed off by: Pradeep Vincent <[email protected]>
>
>--- old/net/core/neighbour.c Wed Nov 9 16:48:10 2005
>+++ new/net/core/neighbour.c Tue Feb 7 17:38:26 2006
>@@ -14,6 +14,7 @@
> * Vitaly E. Lavrov releasing NULL neighbor in neigh_add.
> * Harald Welte Add neighbour cache statistics like rtstat
> * Harald Welte port neighbour cache rework from 2.6.9-rcX
>+ * Pradeep Vincent fix neighbour cache state machine
> */
>
> #include <linux/config.h>
>@@ -705,6 +706,13 @@
> neigh_release(n);
> continue;
> }
>+ /* Move to NUD_STALE state */
>+ if (n->nud_state&NUD_REACHABLE &&
>+ now - n->confirmed > n->parms->reachable_time) {

Hmm, you're suffering tab -> space conversion syndrome :(

Grant.

2006-02-10 20:32:51

by Bill Davidsen

[permalink] [raw]
Subject: Re: [Patch] 2.4.32 - Neighbour Cache (ARP) State machine bug Fixed

Grant Coady wrote:
> On Tue, 7 Feb 2006 17:50:03 -0800, Pradeep Vincent <[email protected]> wrote:
>
>
>>One more attempt. Attaching the diff file as well.
>>
>>Signed off by: Pradeep Vincent <[email protected]>
>>
>>--- old/net/core/neighbour.c Wed Nov 9 16:48:10 2005
>>+++ new/net/core/neighbour.c Tue Feb 7 17:38:26 2006
>>@@ -14,6 +14,7 @@
>> * Vitaly E. Lavrov releasing NULL neighbor in neigh_add.
>> * Harald Welte Add neighbour cache statistics like rtstat
>> * Harald Welte port neighbour cache rework from 2.6.9-rcX
>>+ * Pradeep Vincent fix neighbour cache state machine
>> */
>>
>>#include <linux/config.h>
>>@@ -705,6 +706,13 @@
>> neigh_release(n);
>> continue;
>> }
>>+ /* Move to NUD_STALE state */
>>+ if (n->nud_state&NUD_REACHABLE &&
>>+ now - n->confirmed > n->parms->reachable_time) {
>
>
> Hmm, you're suffering tab -> space conversion syndrome :(
>
> Grant.

The attachment has tabs here, don't know what you're seeing.

--
bill davidsen <[email protected]>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979