2006-03-11 00:32:13

by Greg Scott

[permalink] [raw]
Subject: Router stops routing after changing MAC Address

Hello - This feels like a kernel issue. I spent hours and hours and
hours looking for documentation and archives around this but did not
find anything.

I have a Linux router and I need the ability to swap hardware without
causing downtime. The problem, of course, is ARPs. The NICs in the
replacement system need the same MAC Addresses as the NICs in the
original system. I'd like this all to be in the kernel and not depend
on a daemon process that can die.

How to change MAC addresses is documented well enough - and it works -
but when I change MAC addresses, my router stops routing. From the
router, I can see the systems on both sides - but the router just
refuses to forward packets. Here are my little test scripts to change
MAC Addresses.

First - ip-fudge-mac.sh
[root@test-fw2 gregs]# more ip-fudge-mac.sh
ip link set eth0 down
ip link set eth0 address 01:02:03:04:05:06
ip link set eth0 up

ip link set eth1 down
ip link set eth1 address 17:20:16:01:60:03
ip link set eth1 up

echo "1" > /proc/sys/net/ipv4/ip_forward



Now original-mac.sh

[root@test-fw2 gregs]# more original-mac.sh
ifdown eth0
ifconfig eth0 hw ether 00:c1:28:01:d8:07
ifup eth0

ifdown eth1
ifconfig eth1 hw ether 00:50:da:90:e4:aa
ifup eth1

echo "1" > /proc/sys/net/ipv4/ip_forward

I have systems both on the left and right side of my test router. Here
is some output from the router with tcpdump showing what happens. I
replaced the first 3 real public IP Address octects with "1.2.3". The
first set of tcpdump records shows it forarding with the original
hardware MAC Addreses are set. We see round trips from the left side to
the right side and back with echo request and reply packets.

The second set shows what happens after changing MAC Addresses. We only
see packets come in on the left side - but nothing happening on the
right side.

Packet forwarding must somehow depend on MAC Addresses but I cannot find
anything anywhere that tells me how this works.

I reproduced this problem on at least two different Linux routers - one
running 2.4.27, the other running 2.6.11-1. Am I asking the kernel to
do something bad? What would it take to put together a patch to change
this behavior?


[root@test-fw2 gregs]# ./original-mac.sh
[root@test-fw2 gregs]# /usr/sbin/tcpdump -i eth1 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol
decode
listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
17:14:51.010439 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 479
17:14:51.010537 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 479
17:14:52.010448 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 480
17:14:52.010621 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 480
17:14:53.010531 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 481
17:14:53.010696 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 481
17:14:54.010716 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 482
17:14:54.010882 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 482

8 packets captured
8 packets received by filter
0 packets dropped by kernel
[root@test-fw2 gregs]# ./ip-fudge-mac.sh
[root@test-fw2 gregs]# /usr/sbin/tcpdump -i eth1 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol
decode
listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
17:15:10.031945 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 498
17:15:11.031980 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 499
17:15:11.806487 fe80::1520:16ff:fe01:6003 > ff02::2: icmp6: router
solicitation
17:15:12.032062 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 500
17:15:13.032154 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 501
17:15:14.032222 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 502
17:15:15.032305 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 503
17:15:15.805873 fe80::1520:16ff:fe01:6003 > ff02::2: icmp6: router
solicitation
17:15:16.032394 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 504
17:15:17.032465 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 505

10 packets captured
10 packets received by filter
0 packets dropped by kernel
[root@test-fw2 gregs]#
[root@test-fw2 gregs]# /usr/sbin/tcpdump -i eth0 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol
decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes

0 packets captured
0 packets received by filter
0 packets dropped by kernel
[root@test-fw2 gregs]#


Thanks

- Greg Scott


2006-03-11 00:40:05

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

On Fri, 10 Mar 2006 18:33:15 -0600
"Greg Scott" <[email protected]> wrote:

> Hello - This feels like a kernel issue. I spent hours and hours and
> hours looking for documentation and archives around this but did not
> find anything.
>
> I have a Linux router and I need the ability to swap hardware without
> causing downtime. The problem, of course, is ARPs. The NICs in the
> replacement system need the same MAC Addresses as the NICs in the
> original system. I'd like this all to be in the kernel and not depend
> on a daemon process that can die.
>
> How to change MAC addresses is documented well enough - and it works -
> but when I change MAC addresses, my router stops routing. From the
> router, I can see the systems on both sides - but the router just
> refuses to forward packets. Here are my little test scripts to change
> MAC Addresses.

You probably just need to flush the route cache after the address change?

2006-03-11 00:43:59

by Michael Clark

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

Stephen Hemminger wrote:

>On Fri, 10 Mar 2006 18:33:15 -0600
>"Greg Scott" <[email protected]> wrote:
>
>
>
>>Hello - This feels like a kernel issue. I spent hours and hours and
>>hours looking for documentation and archives around this but did not
>>find anything.
>>
>>I have a Linux router and I need the ability to swap hardware without
>>causing downtime. The problem, of course, is ARPs. The NICs in the
>>replacement system need the same MAC Addresses as the NICs in the
>>original system. I'd like this all to be in the kernel and not depend
>>on a daemon process that can die.
>>
>>How to change MAC addresses is documented well enough - and it works -
>>but when I change MAC addresses, my router stops routing. From the
>>router, I can see the systems on both sides - but the router just
>>refuses to forward packets. Here are my little test scripts to change
>>MAC Addresses.
>>
>>
>
>You probably just need to flush the route cache after the address change?
>
>
Or do a gratutious arp for the address with the new HW address (tool to
do this is included in HA failover software such as heartbeat and others).

~mc

2006-03-11 00:50:12

by Bart Samwel

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

Greg Scott wrote:
> Hello - This feels like a kernel issue. I spent hours and hours and
> hours looking for documentation and archives around this but did not
> find anything.
>
> I have a Linux router and I need the ability to swap hardware without
> causing downtime. The problem, of course, is ARPs. The NICs in the
> replacement system need the same MAC Addresses as the NICs in the
> original system. I'd like this all to be in the kernel and not depend
> on a daemon process that can die.
>
> How to change MAC addresses is documented well enough - and it works -
> but when I change MAC addresses, my router stops routing. From the
> router, I can see the systems on both sides - but the router just
> refuses to forward packets. Here are my little test scripts to change
> MAC Addresses.
>
> First - ip-fudge-mac.sh
> [root@test-fw2 gregs]# more ip-fudge-mac.sh
> ip link set eth0 down
> ip link set eth0 address 01:02:03:04:05:06
> ip link set eth0 up
>
> ip link set eth1 down
> ip link set eth1 address 17:20:16:01:60:03
> ip link set eth1 up
>
> echo "1" > /proc/sys/net/ipv4/ip_forward
>
>
>
> Now original-mac.sh
>
> [root@test-fw2 gregs]# more original-mac.sh
> ifdown eth0
> ifconfig eth0 hw ether 00:c1:28:01:d8:07
> ifup eth0
>
> ifdown eth1
> ifconfig eth1 hw ether 00:50:da:90:e4:aa
> ifup eth1
>
> echo "1" > /proc/sys/net/ipv4/ip_forward
>
> I have systems both on the left and right side of my test router. Here
> is some output from the router with tcpdump showing what happens. I
> replaced the first 3 real public IP Address octects with "1.2.3". The
> first set of tcpdump records shows it forarding with the original
> hardware MAC Addreses are set. We see round trips from the left side to
> the right side and back with echo request and reply packets.
>
> The second set shows what happens after changing MAC Addresses. We only
> see packets come in on the left side - but nothing happening on the
> right side.
>
> Packet forwarding must somehow depend on MAC Addresses but I cannot find
> anything anywhere that tells me how this works.
>
> I reproduced this problem on at least two different Linux routers - one
> running 2.4.27, the other running 2.6.11-1. Am I asking the kernel to
> do something bad? What would it take to put together a patch to change
> this behavior?
>
>
> [root@test-fw2 gregs]# ./original-mac.sh
> [root@test-fw2 gregs]# /usr/sbin/tcpdump -i eth1 -n
> tcpdump: verbose output suppressed, use -v or -vv for full protocol
> decode
> listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
> 17:14:51.010439 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 479
> 17:14:51.010537 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 479
> 17:14:52.010448 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 480
> 17:14:52.010621 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 480
> 17:14:53.010531 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 481
> 17:14:53.010696 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 481
> 17:14:54.010716 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 482
> 17:14:54.010882 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 482
>
> 8 packets captured
> 8 packets received by filter
> 0 packets dropped by kernel
> [root@test-fw2 gregs]# ./ip-fudge-mac.sh
> [root@test-fw2 gregs]# /usr/sbin/tcpdump -i eth1 -n
> tcpdump: verbose output suppressed, use -v or -vv for full protocol
> decode
> listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
> 17:15:10.031945 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 498
> 17:15:11.031980 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 499
> 17:15:11.806487 fe80::1520:16ff:fe01:6003 > ff02::2: icmp6: router
> solicitation
> 17:15:12.032062 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 500
> 17:15:13.032154 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 501
> 17:15:14.032222 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 502
> 17:15:15.032305 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 503
> 17:15:15.805873 fe80::1520:16ff:fe01:6003 > ff02::2: icmp6: router
> solicitation
> 17:15:16.032394 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 504
> 17:15:17.032465 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq 505
>
> 10 packets captured
> 10 packets received by filter
> 0 packets dropped by kernel
> [root@test-fw2 gregs]#
> [root@test-fw2 gregs]# /usr/sbin/tcpdump -i eth0 -n
> tcpdump: verbose output suppressed, use -v or -vv for full protocol
> decode
> listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
>
> 0 packets captured
> 0 packets received by filter
> 0 packets dropped by kernel
> [root@test-fw2 gregs]#

I think you're not testing hotswapping machines with equal MAC addresses
here, you're testing hot-changing the MAC address for a gateway IP. The
machine on the "right side" that the machine on the left side is pinging
probably still has the old MAC address for its gateway in it's ARP
cache, so the echo reply will be sent to the wrong MAC address. (Or am I
talking nonsense here?)

--Bart

2006-03-11 01:17:05

by David Miller

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

From: Stephen Hemminger <[email protected]>
Date: Fri, 10 Mar 2006 16:39:58 -0800

> You probably just need to flush the route cache after the address change?

It's probably a good idea for the routing cache to catch
this event, if that's all that needs to be done.

2006-03-11 02:32:26

by Greg Scott

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

The ARP caches on both the left and right side systems make sense and
show what I would expect.

When I do arp -a on both the left and right side systems, both systems
show the original MAC Addresses for the router when the router uses the
originals, and both show the fudged MAC Addresses when the router is set
with the fudged MAC Addresses. Besides, both the left and right systems
can ping the router regardless of the MAC Address of the router NICs.
To my mind, this rules out ARP cache issues.

Flushing the route cache on the router has no effect. I flushed and
reset everything after putting in the fudged MAC Address like this:

[root@test-fw2 gregs]# more fix-routes.sh
#!/bin/sh

echo "Routes before updating"
/sbin/ip route show

echo "Default route"
/sbin/ip route flush 0.0.0.0/0
/sbin/ip route add 0.0.0.0/0 via 1.2.3.49 dev eth0

echo "Route to the right"
/sbin/ip route flush 1.2.3.48/29
/sbin/ip route add 1.2.3.48/29 dev eth0

echo "Route to the left"
/sbin/ip route flush 172.16.0.0/16
/sbin/ip route add 172.16.0.0/16 dev eth1

echo "Routes after updating"
/sbin/ip route show

exit


Here are the results - with my pings still running from the system on
the left. And then, going back to the original MAC Address - even
without flusing any routes - the router forwards packets again when the
MAC Addresses match the hardware. I don't think the problem has
anything to do with ARP caches or routes or anything I know how to set
external to the kernel. I think the issue is deeper than routing
caches, unless there are routes someplace it won't show me.


[root@test-fw2 gregs]# ./fix-routes.sh
Routes before updating
1.2.3.48/29 dev eth0 scope link
1.2.3.48/28 dev eth0 proto kernel scope link src 1.2.3.50
10.10.10.0/24 dev eth2 proto kernel scope link src 10.10.10.76
172.16.0.0/16 dev eth1 scope link
default via 1.2.3.49 dev eth0
Default route
Route to the right
Route to the left
Routes after updating
1.2.3.48/29 dev eth0 scope link
1.2.3.48/28 dev eth0 proto kernel scope link src 1.2.3.50
10.10.10.0/24 dev eth2 proto kernel scope link src 10.10.10.76
172.16.0.0/16 dev eth1 scope link
default via 1.2.3.49 dev eth0
[root@test-fw2 gregs]#
[root@test-fw2 gregs]#
[root@test-fw2 gregs]# /usr/sbin/tcpdump -i eth1 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol
decode
listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
19:19:37.714471 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq
7962
19:19:38.714497 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq
7963
19:19:39.714575 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq
7964
19:19:40.714668 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq
7965

4 packets captured
4 packets received by filter
0 packets dropped by kernel
[root@test-fw2 gregs]# ./original-mac.sh
[root@test-fw2 gregs]# /usr/sbin/tcpdump -i eth1 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol
decode
listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
19:20:00.726334 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq
7985
19:20:00.726437 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 7985
19:20:01.176923 fe80::250:daff:fe90:e4aa > ff02::2: icmp6: router
solicitation
19:20:01.726354 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq
7986
19:20:01.726528 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 7986
19:20:02.726445 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq
7987
19:20:02.726614 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 7987
19:20:03.726504 IP 172.16.16.1 > 1.2.3.49: icmp 64: echo request seq
7988
19:20:03.726678 IP 1.2.3.49 > 172.16.16.1: icmp 64: echo reply seq 7988

9 packets captured
9 packets received by filter
0 packets dropped by kernel
[root@test-fw2 gregs]#



-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of David S. Miller
Sent: Friday, March 10, 2006 7:17 PM
To: [email protected]
Cc: [email protected]
Subject: Re: Router stops routing after changing MAC Address

From: Stephen Hemminger <[email protected]>
Date: Fri, 10 Mar 2006 16:39:58 -0800

> You probably just need to flush the route cache after the address
change?

It's probably a good idea for the routing cache to catch this event, if
that's all that needs to be done.

2006-03-11 03:07:20

by Greg Scott

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address


> I think you're not testing hotswapping machines with equal
> MAC addresses here, you're testing hot-changing the MAC
> address for a gateway IP. The machine on the "right side"
> that the machine on the left side is pinging probably still
> has the old MAC address for its gateway in it's ARP cache,
> so the echo reply will be sent to the wrong MAC address. (
> Or am I talking nonsense here?)
>
> --Bart

I sometimes wonder if I'm going crazy myself. :)

My ultimate goal is to hotswap machines with equal MAC Addresses. I
built up two machines, hotswapped, and pinged to each one - and it all
worked. Who would have believed they would refuse to forward packets
when I tried to put them into production? After my installation went
haywire, my little testbed right now has one gateway in the middle and
one system on the right and one on the left. So, yes, right now I am
hot-changing MAC addresses on this gateway, trying to get closer to the
problem where routers don't route when the MAC Address is different than
the hardware MAC Address.

Anyway, just to be completely anal about this I put the original MAC
Addresses on the middle gateway and set up the system on the right to
ping the middle gateway. This worked. With the pings still going, I
fudged the MAC Addresses in the middle gateway. The echo replies to the
right stopped for about 30 seconds while its ARP cache cleared. After
about 30 seconds, the echo replies started coming again as expected.
But the left could never ping the right after fudging MAC Addresses on
the middle gateway.

So far, no matter what, left and right do not see eachother when the
middle has fudged MAC Addresses. But when middle has hardware MAC
Addresses, left and right see each other just fine.

I don't see any way the problem could be related to ARP caches.

- Greg

2006-03-12 15:33:14

by Greg Scott

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

Bart and I had a private discussion about this. I was able to prove
that routing stops when "fudged" MAC Addresses on the router don't match
the hardware MAC Addresses. And routing starts back up again when the I
change the "fudged" MAC Addresses back to match the hardware MAC
Addresses.

There are lots of Linux router failover implementations. They all seem
to use gratuitous ARPs instead of a virtual MAC Address and now I see
why - it doesn't look like the kernel completely supports virtual MAC
Addresses. But I don't think gratuitous ARPs are the right answer -
they are still disruptive in a failover and something somewhere will
break. If the kernel could do routing with virtual MAC Addresses, we
wouldn't need gratuitous ARP.

There must be a kernel data structure someplace that has the "fudged"
MAC Address. I wonder how difficult it would be to do a patch to make
routing use this same data structure instead of the hardware MAC
Address?

- Greg Scott



-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Greg Scott
Sent: Friday, March 10, 2006 9:08 PM
To: [email protected]
Cc: Bart Samwel
Subject: RE: Router stops routing after changing MAC Address


> I think you're not testing hotswapping machines with equal MAC
> addresses here, you're testing hot-changing the MAC address for a
> gateway IP. The machine on the "right side"
> that the machine on the left side is pinging probably still has the
> old MAC address for its gateway in it's ARP cache, so the echo reply
> will be sent to the wrong MAC address. ( Or am I talking nonsense
> here?)
>
> --Bart

I sometimes wonder if I'm going crazy myself. :)

My ultimate goal is to hotswap machines with equal MAC Addresses. I
built up two machines, hotswapped, and pinged to each one - and it all
worked. Who would have believed they would refuse to forward packets
when I tried to put them into production? After my installation went
haywire, my little testbed right now has one gateway in the middle and
one system on the right and one on the left. So, yes, right now I am
hot-changing MAC addresses on this gateway, trying to get closer to the
problem where routers don't route when the MAC Address is different than
the hardware MAC Address.

Anyway, just to be completely anal about this I put the original MAC
Addresses on the middle gateway and set up the system on the right to
ping the middle gateway. This worked. With the pings still going, I
fudged the MAC Addresses in the middle gateway. The echo replies to the
right stopped for about 30 seconds while its ARP cache cleared. After
about 30 seconds, the echo replies started coming again as expected.
But the left could never ping the right after fudging MAC Addresses on
the middle gateway.

So far, no matter what, left and right do not see eachother when the
middle has fudged MAC Addresses. But when middle has hardware MAC
Addresses, left and right see each other just fine.

I don't see any way the problem could be related to ARP caches.

- Greg

2006-03-12 18:08:23

by Alan

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

On Sul, 2006-03-12 at 09:34 -0600, Greg Scott wrote:
> Bart and I had a private discussion about this. I was able to prove
> that routing stops when "fudged" MAC Addresses on the router don't match
> the hardware MAC Addresses. And routing starts back up again when the I
> change the "fudged" MAC Addresses back to match the hardware MAC
> Addresses.

Which driver, and does it occur with other drivers. Also you really want
to move this to [email protected] to get the best network folks to see
it.

2006-03-12 19:37:19

by Greg Scott

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

I think the NICs on all the systems are 3c905b's. The system with the
2.4 kernel on it has them and I think that is what I put in my 2.6-11
test system as well. My 2.6 system doesn't have a modules.conf file so
I will need to dig a little deeper. I suppose I could just open it up
and look. But I am almost sure I put 3c905b cards in both test systems.

- Greg



-----Original Message-----
From: Alan Cox [mailto:[email protected]]
Sent: Sunday, March 12, 2006 12:14 PM
To: Greg Scott
Cc: [email protected]; Bart Samwel
Subject: RE: Router stops routing after changing MAC Address

On Sul, 2006-03-12 at 09:34 -0600, Greg Scott wrote:
> Bart and I had a private discussion about this. I was able to prove
> that routing stops when "fudged" MAC Addresses on the router don't
> match the hardware MAC Addresses. And routing starts back up again
> when the I change the "fudged" MAC Addresses back to match the
> hardware MAC Addresses.

Which driver, and does it occur with other drivers. Also you really want
to move this to [email protected] to get the best network folks to see
it.

2006-03-12 20:42:25

by Randy Dunlap

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

On Sun, 12 Mar 2006 18:14:13 +0000 Alan Cox wrote:

> On Sul, 2006-03-12 at 09:34 -0600, Greg Scott wrote:
> > Bart and I had a private discussion about this. I was able to prove
> > that routing stops when "fudged" MAC Addresses on the router don't match
> > the hardware MAC Addresses. And routing starts back up again when the I
> > change the "fudged" MAC Addresses back to match the hardware MAC
> > Addresses.
>
> Which driver, and does it occur with other drivers. Also you really want
> to move this to [email protected] to get the best network folks to see
> it.

that is now [email protected] ....


---
~Randy
Please use an email client that implements proper (compliant) threading.
(You know who you are.)

2006-03-12 20:48:44

by Alan

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

On Sul, 2006-03-12 at 13:38 -0600, Greg Scott wrote:
> I think the NICs on all the systems are 3c905b's. The system with the
> 2.4 kernel on it has them and I think that is what I put in my 2.6-11
> test system as well. My 2.6 system doesn't have a modules.conf file so
> I will need to dig a little deeper. I suppose I could just open it up
> and look. But I am almost sure I put 3c905b cards in both test systems.

Humm - do they start routing correctly if you "ifconfig eth0
promisc" (where eth0 is each interface whose mac you changed) ?

2006-03-12 20:50:59

by Greg Scott

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

No - tried that earlier, no effect. I also deleted and recreated all
the routes. Again, no effect.

- Greg


-----Original Message-----
From: Alan Cox [mailto:[email protected]]
Sent: Sunday, March 12, 2006 2:55 PM
To: Greg Scott
Cc: [email protected]; Bart Samwel
Subject: RE: Router stops routing after changing MAC Address

On Sul, 2006-03-12 at 13:38 -0600, Greg Scott wrote:
> I think the NICs on all the systems are 3c905b's. The system with the
> 2.4 kernel on it has them and I think that is what I put in my 2.6-11
> test system as well. My 2.6 system doesn't have a modules.conf file
> so I will need to dig a little deeper. I suppose I could just open it

> up and look. But I am almost sure I put 3c905b cards in both test
systems.

Humm - do they start routing correctly if you "ifconfig eth0 promisc"
(where eth0 is each interface whose mac you changed) ?

2006-03-13 06:15:50

by Chuck Ebbert

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

In-Reply-To: <925A849792280C4E80C5461017A4B8A20321CC@mail733.InfraSupportEtc.com>

On Fri, 10 Mar 2006 18:33:15 -0600, Greg Scott wrote:

> How to change MAC addresses is documented well enough - and it works -
> but when I change MAC addresses, my router stops routing. From the
> router, I can see the systems on both sides - but the router just
> refuses to forward packets. Here are my little test scripts to change
> MAC Addresses.
>
> First - ip-fudge-mac.sh
> [root@test-fw2 gregs]# more ip-fudge-mac.sh
> ip link set eth0 down
> ip link set eth0 address 01:02:03:04:05:06
^
Bit zero is set, so this is a multicast address. Is that intentional?

> ip link set eth0 up
>
> ip link set eth1 down
> ip link set eth1 address 17:20:16:01:60:03
^
Ditto.

> ip link set eth1 up
>
> echo "1" > /proc/sys/net/ipv4/ip_forward


--
Chuck
"Penguins don't come from next door, they come from the Antarctic!"

2006-03-13 12:14:30

by Greg Scott

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

On eth0 - no. My "fudged" MAC Address is based on the IP Address. So
1.2.3.50 becomes 001.002.003.050, which turns into 00:10:02:00:30:50.
But 1.2.3 is fake - it isn't the one I really use. The other one,
172.16.16.3 - that is a real IP Address that turns into
17:20:16:01:60:03. And here I thought I was pretty clever - it never
dawned on me in my wildest dreams that those bits had any special
meaning! I will do some homework about what all the bits mean and then
put together another scheme for my fudged IP Addresses and post the
results here.

- Greg



-----Original Message-----
From: Chuck Ebbert [mailto:[email protected]]
Sent: Monday, March 13, 2006 12:11 AM
To: Greg Scott
Cc: linux-kernel; David S. Miller
Subject: Re: Router stops routing after changing MAC Address

In-Reply-To:
<925A849792280C4E80C5461017A4B8A20321CC@mail733.InfraSupportEtc.com>

On Fri, 10 Mar 2006 18:33:15 -0600, Greg Scott wrote:

> How to change MAC addresses is documented well enough - and it works -

> but when I change MAC addresses, my router stops routing. From the
> router, I can see the systems on both sides - but the router just
> refuses to forward packets. Here are my little test scripts to change

> MAC Addresses.
>
> First - ip-fudge-mac.sh
> [root@test-fw2 gregs]# more ip-fudge-mac.sh ip link set eth0 down ip
> link set eth0 address 01:02:03:04:05:06
^
Bit zero is set, so this is a multicast address. Is that intentional?

> ip link set eth0 up
>
> ip link set eth1 down
> ip link set eth1 address 17:20:16:01:60:03
^
Ditto.

> ip link set eth1 up
>
> echo "1" > /proc/sys/net/ipv4/ip_forward


--
Chuck
"Penguins don't come from next door, they come from the Antarctic!"

2006-03-13 17:16:20

by Greg Scott

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

HOT DOGGIES!!!!!!!!!!

I think Chuck found the problem. It turns out that the OUI portion of
the MAC Address - those leftmost 6 hex digits that identify the vendor -
do also have some other special meaning built in. Chuck, I am indebted
to you and the list. If the second hex digit is odd, this means the
high-order bit of the OUI is set and that means it's a multicast
address. I think I have my bits right. Here is an excerpt from
http://www.iana.org/assignments/ethernet-numbers.

> These addresses are physical station addresses, not multicast nor
> broadcast, so the second hex digit (reading from the left) will be
> even, not odd.

There are also other sources describing how the bits are arranged and
how we display MAC Addresses. Google is our friend.

Anyway, one of my fudged MAC Addresses had an odd number in that second
hex digit - and that's why the router did not route. The solution -
just make sure my fudged MAC Addresses are real unicast MAC Addresses
and not multicast addresses.

Here is my modified ip-fudge-mac.sh script - note that I also turned
rp_filter back on:


[root@test-fw2 gregs]# more ip-fudge-mac.sh
/sbin/ip link set eth0 down
/sbin/ip link set eth0 address 12:34:56:00:30:50
/sbin/ip link set eth0 up

/sbin/ip link set eth1 down
/sbin/ip link set eth1 address 12:34:56:01:60:03
/sbin/ip link set eth1 up

echo "1" > /proc/sys/net/ipv4/ip_forward
echo "1" > /proc/sys/net/ipv4/conf/eth0/rp_filter
echo "1" > /proc/sys/net/ipv4/conf/eth1/rp_filter


##6: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
## link/ether 00:10:4b:71:20:60 brd ff:ff:ff:ff:ff:ff
##7: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
## link/ether 00:60:97:b6:f9:4a brd ff:ff:ff:ff:ff:ff
[root@test-fw2 gregs]#


I also learned the IEEE has an easy way for anyone to register their own
OUI. You fill out a web form and pay $1650 and 7 days later, you're the
proud owner of your own OUI block - with 24 bits to use as you see fit.
If $1650 is too steep, you can pay $550 and buy 12 bits of MAC
Addresses.

For now, I decided to use a fudged OUI of 12-34-56 and then use the
rightmost 2 octets of the IP Address with leading zeros to fill out the
rest of the MAC Address. I will buy some official numbers from the IEEE
later.

It is proper to give back when given a gift from the community. So here
is my failover-monitor.sh script in its state right now. I will
probably do a few more tweaks before going into production. The .conf
file referenced defines a bunch of IP Addresses and interface names
specific to this site.

This little script starts up as a daemon at boot time and sends its
output to a log file. It polls the heartbeat NIC every 10 seconds. If
the other end does not respond to a ping, it checks all the other NIC
interfaces. If no response from the other NICs either, it checks the
gateway - the router to the Internet. If the gateway DOES respond, then
it assumes the primary role. After assuming the primary role, it polls
the gateway every 10 seconds. If the gatway goes offline, it takes
itself offline and assumes a backup role - polling every 10 seconds to
determine if it should take control again. This hopefully minimizes the
probability that both members of the failover pair will try to take
control and that both will assume a backup role with nobody taking
control. But I may have to tweak the algorithm a bit more after more
testing.

- Greg Scott



#!/bin/bash
# failover-monitor.sh
# First find out if this node or its partner should be primary by
# checking the flag file. If the file exists then this node thinks
# it is supposed to be primary, so take control if its partner is
# unreachable on all interfaces.
# If the flag file does not exist then assume a backup role.
# Poll its partner. If its parter is offline then take control.
# If its partner is online then sleep for a few seconds and repeat.
#
# Greg Scott, March 8, 2006

. /firewall-scripts/rcfirewall.conf

#
# Figure out who we are
#

if [ $(hostname) = $FW1_HOST ]
then
ME_HOST=$FW1_HOST
ME_HBEAT=$FW1_HBEAT
ME_INET=$FW1_INET
ME_INETMAC=$FW1_INETMAC
ME_TRUSTED=$FW1_TRUSTED
ME_TRUSTEDMAC=$FW1_TRUSTEDMAC
YOU_HOST=$FW2_HOST
YOU_HBEAT=$FW2_HBEAT
YOU_INET=$FW2_INET
YOU_TRUSTED=$FW2_TRUSTED
else
ME_HOST=$FW2_HOST
ME_HBEAT=$FW2_HBEAT
ME_INET=$FW2_INET
ME_INETMAC=$FW2_INETMAC
ME_TRUSTED=$FW2_TRUSTED
ME_TRUSTEDMAC=$FW2_TRUSTEDMAC
YOU_HOST=$FW1_HOST
YOU_HBEAT=$FW1_HBEAT
YOU_INET=$FW1_INET
YOU_TRUSTED=$FW1_TRUSTED
fi

function take_control {
# This function is called when the failover partner does not reply
# on the YOU_HBEAT IP Address.
#
# Take over the firewall IP address and special MAC address iff:
# This node, "ME", can see the Internet gateway and YOU_TRUSTED
# and INET_IP do not answer. Remember that INET_IP is the
# IP Address of the primary firewall. That is why we test
# for INET_IP and not YOU_INET.

echo "Investigating taking control"

#
# Ping our partner's other interfaces and the gateway and check
# the status codes. Status of 0 is success. 1 is no reply, 2 is
# any other error. See ping man pages.
#

echo "Checking to see if $YOU_HOST answers on its other interfaces"

ST=0

ping -c 1 -q -w 3 $INET_IP &> /dev/nl
ST=$?
# Ping INET_IP instead of YOU_INET INET_IP because INET_IP is the
# primary IP Address.
if [ $ST = 0 ]
then
echo "$YOU_HOST is alive on $INET_IP. Not assuming primary
role."
ST=$YOU_PARTONLINE
else
echo "$YOU_HOST does not answer on $INET_IP"
ping -c 1 -q -w 3 $YOU_TRUSTED &> /dev/nl
ST=$?
if [ $ST = 0 ]
then
echo "$YOU_HOST is alive on $YOU_TRUSTED. Not assuming
primary role."
ST=$YOU_PARTONLINE
else
echo "$YOU_HOST does not answer on $YOU_TRUSTED"
ping -c 1 -q -w 3 $GATEWAY_IP &> /dev/nl
ST=$?
if [ $ST != 0 ]
then
echo "Gateway at $GATEWAY_IP does not answer. Not
assuming primary role."
else
echo "I see gateway $GATEWAY_IP."
echo "$(date) $ME_HOST Assuming primary firewall role"
assume_primary
echo "$(date) $ME_HOST relinquished primary firewall
role."
fi
fi
fi

return $ST
}

function assume_primary {
# Create FLAGFILE noting that this node is primary.
# Set up the IP Addresses on the INET and TRUSTED1 interfaces.
# run rc.firewall.
# Poll GATEWAY_IP periodically.
# If it does not answer
# then reset all interfaces and firewall rules back to their
# initial state and return.

echo "$(date) $ME_HOST assuming primary firewall role." >> $FLAGFILE

/sbin/ifdown $INET_IFACE
/sbin/ifconfig $INET_IFACE hw ether $INET_MAC
/sbin/ifconfig $INET_IFACE $INET_IP netmask $INET_NETMASK broadcast
$INET_BCAST_ADDRESS
/sbin/ifup $INET_IFACE

/sbin/ifdown $TRUSTED1_IFACE
/sbin/ifconfig $TRUSTED1_IFACE hw ether $TRUSTED1_MAC
/sbin/ifconfig $TRUSTED1_IFACE $TRUSTED1_IP netmask $TRUSTED1_NETMASK
broadcast $TRUSTED1_BCAST_ADDRESS
/sbin/ifup $TRUSTED1_IFACE

echo "Running rc.firewall"

/firewall-scripts/rc.firewall

#
# So now this node is primary and handling all firewall duties. Poll
the
# gateway every 10 seconds and resume a backup role if this node and the
# gateway lose touch with each other. This is a safety mechanism to
reduce
# the odds that both nodes will try to become primary at the same time.

#

while true ; do

# echo "$(date) sleeping 10 seconds"
sleep 10

# Ping the gateway and check the status code
ping -c 1 -q -w 3 $GATEWAY_IP &> /dev/nl

if [ $? != 0 ]
then
# We lost contact with the gateway so reset everything
echo ""
echo "$(date) The gateway at $GATEWAY_IP appears to be offline."
# DO NOT remove_flagfile
# because if the gateway comes back somebody has to take
control.
reset_interfaces
break
fi

done

return 0
}


function reset_interfaces {

echo "Resetting $INET_IFACE to $ME_INET with MAC $ME_INETMAC"
/sbin/ifdown $INET_IFACE
/sbin/ifconfig $INET_IFACE hw ether $ME_INETMAC
/sbin/ifconfig $INET_IFACE $ME_INET netmask $INET_NETMASK broadcast
$INET_BCAST_ADDRESS
/sbin/ifup $INET_IFACE

echo "Resetting $TRUSTED1_IFACE to $ME_TRUSTED with MAC $ME_TRUSTEDMAC"
/sbin/ifdown $TRUSTED1_IFACE
/sbin/ifconfig $TRUSTED1_IFACE hw ether $ME_TRUSTEDMAC
/sbin/ifconfig $TRUSTED1_IFACE $ME_TRUSTED netmask $TRUSTED1_NETMASK
broadcast $TRUSTED1_BCAST_ADDRESS
/sbin/ifup $TRUSTED1_IFACE

echo "Resetting to initial firewall rules."
/firewall-scripts/initial_rc.firewall

return 0
}


function remove_flagfile {
echo "$(date) Removing ${FLAGFILE}"
rm -f $FLAGFILE
return 0
}


echo "$(date) starting up failover.sh on $ME_HOST"

echo "Me"
echo "ME_HOST is $ME_HOST"
echo "ME_HBEAT is $ME_HBEAT"
echo "ME_INET is $ME_INET"
echo "ME_TRUSTED is $ME_TRUSTED"

echo
echo "You"
echo "YOU_HOST is $YOU_HOST"
echo "YOU_HBEAT is $YOU_HBEAT"
echo "YOU_INET is $YOU_INET"
echo "YOU_TRUSTED is $YOU_TRUSTED"

echo

reset_interfaces

echo "Initialization complete. Starting loop"

#
# Initialization is now complete
#

HBEAT_FLG=0

while true ; do

# echo "$(date) sleeping 10 seconds"
sleep 10

if [ -f $FLAGFILE ]
then
echo "$FLAGFILE found; attempting to seize control regardless of
heartbeat"
take_control
if [ $? != 0 ]
then
echo "Unable to take control; removing $FLAGFILE"
remove_flagfile
fi
fi

#
# Check for heartbeat
#
ping -c 1 -q -w 3 $YOU_HBEAT &> /dev/nl
if [ $? != 0 ]
then
HBEAT_FLG=1
echo "$(date) No heartbeat detected from $YOU_HOST"
take_control
continue
else
if [ $HBEAT_FLG != 0 ]
then
HBEAT_FLG=0
echo "$(date) Heartbeat with $YOU_HOST restored"
fi
fi

done

exit 0



-----Original Message-----
From: Chuck Ebbert [mailto:[email protected]]
Sent: Monday, March 13, 2006 12:11 AM
To: Greg Scott
Cc: linux-kernel; David S. Miller
Subject: Re: Router stops routing after changing MAC Address

In-Reply-To:
<925A849792280C4E80C5461017A4B8A20321CC@mail733.InfraSupportEtc.com>

On Fri, 10 Mar 2006 18:33:15 -0600, Greg Scott wrote:

> How to change MAC addresses is documented well enough - and it works -

> but when I change MAC addresses, my router stops routing. From the
> router, I can see the systems on both sides - but the router just
> refuses to forward packets. Here are my little test scripts to change

> MAC Addresses.
>
> First - ip-fudge-mac.sh
> [root@test-fw2 gregs]# more ip-fudge-mac.sh ip link set eth0 down ip
> link set eth0 address 01:02:03:04:05:06
^
Bit zero is set, so this is a multicast address. Is that intentional?

> ip link set eth0 up
>
> ip link set eth1 down
> ip link set eth1 address 17:20:16:01:60:03
^
Ditto.

> ip link set eth1 up
>
> echo "1" > /proc/sys/net/ipv4/ip_forward


--
Chuck
"Penguins don't come from next door, they come from the Antarctic!"

2006-03-13 18:08:00

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

There still is a bug in the 3c59x driver. It doesn't include any code
to handle changing the mac address. It will work if you take the device
down, change address, then bring it up. But you shouldn't have to do that.

Also, if the driver handles setting mac address, it could have prevented
you from using a multicast address.

Something like this is needed (untested, I don't have that hardware).


--- linux-2.6/drivers/net/3c59x.c.orig 2006-03-13 09:58:25.000000000 -0800
+++ linux-2.6/drivers/net/3c59x.c 2006-03-13 09:52:47.000000000 -0800
@@ -895,6 +895,7 @@ static void dump_tx_ring(struct net_devi
static void update_stats(void __iomem *ioaddr, struct net_device *dev);
static struct net_device_stats *vortex_get_stats(struct net_device *dev);
static void set_rx_mode(struct net_device *dev);
+static int set_rx_address(struct net_device *dev, void *addr);
#ifdef CONFIG_PCI
static int vortex_ioctl(struct net_device *dev, struct ifreq *rq, int cmd);
#endif
@@ -1563,6 +1564,7 @@ static int __devinit vortex_probe1(struc
#endif
dev->ethtool_ops = &vortex_ethtool_ops;
dev->set_multicast_list = set_rx_mode;
+ dev->set_mac_address = set_rx_address;
dev->tx_timeout = vortex_tx_timeout;
dev->watchdog_timeo = (watchdog * HZ) / 1000;
#ifdef CONFIG_NET_POLL_CONTROLLER
@@ -3150,6 +3152,27 @@ static void set_rx_mode(struct net_devic
iowrite16(new_mode, ioaddr + EL3_CMD);
}

+
+static int set_rx_address(struct net_device *dev, void *p)
+{
+ struct vortex_private *vp = netdev_priv(dev);
+ void __iomem *ioaddr = vp->ioaddr;
+ const struct sockaddr *addr = p;
+
+ if (!is_valid_ether_addr(addr->sa_data))
+ return -EADDRNOTAVAIL;
+
+ spin_lock_bh(&vp->lock);
+ memcpy(dev->dev_addr, addr->sa_data, ETH_ALEN);
+
+ EL3WINDOW(2);
+ for (i = 0; i < ETH_ALEN; i++)
+ iowrite8(dev->dev_addr[i], ioaddr + i);
+ spin_unlock_bh(&vp->lock);
+
+ return 0;
+}
+
#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
/* Setup the card so that it can receive frames with an 802.1q VLAN tag.
Note that this must be done after each RxReset due to some backwards

2006-03-13 20:27:51

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address


On Mon, 13 Mar 2006, Stephen Hemminger wrote:

> There still is a bug in the 3c59x driver. It doesn't include any code
> to handle changing the mac address. It will work if you take the device
> down, change address, then bring it up. But you shouldn't have to do that.
>
> Also, if the driver handles setting mac address, it could have prevented
> you from using a multicast address.
>
> Something like this is needed (untested, I don't have that hardware).
>
>
> --- linux-2.6/drivers/net/3c59x.c.orig 2006-03-13 09:58:25.000000000 -0800
> +++ linux-2.6/drivers/net/3c59x.c 2006-03-13 09:52:47.000000000 -0800
> @@ -895,6 +895,7 @@ static void dump_tx_ring(struct net_devi
> static void update_stats(void __iomem *ioaddr, struct net_device *dev);
> static struct net_device_stats *vortex_get_stats(struct net_device *dev);
> static void set_rx_mode(struct net_device *dev);
> +static int set_rx_address(struct net_device *dev, void *addr);
> #ifdef CONFIG_PCI
> static int vortex_ioctl(struct net_device *dev, struct ifreq *rq, int cmd);
> #endif
> @@ -1563,6 +1564,7 @@ static int __devinit vortex_probe1(struc
> #endif
> dev->ethtool_ops = &vortex_ethtool_ops;
> dev->set_multicast_list = set_rx_mode;
> + dev->set_mac_address = set_rx_address;
> dev->tx_timeout = vortex_tx_timeout;
> dev->watchdog_timeo = (watchdog * HZ) / 1000;
> #ifdef CONFIG_NET_POLL_CONTROLLER
> @@ -3150,6 +3152,27 @@ static void set_rx_mode(struct net_devic
> iowrite16(new_mode, ioaddr + EL3_CMD);
> }
>
> +
> +static int set_rx_address(struct net_device *dev, void *p)
> +{
> + struct vortex_private *vp = netdev_priv(dev);
> + void __iomem *ioaddr = vp->ioaddr;
> + const struct sockaddr *addr = p;
> +
> + if (!is_valid_ether_addr(addr->sa_data))
> + return -EADDRNOTAVAIL;
> +
> + spin_lock_bh(&vp->lock);
> + memcpy(dev->dev_addr, addr->sa_data, ETH_ALEN);
> +
> + EL3WINDOW(2);
> + for (i = 0; i < ETH_ALEN; i++)
> + iowrite8(dev->dev_addr[i], ioaddr + i);
> + spin_unlock_bh(&vp->lock);
> +
> + return 0;
> +}
> +
> #if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
> /* Setup the card so that it can receive frames with an 802.1q VLAN tag.
> Note that this must be done after each RxReset due to some backwards
> -

Actually, it doesn't make any difference. Changing the IEEE station
(physical) address is not an allowed procedure even though hooks are
available in many drivers to do this. According to the IEEE 802
physical media specification, this 48-bit address must be unique and
must be one of a group assigned by IEEE. Failure to follow this
simple protocol can (will) cause an entire network to fail. If
you don't care, then you certainly don't care about multicast
bits either, basically let them set it to all ones as well.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips).
Warning : 98.36% of all statistics are fiction, book release in April.
_


****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2006-03-13 20:55:57

by Greg Scott

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

But in a failover scenario you want two devices to have the same IEEE
(station) Address (or MAC Address or hardware address). So many names
for the same thing!

When the primary unit fails, you want the backup unit to completely
assume the failed unit's identity - right down to the MAC Address. The
other way to do it using gratuitous ARPs is not good enough because some
cheap router someplace with an ARP cache of several hours will not
listen and will never update its own ARP cache.

I like to think of this as bending the rules a little bit, not really
breaking them. :)

- Greg



>Actually, it doesn't make any difference. Changing the IEEE station
>(physical) address is not an allowed procedure even though hooks are
>available in many drivers to do this. According to the IEEE 802
>physical media specification, this 48-bit address must be unique
>and must be one of a group assigned by IEEE. Failure to follow this
>simple protocol can (will) cause an entire network to fail. If you
>don't care, then you certainly don't care about multicast bits either,
>basically let them set it to all ones as well.

>Cheers,
>Dick Johnson
>Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips).
>Warning : 98.36% of all statistics are fiction, book release in April.

2006-03-13 21:39:29

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address


On Mon, 13 Mar 2006, Greg Scott wrote:

> But in a failover scenario you want two devices to have the same IEEE
> (station) Address (or MAC Address or hardware address). So many names
> for the same thing!
>
> When the primary unit fails, you want the backup unit to completely
> assume the failed unit's identity - right down to the MAC Address. The
> other way to do it using gratuitous ARPs is not good enough because some
> cheap router someplace with an ARP cache of several hours will not
> listen and will never update its own ARP cache.
>
> I like to think of this as bending the rules a little bit, not really
> breaking them. :)
>
> - Greg
>

Top posting, NotGood(tm). Anyway, if the device fails, you have
routers and hosts ARPing the interface, trying to establish a
route anyway.

>
>
>> Actually, it doesn't make any difference. Changing the IEEE station
>> (physical) address is not an allowed procedure even though hooks are
>> available in many drivers to do this. According to the IEEE 802
>> physical media specification, this 48-bit address must be unique
>> and must be one of a group assigned by IEEE. Failure to follow this
>> simple protocol can (will) cause an entire network to fail. If you
>> don't care, then you certainly don't care about multicast bits either,
>> basically let them set it to all ones as well.
>
>> Cheers,
>> Dick Johnson
>> Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips).
>> Warning : 98.36% of all statistics are fiction, book release in April.
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips).
Warning : 98.36% of all statistics are fiction, book release in April.
_


****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2006-03-13 21:50:13

by Rick Jones

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

> Anyway, if the device fails, you have
> routers and hosts ARPing the interface, trying to establish a
> route anyway.

But only after what may be a much longer time than the customer is
willing to accept or able to configure. I know of a number of HA
situations where the "new" device is given the "old" MAC just to avoid
that speicific situation of ARP caches not being updated except after
quite some time. Not necessarily on the end-systems, the issue can be
with intermediate devices (routers).

And if one has to work with static ARP entries to deal (however
imperfectly) with ARP poisioning or whatnot...

Indeed, there is a large onus on the software doing the MAC override to
make sure it does not break the required uniqueness. Just as if one
were using locally administered MAC addresses.

rick jones

2006-03-13 22:08:50

by Randy Dunlap

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

On Mon, 13 Mar 2006 15:27:26 -0500 linux-os \(Dick Johnson\) wrote:

>
> On Mon, 13 Mar 2006, Stephen Hemminger wrote:
>
> > There still is a bug in the 3c59x driver. It doesn't include any code
> > to handle changing the mac address. It will work if you take the device
> > down, change address, then bring it up. But you shouldn't have to do that.
> >
> > Also, if the driver handles setting mac address, it could have prevented
> > you from using a multicast address.
> >
> > Something like this is needed (untested, I don't have that hardware).
> >
[cut patch]

> Actually, it doesn't make any difference. Changing the IEEE station
> (physical) address is not an allowed procedure even though hooks are
> available in many drivers to do this. According to the IEEE 802
> physical media specification, this 48-bit address must be unique and
> must be one of a group assigned by IEEE. Failure to follow this
> simple protocol can (will) cause an entire network to fail. If
> you don't care, then you certainly don't care about multicast
> bits either, basically let them set it to all ones as well.

They used to allow "Locally Administered Addresses." Hrm,
google still finds 18,000 hits for that phrase. Is that now
outlawed?

Even ieee.org has hit(s) for it:
http://standards.ieee.org/regauth/groupmac/tutorial.html

http://en.wikipedia.org/wiki/MAC_address
http://www.mynetwatchman.com/pckidiot/chap04.htm

---
~Randy
You can't do anything without having to do something else first.
-- Belefant's Law

2006-03-13 22:13:57

by Greg Scott

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

Yup.

I had a situation 2 weeks ago where a customer connected a system to the
Internet with an IP Address he should not have used. And the little
Cisco router on the frontend dutifully recorded it in its ARP cache -
forever, with no TTL! This took down their webmail for most of a day
until we finally had to cycle the power on that nasty little Cisco 678.

Bigger routers do it too. I've had several situations over the years
where I replaced an older firewall with a newer one with the same IP
Addresses. All the internal servers find it soon enough. But I've
waited literally hours for the routers to finally purge their ARP caches
so they would see my replacement systems - often with the customer
looking over my shoulders getting more and more nervous by the minute.

And sometimes the routers are not accessible - you can't cycle them even
if you had permission. Consider the cases of bridged DSL service -
where the real router could be on the other side of the country. Try
calling an ISP and asking the tech on the other end to purge an ARP
cache on a router. So the same IP Addresses but different MAC
addresses, all you can do is wait for the passage of (lots of) time.
That happened to me in my own network once. I accidently took down my
email server for something like 4 hours one time when I got careless.

> Indeed, there is a large onus on the software doing the MAC
> override to make sure it does not break the required uniqueness.
> Just as if one were using locally administered MAC addresses.

Yes. My 12:34:56 OUI scheme will work for this project but it is
definitely not good for the long term. I really really hope I have to
spend some money with the IEEE soon to support lots and lots of
rollouts. :)

- Greg Scott



-----Original Message-----
From: Rick Jones [mailto:[email protected]]
Sent: Monday, March 13, 2006 3:50 PM
To: linux-os (Dick Johnson)
Cc: Greg Scott; Chuck Ebbert; linux-kernel; [email protected]; Bart
Samwel; Alan Cox; Simon Mackinlay
Subject: Re: Router stops routing after changing MAC Address

> Anyway, if the device fails, you have
> routers and hosts ARPing the interface, trying to establish a route
> anyway.

But only after what may be a much longer time than the customer is
willing to accept or able to configure. I know of a number of HA
situations where the "new" device is given the "old" MAC just to avoid
that speicific situation of ARP caches not being updated except after
quite some time. Not necessarily on the end-systems, the issue can be
with intermediate devices (routers).

And if one has to work with static ARP entries to deal (however
imperfectly) with ARP poisioning or whatnot...

Indeed, there is a large onus on the software doing the MAC override to
make sure it does not break the required uniqueness. Just as if one
were using locally administered MAC addresses.

rick jones

2006-03-13 22:36:17

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address


On Mon, 13 Mar 2006, Greg Scott wrote:

> Yup.
>
> I had a situation 2 weeks ago where a customer connected a system to the
> Internet with an IP Address he should not have used. And the little
> Cisco router on the frontend dutifully recorded it in its ARP cache -
> forever, with no TTL! This took down their webmail for most of a day
> until we finally had to cycle the power on that nasty little Cisco 678.
>
> Bigger routers do it too. I've had several situations over the years
> where I replaced an older firewall with a newer one with the same IP
> Addresses. All the internal servers find it soon enough. But I've
> waited literally hours for the routers to finally purge their ARP caches
> so they would see my replacement systems - often with the customer
> looking over my shoulders getting more and more nervous by the minute.
>
> And sometimes the routers are not accessible - you can't cycle them even
> if you had permission. Consider the cases of bridged DSL service -

Bzzzzst... Not! There are not any MAC addresses associated with any
of the intercity links, usually not even in WANs! MAC is for
Ethernet! Once you go to fiber, ATM, T-N, etc., there are no
MAC addresses. That's why there are bridges and routers, you
got to "connect" your tiny time-slot to your LAN and that
first device contains the MAC address that all your other stuff
talks to.

> where the real router could be on the other side of the country. Try
> calling an ISP and asking the tech on the other end to purge an ARP
> cache on a router. So the same IP Addresses but different MAC
> addresses, all you can do is wait for the passage of (lots of) time.
> That happened to me in my own network once. I accidently took down my
> email server for something like 4 hours one time when I got careless.
>
>> Indeed, there is a large onus on the software doing the MAC
>> override to make sure it does not break the required uniqueness.
>> Just as if one were using locally administered MAC addresses.
>
> Yes. My 12:34:56 OUI scheme will work for this project but it is
> definitely not good for the long term. I really really hope I have to
> spend some money with the IEEE soon to support lots and lots of
> rollouts. :)
>
> - Greg Scott
>
>
>
> -----Original Message-----
> From: Rick Jones [mailto:[email protected]]
> Sent: Monday, March 13, 2006 3:50 PM
> To: linux-os (Dick Johnson)
> Cc: Greg Scott; Chuck Ebbert; linux-kernel; [email protected]; Bart
> Samwel; Alan Cox; Simon Mackinlay
> Subject: Re: Router stops routing after changing MAC Address
>
> > Anyway, if the device fails, you have
>> routers and hosts ARPing the interface, trying to establish a route
>> anyway.
>
> But only after what may be a much longer time than the customer is
> willing to accept or able to configure. I know of a number of HA
> situations where the "new" device is given the "old" MAC just to avoid
> that speicific situation of ARP caches not being updated except after
> quite some time. Not necessarily on the end-systems, the issue can be
> with intermediate devices (routers).
>
> And if one has to work with static ARP entries to deal (however
> imperfectly) with ARP poisioning or whatnot...
>
> Indeed, there is a large onus on the software doing the MAC override to
> make sure it does not break the required uniqueness. Just as if one
> were using locally administered MAC addresses.
>
> rick jones
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips).
Warning : 98.36% of all statistics are fiction, book release in April.
_


****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2006-03-14 00:25:19

by Robert Hancock

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

linux-os (Dick Johnson) wrote:
>> And sometimes the routers are not accessible - you can't cycle them even
>> if you had permission. Consider the cases of bridged DSL service -
>
> Bzzzzst... Not! There are not any MAC addresses associated with any
> of the intercity links, usually not even in WANs! MAC is for
> Ethernet!

Bzzzt.. Plenty of DSL and cable services have a MAC address for both
ends of the connection that can be seen from the other end. Essentially
the link appears like a restricted Ethernet link.

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/


2006-03-14 11:41:28

by Bart Samwel

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

linux-os (Dick Johnson) wrote:
> On Mon, 13 Mar 2006, Greg Scott wrote:
> Bzzzzst... Not! There are not any MAC addresses associated with any
> of the intercity links, usually not even in WANs! MAC is for
> Ethernet! Once you go to fiber, ATM, T-N, etc., there are no MAC addresses.

Bzzzzt. According to WikiPedia:

http://en.wikipedia.org/wiki/MAC_address

MAC addresses are used for:

- Token ring
- 802.11 wireless networks
- Bluetooth
- FDDI
- ATM (switched virtual connections only, as part of an NSAP address)
- SCSI and Fibre Channel (as part of a World Wide Name)

FDDI = fiber, ATM = ATM.

--Bart

2006-03-14 12:12:28

by Simon Mackinlay

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

> Bzzzzt. According to WikiPedia:
>
> http://en.wikipedia.org/wiki/MAC_address
>
> MAC addresses are used for:
>
> - Token ring
> - 802.11 wireless networks
> - Bluetooth
> - FDDI
> - ATM (switched virtual connections only, as part of an NSAP address)
> - SCSI and Fibre Channel (as part of a World Wide Name)
>
> FDDI = fiber, ATM = ATM.

http://developer.intel.com/design/network/products/optical/framers/ixf18104.htm

It works too.

Cheers,

Simon

--
___________________________________________________
Play 100s of games for FREE! http://games.mail.com/

2006-03-14 12:53:12

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address


On Tue, 14 Mar 2006, Bart Samwel wrote:

> linux-os (Dick Johnson) wrote:
>> On Mon, 13 Mar 2006, Greg Scott wrote:
>> Bzzzzst... Not! There are not any MAC addresses associated with any
>> of the intercity links, usually not even in WANs! MAC is for
>> Ethernet! Once you go to fiber, ATM, T-N, etc., there are no MAC addresses.
>
> Bzzzzt. According to WikiPedia:
>
> http://en.wikipedia.org/wiki/MAC_address
>
> MAC addresses are used for:
>
> - Token ring
> - 802.11 wireless networks
> - Bluetooth
> - FDDI
> - ATM (switched virtual connections only, as part of an NSAP address)
> - SCSI and Fibre Channel (as part of a World Wide Name)
>
> FDDI = fiber, ATM = ATM.
>
> --Bart
>

A name is NOT. I can call my mail route number RFD#2 a MAC
address. Also token-ring is a form of Ethernet as are all
known wireless networks unless they use light. Even cable
modems use Ethernet, with FDM on the cable side and baseband
on the customer side. Calling SCSI MAC is absurd. All of the
above, except the ethernets are forms of point-to-point
communications links. IP (over/under or through) these
links uses a source and destination IP and any hardware
addressing scheme is incidental.


Cheers,
Dick Johnson
Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips).
Warning : 98.36% of all statistics are fiction, book release in April.
_


****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2006-03-14 14:18:35

by Bjørn Mork

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

"linux-os \(Dick Johnson\)" <[email protected]> writes:

> Actually, it doesn't make any difference. Changing the IEEE station
> (physical) address is not an allowed procedure even though hooks are
> available in many drivers to do this.

Of course it is. It's even required to support some obsolete
networking protocols. You could start with
Documentation/networking/decnet.txt if you don't want to STFW


Bj?rn
--
I mean, you're always totally wrong.

2006-03-14 15:29:18

by Greg Scott

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

Yet I have real-world examples I've seen with my own eyes where MAC
Address problems have messed up bridged networks. I posted some of
those here yesterday. Good old Ethernet MAC Addresses can and do play a
real role in these wide area networks.

Don't believe me? Try it yourself. Find a LAN connected to the
Internet via bridged DSL or cablemodem with a real firewall in place.
Swap the firewall and wait...and wait...and wait some more for ARP
caches to clear on the other end.

When nothing changes but the passage of time and traffic starts to flow
again - and the Internet service is bridged not routed - give me another
explanation besides ARP caches.

- Greg



-----Original Message-----
From: linux-os (Dick Johnson) [mailto:[email protected]]
Sent: Tuesday, March 14, 2006 6:53 AM
To: Bart Samwel
Cc: Greg Scott; Rick Jones; Chuck Ebbert; linux-kernel;
[email protected]; Alan Cox; Simon Mackinlay
Subject: Re: Router stops routing after changing MAC Address


On Tue, 14 Mar 2006, Bart Samwel wrote:

> linux-os (Dick Johnson) wrote:
>> On Mon, 13 Mar 2006, Greg Scott wrote:
>> Bzzzzst... Not! There are not any MAC addresses associated with any
>> of the intercity links, usually not even in WANs! MAC is for
>> Ethernet! Once you go to fiber, ATM, T-N, etc., there are no MAC
addresses.
>
> Bzzzzt. According to WikiPedia:
>
> http://en.wikipedia.org/wiki/MAC_address
>
> MAC addresses are used for:
>
> - Token ring
> - 802.11 wireless networks
> - Bluetooth
> - FDDI
> - ATM (switched virtual connections only, as part of an NSAP address)
> - SCSI and Fibre Channel (as part of a World Wide Name)
>
> FDDI = fiber, ATM = ATM.
>
> --Bart
>

A name is NOT. I can call my mail route number RFD#2 a MAC address.
Also token-ring is a form of Ethernet as are all known wireless networks
unless they use light. Even cable modems use Ethernet, with FDM on the
cable side and baseband on the customer side. Calling SCSI MAC is
absurd. All of the above, except the ethernets are forms of
point-to-point communications links. IP (over/under or through) these
links uses a source and destination IP and any hardware addressing
scheme is incidental.


Cheers,
Dick Johnson
Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips).
Warning : 98.36% of all statistics are fiction, book release in April.
_


****************************************************************
The information transmitted in this message is confidential and may be
privileged. Any review, retransmission, dissemination, or other use of
this information by persons or entities other than the intended
recipient is prohibited. If you are not the intended recipient, please
notify Analogic Corporation immediately - by replying to this message or
by sending an email to [email protected] - and destroy all
copies of this information, including any attachments, without reading
or disclosing them.

Thank you.

2006-03-14 23:58:30

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

On Mon, 13 Mar 2006 17:35:50 EST, "linux-os (Dick Johnson)" said:

> Bzzzzst... Not! There are not any MAC addresses associated with any
> of the intercity links, usually not even in WANs! MAC is for
> Ethernet! Once you go to fiber, ATM, T-N, etc., there are no
> MAC addresses.

This will come as a big surprise to those places running Gig-E and 10G-E
links into a fiber for long-haul cross-country connectivity.....


Attachments:
(No filename) (228.00 B)

2006-03-16 16:07:48

by Chris Wedgwood

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

On Mon, Mar 13, 2006 at 10:00:41AM -0800, Stephen Hemminger wrote:

> There still is a bug in the 3c59x driver. It doesn't include any
> code to handle changing the mac address. It will work if you take
> the device down, change address, then bring it up. But you shouldn't
> have to do that.

I sent a patch do to this probably a year or two back and it was
rejected (by akpm if I recall) because of the argument that you could
and should take it down, change the MAC and bring it back up.

Is this no longer a requirement?

2006-03-16 17:58:32

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Router stops routing after changing MAC Address

On Thu, 16 Mar 2006 08:07:43 -0800
Chris Wedgwood <[email protected]> wrote:

> On Mon, Mar 13, 2006 at 10:00:41AM -0800, Stephen Hemminger wrote:
>
> > There still is a bug in the 3c59x driver. It doesn't include any
> > code to handle changing the mac address. It will work if you take
> > the device down, change address, then bring it up. But you shouldn't
> > have to do that.
>
> I sent a patch do to this probably a year or two back and it was
> rejected (by akpm if I recall) because of the argument that you could
> and should take it down, change the MAC and bring it back up.
>
> Is this no longer a requirement?

No. most drivers allow changes on the fly.

2006-03-16 18:31:25

by Greg Scott

[permalink] [raw]
Subject: RE: Router stops routing after changing MAC Address

I wonder if they would be more open to accepting that patch now?

- Greg Scott


-----Original Message-----
From: Stephen Hemminger [mailto:[email protected]]
Sent: Thursday, March 16, 2006 11:55 AM
To: Chris Wedgwood
Cc: Greg Scott; Chuck Ebbert; linux-kernel; David S. Miller;
[email protected]; Bart Samwel; Alan Cox; Simon Mackinlay
Subject: Re: Router stops routing after changing MAC Address

On Thu, 16 Mar 2006 08:07:43 -0800
Chris Wedgwood <[email protected]> wrote:

> On Mon, Mar 13, 2006 at 10:00:41AM -0800, Stephen Hemminger wrote:
>
> > There still is a bug in the 3c59x driver. It doesn't include any
> > code to handle changing the mac address. It will work if you take
> > the device down, change address, then bring it up. But you shouldn't

> > have to do that.
>
> I sent a patch do to this probably a year or two back and it was
> rejected (by akpm if I recall) because of the argument that you could
> and should take it down, change the MAC and bring it back up.
>
> Is this no longer a requirement?

No. most drivers allow changes on the fly.