2001-10-28 17:23:27

by Laurent Deniel

[permalink] [raw]
Subject: Ethernet NIC dual homing


Hi,

Does someone know if there is some work in the area of NIC dual homing ?
By NIC dual homing, I mean two network devices (e.g. Ethernet) that are
connected to the same IP subnet but only one is active (at IP level) at a
time. When a faulty condition is detected (e.g. link down or lack of I/O),
the kernel switches to the second NIC. Such a similar feature exists in
Tru64 UNIX (NetRAIN), HP-UX (APA) and Solaris (Sun Cluster pnmd).
What is the best way to handle that in Linux ? I thought about an IP virtual
device that could be mapped on two eternet NIC and some ioctl to switch from
one NIC to another or a generic virtual ethernet driver that could handle two
real ethernet drivers ?

Laurent

PS: please CC to me since I do not read lkml at a regular basis. TIA.


2001-10-28 22:23:19

by Laurent Deniel

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

Mark Hahn wrote:
>
> > the kernel switches to the second NIC. Such a similar feature exists in
>
> why not user-space?

Good question. The switch could be initiated by a user-space daemon but
the switch itself should be implemented at kernel level for performance
and atomicity reasons (to avoid too many loss of packets) ?

2001-10-28 22:42:32

by Laurent Deniel

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

Mark Hahn wrote:
>
> > > > the kernel switches to the second NIC. Such a similar feature exists in
> > >
> > > why not user-space?
> >
> > Good question. The switch could be initiated by a user-space daemon but
> > the switch itself should be implemented at kernel level for performance
> > and atomicity reasons (to avoid too many loss of packets) ?
>
> nets are lossy; no protocol can't tolerate losing a few packets.
> in any event, there's no way the kernel is going to contain elaborate
> heuristics that try to diagnose a failing NIC. this sort of thing
> really has to be done by user-land.

Yes. NetRAIN (Tru64 Unix) for instance has a daemon in user space that
monitor the device input & output statistics and try to generate some
traffic (e.g. by pinging some multicast groups) in case rx & tx packet counts
are not increased in some period of time. Then a switch is initiated if they
remain stable. But some failure conditions with some cards can be detected at
driver level (e.g. Ethernet link down).

> note that afaik, the switch
> could probably be done atomically simply by changing a route metric,
> and might even happen from the kernel's builtin load-balancing.
> it does depend on the failure mode you're expecting. frankly, I've
> never had a NIC fail - what are you expecting, lightning damage or
> something?
> do you model that the device would give some sign that it's not working,
> or even some kind of warning?

The device could give the Ethernet link down indication. But most failures
would be detected at user-space level with the lack of rx/tx packets in
some period of time. Such a feature would not only allow to have a working
redundant NIC but would also allow to have a fully redundant ethernet
connection (i.e. two Ethernet NIC could be connected to different switches
in the same IP network), in case of failure of one NIC or switch, the other
route would be used by switching the NIC.

Laurent

2001-10-29 13:22:30

by Martin Eriksson

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

----- Original Message -----
From: "Laurent Deniel" <[email protected]>
To: <[email protected]>
Sent: Sunday, October 28, 2001 6:23 PM
Subject: Ethernet NIC dual homing


>
> Hi,
>
> Does someone know if there is some work in the area of NIC dual homing ?
> By NIC dual homing, I mean two network devices (e.g. Ethernet) that are
> connected to the same IP subnet but only one is active (at IP level) at a
> time. When a faulty condition is detected (e.g. link down or lack of I/O),
> the kernel switches to the second NIC. Such a similar feature exists in
> Tru64 UNIX (NetRAIN), HP-UX (APA) and Solaris (Sun Cluster pnmd).
> What is the best way to handle that in Linux ? I thought about an IP
virtual
> device that could be mapped on two eternet NIC and some ioctl to switch
from
> one NIC to another or a generic virtual ethernet driver that could handle
two
> real ethernet drivers ?

Well, it shouldn't be too hard to modify the bonding driver to do something
like this (?), and instead the most work should (and will) go into the
user-space daemon. That way it would be possible not only to detect faulty
NIC hardware, but also to detect for example a faulty network segment.

Anyone wants to take a shot on this? I'm gonna look into it a bit, because
it sounds like a nice project for me as a linux-kernel-programmer newbie.

_____________________________________________________
| Martin Eriksson <[email protected]>
| MSc CSE student, department of Computing Science
| Ume? University, Sweden


2001-10-29 13:39:01

by willy tarreau

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

Hi Laurent,

> Does someone know if there is some work in the area
of NIC
> dual homing ?

I have implemented this for 2.2 kernel a while ago,
and Chad
Tindel has completed the port to 2.4. Some other
contributors
have added features such as XOR distribution. You can
take
a look at it, kernel 2.4 patches are on :

http://sf.net/projects/bonding/

and 2.2 patches are on :


http://www-miaif.lip6.fr/willy/linux-patches/bonding/

Regards,
Willy


___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en fran?ais !
Yahoo! Courrier : http://courrier.yahoo.fr

2001-10-29 19:59:39

by Laurent Deniel

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

willy tarreau wrote:
>
> Hi Laurent,
>
> > Does someone know if there is some work in the area of NIC
> > dual homing ?
>
> I have implemented this for 2.2 kernel a while ago,
> and Chad Tindel has completed the port to 2.4. Some other
> contributors have added features such as XOR distribution. You can
> take a look at it, kernel 2.4 patches are on :
>
> http://sf.net/projects/bonding/
>
> and 2.2 patches are on :
>
> http://www-miaif.lip6.fr/willy/linux-patches/bonding/
>

Thanks for the pointers.

Currently only the link status is used to monitor a NIC.
So it would be nice if an ioctl was available to force a NIC switch-over
(especially in active-backup policy). This could be used by a user-space
daemon in case for instance no traffic is detected.

I see that the bonding driver is included in 2.2.18, what is its status
in 2.4.x ?

Regards,

Laurent

2001-10-29 22:17:18

by Laurent Deniel

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

Laurent Deniel wrote:
>
> Currently only the link status is used to monitor a NIC.
> So it would be nice if an ioctl was available to force a NIC switch-over
> (especially in active-backup policy). This could be used by a user-space
> daemon in case for instance no traffic is detected.
>
> I see that the bonding driver is included in 2.2.18, what is its status
> in 2.4.x ?
>
> Regards,
>
> Laurent

Hmm, it seems that a lot of good stuff (e.g. ARP monitoring and
SIOCBONDCHANGEACTIVE ioctl) are implemented in the bonding patch for 2.4.13.
Will it be included in the mainstream 2.4.x kernel or is it a 2.5 thing ?

2001-10-29 22:26:01

by willy tarreau

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

> Currently only the link status is used to monitor a
> NIC.
> So it would be nice if an ioctl was available to
> force a NIC switch-over
> (especially in active-backup policy). This could be
> used by a user-space
> daemon in case for instance no traffic is detected.

the 2.4 driver provides a mode which sends ARP packets
to test the link (far more reliable than MII), and the
appropriate ioctl for the NIC switch-over you need. It
is available to user through ifenslave -c bond0 eth0
for example.

> I see that the bonding driver is included in 2.2.18,
> what is its status in 2.4.x ?

well, it works for me each time I download a new
release, but I don't have prod servers on it to test
for a long time, nor have I passed all the intensive
tests that Constantine Gavrilov did when I was working
on 2.2. BTW, the last release (20011026) seems to have
a buggy ifenslave which activates all the NICs flags
(to be confirmed, just seen this today and replaced
with the previous one). If this is confirmed, I'll
mail
Chad about this problem.

If there are still people interested in 2.2, I'll try
to find some time to back-port the switch-over ioctl
from 2.4.

Regards,
Willy


___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en fran?ais !
Yahoo! Courrier : http://courrier.yahoo.fr

2001-10-29 22:32:11

by willy tarreau

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

> Hmm, it seems that a lot of good stuff (e.g. ARP
> monitoring and
> SIOCBONDCHANGEACTIVE ioctl) are implemented in the
> bonding patch for 2.4.13.
> Will it be included in the mainstream 2.4.x kernel
> or is it a 2.5 thing ?

Personnaly, I don't know. Chad now maintains the
project so he may have better opinions about this.
But I'd like to see it in the 2.4 once well tested
since it interests lots of people, and it allows us to
put linux boxes in more critical environments.

Regards,
Willy


___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en fran?ais !
Yahoo! Courrier : http://courrier.yahoo.fr

2001-10-29 22:57:30

by Chris Friesen

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

willy tarreau wrote:

> the 2.4 driver provides a mode which sends ARP packets
> to test the link (far more reliable than MII), and the
> appropriate ioctl for the NIC switch-over you need. It
> is available to user through ifenslave -c bond0 eth0
> for example.

Are there issues with using MII to detect link state? I thought it was fairly
reliable...

How are you using arp packets to detect if the link is up? Sending it out to
your own MAC address?

Chris

--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]

2001-10-29 23:07:40

by willy tarreau

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

> Are there issues with using MII to detect link
> state? I thought it was fairly reliable...

no, there are examples with too long lines, or
scratched wires where the links stay up, but even arp
doesn't work.

> How are you using arp packets to detect if the link
> is up? Sending it out to your own MAC address?

no, simply sending ARP request for a known IP address
which will reply, so generate traffic that can be
counted to tell wether a NIC seems working or not.

I didn't try to set my own IP address though. Perhaps
with arp_filter properly set this could be usefull...

Regards,
Willy


___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en fran?ais !
Yahoo! Courrier : http://courrier.yahoo.fr

2001-10-30 02:32:57

by Dan Hollis

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

On Mon, 29 Oct 2001, Christopher Friesen wrote:
> Are there issues with using MII to detect link state? I thought it was fairly
> reliable...

It doesn't work to detect link state through bridging device (say, bridged
ethernet over T3). The T3 might go down, but your MII link to the local
router will remain "up", so you will never know about the loss of link and
your packets will happily go into the void...

-Dan
--
[-] Omae no subete no kichi wa ore no mono da. [-]

2001-10-30 02:56:23

by Jonathan Lundell

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

At 6:33 PM -0800 10/29/01, Dan Hollis wrote:
>On Mon, 29 Oct 2001, Christopher Friesen wrote:
>> Are there issues with using MII to detect link state? I thought
>>it was fairly
>> reliable...
>
>It doesn't work to detect link state through bridging device (say, bridged
>ethernet over T3). The T3 might go down, but your MII link to the local
>router will remain "up", so you will never know about the loss of link and
>your packets will happily go into the void...

ARP isn't going to do much for you once the failure is beyond the
local segment, is it?
--
/Jonathan Lundell.

2001-10-30 03:15:37

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

Followup to: <p05100304b803c6908755@[10.128.7.49]>
By author: Jonathan Lundell <[email protected]>
In newsgroup: linux.dev.kernel
> >
> >It doesn't work to detect link state through bridging device (say, bridged
> >ethernet over T3). The T3 might go down, but your MII link to the local
> >router will remain "up", so you will never know about the loss of link and
> >your packets will happily go into the void...
>
> ARP isn't going to do much for you once the failure is beyond the
> local segment, is it?
>

ARP is broadcast to the layer 2 local segment; link detection refers
to the layer 1 local segment, which is not necessarily the same.

On the other hand, doing link detection is extremely useful for a
portable computer: when I plug in my Ethernet cable in a portable
system I want it to try to start doing DHCP detection and anything
else that is normally associated with the interface being "up" at that
time.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-10-30 03:31:08

by Jonathan Lundell

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

At 7:15 PM -0800 10/29/01, H. Peter Anvin wrote:
> > ARP isn't going to do much for you once the failure is beyond the
>> local segment, is it?
>>
>
>ARP is broadcast to the layer 2 local segment; link detection refers
>to the layer 1 local segment, which is not necessarily the same.
>
>On the other hand, doing link detection is extremely useful for a
>portable computer: when I plug in my Ethernet cable in a portable
>system I want it to try to start doing DHCP detection and anything
>else that is normally associated with the interface being "up" at that
>time.

I'm not planning to use bonding on my notebook any time soon.

But what I meant was bonding's use of ARP to determine whether the
connection is good (or rather, bad, even when the link is up), when
the connection is routed via level 3. Seems to me you'd need a level
3 protocol (say ICMP) rather than ARP.
--
/Jonathan Lundell.

2001-10-30 04:55:20

by Dan Hollis

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

On Mon, 29 Oct 2001, Jonathan Lundell wrote:
> At 6:33 PM -0800 10/29/01, Dan Hollis wrote:
> >On Mon, 29 Oct 2001, Christopher Friesen wrote:
> >> Are there issues with using MII to detect link state? I thought
> >>it was fairly
> >> reliable...
> >It doesn't work to detect link state through bridging device (say, bridged
> >ethernet over T3). The T3 might go down, but your MII link to the local
> >router will remain "up", so you will never know about the loss of link and
> >your packets will happily go into the void...
> ARP isn't going to do much for you once the failure is beyond the
> local segment, is it?

But you can use it to determine end-to-end link status. MII is useless for
that when you're going through a bridge.

So ARP is *perfect* for this situation and is *exactly* what is needed.
When you determine link is down using end-to-end status like ARP then you
can take the device out of the bonding queue. Presto, 100% perfect
failover.

-Dan
--
[-] Omae no subete no kichi wa ore no mono da. [-]

2001-10-30 04:57:40

by Dan Hollis

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

On Mon, 29 Oct 2001, Jonathan Lundell wrote:
> But what I meant was bonding's use of ARP to determine whether the
> connection is good (or rather, bad, even when the link is up), when
> the connection is routed via level 3. Seems to me you'd need a level
> 3 protocol (say ICMP) rather than ARP.

bonding isn't for layer 3. it's layer 2. layer 3 you use equal cost
multipath or other method for load balancing.

-Dan
--
[-] Omae no subete no kichi wa ore no mono da. [-]

2001-10-30 10:54:23

by Chemolli Francesco (USI)

[permalink] [raw]
Subject: RE: Ethernet NIC dual homing

> Hi,
>
> Does someone know if there is some work in the area of NIC
> dual homing ?
> By NIC dual homing, I mean two network devices (e.g.
> Ethernet) that are
> connected to the same IP subnet but only one is active (at IP
> level) at a
> time.
[...]

Intel eepro100 cards, using Intel drivers (e100) and the ANS subsystem
(all available from Intel for free - as in beer) allow this
at the kernel-level, using link-detection to determine whether
to fail over.
They allow for failover, dual-active (only when sending) and Fast
EtherChannel.


Generally speaking, it shouldn't be hard to do it.
A shell script would be inefficient computation-wise, but should be simple
and
quite reliable: arping your default gateway and if it fails more than
X times ifconfig down; ifconfig up.

--
/kinkie

2001-10-30 11:03:45

by Chemolli Francesco (USI)

[permalink] [raw]
Subject: RE: Ethernet NIC dual homing

> Intel eepro100 cards, using Intel drivers (e100) and the ANS subsystem
> (all available from Intel for free - as in beer) allow this

Sorry to follow myself up.
I've re-checked the license, it seems to be BSD-ish (with copyright
publicity requirement clause), so they're free or not depending on
whom you ask to.

--
/kinkie

2001-10-30 11:27:03

by Alan

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

> Intel eepro100 cards, using Intel drivers (e100) and the ANS subsystem
> (all available from Intel for free - as in beer) allow this
> at the kernel-level, using link-detection to determine whether
> to fail over.

Any current 2.4 kernel with the bonding driver and the ethtool stuff will
do the job too. The only thing you do want to add is the Red Hat patch
for ethtool on eepro100, which should end up in -ac soon

2001-10-30 15:30:53

by Chris Friesen

[permalink] [raw]
Subject: Re: Ethernet NIC dual homing

Jonathan Lundell wrote:

> But what I meant was bonding's use of ARP to determine whether the
> connection is good (or rather, bad, even when the link is up), when
> the connection is routed via level 3. Seems to me you'd need a level
> 3 protocol (say ICMP) rather than ARP.

This is what we've done here at work. We use a combination of MII for fast
detection of local link loss, and ICMP ping packets to highly available hosts to
test the network path (with somewhat slower response time).

Chris

--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]