2002-02-25 08:39:30

by Jan Kasprzak

[permalink] [raw]
Subject: Equal cost multipath crash

Hello network hackers,

I had a strange failure of my Linux router yesterday. It is quite
uncommon setup, but I wonder what could have caused this. The router
started to dump the following messages into the syslog, and it stopped
routing so our network was not reachable from the outside world:

Feb 24 21:26:49 router kernel: impossible 888
Feb 24 21:39:20 router kernel: ible 888
Feb 24 21:39:20 router kernel: impossible 888
Feb 24 21:39:20 router last message repeated 42 times
Feb 24 21:39:20 router kernel: impossible 888
Feb 24 21:39:21 router kernel: NET: 344 messages suppressed.
Feb 24 21:39:21 router kernel: dst cache overflow
Feb 24 21:39:21 router kernel: impossible 888
Feb 24 21:39:21 router last message repeated 275 times
[... and so on ...]

After few minutes, a co-worker of mine pressed the big red button.

The box is dual Athlon 1200 MP (Tyan Thunder K7 board), two
on-board NICs (some kind of 3c90x) - eth0 and eth1 - running IPv4 over
ethernet, and one 3c985B (Tigon II, eth2) gigabit NIC, running IPv4 over
802.1Q VLANs. I have five VLANs on the gigabit NIC.

Routing was set up statically, with about 100 IP rules
(used mainly for blocking IP addresses -- ip rule ... blackhole).
In the routing table "main", there was static routes
to directly connected LANs or VLANs, and one default route, which
did equal cost multipath over eth0 and one VLAN - eth2.61:

ip route add default table main nexthop via IP1 dev eth0 \
nexthop via IP2 dev eth2.61

There was one more routing table used mainly for testing
and experiments. Most of time (and at the time of crash) it looked
the same way as the table "main", and some IPs were directed to this
table using IP rules.

Kernel 2.4.17, RedHat 7.2 with updates.

What could cause this problem? I am willing to send more information
on request.

Thanks,

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
|\ As anyone can tell you trying to force things on Linux developers /|
|\\ generally works out pretty badly. (Alan Cox in lkml) //|


2002-02-25 08:54:01

by Eric Krout

[permalink] [raw]
Subject: Re: Equal cost multipath crash

On Mon, 2002-02-25 at 03:39, Jan Kasprzak wrote:
> Hello network hackers,
>
> I had a strange failure of my Linux router yesterday.
...
> Feb 24 21:26:49 router kernel: impossible 888
> Feb 24 21:39:20 router kernel: ible 888
> Feb 24 21:39:20 router kernel: impossible 888
> Feb 24 21:39:20 router last message repeated 42 times
> Feb 24 21:39:20 router kernel: impossible 888
> Feb 24 21:39:21 router kernel: NET: 344 messages suppressed.
> Feb 24 21:39:21 router kernel: dst cache overflow
> Feb 24 21:39:21 router kernel: impossible 888
> Feb 24 21:39:21 router last message repeated 275 times
> [... and so on ...]
>
> After few minutes, a co-worker of mine pressed the big red button.
>
...


Most code I've seen that prints "impossible" looks like this:


/* This is a disaster if it occurs */
printk("impossible");


I'd say it's a Good Thing(tm) that your co-worker "pressed the big red
button" ;-)

(Sorry, that's the best I could come up with at 3:51am)

2002-02-25 14:56:02

by Kristian Peters

[permalink] [raw]
Subject: Re: Equal cost multipath crash

Jan Kasprzak <[email protected]> wrote:
>
> I had a strange failure of my Linux router yesterday. It is quite
> uncommon setup, but I wonder what could have caused this. The router
> started to dump the following messages into the syslog, and it stopped
> routing so our network was not reachable from the outside world:
>
> Feb 24 21:26:49 router kernel: impossible 888
> Feb 24 21:39:20 router kernel: ible 888
> Feb 24 21:39:20 router kernel: impossible 888
> Feb 24 21:39:20 router last message repeated 42 times
> Feb 24 21:39:20 router kernel: impossible 888
> Feb 24 21:39:21 router kernel: NET: 344 messages suppressed.
> Feb 24 21:39:21 router kernel: dst cache overflow
> Feb 24 21:39:21 router kernel: impossible 888
> Feb 24 21:39:21 router last message repeated 275 times
> [... and so on ...]

Have you applied those grsecurity patches ? I'm getting the same messages with it from time to time when hosts forget to log off. But most of them are harmless and only useful for debugging your firewall-rules.

*Kristian

:... [snd.science] ...:
::
:: http://www.korseby.net
:: http://gsmp.sf.net
:..........................:

2002-02-25 17:28:49

by Jan Kasprzak

[permalink] [raw]
Subject: Re: Equal cost multipath crash

Kristian wrote:
: Jan Kasprzak <[email protected]> wrote:
: >
: > I had a strange failure of my Linux router yesterday. It is quite
: > uncommon setup, but I wonder what could have caused this. The router
: > started to dump the following messages into the syslog, and it stopped
: > routing so our network was not reachable from the outside world:
: >
: > Feb 24 21:26:49 router kernel: impossible 888
: > Feb 24 21:39:20 router kernel: ible 888
: > Feb 24 21:39:20 router kernel: impossible 888
: > Feb 24 21:39:20 router last message repeated 42 times
: > Feb 24 21:39:20 router kernel: impossible 888
: > Feb 24 21:39:21 router kernel: NET: 344 messages suppressed.
: > Feb 24 21:39:21 router kernel: dst cache overflow
: > Feb 24 21:39:21 router kernel: impossible 888
: > Feb 24 21:39:21 router last message repeated 275 times
: > [... and so on ...]
:
: Have you applied those grsecurity patches ? I'm getting the same messages with it from time to time when hosts forget to log off. But most of them are harmless and only useful for debugging your firewall-rules.
:

No. What are the grsecurity patches? This is stock 2.4.17 kernel.

-Y.

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
|\ As anyone can tell you trying to force things on Linux developers /|
|\\ generally works out pretty badly. (Alan Cox in lkml) //|