2001-07-19 15:45:01

by Cornel Ciocirlan

[permalink] [raw]
Subject: Request for comments

Hi,

I was thinking of starting a project to implement a Cisco-like
"NetFlow" architecture for Linux. This would be relevant for edge routers
and/or network monitoring devices.

What this would do is keep a "cache" of all the "flows" that are passing
through the system; a flow is defined as the set of packets that have the
same headers - or header fields. For example we could choose "ip source,
ip destination, ip protocol, ip source port [if relevant], ip destination
port [ if relevant ], and maintain a cache of all distinct such
"flows" that pass through the system. The flows would have to be
"expired" from the cache (LRU) and there should be a limit on the size of
the cache.

What can we use the cache for:

a) more efficient packet filtering. After a cache entry is created for a
flow, we apply the ACLs for the packet and associate the action with the
flow. All subsequent packets belonging to the same flow will be
dropped/accepted without re-appying the packet filtering rules
b) traffic statistics. When expiring a flow in the cache we could send a
special "messagge" to a user-space process with the
* flow caracteristics (ip src,ip dest etc)
* total number of packets that were associated with this flow
* flow start timestamp, flow last-activity timestamp
* avg pkts/second while the flow was active
* total bytes transmitted for this flow
c) we could make routing decisions by looking at the flow cache, eg when
we first create the flow we look into the routing table and save the
index of the output interface in the flow cache. Subsequent packets
matching the flow will not cause a search through the routing table.
d) prevent denial-of-service by configuring for example automatic
filtering of a flow that matches more than some-high-value pps (Most flows
will probably be 1000 pps max, while packet floods can be 5k-25k easily)

Problems:
- some overhead will be added, however if we implement a) and c) above we
can reduce it. d) will also make the system perform better under high
load.
- we need to come up with a pretty efficient data structure to search
through it very quickly - if we route 20k pps, too much overhead will kill
us. I was thinking of a hash table with AVL trees instead of linked lists,
which I think the buffer cache is using; other options: splay trees maybe
useful ?)
- in all cases we'll need something like an expiry thread that actively
removes inactive flows from the cache

Is it useful at all ? Point b) above could be implemented in userspace
(Actually I've done a basic skeleton a while ago). Are the others worth
the trouble ?

What do you gurus think ?

Kind regards,
Cornel.



2001-07-19 16:31:14

by Crutcher Dunnavant

[permalink] [raw]
Subject: Re: Request for comments

++ 19/07/01 18:44 +0300 - Cornel Ciocirlan:
> a) more efficient packet filtering. After a cache entry is created for a
> flow, we apply the ACLs for the packet and associate the action with the
> flow. All subsequent packets belonging to the same flow will be
> dropped/accepted without re-appying the packet filtering rules

I'm seeing an identification problem arise here. You have to be able to
identify packets in a flow robustly, and you have to be able to spot packets
trying to fake it. I dont see any way in which you will be able to avoid
the packet filtering rules here.

--
Crutcher <[email protected]>
GCS d--- s+:>+:- a-- C++++$ UL++++$ L+++$>++++ !E PS+++ PE Y+ PGP+>++++
R-(+++) !tv(+++) b+(++++) G+ e>++++ h+>++ r* y+>*$

2001-07-19 17:24:38

by James Lewis Nance

[permalink] [raw]
Subject: Re: Request for comments

On Thu, Jul 19, 2001 at 06:44:52PM +0300, Cornel Ciocirlan wrote:

> What this would do is keep a "cache" of all the "flows" that are passing
> through the system; a flow is defined as the set of packets that have the
> same headers - or header fields. For example we could choose "ip source,
> ip destination, ip protocol, ip source port [if relevant], ip destination
> port [ if relevant ], and maintain a cache of all distinct such
> "flows" that pass through the system. The flows would have to be
> "expired" from the cache (LRU) and there should be a limit on the size of
> the cache.

This sounds a lot like what MPLS does. I believe that someone has MPLS
patches for the kernel, but I dont know who. You might want to find them
and take a look.

Jim

2001-07-19 17:25:48

by Francois Romieu

[permalink] [raw]
Subject: Re: Request for comments

Cornel Ciocirlan <[email protected]> ecrit :
[heavy linux networking rewrite in sight]
> Is it useful at all ? Point b) above could be implemented in userspace
> (Actually I've done a basic skeleton a while ago). Are the others worth
> the trouble ?
>
> What do you gurus think ?

* Are you sure of where the cycles are spent when routing >> 20kpps ?
I have always been told that the lack of polling/batch processing kills
the software router.
* What against the (not widely used) CONFIG_NET_FASTROUTE ?
* <mantra> mpls is good </mantra> Did you browse mpls at http://www.ietf.org
(and sourceforge for the current state of mpls-linux) ?
* IANAG but it looks like a wait for 2.5 project btw.

--
Ueimor

2001-07-19 17:33:28

by Andi Kleen

[permalink] [raw]
Subject: Re: Request for comments

Cornel Ciocirlan <[email protected]> writes:

> Hi,
>
> I was thinking of starting a project to implement a Cisco-like
> "NetFlow" architecture for Linux. This would be relevant for edge routers
> and/or network monitoring devices.

Linux 2.1+ already has such a cache in form of the rtcache since several
years.

-Andi

2001-07-19 17:52:52

by Jakob Oestergaard

[permalink] [raw]
Subject: Re: Request for comments

On Thu, Jul 19, 2001 at 07:33:02PM +0200, Andi Kleen wrote:
> Cornel Ciocirlan <[email protected]> writes:
>
> > Hi,
> >
> > I was thinking of starting a project to implement a Cisco-like
> > "NetFlow" architecture for Linux. This would be relevant for edge routers
> > and/or network monitoring devices.
>
> Linux 2.1+ already has such a cache in form of the rtcache since several
> years.

NeTraMet is a project that will give you NetFlow-like data. You set up traffic
meters on your routers, and gather data centrally from the meters using SNMP.

Works great.

See: http://www2.auckland.ac.nz/net/Accounting/ntm.Release.note.html

--
................................................................
: [email protected] : And I see the elder races, :
:.........................: putrid forms of man :
: Jakob ?stergaard : See him rise and claim the earth, :
: OZ9ABN : his downfall is at hand. :
:.........................:............{Konkhra}...............: