2002-09-26 08:54:42

by Roberto Nibali

[permalink] [raw]
Subject: Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification

Hello DaveM and others,

> It sounds more like it would include the FIB too.
>
> That's the second level cache, not the top level lookup which
> is what hits %99 of the time.

I've done extensive testing in this field trying to achive fast packet
filtering with a huge set of not ordered rules loaded into the kernel.

According to my findings I had reason to believe that after around 1000
rules for ipchains and around 4800 rules for iptables the L2 cache was
the limiting factor (of course given the slowish iptables/conntrack
table lookup).

Those are rule thresholds I achieved with a PIII Tualatin and 512KB L2
cache. With a sluggish Celeron with I think 128KB L2 cache I achieved
about 1/8 of the above treshold. That's why I thought the L2 cache plays
a bigger role in this than the CPU FSB clock.

I concluded that if the ruleset to be matched would exceed the treshold
of what can be loaded into the L2 cache we see cache trashing and that's
why performance goes right to hell. I wanted to test this using oprofile
but haven't found the correct cpu performance counter yet :).

> Also not necessary, only the top level cache really needs to be
> top performance.

I will do a new round of testing this weekend for a speech I'll be
giving. This time I will include ipchains, iptables (of course I am
willing to apply every interesting patch regarding hash table
optimisation and whatnot you want me to test), nf-hipac, the OpenBSD pf
and of course the work done by Jamal.

Dave, is the work done by Jamal (and I think Werner and others did some
too) before, mostly during OLS, and probably now the one you're
referring to? Hadi showed it to me at OLS and I saw a great potential in it.

I'm asking because the company I work for builds rather big packet
filters (with up to 24 NICs per node) for special purpose networks which
due to policies and automated ruleset generation by mapping a port
matrix into a weighted graph and then extrapolating the ruleset with
basic Algebra (Dijkstra and all this cruft) generate a huge set of
rules. Two problems we're facing on a daily basis:

o we can't filter more than 13Mbit/s anymore after loading around 3000
rules into the kernel (problem is gone with nf-hipac for example).
o we can't log all the messages we would like to because the user space
log daemon (syslog-ng in our case, but we've tried others too) doesn't
get enough CPU time anymore to read the buffer before it will be over-
written by the printk's again. This leads to an almost proportial to
N^2 log entry loss with increasing number of rules that do not match.
This is the worst thing that can happen to you working in the
security business: not having an appropriate log trace during a
possible incident.

AFAICR Jamal did modify the routing and FIB code and hacked iproute2 to
achieve that. We spoke about this at the OLS. Until I had seen his code
my approach to test the speed was to (don't laugh):

o blackhole everything (POLICY DROP)
o generate routing rules (selectors) for matching packets
o add routes which would allow just that specific flow into the
according routing tables
o '-j <CHAIN>' was implemented using bounce table walking

This was just a test to see the potential speed improvement of moving
the most simplistic things from netfilter (like raw packetfiltering
without conntrack and ports) a 'layer' down to the routing code. A lot
of works has to be done in this field and the filtering code is just
about the most simple one AFAICT, but conntrack and proper n:m NAPT
incorporated into the routing code is IMHO a tricky thing.

Best regards,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc


2002-09-26 09:07:02

by David Miller

[permalink] [raw]
Subject: Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification

From: Roberto Nibali <[email protected]>
Date: Thu, 26 Sep 2002 11:00:53 +0200

Hello DaveM and others,

> That's the second level cache, not the top level lookup which
> is what hits %99 of the time.
...
the L2 cache was the limiting factor

I'm not talking about cpu second level cache, I'm talking about
a second level lookup table that backs up a front end routing
hash. A software data structure.

You are talking about a lot of independant things, but I'm going
to defer my contributions until we have actual code people can
start plugging netfilter into if they want.

About using syslog to record messages, that is doomed to failure,
implement log messages via netlink and use that to log the events
instead.

2002-09-26 12:05:28

by jamal

[permalink] [raw]
Subject: Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification


It would be nice if people would start ccing networking related
discussions to netdev. I missed the first part of the discussion
but i take it the NF-HIPAC posted a patch.. BTW, I emailed the authors
when i read the paper but never heard back.
What i wanted the authors was to compare against one of the tc
classifiers not iptables.

On Thu, 26 Sep 2002, David S. Miller wrote:

> You are talking about a lot of independant things, but I'm going
> to defer my contributions until we have actual code people can
> start plugging netfilter into if they want.
>

I hacked some code using the traffic control framework around OLS time;
there are a lot of ideas i havent incorporated yet. Too many hacks, too
little time ;-> I think this is what i may have showed Roberto on my
laptop over a drink.
I probably wouldnt have put this code out if my complaints about
netfilter werent ignored.
And you know what happens when you start writting poetry, I ended worrying
more than just about the performance problems of iptables; for example
the code i have now makes it easy to extend the path a packet takes using
simple policies.
The code i have is based around tc framework. One thing i liked about
netfilter is the idea of targets being separate modules; so the code i
have infact makes uses of netfilter targets.
I plan on revisiting this code at some point, maybe this weekend now that
i am reminded of it ;->
Take a look:
http://www.cyberus.ca/~hadi/patches/action.DESCRIPTION

> About using syslog to record messages, that is doomed to failure,
> implement log messages via netlink and use that to log the events
> instead.
>

Agreed, you need a netlink to syslog converter.
Netlink is king -- all the policies in the above code are netlink
controlled. All events are also netlink transported. You dont have to send
every little message you see; netlink allows you to batch and you could
easily do a nagle like algorithm. Next steps are a distributed version
of netlink..

cheers,
jamal

2002-09-26 11:59:14

by Andi Kleen

[permalink] [raw]
Subject: Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification

On Thu, Sep 26, 2002 at 11:00:53AM +0200, Roberto Nibali wrote:
> o we can't filter more than 13Mbit/s anymore after loading around 3000
> rules into the kernel (problem is gone with nf-hipac for example).

For iptables/ipchain you need to write hierarchical/port range rules
in this case and try to terminate searchs early.

But yes, we also found that the L2 cache is limiting here
(ip_conntrack has the same problem)

> o we can't log all the messages we would like to because the user space
> log daemon (syslog-ng in our case, but we've tried others too) doesn't
> get enough CPU time anymore to read the buffer before it will be over-
> written by the printk's again. This leads to an almost proportial to
> N^2 log entry loss with increasing number of rules that do not match.
> This is the worst thing that can happen to you working in the
> security business: not having an appropriate log trace during a
> possible incident.

At least that is easily fixed. Just increase the LOG_BUF_LEN parameter
in kernel/printk.c

Alternatively don't use slow printk, but nfnetlink to report bad packets
and print from user space. That should scale much better.

-Andi

2002-09-30 17:52:28

by Bill Davidsen

[permalink] [raw]
Subject: Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification

On Thu, 26 Sep 2002, Roberto Nibali wrote:

> I've done extensive testing in this field trying to achive fast packet
> filtering with a huge set of not ordered rules loaded into the kernel.
>
> According to my findings I had reason to believe that after around 1000
> rules for ipchains and around 4800 rules for iptables the L2 cache was
> the limiting factor (of course given the slowish iptables/conntrack
> table lookup).
>
> Those are rule thresholds I achieved with a PIII Tualatin and 512KB L2
> cache. With a sluggish Celeron with I think 128KB L2 cache I achieved
> about 1/8 of the above treshold. That's why I thought the L2 cache plays
> a bigger role in this than the CPU FSB clock.
>
> I concluded that if the ruleset to be matched would exceed the treshold
> of what can be loaded into the L2 cache we see cache trashing and that's
> why performance goes right to hell. I wanted to test this using oprofile
> but haven't found the correct cpu performance counter yet :).
>
> > Also not necessary, only the top level cache really needs to be
> > top performance.
>
> I will do a new round of testing this weekend for a speech I'll be
> giving. This time I will include ipchains, iptables (of course I am
> willing to apply every interesting patch regarding hash table
> optimisation and whatnot you want me to test), nf-hipac, the OpenBSD pf
> and of course the work done by Jamal.

Look forward to any info you can provide.

I particularly like that nf-hipac can be put in and tried in one-to-one
comparison, that leaves an easy route to testing and getting confidence in
the code.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.