LinuxLists.cc - [NETFILTER] early_drop() imrovement (v3)

2007-05-09 07:01:17

Subject: [NETFILTER] early_drop() imrovement (v3)

When the number of conntracks is reached nf_conntrack_max limit, early_drop()
tries to free one of already used conntracks. If it does not find any conntracks
that may be freed, it leads to transmission errors.
In current implementation the conntracks are searched in one hash bucket only.
It have some drawbacks: if used hash bucket is empty we have not any chances to
find something. On the other hand the hash bucket can contain a huge number of
conntracks and its check can last a long time.
The proposed patch limits the number of checked conntracks by default number of
conntracks in one hash bucket (NF_CT_PER_BUCKET) and allows to search conntracks
in other hash buckets. As result in any case the search will have the same
chances to free one of the conntracks and the check will not lead to long delays.

Signed-off-by: Vasily Averin <[email protected]>

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index e132c8a..d984bce 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -76,6 +76,8 @@ static unsigned int nf_conntrack_next_id;
DEFINE_PER_CPU(struct ip_conntrack_stat, nf_conntrack_stat);
EXPORT_PER_CPU_SYMBOL(nf_conntrack_stat);

+#define NF_CT_PER_BUCKET 8U
+
/*
* This scheme offers various size of "struct nf_conn" dependent on
* features(helper, nat, ...)
@@ -525,7 +527,7 @@ EXPORT_SYMBOL_GPL(nf_conntrack_tuple_taken);

/* There's a small race here where we may free a just-assured
connection. Too bad: we're in trouble anyway. */
-static int early_drop(struct list_head *chain)
+static int __early_drop(struct list_head *chain, unsigned int *cnt)
{
/* Traverse backwards: gives us oldest, which is roughly LRU */
struct nf_conntrack_tuple_hash *h;
@@ -540,6 +542,8 @@ static int early_drop(struct list_head *chain)
atomic_inc(&ct->ct_general.use);
break;
}
+ if (!--(*cnt))
+ break;
}
read_unlock_bh(&nf_conntrack_lock);

@@ -555,6 +559,21 @@ static int early_drop(struct list_head *chain)
return dropped;
}

+static int early_drop(const struct nf_conntrack_tuple *orig)
+{
+ unsigned int i, hash, cnt;
+ int ret = 0;
+
+ hash = hash_conntrack(orig);
+ cnt = NF_CT_PER_BUCKET;
+
+ for (i = 0;
+ !ret && cnt && i < nf_conntrack_htable_size;
+ ++i, hash = ++hash % nf_conntrack_htable_size)
+ ret = __early_drop(&nf_conntrack_hash[hash], &cnt);
+ return ret;
+}
+
static struct nf_conn *
__nf_conntrack_alloc(const struct nf_conntrack_tuple *orig,
const struct nf_conntrack_tuple *repl,
@@ -574,9 +593,7 @@ __nf_conntrack_alloc(const struct nf_conntrack_tuple *orig,

if (nf_conntrack_max
&& atomic_read(&nf_conntrack_count) > nf_conntrack_max) {
- unsigned int hash = hash_conntrack(orig);
- /* Try dropping from this hash chain. */
- if (!early_drop(&nf_conntrack_hash[hash])) {
+ if (!early_drop(orig)) {
atomic_dec(&nf_conntrack_count);
if (net_ratelimit())
printk(KERN_WARNING
@@ -1226,7 +1243,7 @@ int __init nf_conntrack_init(void)
if (nf_conntrack_htable_size < 16)
nf_conntrack_htable_size = 16;
}
- nf_conntrack_max = 8 * nf_conntrack_htable_size;
+ nf_conntrack_max = NF_CT_PER_BUCKET * nf_conntrack_htable_size;

printk("nf_conntrack version %s (%u buckets, %d max)\n",
NF_CONNTRACK_VERSION, nf_conntrack_htable_size,

2007-06-25 13:53:34

by Patrick McHardy

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v3)

Vasily Averin wrote:
> +static int early_drop(const struct nf_conntrack_tuple *orig)
> +{
> + unsigned int i, hash, cnt;
> + int ret = 0;
> +
> + hash = hash_conntrack(orig);
> + cnt = NF_CT_PER_BUCKET;
> +
> + for (i = 0;
> + !ret && cnt && i < nf_conntrack_htable_size;
> + ++i, hash = ++hash % nf_conntrack_htable_size)
> + ret = __early_drop(&nf_conntrack_hash[hash], &cnt);

Formatting is a bit ugly, looks much nicer as:

for (i = 0; i < nf_conntrack_htable_size; i++) {

ret = __early_drop(&nf_conntrack_hash[hash], &cnt);
if (ret || !cnt)
break;
hash = ++hash % nf_conntrack_htable_size;
}

> @@ -1226,7 +1243,7 @@ int __init nf_conntrack_init(void)
> if (nf_conntrack_htable_size < 16)
> nf_conntrack_htable_size = 16;
> }
> - nf_conntrack_max = 8 * nf_conntrack_htable_size;
> + nf_conntrack_max = NF_CT_PER_BUCKET * nf_conntrack_htable_size;

I don't like the NF_CT_PER_BUCKET constant. First of all, each
conntrack is hashed twice, so its really only 1/2 of the average
conntracks per bucket. Secondly, its only a default and many
people use nf_conntrack_max = nf_conntrack_htable_size / 2, so
using this constant for early_drop seems wrong.

Perhaps make it 2 * nf_conntrack_max / nf_conntrack_htable_size
or even add a nf_conntrack_eviction_range sysctl.

2007-06-25 14:36:59

by Jan Engelhardt

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v3)

On Jun 25 2007 15:53, Patrick McHardy wrote:
>Vasily Averin wrote:
>> +static int early_drop(const struct nf_conntrack_tuple *orig)
>> +{
>> + unsigned int i, hash, cnt;
>> + int ret = 0;
>> +
>> + hash = hash_conntrack(orig);
>> + cnt = NF_CT_PER_BUCKET;
>> +
>> + for (i = 0;
>> + !ret && cnt && i < nf_conntrack_htable_size;
>> + ++i, hash = ++hash % nf_conntrack_htable_size)
>> + ret = __early_drop(&nf_conntrack_hash[hash], &cnt);
>
>Formatting is a bit ugly, looks much nicer as:
>
> for (i = 0; i < nf_conntrack_htable_size; i++) {
>
> ret = __early_drop(&nf_conntrack_hash[hash], &cnt);
> if (ret || !cnt)
> break;
> hash = ++hash % nf_conntrack_htable_size;
> }

gcc warning: operation on ‘hash’ may be undefined

Jan
--

2007-06-26 13:22:35

by Vasily Averin

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v3)

Patrick McHardy wrote:

Patrick, thank you for your tips, I'll remake the patch.

> I don't like the NF_CT_PER_BUCKET constant. First of all, each
> conntrack is hashed twice, so its really only 1/2 of the average
> conntracks per bucket. Secondly, its only a default and many
> people use nf_conntrack_max = nf_conntrack_htable_size / 2, so
> using this constant for early_drop seems wrong.

> Perhaps make it 2 * nf_conntrack_max / nf_conntrack_htable_size
> or even add a nf_conntrack_eviction_range sysctl.

IMHO The number of conntracks checked in early_drop() have following restrictions:
- it should be not too low -- to decrease chances of transmission failures,
- it should be limited by some reasonable value -- to prevent long check delays.

Also I believe it makes sense to have it constant (how about NF_CT_EVICTION
name?) -- to have the same behaviour on various nodes. However I doubt strongly
that anybody will want to change this value. Do you think it is really required?

thank you,
Vasily Averin

2007-06-26 13:28:36

by Patrick McHardy

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v3)

Vasily Averin wrote:
> Patrick McHardy wrote:
>> I don't like the NF_CT_PER_BUCKET constant. First of all, each
>> conntrack is hashed twice, so its really only 1/2 of the average
>> conntracks per bucket. Secondly, its only a default and many
>> people use nf_conntrack_max = nf_conntrack_htable_size / 2, so
>> using this constant for early_drop seems wrong.
>>
>> Perhaps make it 2 * nf_conntrack_max / nf_conntrack_htable_size
>> or even add a nf_conntrack_eviction_range sysctl.
>>
>
> IMHO The number of conntracks checked in early_drop() have following restrictions:
> - it should be not too low -- to decrease chances of transmission failures,
> - it should be limited by some reasonable value -- to prevent long check delays.

Agreed.

> Also I believe it makes sense to have it constant (how about NF_CT_EVICTION
> name?) -- to have the same behaviour on various nodes. However I doubt strongly
> that anybody will want to change this value. Do you think it is really required?
>

I don't know. The current behaviour will on average scan 16 entries.
For people manually tuning their hash to saner settings it will scan
a single entry. So we have a quite wide range of values already.
The single entry with sane hash settings is too little IMO, maybe use
some middle-ground, make it 8 by default as you did and rename the
constant. NF_CT_EVICTION_RANGE sounds fine.

2007-06-27 08:46:22

by Vasily Averin

[permalink] [raw]

Subject: [NETFILTER] early_drop() imrovement (v4)

When the number of conntracks is reached nf_conntrack_max limit, early_drop()
tries to free one of already used conntracks. If it does not find any conntracks
that may be freed, it leads to transmission errors.
In current implementation the conntracks are searched in one hash bucket only.
It have some drawbacks: if used hash bucket is empty we have not any chances to
find something. On the other hand the hash bucket can contain a huge number of
conntracks and its check can last a long time.
The proposed patch limits the number of checked conntracks and allows to search
conntracks in other hash buckets. As result in any case the search will have the
same chances to free one of the conntracks and the check will not lead to long
delays.

Signed-off-by: Vasily Averin <[email protected]>

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 7a15e30..0540a88 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -526,7 +526,7 @@ EXPORT_SYMBOL_GPL(nf_conntrack_tuple_taken);

/* There's a small race here where we may free a just-assured
connection. Too bad: we're in trouble anyway. */
-static int early_drop(struct list_head *chain)
+static int __early_drop(struct list_head *chain, unsigned int *cnt)
{
/* Traverse backwards: gives us oldest, which is roughly LRU */
struct nf_conntrack_tuple_hash *h;
@@ -541,6 +541,8 @@ static int early_drop(struct list_head *chain)
atomic_inc(&ct->ct_general.use);
break;
}
+ if (!--(*cnt))
+ break;
}
read_unlock_bh(&nf_conntrack_lock);

@@ -556,6 +558,25 @@ static int early_drop(struct list_head *chain)
return dropped;
}

+#define NF_CT_EVICTION_RANGE 8U
+
+static int early_drop(const struct nf_conntrack_tuple *orig)
+{
+ unsigned int i, hash, cnt;
+ int ret = 0;
+
+ hash = hash_conntrack(orig);
+ cnt = NF_CT_EVICTION_RANGE;
+
+ for (i = 0; i < nf_conntrack_htable_size; i++) {
+ ret = __early_drop(&nf_conntrack_hash[hash], &cnt);
+ if (ret || !cnt)
+ break;
+ hash++; hash %= nf_conntrack_htable_size;
+ }
+ return ret;
+}
+
static struct nf_conn *
__nf_conntrack_alloc(const struct nf_conntrack_tuple *orig,
const struct nf_conntrack_tuple *repl,
@@ -575,9 +596,7 @@ __nf_conntrack_alloc(const struct nf_conntrack_tuple *orig,

if (nf_conntrack_max
&& atomic_read(&nf_conntrack_count) > nf_conntrack_max) {
- unsigned int hash = hash_conntrack(orig);
- /* Try dropping from this hash chain. */
- if (!early_drop(&nf_conntrack_hash[hash])) {
+ if (!early_drop(orig)) {
atomic_dec(&nf_conntrack_count);
if (net_ratelimit())
printk(KERN_WARNING

2007-06-27 08:52:49

by Patrick McHardy

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v4)

Vasily Averin wrote:
> When the number of conntracks is reached nf_conntrack_max limit, early_drop()
> tries to free one of already used conntracks. If it does not find any conntracks
> that may be freed, it leads to transmission errors.
> In current implementation the conntracks are searched in one hash bucket only.
> It have some drawbacks: if used hash bucket is empty we have not any chances to
> find something. On the other hand the hash bucket can contain a huge number of
> conntracks and its check can last a long time.
> The proposed patch limits the number of checked conntracks and allows to search
> conntracks in other hash buckets. As result in any case the search will have the
> same chances to free one of the conntracks and the check will not lead to long
> delays.

Thanks Vasily. I have some patches queued to convert all conntrack
hashes to hlists, which conflict with your patches. They need a bit
more work, I'll integrate your changes on top of them once I'm done.

BTW, I played around with your last patch yesterday and it shows
a big improvement when flooding the machine with new connections.
Previously about 5% of the (valid) new connections would get
dropped, with your patch not a single one :)

2007-06-27 12:04:52

by Patrick McHardy

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v4)

Patrick McHardy wrote:
> Vasily Averin wrote:
>
>>When the number of conntracks is reached nf_conntrack_max limit, early_drop()
>>tries to free one of already used conntracks. If it does not find any conntracks
>>that may be freed, it leads to transmission errors.
>>In current implementation the conntracks are searched in one hash bucket only.
>>It have some drawbacks: if used hash bucket is empty we have not any chances to
>>find something. On the other hand the hash bucket can contain a huge number of
>>conntracks and its check can last a long time.
>>The proposed patch limits the number of checked conntracks and allows to search
>>conntracks in other hash buckets. As result in any case the search will have the
>>same chances to free one of the conntracks and the check will not lead to long
>>delays.
>
>
>
> Thanks Vasily. I have some patches queued to convert all conntrack
> hashes to hlists, which conflict with your patches. They need a bit
> more work, I'll integrate your changes on top of them once I'm done.

I've added this patch to my tree at

http://people.netfilter.org/kaber/nf-2.6.23.git/

I've joined the two loops from your patch since that avoids an
otherwise useless function and doesn't take the lock up to 8
times in a row.

Attachments:

32.diff (2.73 kB)

2007-06-27 12:29:50

by Vasily Averin

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v4)

Patrick McHardy wrote:
> + for (i = 0; i < NF_CT_EVICTION_RANGE; i++) {
> + hlist_for_each_entry(h, n, &nf_conntrack_hash[hash], hnode) {
> + tmp = nf_ct_tuplehash_to_ctrack(h);
> + if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
> + ct = tmp;
> + }
> + if (ct) {
> + atomic_inc(&ct->ct_general.use);
> + break;
> + }
> + hash = (hash + 1) % nf_conntrack_htable_size;

it is incorrect,
We should count the number of checked _conntracks_, but you count the number of
hash buckets. I.e "i" should be incremented/checked inside the nested loop.

Thank you,
Vasily Averin

2007-06-27 12:52:31

by Patrick McHardy

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v4)

[NETFILTER]: nf_conntrack: early_drop improvement

When the maximum number of conntrack entries is reached and a new
one needs to be allocated, conntrack tries to drop an unassured
connection from the same hash bucket the new conntrack would hash
to. Since with a properly sized hash the average number of entries
per bucket is 1, the chances of actually finding one are not very
good. This patch increases those chances by walking over the hash
until 8 entries are checked.

Based on patch by Vasily Averin <[email protected]>.

Signed-off-by: Patrick McHardy <[email protected]>

---
commit df9f4fc41d7d6a7a51d2fe4b28db2557cb9a0d05
tree 8beb115ce12126b28ce3e5eb3f95b36b71462ea5
parent 665d98d03473cab252830129f414e1b38fb2b038
author Patrick McHardy <[email protected]> Wed, 27 Jun 2007 14:51:38 +0200
committer Patrick McHardy <[email protected]> Wed, 27 Jun 2007 14:51:38 +0200

net/netfilter/nf_conntrack_core.c | 23 +++++++++++++++--------
1 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index d7e62ad..bbb52e5 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -377,21 +377,29 @@ nf_conntrack_tuple_taken(const struct nf_conntrack_tuple *tuple,
}
EXPORT_SYMBOL_GPL(nf_conntrack_tuple_taken);

+#define NF_CT_EVICTION_RANGE 8
+
/* There's a small race here where we may free a just-assured
connection. Too bad: we're in trouble anyway. */
-static int early_drop(struct hlist_head *chain)
+static int early_drop(unsigned int hash)
{
/* Use oldest entry, which is roughly LRU */
struct nf_conntrack_tuple_hash *h;
struct nf_conn *ct = NULL, *tmp;
struct hlist_node *n;
- int dropped = 0;
+ unsigned int i;
+ int dropped = 0, cnt = NF_CT_EVICTION_RANGE;

read_lock_bh(&nf_conntrack_lock);
- hlist_for_each_entry(h, n, chain, hnode) {
- tmp = nf_ct_tuplehash_to_ctrack(h);
- if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
- ct = tmp;
+ for (i = 0; i < nf_conntrack_htable_size; i++) {
+ hlist_for_each_entry(h, n, &nf_conntrack_hash[hash], hnode) {
+ tmp = nf_ct_tuplehash_to_ctrack(h);
+ if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
+ ct = tmp;
+ if (--cnt <= 0)
+ break;
+ }
+ hash = (hash + 1) % nf_conntrack_htable_size;
}
if (ct)
atomic_inc(&ct->ct_general.use);
@@ -425,8 +433,7 @@ struct nf_conn *nf_conntrack_alloc(const struct nf_conntrack_tuple *orig,
if (nf_conntrack_max
&& atomic_read(&nf_conntrack_count) > nf_conntrack_max) {
unsigned int hash = hash_conntrack(orig);
- /* Try dropping from this hash chain. */
- if (!early_drop(&nf_conntrack_hash[hash])) {
+ if (!early_drop(hash)) {
atomic_dec(&nf_conntrack_count);
if (net_ratelimit())
printk(KERN_WARNING

Attachments:

x (2.70 kB)

2007-06-27 13:02:36

by Vasily Averin

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v4)

Patrick McHardy wrote:
> Vasily Averin wrote:
>> it is incorrect,
>> We should count the number of checked _conntracks_, but you count the number of
>> hash buckets. I.e "i" should be incremented/checked inside the nested loop.
>
>
> I misunderstood your patch then. This one should be better.
> +static int early_drop(unsigned int hash)
> {
> /* Use oldest entry, which is roughly LRU */
> struct nf_conntrack_tuple_hash *h;
> struct nf_conn *ct = NULL, *tmp;
> struct hlist_node *n;
> - int dropped = 0;
> + unsigned int i;
> + int dropped = 0, cnt = NF_CT_EVICTION_RANGE;
>
> read_lock_bh(&nf_conntrack_lock);
> - hlist_for_each_entry(h, n, chain, hnode) {
> - tmp = nf_ct_tuplehash_to_ctrack(h);
> - if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
> - ct = tmp;
> + for (i = 0; i < nf_conntrack_htable_size; i++) {
> + hlist_for_each_entry(h, n, &nf_conntrack_hash[hash], hnode) {
> + tmp = nf_ct_tuplehash_to_ctrack(h);
> + if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
> + ct = tmp;
> + if (--cnt <= 0)
> + break;
> + }
> + hash = (hash + 1) % nf_conntrack_htable_size;
> }

it is incorrect again: when cnt=0 you should break both cycles.

thank you,
Vasily Averin

2007-06-27 13:18:49

by Patrick McHardy

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v4)

[NETFILTER]: nf_conntrack: early_drop improvement

When the maximum number of conntrack entries is reached and a new
one needs to be allocated, conntrack tries to drop an unassured
connection from the same hash bucket the new conntrack would hash
to. Since with a properly sized hash the average number of entries
per bucket is 1, the chances of actually finding one are not very
good. This patch increases those chances by walking over the hash
until 8 entries are checked.

Based on patch by Vasily Averin <[email protected]>.

Signed-off-by: Patrick McHardy <[email protected]>

---
commit 047d8f088a71f30d3042cc7615a7a25aa60a668b
tree c83c4a356caf608fe21a646c24678bef74023270
parent 665d98d03473cab252830129f414e1b38fb2b038
author Patrick McHardy <[email protected]> Wed, 27 Jun 2007 15:17:17 +0200
committer Patrick McHardy <[email protected]> Wed, 27 Jun 2007 15:17:17 +0200

net/netfilter/nf_conntrack_core.c | 26 ++++++++++++++++++--------
1 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index d7e62ad..2d5c10f 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -377,22 +377,33 @@ nf_conntrack_tuple_taken(const struct nf_conntrack_tuple *tuple,
}
EXPORT_SYMBOL_GPL(nf_conntrack_tuple_taken);

+#define NF_CT_EVICTION_RANGE 8
+
/* There's a small race here where we may free a just-assured
connection. Too bad: we're in trouble anyway. */
-static int early_drop(struct hlist_head *chain)
+static int early_drop(unsigned int hash)
{
/* Use oldest entry, which is roughly LRU */
struct nf_conntrack_tuple_hash *h;
struct nf_conn *ct = NULL, *tmp;
struct hlist_node *n;
- int dropped = 0;
+ unsigned int i;
+ int dropped = 0, cnt = NF_CT_EVICTION_RANGE;

read_lock_bh(&nf_conntrack_lock);
- hlist_for_each_entry(h, n, chain, hnode) {
- tmp = nf_ct_tuplehash_to_ctrack(h);
- if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
- ct = tmp;
+ for (i = 0; i < nf_conntrack_htable_size; i++) {
+ hlist_for_each_entry(h, n, &nf_conntrack_hash[hash], hnode) {
+ tmp = nf_ct_tuplehash_to_ctrack(h);
+ if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
+ ct = tmp;
+ if (--cnt <= 0)
+ goto stop;
+ }
+ if (ct)
+ break;
+ hash = (hash + 1) & (nf_conntrack_htable_size - 1);
}
+stop:
if (ct)
atomic_inc(&ct->ct_general.use);
read_unlock_bh(&nf_conntrack_lock);
@@ -425,8 +436,7 @@ struct nf_conn *nf_conntrack_alloc(const struct nf_conntrack_tuple *orig,
if (nf_conntrack_max
&& atomic_read(&nf_conntrack_count) > nf_conntrack_max) {
unsigned int hash = hash_conntrack(orig);
- /* Try dropping from this hash chain. */
- if (!early_drop(&nf_conntrack_hash[hash])) {
+ if (!early_drop(hash)) {
atomic_dec(&nf_conntrack_count);
if (net_ratelimit())
printk(KERN_WARNING

Attachments:

x (2.78 kB)

2007-06-27 13:26:18

by Vasily Averin

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v4)

Patrick McHardy wrote:
> Vasily Averin wrote:
>> Patrick McHardy wrote:
> -static int early_drop(struct hlist_head *chain)
> +static int early_drop(unsigned int hash)
> {
> /* Use oldest entry, which is roughly LRU */
> struct nf_conntrack_tuple_hash *h;
> struct nf_conn *ct = NULL, *tmp;
> struct hlist_node *n;
> - int dropped = 0;
> + unsigned int i;
> + int dropped = 0, cnt = NF_CT_EVICTION_RANGE;
>
> read_lock_bh(&nf_conntrack_lock);
> - hlist_for_each_entry(h, n, chain, hnode) {
> - tmp = nf_ct_tuplehash_to_ctrack(h);
> - if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
> - ct = tmp;
> + for (i = 0; i < nf_conntrack_htable_size; i++) {
> + hlist_for_each_entry(h, n, &nf_conntrack_hash[hash], hnode) {
> + tmp = nf_ct_tuplehash_to_ctrack(h);
> + if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
> + ct = tmp;

It is incorrect: you should break nested loop here too.

> + if (--cnt <= 0)
> + goto stop;
> + }
> + if (ct)
> + break;
> + hash = (hash + 1) & (nf_conntrack_htable_size - 1);
> }
> +stop:
> if (ct)
> atomic_inc(&ct->ct_general.use);
> read_unlock_bh(&nf_conntrack_lock);

2007-06-27 13:29:17

by Patrick McHardy

[permalink] [raw]

Subject: Re: [NETFILTER] early_drop() imrovement (v4)

Vasily Averin wrote:
> Patrick McHardy wrote:
>
>>+ for (i = 0; i < nf_conntrack_htable_size; i++) {
>>+ hlist_for_each_entry(h, n, &nf_conntrack_hash[hash], hnode) {
>>+ tmp = nf_ct_tuplehash_to_ctrack(h);
>>+ if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
>>+ ct = tmp;
>
>
> It is incorrect: you should break nested loop here too.

No, as I said, we want the last entry of the chain.