2006-05-07 02:25:20

by Jesper Juhl

[permalink] [raw]
Subject: [PATCH] fix mem-leak in netfilter

The Coverity checker spotted that we may leak 'hold' in
net/ipv4/netfilter/ipt_recent.c::checkentry() when the following
is true :
if (!curr_table->status_proc) {
...
if(!curr_table) {
...
return 0; <-- here we leak.
Simply moving an existing vfree(hold); up a bit avoids the possible leak.


(please keep me on CC when replying since I'm not subscribed
to netfilter-devel)


Signed-off-by: Jesper Juhl <[email protected]>
---

net/ipv4/netfilter/ipt_recent.c | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.17-rc3-git12-orig/net/ipv4/netfilter/ipt_recent.c 2006-05-07 03:25:38.000000000 +0200
+++ linux-2.6.17-rc3-git12/net/ipv4/netfilter/ipt_recent.c 2006-05-07 04:16:26.000000000 +0200
@@ -821,6 +821,7 @@ checkentry(const char *tablename,
/* Create our proc 'status' entry. */
curr_table->status_proc = create_proc_entry(curr_table->name, ip_list_perms, proc_net_ipt_recent);
if (!curr_table->status_proc) {
+ vfree(hold);
printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for /proc entry.\n");
/* Destroy the created table */
spin_lock_bh(&recent_lock);
@@ -845,7 +846,6 @@ checkentry(const char *tablename,
spin_unlock_bh(&recent_lock);
vfree(curr_table->time_info);
vfree(curr_table->hash_table);
- vfree(hold);
vfree(curr_table->table);
vfree(curr_table);
return 0;



2006-05-07 09:36:57

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

On Sun, May 07, 2006 at 04:26:10AM +0200, Jesper Juhl wrote:
> The Coverity checker spotted that we may leak 'hold' in
> net/ipv4/netfilter/ipt_recent.c::checkentry() when the following
> is true :
> if (!curr_table->status_proc) {
> ...
> if(!curr_table) {
> ...
> return 0; <-- here we leak.
> Simply moving an existing vfree(hold); up a bit avoids the possible leak.
>
>
> (please keep me on CC when replying since I'm not subscribed
> to netfilter-devel)
>
>
> Signed-off-by: Jesper Juhl <[email protected]>
> ---
>
> net/ipv4/netfilter/ipt_recent.c | 2 +-
> 1 files changed, 1 insertion(+), 1 deletion(-)
>
> --- linux-2.6.17-rc3-git12-orig/net/ipv4/netfilter/ipt_recent.c 2006-05-07 03:25:38.000000000 +0200
> +++ linux-2.6.17-rc3-git12/net/ipv4/netfilter/ipt_recent.c 2006-05-07 04:16:26.000000000 +0200
> @@ -821,6 +821,7 @@ checkentry(const char *tablename,
> /* Create our proc 'status' entry. */
> curr_table->status_proc = create_proc_entry(curr_table->name, ip_list_perms, proc_net_ipt_recent);
> if (!curr_table->status_proc) {
> + vfree(hold);
> printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for /proc entry.\n");
> /* Destroy the created table */
> spin_lock_bh(&recent_lock);
> @@ -845,7 +846,6 @@ checkentry(const char *tablename,
> spin_unlock_bh(&recent_lock);
> vfree(curr_table->time_info);
> vfree(curr_table->hash_table);
> - vfree(hold);
> vfree(curr_table->table);
> vfree(curr_table);
> return 0;

Seems valid for 2.4.32 too. I'm queuing it up for Marcelo.

Regards,
Willy

2006-05-07 22:43:00

by Grant Coady

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

On Sun, 7 May 2006 11:36:40 +0200, Willy Tarreau <[email protected]> wrote:

>On Sun, May 07, 2006 at 04:26:10AM +0200, Jesper Juhl wrote:
>> The Coverity checker spotted that we may leak 'hold' in
>> net/ipv4/netfilter/ipt_recent.c::checkentry() when the following
>> is true :
>> if (!curr_table->status_proc) {
>> ...
>> if(!curr_table) {
>> ...
>> return 0; <-- here we leak.
>> Simply moving an existing vfree(hold); up a bit avoids the possible leak.
>>
>>
>> (please keep me on CC when replying since I'm not subscribed
>> to netfilter-devel)
>>
>>
>> Signed-off-by: Jesper Juhl <[email protected]>
>> ---
>>
>> net/ipv4/netfilter/ipt_recent.c | 2 +-
>> 1 files changed, 1 insertion(+), 1 deletion(-)
>>
>> --- linux-2.6.17-rc3-git12-orig/net/ipv4/netfilter/ipt_recent.c 2006-05-07 03:25:38.000000000 +0200
>> +++ linux-2.6.17-rc3-git12/net/ipv4/netfilter/ipt_recent.c 2006-05-07 04:16:26.000000000 +0200
>> @@ -821,6 +821,7 @@ checkentry(const char *tablename,
>> /* Create our proc 'status' entry. */
>> curr_table->status_proc = create_proc_entry(curr_table->name, ip_list_perms, proc_net_ipt_recent);
>> if (!curr_table->status_proc) {
>> + vfree(hold);
>> printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for /proc entry.\n");
>> /* Destroy the created table */
>> spin_lock_bh(&recent_lock);
>> @@ -845,7 +846,6 @@ checkentry(const char *tablename,
>> spin_unlock_bh(&recent_lock);
>> vfree(curr_table->time_info);
>> vfree(curr_table->hash_table);
>> - vfree(hold);
>> vfree(curr_table->table);
>> vfree(curr_table);
>> return 0;
>
>Seems valid for 2.4.32 too. I'm queuing it up for Marcelo.

When CONFIG_PROC_FS is not set the function looks like it may exit
without doing the vfree()s for stuff allocated above the #ifdef
CONFIG_PROC_FS.

I wonder if the larger view of the function is also correct? The
coding style is difficult to work with as my terminal only goes to
156 characters wide ;)

Grant.

2006-05-08 05:08:00

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

Hi Grant,

On Mon, May 08, 2006 at 08:42:53AM +1000, Grant Coady wrote:
> On Sun, 7 May 2006 11:36:40 +0200, Willy Tarreau <[email protected]> wrote:
>
> >On Sun, May 07, 2006 at 04:26:10AM +0200, Jesper Juhl wrote:
> >> The Coverity checker spotted that we may leak 'hold' in
> >> net/ipv4/netfilter/ipt_recent.c::checkentry() when the following
> >> is true :
> >> if (!curr_table->status_proc) {
> >> ...
> >> if(!curr_table) {
> >> ...
> >> return 0; <-- here we leak.
> >> Simply moving an existing vfree(hold); up a bit avoids the possible leak.
> >>
> >>
> >> (please keep me on CC when replying since I'm not subscribed
> >> to netfilter-devel)
> >>
> >>
> >> Signed-off-by: Jesper Juhl <[email protected]>
> >> ---
> >>
> >> net/ipv4/netfilter/ipt_recent.c | 2 +-
> >> 1 files changed, 1 insertion(+), 1 deletion(-)
> >>
> >> --- linux-2.6.17-rc3-git12-orig/net/ipv4/netfilter/ipt_recent.c 2006-05-07 03:25:38.000000000 +0200
> >> +++ linux-2.6.17-rc3-git12/net/ipv4/netfilter/ipt_recent.c 2006-05-07 04:16:26.000000000 +0200
> >> @@ -821,6 +821,7 @@ checkentry(const char *tablename,
> >> /* Create our proc 'status' entry. */
> >> curr_table->status_proc = create_proc_entry(curr_table->name, ip_list_perms, proc_net_ipt_recent);
> >> if (!curr_table->status_proc) {
> >> + vfree(hold);
> >> printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for /proc entry.\n");
> >> /* Destroy the created table */
> >> spin_lock_bh(&recent_lock);
> >> @@ -845,7 +846,6 @@ checkentry(const char *tablename,
> >> spin_unlock_bh(&recent_lock);
> >> vfree(curr_table->time_info);
> >> vfree(curr_table->hash_table);
> >> - vfree(hold);
> >> vfree(curr_table->table);
> >> vfree(curr_table);
> >> return 0;
> >
> >Seems valid for 2.4.32 too. I'm queuing it up for Marcelo.
>
> When CONFIG_PROC_FS is not set the function looks like it may exit
> without doing the vfree()s for stuff allocated above the #ifdef
> CONFIG_PROC_FS.

At first, I thought you were right. But after a night long rest,
I'm doubting. In fact, I'm not even sure that we can free 'hold' :

753 for(c = 0; c < ip_list_tot; c++) {
754 curr_table->table[c].last_pkts = hold + c*ip_pkt_list_tot;
755 }
756

So it seems like the vfree(hold) must not be performed if curr_table
is not unlinked. If this is the case, even Jesper's patch might be
wrong. Otherwise, vfree(hold) should be called unconditionnally
after #endif CONFIG_PROC_FS.

> I wonder if the larger view of the function is also correct? The
> coding style is difficult to work with as my terminal only goes to
> 156 characters wide ;)

Agreed ! Reading this code is really painful. Even after one long
night, I have huge trouble understanding it. Here are some good
excerpts, that we might honnestly call 'obfuscation' :

799 while( (last_table = find_table) && strncmp(info->name,find_table->name,IPT_RECENT_NAME_LEN) && (find_table = find_table->next) );
836 while( strncmp(info->name,curr_table->name,IPT_RECENT_NAME_LEN) && (last_table = curr_table) && (curr_table = curr_table->next) );
844 if(last_table) last_table->next = curr_table->next; else r_tables = curr_table->next;

I wonder how such unmaintainable code has been merged in the first
place. Obviously, Davem has never seen it ! He has already annoyed
me for 81-chars wide lines because his terminal is 80 columns. Or
he has given up from the very beginning. The fact is it's a tool
which has found a potential memory leak.

> Grant.

Regards,
Willy

2006-05-08 05:43:38

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

From: Willy Tarreau <[email protected]>
Date: Mon, 8 May 2006 07:07:48 +0200

> I wonder how such unmaintainable code has been merged in the first
> place. Obviously, Davem has never seen it !

Oh I've seen ipt_recent.c, it's one huge pile of trash
that needs to be rewritten. It has all sorts of problems.

This is well understood on the netfilter-devel list and
I am to understand that someone has taken up the task to
finally rewrite the thing.

2006-05-08 08:36:24

by Amin Azez

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

David S. Miller wrote:
> From: Willy Tarreau <[email protected]>
> Date: Mon, 8 May 2006 07:07:48 +0200
>
>> I wonder how such unmaintainable code has been merged in the first
>> place. Obviously, Davem has never seen it !
>
> Oh I've seen ipt_recent.c, it's one huge pile of trash
> that needs to be rewritten. It has all sorts of problems.
>
> This is well understood on the netfilter-devel list and
> I am to understand that someone has taken up the task to
> finally rewrite the thing.


Is that [email protected] ?
...just checking... he seemed to volunteer in December last year but
Stephen Frost has been taking recent questions.

Sam

2006-05-08 09:09:31

by Juergen Kreileder

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

Amin Azez wrote:
> David S. Miller wrote:
>> From: Willy Tarreau <[email protected]>
>> Date: Mon, 8 May 2006 07:07:48 +0200
>>
>>> I wonder how such unmaintainable code has been merged in the first
>>> place. Obviously, Davem has never seen it !
>> Oh I've seen ipt_recent.c, it's one huge pile of trash
>> that needs to be rewritten. It has all sorts of problems.
>>
>> This is well understood on the netfilter-devel list and
>> I am to understand that someone has taken up the task to
>> finally rewrite the thing.
>
>
> Is that [email protected] ?

Please use [email protected] (@empolis.com is just an address at
a client's site).

> ...just checking... he seemed to volunteer in December

but not for a rewrite. Anyhow, if somebody is planning to do that
I'll gladly help.

> last year but Stephen Frost has been taking recent questions.


Juergen

--
Juergen Kreileder, Blackdown Java-Linux Team
http://blog.blackdown.de/

2006-05-12 07:40:48

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

David S. Miller wrote:
> From: Willy Tarreau <[email protected]>
> Date: Mon, 8 May 2006 07:07:48 +0200
>
>
>>I wonder how such unmaintainable code has been merged in the first
>>place. Obviously, Davem has never seen it !
>
>
> Oh I've seen ipt_recent.c, it's one huge pile of trash
> that needs to be rewritten. It has all sorts of problems.
>
> This is well understood on the netfilter-devel list and
> I am to understand that someone has taken up the task to
> finally rewrite the thing.


I haven't seen any cleanup patches so far, so I think I'm
going to start my nth try at cleaning up this mess.
Unfortunately its even immune to Lindent ..

2006-05-12 11:11:05

by Jesper Juhl

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

On 5/12/06, Patrick McHardy <[email protected]> wrote:
> David S. Miller wrote:
> > From: Willy Tarreau <[email protected]>
> > Date: Mon, 8 May 2006 07:07:48 +0200
> >
> >
> >>I wonder how such unmaintainable code has been merged in the first
> >>place. Obviously, Davem has never seen it !
> >
> >
> > Oh I've seen ipt_recent.c, it's one huge pile of trash
> > that needs to be rewritten. It has all sorts of problems.
> >
> > This is well understood on the netfilter-devel list and
> > I am to understand that someone has taken up the task to
> > finally rewrite the thing.
>
>
> I haven't seen any cleanup patches so far, so I think I'm
> going to start my nth try at cleaning up this mess.
> Unfortunately its even immune to Lindent ..
>

If you get too fed up with it, let me know, and I'll give it a go as well.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-05-12 11:33:23

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

diff --git a/net/ipv4/netfilter/ipt_recent.c b/net/ipv4/netfilter/ipt_recent.c
index 1438432..b8850a2 100644
--- a/net/ipv4/netfilter/ipt_recent.c
+++ b/net/ipv4/netfilter/ipt_recent.c
@@ -438,11 +438,15 @@ #endif
(!r_list[hash_table[hash_result]].ttl || r_list[hash_table[hash_result]].ttl == ttl))) {
/* Collision in hash table */
hash_result = (hash_result + 1) % ip_list_hash_size;
+ if (hash_result == orig_hash_result)
+ break;
}
} else {
while(hash_table[hash_result] != -1 && r_list[hash_table[hash_result]].addr != addr) {
/* Collision in hash table */
hash_result = (hash_result + 1) % ip_list_hash_size;
+ if (hash_result == orig_hash_result)
+ break;
}
}


Attachments:
x (712.00 B)

2006-05-12 12:13:34

by Jesper Juhl

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

On 5/12/06, Patrick McHardy <[email protected]> wrote:
> Jesper Juhl wrote:
> > On 5/12/06, Patrick McHardy <[email protected]> wrote:
> >
> >> I haven't seen any cleanup patches so far, so I think I'm
> >> going to start my nth try at cleaning up this mess.
> >> Unfortunately its even immune to Lindent ..
> >>
> >
> > If you get too fed up with it, let me know, and I'll give it a go as well.
>
> Thanks, I'm about half-way through (and about to kill someone),
> just started with the biggest pile of crap (the match function)
> and already noticed a possible endless loop within the first
> couple of lines.
>
> Unfortunately this stuff is so unreadable that I'm not exactly
> sure if the loop really won't terminate, an extra pair of eyes
> would be appreciated.
>

Sure thing.

I don't have time to look at it today (friends comming over for
dinner), but I should have plenty of time for it tomorrow. So, if you
could send me your patch once you are done for the day, then I'll look
it over and see if I can find anything to add on top of your work (or
have anything to comment on) and bounce it back to you sometime during
tomorrow.


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-05-12 12:41:12

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

On Fri, May 12, 2006 at 02:13:32PM +0200, Jesper Juhl wrote:
> On 5/12/06, Patrick McHardy <[email protected]> wrote:
> >Jesper Juhl wrote:
> >> On 5/12/06, Patrick McHardy <[email protected]> wrote:
> >>
> >>> I haven't seen any cleanup patches so far, so I think I'm
> >>> going to start my nth try at cleaning up this mess.
> >>> Unfortunately its even immune to Lindent ..
> >>>
> >>
> >> If you get too fed up with it, let me know, and I'll give it a go as
> >well.
> >
> >Thanks, I'm about half-way through (and about to kill someone),
> >just started with the biggest pile of crap (the match function)
> >and already noticed a possible endless loop within the first
> >couple of lines.
> >
> >Unfortunately this stuff is so unreadable that I'm not exactly
> >sure if the loop really won't terminate, an extra pair of eyes
> >would be appreciated.
> >
>
> Sure thing.
>
> I don't have time to look at it today (friends comming over for
> dinner), but I should have plenty of time for it tomorrow. So, if you
> could send me your patch once you are done for the day, then I'll look
> it over and see if I can find anything to add on top of your work (or
> have anything to comment on) and bounce it back to you sometime during
> tomorrow.

Please post it to the list, this coding style needs far more than two
pairs of eyes to be fixed. It has already discouraged several people,
the more we will be, the least pain we will feel :-)

Cheers
Willy

2006-05-12 12:49:34

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

Willy Tarreau wrote:
> Please post it to the list, this coding style needs far more than two
> pairs of eyes to be fixed. It has already discouraged several people,
> the more we will be, the least pain we will feel :-)

:)

I actually just got to fed up with this garbage (once again) and started
rewriting it from scratch, which looks like a lot less pain. I'll look
into these loops again for 2.4 and 2.6.17 once I'm done doing that.

2006-05-15 08:25:40

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

[NETFILTER]: Replace ipt_recent module

Replace the totally unmaintainable ipt_recent module by a rewritten
version that should be fully compatible.

Signed-off-by: Patrick McHardy <[email protected]>

---
commit e84375630c59ad10eac6235f32ba4beb6921ff9e
tree d58afb0f5e552b3fefae5144414346a43ec9eeec
parent d8c3291c73b958243b33f8509d4507e76dafd055
author Patrick McHardy <[email protected]> Mon, 15 May 2006 10:10:20 +0200
committer Patrick McHardy <[email protected]> Mon, 15 May 2006 10:10:20 +0200

net/ipv4/netfilter/ipt_recent.c | 1276 ++++++++++++---------------------------
1 files changed, 382 insertions(+), 894 deletions(-)

diff --git a/net/ipv4/netfilter/ipt_recent.c b/net/ipv4/netfilter/ipt_recent.c
index 1438432..9c844d8 100644
--- a/net/ipv4/netfilter/ipt_recent.c
+++ b/net/ipv4/netfilter/ipt_recent.c
@@ -1,1007 +1,495 @@
-/* Kernel module to check if the source address has been seen recently. */
-/* Copyright 2002-2003, Stephen Frost, 2.5.x port by [email protected] */
-/* Author: Stephen Frost <[email protected]> */
-/* Project Page: http://snowman.net/projects/ipt_recent/ */
-/* This software is distributed under the terms of the GPL, Version 2 */
-/* This copyright does not cover user programs that use kernel services
- * by normal system calls. */
-
-#include <linux/module.h>
-#include <linux/skbuff.h>
+/*
+ * Copyright (c) 2006 Patrick McHardy <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This is a replacement of the old ipt_recent module, which carried the
+ * following copyright notice:
+ *
+ * Author: Stephen Frost <[email protected]>
+ * Copyright 2002-2003, Stephen Frost, 2.5.x port by [email protected]
+ */
+#include <linux/init.h>
+#include <linux/moduleparam.h>
#include <linux/proc_fs.h>
-#include <linux/spinlock.h>
-#include <linux/interrupt.h>
-#include <asm/uaccess.h>
+#include <linux/seq_file.h>
+#include <linux/string.h>
#include <linux/ctype.h>
-#include <linux/ip.h>
-#include <linux/vmalloc.h>
-#include <linux/moduleparam.h>
+#include <linux/list.h>
+#include <linux/random.h>
+#include <linux/jhash.h>
+#include <linux/bitops.h>
+#include <linux/skbuff.h>
+#include <linux/inet.h>

#include <linux/netfilter_ipv4/ip_tables.h>
#include <linux/netfilter_ipv4/ipt_recent.h>

-#undef DEBUG
-#define HASH_LOG 9
+MODULE_AUTHOR("Patrick McHardy <[email protected]>");
+MODULE_DESCRIPTION("IP tables recently seen matching module");
+MODULE_LICENSE("GPL");

-/* Defaults, these can be overridden on the module command-line. */
static unsigned int ip_list_tot = 100;
static unsigned int ip_pkt_list_tot = 20;
static unsigned int ip_list_hash_size = 0;
static unsigned int ip_list_perms = 0644;
-#ifdef DEBUG
-static int debug = 1;
-#endif
-
-static char version[] =
-KERN_INFO RECENT_NAME " " RECENT_VER ": Stephen Frost <[email protected]>. http://snowman.net/projects/ipt_recent/\n";
-
-MODULE_AUTHOR("Stephen Frost <[email protected]>");
-MODULE_DESCRIPTION("IP tables recently seen matching module " RECENT_VER);
-MODULE_LICENSE("GPL");
module_param(ip_list_tot, uint, 0400);
module_param(ip_pkt_list_tot, uint, 0400);
module_param(ip_list_hash_size, uint, 0400);
module_param(ip_list_perms, uint, 0400);
-#ifdef DEBUG
-module_param(debug, bool, 0600);
-MODULE_PARM_DESC(debug,"enable debugging output");
-#endif
-MODULE_PARM_DESC(ip_list_tot,"number of IPs to remember per list");
-MODULE_PARM_DESC(ip_pkt_list_tot,"number of packets per IP to remember");
-MODULE_PARM_DESC(ip_list_hash_size,"size of hash table used to look up IPs");
-MODULE_PARM_DESC(ip_list_perms,"permissions on /proc/net/ipt_recent/* files");
-
-/* Structure of our list of recently seen addresses. */
-struct recent_ip_list {
- u_int32_t addr;
- u_int8_t ttl;
- unsigned long last_seen;
- unsigned long *last_pkts;
- u_int32_t oldest_pkt;
- u_int32_t hash_entry;
- u_int32_t time_pos;
-};
-
-struct time_info_list {
- u_int32_t position;
- u_int32_t time;
+MODULE_PARM_DESC(ip_list_tot, "number of IPs to remember per list");
+MODULE_PARM_DESC(ip_pkt_list_tot, "number of packets per IP to remember");
+MODULE_PARM_DESC(ip_list_hash_size, "size of hash table used to look up IPs");
+MODULE_PARM_DESC(ip_list_perms, "permissions on /proc/net/ipt_recent/* files");
+
+
+struct recent_entry {
+ struct list_head list;
+ u_int32_t addr;
+ u_int8_t ttl;
+ unsigned int index;
+ unsigned int nstamps;
+ unsigned long stamps[0];
};

-/* Structure of our linked list of tables of recent lists. */
-struct recent_ip_tables {
- char name[IPT_RECENT_NAME_LEN];
- int count;
- int time_pos;
- struct recent_ip_list *table;
- struct recent_ip_tables *next;
- spinlock_t list_lock;
- int *hash_table;
- struct time_info_list *time_info;
+struct recent_table {
+ struct list_head list;
+ char name[IPT_RECENT_NAME_LEN];
#ifdef CONFIG_PROC_FS
- struct proc_dir_entry *status_proc;
-#endif /* CONFIG_PROC_FS */
+ struct proc_dir_entry *proc;
+#endif
+ unsigned int refcnt;
+ unsigned int entries;
+ struct list_head iphash[0];
};

-/* Our current list of addresses we have recently seen.
- * Only added to on a --set, and only updated on --set || --update
- */
-static struct recent_ip_tables *r_tables = NULL;
-
-/* We protect r_list with this spinlock so two processors are not modifying
- * the list at the same time.
- */
+static LIST_HEAD(tables);
static DEFINE_SPINLOCK(recent_lock);

#ifdef CONFIG_PROC_FS
-/* Our /proc/net/ipt_recent entry */
-static struct proc_dir_entry *proc_net_ipt_recent = NULL;
+static struct proc_dir_entry *proc_dir;
+static struct file_operations recent_fops;
#endif

-/* Function declaration for later. */
-static int
-match(const struct sk_buff *skb,
- const struct net_device *in,
- const struct net_device *out,
- const struct xt_match *match,
- const void *matchinfo,
- int offset,
- unsigned int protoff,
- int *hotdrop);
-
-/* Function to hash a given address into the hash table of table_size size */
-static int hash_func(unsigned int addr, int table_size)
-{
- int result = 0;
- unsigned int value = addr;
- do { result ^= value; } while((value >>= HASH_LOG));
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": %d = hash_func(%u,%d)\n",
- result & (table_size - 1),
- addr,
- table_size);
-#endif
+static u_int32_t hash_rnd;
+static int hash_rnd_initted;

- return(result & (table_size - 1));
-}
-
-#ifdef CONFIG_PROC_FS
-/* This is the function which produces the output for our /proc output
- * interface which lists each IP address, the last seen time and the
- * other recent times the address was seen.
- */
-
-static int ip_recent_get_info(char *buffer, char **start, off_t offset, int length, int *eof, void *data)
+static u_int32_t recent_entry_hash(u_int32_t addr)
{
- int len = 0, count, last_len = 0, pkt_count;
- off_t pos = 0;
- off_t begin = 0;
- struct recent_ip_tables *curr_table;
-
- curr_table = (struct recent_ip_tables*) data;
-
- spin_lock_bh(&curr_table->list_lock);
- for(count = 0; count < ip_list_tot; count++) {
- if(!curr_table->table[count].addr) continue;
- last_len = len;
- len += sprintf(buffer+len,"src=%u.%u.%u.%u ",NIPQUAD(curr_table->table[count].addr));
- len += sprintf(buffer+len,"ttl: %u ",curr_table->table[count].ttl);
- len += sprintf(buffer+len,"last_seen: %lu ",curr_table->table[count].last_seen);
- len += sprintf(buffer+len,"oldest_pkt: %u ",curr_table->table[count].oldest_pkt);
- len += sprintf(buffer+len,"last_pkts: %lu",curr_table->table[count].last_pkts[0]);
- for(pkt_count = 1; pkt_count < ip_pkt_list_tot; pkt_count++) {
- if(!curr_table->table[count].last_pkts[pkt_count]) break;
- len += sprintf(buffer+len,", %lu",curr_table->table[count].last_pkts[pkt_count]);
- }
- len += sprintf(buffer+len,"\n");
- pos = begin + len;
- if(pos < offset) { len = 0; begin = pos; }
- if(pos > offset + length) { len = last_len; break; }
+ if (!hash_rnd_initted) {
+ get_random_bytes(&hash_rnd, 4);
+ hash_rnd_initted = 1;
}
-
- *start = buffer + (offset - begin);
- len -= (offset - begin);
- if(len > length) len = length;
-
- spin_unlock_bh(&curr_table->list_lock);
- return len;
+ return jhash_1word(addr, hash_rnd) & (ip_list_hash_size - 1);
}

-/* ip_recent_ctrl provides an interface for users to modify the table
- * directly. This allows adding entries, removing entries, and
- * flushing the entire table.
- * This is done by opening up the appropriate table for writing and
- * sending one of:
- * xx.xx.xx.xx -- Add entry to table with current time
- * +xx.xx.xx.xx -- Add entry to table with current time
- * -xx.xx.xx.xx -- Remove entry from table
- * clear -- Flush table, remove all entries
- */
-
-static int ip_recent_ctrl(struct file *file, const char __user *input, unsigned long size, void *data)
+static struct recent_entry *
+recent_entry_lookup(const struct recent_table *table, u_int32_t addr, u_int8_t ttl)
{
- static const u_int32_t max[4] = { 0xffffffff, 0xffffff, 0xffff, 0xff };
- u_int32_t val;
- int base, used = 0;
- char c, *cp;
- union iaddr {
- uint8_t bytes[4];
- uint32_t word;
- } res;
- uint8_t *pp = res.bytes;
- int digit;
-
- char buffer[20];
- int len, check_set = 0, count;
- u_int32_t addr = 0;
- struct sk_buff *skb;
- struct ipt_recent_info *info;
- struct recent_ip_tables *curr_table;
-
- curr_table = (struct recent_ip_tables*) data;
-
- if(size > 20) len = 20; else len = size;
-
- if(copy_from_user(buffer,input,len)) return -EFAULT;
-
- if(len < 20) buffer[len] = '\0';
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_recent_ctrl len: %d, input: `%.20s'\n",len,buffer);
-#endif
-
- cp = buffer;
- while(isspace(*cp)) { cp++; used++; if(used >= len-5) return used; }
-
- /* Check if we are asked to flush the entire table */
- if(!memcmp(cp,"clear",5)) {
- used += 5;
- spin_lock_bh(&curr_table->list_lock);
- curr_table->time_pos = 0;
- for(count = 0; count < ip_list_hash_size; count++) {
- curr_table->hash_table[count] = -1;
- }
- for(count = 0; count < ip_list_tot; count++) {
- curr_table->table[count].last_seen = 0;
- curr_table->table[count].addr = 0;
- curr_table->table[count].ttl = 0;
- memset(curr_table->table[count].last_pkts,0,ip_pkt_list_tot*sizeof(unsigned long));
- curr_table->table[count].oldest_pkt = 0;
- curr_table->table[count].time_pos = 0;
- curr_table->time_info[count].position = count;
- curr_table->time_info[count].time = 0;
- }
- spin_unlock_bh(&curr_table->list_lock);
- return used;
- }
+ struct recent_entry *e;
+ unsigned int h;
+
+ h = recent_entry_hash(addr);
+ list_for_each_entry(e, &table->iphash[h], list)
+ if (e->addr == addr && (!ttl || !e->ttl || ttl == e->ttl))
+ return e;
+ return NULL;
+}

- check_set = IPT_RECENT_SET;
- switch(*cp) {
- case '+': check_set = IPT_RECENT_SET; cp++; used++; break;
- case '-': check_set = IPT_RECENT_REMOVE; cp++; used++; break;
- default: if(!isdigit(*cp)) return (used+1); break;
- }
+static void recent_entry_remove(struct recent_table *t, struct recent_entry *e)
+{
+ list_del(&e->list);
+ kfree(e);
+ t->entries--;
+}

-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_recent_ctrl cp: `%c', check_set: %d\n",*cp,check_set);
-#endif
- /* Get addr (effectively inet_aton()) */
- /* Shamelessly stolen from libc, a function in the kernel for doing
- * this would, of course, be greatly preferred, but our options appear
- * to be rather limited, so we will just do it ourselves here.
- */
- res.word = 0;
-
- c = *cp;
- for(;;) {
- if(!isdigit(c)) return used;
- val = 0; base = 10; digit = 0;
- if(c == '0') {
- c = *++cp;
- if(c == 'x' || c == 'X') base = 16, c = *++cp;
- else { base = 8; digit = 1; }
- }
- for(;;) {
- if(isascii(c) && isdigit(c)) {
- if(base == 8 && (c == '8' || c == '0')) return used;
- val = (val * base) + (c - '0');
- c = *++cp;
- digit = 1;
- } else if(base == 16 && isascii(c) && isxdigit(c)) {
- val = (val << 4) | (c + 10 - (islower(c) ? 'a' : 'A'));
- c = *++cp;
- digit = 1;
- } else break;
+static struct recent_entry *
+recent_entry_init(struct recent_table *t, u_int32_t addr, u_int8_t ttl)
+{
+ struct recent_entry *e;
+ unsigned int i, h;
+
+ h = recent_entry_hash(addr);
+ if (t->entries >= ip_list_tot) {
+ for (i = h; ; i = (i + 1) % ip_list_hash_size) {
+ if (list_empty(&t->iphash[i]))
+ continue;
+ e = list_entry(t->iphash[i].next, struct recent_entry,
+ list);
+ recent_entry_remove(t, e);
+ break;
}
- if(c == '.') {
- if(pp > res.bytes + 2 || val > 0xff) return used;
- *pp++ = val;
- c = *++cp;
- } else break;
}
- used = cp - buffer;
- if(c != '\0' && (!isascii(c) || !isspace(c))) return used;
- if(c == '\n') used++;
- if(!digit) return used;
+ e = kmalloc(sizeof(*e) + sizeof(e->stamps[0]) * ip_pkt_list_tot,
+ GFP_ATOMIC);
+ if (e == NULL)
+ return NULL;
+ e->addr = addr;
+ e->ttl = ttl;
+ e->stamps[0] = jiffies;
+ e->nstamps = 1;
+ e->index = 1;
+ list_add_tail(&e->list, &t->iphash[h]);
+ t->entries++;
+ return e;
+}

- if(val > max[pp - res.bytes]) return used;
- addr = res.word | htonl(val);
+static void recent_entry_update(struct recent_entry *e)
+{
+ e->stamps[e->index++] = jiffies;
+ if (e->index > e->nstamps)
+ e->nstamps = e->index;
+ e->index %= ip_pkt_list_tot;
+}

- if(!addr && check_set == IPT_RECENT_SET) return used;
+static struct recent_table *recent_table_lookup(const char *name)
+{
+ struct recent_table *t;

-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_recent_ctrl c: %c, addr: %u used: %d\n",c,addr,used);
-#endif
+ list_for_each_entry(t, &tables, list)
+ if (!strcmp(t->name, name))
+ return t;
+ return NULL;
+}

- /* Set up and just call match */
- info = kmalloc(sizeof(struct ipt_recent_info),GFP_KERNEL);
- if(!info) { return -ENOMEM; }
- info->seconds = 0;
- info->hit_count = 0;
- info->check_set = check_set;
- info->invert = 0;
- info->side = IPT_RECENT_SOURCE;
- strncpy(info->name,curr_table->name,IPT_RECENT_NAME_LEN);
- info->name[IPT_RECENT_NAME_LEN-1] = '\0';
-
- skb = kmalloc(sizeof(struct sk_buff),GFP_KERNEL);
- if (!skb) {
- used = -ENOMEM;
- goto out_free_info;
- }
- skb->nh.iph = kmalloc(sizeof(struct iphdr),GFP_KERNEL);
- if (!skb->nh.iph) {
- used = -ENOMEM;
- goto out_free_skb;
+static void recent_table_flush(struct recent_table *t)
+{
+ struct recent_entry *e, *next;
+ unsigned int i;
+
+ for (i = 0; i < ip_list_hash_size; i++) {
+ list_for_each_entry_safe(e, next, &t->iphash[i], list)
+ recent_entry_remove(t, e);
}
-
- skb->nh.iph->saddr = addr;
- skb->nh.iph->daddr = 0;
- /* Clear ttl since we have no way of knowing it */
- skb->nh.iph->ttl = 0;
- match(skb,NULL,NULL,NULL,info,0,0,NULL);
-
- kfree(skb->nh.iph);
-out_free_skb:
- kfree(skb);
-out_free_info:
- kfree(info);
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": Leaving ip_recent_ctrl addr: %u used: %d\n",addr,used);
-#endif
- return used;
}

-#endif /* CONFIG_PROC_FS */
-
-/* 'match' is our primary function, called by the kernel whenever a rule is
- * hit with our module as an option to it.
- * What this function does depends on what was specifically asked of it by
- * the user:
- * --set -- Add or update last seen time of the source address of the packet
- * -- matchinfo->check_set == IPT_RECENT_SET
- * --rcheck -- Just check if the source address is in the list
- * -- matchinfo->check_set == IPT_RECENT_CHECK
- * --update -- If the source address is in the list, update last_seen
- * -- matchinfo->check_set == IPT_RECENT_UPDATE
- * --remove -- If the source address is in the list, remove it
- * -- matchinfo->check_set == IPT_RECENT_REMOVE
- * --seconds -- Option to --rcheck/--update, only match if last_seen within seconds
- * -- matchinfo->seconds
- * --hitcount -- Option to --rcheck/--update, only match if seen hitcount times
- * -- matchinfo->hit_count
- * --seconds and --hitcount can be combined
- */
static int
-match(const struct sk_buff *skb,
- const struct net_device *in,
- const struct net_device *out,
- const struct xt_match *match,
- const void *matchinfo,
- int offset,
- unsigned int protoff,
- int *hotdrop)
+ipt_recent_match(const struct sk_buff *skb,
+ const struct net_device *in, const struct net_device *out,
+ const struct xt_match *match, const void *matchinfo,
+ int offset, unsigned int protoff, int *hotdrop)
{
- int pkt_count, hits_found, ans;
- unsigned long now;
const struct ipt_recent_info *info = matchinfo;
- u_int32_t addr = 0, time_temp;
- u_int8_t ttl = skb->nh.iph->ttl;
- int *hash_table;
- int orig_hash_result, hash_result, temp, location = 0, time_loc, end_collision_chain = -1;
- struct time_info_list *time_info;
- struct recent_ip_tables *curr_table;
- struct recent_ip_tables *last_table;
- struct recent_ip_list *r_list;
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match() called\n");
-#endif
-
- /* Default is false ^ info->invert */
- ans = info->invert;
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): name = '%s'\n",info->name);
-#endif
-
- /* if out != NULL then routing has been done and TTL changed.
- * We change it back here internally for match what came in before routing. */
- if(out) ttl++;
-
- /* Find the right table */
- spin_lock_bh(&recent_lock);
- curr_table = r_tables;
- while( (last_table = curr_table) && strncmp(info->name,curr_table->name,IPT_RECENT_NAME_LEN) && (curr_table = curr_table->next) );
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): table found('%s')\n",info->name);
-#endif
-
- spin_unlock_bh(&recent_lock);
-
- /* Table with this name not found, match impossible */
- if(!curr_table) { return ans; }
-
- /* Make sure no one is changing the list while we work with it */
- spin_lock_bh(&curr_table->list_lock);
-
- r_list = curr_table->table;
- if(info->side == IPT_RECENT_DEST) addr = skb->nh.iph->daddr; else addr = skb->nh.iph->saddr;
-
- if(!addr) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match() address (%u) invalid, leaving.\n",addr);
-#endif
- spin_unlock_bh(&curr_table->list_lock);
- return ans;
- }
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): checking table, addr: %u, ttl: %u, orig_ttl: %u\n",addr,ttl,skb->nh.iph->ttl);
-#endif
-
- /* Get jiffies now in case they changed while we were waiting for a lock */
- now = jiffies;
- hash_table = curr_table->hash_table;
- time_info = curr_table->time_info;
-
- orig_hash_result = hash_result = hash_func(addr,ip_list_hash_size);
- /* Hash entry at this result used */
- /* Check for TTL match if requested. If TTL is zero then a match would never
- * happen, so match regardless of existing TTL in that case. Zero means the
- * entry was added via the /proc interface anyway, so we will just use the
- * first TTL we get for that IP address. */
- if(info->check_set & IPT_RECENT_TTL) {
- while(hash_table[hash_result] != -1 && !(r_list[hash_table[hash_result]].addr == addr &&
- (!r_list[hash_table[hash_result]].ttl || r_list[hash_table[hash_result]].ttl == ttl))) {
- /* Collision in hash table */
- hash_result = (hash_result + 1) % ip_list_hash_size;
- }
- } else {
- while(hash_table[hash_result] != -1 && r_list[hash_table[hash_result]].addr != addr) {
- /* Collision in hash table */
- hash_result = (hash_result + 1) % ip_list_hash_size;
- }
- }
-
- if(hash_table[hash_result] == -1 && !(info->check_set & IPT_RECENT_SET)) {
- /* IP not in list and not asked to SET */
- spin_unlock_bh(&curr_table->list_lock);
- return ans;
+ struct recent_table *t;
+ struct recent_entry *e;
+ u_int32_t addr;
+ u_int8_t ttl;
+ int ret = info->invert;
+
+ if (info->side == IPT_RECENT_DEST)
+ addr = skb->nh.iph->daddr;
+ else
+ addr = skb->nh.iph->saddr;
+
+ ttl = 0;
+ if (info->check_set & IPT_RECENT_TTL) {
+ ttl = skb->nh.iph->ttl;
+ /* use TTL as seen before forwaring */
+ if (out && !skb->sk)
+ ttl++;
}

- /* Check if we need to handle the collision, do not need to on REMOVE */
- if(orig_hash_result != hash_result && !(info->check_set & IPT_RECENT_REMOVE)) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Collision in hash table. (or: %d,hr: %d,oa: %u,ha: %u)\n",
- orig_hash_result,
- hash_result,
- r_list[hash_table[orig_hash_result]].addr,
- addr);
-#endif
-
- /* We had a collision.
- * orig_hash_result is where we started, hash_result is where we ended up.
- * So, swap them because we are likely to see the same guy again sooner */
-#ifdef DEBUG
- if(debug) {
- printk(KERN_INFO RECENT_NAME ": match(): Collision; hash_table[orig_hash_result] = %d\n",hash_table[orig_hash_result]);
- printk(KERN_INFO RECENT_NAME ": match(): Collision; r_list[hash_table[orig_hash_result]].hash_entry = %d\n",
- r_list[hash_table[orig_hash_result]].hash_entry);
- }
-#endif
-
- r_list[hash_table[orig_hash_result]].hash_entry = hash_result;
-
-
- temp = hash_table[orig_hash_result];
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Collision; hash_table[hash_result] = %d\n",hash_table[hash_result]);
-#endif
- hash_table[orig_hash_result] = hash_table[hash_result];
- hash_table[hash_result] = temp;
- temp = hash_result;
- hash_result = orig_hash_result;
- orig_hash_result = temp;
- time_info[r_list[hash_table[orig_hash_result]].time_pos].position = hash_table[orig_hash_result];
- if(hash_table[hash_result] != -1) {
- r_list[hash_table[hash_result]].hash_entry = hash_result;
- time_info[r_list[hash_table[hash_result]].time_pos].position = hash_table[hash_result];
- }
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Collision handled.\n");
-#endif
+ spin_lock_bh(&recent_lock);
+ t = recent_table_lookup(info->name);
+ e = recent_entry_lookup(t, addr, ttl);
+ if (e == NULL) {
+ if (!(info->check_set & IPT_RECENT_SET))
+ goto out;
+ e = recent_entry_init(t, addr, ttl);
+ if (e == NULL)
+ *hotdrop = 1;
+ ret ^= 1;
+ goto out;
}

- if(hash_table[hash_result] == -1) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): New table entry. (hr: %d,ha: %u)\n",
- hash_result, addr);
-#endif
-
- /* New item found and IPT_RECENT_SET, so we need to add it */
- location = time_info[curr_table->time_pos].position;
- hash_table[r_list[location].hash_entry] = -1;
- hash_table[hash_result] = location;
- memset(r_list[location].last_pkts,0,ip_pkt_list_tot*sizeof(unsigned long));
- r_list[location].time_pos = curr_table->time_pos;
- r_list[location].addr = addr;
- r_list[location].ttl = ttl;
- r_list[location].last_seen = now;
- r_list[location].oldest_pkt = 1;
- r_list[location].last_pkts[0] = now;
- r_list[location].hash_entry = hash_result;
- time_info[curr_table->time_pos].time = r_list[location].last_seen;
- curr_table->time_pos = (curr_table->time_pos + 1) % ip_list_tot;
-
- ans = !info->invert;
- } else {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Existing table entry. (hr: %d,ha: %u)\n",
- hash_result,
- addr);
-#endif
-
- /* Existing item found */
- location = hash_table[hash_result];
- /* We have a match on address, now to make sure it meets all requirements for a
- * full match. */
- if(info->check_set & IPT_RECENT_CHECK || info->check_set & IPT_RECENT_UPDATE) {
- if(!info->seconds && !info->hit_count) ans = !info->invert; else ans = info->invert;
- if(info->seconds && !info->hit_count) {
- if(time_before_eq(now,r_list[location].last_seen+info->seconds*HZ)) ans = !info->invert; else ans = info->invert;
- }
- if(info->seconds && info->hit_count) {
- for(pkt_count = 0, hits_found = 0; pkt_count < ip_pkt_list_tot; pkt_count++) {
- if(r_list[location].last_pkts[pkt_count] == 0) break;
- if(time_before_eq(now,r_list[location].last_pkts[pkt_count]+info->seconds*HZ)) hits_found++;
- }
- if(hits_found >= info->hit_count) ans = !info->invert; else ans = info->invert;
- }
- if(info->hit_count && !info->seconds) {
- for(pkt_count = 0, hits_found = 0; pkt_count < ip_pkt_list_tot; pkt_count++) {
- if(r_list[location].last_pkts[pkt_count] == 0) break;
- hits_found++;
- }
- if(hits_found >= info->hit_count) ans = !info->invert; else ans = info->invert;
- }
- }
-#ifdef DEBUG
- if(debug) {
- if(ans)
- printk(KERN_INFO RECENT_NAME ": match(): match addr: %u\n",addr);
- else
- printk(KERN_INFO RECENT_NAME ": match(): no match addr: %u\n",addr);
- }
-#endif
-
- /* If and only if we have been asked to SET, or to UPDATE (on match) do we add the
- * current timestamp to the last_seen. */
- if((info->check_set & IPT_RECENT_SET && (ans = !info->invert)) || (info->check_set & IPT_RECENT_UPDATE && ans)) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): SET or UPDATE; updating time info.\n");
-#endif
- /* Have to update our time info */
- time_loc = r_list[location].time_pos;
- time_info[time_loc].time = now;
- time_info[time_loc].position = location;
- while((time_info[(time_loc+1) % ip_list_tot].time < time_info[time_loc].time) && ((time_loc+1) % ip_list_tot) != curr_table->time_pos) {
- time_temp = time_info[time_loc].time;
- time_info[time_loc].time = time_info[(time_loc+1)%ip_list_tot].time;
- time_info[(time_loc+1)%ip_list_tot].time = time_temp;
- time_temp = time_info[time_loc].position;
- time_info[time_loc].position = time_info[(time_loc+1)%ip_list_tot].position;
- time_info[(time_loc+1)%ip_list_tot].position = time_temp;
- r_list[time_info[time_loc].position].time_pos = time_loc;
- r_list[time_info[(time_loc+1)%ip_list_tot].position].time_pos = (time_loc+1)%ip_list_tot;
- time_loc = (time_loc+1) % ip_list_tot;
- }
- r_list[location].time_pos = time_loc;
- r_list[location].ttl = ttl;
- r_list[location].last_pkts[r_list[location].oldest_pkt] = now;
- r_list[location].oldest_pkt = ++r_list[location].oldest_pkt % ip_pkt_list_tot;
- r_list[location].last_seen = now;
- }
- /* If we have been asked to remove the entry from the list, just set it to 0 */
- if(info->check_set & IPT_RECENT_REMOVE) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): REMOVE; clearing entry (or: %d, hr: %d).\n",orig_hash_result,hash_result);
-#endif
- /* Check if this is part of a collision chain */
- while(hash_table[(orig_hash_result+1) % ip_list_hash_size] != -1) {
- orig_hash_result++;
- if(hash_func(r_list[hash_table[orig_hash_result]].addr,ip_list_hash_size) == hash_result) {
- /* Found collision chain, how deep does this rabbit hole go? */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): REMOVE; found collision chain.\n");
-#endif
- end_collision_chain = orig_hash_result;
- }
- }
- if(end_collision_chain != -1) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): REMOVE; part of collision chain, moving to end.\n");
-#endif
- /* Part of a collision chain, swap it with the end of the chain
- * before removing. */
- r_list[hash_table[end_collision_chain]].hash_entry = hash_result;
- temp = hash_table[end_collision_chain];
- hash_table[end_collision_chain] = hash_table[hash_result];
- hash_table[hash_result] = temp;
- time_info[r_list[hash_table[hash_result]].time_pos].position = hash_table[hash_result];
- hash_result = end_collision_chain;
- r_list[hash_table[hash_result]].hash_entry = hash_result;
- time_info[r_list[hash_table[hash_result]].time_pos].position = hash_table[hash_result];
- }
- location = hash_table[hash_result];
- hash_table[r_list[location].hash_entry] = -1;
- time_loc = r_list[location].time_pos;
- time_info[time_loc].time = 0;
- time_info[time_loc].position = location;
- while((time_info[(time_loc+1) % ip_list_tot].time < time_info[time_loc].time) && ((time_loc+1) % ip_list_tot) != curr_table->time_pos) {
- time_temp = time_info[time_loc].time;
- time_info[time_loc].time = time_info[(time_loc+1)%ip_list_tot].time;
- time_info[(time_loc+1)%ip_list_tot].time = time_temp;
- time_temp = time_info[time_loc].position;
- time_info[time_loc].position = time_info[(time_loc+1)%ip_list_tot].position;
- time_info[(time_loc+1)%ip_list_tot].position = time_temp;
- r_list[time_info[time_loc].position].time_pos = time_loc;
- r_list[time_info[(time_loc+1)%ip_list_tot].position].time_pos = (time_loc+1)%ip_list_tot;
- time_loc = (time_loc+1) % ip_list_tot;
+ if (info->check_set & IPT_RECENT_SET)
+ ret ^= 1;
+ else if (info->check_set & IPT_RECENT_REMOVE) {
+ recent_entry_remove(t, e);
+ ret ^= 1;
+ } else if (info->check_set & (IPT_RECENT_CHECK | IPT_RECENT_UPDATE)) {
+ unsigned long t = jiffies - info->seconds * HZ;
+ unsigned int i, hits = 0;
+
+ for (i = 0; i < e->nstamps; i++) {
+ if (info->seconds && time_after(t, e->stamps[i]))
+ continue;
+ if (!info->hit_count || ++hits >= info->hit_count) {
+ ret ^= 1;
+ break;
}
- r_list[location].time_pos = time_loc;
- r_list[location].last_seen = 0;
- r_list[location].addr = 0;
- r_list[location].ttl = 0;
- memset(r_list[location].last_pkts,0,ip_pkt_list_tot*sizeof(unsigned long));
- r_list[location].oldest_pkt = 0;
- ans = !info->invert;
}
- spin_unlock_bh(&curr_table->list_lock);
- return ans;
}

- spin_unlock_bh(&curr_table->list_lock);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match() left.\n");
-#endif
- return ans;
+ if (info->check_set & IPT_RECENT_SET ||
+ (info->check_set & IPT_RECENT_UPDATE && ret)) {
+ recent_entry_update(e);
+ e->ttl = ttl;
+ }
+out:
+ spin_unlock_bh(&recent_lock);
+ return ret;
}

-/* This function is to verify that the rule given during the userspace iptables
- * command is correct.
- * If the command is valid then we check if the table name referred to by the
- * rule exists, if not it is created.
- */
static int
-checkentry(const char *tablename,
- const void *ip,
- const struct xt_match *match,
- void *matchinfo,
- unsigned int matchsize,
- unsigned int hook_mask)
+ipt_recent_checkentry(const char *tablename, const void *ip,
+ const struct xt_match *match, void *matchinfo,
+ unsigned int matchsize, unsigned int hook_mask)
{
- int flag = 0, c;
- unsigned long *hold;
const struct ipt_recent_info *info = matchinfo;
- struct recent_ip_tables *curr_table, *find_table, *last_table;
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() entered.\n");
-#endif
+ struct recent_table *t;
+ unsigned i;
+ int ret = 0;

- /* seconds and hit_count only valid for CHECK/UPDATE */
- if(info->check_set & IPT_RECENT_SET) { flag++; if(info->seconds || info->hit_count) return 0; }
- if(info->check_set & IPT_RECENT_REMOVE) { flag++; if(info->seconds || info->hit_count) return 0; }
- if(info->check_set & IPT_RECENT_CHECK) flag++;
- if(info->check_set & IPT_RECENT_UPDATE) flag++;
-
- /* One and only one of these should ever be set */
- if(flag != 1) return 0;
-
- /* Name must be set to something */
- if(!info->name || !info->name[0]) return 0;
+ if (hweight8(info->check_set &
+ (IPT_RECENT_SET | IPT_RECENT_REMOVE |
+ IPT_RECENT_CHECK | IPT_RECENT_UPDATE)) != 1)
+ return 0;
+ if (info->check_set & (IPT_RECENT_SET | IPT_RECENT_REMOVE) &&
+ (info->seconds || info->hit_count))
+ return 0;
+ if (info->name[0] == '\0' ||
+ strnlen(info->name, IPT_RECENT_NAME_LEN) == IPT_RECENT_NAME_LEN)
+ return 0;

- /* Things look good, create a list for this if it does not exist */
- /* Lock the linked list while we play with it */
spin_lock_bh(&recent_lock);
-
- /* Look for an entry with this name already created */
- /* Finds the end of the list and the entry before the end if current name does not exist */
- find_table = r_tables;
- while( (last_table = find_table) && strncmp(info->name,find_table->name,IPT_RECENT_NAME_LEN) && (find_table = find_table->next) );
-
- /* If a table already exists just increment the count on that table and return */
- if(find_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: table found (%s), incrementing count.\n",info->name);
-#endif
- find_table->count++;
- spin_unlock_bh(&recent_lock);
- return 1;
+ t = recent_table_lookup(info->name);
+ if (t != NULL) {
+ t->refcnt++;
+ ret = 1;
+ goto out;
}

- spin_unlock_bh(&recent_lock);
-
- /* Table with this name not found */
- /* Allocate memory for new linked list item */
-
-#ifdef DEBUG
- if(debug) {
- printk(KERN_INFO RECENT_NAME ": checkentry: no table found (%s)\n",info->name);
- printk(KERN_INFO RECENT_NAME ": checkentry: Allocationg %d for link-list entry.\n",sizeof(struct recent_ip_tables));
+ t = kzalloc(sizeof(*t) + sizeof(t->iphash[0]) * ip_list_hash_size,
+ GFP_ATOMIC);
+ if (t == NULL)
+ goto out;
+ strcpy(t->name, info->name);
+ for (i = 0; i < ip_list_hash_size; i++)
+ INIT_LIST_HEAD(&t->iphash[i]);
+#ifdef CONFIG_PROC_FS
+ t->proc = create_proc_entry(t->name, ip_list_perms, proc_dir);
+ if (t->proc == NULL) {
+ kfree(t);
+ goto out;
}
+ t->proc->proc_fops = &recent_fops;
+ t->proc->data = t;
#endif
+ list_add_tail(&t->list, &tables);
+ ret = 1;
+out:
+ spin_unlock_bh(&recent_lock);
+ return ret;
+}

- curr_table = vmalloc(sizeof(struct recent_ip_tables));
- if(curr_table == NULL) return 0;
-
- spin_lock_init(&curr_table->list_lock);
- curr_table->next = NULL;
- curr_table->count = 1;
- curr_table->time_pos = 0;
- strncpy(curr_table->name,info->name,IPT_RECENT_NAME_LEN);
- curr_table->name[IPT_RECENT_NAME_LEN-1] = '\0';
-
- /* Allocate memory for this table and the list of packets in each entry. */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for table (%s).\n",
- sizeof(struct recent_ip_list)*ip_list_tot,
- info->name);
-#endif
-
- curr_table->table = vmalloc(sizeof(struct recent_ip_list)*ip_list_tot);
- if(curr_table->table == NULL) { vfree(curr_table); return 0; }
- memset(curr_table->table,0,sizeof(struct recent_ip_list)*ip_list_tot);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for pkt_list.\n",
- sizeof(unsigned long)*ip_pkt_list_tot*ip_list_tot);
-#endif
-
- hold = vmalloc(sizeof(unsigned long)*ip_pkt_list_tot*ip_list_tot);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: After pkt_list allocation.\n");
-#endif
- if(hold == NULL) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for pkt_list.\n");
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
- }
- for(c = 0; c < ip_list_tot; c++) {
- curr_table->table[c].last_pkts = hold + c*ip_pkt_list_tot;
- }
+static void
+ipt_recent_destroy(const struct xt_match *match, void *matchinfo,
+ unsigned int matchsize)
+{
+ const struct ipt_recent_info *info = matchinfo;
+ struct recent_table *t;

- /* Allocate memory for the hash table */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for hash_table.\n",
- sizeof(int)*ip_list_hash_size);
+ spin_lock_bh(&recent_lock);
+ t = recent_table_lookup(info->name);
+ if (--t->refcnt == 0) {
+ list_del(&t->list);
+ recent_table_flush(t);
+#ifdef CONFIG_PROC_FS
+ remove_proc_entry(t->name, proc_dir);
#endif
-
- curr_table->hash_table = vmalloc(sizeof(int)*ip_list_hash_size);
- if(!curr_table->hash_table) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for hash_table.\n");
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
- }
-
- for(c = 0; c < ip_list_hash_size; c++) {
- curr_table->hash_table[c] = -1;
+ kfree(t);
}
+ spin_unlock_bh(&recent_lock);
+}

- /* Allocate memory for the time info */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for time_info.\n",
- sizeof(struct time_info_list)*ip_list_tot);
-#endif
+#ifdef CONFIG_PROC_FS
+struct recent_iter_state {
+ struct recent_table *table;
+ unsigned int bucket;
+};

- curr_table->time_info = vmalloc(sizeof(struct time_info_list)*ip_list_tot);
- if(!curr_table->time_info) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for time_info.\n");
- vfree(curr_table->hash_table);
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
- }
- for(c = 0; c < ip_list_tot; c++) {
- curr_table->time_info[c].position = c;
- curr_table->time_info[c].time = 0;
- }
+static void *recent_seq_start(struct seq_file *seq, loff_t *pos)
+{
+ struct recent_iter_state *st = seq->private;
+ struct recent_table *t = st->table;
+ struct recent_entry *e;
+ loff_t p = *pos;

- /* Put the new table in place */
spin_lock_bh(&recent_lock);
- find_table = r_tables;
- while( (last_table = find_table) && strncmp(info->name,find_table->name,IPT_RECENT_NAME_LEN) && (find_table = find_table->next) );
-
- /* If a table already exists just increment the count on that table and return */
- if(find_table) {
- find_table->count++;
- spin_unlock_bh(&recent_lock);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: table found (%s), created by other process.\n",info->name);
-#endif
- vfree(curr_table->time_info);
- vfree(curr_table->hash_table);
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 1;
- }
- if(!last_table) r_tables = curr_table; else last_table->next = curr_table;
-
- spin_unlock_bh(&recent_lock);

-#ifdef CONFIG_PROC_FS
- /* Create our proc 'status' entry. */
- curr_table->status_proc = create_proc_entry(curr_table->name, ip_list_perms, proc_net_ipt_recent);
- if (!curr_table->status_proc) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for /proc entry.\n");
- /* Destroy the created table */
- spin_lock_bh(&recent_lock);
- last_table = NULL;
- curr_table = r_tables;
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() create_proc failed, no tables.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return 0;
- }
- while( strncmp(info->name,curr_table->name,IPT_RECENT_NAME_LEN) && (last_table = curr_table) && (curr_table = curr_table->next) );
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() create_proc failed, table already destroyed.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return 0;
+ for (st->bucket = 0; st->bucket < ip_list_hash_size; st->bucket++) {
+ list_for_each_entry(e, &t->iphash[st->bucket], list) {
+ if (p-- == 0)
+ return e;
}
- if(last_table) last_table->next = curr_table->next; else r_tables = curr_table->next;
- spin_unlock_bh(&recent_lock);
- vfree(curr_table->time_info);
- vfree(curr_table->hash_table);
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
}
-
- curr_table->status_proc->owner = THIS_MODULE;
- curr_table->status_proc->data = curr_table;
- wmb();
- curr_table->status_proc->read_proc = ip_recent_get_info;
- curr_table->status_proc->write_proc = ip_recent_ctrl;
-#endif /* CONFIG_PROC_FS */
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() left.\n");
-#endif
+ return NULL;
+}

- return 1;
+static void *recent_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+ struct recent_iter_state *st = seq->private;
+ struct recent_table *t = st->table;
+ struct recent_entry *e = v;
+ struct list_head *head = e->list.next;
+
+ while (head == &t->iphash[st->bucket]) {
+ if (++st->bucket >= ip_list_hash_size)
+ return NULL;
+ head = t->iphash[st->bucket].next;
+ }
+ (*pos)++;
+ return list_entry(head, struct recent_entry, list);
}

-/* This function is called in the event that a rule matching this module is
- * removed.
- * When this happens we need to check if there are no other rules matching
- * the table given. If that is the case then we remove the table and clean
- * up its memory.
- */
-static void
-destroy(const struct xt_match *match, void *matchinfo, unsigned int matchsize)
+static void recent_seq_stop(struct seq_file *s, void *v)
{
- const struct ipt_recent_info *info = matchinfo;
- struct recent_ip_tables *curr_table, *last_table;
+ spin_unlock_bh(&recent_lock);
+}

-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() entered.\n");
-#endif
+static int recent_seq_show(struct seq_file *seq, void *v)
+{
+ struct recent_entry *e = v;
+ unsigned int i;
+
+ i = (e->index - 1) % ip_pkt_list_tot;
+ seq_printf(seq, "src=%u.%u.%u.%u ttl: %u last_seen: %lu oldest_pkt: %u",
+ NIPQUAD(e->addr), e->ttl, e->stamps[i], e->index);
+ for (i = 0; i < e->nstamps; i++)
+ seq_printf(seq, "%s %lu", i ? "," : "", e->stamps[i]);
+ seq_printf(seq, "\n");
+ return 0;
+}

- if(matchsize != IPT_ALIGN(sizeof(struct ipt_recent_info))) return;
+static struct seq_operations recent_seq_ops = {
+ .start = recent_seq_start,
+ .next = recent_seq_next,
+ .stop = recent_seq_stop,
+ .show = recent_seq_show,
+};

- /* Lock the linked list while we play with it */
- spin_lock_bh(&recent_lock);
+static int recent_seq_open(struct inode *inode, struct file *file)
+{
+ struct proc_dir_entry *pde = PDE(inode);
+ struct seq_file *seq;
+ struct recent_iter_state *st;
+ int ret;
+
+ st = kzalloc(sizeof(*st), GFP_KERNEL);
+ if (st == NULL)
+ return -ENOMEM;
+ ret = seq_open(file, &recent_seq_ops);
+ if (ret)
+ kfree(st);
+ st->table = pde->data;
+ seq = file->private_data;
+ seq->private = st;
+ return ret;
+}

- /* Look for an entry with this name already created */
- /* Finds the end of the list and the entry before the end if current name does not exist */
- last_table = NULL;
- curr_table = r_tables;
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() No tables found, leaving.\n");
-#endif
+static ssize_t recent_proc_write(struct file *file, const char __user *input,
+ size_t size, loff_t *loff)
+{
+ struct proc_dir_entry *pde = PDE(file->f_dentry->d_inode);
+ struct recent_table *t = pde->data;
+ struct recent_entry *e;
+ char buf[sizeof("+255.255.255.255")], *c = buf;
+ u_int32_t addr;
+ int add;
+
+ if (size > sizeof(buf))
+ size = sizeof(buf);
+ if (copy_from_user(buf, input, size))
+ return -EFAULT;
+ while (isspace(*c))
+ c++;
+
+ if (size - (c - buf) < 5)
+ return c - buf;
+ if (!memcmp(c, "clear", 5)) {
+ spin_lock_bh(&recent_lock);
+ recent_table_flush(t);
spin_unlock_bh(&recent_lock);
- return;
+ return c - buf;
}
- while( strncmp(info->name,curr_table->name,IPT_RECENT_NAME_LEN) && (last_table = curr_table) && (curr_table = curr_table->next) );

- /* If a table does not exist then do nothing and return */
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() table not found, leaving.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return;
+ switch (*c) {
+ case '-':
+ add = 0;
+ c++;
+ break;
+ case '+':
+ c++;
+ default:
+ add = 1;
+ break;
}
+ addr = in_aton(c);

- curr_table->count--;
-
- /* If count is still non-zero then there are still rules referenceing it so we do nothing */
- if(curr_table->count) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() table found, non-zero count, leaving.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return;
+ spin_lock_bh(&recent_lock);
+ e = recent_entry_lookup(t, addr, 0);
+ if (e == NULL) {
+ if (add)
+ recent_entry_init(t, addr, 0);
+ } else {
+ if (add)
+ recent_entry_update(e);
+ else
+ recent_entry_remove(t, e);
}
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() table found, zero count, removing.\n");
-#endif
-
- /* Count must be zero so we remove this table from the list */
- if(last_table) last_table->next = curr_table->next; else r_tables = curr_table->next;
-
spin_unlock_bh(&recent_lock);
+ return size;
+}

- /* lock to make sure any late-runners still using this after we removed it from
- * the list finish up then remove everything */
- spin_lock_bh(&curr_table->list_lock);
- spin_unlock_bh(&curr_table->list_lock);
-
-#ifdef CONFIG_PROC_FS
- if(curr_table->status_proc) remove_proc_entry(curr_table->name,proc_net_ipt_recent);
+static struct file_operations recent_fops = {
+ .open = recent_seq_open,
+ .read = seq_read,
+ .write = recent_proc_write,
+ .release = seq_release_private,
+ .owner = THIS_MODULE,
+};
#endif /* CONFIG_PROC_FS */
- vfree(curr_table->table[0].last_pkts);
- vfree(curr_table->table);
- vfree(curr_table->hash_table);
- vfree(curr_table->time_info);
- vfree(curr_table);
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() left.\n");
-#endif

- return;
-}
-
-/* This is the structure we pass to ipt_register to register our
- * module with iptables.
- */
static struct ipt_match recent_match = {
.name = "recent",
- .match = match,
+ .match = ipt_recent_match,
.matchsize = sizeof(struct ipt_recent_info),
- .checkentry = checkentry,
- .destroy = destroy,
- .me = THIS_MODULE
+ .checkentry = ipt_recent_checkentry,
+ .destroy = ipt_recent_destroy,
+ .me = THIS_MODULE,
};

-/* Kernel module initialization. */
static int __init ipt_recent_init(void)
{
- int err, count;
+ int err;

- printk(version);
-#ifdef CONFIG_PROC_FS
- proc_net_ipt_recent = proc_mkdir("ipt_recent",proc_net);
- if(!proc_net_ipt_recent) return -ENOMEM;
-#endif
-
- if(ip_list_hash_size && ip_list_hash_size <= ip_list_tot) {
- printk(KERN_WARNING RECENT_NAME ": ip_list_hash_size too small, resetting to default.\n");
- ip_list_hash_size = 0;
- }
-
- if(!ip_list_hash_size) {
- ip_list_hash_size = ip_list_tot*3;
- count = 2*2;
- while(ip_list_hash_size > count) count = count*2;
- ip_list_hash_size = count;
- }
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_list_hash_size: %d\n",ip_list_hash_size);
-#endif
+ if (!ip_list_tot || !ip_pkt_list_tot)
+ return -EINVAL;
+ ip_list_hash_size = 1 << fls(ip_list_tot);

err = ipt_register_match(&recent_match);
+#ifdef CONFIG_PROC_FS
if (err)
- remove_proc_entry("ipt_recent", proc_net);
+ return err;
+ proc_dir = proc_mkdir("ipt_recent", proc_net);
+ if (proc_dir == NULL) {
+ ipt_unregister_match(&recent_match);
+ err = -ENOMEM;
+ }
+#endif
return err;
}

-/* Kernel module destruction. */
-static void __exit ipt_recent_fini(void)
+static void __exit ipt_recent_exit(void)
{
ipt_unregister_match(&recent_match);
-
- remove_proc_entry("ipt_recent",proc_net);
+#ifdef CONFIG_PROC_FS
+ remove_proc_entry("ipt_recent", proc_net);
+#endif
}

-/* Register our module with the kernel. */
module_init(ipt_recent_init);
-module_exit(ipt_recent_fini);
+module_exit(ipt_recent_exit);


Attachments:
x (45.68 kB)

2006-05-15 14:28:36

by Stephen Frost

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

* Patrick McHardy ([email protected]) wrote:
> Anyway, here goes the first shot at a replacement, it should be fully
> compatible. Comments and testing welcome.

This patch didn't apply cleanly against 2.6.16; I didn't think there had
been other changes since then. As it was an entire replacement I just
pulled out the '[+ ]' lines from the patch. Hopefully this doesn't lead
to problems in my review.

It probably would have been better to integrate it with ipset, as I've
mentioned previously. Other comments:

recent_entry_init() appears to just look for something to delete when
the maximum number of entries has been reached, starting from the hash
position of the address. The original ipt_recent, quite intentionally,
looked for the *oldest* address to replace. This meant that the list
only had to be large enough to cover the number of addresses seen in a
given time-period. This change would mean that the list would need to
be large enough to hold all addresses seen always, to be able to enforce
the time-based rules ipt_recent was written for.

ie: List of 100 addresses. Highest timeout value in the ruleset is 60
seconds. Average of 100 individual addresses in a 60-second timeframe.
The old ipt_recent would correctly enforce the 60-second requirement in
the ruleset. With the new version, as soon as the list was full the
next address could replace any address in the list, even if that address
was only 15 seconds old.

One way to handle this would be to track the highest time value in the
rulesets but as the ruleset is dynamic you could end up throwing away an
address which would have been caught by a rule that was about to be
added. The old module was written with the expectation of the list
always being full and that it would only be less-than-full shortly after
booting. By then only removing the oldest entry in the table for each
new address seen the maximum amount of time possible for the given table
size and distinct addresses seen is achieved.

The rest looks good, thanks.

Stephen


Attachments:
(No filename) (1.98 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-05-15 18:49:35

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

Stephen Frost wrote:
> * Patrick McHardy ([email protected]) wrote:
>
>>Anyway, here goes the first shot at a replacement, it should be fully
>>compatible. Comments and testing welcome.
>
>
> This patch didn't apply cleanly against 2.6.16; I didn't think there had
> been other changes since then. As it was an entire replacement I just
> pulled out the '[+ ]' lines from the patch. Hopefully this doesn't lead
> to problems in my review.


That should be fine. That patch applies on top of Jespers patch which
started this thread, which I plan to push to Dave today.

> It probably would have been better to integrate it with ipset, as I've
> mentioned previously. Other comments:


Unfortunately we need to provide compatibility.

> recent_entry_init() appears to just look for something to delete when
> the maximum number of entries has been reached, starting from the hash
> position of the address. The original ipt_recent, quite intentionally,
> looked for the *oldest* address to replace. This meant that the list
> only had to be large enough to cover the number of addresses seen in a
> given time-period. This change would mean that the list would need to
> be large enough to hold all addresses seen always, to be able to enforce
> the time-based rules ipt_recent was written for.
>
> ie: List of 100 addresses. Highest timeout value in the ruleset is 60
> seconds. Average of 100 individual addresses in a 60-second timeframe.
> The old ipt_recent would correctly enforce the 60-second requirement in
> the ruleset. With the new version, as soon as the list was full the
> next address could replace any address in the list, even if that address
> was only 15 seconds old.
>
> One way to handle this would be to track the highest time value in the
> rulesets but as the ruleset is dynamic you could end up throwing away an
> address which would have been caught by a rule that was about to be
> added. The old module was written with the expectation of the list
> always being full and that it would only be less-than-full shortly after
> booting. By then only removing the oldest entry in the table for each
> new address seen the maximum amount of time possible for the given table
> size and distinct addresses seen is achieved.


I wasn't sure whether eviction was happening intentional in the old code
at all - still not able to locate the code where this happens, just
noticed that it does do eviction when I manually tried to trigger
a table overflow by adding entries through /proc. Anyway, it should
be easy to fix by keeping an additional lru list. I'll post
an updated patch soon.

2006-05-15 19:27:41

by Stephen Frost

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

* Patrick McHardy ([email protected]) wrote:
> I wasn't sure whether eviction was happening intentional in the old code
> at all - still not able to locate the code where this happens, just
> noticed that it does do eviction when I manually tried to trigger
> a table overflow by adding entries through /proc. Anyway, it should
> be easy to fix by keeping an additional lru list. I'll post
> an updated patch soon.

It was always done intentionally; as I mentioned, it was originally
written with the expectation of the table always being full. That was
also why I used one large malloc'd table and the hash chaining that I
did- I always knew ahead of time exactly how much memory I'd be using as
a running-set and never needed to do any allocation during operation.
In hindsight I can see that the additional complexity from it was
perhaps not worth the benefit that I saw from it.

The eviction is handled through the 'time_info_list'. This is basically
just an always-ordered (by time) array of positions into the main table.
Line 504 (from stock 2.6.16) is where the list is used to add a new
entry at the end of the list (replacing the oldest address). 'time_pos'
points to the oldest entry. The 'position' is then used to clear out
the entry associated with this address from the hash table and the main
table. These are then replaced with the new address information and the
time_pos is adjusted accordingly. This didn't help the complexity as it
meant I was tracking through different systems the position of each
address in the time_info_list, the main table, and the hash table.
Using the lists might make this a bit easier to implement though.

Then on line 566, if a new packet has come in for an existing address,
we have to move that address up to the top of the time_info_list as it
is now the 'most recent'. As someone else mentioned, this might have
been better done using 'memmove' but I wasn't sure about its use or
performance in the kernel. This is done again on line 617 when removing
an address, which is expected to be a somewhat rare event (where an
address is explicitly removed instead of just expiring). One issue I
was concerned about was that I really didn't want the system to become
unhappy if a huge number of different addresses suddenly came in (more
than the list could support and/or more than would be sensible to try to
allocate memory to track).

I'm really not sure why I didn't break out this code into more
functions. It certainly would have made things much clearer/simpler. I
think I was (without any particular reason for it) concerned about
adding too many functions or calling things from the match() function.
As for why I didn't use existing kernel structures, well, I wasn't aware
of them in part and when I was asking about things I was asking about
more complicated things (such as a generic storage/hashing system) than
really made sense. I'm not sure I would have used the lists anyway
since I liked the general idea of just having the one 'main' table but
it does seem to make things cleaner.

Thanks,

Stephen


Attachments:
(No filename) (3.01 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-05-15 20:09:34

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

[NETFILTER]: Replace ipt_recent module

Replace the totally unmaintainable ipt_recent module by a rewritten
version that should be fully compatible.

Signed-off-by: Patrick McHardy <[email protected]>

---
commit e4f33e5c65efaf65d558365fd49ad5d83b13813d
tree fc8a4681aad273bb4f9a6c9c9484d331d7aac064
parent d8c3291c73b958243b33f8509d4507e76dafd055
author Patrick McHardy <[email protected]> Mon, 15 May 2006 22:03:22 +0200
committer Patrick McHardy <[email protected]> Mon, 15 May 2006 22:03:22 +0200

net/ipv4/netfilter/ipt_recent.c | 1274 ++++++++++++---------------------------
1 files changed, 380 insertions(+), 894 deletions(-)

diff --git a/net/ipv4/netfilter/ipt_recent.c b/net/ipv4/netfilter/ipt_recent.c
index 1438432..2e54062 100644
--- a/net/ipv4/netfilter/ipt_recent.c
+++ b/net/ipv4/netfilter/ipt_recent.c
@@ -1,1007 +1,493 @@
-/* Kernel module to check if the source address has been seen recently. */
-/* Copyright 2002-2003, Stephen Frost, 2.5.x port by [email protected] */
-/* Author: Stephen Frost <[email protected]> */
-/* Project Page: http://snowman.net/projects/ipt_recent/ */
-/* This software is distributed under the terms of the GPL, Version 2 */
-/* This copyright does not cover user programs that use kernel services
- * by normal system calls. */
-
-#include <linux/module.h>
-#include <linux/skbuff.h>
+/*
+ * Copyright (c) 2006 Patrick McHardy <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This is a replacement of the old ipt_recent module, which carried the
+ * following copyright notice:
+ *
+ * Author: Stephen Frost <[email protected]>
+ * Copyright 2002-2003, Stephen Frost, 2.5.x port by [email protected]
+ */
+#include <linux/init.h>
+#include <linux/moduleparam.h>
#include <linux/proc_fs.h>
-#include <linux/spinlock.h>
-#include <linux/interrupt.h>
-#include <asm/uaccess.h>
+#include <linux/seq_file.h>
+#include <linux/string.h>
#include <linux/ctype.h>
-#include <linux/ip.h>
-#include <linux/vmalloc.h>
-#include <linux/moduleparam.h>
+#include <linux/list.h>
+#include <linux/random.h>
+#include <linux/jhash.h>
+#include <linux/bitops.h>
+#include <linux/skbuff.h>
+#include <linux/inet.h>

#include <linux/netfilter_ipv4/ip_tables.h>
#include <linux/netfilter_ipv4/ipt_recent.h>

-#undef DEBUG
-#define HASH_LOG 9
+MODULE_AUTHOR("Patrick McHardy <[email protected]>");
+MODULE_DESCRIPTION("IP tables recently seen matching module");
+MODULE_LICENSE("GPL");

-/* Defaults, these can be overridden on the module command-line. */
static unsigned int ip_list_tot = 100;
static unsigned int ip_pkt_list_tot = 20;
static unsigned int ip_list_hash_size = 0;
static unsigned int ip_list_perms = 0644;
-#ifdef DEBUG
-static int debug = 1;
-#endif
-
-static char version[] =
-KERN_INFO RECENT_NAME " " RECENT_VER ": Stephen Frost <[email protected]>. http://snowman.net/projects/ipt_recent/\n";
-
-MODULE_AUTHOR("Stephen Frost <[email protected]>");
-MODULE_DESCRIPTION("IP tables recently seen matching module " RECENT_VER);
-MODULE_LICENSE("GPL");
module_param(ip_list_tot, uint, 0400);
module_param(ip_pkt_list_tot, uint, 0400);
module_param(ip_list_hash_size, uint, 0400);
module_param(ip_list_perms, uint, 0400);
-#ifdef DEBUG
-module_param(debug, bool, 0600);
-MODULE_PARM_DESC(debug,"enable debugging output");
-#endif
-MODULE_PARM_DESC(ip_list_tot,"number of IPs to remember per list");
-MODULE_PARM_DESC(ip_pkt_list_tot,"number of packets per IP to remember");
-MODULE_PARM_DESC(ip_list_hash_size,"size of hash table used to look up IPs");
-MODULE_PARM_DESC(ip_list_perms,"permissions on /proc/net/ipt_recent/* files");
-
-/* Structure of our list of recently seen addresses. */
-struct recent_ip_list {
- u_int32_t addr;
- u_int8_t ttl;
- unsigned long last_seen;
- unsigned long *last_pkts;
- u_int32_t oldest_pkt;
- u_int32_t hash_entry;
- u_int32_t time_pos;
-};
-
-struct time_info_list {
- u_int32_t position;
- u_int32_t time;
+MODULE_PARM_DESC(ip_list_tot, "number of IPs to remember per list");
+MODULE_PARM_DESC(ip_pkt_list_tot, "number of packets per IP to remember");
+MODULE_PARM_DESC(ip_list_hash_size, "size of hash table used to look up IPs");
+MODULE_PARM_DESC(ip_list_perms, "permissions on /proc/net/ipt_recent/* files");
+
+
+struct recent_entry {
+ struct list_head list;
+ struct list_head lru_list;
+ u_int32_t addr;
+ u_int8_t ttl;
+ unsigned int index;
+ unsigned int nstamps;
+ unsigned long stamps[0];
};

-/* Structure of our linked list of tables of recent lists. */
-struct recent_ip_tables {
- char name[IPT_RECENT_NAME_LEN];
- int count;
- int time_pos;
- struct recent_ip_list *table;
- struct recent_ip_tables *next;
- spinlock_t list_lock;
- int *hash_table;
- struct time_info_list *time_info;
+struct recent_table {
+ struct list_head list;
+ char name[IPT_RECENT_NAME_LEN];
#ifdef CONFIG_PROC_FS
- struct proc_dir_entry *status_proc;
-#endif /* CONFIG_PROC_FS */
+ struct proc_dir_entry *proc;
+#endif
+ unsigned int refcnt;
+ unsigned int entries;
+ struct list_head lru_list;
+ struct list_head iphash[0];
};

-/* Our current list of addresses we have recently seen.
- * Only added to on a --set, and only updated on --set || --update
- */
-static struct recent_ip_tables *r_tables = NULL;
-
-/* We protect r_list with this spinlock so two processors are not modifying
- * the list at the same time.
- */
+static LIST_HEAD(tables);
static DEFINE_SPINLOCK(recent_lock);

#ifdef CONFIG_PROC_FS
-/* Our /proc/net/ipt_recent entry */
-static struct proc_dir_entry *proc_net_ipt_recent = NULL;
+static struct proc_dir_entry *proc_dir;
+static struct file_operations recent_fops;
#endif

-/* Function declaration for later. */
-static int
-match(const struct sk_buff *skb,
- const struct net_device *in,
- const struct net_device *out,
- const struct xt_match *match,
- const void *matchinfo,
- int offset,
- unsigned int protoff,
- int *hotdrop);
-
-/* Function to hash a given address into the hash table of table_size size */
-static int hash_func(unsigned int addr, int table_size)
-{
- int result = 0;
- unsigned int value = addr;
- do { result ^= value; } while((value >>= HASH_LOG));
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": %d = hash_func(%u,%d)\n",
- result & (table_size - 1),
- addr,
- table_size);
-#endif
+static u_int32_t hash_rnd;
+static int hash_rnd_initted;

- return(result & (table_size - 1));
-}
-
-#ifdef CONFIG_PROC_FS
-/* This is the function which produces the output for our /proc output
- * interface which lists each IP address, the last seen time and the
- * other recent times the address was seen.
- */
-
-static int ip_recent_get_info(char *buffer, char **start, off_t offset, int length, int *eof, void *data)
+static u_int32_t recent_entry_hash(u_int32_t addr)
{
- int len = 0, count, last_len = 0, pkt_count;
- off_t pos = 0;
- off_t begin = 0;
- struct recent_ip_tables *curr_table;
-
- curr_table = (struct recent_ip_tables*) data;
-
- spin_lock_bh(&curr_table->list_lock);
- for(count = 0; count < ip_list_tot; count++) {
- if(!curr_table->table[count].addr) continue;
- last_len = len;
- len += sprintf(buffer+len,"src=%u.%u.%u.%u ",NIPQUAD(curr_table->table[count].addr));
- len += sprintf(buffer+len,"ttl: %u ",curr_table->table[count].ttl);
- len += sprintf(buffer+len,"last_seen: %lu ",curr_table->table[count].last_seen);
- len += sprintf(buffer+len,"oldest_pkt: %u ",curr_table->table[count].oldest_pkt);
- len += sprintf(buffer+len,"last_pkts: %lu",curr_table->table[count].last_pkts[0]);
- for(pkt_count = 1; pkt_count < ip_pkt_list_tot; pkt_count++) {
- if(!curr_table->table[count].last_pkts[pkt_count]) break;
- len += sprintf(buffer+len,", %lu",curr_table->table[count].last_pkts[pkt_count]);
- }
- len += sprintf(buffer+len,"\n");
- pos = begin + len;
- if(pos < offset) { len = 0; begin = pos; }
- if(pos > offset + length) { len = last_len; break; }
+ if (!hash_rnd_initted) {
+ get_random_bytes(&hash_rnd, 4);
+ hash_rnd_initted = 1;
}
-
- *start = buffer + (offset - begin);
- len -= (offset - begin);
- if(len > length) len = length;
-
- spin_unlock_bh(&curr_table->list_lock);
- return len;
+ return jhash_1word(addr, hash_rnd) & (ip_list_hash_size - 1);
}

-/* ip_recent_ctrl provides an interface for users to modify the table
- * directly. This allows adding entries, removing entries, and
- * flushing the entire table.
- * This is done by opening up the appropriate table for writing and
- * sending one of:
- * xx.xx.xx.xx -- Add entry to table with current time
- * +xx.xx.xx.xx -- Add entry to table with current time
- * -xx.xx.xx.xx -- Remove entry from table
- * clear -- Flush table, remove all entries
- */
-
-static int ip_recent_ctrl(struct file *file, const char __user *input, unsigned long size, void *data)
+static struct recent_entry *
+recent_entry_lookup(const struct recent_table *table, u_int32_t addr, u_int8_t ttl)
{
- static const u_int32_t max[4] = { 0xffffffff, 0xffffff, 0xffff, 0xff };
- u_int32_t val;
- int base, used = 0;
- char c, *cp;
- union iaddr {
- uint8_t bytes[4];
- uint32_t word;
- } res;
- uint8_t *pp = res.bytes;
- int digit;
-
- char buffer[20];
- int len, check_set = 0, count;
- u_int32_t addr = 0;
- struct sk_buff *skb;
- struct ipt_recent_info *info;
- struct recent_ip_tables *curr_table;
-
- curr_table = (struct recent_ip_tables*) data;
-
- if(size > 20) len = 20; else len = size;
-
- if(copy_from_user(buffer,input,len)) return -EFAULT;
-
- if(len < 20) buffer[len] = '\0';
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_recent_ctrl len: %d, input: `%.20s'\n",len,buffer);
-#endif
+ struct recent_entry *e;
+ unsigned int h;
+
+ h = recent_entry_hash(addr);
+ list_for_each_entry(e, &table->iphash[h], list)
+ if (e->addr == addr && (!ttl || !e->ttl || ttl == e->ttl))
+ return e;
+ return NULL;
+}

- cp = buffer;
- while(isspace(*cp)) { cp++; used++; if(used >= len-5) return used; }
+static void recent_entry_remove(struct recent_table *t, struct recent_entry *e)
+{
+ list_del(&e->list);
+ list_del(&e->lru_list);
+ kfree(e);
+ t->entries--;
+}

- /* Check if we are asked to flush the entire table */
- if(!memcmp(cp,"clear",5)) {
- used += 5;
- spin_lock_bh(&curr_table->list_lock);
- curr_table->time_pos = 0;
- for(count = 0; count < ip_list_hash_size; count++) {
- curr_table->hash_table[count] = -1;
- }
- for(count = 0; count < ip_list_tot; count++) {
- curr_table->table[count].last_seen = 0;
- curr_table->table[count].addr = 0;
- curr_table->table[count].ttl = 0;
- memset(curr_table->table[count].last_pkts,0,ip_pkt_list_tot*sizeof(unsigned long));
- curr_table->table[count].oldest_pkt = 0;
- curr_table->table[count].time_pos = 0;
- curr_table->time_info[count].position = count;
- curr_table->time_info[count].time = 0;
- }
- spin_unlock_bh(&curr_table->list_lock);
- return used;
- }
+static struct recent_entry *
+recent_entry_init(struct recent_table *t, u_int32_t addr, u_int8_t ttl)
+{
+ struct recent_entry *e;

- check_set = IPT_RECENT_SET;
- switch(*cp) {
- case '+': check_set = IPT_RECENT_SET; cp++; used++; break;
- case '-': check_set = IPT_RECENT_REMOVE; cp++; used++; break;
- default: if(!isdigit(*cp)) return (used+1); break;
+ if (t->entries >= ip_list_tot) {
+ e = list_entry(t->lru_list.next, struct recent_entry, lru_list);
+ recent_entry_remove(t, e);
}
+ e = kmalloc(sizeof(*e) + sizeof(e->stamps[0]) * ip_pkt_list_tot,
+ GFP_ATOMIC);
+ if (e == NULL)
+ return NULL;
+ e->addr = addr;
+ e->ttl = ttl;
+ e->stamps[0] = jiffies;
+ e->nstamps = 1;
+ e->index = 1;
+ list_add_tail(&e->list, &t->iphash[recent_entry_hash(addr)]);
+ list_add_tail(&e->lru_list, &t->lru_list);
+ t->entries++;
+ return e;
+}

-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_recent_ctrl cp: `%c', check_set: %d\n",*cp,check_set);
-#endif
- /* Get addr (effectively inet_aton()) */
- /* Shamelessly stolen from libc, a function in the kernel for doing
- * this would, of course, be greatly preferred, but our options appear
- * to be rather limited, so we will just do it ourselves here.
- */
- res.word = 0;
-
- c = *cp;
- for(;;) {
- if(!isdigit(c)) return used;
- val = 0; base = 10; digit = 0;
- if(c == '0') {
- c = *++cp;
- if(c == 'x' || c == 'X') base = 16, c = *++cp;
- else { base = 8; digit = 1; }
- }
- for(;;) {
- if(isascii(c) && isdigit(c)) {
- if(base == 8 && (c == '8' || c == '0')) return used;
- val = (val * base) + (c - '0');
- c = *++cp;
- digit = 1;
- } else if(base == 16 && isascii(c) && isxdigit(c)) {
- val = (val << 4) | (c + 10 - (islower(c) ? 'a' : 'A'));
- c = *++cp;
- digit = 1;
- } else break;
- }
- if(c == '.') {
- if(pp > res.bytes + 2 || val > 0xff) return used;
- *pp++ = val;
- c = *++cp;
- } else break;
- }
- used = cp - buffer;
- if(c != '\0' && (!isascii(c) || !isspace(c))) return used;
- if(c == '\n') used++;
- if(!digit) return used;
-
- if(val > max[pp - res.bytes]) return used;
- addr = res.word | htonl(val);
+static void recent_entry_update(struct recent_entry *e)
+{
+ e->stamps[e->index++] = jiffies;
+ if (e->index > e->nstamps)
+ e->nstamps = e->index;
+ e->index %= ip_pkt_list_tot;
+}

- if(!addr && check_set == IPT_RECENT_SET) return used;
+static struct recent_table *recent_table_lookup(const char *name)
+{
+ struct recent_table *t;

-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_recent_ctrl c: %c, addr: %u used: %d\n",c,addr,used);
-#endif
+ list_for_each_entry(t, &tables, list)
+ if (!strcmp(t->name, name))
+ return t;
+ return NULL;
+}

- /* Set up and just call match */
- info = kmalloc(sizeof(struct ipt_recent_info),GFP_KERNEL);
- if(!info) { return -ENOMEM; }
- info->seconds = 0;
- info->hit_count = 0;
- info->check_set = check_set;
- info->invert = 0;
- info->side = IPT_RECENT_SOURCE;
- strncpy(info->name,curr_table->name,IPT_RECENT_NAME_LEN);
- info->name[IPT_RECENT_NAME_LEN-1] = '\0';
-
- skb = kmalloc(sizeof(struct sk_buff),GFP_KERNEL);
- if (!skb) {
- used = -ENOMEM;
- goto out_free_info;
- }
- skb->nh.iph = kmalloc(sizeof(struct iphdr),GFP_KERNEL);
- if (!skb->nh.iph) {
- used = -ENOMEM;
- goto out_free_skb;
+static void recent_table_flush(struct recent_table *t)
+{
+ struct recent_entry *e, *next;
+ unsigned int i;
+
+ for (i = 0; i < ip_list_hash_size; i++) {
+ list_for_each_entry_safe(e, next, &t->iphash[i], list)
+ recent_entry_remove(t, e);
}
-
- skb->nh.iph->saddr = addr;
- skb->nh.iph->daddr = 0;
- /* Clear ttl since we have no way of knowing it */
- skb->nh.iph->ttl = 0;
- match(skb,NULL,NULL,NULL,info,0,0,NULL);
-
- kfree(skb->nh.iph);
-out_free_skb:
- kfree(skb);
-out_free_info:
- kfree(info);
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": Leaving ip_recent_ctrl addr: %u used: %d\n",addr,used);
-#endif
- return used;
}

-#endif /* CONFIG_PROC_FS */
-
-/* 'match' is our primary function, called by the kernel whenever a rule is
- * hit with our module as an option to it.
- * What this function does depends on what was specifically asked of it by
- * the user:
- * --set -- Add or update last seen time of the source address of the packet
- * -- matchinfo->check_set == IPT_RECENT_SET
- * --rcheck -- Just check if the source address is in the list
- * -- matchinfo->check_set == IPT_RECENT_CHECK
- * --update -- If the source address is in the list, update last_seen
- * -- matchinfo->check_set == IPT_RECENT_UPDATE
- * --remove -- If the source address is in the list, remove it
- * -- matchinfo->check_set == IPT_RECENT_REMOVE
- * --seconds -- Option to --rcheck/--update, only match if last_seen within seconds
- * -- matchinfo->seconds
- * --hitcount -- Option to --rcheck/--update, only match if seen hitcount times
- * -- matchinfo->hit_count
- * --seconds and --hitcount can be combined
- */
static int
-match(const struct sk_buff *skb,
- const struct net_device *in,
- const struct net_device *out,
- const struct xt_match *match,
- const void *matchinfo,
- int offset,
- unsigned int protoff,
- int *hotdrop)
+ipt_recent_match(const struct sk_buff *skb,
+ const struct net_device *in, const struct net_device *out,
+ const struct xt_match *match, const void *matchinfo,
+ int offset, unsigned int protoff, int *hotdrop)
{
- int pkt_count, hits_found, ans;
- unsigned long now;
const struct ipt_recent_info *info = matchinfo;
- u_int32_t addr = 0, time_temp;
- u_int8_t ttl = skb->nh.iph->ttl;
- int *hash_table;
- int orig_hash_result, hash_result, temp, location = 0, time_loc, end_collision_chain = -1;
- struct time_info_list *time_info;
- struct recent_ip_tables *curr_table;
- struct recent_ip_tables *last_table;
- struct recent_ip_list *r_list;
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match() called\n");
-#endif
-
- /* Default is false ^ info->invert */
- ans = info->invert;
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): name = '%s'\n",info->name);
-#endif
-
- /* if out != NULL then routing has been done and TTL changed.
- * We change it back here internally for match what came in before routing. */
- if(out) ttl++;
-
- /* Find the right table */
- spin_lock_bh(&recent_lock);
- curr_table = r_tables;
- while( (last_table = curr_table) && strncmp(info->name,curr_table->name,IPT_RECENT_NAME_LEN) && (curr_table = curr_table->next) );
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): table found('%s')\n",info->name);
-#endif
-
- spin_unlock_bh(&recent_lock);
-
- /* Table with this name not found, match impossible */
- if(!curr_table) { return ans; }
-
- /* Make sure no one is changing the list while we work with it */
- spin_lock_bh(&curr_table->list_lock);
-
- r_list = curr_table->table;
- if(info->side == IPT_RECENT_DEST) addr = skb->nh.iph->daddr; else addr = skb->nh.iph->saddr;
-
- if(!addr) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match() address (%u) invalid, leaving.\n",addr);
-#endif
- spin_unlock_bh(&curr_table->list_lock);
- return ans;
- }
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): checking table, addr: %u, ttl: %u, orig_ttl: %u\n",addr,ttl,skb->nh.iph->ttl);
-#endif
-
- /* Get jiffies now in case they changed while we were waiting for a lock */
- now = jiffies;
- hash_table = curr_table->hash_table;
- time_info = curr_table->time_info;
-
- orig_hash_result = hash_result = hash_func(addr,ip_list_hash_size);
- /* Hash entry at this result used */
- /* Check for TTL match if requested. If TTL is zero then a match would never
- * happen, so match regardless of existing TTL in that case. Zero means the
- * entry was added via the /proc interface anyway, so we will just use the
- * first TTL we get for that IP address. */
- if(info->check_set & IPT_RECENT_TTL) {
- while(hash_table[hash_result] != -1 && !(r_list[hash_table[hash_result]].addr == addr &&
- (!r_list[hash_table[hash_result]].ttl || r_list[hash_table[hash_result]].ttl == ttl))) {
- /* Collision in hash table */
- hash_result = (hash_result + 1) % ip_list_hash_size;
- }
- } else {
- while(hash_table[hash_result] != -1 && r_list[hash_table[hash_result]].addr != addr) {
- /* Collision in hash table */
- hash_result = (hash_result + 1) % ip_list_hash_size;
- }
- }
-
- if(hash_table[hash_result] == -1 && !(info->check_set & IPT_RECENT_SET)) {
- /* IP not in list and not asked to SET */
- spin_unlock_bh(&curr_table->list_lock);
- return ans;
+ struct recent_table *t;
+ struct recent_entry *e;
+ u_int32_t addr;
+ u_int8_t ttl;
+ int ret = info->invert;
+
+ if (info->side == IPT_RECENT_DEST)
+ addr = skb->nh.iph->daddr;
+ else
+ addr = skb->nh.iph->saddr;
+
+ ttl = 0;
+ if (info->check_set & IPT_RECENT_TTL) {
+ ttl = skb->nh.iph->ttl;
+ /* use TTL as seen before forwaring */
+ if (out && !skb->sk)
+ ttl++;
}

- /* Check if we need to handle the collision, do not need to on REMOVE */
- if(orig_hash_result != hash_result && !(info->check_set & IPT_RECENT_REMOVE)) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Collision in hash table. (or: %d,hr: %d,oa: %u,ha: %u)\n",
- orig_hash_result,
- hash_result,
- r_list[hash_table[orig_hash_result]].addr,
- addr);
-#endif
-
- /* We had a collision.
- * orig_hash_result is where we started, hash_result is where we ended up.
- * So, swap them because we are likely to see the same guy again sooner */
-#ifdef DEBUG
- if(debug) {
- printk(KERN_INFO RECENT_NAME ": match(): Collision; hash_table[orig_hash_result] = %d\n",hash_table[orig_hash_result]);
- printk(KERN_INFO RECENT_NAME ": match(): Collision; r_list[hash_table[orig_hash_result]].hash_entry = %d\n",
- r_list[hash_table[orig_hash_result]].hash_entry);
- }
-#endif
-
- r_list[hash_table[orig_hash_result]].hash_entry = hash_result;
-
-
- temp = hash_table[orig_hash_result];
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Collision; hash_table[hash_result] = %d\n",hash_table[hash_result]);
-#endif
- hash_table[orig_hash_result] = hash_table[hash_result];
- hash_table[hash_result] = temp;
- temp = hash_result;
- hash_result = orig_hash_result;
- orig_hash_result = temp;
- time_info[r_list[hash_table[orig_hash_result]].time_pos].position = hash_table[orig_hash_result];
- if(hash_table[hash_result] != -1) {
- r_list[hash_table[hash_result]].hash_entry = hash_result;
- time_info[r_list[hash_table[hash_result]].time_pos].position = hash_table[hash_result];
- }
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Collision handled.\n");
-#endif
+ spin_lock_bh(&recent_lock);
+ t = recent_table_lookup(info->name);
+ e = recent_entry_lookup(t, addr, ttl);
+ if (e == NULL) {
+ if (!(info->check_set & IPT_RECENT_SET))
+ goto out;
+ e = recent_entry_init(t, addr, ttl);
+ if (e == NULL)
+ *hotdrop = 1;
+ ret ^= 1;
+ goto out;
}

- if(hash_table[hash_result] == -1) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): New table entry. (hr: %d,ha: %u)\n",
- hash_result, addr);
-#endif
-
- /* New item found and IPT_RECENT_SET, so we need to add it */
- location = time_info[curr_table->time_pos].position;
- hash_table[r_list[location].hash_entry] = -1;
- hash_table[hash_result] = location;
- memset(r_list[location].last_pkts,0,ip_pkt_list_tot*sizeof(unsigned long));
- r_list[location].time_pos = curr_table->time_pos;
- r_list[location].addr = addr;
- r_list[location].ttl = ttl;
- r_list[location].last_seen = now;
- r_list[location].oldest_pkt = 1;
- r_list[location].last_pkts[0] = now;
- r_list[location].hash_entry = hash_result;
- time_info[curr_table->time_pos].time = r_list[location].last_seen;
- curr_table->time_pos = (curr_table->time_pos + 1) % ip_list_tot;
-
- ans = !info->invert;
- } else {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Existing table entry. (hr: %d,ha: %u)\n",
- hash_result,
- addr);
-#endif
-
- /* Existing item found */
- location = hash_table[hash_result];
- /* We have a match on address, now to make sure it meets all requirements for a
- * full match. */
- if(info->check_set & IPT_RECENT_CHECK || info->check_set & IPT_RECENT_UPDATE) {
- if(!info->seconds && !info->hit_count) ans = !info->invert; else ans = info->invert;
- if(info->seconds && !info->hit_count) {
- if(time_before_eq(now,r_list[location].last_seen+info->seconds*HZ)) ans = !info->invert; else ans = info->invert;
- }
- if(info->seconds && info->hit_count) {
- for(pkt_count = 0, hits_found = 0; pkt_count < ip_pkt_list_tot; pkt_count++) {
- if(r_list[location].last_pkts[pkt_count] == 0) break;
- if(time_before_eq(now,r_list[location].last_pkts[pkt_count]+info->seconds*HZ)) hits_found++;
- }
- if(hits_found >= info->hit_count) ans = !info->invert; else ans = info->invert;
- }
- if(info->hit_count && !info->seconds) {
- for(pkt_count = 0, hits_found = 0; pkt_count < ip_pkt_list_tot; pkt_count++) {
- if(r_list[location].last_pkts[pkt_count] == 0) break;
- hits_found++;
- }
- if(hits_found >= info->hit_count) ans = !info->invert; else ans = info->invert;
- }
- }
-#ifdef DEBUG
- if(debug) {
- if(ans)
- printk(KERN_INFO RECENT_NAME ": match(): match addr: %u\n",addr);
- else
- printk(KERN_INFO RECENT_NAME ": match(): no match addr: %u\n",addr);
- }
-#endif
-
- /* If and only if we have been asked to SET, or to UPDATE (on match) do we add the
- * current timestamp to the last_seen. */
- if((info->check_set & IPT_RECENT_SET && (ans = !info->invert)) || (info->check_set & IPT_RECENT_UPDATE && ans)) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): SET or UPDATE; updating time info.\n");
-#endif
- /* Have to update our time info */
- time_loc = r_list[location].time_pos;
- time_info[time_loc].time = now;
- time_info[time_loc].position = location;
- while((time_info[(time_loc+1) % ip_list_tot].time < time_info[time_loc].time) && ((time_loc+1) % ip_list_tot) != curr_table->time_pos) {
- time_temp = time_info[time_loc].time;
- time_info[time_loc].time = time_info[(time_loc+1)%ip_list_tot].time;
- time_info[(time_loc+1)%ip_list_tot].time = time_temp;
- time_temp = time_info[time_loc].position;
- time_info[time_loc].position = time_info[(time_loc+1)%ip_list_tot].position;
- time_info[(time_loc+1)%ip_list_tot].position = time_temp;
- r_list[time_info[time_loc].position].time_pos = time_loc;
- r_list[time_info[(time_loc+1)%ip_list_tot].position].time_pos = (time_loc+1)%ip_list_tot;
- time_loc = (time_loc+1) % ip_list_tot;
- }
- r_list[location].time_pos = time_loc;
- r_list[location].ttl = ttl;
- r_list[location].last_pkts[r_list[location].oldest_pkt] = now;
- r_list[location].oldest_pkt = ++r_list[location].oldest_pkt % ip_pkt_list_tot;
- r_list[location].last_seen = now;
- }
- /* If we have been asked to remove the entry from the list, just set it to 0 */
- if(info->check_set & IPT_RECENT_REMOVE) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): REMOVE; clearing entry (or: %d, hr: %d).\n",orig_hash_result,hash_result);
-#endif
- /* Check if this is part of a collision chain */
- while(hash_table[(orig_hash_result+1) % ip_list_hash_size] != -1) {
- orig_hash_result++;
- if(hash_func(r_list[hash_table[orig_hash_result]].addr,ip_list_hash_size) == hash_result) {
- /* Found collision chain, how deep does this rabbit hole go? */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): REMOVE; found collision chain.\n");
-#endif
- end_collision_chain = orig_hash_result;
- }
- }
- if(end_collision_chain != -1) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): REMOVE; part of collision chain, moving to end.\n");
-#endif
- /* Part of a collision chain, swap it with the end of the chain
- * before removing. */
- r_list[hash_table[end_collision_chain]].hash_entry = hash_result;
- temp = hash_table[end_collision_chain];
- hash_table[end_collision_chain] = hash_table[hash_result];
- hash_table[hash_result] = temp;
- time_info[r_list[hash_table[hash_result]].time_pos].position = hash_table[hash_result];
- hash_result = end_collision_chain;
- r_list[hash_table[hash_result]].hash_entry = hash_result;
- time_info[r_list[hash_table[hash_result]].time_pos].position = hash_table[hash_result];
- }
- location = hash_table[hash_result];
- hash_table[r_list[location].hash_entry] = -1;
- time_loc = r_list[location].time_pos;
- time_info[time_loc].time = 0;
- time_info[time_loc].position = location;
- while((time_info[(time_loc+1) % ip_list_tot].time < time_info[time_loc].time) && ((time_loc+1) % ip_list_tot) != curr_table->time_pos) {
- time_temp = time_info[time_loc].time;
- time_info[time_loc].time = time_info[(time_loc+1)%ip_list_tot].time;
- time_info[(time_loc+1)%ip_list_tot].time = time_temp;
- time_temp = time_info[time_loc].position;
- time_info[time_loc].position = time_info[(time_loc+1)%ip_list_tot].position;
- time_info[(time_loc+1)%ip_list_tot].position = time_temp;
- r_list[time_info[time_loc].position].time_pos = time_loc;
- r_list[time_info[(time_loc+1)%ip_list_tot].position].time_pos = (time_loc+1)%ip_list_tot;
- time_loc = (time_loc+1) % ip_list_tot;
+ if (info->check_set & IPT_RECENT_SET)
+ ret ^= 1;
+ else if (info->check_set & IPT_RECENT_REMOVE) {
+ recent_entry_remove(t, e);
+ ret ^= 1;
+ } else if (info->check_set & (IPT_RECENT_CHECK | IPT_RECENT_UPDATE)) {
+ unsigned long t = jiffies - info->seconds * HZ;
+ unsigned int i, hits = 0;
+
+ for (i = 0; i < e->nstamps; i++) {
+ if (info->seconds && time_after(t, e->stamps[i]))
+ continue;
+ if (!info->hit_count || ++hits >= info->hit_count) {
+ ret ^= 1;
+ break;
}
- r_list[location].time_pos = time_loc;
- r_list[location].last_seen = 0;
- r_list[location].addr = 0;
- r_list[location].ttl = 0;
- memset(r_list[location].last_pkts,0,ip_pkt_list_tot*sizeof(unsigned long));
- r_list[location].oldest_pkt = 0;
- ans = !info->invert;
}
- spin_unlock_bh(&curr_table->list_lock);
- return ans;
}

- spin_unlock_bh(&curr_table->list_lock);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match() left.\n");
-#endif
- return ans;
+ if (info->check_set & IPT_RECENT_SET ||
+ (info->check_set & IPT_RECENT_UPDATE && ret)) {
+ recent_entry_update(e);
+ if (info->check_set & IPT_RECENT_TTL)
+ e->ttl = ttl;
+ }
+out:
+ spin_unlock_bh(&recent_lock);
+ return ret;
}

-/* This function is to verify that the rule given during the userspace iptables
- * command is correct.
- * If the command is valid then we check if the table name referred to by the
- * rule exists, if not it is created.
- */
static int
-checkentry(const char *tablename,
- const void *ip,
- const struct xt_match *match,
- void *matchinfo,
- unsigned int matchsize,
- unsigned int hook_mask)
+ipt_recent_checkentry(const char *tablename, const void *ip,
+ const struct xt_match *match, void *matchinfo,
+ unsigned int matchsize, unsigned int hook_mask)
{
- int flag = 0, c;
- unsigned long *hold;
const struct ipt_recent_info *info = matchinfo;
- struct recent_ip_tables *curr_table, *find_table, *last_table;
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() entered.\n");
-#endif
+ struct recent_table *t;
+ unsigned i;
+ int ret = 0;

- /* seconds and hit_count only valid for CHECK/UPDATE */
- if(info->check_set & IPT_RECENT_SET) { flag++; if(info->seconds || info->hit_count) return 0; }
- if(info->check_set & IPT_RECENT_REMOVE) { flag++; if(info->seconds || info->hit_count) return 0; }
- if(info->check_set & IPT_RECENT_CHECK) flag++;
- if(info->check_set & IPT_RECENT_UPDATE) flag++;
-
- /* One and only one of these should ever be set */
- if(flag != 1) return 0;
-
- /* Name must be set to something */
- if(!info->name || !info->name[0]) return 0;
+ if (hweight8(info->check_set &
+ (IPT_RECENT_SET | IPT_RECENT_REMOVE |
+ IPT_RECENT_CHECK | IPT_RECENT_UPDATE)) != 1)
+ return 0;
+ if (info->check_set & (IPT_RECENT_SET | IPT_RECENT_REMOVE) &&
+ (info->seconds || info->hit_count))
+ return 0;
+ if (info->name[0] == '\0' ||
+ strnlen(info->name, IPT_RECENT_NAME_LEN) == IPT_RECENT_NAME_LEN)
+ return 0;

- /* Things look good, create a list for this if it does not exist */
- /* Lock the linked list while we play with it */
spin_lock_bh(&recent_lock);
-
- /* Look for an entry with this name already created */
- /* Finds the end of the list and the entry before the end if current name does not exist */
- find_table = r_tables;
- while( (last_table = find_table) && strncmp(info->name,find_table->name,IPT_RECENT_NAME_LEN) && (find_table = find_table->next) );
-
- /* If a table already exists just increment the count on that table and return */
- if(find_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: table found (%s), incrementing count.\n",info->name);
-#endif
- find_table->count++;
- spin_unlock_bh(&recent_lock);
- return 1;
+ t = recent_table_lookup(info->name);
+ if (t != NULL) {
+ t->refcnt++;
+ ret = 1;
+ goto out;
}

- spin_unlock_bh(&recent_lock);
-
- /* Table with this name not found */
- /* Allocate memory for new linked list item */
-
-#ifdef DEBUG
- if(debug) {
- printk(KERN_INFO RECENT_NAME ": checkentry: no table found (%s)\n",info->name);
- printk(KERN_INFO RECENT_NAME ": checkentry: Allocationg %d for link-list entry.\n",sizeof(struct recent_ip_tables));
+ t = kzalloc(sizeof(*t) + sizeof(t->iphash[0]) * ip_list_hash_size,
+ GFP_ATOMIC);
+ if (t == NULL)
+ goto out;
+ strcpy(t->name, info->name);
+ INIT_LIST_HEAD(&t->lru_list);
+ for (i = 0; i < ip_list_hash_size; i++)
+ INIT_LIST_HEAD(&t->iphash[i]);
+#ifdef CONFIG_PROC_FS
+ t->proc = create_proc_entry(t->name, ip_list_perms, proc_dir);
+ if (t->proc == NULL) {
+ kfree(t);
+ goto out;
}
+ t->proc->proc_fops = &recent_fops;
+ t->proc->data = t;
#endif
+ list_add_tail(&t->list, &tables);
+ ret = 1;
+out:
+ spin_unlock_bh(&recent_lock);
+ return ret;
+}

- curr_table = vmalloc(sizeof(struct recent_ip_tables));
- if(curr_table == NULL) return 0;
-
- spin_lock_init(&curr_table->list_lock);
- curr_table->next = NULL;
- curr_table->count = 1;
- curr_table->time_pos = 0;
- strncpy(curr_table->name,info->name,IPT_RECENT_NAME_LEN);
- curr_table->name[IPT_RECENT_NAME_LEN-1] = '\0';
-
- /* Allocate memory for this table and the list of packets in each entry. */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for table (%s).\n",
- sizeof(struct recent_ip_list)*ip_list_tot,
- info->name);
-#endif
-
- curr_table->table = vmalloc(sizeof(struct recent_ip_list)*ip_list_tot);
- if(curr_table->table == NULL) { vfree(curr_table); return 0; }
- memset(curr_table->table,0,sizeof(struct recent_ip_list)*ip_list_tot);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for pkt_list.\n",
- sizeof(unsigned long)*ip_pkt_list_tot*ip_list_tot);
-#endif
-
- hold = vmalloc(sizeof(unsigned long)*ip_pkt_list_tot*ip_list_tot);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: After pkt_list allocation.\n");
-#endif
- if(hold == NULL) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for pkt_list.\n");
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
- }
- for(c = 0; c < ip_list_tot; c++) {
- curr_table->table[c].last_pkts = hold + c*ip_pkt_list_tot;
- }
+static void
+ipt_recent_destroy(const struct xt_match *match, void *matchinfo,
+ unsigned int matchsize)
+{
+ const struct ipt_recent_info *info = matchinfo;
+ struct recent_table *t;

- /* Allocate memory for the hash table */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for hash_table.\n",
- sizeof(int)*ip_list_hash_size);
+ spin_lock_bh(&recent_lock);
+ t = recent_table_lookup(info->name);
+ if (--t->refcnt == 0) {
+ list_del(&t->list);
+ recent_table_flush(t);
+#ifdef CONFIG_PROC_FS
+ remove_proc_entry(t->name, proc_dir);
#endif
-
- curr_table->hash_table = vmalloc(sizeof(int)*ip_list_hash_size);
- if(!curr_table->hash_table) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for hash_table.\n");
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
- }
-
- for(c = 0; c < ip_list_hash_size; c++) {
- curr_table->hash_table[c] = -1;
+ kfree(t);
}
+ spin_unlock_bh(&recent_lock);
+}

- /* Allocate memory for the time info */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for time_info.\n",
- sizeof(struct time_info_list)*ip_list_tot);
-#endif
+#ifdef CONFIG_PROC_FS
+struct recent_iter_state {
+ struct recent_table *table;
+ unsigned int bucket;
+};

- curr_table->time_info = vmalloc(sizeof(struct time_info_list)*ip_list_tot);
- if(!curr_table->time_info) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for time_info.\n");
- vfree(curr_table->hash_table);
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
- }
- for(c = 0; c < ip_list_tot; c++) {
- curr_table->time_info[c].position = c;
- curr_table->time_info[c].time = 0;
- }
+static void *recent_seq_start(struct seq_file *seq, loff_t *pos)
+{
+ struct recent_iter_state *st = seq->private;
+ struct recent_table *t = st->table;
+ struct recent_entry *e;
+ loff_t p = *pos;

- /* Put the new table in place */
spin_lock_bh(&recent_lock);
- find_table = r_tables;
- while( (last_table = find_table) && strncmp(info->name,find_table->name,IPT_RECENT_NAME_LEN) && (find_table = find_table->next) );
-
- /* If a table already exists just increment the count on that table and return */
- if(find_table) {
- find_table->count++;
- spin_unlock_bh(&recent_lock);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: table found (%s), created by other process.\n",info->name);
-#endif
- vfree(curr_table->time_info);
- vfree(curr_table->hash_table);
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 1;
- }
- if(!last_table) r_tables = curr_table; else last_table->next = curr_table;
-
- spin_unlock_bh(&recent_lock);

-#ifdef CONFIG_PROC_FS
- /* Create our proc 'status' entry. */
- curr_table->status_proc = create_proc_entry(curr_table->name, ip_list_perms, proc_net_ipt_recent);
- if (!curr_table->status_proc) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for /proc entry.\n");
- /* Destroy the created table */
- spin_lock_bh(&recent_lock);
- last_table = NULL;
- curr_table = r_tables;
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() create_proc failed, no tables.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return 0;
- }
- while( strncmp(info->name,curr_table->name,IPT_RECENT_NAME_LEN) && (last_table = curr_table) && (curr_table = curr_table->next) );
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() create_proc failed, table already destroyed.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return 0;
+ for (st->bucket = 0; st->bucket < ip_list_hash_size; st->bucket++) {
+ list_for_each_entry(e, &t->iphash[st->bucket], list) {
+ if (p-- == 0)
+ return e;
}
- if(last_table) last_table->next = curr_table->next; else r_tables = curr_table->next;
- spin_unlock_bh(&recent_lock);
- vfree(curr_table->time_info);
- vfree(curr_table->hash_table);
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
}
-
- curr_table->status_proc->owner = THIS_MODULE;
- curr_table->status_proc->data = curr_table;
- wmb();
- curr_table->status_proc->read_proc = ip_recent_get_info;
- curr_table->status_proc->write_proc = ip_recent_ctrl;
-#endif /* CONFIG_PROC_FS */
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() left.\n");
-#endif
+ return NULL;
+}

- return 1;
+static void *recent_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+ struct recent_iter_state *st = seq->private;
+ struct recent_table *t = st->table;
+ struct recent_entry *e = v;
+ struct list_head *head = e->list.next;
+
+ while (head == &t->iphash[st->bucket]) {
+ if (++st->bucket >= ip_list_hash_size)
+ return NULL;
+ head = t->iphash[st->bucket].next;
+ }
+ (*pos)++;
+ return list_entry(head, struct recent_entry, list);
}

-/* This function is called in the event that a rule matching this module is
- * removed.
- * When this happens we need to check if there are no other rules matching
- * the table given. If that is the case then we remove the table and clean
- * up its memory.
- */
-static void
-destroy(const struct xt_match *match, void *matchinfo, unsigned int matchsize)
+static void recent_seq_stop(struct seq_file *s, void *v)
{
- const struct ipt_recent_info *info = matchinfo;
- struct recent_ip_tables *curr_table, *last_table;
+ spin_unlock_bh(&recent_lock);
+}

-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() entered.\n");
-#endif
+static int recent_seq_show(struct seq_file *seq, void *v)
+{
+ struct recent_entry *e = v;
+ unsigned int i;
+
+ i = (e->index - 1) % ip_pkt_list_tot;
+ seq_printf(seq, "src=%u.%u.%u.%u ttl: %u last_seen: %lu oldest_pkt: %u",
+ NIPQUAD(e->addr), e->ttl, e->stamps[i], e->index);
+ for (i = 0; i < e->nstamps; i++)
+ seq_printf(seq, "%s %lu", i ? "," : "", e->stamps[i]);
+ seq_printf(seq, "\n");
+ return 0;
+}

- if(matchsize != IPT_ALIGN(sizeof(struct ipt_recent_info))) return;
+static struct seq_operations recent_seq_ops = {
+ .start = recent_seq_start,
+ .next = recent_seq_next,
+ .stop = recent_seq_stop,
+ .show = recent_seq_show,
+};

- /* Lock the linked list while we play with it */
- spin_lock_bh(&recent_lock);
+static int recent_seq_open(struct inode *inode, struct file *file)
+{
+ struct proc_dir_entry *pde = PDE(inode);
+ struct seq_file *seq;
+ struct recent_iter_state *st;
+ int ret;
+
+ st = kzalloc(sizeof(*st), GFP_KERNEL);
+ if (st == NULL)
+ return -ENOMEM;
+ ret = seq_open(file, &recent_seq_ops);
+ if (ret)
+ kfree(st);
+ st->table = pde->data;
+ seq = file->private_data;
+ seq->private = st;
+ return ret;
+}

- /* Look for an entry with this name already created */
- /* Finds the end of the list and the entry before the end if current name does not exist */
- last_table = NULL;
- curr_table = r_tables;
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() No tables found, leaving.\n");
-#endif
+static ssize_t recent_proc_write(struct file *file, const char __user *input,
+ size_t size, loff_t *loff)
+{
+ struct proc_dir_entry *pde = PDE(file->f_dentry->d_inode);
+ struct recent_table *t = pde->data;
+ struct recent_entry *e;
+ char buf[sizeof("+255.255.255.255")], *c = buf;
+ u_int32_t addr;
+ int add;
+
+ if (size > sizeof(buf))
+ size = sizeof(buf);
+ if (copy_from_user(buf, input, size))
+ return -EFAULT;
+ while (isspace(*c))
+ c++;
+
+ if (size - (c - buf) < 5)
+ return c - buf;
+ if (!memcmp(c, "clear", 5)) {
+ spin_lock_bh(&recent_lock);
+ recent_table_flush(t);
spin_unlock_bh(&recent_lock);
- return;
+ return c - buf;
}
- while( strncmp(info->name,curr_table->name,IPT_RECENT_NAME_LEN) && (last_table = curr_table) && (curr_table = curr_table->next) );

- /* If a table does not exist then do nothing and return */
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() table not found, leaving.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return;
+ switch (*c) {
+ case '-':
+ add = 0;
+ c++;
+ break;
+ case '+':
+ c++;
+ default:
+ add = 1;
+ break;
}
+ addr = in_aton(c);

- curr_table->count--;
-
- /* If count is still non-zero then there are still rules referenceing it so we do nothing */
- if(curr_table->count) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() table found, non-zero count, leaving.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return;
+ spin_lock_bh(&recent_lock);
+ e = recent_entry_lookup(t, addr, 0);
+ if (e == NULL) {
+ if (add)
+ recent_entry_init(t, addr, 0);
+ } else {
+ if (add)
+ recent_entry_update(e);
+ else
+ recent_entry_remove(t, e);
}
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() table found, zero count, removing.\n");
-#endif
-
- /* Count must be zero so we remove this table from the list */
- if(last_table) last_table->next = curr_table->next; else r_tables = curr_table->next;
-
spin_unlock_bh(&recent_lock);
+ return size;
+}

- /* lock to make sure any late-runners still using this after we removed it from
- * the list finish up then remove everything */
- spin_lock_bh(&curr_table->list_lock);
- spin_unlock_bh(&curr_table->list_lock);
-
-#ifdef CONFIG_PROC_FS
- if(curr_table->status_proc) remove_proc_entry(curr_table->name,proc_net_ipt_recent);
+static struct file_operations recent_fops = {
+ .open = recent_seq_open,
+ .read = seq_read,
+ .write = recent_proc_write,
+ .release = seq_release_private,
+ .owner = THIS_MODULE,
+};
#endif /* CONFIG_PROC_FS */
- vfree(curr_table->table[0].last_pkts);
- vfree(curr_table->table);
- vfree(curr_table->hash_table);
- vfree(curr_table->time_info);
- vfree(curr_table);
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() left.\n");
-#endif

- return;
-}
-
-/* This is the structure we pass to ipt_register to register our
- * module with iptables.
- */
static struct ipt_match recent_match = {
.name = "recent",
- .match = match,
+ .match = ipt_recent_match,
.matchsize = sizeof(struct ipt_recent_info),
- .checkentry = checkentry,
- .destroy = destroy,
- .me = THIS_MODULE
+ .checkentry = ipt_recent_checkentry,
+ .destroy = ipt_recent_destroy,
+ .me = THIS_MODULE,
};

-/* Kernel module initialization. */
static int __init ipt_recent_init(void)
{
- int err, count;
+ int err;

- printk(version);
-#ifdef CONFIG_PROC_FS
- proc_net_ipt_recent = proc_mkdir("ipt_recent",proc_net);
- if(!proc_net_ipt_recent) return -ENOMEM;
-#endif
-
- if(ip_list_hash_size && ip_list_hash_size <= ip_list_tot) {
- printk(KERN_WARNING RECENT_NAME ": ip_list_hash_size too small, resetting to default.\n");
- ip_list_hash_size = 0;
- }
-
- if(!ip_list_hash_size) {
- ip_list_hash_size = ip_list_tot*3;
- count = 2*2;
- while(ip_list_hash_size > count) count = count*2;
- ip_list_hash_size = count;
- }
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_list_hash_size: %d\n",ip_list_hash_size);
-#endif
+ if (!ip_list_tot || !ip_pkt_list_tot)
+ return -EINVAL;
+ ip_list_hash_size = 1 << fls(ip_list_tot);

err = ipt_register_match(&recent_match);
+#ifdef CONFIG_PROC_FS
if (err)
- remove_proc_entry("ipt_recent", proc_net);
+ return err;
+ proc_dir = proc_mkdir("ipt_recent", proc_net);
+ if (proc_dir == NULL) {
+ ipt_unregister_match(&recent_match);
+ err = -ENOMEM;
+ }
+#endif
return err;
}

-/* Kernel module destruction. */
-static void __exit ipt_recent_fini(void)
+static void __exit ipt_recent_exit(void)
{
ipt_unregister_match(&recent_match);
-
- remove_proc_entry("ipt_recent",proc_net);
+#ifdef CONFIG_PROC_FS
+ remove_proc_entry("ipt_recent", proc_net);
+#endif
}

-/* Register our module with the kernel. */
module_init(ipt_recent_init);
-module_exit(ipt_recent_fini);
+module_exit(ipt_recent_exit);


Attachments:
x (45.72 kB)

2006-05-15 20:41:45

by Stephen Frost

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

* Patrick McHardy ([email protected]) wrote:
> This is the updated patch, it changes the eviction strategy
> to LRU and fixes a bug related to TTL handling, the TTL stored
> in the entry should only be overwritten if the IPT_RECENT_TTL
> flag is set.

This looks like least-recently-added as opposed to least-recently-used
(or, really, least-recently-updated). Not sure how you move an entry in
the lru list (perhaps just delete/add?) but I'm pretty sure
recent_entry_update() needs to be modified to move the updated entry to
the end of the list for correct operation.

You also don't appear to check if 't' (the table following the
recent_table_lookup() call) is valid in the 'match' (around
line 191). recent_entry_lookup() doesn't check that either. It seems
like you should be guarenteed to always get a table back but it might be
prudent to check anyway.

I thought that I had convinced myself that the TTL handling was okay and
that where it was overwritten wasn't harmful. Oh well.

Thanks,

Stephen


Attachments:
(No filename) (0.99 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-05-15 20:45:06

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

Stephen Frost wrote:
> * Patrick McHardy ([email protected]) wrote:
>
>>This is the updated patch, it changes the eviction strategy
>>to LRU and fixes a bug related to TTL handling, the TTL stored
>>in the entry should only be overwritten if the IPT_RECENT_TTL
>>flag is set.
>
>
> This looks like least-recently-added as opposed to least-recently-used
> (or, really, least-recently-updated). Not sure how you move an entry in
> the lru list (perhaps just delete/add?) but I'm pretty sure
> recent_entry_update() needs to be modified to move the updated entry to
> the end of the list for correct operation.


Good point, I'll fix the patch.

> You also don't appear to check if 't' (the table following the
> recent_table_lookup() call) is valid in the 'match' (around
> line 191). recent_entry_lookup() doesn't check that either. It seems
> like you should be guarenteed to always get a table back but it might be
> prudent to check anyway.


It is guaranteed that we will get a valid table back, otherwise
there must be a serious bug somewhere else, in which case I
prefer to crash instead of hiding it away.

2006-05-15 21:03:43

by Stephen Frost

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

* Stephen Frost ([email protected]) wrote:
> * Patrick McHardy ([email protected]) wrote:
> > This is the updated patch, it changes the eviction strategy
> > to LRU and fixes a bug related to TTL handling, the TTL stored
> > in the entry should only be overwritten if the IPT_RECENT_TTL
> > flag is set.
>
> I thought that I had convinced myself that the TTL handling was okay and
> that where it was overwritten wasn't harmful. Oh well.

Looking at this again... The ttl isn't copied into 'ttl' unless the
check_set has TTL turned on. This means that the overwritting was fine,
if you accept that you can only ever match on TTL, or never match on it.
That doesn't seem right to me. The TTL in the table should always be
kept up-to-date and the only question is if the current rule requires it
for a match or not. This isn't a huge change, just set the local
variable always but check for if it's asked to match before calling the
lookup. Or you could move it into an if/else block.

Thanks,

Stephen


Attachments:
(No filename) (0.99 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-05-17 06:26:09

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

[NETFILTER]: Replace ipt_recent module

Replace the totally unmaintainable ipt_recent module by a rewritten
version that should be fully compatible.

Signed-off-by: Patrick McHardy <[email protected]>

---
commit 791489887d0984df96ac098707993bf01a4804a9
tree ea9a218b86d9a922ba69f9ae87cca826d9b52d87
parent d8c3291c73b958243b33f8509d4507e76dafd055
author Patrick McHardy <[email protected]> Wed, 17 May 2006 08:22:14 +0200
committer Patrick McHardy <[email protected]> Wed, 17 May 2006 08:22:14 +0200

net/ipv4/netfilter/ipt_recent.c | 1268 ++++++++++++---------------------------
1 files changed, 377 insertions(+), 891 deletions(-)

diff --git a/net/ipv4/netfilter/ipt_recent.c b/net/ipv4/netfilter/ipt_recent.c
index 1438432..9dc4dea 100644
--- a/net/ipv4/netfilter/ipt_recent.c
+++ b/net/ipv4/netfilter/ipt_recent.c
@@ -1,1007 +1,493 @@
-/* Kernel module to check if the source address has been seen recently. */
-/* Copyright 2002-2003, Stephen Frost, 2.5.x port by [email protected] */
-/* Author: Stephen Frost <[email protected]> */
-/* Project Page: http://snowman.net/projects/ipt_recent/ */
-/* This software is distributed under the terms of the GPL, Version 2 */
-/* This copyright does not cover user programs that use kernel services
- * by normal system calls. */
-
-#include <linux/module.h>
-#include <linux/skbuff.h>
+/*
+ * Copyright (c) 2006 Patrick McHardy <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This is a replacement of the old ipt_recent module, which carried the
+ * following copyright notice:
+ *
+ * Author: Stephen Frost <[email protected]>
+ * Copyright 2002-2003, Stephen Frost, 2.5.x port by [email protected]
+ */
+#include <linux/init.h>
+#include <linux/moduleparam.h>
#include <linux/proc_fs.h>
-#include <linux/spinlock.h>
-#include <linux/interrupt.h>
-#include <asm/uaccess.h>
+#include <linux/seq_file.h>
+#include <linux/string.h>
#include <linux/ctype.h>
-#include <linux/ip.h>
-#include <linux/vmalloc.h>
-#include <linux/moduleparam.h>
+#include <linux/list.h>
+#include <linux/random.h>
+#include <linux/jhash.h>
+#include <linux/bitops.h>
+#include <linux/skbuff.h>
+#include <linux/inet.h>

#include <linux/netfilter_ipv4/ip_tables.h>
#include <linux/netfilter_ipv4/ipt_recent.h>

-#undef DEBUG
-#define HASH_LOG 9
+MODULE_AUTHOR("Patrick McHardy <[email protected]>");
+MODULE_DESCRIPTION("IP tables recently seen matching module");
+MODULE_LICENSE("GPL");

-/* Defaults, these can be overridden on the module command-line. */
static unsigned int ip_list_tot = 100;
static unsigned int ip_pkt_list_tot = 20;
static unsigned int ip_list_hash_size = 0;
static unsigned int ip_list_perms = 0644;
-#ifdef DEBUG
-static int debug = 1;
-#endif
-
-static char version[] =
-KERN_INFO RECENT_NAME " " RECENT_VER ": Stephen Frost <[email protected]>. http://snowman.net/projects/ipt_recent/\n";
-
-MODULE_AUTHOR("Stephen Frost <[email protected]>");
-MODULE_DESCRIPTION("IP tables recently seen matching module " RECENT_VER);
-MODULE_LICENSE("GPL");
module_param(ip_list_tot, uint, 0400);
module_param(ip_pkt_list_tot, uint, 0400);
module_param(ip_list_hash_size, uint, 0400);
module_param(ip_list_perms, uint, 0400);
-#ifdef DEBUG
-module_param(debug, bool, 0600);
-MODULE_PARM_DESC(debug,"enable debugging output");
-#endif
-MODULE_PARM_DESC(ip_list_tot,"number of IPs to remember per list");
-MODULE_PARM_DESC(ip_pkt_list_tot,"number of packets per IP to remember");
-MODULE_PARM_DESC(ip_list_hash_size,"size of hash table used to look up IPs");
-MODULE_PARM_DESC(ip_list_perms,"permissions on /proc/net/ipt_recent/* files");
-
-/* Structure of our list of recently seen addresses. */
-struct recent_ip_list {
- u_int32_t addr;
- u_int8_t ttl;
- unsigned long last_seen;
- unsigned long *last_pkts;
- u_int32_t oldest_pkt;
- u_int32_t hash_entry;
- u_int32_t time_pos;
-};
-
-struct time_info_list {
- u_int32_t position;
- u_int32_t time;
+MODULE_PARM_DESC(ip_list_tot, "number of IPs to remember per list");
+MODULE_PARM_DESC(ip_pkt_list_tot, "number of packets per IP to remember");
+MODULE_PARM_DESC(ip_list_hash_size, "size of hash table used to look up IPs");
+MODULE_PARM_DESC(ip_list_perms, "permissions on /proc/net/ipt_recent/* files");
+
+
+struct recent_entry {
+ struct list_head list;
+ struct list_head lru_list;
+ u_int32_t addr;
+ u_int8_t ttl;
+ unsigned int index;
+ unsigned int nstamps;
+ unsigned long stamps[0];
};

-/* Structure of our linked list of tables of recent lists. */
-struct recent_ip_tables {
- char name[IPT_RECENT_NAME_LEN];
- int count;
- int time_pos;
- struct recent_ip_list *table;
- struct recent_ip_tables *next;
- spinlock_t list_lock;
- int *hash_table;
- struct time_info_list *time_info;
+struct recent_table {
+ struct list_head list;
+ char name[IPT_RECENT_NAME_LEN];
#ifdef CONFIG_PROC_FS
- struct proc_dir_entry *status_proc;
-#endif /* CONFIG_PROC_FS */
+ struct proc_dir_entry *proc;
+#endif
+ unsigned int refcnt;
+ unsigned int entries;
+ struct list_head lru_list;
+ struct list_head iphash[0];
};

-/* Our current list of addresses we have recently seen.
- * Only added to on a --set, and only updated on --set || --update
- */
-static struct recent_ip_tables *r_tables = NULL;
-
-/* We protect r_list with this spinlock so two processors are not modifying
- * the list at the same time.
- */
+static LIST_HEAD(tables);
static DEFINE_SPINLOCK(recent_lock);

#ifdef CONFIG_PROC_FS
-/* Our /proc/net/ipt_recent entry */
-static struct proc_dir_entry *proc_net_ipt_recent = NULL;
-#endif
-
-/* Function declaration for later. */
-static int
-match(const struct sk_buff *skb,
- const struct net_device *in,
- const struct net_device *out,
- const struct xt_match *match,
- const void *matchinfo,
- int offset,
- unsigned int protoff,
- int *hotdrop);
-
-/* Function to hash a given address into the hash table of table_size size */
-static int hash_func(unsigned int addr, int table_size)
-{
- int result = 0;
- unsigned int value = addr;
- do { result ^= value; } while((value >>= HASH_LOG));
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": %d = hash_func(%u,%d)\n",
- result & (table_size - 1),
- addr,
- table_size);
+static struct proc_dir_entry *proc_dir;
+static struct file_operations recent_fops;
#endif

- return(result & (table_size - 1));
-}
+static u_int32_t hash_rnd;
+static int hash_rnd_initted;

-#ifdef CONFIG_PROC_FS
-/* This is the function which produces the output for our /proc output
- * interface which lists each IP address, the last seen time and the
- * other recent times the address was seen.
- */
-
-static int ip_recent_get_info(char *buffer, char **start, off_t offset, int length, int *eof, void *data)
+static unsigned int recent_entry_hash(u_int32_t addr)
{
- int len = 0, count, last_len = 0, pkt_count;
- off_t pos = 0;
- off_t begin = 0;
- struct recent_ip_tables *curr_table;
-
- curr_table = (struct recent_ip_tables*) data;
-
- spin_lock_bh(&curr_table->list_lock);
- for(count = 0; count < ip_list_tot; count++) {
- if(!curr_table->table[count].addr) continue;
- last_len = len;
- len += sprintf(buffer+len,"src=%u.%u.%u.%u ",NIPQUAD(curr_table->table[count].addr));
- len += sprintf(buffer+len,"ttl: %u ",curr_table->table[count].ttl);
- len += sprintf(buffer+len,"last_seen: %lu ",curr_table->table[count].last_seen);
- len += sprintf(buffer+len,"oldest_pkt: %u ",curr_table->table[count].oldest_pkt);
- len += sprintf(buffer+len,"last_pkts: %lu",curr_table->table[count].last_pkts[0]);
- for(pkt_count = 1; pkt_count < ip_pkt_list_tot; pkt_count++) {
- if(!curr_table->table[count].last_pkts[pkt_count]) break;
- len += sprintf(buffer+len,", %lu",curr_table->table[count].last_pkts[pkt_count]);
- }
- len += sprintf(buffer+len,"\n");
- pos = begin + len;
- if(pos < offset) { len = 0; begin = pos; }
- if(pos > offset + length) { len = last_len; break; }
+ if (!hash_rnd_initted) {
+ get_random_bytes(&hash_rnd, 4);
+ hash_rnd_initted = 1;
}
-
- *start = buffer + (offset - begin);
- len -= (offset - begin);
- if(len > length) len = length;
-
- spin_unlock_bh(&curr_table->list_lock);
- return len;
+ return jhash_1word(addr, hash_rnd) & (ip_list_hash_size - 1);
}

-/* ip_recent_ctrl provides an interface for users to modify the table
- * directly. This allows adding entries, removing entries, and
- * flushing the entire table.
- * This is done by opening up the appropriate table for writing and
- * sending one of:
- * xx.xx.xx.xx -- Add entry to table with current time
- * +xx.xx.xx.xx -- Add entry to table with current time
- * -xx.xx.xx.xx -- Remove entry from table
- * clear -- Flush table, remove all entries
- */
-
-static int ip_recent_ctrl(struct file *file, const char __user *input, unsigned long size, void *data)
+static struct recent_entry *
+recent_entry_lookup(const struct recent_table *table, u_int32_t addr, u_int8_t ttl)
{
- static const u_int32_t max[4] = { 0xffffffff, 0xffffff, 0xffff, 0xff };
- u_int32_t val;
- int base, used = 0;
- char c, *cp;
- union iaddr {
- uint8_t bytes[4];
- uint32_t word;
- } res;
- uint8_t *pp = res.bytes;
- int digit;
-
- char buffer[20];
- int len, check_set = 0, count;
- u_int32_t addr = 0;
- struct sk_buff *skb;
- struct ipt_recent_info *info;
- struct recent_ip_tables *curr_table;
-
- curr_table = (struct recent_ip_tables*) data;
-
- if(size > 20) len = 20; else len = size;
-
- if(copy_from_user(buffer,input,len)) return -EFAULT;
-
- if(len < 20) buffer[len] = '\0';
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_recent_ctrl len: %d, input: `%.20s'\n",len,buffer);
-#endif
+ struct recent_entry *e;
+ unsigned int h;
+
+ h = recent_entry_hash(addr);
+ list_for_each_entry(e, &table->iphash[h], list)
+ if (e->addr == addr && (ttl == e->ttl || !ttl || !e->ttl))
+ return e;
+ return NULL;
+}

- cp = buffer;
- while(isspace(*cp)) { cp++; used++; if(used >= len-5) return used; }
+static void recent_entry_remove(struct recent_table *t, struct recent_entry *e)
+{
+ list_del(&e->list);
+ list_del(&e->lru_list);
+ kfree(e);
+ t->entries--;
+}

- /* Check if we are asked to flush the entire table */
- if(!memcmp(cp,"clear",5)) {
- used += 5;
- spin_lock_bh(&curr_table->list_lock);
- curr_table->time_pos = 0;
- for(count = 0; count < ip_list_hash_size; count++) {
- curr_table->hash_table[count] = -1;
- }
- for(count = 0; count < ip_list_tot; count++) {
- curr_table->table[count].last_seen = 0;
- curr_table->table[count].addr = 0;
- curr_table->table[count].ttl = 0;
- memset(curr_table->table[count].last_pkts,0,ip_pkt_list_tot*sizeof(unsigned long));
- curr_table->table[count].oldest_pkt = 0;
- curr_table->table[count].time_pos = 0;
- curr_table->time_info[count].position = count;
- curr_table->time_info[count].time = 0;
- }
- spin_unlock_bh(&curr_table->list_lock);
- return used;
- }
+static struct recent_entry *
+recent_entry_init(struct recent_table *t, u_int32_t addr, u_int8_t ttl)
+{
+ struct recent_entry *e;

- check_set = IPT_RECENT_SET;
- switch(*cp) {
- case '+': check_set = IPT_RECENT_SET; cp++; used++; break;
- case '-': check_set = IPT_RECENT_REMOVE; cp++; used++; break;
- default: if(!isdigit(*cp)) return (used+1); break;
+ if (t->entries >= ip_list_tot) {
+ e = list_entry(t->lru_list.next, struct recent_entry, lru_list);
+ recent_entry_remove(t, e);
}
+ e = kmalloc(sizeof(*e) + sizeof(e->stamps[0]) * ip_pkt_list_tot,
+ GFP_ATOMIC);
+ if (e == NULL)
+ return NULL;
+ e->addr = addr;
+ e->ttl = ttl;
+ e->stamps[0] = jiffies;
+ e->nstamps = 1;
+ e->index = 1;
+ INIT_LIST_HEAD(&e->lru_list);
+ list_add_tail(&e->list, &t->iphash[recent_entry_hash(addr)]);
+ t->entries++;
+ return e;
+}

-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_recent_ctrl cp: `%c', check_set: %d\n",*cp,check_set);
-#endif
- /* Get addr (effectively inet_aton()) */
- /* Shamelessly stolen from libc, a function in the kernel for doing
- * this would, of course, be greatly preferred, but our options appear
- * to be rather limited, so we will just do it ourselves here.
- */
- res.word = 0;
-
- c = *cp;
- for(;;) {
- if(!isdigit(c)) return used;
- val = 0; base = 10; digit = 0;
- if(c == '0') {
- c = *++cp;
- if(c == 'x' || c == 'X') base = 16, c = *++cp;
- else { base = 8; digit = 1; }
- }
- for(;;) {
- if(isascii(c) && isdigit(c)) {
- if(base == 8 && (c == '8' || c == '0')) return used;
- val = (val * base) + (c - '0');
- c = *++cp;
- digit = 1;
- } else if(base == 16 && isascii(c) && isxdigit(c)) {
- val = (val << 4) | (c + 10 - (islower(c) ? 'a' : 'A'));
- c = *++cp;
- digit = 1;
- } else break;
- }
- if(c == '.') {
- if(pp > res.bytes + 2 || val > 0xff) return used;
- *pp++ = val;
- c = *++cp;
- } else break;
- }
- used = cp - buffer;
- if(c != '\0' && (!isascii(c) || !isspace(c))) return used;
- if(c == '\n') used++;
- if(!digit) return used;
+static void recent_entry_update(struct recent_table *t, struct recent_entry *e)
+{
+ e->stamps[e->index++] = jiffies;
+ if (e->index > e->nstamps)
+ e->nstamps = e->index;
+ e->index %= ip_pkt_list_tot;
+ list_move_tail(&e->lru_list, &t->lru_list);
+}

- if(val > max[pp - res.bytes]) return used;
- addr = res.word | htonl(val);
+static struct recent_table *recent_table_lookup(const char *name)
+{
+ struct recent_table *t;

- if(!addr && check_set == IPT_RECENT_SET) return used;
+ list_for_each_entry(t, &tables, list)
+ if (!strcmp(t->name, name))
+ return t;
+ return NULL;
+}

-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_recent_ctrl c: %c, addr: %u used: %d\n",c,addr,used);
-#endif
+static void recent_table_flush(struct recent_table *t)
+{
+ struct recent_entry *e, *next;
+ unsigned int i;

- /* Set up and just call match */
- info = kmalloc(sizeof(struct ipt_recent_info),GFP_KERNEL);
- if(!info) { return -ENOMEM; }
- info->seconds = 0;
- info->hit_count = 0;
- info->check_set = check_set;
- info->invert = 0;
- info->side = IPT_RECENT_SOURCE;
- strncpy(info->name,curr_table->name,IPT_RECENT_NAME_LEN);
- info->name[IPT_RECENT_NAME_LEN-1] = '\0';
-
- skb = kmalloc(sizeof(struct sk_buff),GFP_KERNEL);
- if (!skb) {
- used = -ENOMEM;
- goto out_free_info;
- }
- skb->nh.iph = kmalloc(sizeof(struct iphdr),GFP_KERNEL);
- if (!skb->nh.iph) {
- used = -ENOMEM;
- goto out_free_skb;
+ for (i = 0; i < ip_list_hash_size; i++) {
+ list_for_each_entry_safe(e, next, &t->iphash[i], list)
+ recent_entry_remove(t, e);
}
-
- skb->nh.iph->saddr = addr;
- skb->nh.iph->daddr = 0;
- /* Clear ttl since we have no way of knowing it */
- skb->nh.iph->ttl = 0;
- match(skb,NULL,NULL,NULL,info,0,0,NULL);
-
- kfree(skb->nh.iph);
-out_free_skb:
- kfree(skb);
-out_free_info:
- kfree(info);
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": Leaving ip_recent_ctrl addr: %u used: %d\n",addr,used);
-#endif
- return used;
}

-#endif /* CONFIG_PROC_FS */
-
-/* 'match' is our primary function, called by the kernel whenever a rule is
- * hit with our module as an option to it.
- * What this function does depends on what was specifically asked of it by
- * the user:
- * --set -- Add or update last seen time of the source address of the packet
- * -- matchinfo->check_set == IPT_RECENT_SET
- * --rcheck -- Just check if the source address is in the list
- * -- matchinfo->check_set == IPT_RECENT_CHECK
- * --update -- If the source address is in the list, update last_seen
- * -- matchinfo->check_set == IPT_RECENT_UPDATE
- * --remove -- If the source address is in the list, remove it
- * -- matchinfo->check_set == IPT_RECENT_REMOVE
- * --seconds -- Option to --rcheck/--update, only match if last_seen within seconds
- * -- matchinfo->seconds
- * --hitcount -- Option to --rcheck/--update, only match if seen hitcount times
- * -- matchinfo->hit_count
- * --seconds and --hitcount can be combined
- */
static int
-match(const struct sk_buff *skb,
- const struct net_device *in,
- const struct net_device *out,
- const struct xt_match *match,
- const void *matchinfo,
- int offset,
- unsigned int protoff,
- int *hotdrop)
+ipt_recent_match(const struct sk_buff *skb,
+ const struct net_device *in, const struct net_device *out,
+ const struct xt_match *match, const void *matchinfo,
+ int offset, unsigned int protoff, int *hotdrop)
{
- int pkt_count, hits_found, ans;
- unsigned long now;
const struct ipt_recent_info *info = matchinfo;
- u_int32_t addr = 0, time_temp;
- u_int8_t ttl = skb->nh.iph->ttl;
- int *hash_table;
- int orig_hash_result, hash_result, temp, location = 0, time_loc, end_collision_chain = -1;
- struct time_info_list *time_info;
- struct recent_ip_tables *curr_table;
- struct recent_ip_tables *last_table;
- struct recent_ip_list *r_list;
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match() called\n");
-#endif
-
- /* Default is false ^ info->invert */
- ans = info->invert;
+ struct recent_table *t;
+ struct recent_entry *e;
+ u_int32_t addr;
+ u_int8_t ttl;
+ int ret = info->invert;

-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): name = '%s'\n",info->name);
-#endif
+ if (info->side == IPT_RECENT_DEST)
+ addr = skb->nh.iph->daddr;
+ else
+ addr = skb->nh.iph->saddr;

- /* if out != NULL then routing has been done and TTL changed.
- * We change it back here internally for match what came in before routing. */
- if(out) ttl++;
+ ttl = skb->nh.iph->ttl;
+ /* use TTL as seen before forwarding */
+ if (out && !skb->sk)
+ ttl++;

- /* Find the right table */
spin_lock_bh(&recent_lock);
- curr_table = r_tables;
- while( (last_table = curr_table) && strncmp(info->name,curr_table->name,IPT_RECENT_NAME_LEN) && (curr_table = curr_table->next) );
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): table found('%s')\n",info->name);
-#endif
-
- spin_unlock_bh(&recent_lock);
-
- /* Table with this name not found, match impossible */
- if(!curr_table) { return ans; }
-
- /* Make sure no one is changing the list while we work with it */
- spin_lock_bh(&curr_table->list_lock);
-
- r_list = curr_table->table;
- if(info->side == IPT_RECENT_DEST) addr = skb->nh.iph->daddr; else addr = skb->nh.iph->saddr;
-
- if(!addr) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match() address (%u) invalid, leaving.\n",addr);
-#endif
- spin_unlock_bh(&curr_table->list_lock);
- return ans;
+ t = recent_table_lookup(info->name);
+ e = recent_entry_lookup(t, addr,
+ info->check_set & IPT_RECENT_TTL ? ttl : 0);
+ if (e == NULL) {
+ if (!(info->check_set & IPT_RECENT_SET))
+ goto out;
+ e = recent_entry_init(t, addr, ttl);
+ if (e == NULL)
+ *hotdrop = 1;
+ ret ^= 1;
+ goto out;
}

-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): checking table, addr: %u, ttl: %u, orig_ttl: %u\n",addr,ttl,skb->nh.iph->ttl);
-#endif
-
- /* Get jiffies now in case they changed while we were waiting for a lock */
- now = jiffies;
- hash_table = curr_table->hash_table;
- time_info = curr_table->time_info;
-
- orig_hash_result = hash_result = hash_func(addr,ip_list_hash_size);
- /* Hash entry at this result used */
- /* Check for TTL match if requested. If TTL is zero then a match would never
- * happen, so match regardless of existing TTL in that case. Zero means the
- * entry was added via the /proc interface anyway, so we will just use the
- * first TTL we get for that IP address. */
- if(info->check_set & IPT_RECENT_TTL) {
- while(hash_table[hash_result] != -1 && !(r_list[hash_table[hash_result]].addr == addr &&
- (!r_list[hash_table[hash_result]].ttl || r_list[hash_table[hash_result]].ttl == ttl))) {
- /* Collision in hash table */
- hash_result = (hash_result + 1) % ip_list_hash_size;
- }
- } else {
- while(hash_table[hash_result] != -1 && r_list[hash_table[hash_result]].addr != addr) {
- /* Collision in hash table */
- hash_result = (hash_result + 1) % ip_list_hash_size;
- }
- }
-
- if(hash_table[hash_result] == -1 && !(info->check_set & IPT_RECENT_SET)) {
- /* IP not in list and not asked to SET */
- spin_unlock_bh(&curr_table->list_lock);
- return ans;
- }
-
- /* Check if we need to handle the collision, do not need to on REMOVE */
- if(orig_hash_result != hash_result && !(info->check_set & IPT_RECENT_REMOVE)) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Collision in hash table. (or: %d,hr: %d,oa: %u,ha: %u)\n",
- orig_hash_result,
- hash_result,
- r_list[hash_table[orig_hash_result]].addr,
- addr);
-#endif
-
- /* We had a collision.
- * orig_hash_result is where we started, hash_result is where we ended up.
- * So, swap them because we are likely to see the same guy again sooner */
-#ifdef DEBUG
- if(debug) {
- printk(KERN_INFO RECENT_NAME ": match(): Collision; hash_table[orig_hash_result] = %d\n",hash_table[orig_hash_result]);
- printk(KERN_INFO RECENT_NAME ": match(): Collision; r_list[hash_table[orig_hash_result]].hash_entry = %d\n",
- r_list[hash_table[orig_hash_result]].hash_entry);
- }
-#endif
-
- r_list[hash_table[orig_hash_result]].hash_entry = hash_result;
-
-
- temp = hash_table[orig_hash_result];
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Collision; hash_table[hash_result] = %d\n",hash_table[hash_result]);
-#endif
- hash_table[orig_hash_result] = hash_table[hash_result];
- hash_table[hash_result] = temp;
- temp = hash_result;
- hash_result = orig_hash_result;
- orig_hash_result = temp;
- time_info[r_list[hash_table[orig_hash_result]].time_pos].position = hash_table[orig_hash_result];
- if(hash_table[hash_result] != -1) {
- r_list[hash_table[hash_result]].hash_entry = hash_result;
- time_info[r_list[hash_table[hash_result]].time_pos].position = hash_table[hash_result];
- }
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Collision handled.\n");
-#endif
- }
-
- if(hash_table[hash_result] == -1) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): New table entry. (hr: %d,ha: %u)\n",
- hash_result, addr);
-#endif
-
- /* New item found and IPT_RECENT_SET, so we need to add it */
- location = time_info[curr_table->time_pos].position;
- hash_table[r_list[location].hash_entry] = -1;
- hash_table[hash_result] = location;
- memset(r_list[location].last_pkts,0,ip_pkt_list_tot*sizeof(unsigned long));
- r_list[location].time_pos = curr_table->time_pos;
- r_list[location].addr = addr;
- r_list[location].ttl = ttl;
- r_list[location].last_seen = now;
- r_list[location].oldest_pkt = 1;
- r_list[location].last_pkts[0] = now;
- r_list[location].hash_entry = hash_result;
- time_info[curr_table->time_pos].time = r_list[location].last_seen;
- curr_table->time_pos = (curr_table->time_pos + 1) % ip_list_tot;
-
- ans = !info->invert;
- } else {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): Existing table entry. (hr: %d,ha: %u)\n",
- hash_result,
- addr);
-#endif
-
- /* Existing item found */
- location = hash_table[hash_result];
- /* We have a match on address, now to make sure it meets all requirements for a
- * full match. */
- if(info->check_set & IPT_RECENT_CHECK || info->check_set & IPT_RECENT_UPDATE) {
- if(!info->seconds && !info->hit_count) ans = !info->invert; else ans = info->invert;
- if(info->seconds && !info->hit_count) {
- if(time_before_eq(now,r_list[location].last_seen+info->seconds*HZ)) ans = !info->invert; else ans = info->invert;
- }
- if(info->seconds && info->hit_count) {
- for(pkt_count = 0, hits_found = 0; pkt_count < ip_pkt_list_tot; pkt_count++) {
- if(r_list[location].last_pkts[pkt_count] == 0) break;
- if(time_before_eq(now,r_list[location].last_pkts[pkt_count]+info->seconds*HZ)) hits_found++;
- }
- if(hits_found >= info->hit_count) ans = !info->invert; else ans = info->invert;
- }
- if(info->hit_count && !info->seconds) {
- for(pkt_count = 0, hits_found = 0; pkt_count < ip_pkt_list_tot; pkt_count++) {
- if(r_list[location].last_pkts[pkt_count] == 0) break;
- hits_found++;
- }
- if(hits_found >= info->hit_count) ans = !info->invert; else ans = info->invert;
- }
- }
-#ifdef DEBUG
- if(debug) {
- if(ans)
- printk(KERN_INFO RECENT_NAME ": match(): match addr: %u\n",addr);
- else
- printk(KERN_INFO RECENT_NAME ": match(): no match addr: %u\n",addr);
- }
-#endif
-
- /* If and only if we have been asked to SET, or to UPDATE (on match) do we add the
- * current timestamp to the last_seen. */
- if((info->check_set & IPT_RECENT_SET && (ans = !info->invert)) || (info->check_set & IPT_RECENT_UPDATE && ans)) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): SET or UPDATE; updating time info.\n");
-#endif
- /* Have to update our time info */
- time_loc = r_list[location].time_pos;
- time_info[time_loc].time = now;
- time_info[time_loc].position = location;
- while((time_info[(time_loc+1) % ip_list_tot].time < time_info[time_loc].time) && ((time_loc+1) % ip_list_tot) != curr_table->time_pos) {
- time_temp = time_info[time_loc].time;
- time_info[time_loc].time = time_info[(time_loc+1)%ip_list_tot].time;
- time_info[(time_loc+1)%ip_list_tot].time = time_temp;
- time_temp = time_info[time_loc].position;
- time_info[time_loc].position = time_info[(time_loc+1)%ip_list_tot].position;
- time_info[(time_loc+1)%ip_list_tot].position = time_temp;
- r_list[time_info[time_loc].position].time_pos = time_loc;
- r_list[time_info[(time_loc+1)%ip_list_tot].position].time_pos = (time_loc+1)%ip_list_tot;
- time_loc = (time_loc+1) % ip_list_tot;
- }
- r_list[location].time_pos = time_loc;
- r_list[location].ttl = ttl;
- r_list[location].last_pkts[r_list[location].oldest_pkt] = now;
- r_list[location].oldest_pkt = ++r_list[location].oldest_pkt % ip_pkt_list_tot;
- r_list[location].last_seen = now;
- }
- /* If we have been asked to remove the entry from the list, just set it to 0 */
- if(info->check_set & IPT_RECENT_REMOVE) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): REMOVE; clearing entry (or: %d, hr: %d).\n",orig_hash_result,hash_result);
-#endif
- /* Check if this is part of a collision chain */
- while(hash_table[(orig_hash_result+1) % ip_list_hash_size] != -1) {
- orig_hash_result++;
- if(hash_func(r_list[hash_table[orig_hash_result]].addr,ip_list_hash_size) == hash_result) {
- /* Found collision chain, how deep does this rabbit hole go? */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): REMOVE; found collision chain.\n");
-#endif
- end_collision_chain = orig_hash_result;
- }
- }
- if(end_collision_chain != -1) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match(): REMOVE; part of collision chain, moving to end.\n");
-#endif
- /* Part of a collision chain, swap it with the end of the chain
- * before removing. */
- r_list[hash_table[end_collision_chain]].hash_entry = hash_result;
- temp = hash_table[end_collision_chain];
- hash_table[end_collision_chain] = hash_table[hash_result];
- hash_table[hash_result] = temp;
- time_info[r_list[hash_table[hash_result]].time_pos].position = hash_table[hash_result];
- hash_result = end_collision_chain;
- r_list[hash_table[hash_result]].hash_entry = hash_result;
- time_info[r_list[hash_table[hash_result]].time_pos].position = hash_table[hash_result];
- }
- location = hash_table[hash_result];
- hash_table[r_list[location].hash_entry] = -1;
- time_loc = r_list[location].time_pos;
- time_info[time_loc].time = 0;
- time_info[time_loc].position = location;
- while((time_info[(time_loc+1) % ip_list_tot].time < time_info[time_loc].time) && ((time_loc+1) % ip_list_tot) != curr_table->time_pos) {
- time_temp = time_info[time_loc].time;
- time_info[time_loc].time = time_info[(time_loc+1)%ip_list_tot].time;
- time_info[(time_loc+1)%ip_list_tot].time = time_temp;
- time_temp = time_info[time_loc].position;
- time_info[time_loc].position = time_info[(time_loc+1)%ip_list_tot].position;
- time_info[(time_loc+1)%ip_list_tot].position = time_temp;
- r_list[time_info[time_loc].position].time_pos = time_loc;
- r_list[time_info[(time_loc+1)%ip_list_tot].position].time_pos = (time_loc+1)%ip_list_tot;
- time_loc = (time_loc+1) % ip_list_tot;
+ if (info->check_set & IPT_RECENT_SET)
+ ret ^= 1;
+ else if (info->check_set & IPT_RECENT_REMOVE) {
+ recent_entry_remove(t, e);
+ ret ^= 1;
+ } else if (info->check_set & (IPT_RECENT_CHECK | IPT_RECENT_UPDATE)) {
+ unsigned long t = jiffies - info->seconds * HZ;
+ unsigned int i, hits = 0;
+
+ for (i = 0; i < e->nstamps; i++) {
+ if (info->seconds && time_after(t, e->stamps[i]))
+ continue;
+ if (++hits >= info->hit_count) {
+ ret ^= 1;
+ break;
}
- r_list[location].time_pos = time_loc;
- r_list[location].last_seen = 0;
- r_list[location].addr = 0;
- r_list[location].ttl = 0;
- memset(r_list[location].last_pkts,0,ip_pkt_list_tot*sizeof(unsigned long));
- r_list[location].oldest_pkt = 0;
- ans = !info->invert;
}
- spin_unlock_bh(&curr_table->list_lock);
- return ans;
}

- spin_unlock_bh(&curr_table->list_lock);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": match() left.\n");
-#endif
- return ans;
+ if (info->check_set & IPT_RECENT_SET ||
+ (info->check_set & IPT_RECENT_UPDATE && ret)) {
+ recent_entry_update(t, e);
+ e->ttl = ttl;
+ }
+out:
+ spin_unlock_bh(&recent_lock);
+ return ret;
}

-/* This function is to verify that the rule given during the userspace iptables
- * command is correct.
- * If the command is valid then we check if the table name referred to by the
- * rule exists, if not it is created.
- */
static int
-checkentry(const char *tablename,
- const void *ip,
- const struct xt_match *match,
- void *matchinfo,
- unsigned int matchsize,
- unsigned int hook_mask)
+ipt_recent_checkentry(const char *tablename, const void *ip,
+ const struct xt_match *match, void *matchinfo,
+ unsigned int matchsize, unsigned int hook_mask)
{
- int flag = 0, c;
- unsigned long *hold;
const struct ipt_recent_info *info = matchinfo;
- struct recent_ip_tables *curr_table, *find_table, *last_table;
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() entered.\n");
-#endif
-
- /* seconds and hit_count only valid for CHECK/UPDATE */
- if(info->check_set & IPT_RECENT_SET) { flag++; if(info->seconds || info->hit_count) return 0; }
- if(info->check_set & IPT_RECENT_REMOVE) { flag++; if(info->seconds || info->hit_count) return 0; }
- if(info->check_set & IPT_RECENT_CHECK) flag++;
- if(info->check_set & IPT_RECENT_UPDATE) flag++;
-
- /* One and only one of these should ever be set */
- if(flag != 1) return 0;
+ struct recent_table *t;
+ unsigned i;
+ int ret = 0;

- /* Name must be set to something */
- if(!info->name || !info->name[0]) return 0;
+ if (hweight8(info->check_set &
+ (IPT_RECENT_SET | IPT_RECENT_REMOVE |
+ IPT_RECENT_CHECK | IPT_RECENT_UPDATE)) != 1)
+ return 0;
+ if (info->check_set & (IPT_RECENT_SET | IPT_RECENT_REMOVE) &&
+ (info->seconds || info->hit_count))
+ return 0;
+ if (info->name[0] == '\0' ||
+ strnlen(info->name, IPT_RECENT_NAME_LEN) == IPT_RECENT_NAME_LEN)
+ return 0;

- /* Things look good, create a list for this if it does not exist */
- /* Lock the linked list while we play with it */
spin_lock_bh(&recent_lock);
-
- /* Look for an entry with this name already created */
- /* Finds the end of the list and the entry before the end if current name does not exist */
- find_table = r_tables;
- while( (last_table = find_table) && strncmp(info->name,find_table->name,IPT_RECENT_NAME_LEN) && (find_table = find_table->next) );
-
- /* If a table already exists just increment the count on that table and return */
- if(find_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: table found (%s), incrementing count.\n",info->name);
-#endif
- find_table->count++;
- spin_unlock_bh(&recent_lock);
- return 1;
+ t = recent_table_lookup(info->name);
+ if (t != NULL) {
+ t->refcnt++;
+ ret = 1;
+ goto out;
}

- spin_unlock_bh(&recent_lock);
-
- /* Table with this name not found */
- /* Allocate memory for new linked list item */
-
-#ifdef DEBUG
- if(debug) {
- printk(KERN_INFO RECENT_NAME ": checkentry: no table found (%s)\n",info->name);
- printk(KERN_INFO RECENT_NAME ": checkentry: Allocationg %d for link-list entry.\n",sizeof(struct recent_ip_tables));
+ t = kzalloc(sizeof(*t) + sizeof(t->iphash[0]) * ip_list_hash_size,
+ GFP_ATOMIC);
+ if (t == NULL)
+ goto out;
+ strcpy(t->name, info->name);
+ INIT_LIST_HEAD(&t->lru_list);
+ for (i = 0; i < ip_list_hash_size; i++)
+ INIT_LIST_HEAD(&t->iphash[i]);
+#ifdef CONFIG_PROC_FS
+ t->proc = create_proc_entry(t->name, ip_list_perms, proc_dir);
+ if (t->proc == NULL) {
+ kfree(t);
+ goto out;
}
+ t->proc->proc_fops = &recent_fops;
+ t->proc->data = t;
#endif
+ list_add_tail(&t->list, &tables);
+ ret = 1;
+out:
+ spin_unlock_bh(&recent_lock);
+ return ret;
+}

- curr_table = vmalloc(sizeof(struct recent_ip_tables));
- if(curr_table == NULL) return 0;
-
- spin_lock_init(&curr_table->list_lock);
- curr_table->next = NULL;
- curr_table->count = 1;
- curr_table->time_pos = 0;
- strncpy(curr_table->name,info->name,IPT_RECENT_NAME_LEN);
- curr_table->name[IPT_RECENT_NAME_LEN-1] = '\0';
-
- /* Allocate memory for this table and the list of packets in each entry. */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for table (%s).\n",
- sizeof(struct recent_ip_list)*ip_list_tot,
- info->name);
-#endif
-
- curr_table->table = vmalloc(sizeof(struct recent_ip_list)*ip_list_tot);
- if(curr_table->table == NULL) { vfree(curr_table); return 0; }
- memset(curr_table->table,0,sizeof(struct recent_ip_list)*ip_list_tot);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for pkt_list.\n",
- sizeof(unsigned long)*ip_pkt_list_tot*ip_list_tot);
-#endif
-
- hold = vmalloc(sizeof(unsigned long)*ip_pkt_list_tot*ip_list_tot);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: After pkt_list allocation.\n");
-#endif
- if(hold == NULL) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for pkt_list.\n");
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
- }
- for(c = 0; c < ip_list_tot; c++) {
- curr_table->table[c].last_pkts = hold + c*ip_pkt_list_tot;
- }
+static void
+ipt_recent_destroy(const struct xt_match *match, void *matchinfo,
+ unsigned int matchsize)
+{
+ const struct ipt_recent_info *info = matchinfo;
+ struct recent_table *t;

- /* Allocate memory for the hash table */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for hash_table.\n",
- sizeof(int)*ip_list_hash_size);
+ spin_lock_bh(&recent_lock);
+ t = recent_table_lookup(info->name);
+ if (--t->refcnt == 0) {
+ list_del(&t->list);
+ recent_table_flush(t);
+#ifdef CONFIG_PROC_FS
+ remove_proc_entry(t->name, proc_dir);
#endif
-
- curr_table->hash_table = vmalloc(sizeof(int)*ip_list_hash_size);
- if(!curr_table->hash_table) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for hash_table.\n");
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
- }
-
- for(c = 0; c < ip_list_hash_size; c++) {
- curr_table->hash_table[c] = -1;
+ kfree(t);
}
+ spin_unlock_bh(&recent_lock);
+}

- /* Allocate memory for the time info */
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: Allocating %d for time_info.\n",
- sizeof(struct time_info_list)*ip_list_tot);
-#endif
+#ifdef CONFIG_PROC_FS
+struct recent_iter_state {
+ struct recent_table *table;
+ unsigned int bucket;
+};

- curr_table->time_info = vmalloc(sizeof(struct time_info_list)*ip_list_tot);
- if(!curr_table->time_info) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for time_info.\n");
- vfree(curr_table->hash_table);
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
- }
- for(c = 0; c < ip_list_tot; c++) {
- curr_table->time_info[c].position = c;
- curr_table->time_info[c].time = 0;
- }
+static void *recent_seq_start(struct seq_file *seq, loff_t *pos)
+{
+ struct recent_iter_state *st = seq->private;
+ struct recent_table *t = st->table;
+ struct recent_entry *e;
+ loff_t p = *pos;

- /* Put the new table in place */
spin_lock_bh(&recent_lock);
- find_table = r_tables;
- while( (last_table = find_table) && strncmp(info->name,find_table->name,IPT_RECENT_NAME_LEN) && (find_table = find_table->next) );
-
- /* If a table already exists just increment the count on that table and return */
- if(find_table) {
- find_table->count++;
- spin_unlock_bh(&recent_lock);
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry: table found (%s), created by other process.\n",info->name);
-#endif
- vfree(curr_table->time_info);
- vfree(curr_table->hash_table);
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 1;
- }
- if(!last_table) r_tables = curr_table; else last_table->next = curr_table;

- spin_unlock_bh(&recent_lock);
-
-#ifdef CONFIG_PROC_FS
- /* Create our proc 'status' entry. */
- curr_table->status_proc = create_proc_entry(curr_table->name, ip_list_perms, proc_net_ipt_recent);
- if (!curr_table->status_proc) {
- printk(KERN_INFO RECENT_NAME ": checkentry: unable to allocate for /proc entry.\n");
- /* Destroy the created table */
- spin_lock_bh(&recent_lock);
- last_table = NULL;
- curr_table = r_tables;
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() create_proc failed, no tables.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return 0;
- }
- while( strncmp(info->name,curr_table->name,IPT_RECENT_NAME_LEN) && (last_table = curr_table) && (curr_table = curr_table->next) );
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() create_proc failed, table already destroyed.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return 0;
+ for (st->bucket = 0; st->bucket < ip_list_hash_size; st->bucket++) {
+ list_for_each_entry(e, &t->iphash[st->bucket], list) {
+ if (p-- == 0)
+ return e;
}
- if(last_table) last_table->next = curr_table->next; else r_tables = curr_table->next;
- spin_unlock_bh(&recent_lock);
- vfree(curr_table->time_info);
- vfree(curr_table->hash_table);
- vfree(hold);
- vfree(curr_table->table);
- vfree(curr_table);
- return 0;
}
-
- curr_table->status_proc->owner = THIS_MODULE;
- curr_table->status_proc->data = curr_table;
- wmb();
- curr_table->status_proc->read_proc = ip_recent_get_info;
- curr_table->status_proc->write_proc = ip_recent_ctrl;
-#endif /* CONFIG_PROC_FS */
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": checkentry() left.\n");
-#endif
+ return NULL;
+}

- return 1;
+static void *recent_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+ struct recent_iter_state *st = seq->private;
+ struct recent_table *t = st->table;
+ struct recent_entry *e = v;
+ struct list_head *head = e->list.next;
+
+ while (head == &t->iphash[st->bucket]) {
+ if (++st->bucket >= ip_list_hash_size)
+ return NULL;
+ head = t->iphash[st->bucket].next;
+ }
+ (*pos)++;
+ return list_entry(head, struct recent_entry, list);
}

-/* This function is called in the event that a rule matching this module is
- * removed.
- * When this happens we need to check if there are no other rules matching
- * the table given. If that is the case then we remove the table and clean
- * up its memory.
- */
-static void
-destroy(const struct xt_match *match, void *matchinfo, unsigned int matchsize)
+static void recent_seq_stop(struct seq_file *s, void *v)
{
- const struct ipt_recent_info *info = matchinfo;
- struct recent_ip_tables *curr_table, *last_table;
+ spin_unlock_bh(&recent_lock);
+}

-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() entered.\n");
-#endif
+static int recent_seq_show(struct seq_file *seq, void *v)
+{
+ struct recent_entry *e = v;
+ unsigned int i;
+
+ i = (e->index - 1) % ip_pkt_list_tot;
+ seq_printf(seq, "src=%u.%u.%u.%u ttl: %u last_seen: %lu oldest_pkt: %u",
+ NIPQUAD(e->addr), e->ttl, e->stamps[i], e->index);
+ for (i = 0; i < e->nstamps; i++)
+ seq_printf(seq, "%s %lu", i ? "," : "", e->stamps[i]);
+ seq_printf(seq, "\n");
+ return 0;
+}

- if(matchsize != IPT_ALIGN(sizeof(struct ipt_recent_info))) return;
+static struct seq_operations recent_seq_ops = {
+ .start = recent_seq_start,
+ .next = recent_seq_next,
+ .stop = recent_seq_stop,
+ .show = recent_seq_show,
+};

- /* Lock the linked list while we play with it */
- spin_lock_bh(&recent_lock);
+static int recent_seq_open(struct inode *inode, struct file *file)
+{
+ struct proc_dir_entry *pde = PDE(inode);
+ struct seq_file *seq;
+ struct recent_iter_state *st;
+ int ret;
+
+ st = kzalloc(sizeof(*st), GFP_KERNEL);
+ if (st == NULL)
+ return -ENOMEM;
+ ret = seq_open(file, &recent_seq_ops);
+ if (ret)
+ kfree(st);
+ st->table = pde->data;
+ seq = file->private_data;
+ seq->private = st;
+ return ret;
+}

- /* Look for an entry with this name already created */
- /* Finds the end of the list and the entry before the end if current name does not exist */
- last_table = NULL;
- curr_table = r_tables;
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() No tables found, leaving.\n");
-#endif
+static ssize_t recent_proc_write(struct file *file, const char __user *input,
+ size_t size, loff_t *loff)
+{
+ struct proc_dir_entry *pde = PDE(file->f_dentry->d_inode);
+ struct recent_table *t = pde->data;
+ struct recent_entry *e;
+ char buf[sizeof("+255.255.255.255")], *c = buf;
+ u_int32_t addr;
+ int add;
+
+ if (size > sizeof(buf))
+ size = sizeof(buf);
+ if (copy_from_user(buf, input, size))
+ return -EFAULT;
+ while (isspace(*c))
+ c++;
+
+ if (size - (c - buf) < 5)
+ return c - buf;
+ if (!strncmp(c, "clear", 5)) {
+ c += 5;
+ spin_lock_bh(&recent_lock);
+ recent_table_flush(t);
spin_unlock_bh(&recent_lock);
- return;
+ return c - buf;
}
- while( strncmp(info->name,curr_table->name,IPT_RECENT_NAME_LEN) && (last_table = curr_table) && (curr_table = curr_table->next) );

- /* If a table does not exist then do nothing and return */
- if(!curr_table) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() table not found, leaving.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return;
+ switch (*c) {
+ case '-':
+ add = 0;
+ c++;
+ break;
+ case '+':
+ c++;
+ default:
+ add = 1;
+ break;
}
+ addr = in_aton(c);

- curr_table->count--;
-
- /* If count is still non-zero then there are still rules referenceing it so we do nothing */
- if(curr_table->count) {
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() table found, non-zero count, leaving.\n");
-#endif
- spin_unlock_bh(&recent_lock);
- return;
+ spin_lock_bh(&recent_lock);
+ e = recent_entry_lookup(t, addr, 0);
+ if (e == NULL) {
+ if (add)
+ recent_entry_init(t, addr, 0);
+ } else {
+ if (add)
+ recent_entry_update(t, e);
+ else
+ recent_entry_remove(t, e);
}
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() table found, zero count, removing.\n");
-#endif
-
- /* Count must be zero so we remove this table from the list */
- if(last_table) last_table->next = curr_table->next; else r_tables = curr_table->next;
-
spin_unlock_bh(&recent_lock);
+ return size;
+}

- /* lock to make sure any late-runners still using this after we removed it from
- * the list finish up then remove everything */
- spin_lock_bh(&curr_table->list_lock);
- spin_unlock_bh(&curr_table->list_lock);
-
-#ifdef CONFIG_PROC_FS
- if(curr_table->status_proc) remove_proc_entry(curr_table->name,proc_net_ipt_recent);
+static struct file_operations recent_fops = {
+ .open = recent_seq_open,
+ .read = seq_read,
+ .write = recent_proc_write,
+ .release = seq_release_private,
+ .owner = THIS_MODULE,
+};
#endif /* CONFIG_PROC_FS */
- vfree(curr_table->table[0].last_pkts);
- vfree(curr_table->table);
- vfree(curr_table->hash_table);
- vfree(curr_table->time_info);
- vfree(curr_table);
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": destroy() left.\n");
-#endif

- return;
-}
-
-/* This is the structure we pass to ipt_register to register our
- * module with iptables.
- */
static struct ipt_match recent_match = {
.name = "recent",
- .match = match,
+ .match = ipt_recent_match,
.matchsize = sizeof(struct ipt_recent_info),
- .checkentry = checkentry,
- .destroy = destroy,
- .me = THIS_MODULE
+ .checkentry = ipt_recent_checkentry,
+ .destroy = ipt_recent_destroy,
+ .me = THIS_MODULE,
};

-/* Kernel module initialization. */
static int __init ipt_recent_init(void)
{
- int err, count;
+ int err;

- printk(version);
-#ifdef CONFIG_PROC_FS
- proc_net_ipt_recent = proc_mkdir("ipt_recent",proc_net);
- if(!proc_net_ipt_recent) return -ENOMEM;
-#endif
-
- if(ip_list_hash_size && ip_list_hash_size <= ip_list_tot) {
- printk(KERN_WARNING RECENT_NAME ": ip_list_hash_size too small, resetting to default.\n");
- ip_list_hash_size = 0;
- }
-
- if(!ip_list_hash_size) {
- ip_list_hash_size = ip_list_tot*3;
- count = 2*2;
- while(ip_list_hash_size > count) count = count*2;
- ip_list_hash_size = count;
- }
-
-#ifdef DEBUG
- if(debug) printk(KERN_INFO RECENT_NAME ": ip_list_hash_size: %d\n",ip_list_hash_size);
-#endif
+ if (!ip_list_tot || !ip_pkt_list_tot)
+ return -EINVAL;
+ ip_list_hash_size = 1 << fls(ip_list_tot);

err = ipt_register_match(&recent_match);
+#ifdef CONFIG_PROC_FS
if (err)
- remove_proc_entry("ipt_recent", proc_net);
+ return err;
+ proc_dir = proc_mkdir("ipt_recent", proc_net);
+ if (proc_dir == NULL) {
+ ipt_unregister_match(&recent_match);
+ err = -ENOMEM;
+ }
+#endif
return err;
}

-/* Kernel module destruction. */
-static void __exit ipt_recent_fini(void)
+static void __exit ipt_recent_exit(void)
{
+ BUG_ON(!list_empty(&tables));
ipt_unregister_match(&recent_match);
-
- remove_proc_entry("ipt_recent",proc_net);
+#ifdef CONFIG_PROC_FS
+ remove_proc_entry("ipt_recent", proc_net);
+#endif
}

-/* Register our module with the kernel. */
module_init(ipt_recent_init);
-module_exit(ipt_recent_fini);
+module_exit(ipt_recent_exit);


Attachments:
x (45.71 kB)

2006-05-17 06:59:28

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

From: Patrick McHardy <[email protected]>
Date: Wed, 17 May 2006 08:26:03 +0200

> Stephen Frost wrote:
> > Looking at this again... The ttl isn't copied into 'ttl' unless the
> > check_set has TTL turned on. This means that the overwritting was fine,
> > if you accept that you can only ever match on TTL, or never match on it.
> > That doesn't seem right to me. The TTL in the table should always be
> > kept up-to-date and the only question is if the current rule requires it
> > for a match or not.
>
>
> OK, updated patch attached. The TTL is now always kept up-to-date.

Looks nice.

Is there any reasonable reason to allow ip_pkt_list_tot to ever be
larger than say 255? If we can accept that limit, we can shrink
the recent_entry considerably by packing the index and nstamps
into a single word next to ttl.

2006-05-17 07:10:18

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

From: Patrick McHardy <[email protected]>
Date: Wed, 17 May 2006 08:26:03 +0200

> + if (info->check_set & (IPT_RECENT_SET | IPT_RECENT_REMOVE) &&
> + (info->seconds || info->hit_count))
> + return 0;

I'm feeling particularly dense today... but what is the relative
precedence of '&' vs '&&'?

I've been told that if you have to look up C operator precedence,
don't bother and add parenthesis instead :)

2006-05-17 07:13:59

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

David> I'm feeling particularly dense today... but what is the
David> relative precedence of '&' vs '&&'?

& binds tighter than &&. "man operator" can be your friend...

David> I've been told that if you have to look up C operator
David> precedence, don't bother and add parenthesis instead :) -

Probably a good rule though.

- R.

2006-05-17 07:19:08

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

David S. Miller wrote:
> From: Patrick McHardy <[email protected]>
> Date: Wed, 17 May 2006 08:26:03 +0200
>
>>OK, updated patch attached. The TTL is now always kept up-to-date.
>
>
> Looks nice.
>
> Is there any reasonable reason to allow ip_pkt_list_tot to ever be
> larger than say 255? If we can accept that limit, we can shrink
> the recent_entry considerably by packing the index and nstamps
> into a single word next to ttl.


My primary goal was full compatibility, I have no idea about real-life
usage though. Maybe Stephen can answer this.

2006-05-17 07:19:17

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

David S. Miller wrote:
> From: Patrick McHardy <[email protected]>
> Date: Wed, 17 May 2006 08:26:03 +0200
>
>
>>+ if (info->check_set & (IPT_RECENT_SET | IPT_RECENT_REMOVE) &&
>>+ (info->seconds || info->hit_count))
>>+ return 0;
>
>
> I'm feeling particularly dense today... but what is the relative
> precedence of '&' vs '&&'?
>
> I've been told that if you have to look up C operator precedence,
> don't bother and add parenthesis instead :)


Bitwise AND has precedence, but I have no problems adding an extra
set of parenthesis around it :)

2006-05-17 10:55:06

by Stephen Frost

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

* Patrick McHardy ([email protected]) wrote:
> David S. Miller wrote:
> > Is there any reasonable reason to allow ip_pkt_list_tot to ever be
> > larger than say 255? If we can accept that limit, we can shrink
> > the recent_entry considerably by packing the index and nstamps
> > into a single word next to ttl.
>
> My primary goal was full compatibility, I have no idea about real-life
> usage though. Maybe Stephen can answer this.

I don't recall ever seeing > 255 usage. It's been pretty rare for it to
be changed from the default at all from what I've seen. Making the
limit be 255 seems perfectly reasonable to me.

Thanks,

Stephen


Attachments:
(No filename) (644.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-05-17 13:14:17

by Stephen Frost

[permalink] [raw]
Subject: Re: [PATCH] fix mem-leak in netfilter

* Patrick McHardy ([email protected]) wrote:
> OK, updated patch attached. The TTL is now always kept up-to-date.

Yup, that looks good. Unfortunately, it looks like the lru_list isn't
being kept track of correctly now. Perhaps I'm reading it wrong but it
*looks* like recent_entry_init() is only initializing the lru_list for
the local entry but doesn't ever add it to the main table lru_list. My
guess is you were expecting that to be done by recent_entry_update() but
it's never the case that recent_entry_update() is called directly after
recent_entry_init() due to the 'goto out' (my line 199). Therefore I'm
afraid that a new entry is never added to the lru_list with the current
setup and if nothing is ever updated you'll end up in a bad situation.

I think you can just drop lines 198 & 199 and modify recent_entry_init()
to not put the initial stamp in. This way, for a new entry to the list,
recent_entry_init() is called still on 195, the return value is updated
just like it would be for an existing entry, and recent_entry_update()
is called to handle adding the latest stamp and updating the lru_list.

Looking at list.h, I *think* that will work (wasn't sure if
list_move_tail() would be upset about the state of the e->lru_list
coming from INIT_LIST_HEAD but I think the __list_del will effectively
be a no-op and so it'll be fine).

Thanks,

Stephen


Attachments:
(No filename) (1.34 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments