2002-11-05 03:28:14

by Davide Libenzi

[permalink] [raw]
Subject: [patch] epoll bits 0.30 ...


These are the latest few bits for epoll. Changes :

*) Some constant adjusted

*) Comments plus

*) Better hash initialization

*) Correct timeout setup




- Davide




eventpoll.c | 55 ++++++++++++++++++++++++++++++++++++++-----------------
1 files changed, 38 insertions, 17 deletions





diff -Nru linux-2.5.46.vanilla/fs/eventpoll.c linux-2.5.46.epoll/fs/eventpoll.c
--- linux-2.5.46.vanilla/fs/eventpoll.c Mon Nov 4 15:45:35 2002
+++ linux-2.5.46.epoll/fs/eventpoll.c Mon Nov 4 15:50:27 2002
@@ -61,8 +61,12 @@
/* Maximum storage for the eventpoll interest set */
#define EP_MAX_FDS_SIZE (1024 * 128)

-/* We don't want the hash to be smaller than this */
-#define EP_MIN_HASH_SIZE 101
+/*
+ * We don't want the hash to be smaller than this. This is basically
+ * the highest prime lower than (PAGE_SIZE / sizeof(struct list_head)).
+ */
+#define EP_MIN_HASH_SIZE 509
+
/*
* Event buffer dimension used to cache events before sending them in
* userspace with a __copy_to_user(). The event buffer is in stack,
@@ -188,6 +192,7 @@


static int ep_is_prime(int n);
+static int ep_size_hash(int hintsize);
static int ep_getfd(int *efd, struct inode **einode, struct file **efile);
static int ep_alloc_pages(char **pages, int numpages);
static int ep_free_pages(char **pages, int numpages);
@@ -252,14 +257,14 @@



-/* Report if the number is prime. Needed to correctly size the hash */
+/* Report if the number is prime. Needed to correctly size the hash */
static int ep_is_prime(int n)
{

if (n > 3) {
if (n & 1) {
int i, hn = n / 2;
-
+
for (i = 3; i < hn; i += 2)
if (!(n % i))
return 0;
@@ -296,9 +301,30 @@
}


+static int ep_size_hash(int hintsize)
+{
+ int maxsize = (EP_MAX_HPAGES - 1) * EP_HENTRY_X_PAGE;
+
+ /*
+ * Search the nearest prime number higher than "hintsize" and
+ * smaller than "maxsize".
+ */
+ for (; !ep_is_prime(hintsize); hintsize++);
+ if (hintsize < EP_MIN_HASH_SIZE)
+ hintsize = EP_MIN_HASH_SIZE;
+ else if (hintsize >= maxsize)
+ for (hintsize = maxsize - 1; !ep_is_prime(hintsize); hintsize--);
+
+ return hintsize;
+}
+
+
/*
* It opens an eventpoll file descriptor by suggesting a storage of "size"
- * file descriptors. It is the kernel part of the userspace epoll_create(2).
+ * file descriptors. The size parameter is just an hint about how to size
+ * data structures. It won't prevent the user to store more than "size"
+ * file descriptors inside the epoll interface. It is the kernel part of
+ * the userspace epoll_create(2).
*/
asmlinkage int sys_epoll_create(int size)
{
@@ -309,10 +335,8 @@
DNPRINTK(3, (KERN_INFO "[%p] eventpoll: sys_epoll_create(%d)\n",
current, size));

- /* Search the nearest prime number higher than "size" */
- for (; !ep_is_prime(size); size++);
- if (size < EP_MIN_HASH_SIZE)
- size = EP_MIN_HASH_SIZE;
+ /* Correctly size the hash */
+ size = ep_size_hash(size);

/*
* Creates all the items needed to setup an eventpoll file. That is,
@@ -567,7 +591,7 @@
eexit_2:
put_filp(file);
eexit_1:
- return error;
+ return error;
}


@@ -723,8 +747,7 @@
write_unlock_irqrestore(&eplock, flags);

/* Free hash pages */
- if (ep->nhpages > 0)
- ep_free_pages(ep->hpages, ep->nhpages);
+ ep_free_pages(ep->hpages, ep->nhpages);
}


@@ -1126,12 +1149,10 @@
wait_queue_t wait;

/*
- * Calculate the timeout by checking for the "infinite" value ( -1 )
- * and the overflow condition ( > MAX_SCHEDULE_TIMEOUT / HZ ). The
- * passed timeout is in milliseconds, that why (t * HZ) / 1000.
+ * Calculate the timeout by checking for the "infinite" value ( -1 ).
+ * The passed timeout is in milliseconds, that why (t * HZ) / 1000.
*/
- jtimeout = timeout == -1 || timeout > MAX_SCHEDULE_TIMEOUT / HZ ?
- MAX_SCHEDULE_TIMEOUT: (timeout * HZ) / 1000;
+ jtimeout = timeout == -1 ? MAX_SCHEDULE_TIMEOUT: (timeout * HZ) / 1000;

retry:
write_lock_irqsave(&ep->lock, flags);


2002-11-07 00:21:40

by Rusty Russell

[permalink] [raw]
Subject: Re: [patch] epoll bits 0.30 ...

On Mon, 4 Nov 2002 19:44:29 -0800 (PST)
Davide Libenzi <[email protected]> wrote:

>
> These are the latest few bits for epoll. Changes :
>
> *) Some constant adjusted
>
> *) Comments plus
>
> *) Better hash initialization

Um, why doesn't this use linux/hash.h? Haven't looked hard, but...

Cheers,
Rusty.
--
there are those who do and those who hang on and you don't see too
many doers quoting their contemporaries. -- Larry McVoy

2002-11-07 00:29:34

by Davide Libenzi

[permalink] [raw]
Subject: Re: [patch] epoll bits 0.30 ...

On Wed, 6 Nov 2002, Rusty Russell wrote:

> On Mon, 4 Nov 2002 19:44:29 -0800 (PST)
> Davide Libenzi <[email protected]> wrote:
>
> >
> > These are the latest few bits for epoll. Changes :
> >
> > *) Some constant adjusted
> >
> > *) Comments plus
> >
> > *) Better hash initialization
>
> Um, why doesn't this use linux/hash.h? Haven't looked hard, but...

Rusty, the hash is not under pressure over there. The only time seeks is
performed is at file removal ( from the set ) and eventually at file
modify. There's a direct link between the wait queue and its item during
the high frequency event delivery, so need seek is performed.




- Davide


2002-11-07 00:43:31

by Davide Libenzi

[permalink] [raw]
Subject: Re: [patch] epoll bits 0.30 ...

On Wed, 6 Nov 2002, Davide Libenzi wrote:

> Rusty, the hash is not under pressure over there. The only time seeks is
> performed is at file removal ( from the set ) and eventually at file
> modify. There's a direct link between the wait queue and its item during
> the high frequency event delivery, so need seek is performed.

s/so need seek is performed/so no seek is performed/



- Davide


2002-11-08 16:03:38

by Jamie Lokier

[permalink] [raw]
Subject: Re: [patch] epoll bits 0.30 ...

Davide Libenzi wrote:
> Rusty, the hash is not under pressure over there. The only time seeks is
> performed is at file removal ( from the set ) and eventually at file
> modify. There's a direct link between the wait queue and its item during
> the high frequency event delivery, so need seek is performed.

It does seem peculiar to use a prime-sized hash table, though. These
days, good power-of-two-sized hash functions are well known.

-- Jamie

2002-11-08 17:20:56

by Davide Libenzi

[permalink] [raw]
Subject: Re: [patch] epoll bits 0.30 ...

On Fri, 8 Nov 2002, Jamie Lokier wrote:

> Davide Libenzi wrote:
> > Rusty, the hash is not under pressure over there. The only time seeks is
> > performed is at file removal ( from the set ) and eventually at file
> > modify. There's a direct link between the wait queue and its item during
> > the high frequency event delivery, so need seek is performed.
>
> It does seem peculiar to use a prime-sized hash table, though. These
> days, good power-of-two-sized hash functions are well known.

To make everyone happy the latest code uses hash.h :)



- Davide