2005-09-06 21:25:24

by Nishanth Aravamudan

[permalink] [raw]
Subject: Re: [PATCH][Bug 5132] fix sys_poll() large timeout handling

On 31.08.2005 [13:01:09 -0700], Nishanth Aravamudan wrote:
> Sorry everybody, forgot the most important Cc: :)
>
> -Nish
>
> Hi Andrew,
>
> In looking at Bug 5132 and sys_poll(), I think there is a flaw in the
> current code.
>
> The @timeout parameter to sys_poll() is in milliseconds but we compare
> it to (MAX_SCHEDULE_TIMEOUT / HZ), which is jiffies/jiffies-per-sec or
> seconds. That seems blatantly broken. Also, I think we are better served
> by converting to jiffies first then comparing, as opposed to converting
> our maximum to milliseconds (or seconds, incorrectly) and comparing.
>
> Comments, suggestions for improvement?

I haven't got any responses (here or on the bug)... A silent NACK?
Anything I should change to make people happier?

Thanks,
Nish


2005-09-10 00:35:34

by Nishanth Aravamudan

[permalink] [raw]
Subject: [UPDATE PATCH][Bug 5132] fix sys_poll() large timeout handling

On 06.09.2005 [14:25:14 -0700], Nishanth Aravamudan wrote:
> On 31.08.2005 [13:01:09 -0700], Nishanth Aravamudan wrote:
> > Sorry everybody, forgot the most important Cc: :)
> >
> > -Nish
> >
> > Hi Andrew,
> >
> > In looking at Bug 5132 and sys_poll(), I think there is a flaw in the
> > current code.
> >
> > The @timeout parameter to sys_poll() is in milliseconds but we compare
> > it to (MAX_SCHEDULE_TIMEOUT / HZ), which is jiffies/jiffies-per-sec or
> > seconds. That seems blatantly broken. Also, I think we are better served
> > by converting to jiffies first then comparing, as opposed to converting
> > our maximum to milliseconds (or seconds, incorrectly) and comparing.
> >
> > Comments, suggestions for improvement?
>
> I haven't got any responses (here or on the bug)... A silent NACK?
> Anything I should change to make people happier?

Well, I found one thing to change. msecs_to_jiffies() deals in unsigned
quantities while the parameter we are passing in is signed (and with
reason). I mistakenly left the

if (timeout)

check in, when it should be

if (timeout > 0)

check. Fixed below.

Thanks,
Nish

Description: The current sys_poll() implementation does not seem to
handle large timeouts correctly. Any value in milliseconds (@timeout)
which exceeds the maximum representable jiffy value
(MAX_SCHEDULE_TIMEOUT) should result in a MAX_SCHEDULE_TIMEOUT
schedule_timeout() request. To achieve this, convert @timeout to jiffies
first, then compare to MAX_SCHEDULE_TIMEOUT.

Signed-off-by: Nishanth Aravamudan <[email protected]>

---

fs/select.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)

diff -urpN 2.6.13/fs/select.c 2.6.13-dev/fs/select.c
--- 2.6.13/fs/select.c 2005-08-28 17:46:14.000000000 -0700
+++ 2.6.13-dev/fs/select.c 2005-09-09 17:22:30.000000000 -0700
@@ -469,13 +469,16 @@ asmlinkage long sys_poll(struct pollfd _
if (nfds > current->files->max_fdset && nfds > OPEN_MAX)
return -EINVAL;

- if (timeout) {
- /* Careful about overflow in the intermediate values */
- if ((unsigned long) timeout < MAX_SCHEDULE_TIMEOUT / HZ)
- timeout = (unsigned long)(timeout*HZ+999)/1000+1;
- else /* Negative or overflow */
- timeout = MAX_SCHEDULE_TIMEOUT;
- }
+ if (timeout > 0)
+ /*
+ * Convert the value from msecs to jiffies - if overflow
+ * occurs we get a negative value, which gets handled by
+ * the next block
+ */
+ timeout = msecs_to_jiffies(timeout) + 1;
+ if (timeout < 0) /* Negative requests result in infinite timeouts */
+ timeout = MAX_SCHEDULE_TIMEOUT;
+ /* 0 case falls through */

poll_initwait(&table);

2005-09-10 01:20:57

by Andrew Morton

[permalink] [raw]
Subject: Re: [UPDATE PATCH][Bug 5132] fix sys_poll() large timeout handling

Nishanth Aravamudan <[email protected]> wrote:
>
> Description: The current sys_poll() implementation does not seem to
> handle large timeouts correctly. Any value in milliseconds (@timeout)
> which exceeds the maximum representable jiffy value
> (MAX_SCHEDULE_TIMEOUT) should result in a MAX_SCHEDULE_TIMEOUT
> schedule_timeout() request. To achieve this, convert @timeout to jiffies
> first, then compare to MAX_SCHEDULE_TIMEOUT.

The above doesn't describe the bug very well.

> Signed-off-by: Nishanth Aravamudan <[email protected]>
>
> ---
>
> fs/select.c | 17 ++++++++++-------
> 1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff -urpN 2.6.13/fs/select.c 2.6.13-dev/fs/select.c
> --- 2.6.13/fs/select.c 2005-08-28 17:46:14.000000000 -0700
> +++ 2.6.13-dev/fs/select.c 2005-09-09 17:22:30.000000000 -0700
> @@ -469,13 +469,16 @@ asmlinkage long sys_poll(struct pollfd _
> if (nfds > current->files->max_fdset && nfds > OPEN_MAX)
> return -EINVAL;
>
> - if (timeout) {
> - /* Careful about overflow in the intermediate values */
> - if ((unsigned long) timeout < MAX_SCHEDULE_TIMEOUT / HZ)

This is the problem to which you're referring, yes?

We're comparing milliseconds with jiffies/HZ, yes?

> - timeout = (unsigned long)(timeout*HZ+999)/1000+1;
> - else /* Negative or overflow */
> - timeout = MAX_SCHEDULE_TIMEOUT;
> - }
> + if (timeout > 0)
> + /*
> + * Convert the value from msecs to jiffies - if overflow
> + * occurs we get a negative value, which gets handled by
> + * the next block
> + */
> + timeout = msecs_to_jiffies(timeout) + 1;
> + if (timeout < 0) /* Negative requests result in infinite timeouts */
> + timeout = MAX_SCHEDULE_TIMEOUT;
> + /* 0 case falls through */

I don't particularly like the idea of relying on msecs_to_jiffies(too much)
returning a negative value.

Why can't we do

int too_much;

/*
* We compare HZ with 1000 to work out which side of the expression
* needs conversion. Because we want to avoid converting any value
* to a numerically higher value, which could overflow.
*/
#if HZ > 1000
too_much = timeout > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT);
#else
too_much = msecs_to_jiffies(timeout) > MAX_SCHEDULE_TIMEOUT;
#endif

if (too_much)
timeout = MAX_SCHEDULE_TIMEOUT;

And while we're there, let's stop using the same variable for two different
units - it's horrid. How about we nuke `timeout' and create timeout_msecs
and timeout_jiffies to show what units they're in?

2005-09-10 02:23:43

by Nishanth Aravamudan

[permalink] [raw]
Subject: [UPDATE PATCH][Bug 5132] fix sys_poll() large timeout handling

On 09.09.2005 [18:16:58 -0700], Andrew Morton wrote:
> Nishanth Aravamudan <[email protected]> wrote:
> >
> > Description: The current sys_poll() implementation does not seem to
> > handle large timeouts correctly. Any value in milliseconds (@timeout)
> > which exceeds the maximum representable jiffy value
> > (MAX_SCHEDULE_TIMEOUT) should result in a MAX_SCHEDULE_TIMEOUT
> > schedule_timeout() request. To achieve this, convert @timeout to jiffies
> > first, then compare to MAX_SCHEDULE_TIMEOUT.
>
> The above doesn't describe the bug very well.

Yes, sorry, in the e-mail I sent with the updated patch, the beginning
description had more detailed information. I meant to also change this
to an RFC, as no one is commenting on just the patch :) Fixed in the
description below. And I guess now that I have your attention I don't
need to bother changing the subject *again*...

> > Signed-off-by: Nishanth Aravamudan <[email protected]>
> >
> > ---
> >
> > fs/select.c | 17 ++++++++++-------
> > 1 file changed, 10 insertions(+), 7 deletions(-)
> >
> > diff -urpN 2.6.13/fs/select.c 2.6.13-dev/fs/select.c
> > --- 2.6.13/fs/select.c 2005-08-28 17:46:14.000000000 -0700
> > +++ 2.6.13-dev/fs/select.c 2005-09-09 17:22:30.000000000 -0700
> > @@ -469,13 +469,16 @@ asmlinkage long sys_poll(struct pollfd _
> > if (nfds > current->files->max_fdset && nfds > OPEN_MAX)
> > return -EINVAL;
> >
> > - if (timeout) {
> > - /* Careful about overflow in the intermediate values */
> > - if ((unsigned long) timeout < MAX_SCHEDULE_TIMEOUT / HZ)
>
> This is the problem to which you're referring, yes?
>
> We're comparing milliseconds with jiffies/HZ, yes?

Yes, exactly.

> > - timeout = (unsigned long)(timeout*HZ+999)/1000+1;
> > - else /* Negative or overflow */
> > - timeout = MAX_SCHEDULE_TIMEOUT;
> > - }
> > + if (timeout > 0)
> > + /*
> > + * Convert the value from msecs to jiffies - if overflow
> > + * occurs we get a negative value, which gets handled by
> > + * the next block
> > + */
> > + timeout = msecs_to_jiffies(timeout) + 1;
> > + if (timeout < 0) /* Negative requests result in infinite timeouts */
> > + timeout = MAX_SCHEDULE_TIMEOUT;
> > + /* 0 case falls through */
>
> I don't particularly like the idea of relying on msecs_to_jiffies(too much)
> returning a negative value.

I agree with your changes. Does the patch below reflect what you wanted?
I used >=, though, as we want to still +1 to the request, so that we
don't return early.

> Why can't we do
>
> int too_much;
>
> /*
> * We compare HZ with 1000 to work out which side of the expression
> * needs conversion. Because we want to avoid converting any value
> * to a numerically higher value, which could overflow.
> */
> #if HZ > 1000
> too_much = timeout > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT);
> #else
> too_much = msecs_to_jiffies(timeout) > MAX_SCHEDULE_TIMEOUT;
> #endif
>
> if (too_much)
> timeout = MAX_SCHEDULE_TIMEOUT;
>
> And while we're there, let's stop using the same variable for two different
> units - it's horrid. How about we nuke `timeout' and create timeout_msecs
> and timeout_jiffies to show what units they're in?

Done as well.

Description: The @timeout parameter to sys_poll() is in milliseconds but
we compare it to (MAX_SCHEDULE_TIMEOUT / HZ), which is
(jiffies/jiffies-per-sec) or seconds. That seems blatantly broken. This
led to improper overflow checking for @timeout. As Andrew Morton pointed
out, the best fix is to to check for potential overflow first, then
either select an indefinite value or convert @timeout.

To achieve this and clean-up the code, change the prototype of the
sys_poll to make it clear that the parameter is in milliseconds and
introduce a variable, timeout_jiffies to hold the corresonding jiffies
value.

Signed-off-by: Nishanth Aravamudan <[email protected]>

---

fs/select.c | 33 ++++++++++++++++++++++++---------
include/linux/syscalls.h | 2 +-
2 files changed, 25 insertions(+), 10 deletions(-)

diff -urpN 2.6.13/fs/select.c 2.6.13-dev/fs/select.c
--- 2.6.13/fs/select.c 2005-08-28 17:46:14.000000000 -0700
+++ 2.6.13-dev/fs/select.c 2005-09-09 19:20:53.000000000 -0700
@@ -457,25 +457,40 @@ static int do_poll(unsigned int nfds, s
return count;
}

-asmlinkage long sys_poll(struct pollfd __user * ufds, unsigned int nfds, long timeout)
+asmlinkage long sys_poll(struct pollfd __user * ufds, unsigned int nfds,
+ long timeout_msecs)
{
struct poll_wqueues table;
- int fdcount, err;
+ int fdcount, err, overflow;
unsigned int i;
struct poll_list *head;
struct poll_list *walk;
+ unsigned long timeout_jiffies;

/* Do a sanity check on nfds ... */
if (nfds > current->files->max_fdset && nfds > OPEN_MAX)
return -EINVAL;

- if (timeout) {
- /* Careful about overflow in the intermediate values */
- if ((unsigned long) timeout < MAX_SCHEDULE_TIMEOUT / HZ)
- timeout = (unsigned long)(timeout*HZ+999)/1000+1;
- else /* Negative or overflow */
- timeout = MAX_SCHEDULE_TIMEOUT;
- }
+ /*
+ * We compare HZ with 1000 to work out which side of the
+ * expression needs conversion. Because we want to avoid
+ * converting any value to a numerically higher value, which
+ * could overflow.
+ */
+#if HZ > 1000
+ overflow = timeout_msecs >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT);
+#else
+ overflow = msecs_to_jiffies(timeout_msecs) >= MAX_SCHEDULE_TIMEOUT;
+#endif
+
+ /*
+ * If we would overflow in the conversion or a negative timeout
+ * is requested, sleep indefinitely.
+ */
+ if (overflow || timeout_msecs < 0)
+ timeout_jiffies = MAX_SCHEDULE_TIMEOUT;
+ else
+ timeout_jiffies = msecs_to_jiffies(timeout_msecs) + 1;

poll_initwait(&table);

diff -urpN 2.6.13/include/linux/syscalls.h 2.6.13-dev/include/linux/syscalls.h
--- 2.6.13/include/linux/syscalls.h 2005-08-28 17:46:36.000000000 -0700
+++ 2.6.13-dev/include/linux/syscalls.h 2005-09-09 19:05:34.000000000 -0700
@@ -420,7 +420,7 @@ asmlinkage long sys_socketpair(int, int,
asmlinkage long sys_socketcall(int call, unsigned long __user *args);
asmlinkage long sys_listen(int, int);
asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds,
- long timeout);
+ long timeout_msecs);
asmlinkage long sys_select(int n, fd_set __user *inp, fd_set __user *outp,
fd_set __user *exp, struct timeval __user *tvp);
asmlinkage long sys_epoll_create(int size);

2005-09-10 02:40:23

by Andrew Morton

[permalink] [raw]
Subject: Re: [UPDATE PATCH][Bug 5132] fix sys_poll() large timeout handling

Nishanth Aravamudan <[email protected]> wrote:
>
> + /*
> + * We compare HZ with 1000 to work out which side of the
> + * expression needs conversion. Because we want to avoid
> + * converting any value to a numerically higher value, which
> + * could overflow.
> + */
> +#if HZ > 1000
> + overflow = timeout_msecs >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT);
> +#else
> + overflow = msecs_to_jiffies(timeout_msecs) >= MAX_SCHEDULE_TIMEOUT;
> +#endif
> +
> + /*
> + * If we would overflow in the conversion or a negative timeout
> + * is requested, sleep indefinitely.
> + */
> + if (overflow || timeout_msecs < 0)
> + timeout_jiffies = MAX_SCHEDULE_TIMEOUT;

Do we need to test (timeout_msecs < 0) here? If we make timeout_msecs
unsigned long then I think `overflow' will always be correct.

2005-09-10 02:55:39

by Nishanth Aravamudan

[permalink] [raw]
Subject: Re: [UPDATE PATCH][Bug 5132] fix sys_poll() large timeout handling

On 09.09.2005 [19:36:21 -0700], Andrew Morton wrote:
> Nishanth Aravamudan <[email protected]> wrote:
> >
> > + /*
> > + * We compare HZ with 1000 to work out which side of the
> > + * expression needs conversion. Because we want to avoid
> > + * converting any value to a numerically higher value, which
> > + * could overflow.
> > + */
> > +#if HZ > 1000
> > + overflow = timeout_msecs >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT);
> > +#else
> > + overflow = msecs_to_jiffies(timeout_msecs) >= MAX_SCHEDULE_TIMEOUT;
> > +#endif
> > +
> > + /*
> > + * If we would overflow in the conversion or a negative timeout
> > + * is requested, sleep indefinitely.
> > + */
> > + if (overflow || timeout_msecs < 0)
> > + timeout_jiffies = MAX_SCHEDULE_TIMEOUT;
>
> Do we need to test (timeout_msecs < 0) here? If we make timeout_msecs
> unsigned long then I think `overflow' will always be correct.

Even though poll is explicitly allowed to take negative values, as per
my man-page:

"#include <sys/poll.h>

int poll(struct pollfd *ufds, unsigned int nfds, int timeout);

...

A negative value means infinite timeout."

Would we have a local variable to store timeout_msecs as well? Or do we
want to make a userspace-visible change like this? I don't have a
preference, I just want to make sure I understand.

Thanks,
Nish

2005-09-12 14:38:12

by Peter Staubach

[permalink] [raw]
Subject: Re: [UPDATE PATCH][Bug 5132] fix sys_poll() large timeout handling

--- linux-2.6.13/fs/select.c.org 2005-08-28 19:41:01.000000000 -0400
+++ linux-2.6.13/fs/select.c 2005-09-12 10:19:30.000000000 -0400
@@ -457,25 +457,34 @@ static int do_poll(unsigned int nfds, s
return count;
}

-asmlinkage long sys_poll(struct pollfd __user * ufds, unsigned int nfds, long timeout)
+asmlinkage long sys_poll(struct pollfd __user * ufds, unsigned int nfds, int timeout_msecs)
{
struct poll_wqueues table;
int fdcount, err;
unsigned int i;
struct poll_list *head;
struct poll_list *walk;
+ long timeout;
+ int64_t lltimeout;

/* Do a sanity check on nfds ... */
if (nfds > current->files->max_fdset && nfds > OPEN_MAX)
return -EINVAL;

- if (timeout) {
- /* Careful about overflow in the intermediate values */
- if ((unsigned long) timeout < MAX_SCHEDULE_TIMEOUT / HZ)
- timeout = (unsigned long)(timeout*HZ+999)/1000+1;
- else /* Negative or overflow */
+ if (timeout_msecs) {
+ if (timeout_msecs < 0)
timeout = MAX_SCHEDULE_TIMEOUT;
- }
+ else {
+ lltimeout = (int64_t)timeout_msecs * HZ + 999;
+ do_div(lltimeout, 1000);
+ lltimeout++;
+ if (lltimeout > MAX_SCHEDULE_TIMEOUT)
+ timeout = MAX_SCHEDULE_TIMEOUT;
+ else
+ timeout = (long)lltimeout;
+ }
+ } else
+ timeout = 0;

poll_initwait(&table);

--- linux-2.6.13/include/linux/syscalls.h.org 2005-08-28 19:41:01.000000000 -0400
+++ linux-2.6.13/include/linux/syscalls.h 2005-09-12 10:22:25.000000000 -0400
@@ -420,7 +420,7 @@ asmlinkage long sys_socketpair(int, int,
asmlinkage long sys_socketcall(int call, unsigned long __user *args);
asmlinkage long sys_listen(int, int);
asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds,
- long timeout);
+ int timeout_msecs);
asmlinkage long sys_select(int n, fd_set __user *inp, fd_set __user *outp,
fd_set __user *exp, struct timeval __user *tvp);
asmlinkage long sys_epoll_create(int size);


Attachments:
select.devel (1.86 kB)

2005-09-12 15:06:15

by Nishanth Aravamudan

[permalink] [raw]
Subject: Re: [UPDATE PATCH][Bug 5132] fix sys_poll() large timeout handling

On 12.09.2005 [10:30:38 -0400], Peter Staubach wrote:
> Nishanth Aravamudan wrote:
>
> >On 09.09.2005 [19:36:21 -0700], Andrew Morton wrote:
> >
> >
> >>Nishanth Aravamudan <[email protected]> wrote:
> >>
> >>
> >>>+ /*
> >>>+ * We compare HZ with 1000 to work out which side of the
> >>>+ * expression needs conversion. Because we want to avoid
> >>>+ * converting any value to a numerically higher value, which
> >>>+ * could overflow.
> >>>+ */
> >>>+#if HZ > 1000
> >>>+ overflow = timeout_msecs >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT);
> >>>+#else
> >>>+ overflow = msecs_to_jiffies(timeout_msecs) >= MAX_SCHEDULE_TIMEOUT;
> >>>+#endif
> >>>+
> >>>+ /*
> >>>+ * If we would overflow in the conversion or a negative timeout
> >>>+ * is requested, sleep indefinitely.
> >>>+ */
> >>>+ if (overflow || timeout_msecs < 0)
> >>>+ timeout_jiffies = MAX_SCHEDULE_TIMEOUT;
> >>>
> >>>
> >>Do we need to test (timeout_msecs < 0) here? If we make timeout_msecs
> >>unsigned long then I think `overflow' will always be correct.
> >>
> >>
> >
> >Even though poll is explicitly allowed to take negative values, as per
> >my man-page:
> >
> >"#include <sys/poll.h>
> >
> >int poll(struct pollfd *ufds, unsigned int nfds, int timeout);
> >
> >...
> >
> >A negative value means infinite timeout."
> >
> >Would we have a local variable to store timeout_msecs as well? Or do we
> >want to make a userspace-visible change like this? I don't have a
> >preference, I just want to make sure I understand.
> >
>
> Actually, given this, isn't the interface for sys_poll() incorrectly
> defined?
> Shouldn't the timeout argument be an int, instead of a long?
>
> And, if we make it an int, then can't we do the math correctly for all
> possible values of the timeout? The patch could look like:
>
> Signed-off-by: Peter Staubach <[email protected]>
>

> --- linux-2.6.13/fs/select.c.org 2005-08-28 19:41:01.000000000 -0400
> +++ linux-2.6.13/fs/select.c 2005-09-12 10:19:30.000000000 -0400
> @@ -457,25 +457,34 @@ static int do_poll(unsigned int nfds, s
> return count;
> }
>
> -asmlinkage long sys_poll(struct pollfd __user * ufds, unsigned int nfds, long timeout)
> +asmlinkage long sys_poll(struct pollfd __user * ufds, unsigned int nfds, int timeout_msecs)
> {
> struct poll_wqueues table;
> int fdcount, err;
> unsigned int i;
> struct poll_list *head;
> struct poll_list *walk;
> + long timeout;
> + int64_t lltimeout;
>
> /* Do a sanity check on nfds ... */
> if (nfds > current->files->max_fdset && nfds > OPEN_MAX)
> return -EINVAL;
>
> - if (timeout) {
> - /* Careful about overflow in the intermediate values */
> - if ((unsigned long) timeout < MAX_SCHEDULE_TIMEOUT / HZ)
> - timeout = (unsigned long)(timeout*HZ+999)/1000+1;
> - else /* Negative or overflow */
> + if (timeout_msecs) {
> + if (timeout_msecs < 0)
> timeout = MAX_SCHEDULE_TIMEOUT;
> - }
> + else {
> + lltimeout = (int64_t)timeout_msecs * HZ + 999;
> + do_div(lltimeout, 1000);

I don't think the embedded folks are going to be ok with adding a 64-bit
div in the poll() common-path... But otherwise the patch looks pretty
sane, except I think you want s64, not int64_t? I can't ever remember
myself :)

I agree the interface mght be mis-defined. And changing timeout_msecs()
to an integer is consistent with the size of millisecond-unit variables
used elsewhere in the kernel.

Thanks,
Nish

2005-09-12 15:26:32

by Peter Staubach

[permalink] [raw]
Subject: Re: [UPDATE PATCH][Bug 5132] fix sys_poll() large timeout handling

--- linux-2.6.13/fs/select.c.org 2005-08-28 19:41:01.000000000 -0400
+++ linux-2.6.13/fs/select.c 2005-09-12 11:11:53.000000000 -0400
@@ -24,6 +24,7 @@
#include <linux/fs.h>

#include <asm/uaccess.h>
+#include <asm/div64.h>

#define ROUND_UP(x,y) (((x)+(y)-1)/(y))
#define DEFAULT_POLLMASK (POLLIN | POLLOUT | POLLRDNORM | POLLWRNORM)
@@ -457,25 +458,34 @@ static int do_poll(unsigned int nfds, s
return count;
}

-asmlinkage long sys_poll(struct pollfd __user * ufds, unsigned int nfds, long timeout)
+asmlinkage long sys_poll(struct pollfd __user * ufds, unsigned int nfds, int timeout_msecs)
{
struct poll_wqueues table;
int fdcount, err;
unsigned int i;
struct poll_list *head;
struct poll_list *walk;
+ long timeout;
+ int64_t lltimeout;

/* Do a sanity check on nfds ... */
if (nfds > current->files->max_fdset && nfds > OPEN_MAX)
return -EINVAL;

- if (timeout) {
- /* Careful about overflow in the intermediate values */
- if ((unsigned long) timeout < MAX_SCHEDULE_TIMEOUT / HZ)
- timeout = (unsigned long)(timeout*HZ+999)/1000+1;
- else /* Negative or overflow */
+ if (timeout_msecs) {
+ if (timeout_msecs < 0)
timeout = MAX_SCHEDULE_TIMEOUT;
- }
+ else {
+ lltimeout = (int64_t)timeout_msecs * HZ + 999;
+ do_div(lltimeout, 1000);
+ lltimeout++;
+ if (lltimeout > MAX_SCHEDULE_TIMEOUT)
+ timeout = MAX_SCHEDULE_TIMEOUT;
+ else
+ timeout = (long)lltimeout;
+ }
+ } else
+ timeout = 0;

poll_initwait(&table);

--- linux-2.6.13/include/linux/syscalls.h.org 2005-08-28 19:41:01.000000000 -0400
+++ linux-2.6.13/include/linux/syscalls.h 2005-09-12 10:22:25.000000000 -0400
@@ -420,7 +420,7 @@ asmlinkage long sys_socketpair(int, int,
asmlinkage long sys_socketcall(int call, unsigned long __user *args);
asmlinkage long sys_listen(int, int);
asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds,
- long timeout);
+ int timeout_msecs);
asmlinkage long sys_select(int n, fd_set __user *inp, fd_set __user *outp,
fd_set __user *exp, struct timeval __user *tvp);
asmlinkage long sys_epoll_create(int size);


Attachments:
select.devel (2.06 kB)

2005-09-12 16:07:00

by Nishanth Aravamudan

[permalink] [raw]
Subject: Re: [UPDATE PATCH][Bug 5132] fix sys_poll() large timeout handling

On 12.09.2005 [11:19:32 -0400], Peter Staubach wrote:
> Nishanth Aravamudan wrote:
>
> >
> >I don't think the embedded folks are going to be ok with adding a 64-bit
> >div in the poll() common-path... But otherwise the patch looks pretty
> >sane, except I think you want s64, not int64_t? I can't ever remember
> >myself :)
> >
> >
>
> Oops, I missed an include which is required. The updated patch is attached.

Could you update your patch against 2.6.13-mm3? Andrew has aleady pulled
in the patch I sent with his changes. It should make your pactch a
little smaller (and only need to touch fs/select.c).

> A 64 bit quantity divided by a 32 bit quantity seems to be a standard
> interface in the kernel and is used fairly widely, so I suspect that
> the embedded folks have already done the work.

I understand that the interface is common enough, but I'm not sure how
appreciated it is :)

> I don't know which should be used, s64 or int64_t. I chose int64_t
> because then it would (more or less) match the interface of do_div().

Yes, I'll leave this to someone else to figure out...

> >I agree the interface mght be mis-defined. And changing timeout_msecs()
> >to an integer is consistent with the size of millisecond-unit variables
> >used elsewhere in the kernel.
> >
>
> Yes, it also makes the kernel definition for sys_poll() match the user level
> definition for poll(2) found in <poll.h>.

Yup, sounds good to me.

Thanks,
Nish