Hi,
The linux implementation of setitimer() doesn't behave quite as
expected. I found several problems:
1. POSIX says that negative times should cause setitimer() to
return -1 and set errno to EINVAL. In linux, the call succeeds.
2. POSIX says that time values with usec >= 1000000 should
cause the same behaviour. In linux, the call succeeds.
3. If large time values are given, linux quietly truncates them
to the maximum time representable in jiffies. On 2.4.4 on PPC,
that's about 248 days. On 2.6.10 on x86, that's about 24 days.
POSIX doesn't really say what to do in this case, but looking at
established practice, i.e. "what BSD does", since the call comes
from BSD, *BSD returns -1 if the time is out of range.
References:
kernel/itimer.c
http://www.opengroup.org/onlinepubs/009695399/functions/getitimer.html
http://www.delorie.com/gnu/docs/glibc/libc_444.html
http://www.gsp.com/cgi-bin/man.cgi?section=2&topic=setitimer
I have tested on 2.4.4 and 2.6.10 as well as getting other people to
test on 2.4.28, 2.4.27 and FreeBSD.
Here's a patch against linux 2.6.10:
--- linux-2.6.10/kernel/itimer.c 2005-01-15 09:39:13.000000000 +0100
+++ linux-2.6.10.fixed/kernel/itimer.c 2005-01-13 09:56:50.000000000 +0100
@@ -84,6 +84,16 @@
register unsigned long i, j;
int k;
+ if (value->it_interval.tv_sec >= MAX_SEC_IN_JIFFIES
+ || value->it_value.tv_sec >= MAX_SEC_IN_JIFFIES
+ || value->it_interval.tv_sec < 0
+ || value->it_value.tv_sec < 0
+ || value->it_interval.tv_usec >= 1000000
+ || value->it_value.tv_usec >= 1000000
+ || value->it_interval.tv_usec < 0
+ || value->it_value.tv_usec < 0)
+ return -EINVAL;
+
i = timeval_to_jiffies(&value->it_interval);
j = timeval_to_jiffies(&value->it_value);
if (ovalue && (k = do_getitimer(which, ovalue)) < 0)
I've put it at http://www.corelatus.com/~matthias/posix_setitimer.patch
along with a test program, http://www.corelatus.com/~matthias/ti.c
Here's what the test program looks like when run on a patched 2.6.10 kernel:
tmp >gcc -Wall ti.c
tmp >./a.out
value sec=0 usec=0 result=0 expected=0
interval sec=0 usec=0 result=0 expected=0
value sec=1 usec=0 result=0 expected=0
interval sec=1 usec=0 result=0 expected=0
value sec=-1 usec=0 result=-1 expected=-1
interval sec=-1 usec=0 result=-1 expected=-1
value sec=0 usec=-1 result=-1 expected=-1
interval sec=0 usec=-1 result=-1 expected=-1
value sec=0 usec=1000000 result=-1 expected=-1
interval sec=0 usec=1000000 result=-1 expected=-1
value sec=0 usec=999999 result=0 expected=0
interval sec=0 usec=999999 result=0 expected=0
limit is 2147156
value sec=2147155 usec=0 result=0 expected=0
interval sec=2147155 usec=0 result=0 expected=0
value sec=2147156 usec=0 result=0 expected=0
interval sec=2147156 usec=0 result=0 expected=0
value sec=2147157 usec=0 result=-1 expected=-1
interval sec=2147157 usec=0 result=-1 expected=-1
All tests passed
Matt
Matthias Lang <[email protected]> wrote:
>
> The linux implementation of setitimer() doesn't behave quite as
> expected. I found several problems:
>
> 1. POSIX says that negative times should cause setitimer() to
> return -1 and set errno to EINVAL. In linux, the call succeeds.
>
> 2. POSIX says that time values with usec >= 1000000 should
> cause the same behaviour. In linux, the call succeeds.
>
> 3. If large time values are given, linux quietly truncates them
> to the maximum time representable in jiffies. On 2.4.4 on PPC,
> that's about 248 days. On 2.6.10 on x86, that's about 24 days.
>
> POSIX doesn't really say what to do in this case, but looking at
> established practice, i.e. "what BSD does", since the call comes
> from BSD, *BSD returns -1 if the time is out of range.
>
These are things we probably cannot change now. All three are arguably
sensible behaviour and do satisfy the principle of least surprise. So
there may be apps out there which will break if we "fix" these things.
If the kernel version was 2.7.0 then well maybe...
Matthias Lang <[email protected]> wrote:
>> The linux implementation of setitimer() doesn't behave quite as
>> expected. I found several problems:
>> 1. POSIX says that negative times should cause setitimer() to
>> return -1 and set errno to EINVAL. In linux, the call succeeds.
>> 2. POSIX says that time values with usec >= 1000000 should
>> cause the same behaviour. In linux, the call succeeds.
>> 3. If large time values are given, linux quietly truncates them
>> to the maximum time representable in jiffies. On 2.4.4 on PPC,
>> that's about 248 days. On 2.6.10 on x86, that's about 24 days.
>> POSIX doesn't really say what to do in this case, but looking at
>> established practice, i.e. "what BSD does", since the call comes
>> from BSD, *BSD returns -1 if the time is out of range.
On Sat, Jan 15, 2005 at 01:30:13AM -0800, Andrew Morton wrote:
> These are things we probably cannot change now. All three are arguably
> sensible behaviour and do satisfy the principle of least surprise. So
> there may be apps out there which will break if we "fix" these things.
> If the kernel version was 2.7.0 then well maybe...
We can easily do a "rolling upgrade" by adding new versions of the
system calls, giving glibc and apps grace periods to adjust to them,
and nuking the old versions in a few years.
-- wli
On Sat, 2005-01-15 at 01:36 -0800, William Lee Irwin III wrote:
> Matthias Lang <[email protected]> wrote:
> >> The linux implementation of setitimer() doesn't behave quite as
> >> expected. I found several problems:
> >> 1. POSIX says that negative times should cause setitimer() to
> >> return -1 and set errno to EINVAL. In linux, the call succeeds.
> >> 2. POSIX says that time values with usec >= 1000000 should
> >> cause the same behaviour. In linux, the call succeeds.
> >> 3. If large time values are given, linux quietly truncates them
> >> to the maximum time representable in jiffies. On 2.4.4 on PPC,
> >> that's about 248 days. On 2.6.10 on x86, that's about 24 days.
> >> POSIX doesn't really say what to do in this case, but looking at
> >> established practice, i.e. "what BSD does", since the call comes
> >> from BSD, *BSD returns -1 if the time is out of range.
>
> On Sat, Jan 15, 2005 at 01:30:13AM -0800, Andrew Morton wrote:
> > These are things we probably cannot change now. All three are arguably
> > sensible behaviour and do satisfy the principle of least surprise. So
> > there may be apps out there which will break if we "fix" these things.
> > If the kernel version was 2.7.0 then well maybe...
>
> We can easily do a "rolling upgrade" by adding new versions of the
> system calls, giving glibc and apps grace periods to adjust to them,
> and nuking the old versions in a few years.
but for 1: do we care? it is being more tolerant than allowed by a
standard. Those who care can easily add the test in the userspace
wrapper
for 2: we again are more tolerant and dtrt; again. And again userspace
wrapper can impose an additional restriction if it wants
3 is more nasty and needs thinking; we could consider a fix inside the
kernel that actually does wait long enough
I don't see a valid reason to restrict/reject input that is accepted now
and dealt with reasonably because some standard says so (if you design a
new api, following the standard is nice of course). I don't see "doesn't
reject a condition that can reasonable be dealt with" as a good reason
to go double ABI at all.
On Sat, 2005-01-15 at 01:36 -0800, William Lee Irwin III wrote:
>> We can easily do a "rolling upgrade" by adding new versions of the
>> system calls, giving glibc and apps grace periods to adjust to them,
>> and nuking the old versions in a few years.
On Sat, Jan 15, 2005 at 10:58:45AM +0100, Arjan van de Ven wrote:
> but for 1: do we care? it is being more tolerant than allowed by a
> standard. Those who care can easily add the test in the userspace
> wrapper
> for 2: we again are more tolerant and dtrt; again. And again userspace
> wrapper can impose an additional restriction if it wants
> 3 is more nasty and needs thinking; we could consider a fix inside the
> kernel that actually does wait long enough
> I don't see a valid reason to restrict/reject input that is accepted now
> and dealt with reasonably because some standard says so (if you design a
> new api, following the standard is nice of course). I don't see "doesn't
> reject a condition that can reasonable be dealt with" as a good reason
> to go double ABI at all.
These are probably better reasons against fiddling with ABI shifts and
against starting 2.7 for its sake than I could come up with. Thanks.
-- wli
On Sat, Jan 15, 2005 at 10:58:45AM +0100, Arjan van de Ven wrote:
> I don't see a valid reason to restrict/reject input that is accepted
> now and dealt with reasonably because some standard says so (if you
> design a new api, following the standard is nice of course). I don't
> see "doesn't reject a condition that can reasonable be dealt with"
> as a good reason to go double ABI at all.
we could printk for now and if nobody reports this to lkml (as they
do/did with oldish tcpdump/libpcap a while ago) we could -EINVAL
On Sat, 2005-01-15 at 11:55 -0800, Chris Wedgwood wrote:
> On Sat, Jan 15, 2005 at 10:58:45AM +0100, Arjan van de Ven wrote:
>
> > I don't see a valid reason to restrict/reject input that is accepted
> > now and dealt with reasonably because some standard says so (if you
> > design a new api, following the standard is nice of course). I don't
> > see "doesn't reject a condition that can reasonable be dealt with"
> > as a good reason to go double ABI at all.
>
> we could printk for now and if nobody reports this to lkml (as they
> do/did with oldish tcpdump/libpcap a while ago) we could -EINVAL
>
but why????
if someone wants the stuff rejected in a posix confirm way, he can do
these tests easily in the syscall wrapper he needs anyway for this
function.
Chris Wedgewood suggested handling this with a printk, to which Arjan
van de Ven asked
> but why????
>
> if someone wants the stuff rejected in a posix confirm way, he can do
> these tests easily in the syscall wrapper he needs anyway for this
> function.
For negative times and oversized usec values, that's easy. But the
third problem was that setitimer() may silently truncate the time
value. To deal with that, a wrapper would need to
a) know that this silent truncation happens in the first place.
The only way I know of finding that out is to read the kernel
source. (the man page doesn't say anything, and POSIX doesn't
mention any silent truncation either)
and
b) Know that the particular value the truncation happens at is
dependent on HZ (and, presumably, know what HZ is on that
particular machine)
I found it surprising that the timer set by setitimer() could expire
before the time passed to it---the manpage explicitly promises that
will never happen.
On many (most?) machines, the obvious symptoms of this truncation
don't start appearing until after 248 days of uptime, so it's not the
sort of problem which jumps out in testing. A printk() warning would
have helped me. As would a warning in the manpage, e.g.:
| BUGS
|
| Under Linux, timers will expire before the requested time if the
| requested time is larger than MAX_SEC_IN_JIFFIES, which is
| defined in include/linux/jiffies.h.
Where can I send manpage improvements?
Matthias
Matthias Lang wrote:
> Chris Wedgewood suggested handling this with a printk, to which Arjan
> van de Ven asked
>
> > but why????
> >
> > if someone wants the stuff rejected in a posix confirm way, he can do
> > these tests easily in the syscall wrapper he needs anyway for this
> > function.
>
> For negative times and oversized usec values, that's easy. But the
> third problem was that setitimer() may silently truncate the time
> value. To deal with that, a wrapper would need to
>
> a) know that this silent truncation happens in the first place.
> The only way I know of finding that out is to read the kernel
> source. (the man page doesn't say anything, and POSIX doesn't
> mention any silent truncation either)
>
> and
>
> b) Know that the particular value the truncation happens at is
> dependent on HZ (and, presumably, know what HZ is on that
> particular machine)
>
> I found it surprising that the timer set by setitimer() could expire
> before the time passed to it---the manpage explicitly promises that
> will never happen.
>
> On many (most?) machines, the obvious symptoms of this truncation
> don't start appearing until after 248 days of uptime, so it's not the
> sort of problem which jumps out in testing. A printk() warning would
> have helped me. As would a warning in the manpage, e.g.:
>
> | BUGS
> |
> | Under Linux, timers will expire before the requested time if the
> | requested time is larger than MAX_SEC_IN_JIFFIES, which is
> | defined in include/linux/jiffies.h.
>
> Where can I send manpage improvements?
aeb wrote on 2004-OCT-31:
Fortunately Michael Kerrisk has accepted to take over.
Send corrections and additions to [email protected] .
--
~Randy
On Sad, 2005-01-15 at 09:30, Andrew Morton wrote:
> Matthias Lang <[email protected]> wrote:
> These are things we probably cannot change now. All three are arguably
> sensible behaviour and do satisfy the principle of least surprise. So
> there may be apps out there which will break if we "fix" these things.
>
> If the kernel version was 2.7.0 then well maybe...
These are things we should fix. They are bugs. Since there is no 2.7
plan pick a date to fix it. We should certainly error the overflow case
*now* because the behaviour is undefined/broken. The other cases I'm not
clear about. setitimer() is a library interface and it can do the basic
checking and error if it wants to be strictly posixly compliant.
On Sun, 2005-01-16 at 00:58 +0000, Alan Cox wrote:
> On Sad, 2005-01-15 at 09:30, Andrew Morton wrote:
> > Matthias Lang <[email protected]> wrote:
> > These are things we probably cannot change now. All three are arguably
> > sensible behaviour and do satisfy the principle of least surprise. So
> > there may be apps out there which will break if we "fix" these things.
> >
> > If the kernel version was 2.7.0 then well maybe...
>
> These are things we should fix. They are bugs. Since there is no 2.7
> plan pick a date to fix it. We should certainly error the overflow case
> *now* because the behaviour is undefined/broken. The other cases I'm not
> clear about. setitimer() is a library interface and it can do the basic
> checking and error if it wants to be strictly posixly compliant.
why error?
I'm pretty sure we can make a loop in the setitimer code that detects
we're at the end of jiffies but haven't upsurped the entire interval the
user requested yet, so that the code should just do another round of
sleeping...
Arjan van de Ven wrote:
> On Sun, 2005-01-16 at 00:58 +0000, Alan Cox wrote:
>
>>On Sad, 2005-01-15 at 09:30, Andrew Morton wrote:
>>
>>>Matthias Lang <[email protected]> wrote:
>>>These are things we probably cannot change now. All three are arguably
>>>sensible behaviour and do satisfy the principle of least surprise. So
>>>there may be apps out there which will break if we "fix" these things.
>>>
>>>If the kernel version was 2.7.0 then well maybe...
>>
>>These are things we should fix. They are bugs. Since there is no 2.7
>>plan pick a date to fix it. We should certainly error the overflow case
>>*now* because the behaviour is undefined/broken. The other cases I'm not
>>clear about. setitimer() is a library interface and it can do the basic
>>checking and error if it wants to be strictly posixly compliant.
>
>
> why error?
> I'm pretty sure we can make a loop in the setitimer code that detects
> we're at the end of jiffies but haven't upsurped the entire interval the
> user requested yet, so that the code should just do another round of
> sleeping...
>
That would work for sleep (but glibc uses nanosleep for that) but an itimer
delivers a signal. Rather hard to trap that in glibc.
For what its worth, you can tell it was truncated by reading back the remaining
time. That could be done in the glibc wrapper and an error passed back, but
actually doing the proper wait using this interface is rather hard.
The timer_settime(CLOCK... interface detects overflow and passes back an error...
Clock_nanosleep on the otherhand, will sleep for the whole time, provide the
machine doesn't turn to dust :)
--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
On Wed, 2005-01-19 at 15:51 -0800, George Anzinger wrote:
> Arjan van de Ven wrote:
> > On Sun, 2005-01-16 at 00:58 +0000, Alan Cox wrote:
> >
> >>On Sad, 2005-01-15 at 09:30, Andrew Morton wrote:
> >>
> >>>Matthias Lang <[email protected]> wrote:
> >>>These are things we probably cannot change now. All three are arguably
> >>>sensible behaviour and do satisfy the principle of least surprise. So
> >>>there may be apps out there which will break if we "fix" these things.
> >>>
> >>>If the kernel version was 2.7.0 then well maybe...
> >>
> >>These are things we should fix. They are bugs. Since there is no 2.7
> >>plan pick a date to fix it. We should certainly error the overflow case
> >>*now* because the behaviour is undefined/broken. The other cases I'm not
> >>clear about. setitimer() is a library interface and it can do the basic
> >>checking and error if it wants to be strictly posixly compliant.
> >
> >
> > why error?
> > I'm pretty sure we can make a loop in the setitimer code that detects
> > we're at the end of jiffies but haven't upsurped the entire interval the
> > user requested yet, so that the code should just do another round of
> > sleeping...
> >
> That would work for sleep (but glibc uses nanosleep for that) but an itimer
> delivers a signal. Rather hard to trap that in glibc.
>
This one I meant to fix in the kernel fwiw; we can put that loop inside
the kernel easily I'm sure
Arjan van de Ven wrote:
> On Wed, 2005-01-19 at 15:51 -0800, George Anzinger wrote:
>
>>Arjan van de Ven wrote:
>>
>>>On Sun, 2005-01-16 at 00:58 +0000, Alan Cox wrote:
>>>
>>>
>>>>On Sad, 2005-01-15 at 09:30, Andrew Morton wrote:
>>>>
>>>>
>>>>>Matthias Lang <[email protected]> wrote:
>>>>>These are things we probably cannot change now. All three are arguably
>>>>>sensible behaviour and do satisfy the principle of least surprise. So
>>>>>there may be apps out there which will break if we "fix" these things.
>>>>>
>>>>>If the kernel version was 2.7.0 then well maybe...
>>>>
>>>>These are things we should fix. They are bugs. Since there is no 2.7
>>>>plan pick a date to fix it. We should certainly error the overflow case
>>>>*now* because the behaviour is undefined/broken. The other cases I'm not
>>>>clear about. setitimer() is a library interface and it can do the basic
>>>>checking and error if it wants to be strictly posixly compliant.
>>>
>>>
>>>why error?
>>>I'm pretty sure we can make a loop in the setitimer code that detects
>>>we're at the end of jiffies but haven't upsurped the entire interval the
>>>user requested yet, so that the code should just do another round of
>>>sleeping...
>>>
>>
>>That would work for sleep (but glibc uses nanosleep for that) but an itimer
>>delivers a signal. Rather hard to trap that in glibc.
>>
>
> This one I meant to fix in the kernel fwiw; we can put that loop inside
> the kernel easily I'm sure
Yes, but it will increase the data size of the timer...
--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
> >
> > This one I meant to fix in the kernel fwiw; we can put that loop inside
> > the kernel easily I'm sure
>
> Yes, but it will increase the data size of the timer...
>
eh how?
the way I think it can be done is to just have multiple timers fire
until the total time is up. It's not a performance issue (a timer firing
every 24 days.. who cares, esp since such long delays are rare anyway)
after all...
Arjan van de Ven wrote:
>>>This one I meant to fix in the kernel fwiw; we can put that loop inside
>>>the kernel easily I'm sure
>>
>>Yes, but it will increase the data size of the timer...
>>
>
>
> eh how?
> the way I think it can be done is to just have multiple timers fire
> until the total time is up. It's not a performance issue (a timer firing
> every 24 days.. who cares, esp since such long delays are rare anyway)
> after all...
Sure that works, but you still need to keep info around on when the timer is
supposed to expire. This will be at least two words (u64)jiffies_expire_time.
This would likly end up in the task struc along with the timer itself.
--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/