LinuxLists.cc - [patch 1/2] Validate itimer timeval from userspace

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

On Sat, 2006-03-18 at 20:10 +0100, Ingo Molnar wrote:
> * Thomas Gleixner <[email protected]> wrote:
>
> > According to the specification the timeval must be validated and an
> > errorcode -EINVAL returned in case the timeval is not in canonical
> > form. Before the hrtimer merge this was silently ignored by the
> > timeval to jiffies conversion. The validation is done inside
> > do_setitimer so all callers are catched.
> >
> > Signed-off-by: Thomas Gleixner <[email protected]>
>
> ok - bad (invalid) timevals were thus randomly interpreted? I agree that
> even though this is new behavior, it is much better to return -EINVAL
> than to behave randomly. OTOH, since 2.6.15 and earlier did this too, is
> there any urgency to apply this to 2.6.16?

Sorry, I explained it badly.

The negative timer values were converted to MAX_JIFFIES_PER_LONG
timeouts and intervals.

The hrtimer code expects canonical values for the start time and the
interval. The random non canonical values should not do any damage, but
the normalizing routines might loop for a while. The negative start time
will be treated as expired and negative intervals are adjusted to
resolution (jiffie) in the hrtimer_forward code.

So the behaviour is different anyway, but I prefer the clear -EINVAL to
randomness.

tglx

2006-03-18 20:10:30

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

Thomas Gleixner <[email protected]> wrote:
>
>
> According to the specification the timeval must be validated and
> an errorcode -EINVAL returned in case the timeval is not in canonical
> form. Before the hrtimer merge this was silently ignored by the
> timeval to jiffies conversion. The validation is done inside
> do_setitimer so all callers are catched.
>
> ...
>
> --- linux-2.6.16-rc6-updates.orig/include/linux/time.h
> +++ linux-2.6.16-rc6-updates/include/linux/time.h
> @@ -73,6 +73,12 @@ extern void set_normalized_timespec(stru
> #define timespec_valid(ts) \
> (((ts)->tv_sec >= 0) && (((unsigned long) (ts)->tv_nsec) < NSEC_PER_SEC))
>
> +/*
> + * Returns true if the timeval is in canonical form
> + */
> +#define timeval_valid(t) \
> + (((t)->tv_sec >= 0) && (((unsigned long) (t)->tv_usec) < USEC_PER_SEC))
> +
> extern struct timespec xtime;
> extern struct timespec wall_to_monotonic;
> extern seqlock_t xtime_lock;
> Index: linux-2.6.16-rc6-updates/kernel/itimer.c
> ===================================================================
> --- linux-2.6.16-rc6-updates.orig/kernel/itimer.c
> +++ linux-2.6.16-rc6-updates/kernel/itimer.c
> @@ -150,6 +150,14 @@ int do_setitimer(int which, struct itime
> ktime_t expires;
> cputime_t cval, cinterval, nval, ninterval;
>
> + /*
> + * Validate the timeval. This catches all users of
> + * do_setitimer.
> + */
> + if (!timeval_valid(&value->it_value) ||
> + !timeval_valid(&value->it_interval))
> + return -EINVAL;
> +
> switch (which) {
> case ITIMER_REAL:
> again:

>From my reading, 2.4's sys_setitimer() will normalise the incoming timeval
rather than rejecting it. And I think 2.6.13 did that too.

It would be bad of us to change this behaviour, even if that's what the
spec says we should do - because we can break existing applications.

So I think we're stuck with it - we should normalise and then accept such
timevals. And we should have a big comment explaining how we differ from
the spec, and why.

2006-03-18 20:16:06

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

On Sat, 2006-03-18 at 12:07 -0800, Andrew Morton wrote:

> From my reading, 2.4's sys_setitimer() will normalise the incoming timeval
> rather than rejecting it. And I think 2.6.13 did that too.
>
> It would be bad of us to change this behaviour, even if that's what the
> spec says we should do - because we can break existing applications.
>
> So I think we're stuck with it - we should normalise and then accept such
> timevals. And we should have a big comment explaining how we differ from
> the spec, and why.

Hmm. How do you treat a negative value ?

tglx

2006-03-18 20:23:55

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

On 3/18/06, Andrew Morton <[email protected]> wrote:
> Thomas Gleixner <[email protected]> wrote:
> >
> > According to the specification the timeval must be validated and
> > an errorcode -EINVAL returned in case the timeval is not in canonical
> > form. Before the hrtimer merge this was silently ignored by the
> > timeval to jiffies conversion. The validation is done inside
> > do_setitimer so all callers are catched.
> >
[...]
>
> From my reading, 2.4's sys_setitimer() will normalise the incoming timeval
> rather than rejecting it. And I think 2.6.13 did that too.
>
> It would be bad of us to change this behaviour, even if that's what the
> spec says we should do - because we can break existing applications.
>
> So I think we're stuck with it - we should normalise and then accept such
> timevals. And we should have a big comment explaining how we differ from
> the spec, and why.
>

Wouldn't this only break existing applications that do incorrect
things (passing invalid values) ?
If that's the case I'd say breaking them is OK and we should change to
follow the spec.

I don't like potential userspace breakage any more than the next guy,
but if the breakage only affects buggy applications then I think it's
more acceptable.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-18 20:30:11

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

On Sat, 2006-03-18 at 21:23 +0100, Jesper Juhl wrote:

> Wouldn't this only break existing applications that do incorrect
> things (passing invalid values) ?
> If that's the case I'd say breaking them is OK and we should change to
> follow the spec.
>
> I don't like potential userspace breakage any more than the next guy,
> but if the breakage only affects buggy applications then I think it's
> more acceptable.

Yes, it only breaks buggy applications.

On a full blown desktop the check (I added a printk) did not trigger.

The only application I found so far was the LTP setitimer "correctness"
test, which did not initialize it_interval and handed random garbage to
the kernel. :)

tglx

2006-03-18 20:34:04

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

Thomas Gleixner <[email protected]> wrote:
>
> On Sat, 2006-03-18 at 12:07 -0800, Andrew Morton wrote:
>
> > From my reading, 2.4's sys_setitimer() will normalise the incoming timeval
> > rather than rejecting it. And I think 2.6.13 did that too.
> >
> > It would be bad of us to change this behaviour, even if that's what the
> > spec says we should do - because we can break existing applications.
> >
> > So I think we're stuck with it - we should normalise and then accept such
> > timevals. And we should have a big comment explaining how we differ from
> > the spec, and why.
>
> Hmm. How do you treat a negative value ?
>

In the same way as earlier kernels did!

Unless, of course, those kernels did something utterly insane. In that
case we'd need to have a little think.

2006-03-18 20:38:41

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

On Sat, 2006-03-18 at 12:31 -0800, Andrew Morton wrote:
> Thomas Gleixner <[email protected]> wrote:
> >
> > On Sat, 2006-03-18 at 12:07 -0800, Andrew Morton wrote:
> >
> > > From my reading, 2.4's sys_setitimer() will normalise the incoming timeval
> > > rather than rejecting it. And I think 2.6.13 did that too.
> > >
> > > It would be bad of us to change this behaviour, even if that's what the
> > > spec says we should do - because we can break existing applications.
> > >
> > > So I think we're stuck with it - we should normalise and then accept such
> > > timevals. And we should have a big comment explaining how we differ from
> > > the spec, and why.
> >
> > Hmm. How do you treat a negative value ?
> >
>
> In the same way as earlier kernels did!
>
> Unless, of course, those kernels did something utterly insane. In that
> case we'd need to have a little think.

It was caught by:

timeval_to_jiffies(const struct timeval *value)
{
unsigned long sec = value->tv_sec;
long usec = value->tv_usec;

if (sec >= MAX_SEC_IN_JIFFIES)
sec = MAX_SEC_IN_JIFFIES;
....
}

The conversion of long to unsigned long converted a negative value to
the maximum timeout.

It's not utterly insane, but it does not make much sense either.

Of course I can convert it that way, if we want to keep this "help
sloppy programmers aid" alive.

tglx

2006-03-18 20:45:41

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

On 3/18/06, Andrew Morton <[email protected]> wrote:
> Thomas Gleixner <[email protected]> wrote:
> >
> > On Sat, 2006-03-18 at 12:07 -0800, Andrew Morton wrote:
> >
> > > From my reading, 2.4's sys_setitimer() will normalise the incoming timeval
> > > rather than rejecting it. And I think 2.6.13 did that too.
> > >
> > > It would be bad of us to change this behaviour, even if that's what the
> > > spec says we should do - because we can break existing applications.
> > >
> > > So I think we're stuck with it - we should normalise and then accept such
> > > timevals. And we should have a big comment explaining how we differ from
> > > the spec, and why.
> >
> > Hmm. How do you treat a negative value ?
> >
>
> In the same way as earlier kernels did!
>
> Unless, of course, those kernels did something utterly insane. In that
> case we'd need to have a little think.
>

If the change only affects buggy apps (as Thomas says), then it seems
completely obvious to me that the change should be made.

1. We'll be in compliance with the spec
2. Buggy applications will actually be helped by this by getting a
clear error instead of undefined behaviour silently hiding the fact
that they are buggy.
3. Correct applications are unaffected.

Seems like a no-brainer to me...

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-18 21:07:49

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

Thomas Gleixner <[email protected]> wrote:
>
> On Sat, 2006-03-18 at 21:23 +0100, Jesper Juhl wrote:
>
> > Wouldn't this only break existing applications that do incorrect
> > things (passing invalid values) ?
> > If that's the case I'd say breaking them is OK and we should change to
> > follow the spec.
> >
> > I don't like potential userspace breakage any more than the next guy,
> > but if the breakage only affects buggy applications then I think it's
> > more acceptable.
>
> Yes, it only breaks buggy applications.

But we live in the real world. There could be four-year-old applications
which passed all their Linux QA and which work perfectly well.

Then the kernel guys make some correctness change and that application
totally fails on new kernels. Your choice is a) don't use new kernels or
b) hold off the new kernel until your provider (if the company or internal
group still exists) has put out a new version of the application and then
you wear the (considerable) cost of upgrading what was a perfectly-running
application.

And whose fault was it? Ours. Because older kernels had the wrong
checking (thus causing that app's QA to pass) and because later kernels
changed the rules.

2006-03-18 21:12:26

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

Thomas Gleixner <[email protected]> wrote:
>
> On Sat, 2006-03-18 at 12:31 -0800, Andrew Morton wrote:
> > Thomas Gleixner <[email protected]> wrote:
> > >
> > > On Sat, 2006-03-18 at 12:07 -0800, Andrew Morton wrote:
> > >
> > > > From my reading, 2.4's sys_setitimer() will normalise the incoming timeval
> > > > rather than rejecting it. And I think 2.6.13 did that too.
> > > >
> > > > It would be bad of us to change this behaviour, even if that's what the
> > > > spec says we should do - because we can break existing applications.
> > > >
> > > > So I think we're stuck with it - we should normalise and then accept such
> > > > timevals. And we should have a big comment explaining how we differ from
> > > > the spec, and why.
> > >
> > > Hmm. How do you treat a negative value ?
> > >
> >
> > In the same way as earlier kernels did!
> >
> > Unless, of course, those kernels did something utterly insane. In that
> > case we'd need to have a little think.
>
> It was caught by:
>
> timeval_to_jiffies(const struct timeval *value)
> {
> unsigned long sec = value->tv_sec;
> long usec = value->tv_usec;
>
> if (sec >= MAX_SEC_IN_JIFFIES)
> sec = MAX_SEC_IN_JIFFIES;
> ....
> }
>
> The conversion of long to unsigned long converted a negative value to
> the maximum timeout.
>
> It's not utterly insane, but it does not make much sense either.
>
> Of course I can convert it that way, if we want to keep this "help
> sloppy programmers aid" alive.
>

It would be strange to set an alarm for 0xffffffff seconds in the future
but yeah, unless we can point at a reason why nobody could have ever been
doing that, we should turn this into permanent, documented behaviour of
Linux 2.6 and earlier, I'm afraid.

2006-03-18 21:26:40

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

On 3/18/06, Andrew Morton <[email protected]> wrote:
> Thomas Gleixner <[email protected]> wrote:
> >
> > On Sat, 2006-03-18 at 12:31 -0800, Andrew Morton wrote:
> > > Thomas Gleixner <[email protected]> wrote:
> > > >
> > > > On Sat, 2006-03-18 at 12:07 -0800, Andrew Morton wrote:
> > > >
> > > > > From my reading, 2.4's sys_setitimer() will normalise the incoming timeval
> > > > > rather than rejecting it. And I think 2.6.13 did that too.
> > > > >
> > > > > It would be bad of us to change this behaviour, even if that's what the
> > > > > spec says we should do - because we can break existing applications.
> > > > >
> > > > > So I think we're stuck with it - we should normalise and then accept such
> > > > > timevals. And we should have a big comment explaining how we differ from
> > > > > the spec, and why.
> > > >
> > > > Hmm. How do you treat a negative value ?
> > > >
> > >
> > > In the same way as earlier kernels did!
> > >
> > > Unless, of course, those kernels did something utterly insane. In that
> > > case we'd need to have a little think.
> >
> > It was caught by:
> >
> > timeval_to_jiffies(const struct timeval *value)
> > {
> > unsigned long sec = value->tv_sec;
> > long usec = value->tv_usec;
> >
> > if (sec >= MAX_SEC_IN_JIFFIES)
> > sec = MAX_SEC_IN_JIFFIES;
> > ....
> > }
> >
> > The conversion of long to unsigned long converted a negative value to
> > the maximum timeout.
> >
> > It's not utterly insane, but it does not make much sense either.
> >
> > Of course I can convert it that way, if we want to keep this "help
> > sloppy programmers aid" alive.
> >
>
> It would be strange to set an alarm for 0xffffffff seconds in the future
> but yeah, unless we can point at a reason why nobody could have ever been
> doing that, we should turn this into permanent, documented behaviour of
> Linux 2.6 and earlier, I'm afraid.
>

How about 0xffffffff seconds into the future being the same as 136
years (unless I botched the math)... That means that if any Linux
application ever did that it's still waiting for the alarm and will be
for at least another century...
I'd say that makes it pretty certain that noone are doing or have been
doing that without spotting the problem somehow - apps with such a bug
are unlikely to be in production and actually working correctly.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-18 21:36:55

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

"Jesper Juhl" <[email protected]> wrote:
>
> > It would be strange to set an alarm for 0xffffffff seconds in the future
> > but yeah, unless we can point at a reason why nobody could have ever been
> > doing that, we should turn this into permanent, documented behaviour of
> > Linux 2.6 and earlier, I'm afraid.
> >
>
> How about 0xffffffff seconds into the future being the same as 136
> years (unless I botched the math)... That means that if any Linux
> application ever did that it's still waiting for the alarm and will be
> for at least another century...
> I'd say that makes it pretty certain that noone are doing or have been
> doing that without spotting the problem somehow - apps with such a bug
> are unlikely to be in production and actually working correctly.

We just don't know. People do all sorts of things.

How about this?

$ cat /etc/my-expensive-app.conf
#
# Interval polling timer. Set this to -1 to disable
#
interval_polling_timer=-1

We just don't know.

2006-03-18 21:50:32

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

On 3/18/06, Andrew Morton <[email protected]> wrote:
> "Jesper Juhl" <[email protected]> wrote:
> >
> > > It would be strange to set an alarm for 0xffffffff seconds in the future
> > > but yeah, unless we can point at a reason why nobody could have ever been
> > > doing that, we should turn this into permanent, documented behaviour of
> > > Linux 2.6 and earlier, I'm afraid.
> > >
> >
> > How about 0xffffffff seconds into the future being the same as 136
> > years (unless I botched the math)... That means that if any Linux
> > application ever did that it's still waiting for the alarm and will be
> > for at least another century...
> > I'd say that makes it pretty certain that noone are doing or have been
> > doing that without spotting the problem somehow - apps with such a bug
> > are unlikely to be in production and actually working correctly.
>
> We just don't know. People do all sorts of things.
>
> How about this?
>
> $ cat /etc/my-expensive-app.conf
> #
> # Interval polling timer. Set this to -1 to disable
> #
> interval_polling_timer=-1
>
> We just don't know.
>
True, someone might be doing something like that, but consider this as well;

an app promts the user to enter a value and passes it on, expecting to
get EINVAL back if the value is wrong and has code in place to handle
that (like, say, prompt the user for a different value).

That app is not buggy, but it will not work correctly because *we* are buggy.

IMO we should cater to the correctly written app and prevent it from
breaking unexpectedly rather than continue to keep the buggy one
working.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-18 22:02:45

by Ray Lee

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

On 3/18/06, Jesper Juhl <[email protected]> wrote:
> If the change only affects buggy apps (as Thomas says), then it seems
> completely obvious to me that the change should be made.

But the app isn't buggy, it's just not coded to some arbitrary spec.
Further, an arbitrary spec that *the kernel didn't implement*. The app
author could very well have been competent and tested that behavior in
a ten line program (I do that sort of code *all the time* to test
corner cases that aren't clear in man pages). Once tested, they found
out -1 is an effectively infinite timeout, went "Hey, cool, that makes
sense", and went on with their day.

You're now arguing that we should break apps -- possibly well tested
apps -- because they didn't implement a spec that the kernel itself
wasn't implementing.

That's just nuts.

> 3. Correct applications are unaffected.

You're assuming that the apps that we'd break are incorrect. That's a
big assumption. Try imagining instead that it's a well-tested app that
passed QA with flying colors on a previous version of the kernel. They
exist. Honest.

Ray

2006-03-18 22:05:08

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

On Sat, 2006-03-18 at 13:09 -0800, Andrew Morton wrote:
> > Of course I can convert it that way, if we want to keep this "help
> > sloppy programmers aid" alive.
> >
>
> It would be strange to set an alarm for 0xffffffff seconds in the future
> but yeah, unless we can point at a reason why nobody could have ever been
> doing that, we should turn this into permanent, documented behaviour of
> Linux 2.6 and earlier, I'm afraid.

We have to take two things into account:

1. sys_alarm()

The alarm value 0xFFFFFFFF is valid as the argument to alarm() is an
unsigned int. So we have to convert this to 0x7FFFFFFF (for 32bit
machines) because timeval.tv_sec is a signed long. This is done by the
alarm patch, which is necessary whether we check the sanity of the
timeval in do_setitimer or not. The current -rc6 kernel sends the alarm
with the next timer tick, which will break an application which set it
to something > INT_MAX.

Of course we could do this by the silent conversion of negative values
in setitimer too. But thats insane as we rely on some broken feature.

2. setitimer()

An application would have to set value.it_value.tv_sec to a negative
value to trigger this. Also uninitialized usage of struct timevals can
cause such behaviour.

I'm not sure, if it is sane to ingore this. I can change the itimer
validate patch for now to do

if (unlikely(!timeval_valid(v))
fixup_timeval(v);

and print an appropriate warning in fixup_timeval() for the time being.

tglx

2006-03-18 22:28:23

[permalink] [raw]

Subject: Re: [patch 1/2] Validate itimer timeval from userspace

Thomas Gleixner <[email protected]> wrote:
>
> On Sat, 2006-03-18 at 13:09 -0800, Andrew Morton wrote:
> > > Of course I can convert it that way, if we want to keep this "help
> > > sloppy programmers aid" alive.
> > >
> >
> > It would be strange to set an alarm for 0xffffffff seconds in the future
> > but yeah, unless we can point at a reason why nobody could have ever been
> > doing that, we should turn this into permanent, documented behaviour of
> > Linux 2.6 and earlier, I'm afraid.
>
> We have to take two things into account:
>
> 1. sys_alarm()
>
> The alarm value 0xFFFFFFFF is valid as the argument to alarm() is an
> unsigned int. So we have to convert this to 0x7FFFFFFF (for 32bit
> machines) because timeval.tv_sec is a signed long. This is done by the
> alarm patch, which is necessary whether we check the sanity of the
> timeval in do_setitimer or not. The current -rc6 kernel sends the alarm
> with the next timer tick, which will break an application which set it
> to something > INT_MAX.
>
> Of course we could do this by the silent conversion of negative values
> in setitimer too. But thats insane as we rely on some broken feature.

So you're saying that sys_alarm(0xffffffff) needs to behave as
sys_alarm(0x7ffffffff)?

I guess if we have to do it that way, the risk of breaking anything is very
small.

What's the 2.4/2.6.13 behaviour of sys_alarm(0xffffffff)?

> 2. setitimer()
>
> An application would have to set value.it_value.tv_sec to a negative
> value to trigger this. Also uninitialized usage of struct timevals can
> cause such behaviour.
>
> I'm not sure, if it is sane to ingore this.

What does 2.4/2.6.13 do? Let's do that.

If you're proposing that we depart from previous behaviour by converting
setitimer(0xffffffff) into setitimer(0x7fffffff) then I guess we could live
with that.

> I can change the itimer
> validate patch for now to do
>
> if (unlikely(!timeval_valid(v))
> fixup_timeval(v);
>
> and print an appropriate warning in fixup_timeval() for the time being.
>

No, we cannot warn - it'll enable unprivileged users to spam the logs.

One could generate a once-per-reboot warning, I guess. The message should
include the PID and current->comm.

2006-03-18 23:12:49