At the moment, the times() system call will appear to fail for a
period shortly after boot, while the value it want to return is
between -4095 and -1. The same thing will also happen for the time()
system call on 32-bit platforms some time in 2106 or so.
On some platforms, such as x86, this is unavoidable because of the
system call ABI, but other platforms such as powerpc have a separate
error indication from the return value, so system calls can in fact
return small negative values without indicating an error. On those
platforms, force_successful_syscall_return() provides a way to
indicate that the system call return value should not be treated as an
error even if it is in the range which would normally be taken as a
negative error number.
This adds a force_successful_syscall_return() call to the time() and
times() system calls plus their 32-bit compat versions, so that they
don't erroneously indicate an error on those platforms whose system
call ABI has a separate error indication. This will not affect
anything on other platforms.
Joakim Tjernlund added the fix for time() and the compat versions of
time() and times(), after I did the fix for times().
Signed-off-by: Joakim Tjernlund <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
---
Andrew - since this only touches generic code, will you queue this up
for 2.6.29?
diff --git a/kernel/compat.c b/kernel/compat.c
index 8eafe3e..21a93ce 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -24,6 +24,7 @@
#include <linux/migrate.h>
#include <linux/posix-timers.h>
#include <linux/times.h>
+#include <linux/ptrace.h>
#include <asm/uaccess.h>
@@ -229,6 +230,7 @@ asmlinkage long compat_sys_times(struct compat_tms __user *tbuf)
if (copy_to_user(tbuf, &tmp, sizeof(tmp)))
return -EFAULT;
}
+ force_successful_syscall_return();
return compat_jiffies_to_clock_t(jiffies);
}
@@ -883,8 +885,9 @@ asmlinkage long compat_sys_time(compat_time_t __user * tloc)
if (tloc) {
if (put_user(i,tloc))
- i = -EFAULT;
+ return -EFAULT;
}
+ force_successful_syscall_return();
return i;
}
diff --git a/kernel/sys.c b/kernel/sys.c
index 31deba8..1bf8c5c 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -33,6 +33,7 @@
#include <linux/task_io_accounting_ops.h>
#include <linux/seccomp.h>
#include <linux/cpu.h>
+#include <linux/ptrace.h>
#include <linux/compat.h>
#include <linux/syscalls.h>
@@ -878,6 +879,7 @@ asmlinkage long sys_times(struct tms __user * tbuf)
if (copy_to_user(tbuf, &tmp, sizeof(struct tms)))
return -EFAULT;
}
+ force_successful_syscall_return();
return (long) jiffies_64_to_clock_t(get_jiffies_64());
}
diff --git a/kernel/time.c b/kernel/time.c
index d63a433..4886e3c 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -37,6 +37,7 @@
#include <linux/fs.h>
#include <linux/slab.h>
#include <linux/math64.h>
+#include <linux/ptrace.h>
#include <asm/uaccess.h>
#include <asm/unistd.h>
@@ -65,8 +66,9 @@ asmlinkage long sys_time(time_t __user * tloc)
if (tloc) {
if (put_user(i,tloc))
- i = -EFAULT;
+ return -EFAULT;
}
+ force_successful_syscall_return();
return i;
}
From: Paul Mackerras <[email protected]>
Date: Thu, 4 Dec 2008 16:06:56 +1100
> At the moment, the times() system call will appear to fail for a
> period shortly after boot, while the value it want to return is
> between -4095 and -1. The same thing will also happen for the time()
> system call on 32-bit platforms some time in 2106 or so.
>
> On some platforms, such as x86, this is unavoidable because of the
> system call ABI, but other platforms such as powerpc have a separate
> error indication from the return value, so system calls can in fact
> return small negative values without indicating an error. On those
> platforms, force_successful_syscall_return() provides a way to
> indicate that the system call return value should not be treated as an
> error even if it is in the range which would normally be taken as a
> negative error number.
>
> This adds a force_successful_syscall_return() call to the time() and
> times() system calls plus their 32-bit compat versions, so that they
> don't erroneously indicate an error on those platforms whose system
> call ABI has a separate error indication. This will not affect
> anything on other platforms.
>
> Joakim Tjernlund added the fix for time() and the compat versions of
> time() and times(), after I did the fix for times().
>
> Signed-off-by: Joakim Tjernlund <[email protected]>
> Signed-off-by: Paul Mackerras <[email protected]>
Acked-by: David S. Miller <[email protected]>
* Paul Mackerras <[email protected]> wrote:
> @@ -229,6 +230,7 @@ asmlinkage long compat_sys_times(struct compat_tms __user *tbuf)
> if (copy_to_user(tbuf, &tmp, sizeof(tmp)))
> return -EFAULT;
> }
> + force_successful_syscall_return();
> return compat_jiffies_to_clock_t(jiffies);
just curious: what code does force_successful_syscall_return() actually
run in the powerpc case - those bits are missing from this patch. I
suspect it sets some sort of flag?
Ingo
Ingo Molnar writes:
> > + force_successful_syscall_return();
> > return compat_jiffies_to_clock_t(jiffies);
>
> just curious: what code does force_successful_syscall_return() actually
> run in the powerpc case - those bits are missing from this patch. I
> suspect it sets some sort of flag?
It sets the TIF_NOERROR thread flag, which is tested in the syscall
exit path along with various other thread flags.
force_successful_syscall_return() is defined for powerpc in
arch/powerpc/include/asm/ptrace.h. It's nothing new, it has existed
for ages and is used in a few other places already. It has existing
non-null definitions on alpha, ia64, powerpc, and sparc64.
Paul.
* Paul Mackerras <[email protected]> wrote:
> Ingo Molnar writes:
>
> > > + force_successful_syscall_return();
> > > return compat_jiffies_to_clock_t(jiffies);
> >
> > just curious: what code does force_successful_syscall_return() actually
> > run in the powerpc case - those bits are missing from this patch. I
> > suspect it sets some sort of flag?
>
> It sets the TIF_NOERROR thread flag, which is tested in the syscall
> exit path along with various other thread flags.
>
> force_successful_syscall_return() is defined for powerpc in
> arch/powerpc/include/asm/ptrace.h. It's nothing new, it has existed
> for ages and is used in a few other places already. It has existing
> non-null definitions on alpha, ia64, powerpc, and sparc64.
ok, was just curious. To make this code nicer, we could keep the time
syscalls from returning small negative numbers.
the core issue is probably:
jiffies.h:#define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ))
the thing is, we did INITIAL_JIFFIES to find _in kernel_ jiffies
wraparoudn bugs (hung timers, broken drivers, etc.) - we did not want to
confuse userspace with it.
So how about adding INITIAL_JIFFIES to the sys_times() return value?
Ingo
On Thu, 2008-12-04 at 16:40 +0100, Ingo Molnar wrote:
> * Paul Mackerras <[email protected]> wrote:
>
> > Ingo Molnar writes:
> >
> > > > + force_successful_syscall_return();
> > > > return compat_jiffies_to_clock_t(jiffies);
> > >
> > > just curious: what code does force_successful_syscall_return() actually
> > > run in the powerpc case - those bits are missing from this patch. I
> > > suspect it sets some sort of flag?
> >
> > It sets the TIF_NOERROR thread flag, which is tested in the syscall
> > exit path along with various other thread flags.
> >
> > force_successful_syscall_return() is defined for powerpc in
> > arch/powerpc/include/asm/ptrace.h. It's nothing new, it has existed
> > for ages and is used in a few other places already. It has existing
> > non-null definitions on alpha, ia64, powerpc, and sparc64.
>
> ok, was just curious. To make this code nicer, we could keep the time
> syscalls from returning small negative numbers.
>
> the core issue is probably:
>
> jiffies.h:#define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ))
>
> the thing is, we did INITIAL_JIFFIES to find _in kernel_ jiffies
> wraparoudn bugs (hung timers, broken drivers, etc.) - we did not want to
> confuse userspace with it.
>
> So how about adding INITIAL_JIFFIES to the sys_times() return value?
Na, that will just hide the bug for some 497 days(or 49 days on alpha).
In user space on have to use this workaround(and I think some variation
is in glibc already):
static unsigned long quagga_times(void)
{
#if defined(GNU_LINUX)
unsigned long ret;
errno = 0;
ret = times(NULL); /* Linux can handle NULL */
/* Workaround broken syscall impl.
* A bugfix exists for the kernel, hopefully
* it will make it into 2.6.29
*/
if (errno)
ret = (unsigned long) (-errno);
return ret;
#else
struct tms dummy; /* Only return value is used */
return times(&dummy);
#endif
}
From: Ingo Molnar <[email protected]>
Date: Thu, 4 Dec 2008 16:40:44 +0100
> * Paul Mackerras <[email protected]> wrote:
>
> > Ingo Molnar writes:
> >
> > > > + force_successful_syscall_return();
> > > > return compat_jiffies_to_clock_t(jiffies);
> > >
> > > just curious: what code does force_successful_syscall_return() actually
> > > run in the powerpc case - those bits are missing from this patch. I
> > > suspect it sets some sort of flag?
> >
> > It sets the TIF_NOERROR thread flag, which is tested in the syscall
> > exit path along with various other thread flags.
> >
> > force_successful_syscall_return() is defined for powerpc in
> > arch/powerpc/include/asm/ptrace.h. It's nothing new, it has existed
> > for ages and is used in a few other places already. It has existing
> > non-null definitions on alpha, ia64, powerpc, and sparc64.
>
> ok, was just curious. To make this code nicer, we could keep the time
> syscalls from returning small negative numbers.
>
> the core issue is probably:
>
> jiffies.h:#define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ))
>
> the thing is, we did INITIAL_JIFFIES to find _in kernel_ jiffies
> wraparoudn bugs (hung timers, broken drivers, etc.) - we did not want to
> confuse userspace with it.
>
> So how about adding INITIAL_JIFFIES to the sys_times() return value?
I can't believe people are still making suggestions which just paper
over this issue. :-)
This thing is going to wrap eventually, so wouldn't it be nice to wrap
IMMEDIATELY so that you can trigger the problem and get it really
fixed now? And that's exactly how things work at the moment.
On another note, this wouldn't be an issue if x86 did some condition
code thing to indicate errors. We all know this. So why don't we put
some kind of condition code setting support into the x86 syscall
return path now and perhaps 5 years from now the userspace on most
Linux machines will understand it and this bug will be essentially gone?