hi,
I was running the strace-test from ltp 20081130 with 2.6.28-rc9, when i got the following bug
(I can reproduce the bug by simply running the testcase timer_create04)
[ 2460.441441] BUG: unable to handle kernel NULL pointer dereference at
00000003
[ 2460.441749] IP: [<c02a78c1>] _raw_spin_lock+0x11/0x110
[ 2460.442012] *pde = 00000000
[ 2460.442161] Oops: 0000 [#2] DEBUG_PAGEALLOC
[ 2460.442420] last sysfs file: /sys/block/sda/size
[ 2460.442541] Modules linked in: sctp nfsd auth_rpcgss exportfs nfs
lockd nfs_acl sunrpc ipv6 fuse unix
[ 2460.443471]
[ 2460.443628] Pid: 14406, comm: timer_create04 Tainted: G D W
(2.6.28-rc9 #71)
[ 2460.443811] EIP: 0060:[<c02a78c1>] EFLAGS: 00010096 CPU: 0
[ 2460.443941] EIP is at _raw_spin_lock+0x11/0x110
[ 2460.444044] EAX: ffffffff EBX: ffffffff ECX: 00000000 EDX: 00000000
[ 2460.444044] ESI: 00000096 EDI: c08337ec EBP: c8fc7ee0 ESP: c8fc7eb8
[ 2460.444044] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[ 2460.444044] Process timer_create04 (pid: 14406, ti=c8fc7000
task=c8db5710 task.ti=c8fc7000)
[ 2460.444044] Stack:
[ 2460.444044] 00000046 00000000 00000002 00000001 00000000 c0141070
0000000f ffffffff
[ 2460.444044] 00000096 c08337ec c8fc7f00 c05ac067 00000000 00000002
00000000 c0141070
[ 2460.444044] ffffffff c1956aa4 c8fc7f1c c0141070 00000002 00000000
c1956a60 c1956a68
[ 2460.444044] Call Trace:
[ 2460.444044] [<c0141070>] ? hrtimer_try_to_cancel+0x20/0x90
[ 2460.444044] [<c05ac067>] ? _spin_lock_irqsave+0x47/0x60
[ 2460.444044] [<c0141070>] ? hrtimer_try_to_cancel+0x20/0x90
[ 2460.444044] [<c0141070>] ? hrtimer_try_to_cancel+0x20/0x90
[ 2460.444044] [<c013cf94>] ? exit_itimers+0x94/0xf0
[ 2460.444044] [<c012cab2>] ? do_exit+0x602/0x810
[ 2460.444044] [<c05abe5d>] ? _spin_unlock+0x1d/0x20
[ 2460.444044] [<c01a245e>] ? vfs_write+0xfe/0x160
[ 2460.444044] [<c0319090>] ? tty_write+0x0/0x1f0
[ 2460.444044] [<c012ccef>] ? do_group_exit+0x2f/0x90
[ 2460.444044] [<c012cd63>] ? sys_exit_group+0x13/0x20
[ 2460.444044] [<c01035a9>] ? sysenter_do_call+0x12/0x31
[ 2460.444044] Code: 00 a1 00 c0 82 c0 89 41 0c 89 d8 5b 5d c3 8d b6 00
00 00 00 8d bf 00 00 00 00 55 89 e5 83 ec 28 89 5d f4 89 c3 89 75 f8 89
7d fc <81> 78 04 ad 4e ad de 74 0a ba 77 3f 79 c0 e8 2c fe ff ff a1 00
[ 2460.444044] EIP: [<c02a78c1>] _raw_spin_lock+0x11/0x110 SS:ESP
0068:c8fc7eb8
[ 2460.444044] ---[ end trace 85ae5af0da78d5d7 ]---
[ 2460.444044] Fixing recursive fault but reboot is needed!
(gdb) l *(_raw_spin_lock+0x11)
0xc02a78c1 is in _raw_spin_lock (lib/spinlock_debug.c:78).
73 #define SPIN_BUG_ON(cond, lock, msg) if (unlikely(cond))
spin_bug(lock, msg)
74
75 static inline void
76 debug_spin_lock_before(spinlock_t *lock)
77 {
78 SPIN_BUG_ON(lock->magic != SPINLOCK_MAGIC, lock, "bad
magic");
79 SPIN_BUG_ON(lock->owner == current, lock, "recursion");
80 SPIN_BUG_ON(lock->owner_cpu == raw_smp_processor_id(),
81 lock,
"cpu recursion");
82 }
(gdb) q
(gdb) l *(hrtimer_try_to_cancel+0x20)
0xc0141070 is in hrtimer_try_to_cancel (kernel/hrtimer.c:234).
229 static inline struct hrtimer_clock_base *
230 lock_hrtimer_base(const struct hrtimer *timer, unsigned long
*flags)
231 {
232 struct hrtimer_clock_base *base = timer->base;
233
234 spin_lock_irqsave(&base->cpu_base->lock, *flags);
235
236 return base;
237 }
238
(gdb)
The testcase aborts in pass number 6:
root@computer-desktop:~/testing/ltp-full-20081130/tools/strace_test#
./timer_create04
timer_create04 1 FAIL : timer_create(2) failed to produce expected
error; 22 , errno : EINVAL and got 0
timer_create04 2 PASS : timer_create(2) expected failure; Got
errno - EINVAL : Invalid parameter
timer_create04 3 PASS : timer_create(2) expected failure; Got
errno - EFAULT : Bad address
timer_create04 4 PASS : timer_create(2) expected failure; Got
errno - EFAULT : Bad address
timer_create04 5 PASS : timer_create(2) expected failure; Got
errno - EFAULT : Bad address
timer_create04 6 PASS : timer_create(2) expected failure; Got
errno - EFAULT : Bad address
Greetings, Eric
On Fri, 19 Dec 2008, Eric Sesterhenn wrote:
> hi,
CC'ed Oleg
> I was running the strace-test from ltp 20081130 with 2.6.28-rc9, when i got the following bug
> (I can reproduce the bug by simply running the testcase timer_create04)
>
> [ 2460.441441] BUG: unable to handle kernel NULL pointer dereference at
> 00000003
> [ 2460.441749] IP: [<c02a78c1>] _raw_spin_lock+0x11/0x110
> [ 2460.442012] *pde = 00000000
> [ 2460.442161] Oops: 0000 [#2] DEBUG_PAGEALLOC
> [ 2460.442420] last sysfs file: /sys/block/sda/size
> [ 2460.442541] Modules linked in: sctp nfsd auth_rpcgss exportfs nfs
> lockd nfs_acl sunrpc ipv6 fuse unix
> [ 2460.443471]
> [ 2460.443628] Pid: 14406, comm: timer_create04 Tainted: G D W
> (2.6.28-rc9 #71)
> [ 2460.443811] EIP: 0060:[<c02a78c1>] EFLAGS: 00010096 CPU: 0
> [ 2460.443941] EIP is at _raw_spin_lock+0x11/0x110
> [ 2460.444044] EAX: ffffffff EBX: ffffffff ECX: 00000000 EDX: 00000000
> [ 2460.444044] ESI: 00000096 EDI: c08337ec EBP: c8fc7ee0 ESP: c8fc7eb8
> [ 2460.444044] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> [ 2460.444044] Process timer_create04 (pid: 14406, ti=c8fc7000
> task=c8db5710 task.ti=c8fc7000)
> [ 2460.444044] Stack:
> [ 2460.444044] 00000046 00000000 00000002 00000001 00000000 c0141070
> 0000000f ffffffff
> [ 2460.444044] 00000096 c08337ec c8fc7f00 c05ac067 00000000 00000002
> 00000000 c0141070
> [ 2460.444044] ffffffff c1956aa4 c8fc7f1c c0141070 00000002 00000000
> c1956a60 c1956a68
> [ 2460.444044] Call Trace:
> [ 2460.444044] [<c0141070>] ? hrtimer_try_to_cancel+0x20/0x90
> [ 2460.444044] [<c05ac067>] ? _spin_lock_irqsave+0x47/0x60
> [ 2460.444044] [<c0141070>] ? hrtimer_try_to_cancel+0x20/0x90
> [ 2460.444044] [<c0141070>] ? hrtimer_try_to_cancel+0x20/0x90
> [ 2460.444044] [<c013cf94>] ? exit_itimers+0x94/0xf0
> [ 2460.444044] [<c012cab2>] ? do_exit+0x602/0x810
> [ 2460.444044] [<c05abe5d>] ? _spin_unlock+0x1d/0x20
> [ 2460.444044] [<c01a245e>] ? vfs_write+0xfe/0x160
> [ 2460.444044] [<c0319090>] ? tty_write+0x0/0x1f0
> [ 2460.444044] [<c012ccef>] ? do_group_exit+0x2f/0x90
> [ 2460.444044] [<c012cd63>] ? sys_exit_group+0x13/0x20
> [ 2460.444044] [<c01035a9>] ? sysenter_do_call+0x12/0x31
> [ 2460.444044] Code: 00 a1 00 c0 82 c0 89 41 0c 89 d8 5b 5d c3 8d b6 00
> 00 00 00 8d bf 00 00 00 00 55 89 e5 83 ec 28 89 5d f4 89 c3 89 75 f8 89
> 7d fc <81> 78 04 ad 4e ad de 74 0a ba 77 3f 79 c0 e8 2c fe ff ff a1 00
> [ 2460.444044] EIP: [<c02a78c1>] _raw_spin_lock+0x11/0x110 SS:ESP
> 0068:c8fc7eb8
> [ 2460.444044] ---[ end trace 85ae5af0da78d5d7 ]---
> [ 2460.444044] Fixing recursive fault but reboot is needed!
>
>
>
> (gdb) l *(_raw_spin_lock+0x11)
> 0xc02a78c1 is in _raw_spin_lock (lib/spinlock_debug.c:78).
> 73 #define SPIN_BUG_ON(cond, lock, msg) if (unlikely(cond))
> spin_bug(lock, msg)
> 74
> 75 static inline void
> 76 debug_spin_lock_before(spinlock_t *lock)
> 77 {
> 78 SPIN_BUG_ON(lock->magic != SPINLOCK_MAGIC, lock, "bad
> magic");
> 79 SPIN_BUG_ON(lock->owner == current, lock, "recursion");
> 80 SPIN_BUG_ON(lock->owner_cpu == raw_smp_processor_id(),
> 81 lock,
> "cpu recursion");
> 82 }
> (gdb) q
>
> (gdb) l *(hrtimer_try_to_cancel+0x20)
> 0xc0141070 is in hrtimer_try_to_cancel (kernel/hrtimer.c:234).
> 229 static inline struct hrtimer_clock_base *
> 230 lock_hrtimer_base(const struct hrtimer *timer, unsigned long
> *flags)
> 231 {
> 232 struct hrtimer_clock_base *base = timer->base;
> 233
> 234 spin_lock_irqsave(&base->cpu_base->lock, *flags);
> 235
> 236 return base;
> 237 }
> 238
> (gdb)
>
> The testcase aborts in pass number 6:
>
> root@computer-desktop:~/testing/ltp-full-20081130/tools/strace_test#
> ./timer_create04
> timer_create04 1 FAIL : timer_create(2) failed to produce expected
> error; 22 , errno : EINVAL and got 0
> timer_create04 2 PASS : timer_create(2) expected failure; Got
> errno - EINVAL : Invalid parameter
> timer_create04 3 PASS : timer_create(2) expected failure; Got
> errno - EFAULT : Bad address
> timer_create04 4 PASS : timer_create(2) expected failure; Got
> errno - EFAULT : Bad address
> timer_create04 5 PASS : timer_create(2) expected failure; Got
> errno - EFAULT : Bad address
> timer_create04 6 PASS : timer_create(2) expected failure; Got
> errno - EFAULT : Bad address
>
> Greetings, Eric
>
On 12/19, Thomas Gleixner wrote:
>
> On Fri, 19 Dec 2008, Eric Sesterhenn wrote:
>
> > I was running the strace-test from ltp 20081130 with 2.6.28-rc9, when i got the following bug
> > (I can reproduce the bug by simply running the testcase timer_create04)
Thanks a lot Eric (and thanks for .s files you sent me privately).
At first glance this all is very strange.
> > [ 2460.444044] [<c0141070>] ? hrtimer_try_to_cancel+0x20/0x90
> > [ 2460.444044] [<c013cf94>] ? exit_itimers+0x94/0xf0
> > [ 2460.444044] [<c012cab2>] ? do_exit+0x602/0x810
So, when the task exits its has a timer in ->posix_timers.
However, this means sys_timer_create() must return 0, the code
is very simple
spin_lock_irq(¤t->sighand->siglock);
new_timer->it_process = process;
list_add(&new_timer->list, ¤t->signal->posix_timers);
spin_unlock_irq(¤t->sighand->siglock);
return 0;
and nobody else adds the timer to ->posix_timers.
But,
> > root@computer-desktop:~/testing/ltp-full-20081130/tools/strace_test#
> > ./timer_create04
> > timer_create04 1 FAIL : timer_create(2) failed to produce expected
> > error; 22 , errno : EINVAL and got 0
> > timer_create04 2 PASS : timer_create(2) expected failure; Got
> > errno - EINVAL : Invalid parameter
> > timer_create04 3 PASS : timer_create(2) expected failure; Got
> > errno - EFAULT : Bad address
> > timer_create04 4 PASS : timer_create(2) expected failure; Got
> > errno - EFAULT : Bad address
> > timer_create04 5 PASS : timer_create(2) expected failure; Got
> > errno - EFAULT : Bad address
> > timer_create04 6 PASS : timer_create(2) expected failure; Got
> > errno - EFAULT : Bad address
according to above, timer_create() always returns -EXXX ?
I'll try to re-produce and investigate tomorrow.
Oleg.
On 12/20, Oleg Nesterov wrote:
>
> On 12/19, Thomas Gleixner wrote:
> >
> > On Fri, 19 Dec 2008, Eric Sesterhenn wrote:
> >
> > > root@computer-desktop:~/testing/ltp-full-20081130/tools/strace_test#
> > > ./timer_create04
> > > timer_create04 1 FAIL : timer_create(2) failed to produce expected
> > > error; 22 , errno : EINVAL and got 0
> > > timer_create04 2 PASS : timer_create(2) expected failure; Got
> > > errno - EINVAL : Invalid parameter
> > > timer_create04 3 PASS : timer_create(2) expected failure; Got
> > > errno - EFAULT : Bad address
> > > timer_create04 4 PASS : timer_create(2) expected failure; Got
> > > errno - EFAULT : Bad address
> > > timer_create04 5 PASS : timer_create(2) expected failure; Got
> > > errno - EFAULT : Bad address
> > > timer_create04 6 PASS : timer_create(2) expected failure; Got
> > > errno - EFAULT : Bad address
>
> according to above, timer_create() always returns -EXXX ?
Aaah. I misread the first "FAIL" above. timer_create() succeeds!
hmm... it does timer_create(MAX_CLOCKS) and thus it should fail...
Can't find the original commit at
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git
but now we have CLOCK_MONOTONIC_RAW == 4, and MAX_CLOCKS == 4. So the
test should be fixed too, the first timer_create() should not fail on
2.6.28.
OK, sys_timer_create(CLOCK_MONOTONIC_RAW) calls
__hrtimer_init(CLOCK_MONOTONIC_RAW) and this looks just wrong:
timer->base = &cpu_base->clock_base[CLOCK_MONOTONIC_RAW];
while HRTIMER_MAX_CLOCK_BASES == 2. So time->base points to
"nowhere", this can explain the crash.
Thomas?
Oleg.
(compile tested)
common_timer_create() and common_timer_set() blindly pass ->it_clock to
hrtimer_init() as clock_id. This used to work until CLOCK_MONOTONIC_RAW
was introduced, now we should be careful.
Perhaps it makes sense to add BUG_ON(clock_id >= HRTIMER_MAX_CLOCK_BASES)
to __hrtimer_init(), the wrong clock_id leads to catastrophe.
Reported-by: Eric Sesterhenn <[email protected]>
Signed-off-by: Oleg Nesterov <[email protected]>
--- K-28/kernel/posix-timers.c~CLOCK_MONOTONIC_RAW 2008-12-02 17:12:40.000000000 +0100
+++ K-28/kernel/posix-timers.c 2008-12-20 18:23:28.000000000 +0100
@@ -191,12 +191,20 @@ static inline int common_clock_set(const
return do_sys_settimeofday(tp, NULL);
}
-static int common_timer_create(struct k_itimer *new_timer)
+static inline int
+__common_timer_init(struct k_itimer *timer, enum hrtimer_mode mode)
{
- hrtimer_init(&new_timer->it.real.timer, new_timer->it_clock, 0);
+ clockid_t clock_id = timer->it_clock ?
+ CLOCK_MONOTONIC : CLOCK_REALTIME;
+ hrtimer_init(&timer->it.real.timer, clock_id, mode);
return 0;
}
+static int common_timer_create(struct k_itimer *new_timer)
+{
+ return __common_timer_init(new_timer, HRTIMER_MODE_ABS);
+}
+
/*
* Return nonzero if we know a priori this clockid_t value is bogus.
*/
@@ -730,7 +738,7 @@ common_timer_set(struct k_itimer *timr,
return 0;
mode = flags & TIMER_ABSTIME ? HRTIMER_MODE_ABS : HRTIMER_MODE_REL;
- hrtimer_init(&timr->it.real.timer, timr->it_clock, mode);
+ __common_timer_init(timr, mode);
timr->it.real.timer.function = posix_timer_fn;
hrtimer_set_expires(timer, timespec_to_ktime(new_setting->it_value));
On Sat, 20 Dec 2008, Oleg Nesterov wrote:
> (compile tested)
>
> common_timer_create() and common_timer_set() blindly pass ->it_clock to
> hrtimer_init() as clock_id. This used to work until CLOCK_MONOTONIC_RAW
> was introduced, now we should be careful.
>
> Perhaps it makes sense to add BUG_ON(clock_id >= HRTIMER_MAX_CLOCK_BASES)
> to __hrtimer_init(), the wrong clock_id leads to catastrophe.
>
> Reported-by: Eric Sesterhenn <[email protected]>
> Signed-off-by: Oleg Nesterov <[email protected]>
>
> --- K-28/kernel/posix-timers.c~CLOCK_MONOTONIC_RAW 2008-12-02 17:12:40.000000000 +0100
> +++ K-28/kernel/posix-timers.c 2008-12-20 18:23:28.000000000 +0100
> @@ -191,12 +191,20 @@ static inline int common_clock_set(const
> return do_sys_settimeofday(tp, NULL);
> }
>
> -static int common_timer_create(struct k_itimer *new_timer)
> +static inline int
> +__common_timer_init(struct k_itimer *timer, enum hrtimer_mode mode)
> {
> - hrtimer_init(&new_timer->it.real.timer, new_timer->it_clock, 0);
> + clockid_t clock_id = timer->it_clock ?
> + CLOCK_MONOTONIC : CLOCK_REALTIME;
> + hrtimer_init(&timer->it.real.timer, clock_id, mode);
> return 0;
> }
No, this is wrong. We do not want to create a timer for
CLOCK_MONOTONIC_RAW.
tglx
On 12/20, Thomas Gleixner wrote:
>
> On Sat, 20 Dec 2008, Oleg Nesterov wrote:
>
> > -static int common_timer_create(struct k_itimer *new_timer)
> > +static inline int
> > +__common_timer_init(struct k_itimer *timer, enum hrtimer_mode mode)
> > {
> > - hrtimer_init(&new_timer->it.real.timer, new_timer->it_clock, 0);
> > + clockid_t clock_id = timer->it_clock ?
> > + CLOCK_MONOTONIC : CLOCK_REALTIME;
> > + hrtimer_init(&timer->it.real.timer, clock_id, mode);
> > return 0;
> > }
>
> No, this is wrong. We do not want to create a timer for
> CLOCK_MONOTONIC_RAW.
OK, thanks.
I thought that the intent was to allow the creation.
Then we should we shoould add clock_monotonic_raw->timer_create()
which returns -EINVAL ?
Oleg.
Impact: Prevent kernel crash with posix timer clockid CLOCK_MONOTONIC_RAW
commit 2d42244ae71d6c7b0884b5664cf2eda30fb2ae68 (clocksource:
introduce CLOCK_MONOTONIC_RAW) introduced a new clockid, which is only
available to read out the raw not NTP adjusted system time.
The above commit did not prevent that a posix timer can be created
with that clockid. The timer_create() syscall succeeds and initializes
the timer to a non existing hrtimer base. When the timer is deleted
either by timer_delete() or by the exit() cleanup the kernel crashes.
Prevent the creation of timers for CLOCK_MONOTONIC_RAW by setting the
posix clock function to no_timer_create which returns an error code.
Reported-by: Eric Sesterhenn <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index 5e79c66..a140e44 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -197,6 +197,11 @@ static int common_timer_create(struct k_itimer *new_timer)
return 0;
}
+static int no_timer_create(struct k_itimer *new_timer)
+{
+ return -EOPNOTSUPP;
+}
+
/*
* Return nonzero if we know a priori this clockid_t value is bogus.
*/
@@ -248,6 +253,7 @@ static __init int init_posix_timers(void)
.clock_getres = hrtimer_get_res,
.clock_get = posix_get_monotonic_raw,
.clock_set = do_posix_clock_nosettime,
+ .timer_create = no_timer_create,
};
register_posix_clock(CLOCK_REALTIME, &clock_realtime);
On Sat, 20 Dec 2008, Oleg Nesterov wrote:
> On 12/20, Thomas Gleixner wrote:
> >
> > On Sat, 20 Dec 2008, Oleg Nesterov wrote:
> >
> > > -static int common_timer_create(struct k_itimer *new_timer)
> > > +static inline int
> > > +__common_timer_init(struct k_itimer *timer, enum hrtimer_mode mode)
> > > {
> > > - hrtimer_init(&new_timer->it.real.timer, new_timer->it_clock, 0);
> > > + clockid_t clock_id = timer->it_clock ?
> > > + CLOCK_MONOTONIC : CLOCK_REALTIME;
> > > + hrtimer_init(&timer->it.real.timer, clock_id, mode);
> > > return 0;
> > > }
> >
> > No, this is wrong. We do not want to create a timer for
> > CLOCK_MONOTONIC_RAW.
>
> OK, thanks.
>
> I thought that the intent was to allow the creation.
No.
1. CLOCK_MONOTONIC_RAW and CLOCK_MONOTONIC are diffferent beasts
2. CLOCK_MONOTONIC_RAW was created to allow user space to read out the
non NTP frequency corrected raw system time. That's mainly for the NTP
folks so they have a better idea what's the hardware's idea of time
is.
> Then we should we shoould add clock_monotonic_raw->timer_create()
> which returns -EINVAL ?
That's what I just sent out :) I looked into this offline and had the
fix ready to send out when I noticed yours.
Thanks,
tglx
On 12/20, Thomas Gleixner wrote:
>
> +static int no_timer_create(struct k_itimer *new_timer)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> /*
> * Return nonzero if we know a priori this clockid_t value is bogus.
> */
> @@ -248,6 +253,7 @@ static __init int init_posix_timers(void)
> .clock_getres = hrtimer_get_res,
> .clock_get = posix_get_monotonic_raw,
> .clock_set = do_posix_clock_nosettime,
> + .timer_create = no_timer_create,
Agreed, this patch is better than mine (and thanks for your
explanation about CLOCK_MONOTONIC_RAW).
I am not sure about -EOPNOTSUPP. To clarify, I do not claim this
is wrong, I just do not know.
But please note that sys_timer_create() does:
if (invalid_clockid(which_clock))
return -EINVAL;
And ltp's timer_create04.c expects timer_create(MAX_CLOCKS == 4)
returns -EINVAL.
Oleg.
* Thomas Gleixner ([email protected]) wrote:
> Impact: Prevent kernel crash with posix timer clockid CLOCK_MONOTONIC_RAW
>
> commit 2d42244ae71d6c7b0884b5664cf2eda30fb2ae68 (clocksource:
> introduce CLOCK_MONOTONIC_RAW) introduced a new clockid, which is only
> available to read out the raw not NTP adjusted system time.
Thanks for the fast response, patch works for me.
Cant trigger the bug anymore.
Greetings, Eric
On Sat, 20 Dec 2008, Oleg Nesterov wrote:
> On 12/20, Thomas Gleixner wrote:
> >
> > +static int no_timer_create(struct k_itimer *new_timer)
> > +{
> > + return -EOPNOTSUPP;
> > +}
> > +
> > /*
> > * Return nonzero if we know a priori this clockid_t value is bogus.
> > */
> > @@ -248,6 +253,7 @@ static __init int init_posix_timers(void)
> > .clock_getres = hrtimer_get_res,
> > .clock_get = posix_get_monotonic_raw,
> > .clock_set = do_posix_clock_nosettime,
> > + .timer_create = no_timer_create,
>
> Agreed, this patch is better than mine (and thanks for your
> explanation about CLOCK_MONOTONIC_RAW).
>
> I am not sure about -EOPNOTSUPP. To clarify, I do not claim this
> is wrong, I just do not know.
>
> But please note that sys_timer_create() does:
>
> if (invalid_clockid(which_clock))
> return -EINVAL;
EINVAL is wrong for timer_create(CLOCK_MONOTONIC_RAW) as clockid is
valid, just the operation of creating a timer is not supported for it.
> And ltp's timer_create04.c expects timer_create(MAX_CLOCKS == 4)
> returns -EINVAL.
That's because timer_create04.c does not know about
CLOCK_MONOTONIC_RAW yet.
Thanks,
tglx