Hi, all,
I got a question, that is, I am confused by the following code in
do_sys_settimeofday().
if (tz) {
/* SMP safe, global irq locking makes it work. */
sys_tz = *tz;
if (firsttime) {
firsttime = 0;
if (!tv)
warp_clock();
}
}
For my understanding, an assignment between structs should be a
bit-wise copy. Such operation is not atomic, so it can not be supposed
SMP-safe. And the subsequent test-and-assign operation on firsttime is
not atomic, either.
If the comments mean the subsequent code is SMP-safe and can prevent
nest-kernel-path, how does it achieves that?
Thank you in advance.
Feng,Dong
On Fri, 29 Sep 2006, Dong Feng wrote:
> For my understanding, an assignment between structs should be a
> bit-wise copy. Such operation is not atomic, so it can not be supposed
Byte or Machine word yes.
> SMP-safe. And the subsequent test-and-assign operation on firsttime is
> not atomic, either.
No its not atomic on its own. Correct.
> If the comments mean the subsequent code is SMP-safe and can prevent
> nest-kernel-path, how does it achieves that?
It relies on locking outside of do_sys_settimeofday(). Seems that this
indicates locking is to be performed by the arch before calling
do_sys_settimeofday. Looks suspicious to me. Check that this function is
always called with the same lock.
2006/9/30, Christoph Lameter <[email protected]>:
> On Fri, 29 Sep 2006, Dong Feng wrote:
>
> > For my understanding, an assignment between structs should be a
> > bit-wise copy. Such operation is not atomic, so it can not be supposed
>
> Byte or Machine word yes.
>
> > SMP-safe. And the subsequent test-and-assign operation on firsttime is
> > not atomic, either.
>
> No its not atomic on its own. Correct.
>
> > If the comments mean the subsequent code is SMP-safe and can prevent
> > nest-kernel-path, how does it achieves that?
>
> It relies on locking outside of do_sys_settimeofday(). Seems that this
> indicates locking is to be performed by the arch before calling
> do_sys_settimeofday. Looks suspicious to me. Check that this function is
> always called with the same lock.
>
Yes, that is the question. The whole invocation path is
sys_settimeofday() -> do_sys_settimeofday()
I do not find a lock embracing do_sys_settimeofday().
Moreover, seems neither write operations nor read operations on sys_tz
is protected by any locks, in sys_gettimeofday() and
sys_settimeofday() respectively.
> Did you get to the bottom of this yet? It looks like you're right,
> and I suggest a seqlock might be a good option.
It basically doesn't matter because nobody changes the time zone after boot.
-Andi
On Sat, 30 Sep 2006, Andi Kleen wrote:
> > Did you get to the bottom of this yet? It looks like you're right,
> > and I suggest a seqlock might be a good option.
>
> It basically doesn't matter because nobody changes the time zone after boot.
Then we need to change to comments to explain the situation.
On Sun, 1 Oct 2006, Dong Feng wrote:
> --- kernel/time.c.orig 2006-09-30 23:21:29.000000000 +0800
> +++ kernel/time.c 2006-09-30 23:38:18.000000000 +0800
> @@ -107,7 +107,16 @@ asmlinkage long sys_gettimeofday(struct
> return -EFAULT;
> }
> if (unlikely(tz != NULL)) {
> - if (copy_to_user(tz, &sys_tz, sizeof(sys_tz)))
> + struct timezone ktz;
> + unsigned long seq;
> +
> + do {
> + seq = read_seqbegin(&xtime_lock);
> + ktz.tz_minuteswest = sys_tz.tz_minuteswest;
> + ktz.tz_dsttime = sys_tz.tz_dsttime;
> + } while (unlikely(read_seqretry(&xtime_lock, seq)));
> +
> + if (copy_to_user(tz, &ktz, sizeof(ktz)))
> return -EFAULT;
I really hate adding overhead to gettimeofday() and we would have to take
the seqlock in all places when we reference tz. Maybe we can tolerate the
resulting race?
If we assume word size transfers then we only have an issue on 32 bit
platforms. The result of the race would be that tz_minuteswest and
tz_dsttime disagree. So we may get daylight savings time wrong.
But then we are already changing the timezone and are potentially warping time.
gettimofday may be unstable anyways. So it may be okay to leave the race
in. Just add some comments explaining the situation.
Dong Feng wrote:
> 2006/9/30, Christoph Lameter <[email protected]>:
>
>> On Fri, 29 Sep 2006, Dong Feng wrote:
>>
>> > If the comments mean the subsequent code is SMP-safe and can prevent
>> > nest-kernel-path, how does it achieves that?
>>
>> It relies on locking outside of do_sys_settimeofday(). Seems that this
>> indicates locking is to be performed by the arch before calling
>> do_sys_settimeofday. Looks suspicious to me. Check that this function is
>> always called with the same lock.
>>
>
> Yes, that is the question. The whole invocation path is
> sys_settimeofday() -> do_sys_settimeofday()
>
> I do not find a lock embracing do_sys_settimeofday().
>
> Moreover, seems neither write operations nor read operations on sys_tz
> is protected by any locks, in sys_gettimeofday() and
> sys_settimeofday() respectively.
Did you get to the bottom of this yet? It looks like you're right,
and I suggest a seqlock might be a good option.
You should write a patch and send it to Mister Morton.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
On Sat, 30 Sep 2006, Nick Piggin wrote
>
> You should write a patch and send it to Mister Morton.
>
> --
> SUSE Labs, Novell Inc.
> Send instant messages to your online friends http://au.messenger.yahoo.com
>
>
This is a patch for your review.
--- kernel/time.c.orig 2006-09-30 23:21:29.000000000 +0800
+++ kernel/time.c 2006-09-30 23:38:18.000000000 +0800
@@ -107,7 +107,16 @@ asmlinkage long sys_gettimeofday(struct
return -EFAULT;
}
if (unlikely(tz != NULL)) {
- if (copy_to_user(tz, &sys_tz, sizeof(sys_tz)))
+ struct timezone ktz;
+ unsigned long seq;
+
+ do {
+ seq = read_seqbegin(&xtime_lock);
+ ktz.tz_minuteswest = sys_tz.tz_minuteswest;
+ ktz.tz_dsttime = sys_tz.tz_dsttime;
+ } while (unlikely(read_seqretry(&xtime_lock, seq)));
+
+ if (copy_to_user(tz, &ktz, sizeof(ktz)))
return -EFAULT;
}
return 0;
@@ -164,12 +173,16 @@ int do_sys_settimeofday(struct timespec
if (tz) {
/* SMP safe, global irq locking makes it work. */
- sys_tz = *tz;
- if (firsttime) {
- firsttime = 0;
- if (!tv)
- warp_clock();
+ write_seqlock_irq(&xtime_lock);
+ {
+ sys_tz = *tz;
+ if (firsttime) {
+ firsttime = 0;
+ if (!tv)
+ warp_clock();
+ }
}
+ write_sequnlock_irq(&xtime_lock);
}
if (tv)
{
Christoph Lameter wrote:
> On Sun, 1 Oct 2006, Dong Feng wrote:
>
>
>>--- kernel/time.c.orig 2006-09-30 23:21:29.000000000 +0800
>>+++ kernel/time.c 2006-09-30 23:38:18.000000000 +0800
>>@@ -107,7 +107,16 @@ asmlinkage long sys_gettimeofday(struct
>> return -EFAULT;
>> }
>> if (unlikely(tz != NULL)) {
>>- if (copy_to_user(tz, &sys_tz, sizeof(sys_tz)))
>>+ struct timezone ktz;
>>+ unsigned long seq;
>>+
>>+ do {
>>+ seq = read_seqbegin(&xtime_lock);
>>+ ktz.tz_minuteswest = sys_tz.tz_minuteswest;
>>+ ktz.tz_dsttime = sys_tz.tz_dsttime;
>>+ } while (unlikely(read_seqretry(&xtime_lock, seq)));
>>+
>>+ if (copy_to_user(tz, &ktz, sizeof(ktz)))
>> return -EFAULT;
>
>
> I really hate adding overhead to gettimeofday() and we would have to take
> the seqlock in all places when we reference tz. Maybe we can tolerate the
> resulting race?
>
> If we assume word size transfers then we only have an issue on 32 bit
> platforms. The result of the race would be that tz_minuteswest and
> tz_dsttime disagree. So we may get daylight savings time wrong.
> But then we are already changing the timezone and are potentially warping time.
> gettimofday may be unstable anyways. So it may be okay to leave the race
> in. Just add some comments explaining the situation.
It is in an unlikely path though. How many apps actually pass in a
non NULL value for the timezone? Those that don't won't be affected.
Even for those that do, it doesn't introduce any atomic ops or
unpredictable branches, or cacheline pressure (because xtime lock is
already touched by do_gettimeofday). IOW: I'm sure it would be
unmeasurable.
OTOH, to be completely correct, it seems like the same xtime_lock
read section should cover both the calculation of ktv, and that of
ktz. So if it is going to be fixed at all, it should be done
properly and looks like it needs to be a bit more intrusive (but
no more expensive).
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
2006/10/1, Nick Piggin <[email protected]>:
> It is in an unlikely path though. How many apps actually pass in a
> non NULL value for the timezone? Those that don't won't be affected.
> Even for those that do, it doesn't introduce any atomic ops or
> unpredictable branches, or cacheline pressure (because xtime lock is
> already touched by do_gettimeofday). IOW: I'm sure it would be
> unmeasurable.
I agree the above. Normally the unlikely path is not invoked after boot.
>
> OTOH, to be completely correct, it seems like the same xtime_lock
> read section should cover both the calculation of ktv, and that of
> ktz. So if it is going to be fixed at all, it should be done
> properly and looks like it needs to be a bit more intrusive (but
> no more expensive).
>
That means either 1. Move the seq lock from within do_gettimeofday()
to out of do_gettimeofday(), or 2. pass ktz into do_gettimeofday() and
compute it in the preexisting read section in do_gettimeofday().
The first changes the semantic of do_gettimeofday() so it would be
unacceptable since the function is invoked from many places. The
second changes the signature of the function so every caller need to
be changed to passing an extra NULL pointer in order to satisfy the
changed invocation agreement.
I think the race condition is not unacceptable so long as comments is
changed to state the situation clearly. To move everything into the
same read (and write) section (respectively) is a bit bigger work but
it does not introduce performance penalty at run time. Or perhaps we
could tolerate some middle point, that is, as the initial patch, still
protect ktz and ktv in separated section and let the comments state
the not-so-perfect situation clearly.
>>>>> "Andi" == Andi Kleen <[email protected]> writes:
>> Did you get to the bottom of this yet? It looks like you're right,
>> and I suggest a seqlock might be a good option.
Andi> It basically doesn't matter because nobody changes the time zone
Andi> after boot.
I do when I travel, and my laptop has a dual-core processor.
Sam
--
Samuel Tardieu -- [email protected] -- http://www.rfc1149.net/
>>>>> "Christoph" == Christoph Lameter <[email protected]> writes:
Christoph> I really hate adding overhead to gettimeofday() and we
Christoph> would have to take the seqlock in all places when we
Christoph> reference tz. Maybe we can tolerate the resulting race?
Agreed. In some applications, gettimeofday() is called an awfully lot
of times. And the only plausible case where settimeofday() is called
after boot is on a multi-core laptop during a traval accross different
timezones, where it is unlikely that time-and-DST-sensitive
applications would be running.
Sam
--
Samuel Tardieu -- [email protected] -- http://www.rfc1149.net/
On Sat 30-09-06 17:03:45, Andi Kleen wrote:
>
> > Did you get to the bottom of this yet? It looks like you're right,
> > and I suggest a seqlock might be a good option.
>
> It basically doesn't matter because nobody changes the time zone after boot.
Attacker might; in a tight loop, to confuse time-of-day subsystem, or
maybe oops the kernel.
Pavel
--
Thanks for all the (sleeping) penguins.
On Tuesday 03 October 2006 10:03, Pavel Machek wrote:
> On Sat 30-09-06 17:03:45, Andi Kleen wrote:
> >
> > > Did you get to the bottom of this yet? It looks like you're right,
> > > and I suggest a seqlock might be a good option.
> >
> > It basically doesn't matter because nobody changes the time zone after boot.
>
> Attacker might; in a tight loop, to confuse time-of-day subsystem, or
> maybe oops the kernel.
(a) only root change it.
(b) the time of day subsystem never cares about the time zone. It is
just a variable stored for user space.
-Andi