2013-06-24 19:12:30

by Marcus Gelderie

[permalink] [raw]
Subject: Race condition in time/alarmtimer.c

Hi,

there seems to be a race condition in kernel/time/alarmtimer.c

More specifically, the following function (line numbers correspond to actual file):

584 static int alarmtimer_do_nsleep(struct alarm *alarm, ktime_t absexp)
585 {
586 alarm->data = (void *)current;
587 do {
588 set_current_state(TASK_INTERRUPTIBLE);
589 alarm_start(alarm, absexp);
590 if (likely(alarm->data))
591 schedule();
592
593 alarm_cancel(alarm);
594 } while (alarm->data && !signal_pending(current));
595
596 __set_current_state(TASK_RUNNING);
597
598 return (alarm->data == NULL);
599 }

has a race: If the task is preempted after set_current_state(TASK_INTERRUPTIBLE)
but before the alarm is started in the next line, the task never wakes up.

Swapping both lines is not an option either, because then the alarm might trigger before
the thread sets itself to TASK_INTERRUPTIBLE, thereby loosing the wakeup.

A spinlock would disable preemption and protect alarm->data against the race from another CPU.
We could wrap lines 588 and 589 with a spin lock. Then the wakeup code would also aquire the
lock, of course. The lock could be attached to struct alarm.

An alternative would be a waitqueue, of course.

If folks agree with me, I will provide a patch.


Cheers
Marcus


2013-06-29 13:47:23

by Marcus Gelderie

[permalink] [raw]
Subject: Re: Race condition in time/alarmtimer.c

Hi,

bouncing this mail because originally my mail address was mangled due to MUA misconfig.

Sorry
Marcus


On Mo, Jun 24, 2013 at 09:12:03PM +0200, Marcus Gelderie wrote:
> Hi,
>
> there seems to be a race condition in kernel/time/alarmtimer.c
>
> More specifically, the following function (line numbers correspond to actual file):
>
> 584 static int alarmtimer_do_nsleep(struct alarm *alarm, ktime_t absexp)
> 585 {
> 586 alarm->data = (void *)current;
> 587 do {
> 588 set_current_state(TASK_INTERRUPTIBLE);
> 589 alarm_start(alarm, absexp);
> 590 if (likely(alarm->data))
> 591 schedule();
> 592
> 593 alarm_cancel(alarm);
> 594 } while (alarm->data && !signal_pending(current));
> 595
> 596 __set_current_state(TASK_RUNNING);
> 597
> 598 return (alarm->data == NULL);
> 599 }
>
> has a race: If the task is preempted after set_current_state(TASK_INTERRUPTIBLE)
> but before the alarm is started in the next line, the task never wakes up.
>
> Swapping both lines is not an option either, because then the alarm might trigger before
> the thread sets itself to TASK_INTERRUPTIBLE, thereby loosing the wakeup.
>
> A spinlock would disable preemption and protect alarm->data against the race from another CPU.
> We could wrap lines 588 and 589 with a spin lock. Then the wakeup code would also aquire the
> lock, of course. The lock could be attached to struct alarm.
>
> An alternative would be a waitqueue, of course.
>
> If folks agree with me, I will provide a patch.
>
>
> Cheers
> Marcus

2013-07-22 19:04:47

by John Stultz

[permalink] [raw]
Subject: Re: Race condition in time/alarmtimer.c

On 06/29/2013 06:47 AM, Marcus Gelderie wrote:
> On Mo, Jun 24, 2013 at 09:12:03PM +0200, Marcus Gelderie wrote:
>> Hi,
>>
>> there seems to be a race condition in kernel/time/alarmtimer.c
>>
>> More specifically, the following function (line numbers correspond to actual file):
>>
>> 584 static int alarmtimer_do_nsleep(struct alarm *alarm, ktime_t absexp)
>> 585 {
>> 586 alarm->data = (void *)current;
>> 587 do {
>> 588 set_current_state(TASK_INTERRUPTIBLE);
>> 589 alarm_start(alarm, absexp);
>> 590 if (likely(alarm->data))
>> 591 schedule();
>> 592
>> 593 alarm_cancel(alarm);
>> 594 } while (alarm->data && !signal_pending(current));
>> 595
>> 596 __set_current_state(TASK_RUNNING);
>> 597
>> 598 return (alarm->data == NULL);
>> 599 }
>>
>> has a race: If the task is preempted after set_current_state(TASK_INTERRUPTIBLE)
>> but before the alarm is started in the next line, the task never wakes up.
>>
>> Swapping both lines is not an option either, because then the alarm might trigger before
>> the thread sets itself to TASK_INTERRUPTIBLE, thereby loosing the wakeup.
>>
>> A spinlock would disable preemption and protect alarm->data against the race from another CPU.
>> We could wrap lines 588 and 589 with a spin lock. Then the wakeup code would also aquire the
>> lock, of course. The lock could be attached to struct alarm.
>>
>> An alternative would be a waitqueue, of course.
>>
>> If folks agree with me, I will provide a patch.

So does this race also affect the hrtimer do_nanosleep?

thanks
-john