2005-03-15 16:14:29

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH 0/2] del_timer_sync: proof of concept

Andrew Morton wrote:
>
> If we're prepared to rule that a timer handler is not allowed to do
> add_timer_on() then a recurring timer is permanently pinned to a CPU, isn't
> it?
>
> That should make things simpler?

In that case I think that both problems (race and scalability)
can be solved without increasing sizeof(timer_list).

What if timer_list had ->pending field? Then we can do:

timer_pending:
return timer->pending;

__mod_timer:
internal_add_timer(new_base, timer);
timer->base = new_base;
timer->pending = 1;

__run_timers:
list_del(&timer->entry);
set_running_timer(base, timer);

/* do not change timer->base */
timer->pending = 0;

spin_unlock(base->lock);
timer->function();

del_timer:
if (!timer->pending)
return 0;
base = timer->base;
...

del_timer_sync:
base = timer->base;
if (!base)
return 0;
spin_lock(base->lock);

if (base != timer->base)
goto del_again;
if (base->running_timer == timer)
goto del_again;

if (timer->pending)
list_del(&timer->entry);

timer->pending = 0;
timer->base = NULL;

The ->pending flag could live in the least significant bit of
timer->base, this way we:
1. do not waste the space
2. can read/write base+pending atomically

These patches are incomplete/suboptimal, just a proof of concept.

Oleg.


2005-03-15 18:19:42

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 0/2] del_timer_sync: proof of concept

I like the idea and it would solve the concerns that we had. The encoding
of a bit in a pointer is weird but we have done that before in the
page_struct.

However, this also means that __run_timers will not free up the timer and
it has to be explicitly freed with del_timer_??. There may be code
that relies on not having to delete a single shot timer after it has been
fired.

2005-03-15 18:40:02

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 0/2] del_timer_sync: proof of concept

Christoph Lameter wrote:
>
> However, this also means that __run_timers will not free up the timer and
> it has to be explicitly freed with del_timer_??.

I am not sure I understand you but no, del_timer{,_sync} is not needed.

__run_timer deletes timer from base->tv? list and clears 'pending flag'.

__del_timer_sync sets ->_base = NULL, but it is merely optimization.
It could set ->_base = base, but in that case next del_timer_sync()
call will need spin_lock(base->lock) again.

Oleg.

2005-03-15 19:08:35

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH 0/2] del_timer_sync: proof of concept

On Tue, 15 Mar 2005, Oleg Nesterov wrote:

> Christoph Lameter wrote:
> >
> > However, this also means that __run_timers will not free up the timer and
> > it has to be explicitly freed with del_timer_??.
>
> I am not sure I understand you but no, del_timer{,_sync} is not needed.
>
> __run_timer deletes timer from base->tv? list and clears 'pending flag'.
>
> __del_timer_sync sets ->_base = NULL, but it is merely optimization.
> It could set ->_base = base, but in that case next del_timer_sync()
> call will need spin_lock(base->lock) again.

For some reason I thought that ->base == NULL would have special
significance outside of the function you discussed. Looks fine to me now.

2005-03-16 15:49:49

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 0/2] del_timer_sync: proof of concept

Andrew Morton wrote:
>
> If we're prepared to rule that a timer handler is not allowed to do
> add_timer_on() then a recurring timer is permanently pinned to a CPU, isn't
> it?
>
> That should make things simpler?

I think that current inplementation of del_timer_sync() don't like
add_timer_on() too.

Consider the timer running on CPU_0. It sets timer->expires = jiffies,
and calls add_timer_on(1). Now it is possible that local timer interrupt
on CPU_1 happens and starts that timer before timer->function returns on
CPU_0.

del_timer_sync() detects that timer is running on CPU_0, waits while
->running_timer == timer, and returns. The timer still runs on CPU_1.

Oleg.