2000-11-22 10:09:19

by Willy Tarreau

[permalink] [raw]
Subject: [BUG] 2.2.1[78] : RTNETLINK lock not properly locking ?

Hello !

while I was searching how to implement an rtnl_lock() in the bonding code,
I discovered that the rtnl_shlock() function in 2.2.1[78] could misbehave if
CONFIG_RTNETLINK is not set :
- it will nearly never allow concurrent accesses (seems to be what was
intented when it was written)
- it will not always prevent concurrent accesses, which is weird because
rtnl_lock() only relies on rtnl_shlock() (and exlock, which is empty) to
protect sensible areas

The first case is trivial : one at a time.
(code taken from include/linux/rtnetlink.h, line 639)

while (atomic_read(&rtnl_rlockct))
sleep_on(&rtnl_wait);
atomic_inc(&rtnl_rlockct);

The second case isn't trivial, so I will quote some points in the code :

[rtnl_shlock]
(1) ---------
while (atomic_read(&rtnl_rlockct))
(2) ---------
sleep_on(&rtnl_wait);
(3) ---------
atomic_inc(&rtnl_rlockct);
(4) ---------

[rtnl_shunlock]
(5) ---------
if (atomic_dec_and_test(&rtnl_rlockct))
(6) ---------
wake_up(&rtnl_wait);
(7) ---------

Consider 3 concurrent threads A, B and C.
- First, A needs the lock. Noone has it. It enters (1), then (3), sets the
rtnl_rlockct to 1 and exits at (4).
- now B comes in (1). The lock is already set by A, so B goes to (2) and
sleeps.
- A unlocks. It goes to (5), then (6)
- at this moment, C tries to lock in (1), an succeeds since A has just released
the lock. So it gets the lock and goes to (3), then (4).
- A is at (6) and wakes B up and steps to (7) and exits.
- B is woken up and goes to (3) then (4).

=> B and C both have the lock.

Perhaps I have missed something, but I don't find what. If I'm right, then why
don't we simply keep the same code as for the CONFIG_RTNETLINK case ?

Thanks in advance for any comment,

Regards,
Willy


2000-11-22 10:31:32

by David Miller

[permalink] [raw]
Subject: Re: [BUG] 2.2.1[78] : RTNETLINK lock not properly locking ?

Date: Wed, 22 Nov 2000 10:39:03 +0100 (MET)
From: Willy Tarreau <[email protected]>

Thanks in advance for any comment,

All of this is protected by lock_kernel() so none of the
A,B,C,whatever spots can be interrupted in 2.2.x

Later,
David S. Miller
[email protected]

2000-11-22 11:58:34

by Willy Tarreau

[permalink] [raw]
Subject: Re: [BUG] 2.2.1[78] : RTNETLINK lock not properly locking ?

Quoting "David S. Miller" <[email protected]>:

> Date: Wed, 22 Nov 2000 10:39:03 +0100 (MET)
> From: Willy Tarreau <[email protected]>
>
> Thanks in advance for any comment,
>
> All of this is protected by lock_kernel() so none of the
> A,B,C,whatever spots can be interrupted in 2.2.x

so, does this mean that rtnl_*lock* are completely useless ???

Willy

2000-11-22 12:13:03

by David Miller

[permalink] [raw]
Subject: Re: [BUG] 2.2.1[78] : RTNETLINK lock not properly locking ?

Date: Wed, 22 Nov 2000 12:27:57 +0100 (MET)
From: Willy Tarreau <[email protected]>

Quoting "David S. Miller" <[email protected]>:

> All of this is protected by lock_kernel() so none of the
> A,B,C,whatever spots can be interrupted in 2.2.x

so, does this mean that rtnl_*lock* are completely useless ???

No, it guarentees that only one process may be in the middle
of modifying interface configuration state, the same and only
guarentee it makes in 2.4.x as well.

Later,
David S. Miller
[email protected]

2000-11-22 16:06:46

by Willy Tarreau

[permalink] [raw]
Subject: Re: [BUG] 2.2.1[78] : RTNETLINK lock not properly locking ?

> No, it guarentees that only one process may be in the middle
> of modifying interface configuration state, the same and only
> guarentee it makes in 2.4.x as well.

ok, Dave. But the code in dev_ioctl() actually is :

rtnl_lock();
ret = dev_ifsioc(&ifr, cmd);
rtnl_unlock();

if only these lock/unlock guarantee this atomicity, then I can't
see why my A,B,C case could not work. If this is because the
kernel has been locked somewhere else, then why are the locks
still needed ? The author of rtnetlink.h has been very precautious
about the atomicity of these locks when CONFIG_RTNETLINK is set. I
don't understand why this could change in other cases. For this
reason, I don't know what to write in my code ...

Regards,
Willy

2000-11-22 18:39:45

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: [BUG] 2.2.1[78] : RTNETLINK lock not properly locking ?

Hello!

> - it will nearly never allow concurrent accesses (seems to be what was
> intented when it was written)

Never, to be more exact. Concurrent accesses are allowed only with rtnetlink.

In 2.4 it is always exclusive, because shared access turned out to
be mostly useless.


> - it will not always prevent concurrent accesses, which is weird because
> rtnl_lock() only relies on rtnl_shlock() (and exlock, which is empty) to
> protect sensible areas

It is linux-2.2, guy. 8) "threads" are not threaded there.

Semaphores (rtnl_lock, particularly) protects only areas, which
are going to _schedule_ excplicitly or implicitly.


> ok, Dave. But the code in dev_ioctl() actually is :
>
> rtnl_lock();
> ret = dev_ifsioc(&ifr, cmd);
> rtnl_unlock();
>
> if only these lock/unlock guarantee this atomicity, then I can't
> see why my A,B,C case could not work. If this is because the
> kernel has been locked somewhere else, then why are the locks
> still needed ?

Please, read comments. People used to consider comments as something
decorative, but they are not.

The first group of ioctls (no lock!):

/*
* These ioctl calls:
* - can be done by all.
* - atomic and do not require locking.
* - return a value
*/


The second group of ioctls (locked):

/*
* These ioctl calls:
* - require superuser power.
* - require strict serialization.
* - do not return a value
*/

Any questions?


> The author of rtnetlink.h has been very precautious
> about the atomicity of these locks when CONFIG_RTNETLINK is set.

Sorry...

/* NOTE: these locks are not interrupt safe, are not SMP safe,
* they are even not atomic. 8)8)8) ... and it is not a bug.
etc.

Do you call this "very precautios"? 8)

I would say that I was absolutely careless about their atomicity
because it is useless before BKL has been removed.

Alexey

2000-11-23 06:44:57

by Willy Tarreau

[permalink] [raw]
Subject: Re: [BUG] 2.2.1[78] : RTNETLINK lock not properly locking ?

> It is linux-2.2, guy. 8) "threads" are not threaded there.
>
> Semaphores (rtnl_lock, particularly) protects only areas, which
> are going to _schedule_ excplicitly or implicitly.

ok, thanks a lot Alexey, now I understand.

> Please, read comments. People used to consider comments as something
> decorative, but they are not.

I did read them again and again, but you know, when there's something
you don't understand, sometimes you interprete things badly.

> Any questions?
not anymore, of course :-)

> Sorry...
>
> /* NOTE: these locks are not interrupt safe, are not SMP safe,
> * they are even not atomic. 8)8)8) ... and it is not a bug.
> etc.
>
> Do you call this "very precautios"? 8)

I spoke about the first ones :)

thanks a lot, now I know how to proceed.

Cheers,
Willy