2016-03-31 09:40:33

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: Possible ABA in use of llist.h llist_del_first() in tty_buffer and ib_rdma

CCing LKML.

----- On Mar 31, 2016, at 5:39 AM, Mathieu Desnoyers [email protected] wrote:

> Hi,
>
> Code review (really: grepping the Linux kernel for
> llist_del_first) leads me to notice two possible ABA issues.
> The first one is in drivers/tty/tty_buffer.c, and is due to
> its use of llist_del_all and llist_del_first without locking
> since commit 809850b7a5 "tty: Use lockless flip buffer free list".
>
> Unfortunately, it appears to do so without respecting the
> locking requirements associated with llist_del_first.
>
> Quoting llist.h:
>
> " * If there are multiple producers and one consumer, llist_add can be
> * used in producers and llist_del_all or llist_del_first can be used
> * in the consumer.
> *
> * This can be summarized as follow:
> *
> * | add | del_first | del_all
> * add | - | - | -
> * del_first | | L | L
> * del_all | | | -
> *
> * Where "-" stands for no lock is needed, while "L" stands for lock
> * is needed.
> "
>
> As soon as a llist_del_first() is used, then both llist_del_first()
> and llist_del_all() need to be protected by a lock, thus preventing
> ABA in llist_del_first().
>
> An alternative to locking would be to only use llist_del_all() and
> never llist_del_first().
>
> I'm also noticing a similar concurrent use of llist_del_first() and
> llist_del_all() in commit 1bc144b625 "net, rds, Replace xlist in net/rds/xlist.h
> with llist".
> The locking surrounding their use (especially in rds_ib_reuse_mr)
> don't appear clearly documented there. Perhaps there was a preexisting
> issue with the xlist.h use too ?
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


2016-03-31 13:58:58

by Peter Hurley

[permalink] [raw]
Subject: Re: Possible ABA in use of llist.h llist_del_first() in tty_buffer and ib_rdma

Hi Mathieu,

On 03/31/2016 02:40 AM, Mathieu Desnoyers wrote:
> CCing LKML.
>
> ----- On Mar 31, 2016, at 5:39 AM, Mathieu Desnoyers [email protected] wrote:
>
>> Hi,
>>
>> Code review (really: grepping the Linux kernel for
>> llist_del_first) leads me to notice two possible ABA issues.
>> The first one is in drivers/tty/tty_buffer.c, and is due to
>> its use of llist_del_all and llist_del_first without locking
>> since commit 809850b7a5 "tty: Use lockless flip buffer free list".
>>
>> Unfortunately, it appears to do so without respecting the
>> locking requirements associated with llist_del_first.
>>
>> Quoting llist.h:
>>
>> " * If there are multiple producers and one consumer, llist_add can be
>> * used in producers and llist_del_all or llist_del_first can be used
>> * in the consumer.

The use of llist_del_all in tty_buffer_free_all() is not concurrent with
any other use of the free list; the comments for tty_buffer_free_all() even
note the special condition.

Only the llist_del_first() and llist_add() usage are concurrent, and fwiw,
that usage is single-producer/single-consumer.

Regards,
Peter Hurley

>> * This can be summarized as follow:
>> *
>> * | add | del_first | del_all
>> * add | - | - | -
>> * del_first | | L | L
>> * del_all | | | -
>> *
>> * Where "-" stands for no lock is needed, while "L" stands for lock
>> * is needed.
>> "
>>
>> As soon as a llist_del_first() is used, then both llist_del_first()
>> and llist_del_all() need to be protected by a lock, thus preventing
>> ABA in llist_del_first().
>>
>> An alternative to locking would be to only use llist_del_all() and
>> never llist_del_first().
>>
>> I'm also noticing a similar concurrent use of llist_del_first() and
>> llist_del_all() in commit 1bc144b625 "net, rds, Replace xlist in net/rds/xlist.h
>> with llist".
>> The locking surrounding their use (especially in rds_ib_reuse_mr)
>> don't appear clearly documented there. Perhaps there was a preexisting
>> issue with the xlist.h use too ?
>>
>> Thanks,
>>
>> Mathieu
>>
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com
>

2016-04-01 21:32:29

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: Possible ABA in use of llist.h llist_del_first() in tty_buffer and ib_rdma

----- On Mar 31, 2016, at 9:58 AM, Peter Hurley [email protected] wrote:

> Hi Mathieu,
>
> On 03/31/2016 02:40 AM, Mathieu Desnoyers wrote:
>> CCing LKML.
>>
>> ----- On Mar 31, 2016, at 5:39 AM, Mathieu Desnoyers
>> [email protected] wrote:
>>
>>> Hi,
>>>
>>> Code review (really: grepping the Linux kernel for
>>> llist_del_first) leads me to notice two possible ABA issues.
>>> The first one is in drivers/tty/tty_buffer.c, and is due to
>>> its use of llist_del_all and llist_del_first without locking
>>> since commit 809850b7a5 "tty: Use lockless flip buffer free list".
>>>
>>> Unfortunately, it appears to do so without respecting the
>>> locking requirements associated with llist_del_first.
>>>
>>> Quoting llist.h:
>>>
>>> " * If there are multiple producers and one consumer, llist_add can be
>>> * used in producers and llist_del_all or llist_del_first can be used
>>> * in the consumer.
>
> The use of llist_del_all in tty_buffer_free_all() is not concurrent with
> any other use of the free list; the comments for tty_buffer_free_all() even
> note the special condition.

This one looks OK indeed.

>
> Only the llist_del_first() and llist_add() usage are concurrent, and fwiw,
> that usage is single-producer/single-consumer.

I see that tty_buffer_request_room is an exported symbol, and no
documentation indicate that it should never be called concurrently
for a struct tty_port. Also, there does not appear to be any locking
within this function preventing concurrent execution on a struct tty_port.
Is there some documentation about this interface that I am missing ?

If it's possible to call llist_del_first() concurrently, then we can run
into ABA scenarios, even if llist_add() is protected from concurrent
llist_add() by a lock.

Thanks,

Mathieu


>
> Regards,
> Peter Hurley
>
>>> * This can be summarized as follow:
>>> *
>>> * | add | del_first | del_all
>>> * add | - | - | -
>>> * del_first | | L | L
>>> * del_all | | | -
>>> *
>>> * Where "-" stands for no lock is needed, while "L" stands for lock
>>> * is needed.
>>> "
>>>
>>> As soon as a llist_del_first() is used, then both llist_del_first()
>>> and llist_del_all() need to be protected by a lock, thus preventing
>>> ABA in llist_del_first().
>>>
>>> An alternative to locking would be to only use llist_del_all() and
>>> never llist_del_first().
>>>
>>> I'm also noticing a similar concurrent use of llist_del_first() and
>>> llist_del_all() in commit 1bc144b625 "net, rds, Replace xlist in net/rds/xlist.h
>>> with llist".
>>> The locking surrounding their use (especially in rds_ib_reuse_mr)
>>> don't appear clearly documented there. Perhaps there was a preexisting
>>> issue with the xlist.h use too ?
>>>
>>> Thanks,
>>>
>>> Mathieu
>>>
>>> --
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>>> http://www.efficios.com

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

2016-04-01 23:50:58

by Peter Hurley

[permalink] [raw]
Subject: Re: Possible ABA in use of llist.h llist_del_first() in tty_buffer and ib_rdma

On 04/01/2016 02:32 PM, Mathieu Desnoyers wrote:
> ----- On Mar 31, 2016, at 9:58 AM, Peter Hurley [email protected] wrote:
>> On 03/31/2016 02:40 AM, Mathieu Desnoyers wrote:
>>> CCing LKML.
>>>
>>> ----- On Mar 31, 2016, at 5:39 AM, Mathieu Desnoyers
>>> [email protected] wrote:
>>>
>>>> Hi,
>>>>
>>>> Code review (really: grepping the Linux kernel for
>>>> llist_del_first) leads me to notice two possible ABA issues.
>>>> The first one is in drivers/tty/tty_buffer.c, and is due to
>>>> its use of llist_del_all and llist_del_first without locking
>>>> since commit 809850b7a5 "tty: Use lockless flip buffer free list".
>>>>
>>>> Unfortunately, it appears to do so without respecting the
>>>> locking requirements associated with llist_del_first.
>>>>
>>>> Quoting llist.h:
>>>>
>>>> " * If there are multiple producers and one consumer, llist_add can be
>>>> * used in producers and llist_del_all or llist_del_first can be used
>>>> * in the consumer.
>>
>> The use of llist_del_all in tty_buffer_free_all() is not concurrent with
>> any other use of the free list; the comments for tty_buffer_free_all() even
>> note the special condition.
>
> This one looks OK indeed.
>
>>
>> Only the llist_del_first() and llist_add() usage are concurrent, and fwiw,
>> that usage is single-producer/single-consumer.
>
> I see that tty_buffer_request_room is an exported symbol, and no
> documentation indicate that it should never be called concurrently
> for a struct tty_port. Also, there does not appear to be any locking
> within this function preventing concurrent execution on a struct tty_port.
> Is there some documentation about this interface that I am missing ?

There is little to no documentation on the tty flip buffer interface,
so you're not missing anything there.

The driver-side flip buffer interface is purely single-threaded;
it is exclusively called from interrupt handlers and single-threaded
bottom halves.

None of the functions are atomic, nor is the interface design.

For example, ignoring tty_buffer_request_room() for a moment, consider
how broken concurrent use of tty_insert_flip_string_flags() would be;
input from multiple threads would be overwritten/lost/mixed.

The interface itself is not atomic because tty_flip_buffer_push()
marks the conclusion of input and the hand-off to kworker, which may
already be running at the time.

The tty core doesn't support multi-channel input directly; the driver
is expected to deliver input from each channel to a separate tty,
or mux the inputs before tty_insert_flip_string_fixed_flag().



> If it's possible to call llist_del_first() concurrently, then we can run
> into ABA scenarios, even if llist_add() is protected from concurrent
> llist_add() by a lock.

And way more obvious problems than that, such as I wrote above,
if used concurrently.

Regards,
Peter Hurley