LinuxLists.cc - Getting L2CAP ERTM support into better upstream state

2012-02-08 05:27:35

Subject: Getting L2CAP ERTM support into better upstream state

Hello everyone,

we currently have the problem on our hand that our upstream L2CAP
support is not as good as it should be. And many Android products start
deviating from it to fix issues with. That is of course something we can
not have. Fragmentation on this level is not useful for anyone. It will
hurt everybody in the longterm. So I asked Mat to port over the QUIC
code to the latest upstream so we can start a discussion here.

https://www.codeaurora.org/gitweb/quic/bluetooth/?p=bluetooth-next.git;a=commit;h=d02ef660d1ec9e9312798561fd688a4c717d339e

To make one thing perfectly clear here. I want that our upstream code
reflects the state machine transition from the L2CAP specification. I
know that historically this was not the case and that was for a reason.
All early specification where pretty unclear in this area, but with the
introduction of ERTM the Bluetooth SIG got this fixed and now it is our
time to get this clearly into our code as well.

With the separation of L2CAP channels from L2CAP sockets this should
make it even more easier to use since we do not need and should not to
use socket states anymore anyway.

Regards

Marcel

2012-02-10 16:54:04

by Mat Martineau

[permalink] [raw]

Subject: Re: Getting L2CAP ERTM support into better upstream state

Gustavo -

On Fri, 10 Feb 2012, Ulisses Furquim wrote:

> Hi Gustavo,
>
> On Fri, Feb 10, 2012 at 10:54 AM, Gustavo Padovan
> <[email protected]> wrote:
>> Hi Mat, Ulisses,
>>
>> * Ulisses Furquim <[email protected]> [2012-02-08 23:01:16 -0200]:
>>
>>> Hi Mat,
>>>
>>> On Wed, Feb 8, 2012 at 8:33 PM, Mat Martineau <[email protected]> wrote:
>>>>
>>>> On Wed, 8 Feb 2012, Ulisses Furquim wrote:
>>>>
>>>>> Hi everybody,
>>>>>
>>>>> On Wed, Feb 8, 2012 at 7:32 AM, Luiz Augusto von Dentz
>>>>> <[email protected]> wrote:
>>>>>>
>>>>>> Hi Andrei,
>>>>>>
>>>>>> On Wed, Feb 8, 2012 at 11:09 AM, Andrei Emeltchenko
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>>
>>>>>>> I think this would be good to send as patch series so that people can
>>>>>>> comment. What comes to my mind is that the patch might be reduced if it
>>>>>>> does not change order of functions and defines like:
>>>>>>>
>>>>
>>>> ... <snip> ...
>>>>
>>>>
>>>>>>> those magic number does not look nice IMO and the code is not looking
>>>>>>> any
>>>>>>> better.
>>>>>>
>>>>>>
>>>>>> In this aspect perhaps, but we don't need to take the code as it is,
>>>>>> but the point here is following the states defined by the spec and
>>>>>> that is IMO much better.
>>>>
>>>>
>>>> Right - the code as it stands now was ported quickly, and does not take in
>>>> to account many of Andrei's upstream improvements. ?I don't want to move
>>>> backwards in terms of magic numbers, and there is some extra noise in there
>>>> due to symbol names. ?I need to address these issues before merging.
>>>
>>> Great.
>>>
>>>> I'm also looking for ideas on how to do this as a patch series. ?Since it
>>>> takes a very different approach from the original code, it's hard to make
>>>> meaningful, small patches that don't break functionality. ?At some point
>>>> there has to be a major switchover but there is a lot of room to split this
>>>> up.
>>>
>>> I agree here. You maybe can have some patches introducing new code
>>> that's going to be used in the commit that really switches the state
>>> machine handling.
>>
>>
>> The first step is to do a huge clean up on your patch, people already reported
>> many issues here. Then you can send all the macro definitions changes to
>> l2cap.h. Finally we look to your patch and check how bit it is and decide what
>> to do. It's much better waste time looking for bugs in your code than trying
>> to find a way to split in many patches.
>> The most important thing is do it fast, we are now in -rc3, we have 4 or 5
>> weeks until 3.4 merge windows opens.

I agree that time spent splitting up in to small patches will make
this take longer, and that it would be good to take another look after
some cleanup.

As for timing, Marcel was favoring 3.5 on IRC:

[13:27:09] <jhe> Vudentz: are there still ERTM patches that I should
wait for before considering our tree "good enough" for a pull request
to the wireless tree?
[13:29:30] <jhe> holtmann: regarding my above comment, I suppose the
rewrite of the L2CAP states that you've requested will inevitably need
to wait until the 3.5 merge window, right?
...
[15:40:41] <holtmann> jhe: Yes. for 3.4 merge window we are almost
done with changes.

>>>> One approach is to add inactive code until everything is there, then have
>>>> one commit that calls in to the new code instead of the old code. Then the
>>>> old code can be removed. ?I talked to Gustavo about that a while ago, and he
>>>> preferred finding a different way. ?Maybe an intermediate step is to put the
>>>> state machine code in there, but call the existing frame handling functions
>>>> in every state.
>>>
>>> IMO the former approach is better. We could even have a module option
>>> to activate the new code for testing (have it experimental) and do the
>>> switch when we're confident it's ready.
>>>
>>> <snip>
>>>
>>>>> Marcel mentioned the separation of L2CAP channel and socket. That is a
>>>>> work in progress by Andrei and judging by what you said, Marcel, you
>>>>> want that merged before we change ERTM, is that it?
>>>>
>>>> Andrei and Marcel, let's figure out this order now. ?It doesn't make sense
>>>> for one of us to have a bunch of changes staged, only to have major
>>>> merge/rebase conflicts when another far-reaching patch set gets merged.
>>>
>>> I'd say we add your changes and then Andrei's when he's ready with
>>> them. Right now I have the feeling your changes are more mature and
>>> can be easily merged.
>>>
>>>>> Mat, thanks a lot for doing this. I have just a few questions. Have
>>>>> you tested the stack with this patch? How was it? Quickly looking
>>>>> through the code I feel it's almost there. I agree with Andrei
>>>>> regarding constants and control field handling but apart from that it
>>>>> looks good, in general.
>>>>
>>>> It's only lightly tested right now. ?I can do incoming and outgoing
>>>> connections, and send/receive. ?It is unstable when disconnecting. ?I have
>>>> not tested with dropped frames yet.
>>>
>>> Ok.
>>>
>>>>> Regarding delayed work handling, are you sure about usage of
>>>>> __cancel_delayed_work()? I do think it's safer to use
>>>>> cancel_delayed_work() instead as it'll just spin on a lock if the
>>>>> timer to queue the work is running on another CPU.
>>>>
>>>> This code is coming from a kernel that still had code running in tasklet
>>>> context that couldn't block, and the delayed work handlers were written such
>>>> that it was ok if the functions were called after cancel. ?I will update to
>>>> the safer cancel_delayed_work().
>>>
>>> I see. I haven't really checked the handlers but it's less error prone
>>> to guarantee that if we cancel a delayed work as soon as the function
>>> returns the handler won't be queued on our back. And I say that
>>> thinking of keeping it simpler to maintain. Also cancel_delayed_work()
>>> never sleeps, it can only spin on a spinlock waiting for the timer
>>> handler to return to guarantee the work won't be queued again.
>>>
>>>>> I saw a comment about channel ref counting and kind of audit that
>>>>> would be great. It'd be good to see a patch for that after transition
>>>>> to new state handling.
>>>>
>>>> The "do channel ref counting" comment is there because I saw that the
>>>> existing ack/monitor/retrans timers do that. ?I will fix this up before
>>>> submitting.
>>>
>>> Ok.
>>>
>>>> Thanks for the feedback, everyone. ?Please let me know if you have
>>>> preferences for how to structure this patch set. ?I'll work on the issues
>>>> mentioned in this thread and start splitting up the changes.
>>>
>>> We need to hear from Marcel and Padovan if they agree with us. I do
>>> think you can introduce new stuff with small commits and then have one
>>> commit to add the bulk of it with the module option to enable it. Then
>>> we test that and make it default later.
>>
>> If the concern here is that the current ERTM implementation does not work I'd
>> rather completely remove it and add the new one, no module options to choose.
>
> Sure, we can also do that. The other approaches were just more incremental.

Just switching over without a module option would also help us do this
faster.

Regards,

--
Mat Martineau
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum

2012-02-10 13:49:34

by Ulisses Furquim

[permalink] [raw]

Subject: Re: Getting L2CAP ERTM support into better upstream state

Hi Gustavo,

On Fri, Feb 10, 2012 at 10:54 AM, Gustavo Padovan
<[email protected]> wrote:
> Hi Mat, Ulisses,
>
> * Ulisses Furquim <[email protected]> [2012-02-08 23:01:16 -0200]:
>
>> Hi Mat,
>>
>> On Wed, Feb 8, 2012 at 8:33 PM, Mat Martineau <[email protected]> w=
rote:
>> >
>> > On Wed, 8 Feb 2012, Ulisses Furquim wrote:
>> >
>> >> Hi everybody,
>> >>
>> >> On Wed, Feb 8, 2012 at 7:32 AM, Luiz Augusto von Dentz
>> >> <[email protected]> wrote:
>> >>>
>> >>> Hi Andrei,
>> >>>
>> >>> On Wed, Feb 8, 2012 at 11:09 AM, Andrei Emeltchenko
>> >>> <[email protected]> wrote:
>> >>>>
>> >>>>
>> >>>> I think this would be good to send as patch series so that people c=
an
>> >>>> comment. What comes to my mind is that the patch might be reduced i=
f it
>> >>>> does not change order of functions and defines like:
>> >>>>
>> >
>> > ... <snip> ...
>> >
>> >
>> >>>> those magic number does not look nice IMO and the code is not looki=
ng
>> >>>> any
>> >>>> better.
>> >>>
>> >>>
>> >>> In this aspect perhaps, but we don't need to take the code as it is,
>> >>> but the point here is following the states defined by the spec and
>> >>> that is IMO much better.
>> >
>> >
>> > Right - the code as it stands now was ported quickly, and does not tak=
e in
>> > to account many of Andrei's upstream improvements. =A0I don't want to =
move
>> > backwards in terms of magic numbers, and there is some extra noise in =
there
>> > due to symbol names. =A0I need to address these issues before merging.
>>
>> Great.
>>
>> > I'm also looking for ideas on how to do this as a patch series. =A0Sin=
ce it
>> > takes a very different approach from the original code, it's hard to m=
ake
>> > meaningful, small patches that don't break functionality. =A0At some p=
oint
>> > there has to be a major switchover but there is a lot of room to split=
this
>> > up.
>>
>> I agree here. You maybe can have some patches introducing new code
>> that's going to be used in the commit that really switches the state
>> machine handling.
>
>
> The first step is to do a huge clean up on your patch, people already rep=
orted
> many issues here. Then you can send all the macro definitions changes to
> l2cap.h. Finally we look to your patch and check how bit it is and decide=
what
> to do. It's much better waste time looking for bugs in your code than try=
ing
> to find a way to split in many patches.
> The most important thing is do it fast, we are now in -rc3, we have 4 or =
5
> weeks until 3.4 merge windows opens.
>
>>
>> > One approach is to add inactive code until everything is there, then h=
ave
>> > one commit that calls in to the new code instead of the old code. Then=
the
>> > old code can be removed. =A0I talked to Gustavo about that a while ago=
, and he
>> > preferred finding a different way. =A0Maybe an intermediate step is to=
put the
>> > state machine code in there, but call the existing frame handling func=
tions
>> > in every state.
>>
>> IMO the former approach is better. We could even have a module option
>> to activate the new code for testing (have it experimental) and do the
>> switch when we're confident it's ready.
>>
>> <snip>
>>
>> >> Marcel mentioned the separation of L2CAP channel and socket. That is =
a
>> >> work in progress by Andrei and judging by what you said, Marcel, you
>> >> want that merged before we change ERTM, is that it?
>> >
>> > Andrei and Marcel, let's figure out this order now. =A0It doesn't make=
sense
>> > for one of us to have a bunch of changes staged, only to have major
>> > merge/rebase conflicts when another far-reaching patch set gets merged=
.
>>
>> I'd say we add your changes and then Andrei's when he's ready with
>> them. Right now I have the feeling your changes are more mature and
>> can be easily merged.
>>
>> >> Mat, thanks a lot for doing this. I have just a few questions. Have
>> >> you tested the stack with this patch? How was it? Quickly looking
>> >> through the code I feel it's almost there. I agree with Andrei
>> >> regarding constants and control field handling but apart from that it
>> >> looks good, in general.
>> >
>> > It's only lightly tested right now. =A0I can do incoming and outgoing
>> > connections, and send/receive. =A0It is unstable when disconnecting. =
=A0I have
>> > not tested with dropped frames yet.
>>
>> Ok.
>>
>> >> Regarding delayed work handling, are you sure about usage of
>> >> __cancel_delayed_work()? I do think it's safer to use
>> >> cancel_delayed_work() instead as it'll just spin on a lock if the
>> >> timer to queue the work is running on another CPU.
>> >
>> > This code is coming from a kernel that still had code running in taskl=
et
>> > context that couldn't block, and the delayed work handlers were writte=
n such
>> > that it was ok if the functions were called after cancel. =A0I will up=
date to
>> > the safer cancel_delayed_work().
>>
>> I see. I haven't really checked the handlers but it's less error prone
>> to guarantee that if we cancel a delayed work as soon as the function
>> returns the handler won't be queued on our back. And I say that
>> thinking of keeping it simpler to maintain. Also cancel_delayed_work()
>> never sleeps, it can only spin on a spinlock waiting for the timer
>> handler to return to guarantee the work won't be queued again.
>>
>> >> I saw a comment about channel ref counting and kind of audit that
>> >> would be great. It'd be good to see a patch for that after transition
>> >> to new state handling.
>> >
>> > The "do channel ref counting" comment is there because I saw that the
>> > existing ack/monitor/retrans timers do that. =A0I will fix this up bef=
ore
>> > submitting.
>>
>> Ok.
>>
>> > Thanks for the feedback, everyone. =A0Please let me know if you have
>> > preferences for how to structure this patch set. =A0I'll work on the i=
ssues
>> > mentioned in this thread and start splitting up the changes.
>>
>> We need to hear from Marcel and Padovan if they agree with us. I do
>> think you can introduce new stuff with small commits and then have one
>> commit to add the bulk of it with the module option to enable it. Then
>> we test that and make it default later.
>
> If the concern here is that the current ERTM implementation does not work=
I'd
> rather completely remove it and add the new one, no module options to cho=
ose.

Sure, we can also do that. The other approaches were just more incremental.

Regards,

--=20
Ulisses Furquim
ProFUSION embedded systems
http://profusion.mobi
Mobile: +55 19 9250 0942
Skype: ulissesffs

2012-02-10 12:54:44

by Gustavo Padovan

[permalink] [raw]

Subject: Re: Getting L2CAP ERTM support into better upstream state

Hi Mat, Ulisses,

* Ulisses Furquim <[email protected]> [2012-02-08 23:01:16 -0200]:

> Hi Mat,
>=20
> On Wed, Feb 8, 2012 at 8:33 PM, Mat Martineau <[email protected]> wr=
ote:
> >
> > On Wed, 8 Feb 2012, Ulisses Furquim wrote:
> >
> >> Hi everybody,
> >>
> >> On Wed, Feb 8, 2012 at 7:32 AM, Luiz Augusto von Dentz
> >> <[email protected]> wrote:
> >>>
> >>> Hi Andrei,
> >>>
> >>> On Wed, Feb 8, 2012 at 11:09 AM, Andrei Emeltchenko
> >>> <[email protected]> wrote:
> >>>>
> >>>>
> >>>> I think this would be good to send as patch series so that people can
> >>>> comment. What comes to my mind is that the patch might be reduced if=
it
> >>>> does not change order of functions and defines like:
> >>>>
> >
> > ... <snip> ...
> >
> >
> >>>> those magic number does not look nice IMO and the code is not looking
> >>>> any
> >>>> better.
> >>>
> >>>
> >>> In this aspect perhaps, but we don't need to take the code as it is,
> >>> but the point here is following the states defined by the spec and
> >>> that is IMO much better.
> >
> >
> > Right - the code as it stands now was ported quickly, and does not take=
in
> > to account many of Andrei's upstream improvements. =A0I don't want to m=
ove
> > backwards in terms of magic numbers, and there is some extra noise in t=
here
> > due to symbol names. =A0I need to address these issues before merging.
>=20
> Great.
>=20
> > I'm also looking for ideas on how to do this as a patch series. =A0Sinc=
e it
> > takes a very different approach from the original code, it's hard to ma=
ke
> > meaningful, small patches that don't break functionality. =A0At some po=
int
> > there has to be a major switchover but there is a lot of room to split =
this
> > up.
>=20
> I agree here. You maybe can have some patches introducing new code
> that's going to be used in the commit that really switches the state
> machine handling.

The first step is to do a huge clean up on your patch, people already repor=
ted
many issues here. Then you can send all the macro definitions changes to
l2cap.h. Finally we look to your patch and check how bit it is and decide w=
hat
to do. It's much better waste time looking for bugs in your code than trying
to find a way to split in many patches.
The most important thing is do it fast, we are now in -rc3, we have 4 or 5
weeks until 3.4 merge windows opens.

>=20
> > One approach is to add inactive code until everything is there, then ha=
ve
> > one commit that calls in to the new code instead of the old code. Then =
the
> > old code can be removed. =A0I talked to Gustavo about that a while ago,=
and he
> > preferred finding a different way. =A0Maybe an intermediate step is to =
put the
> > state machine code in there, but call the existing frame handling funct=
ions
> > in every state.
>=20
> IMO the former approach is better. We could even have a module option
> to activate the new code for testing (have it experimental) and do the
> switch when we're confident it's ready.
>=20
> <snip>
>=20
> >> Marcel mentioned the separation of L2CAP channel and socket. That is a
> >> work in progress by Andrei and judging by what you said, Marcel, you
> >> want that merged before we change ERTM, is that it?
> >
> > Andrei and Marcel, let's figure out this order now. =A0It doesn't make =
sense
> > for one of us to have a bunch of changes staged, only to have major
> > merge/rebase conflicts when another far-reaching patch set gets merged.
>=20
> I'd say we add your changes and then Andrei's when he's ready with
> them. Right now I have the feeling your changes are more mature and
> can be easily merged.
>=20
> >> Mat, thanks a lot for doing this. I have just a few questions. Have
> >> you tested the stack with this patch? How was it? Quickly looking
> >> through the code I feel it's almost there. I agree with Andrei
> >> regarding constants and control field handling but apart from that it
> >> looks good, in general.
> >
> > It's only lightly tested right now. =A0I can do incoming and outgoing
> > connections, and send/receive. =A0It is unstable when disconnecting. =
=A0I have
> > not tested with dropped frames yet.
>=20
> Ok.
>=20
> >> Regarding delayed work handling, are you sure about usage of
> >> __cancel_delayed_work()? I do think it's safer to use
> >> cancel_delayed_work() instead as it'll just spin on a lock if the
> >> timer to queue the work is running on another CPU.
> >
> > This code is coming from a kernel that still had code running in tasklet
> > context that couldn't block, and the delayed work handlers were written=
such
> > that it was ok if the functions were called after cancel. =A0I will upd=
ate to
> > the safer cancel_delayed_work().
>=20
> I see. I haven't really checked the handlers but it's less error prone
> to guarantee that if we cancel a delayed work as soon as the function
> returns the handler won't be queued on our back. And I say that
> thinking of keeping it simpler to maintain. Also cancel_delayed_work()
> never sleeps, it can only spin on a spinlock waiting for the timer
> handler to return to guarantee the work won't be queued again.
>=20
> >> I saw a comment about channel ref counting and kind of audit that
> >> would be great. It'd be good to see a patch for that after transition
> >> to new state handling.
> >
> > The "do channel ref counting" comment is there because I saw that the
> > existing ack/monitor/retrans timers do that. =A0I will fix this up befo=
re
> > submitting.
>=20
> Ok.
>=20
> > Thanks for the feedback, everyone. =A0Please let me know if you have
> > preferences for how to structure this patch set. =A0I'll work on the is=
sues
> > mentioned in this thread and start splitting up the changes.
>=20
> We need to hear from Marcel and Padovan if they agree with us. I do
> think you can introduce new stuff with small commits and then have one
> commit to add the bulk of it with the module option to enable it. Then
> we test that and make it default later.

If the concern here is that the current ERTM implementation does not work I=
'd
rather completely remove it and add the new one, no module options to choos=
e.

Gustavo

2012-02-09 10:53:23

by Andrei Emeltchenko

[permalink] [raw]

Subject: Re: Getting L2CAP ERTM support into better upstream state

Hi all,

On Wed, Feb 08, 2012 at 11:01:16PM -0200, Ulisses Furquim wrote:
...
> >> Marcel mentioned the separation of L2CAP channel and socket. That is a
> >> work in progress by Andrei and judging by what you said, Marcel, you
> >> want that merged before we change ERTM, is that it?
> >
> > Andrei and Marcel, let's figure out this order now. ?It doesn't make sense
> > for one of us to have a bunch of changes staged, only to have major
> > merge/rebase conflicts when another far-reaching patch set gets merged.
>
> I'd say we add your changes and then Andrei's when he's ready with
> them. Right now I have the feeling your changes are more mature and
> can be easily merged.

I believe that those mentioned state changes should not affect lock
separation patches.

Locking is done when receiving data packet in l2cap_data_channel and the
changes shall come to functions surrounded by those locks. The same with
sending packet functionality.

...
> > Thanks for the feedback, everyone. ?Please let me know if you have
> > preferences for how to structure this patch set. ?I'll work on the issues
> > mentioned in this thread and start splitting up the changes.
>
> We need to hear from Marcel and Padovan if they agree with us. I do
> think you can introduce new stuff with small commits and then have one
> commit to add the bulk of it with the module option to enable it. Then
> we test that and make it default later.

Thanks Mat for work you have done. I am generally agree with Ulisses about
small logical commits (better if they compile without warnings). But if
there are changes that might be applied as is (like the patch Luiz has
sent concerning tx_window) that would be even better.

Best regards
Andrei Emeltchenko

2012-02-09 07:13:19

by Marcel Holtmann

[permalink] [raw]

Subject: Re: Getting L2CAP ERTM support into better upstream state

Hi Mat,

> > On Wed, Feb 8, 2012 at 7:32 AM, Luiz Augusto von Dentz
> > <[email protected]> wrote:
> >> Hi Andrei,
> >>
> >> On Wed, Feb 8, 2012 at 11:09 AM, Andrei Emeltchenko
> >> <[email protected]> wrote:
> >>>
> >>> I think this would be good to send as patch series so that people can
> >>> comment. What comes to my mind is that the patch might be reduced if it
> >>> does not change order of functions and defines like:
> >>>
>
> ... <snip> ...
>
> >>> those magic number does not look nice IMO and the code is not looking any
> >>> better.
> >>
> >> In this aspect perhaps, but we don't need to take the code as it is,
> >> but the point here is following the states defined by the spec and
> >> that is IMO much better.
>
> Right - the code as it stands now was ported quickly, and does not
> take in to account many of Andrei's upstream improvements. I don't
> want to move backwards in terms of magic numbers, and there is some
> extra noise in there due to symbol names. I need to address these
> issues before merging.
>
> I'm also looking for ideas on how to do this as a patch series. Since
> it takes a very different approach from the original code, it's hard
> to make meaningful, small patches that don't break functionality. At
> some point there has to be a major switchover but there is a lot of
> room to split this up.
>
> One approach is to add inactive code until everything is there, then
> have one commit that calls in to the new code instead of the old code.
> Then the old code can be removed. I talked to Gustavo about that a
> while ago, and he preferred finding a different way. Maybe an
> intermediate step is to put the state machine code in there, but call
> the existing frame handling functions in every state.

I am fine with this approach. We must make sure that we have a complete
series and be merging it at once. Otherwise dealing with compiler
warnings in -next trees is just creating too much noise.

And to make this crystal clear to everybody. I want _ONE_ upstream L2CAP
ERTM implementation and no future fragmentation. I am dealing with this
issue once and after that is has to be upstream first. No exceptions.

> > I've taken a quick look last week at their code and indeed it looks
> > better regarding following the states. In particular they track
> > rx_state and tx_state and then decide what to do based on that and
> > what happened. I like it because it's explicit about that instead of
> > demanding us to reason a lot on what state we are and what we should
> > do. I'm all for changing our ERTM to that so it'll be more
> > maintainable.
>
> That was the goal of the design. It has worked well during in-house
> testing, and has been through several UPFs.
>
>
> > Marcel mentioned the separation of L2CAP channel and socket. That is a
> > work in progress by Andrei and judging by what you said, Marcel, you
> > want that merged before we change ERTM, is that it?
>
> Andrei and Marcel, let's figure out this order now. It doesn't make
> sense for one of us to have a bunch of changes staged, only to have
> major merge/rebase conflicts when another far-reaching patch set gets
> merged.

This is between you and Andrei. I want to see the L2CAP channel vs
socket split happening. I do not care in which order. You guys agree on
a way forward and I will just go along with it.

Regards

Marcel

2012-02-09 01:01:16

by Ulisses Furquim

[permalink] [raw]

Subject: Re: Getting L2CAP ERTM support into better upstream state

Hi Mat,

On Wed, Feb 8, 2012 at 8:33 PM, Mat Martineau <[email protected]> wrot=
e:
>
> On Wed, 8 Feb 2012, Ulisses Furquim wrote:
>
>> Hi everybody,
>>
>> On Wed, Feb 8, 2012 at 7:32 AM, Luiz Augusto von Dentz
>> <[email protected]> wrote:
>>>
>>> Hi Andrei,
>>>
>>> On Wed, Feb 8, 2012 at 11:09 AM, Andrei Emeltchenko
>>> <[email protected]> wrote:
>>>>
>>>>
>>>> I think this would be good to send as patch series so that people can
>>>> comment. What comes to my mind is that the patch might be reduced if i=
t
>>>> does not change order of functions and defines like:
>>>>
>
> ... <snip> ...
>
>
>>>> those magic number does not look nice IMO and the code is not looking
>>>> any
>>>> better.
>>>
>>>
>>> In this aspect perhaps, but we don't need to take the code as it is,
>>> but the point here is following the states defined by the spec and
>>> that is IMO much better.
>
>
> Right - the code as it stands now was ported quickly, and does not take i=
n
> to account many of Andrei's upstream improvements. =A0I don't want to mov=
e
> backwards in terms of magic numbers, and there is some extra noise in the=
re
> due to symbol names. =A0I need to address these issues before merging.

Great.

> I'm also looking for ideas on how to do this as a patch series. =A0Since =
it
> takes a very different approach from the original code, it's hard to make
> meaningful, small patches that don't break functionality. =A0At some poin=
t
> there has to be a major switchover but there is a lot of room to split th=
is
> up.

I agree here. You maybe can have some patches introducing new code
that's going to be used in the commit that really switches the state
machine handling.

> One approach is to add inactive code until everything is there, then have
> one commit that calls in to the new code instead of the old code. Then th=
e
> old code can be removed. =A0I talked to Gustavo about that a while ago, a=
nd he
> preferred finding a different way. =A0Maybe an intermediate step is to pu=
t the
> state machine code in there, but call the existing frame handling functio=
ns
> in every state.

IMO the former approach is better. We could even have a module option
to activate the new code for testing (have it experimental) and do the
switch when we're confident it's ready.

<snip>

>> Marcel mentioned the separation of L2CAP channel and socket. That is a
>> work in progress by Andrei and judging by what you said, Marcel, you
>> want that merged before we change ERTM, is that it?
>
> Andrei and Marcel, let's figure out this order now. =A0It doesn't make se=
nse
> for one of us to have a bunch of changes staged, only to have major
> merge/rebase conflicts when another far-reaching patch set gets merged.

I'd say we add your changes and then Andrei's when he's ready with
them. Right now I have the feeling your changes are more mature and
can be easily merged.

>> Mat, thanks a lot for doing this. I have just a few questions. Have
>> you tested the stack with this patch? How was it? Quickly looking
>> through the code I feel it's almost there. I agree with Andrei
>> regarding constants and control field handling but apart from that it
>> looks good, in general.
>
> It's only lightly tested right now. =A0I can do incoming and outgoing
> connections, and send/receive. =A0It is unstable when disconnecting. =A0I=
have
> not tested with dropped frames yet.

Ok.

>> Regarding delayed work handling, are you sure about usage of
>> __cancel_delayed_work()? I do think it's safer to use
>> cancel_delayed_work() instead as it'll just spin on a lock if the
>> timer to queue the work is running on another CPU.
>
> This code is coming from a kernel that still had code running in tasklet
> context that couldn't block, and the delayed work handlers were written s=
uch
> that it was ok if the functions were called after cancel. =A0I will updat=
e to
> the safer cancel_delayed_work().

I see. I haven't really checked the handlers but it's less error prone
to guarantee that if we cancel a delayed work as soon as the function
returns the handler won't be queued on our back. And I say that
thinking of keeping it simpler to maintain. Also cancel_delayed_work()
never sleeps, it can only spin on a spinlock waiting for the timer
handler to return to guarantee the work won't be queued again.

>> I saw a comment about channel ref counting and kind of audit that
>> would be great. It'd be good to see a patch for that after transition
>> to new state handling.
>
> The "do channel ref counting" comment is there because I saw that the
> existing ack/monitor/retrans timers do that. =A0I will fix this up before
> submitting.

Ok.

> Thanks for the feedback, everyone. =A0Please let me know if you have
> preferences for how to structure this patch set. =A0I'll work on the issu=
es
> mentioned in this thread and start splitting up the changes.

We need to hear from Marcel and Padovan if they agree with us. I do
think you can introduce new stuff with small commits and then have one
commit to add the bulk of it with the module option to enable it. Then
we test that and make it default later.

Best regards,

--=20
Ulisses Furquim
ProFUSION embedded systems
http://profusion.mobi
Mobile: +55 19 9250 0942
Skype: ulissesffs

2012-02-08 22:33:19

by Mat Martineau

[permalink] [raw]

Subject: Re: Getting L2CAP ERTM support into better upstream state

On Wed, 8 Feb 2012, Ulisses Furquim wrote:

> Hi everybody,
>
> On Wed, Feb 8, 2012 at 7:32 AM, Luiz Augusto von Dentz
> <[email protected]> wrote:
>> Hi Andrei,
>>
>> On Wed, Feb 8, 2012 at 11:09 AM, Andrei Emeltchenko
>> <[email protected]> wrote:
>>>
>>> I think this would be good to send as patch series so that people can
>>> comment. What comes to my mind is that the patch might be reduced if it
>>> does not change order of functions and defines like:
>>>

... <snip> ...

>>> those magic number does not look nice IMO and the code is not looking any
>>> better.
>>
>> In this aspect perhaps, but we don't need to take the code as it is,
>> but the point here is following the states defined by the spec and
>> that is IMO much better.

Right - the code as it stands now was ported quickly, and does not
take in to account many of Andrei's upstream improvements. I don't
want to move backwards in terms of magic numbers, and there is some
extra noise in there due to symbol names. I need to address these
issues before merging.

I'm also looking for ideas on how to do this as a patch series. Since
it takes a very different approach from the original code, it's hard
to make meaningful, small patches that don't break functionality. At
some point there has to be a major switchover but there is a lot of
room to split this up.

One approach is to add inactive code until everything is there, then
have one commit that calls in to the new code instead of the old code.
Then the old code can be removed. I talked to Gustavo about that a
while ago, and he preferred finding a different way. Maybe an
intermediate step is to put the state machine code in there, but call
the existing frame handling functions in every state.

> I've taken a quick look last week at their code and indeed it looks
> better regarding following the states. In particular they track
> rx_state and tx_state and then decide what to do based on that and
> what happened. I like it because it's explicit about that instead of
> demanding us to reason a lot on what state we are and what we should
> do. I'm all for changing our ERTM to that so it'll be more
> maintainable.

That was the goal of the design. It has worked well during in-house
testing, and has been through several UPFs.

> Marcel mentioned the separation of L2CAP channel and socket. That is a
> work in progress by Andrei and judging by what you said, Marcel, you
> want that merged before we change ERTM, is that it?

Andrei and Marcel, let's figure out this order now. It doesn't make
sense for one of us to have a bunch of changes staged, only to have
major merge/rebase conflicts when another far-reaching patch set gets
merged.

> Mat, thanks a lot for doing this. I have just a few questions. Have
> you tested the stack with this patch? How was it? Quickly looking
> through the code I feel it's almost there. I agree with Andrei
> regarding constants and control field handling but apart from that it
> looks good, in general.

It's only lightly tested right now. I can do incoming and outgoing
connections, and send/receive. It is unstable when disconnecting. I
have not tested with dropped frames yet.

> Regarding delayed work handling, are you sure about usage of
> __cancel_delayed_work()? I do think it's safer to use
> cancel_delayed_work() instead as it'll just spin on a lock if the
> timer to queue the work is running on another CPU.

This code is coming from a kernel that still had code running in
tasklet context that couldn't block, and the delayed work handlers
were written such that it was ok if the functions were called after
cancel. I will update to the safer cancel_delayed_work().

> I saw a comment about channel ref counting and kind of audit that
> would be great. It'd be good to see a patch for that after transition
> to new state handling.

The "do channel ref counting" comment is there because I saw that the
existing ack/monitor/retrans timers do that. I will fix this up
before submitting.

Thanks for the feedback, everyone. Please let me know if you have
preferences for how to structure this patch set. I'll work on the
issues mentioned in this thread and start splitting up the changes.

--
Mat Martineau
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum

2012-02-08 11:16:00

by Ulisses Furquim

[permalink] [raw]

Subject: Re: Getting L2CAP ERTM support into better upstream state

Hi everybody,

On Wed, Feb 8, 2012 at 7:32 AM, Luiz Augusto von Dentz
<[email protected]> wrote:
> Hi Andrei,
>
> On Wed, Feb 8, 2012 at 11:09 AM, Andrei Emeltchenko
> <[email protected]> wrote:
>>
>> I think this would be good to send as patch series so that people can
>> comment. What comes to my mind is that the patch might be reduced if it
>> does not change order of functions and defines like:
>>
>> <------8<---------------------------------------------------------------=
--
>> | =A0-#define L2CAP_EXT_CTRL_TXSEQ =A0 =A0 =A0 =A0 =A0 0xFFFC0000
>> | =A0 #define L2CAP_EXT_CTRL_SAR =A0 =A0 =A0 =A0 =A0 =A0 0x00030000
>> | =A0-#define L2CAP_EXT_CTRL_SUPERVISE =A0 =A0 =A0 0x00030000
>> | =A0 #define L2CAP_EXT_CTRL_REQSEQ =A0 =A0 =A0 =A0 =A00x0000FFFC
>> | =A0-
>> | =A0-#define L2CAP_EXT_CTRL_POLL =A0 =A0 =A0 =A0 =A0 =A00x00040000
>> | =A0+#define L2CAP_EXT_CTRL_TXSEQ =A0 =A0 =A0 =A0 0xFFFC0000
>> | =A0 #define L2CAP_EXT_CTRL_FINAL =A0 =A0 =A0 =A0 =A0 0x00000002
>> | =A0+#define L2CAP_EXT_CTRL_POLL =A0 =A0 =A0 =A0 =A00x00040000
>> | =A0+#define L2CAP_EXT_CTRL_SUPERVISE =A0 =A0 0x00030000
>> | =A0 #define L2CAP_EXT_CTRL_FRAME_TYPE =A0 =A0 =A00x00000001 /* I- or S=
-Frame */
>> <------8<---------------------------------------------------------------=
--
>>
>> and I don't like this kind of change:
>>
>> <------8<---------------------------------------------------------------=
-----
>> | =A0- =A0 =A0 =A0 if (__is_sar_start(chan, control) && !__is_sframe(cha=
n, control))
>> | =A0- =A0 =A0 =A0 =A0 =A0 =A0 =A0 len -=3D L2CAP_SDULEN_SIZE;
>> | =A0+ =A0 =A0 =A0 if ((control->frame_type =3D=3D 'i') &&
>> | =A0+ =A0 =A0 =A0 =A0 =A0 (control->sar =3D=3D L2CAP_SAR_START))
>> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 len -=3D 2;
>> |
>> | =A0 =A0 =A0 =A0 =A0if (chan->fcs =3D=3D L2CAP_FCS_CRC16)
>> | =A0- =A0 =A0 =A0 =A0 =A0 =A0 =A0 len -=3D L2CAP_FCS_SIZE;
>> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 len -=3D 2;
>> <------8<---------------------------------------------------------------=
-----
>>
>> why not to use macros and defines for magic numbers?
>>
>> the same below:
>>
>> <------8<---------------------------------------------------------------=
-
>> | =A0+ =A0 =A0 =A0 if (test_bit(FLAG_EXT_CTRL, &chan->flags)) {
>> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 __get_extended_control(get_unaligned_=
le32(skb->data),
>> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0control);
>> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 skb_pull(skb, 4);
>> | =A0+ =A0 =A0 =A0 } else {
>> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 __get_enhanced_control(get_unaligned_=
le16(skb->data),
>> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0control);
>> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 skb_pull(skb, 2);
>> | =A0+ =A0 =A0 =A0 }
>> |
>> | =A0- =A0 =A0 =A0 control =3D __get_control(chan, skb->data);
>> | =A0- =A0 =A0 =A0 skb_pull(skb, __ctrl_size(chan));
>> <------8<---------------------------------------------------------------=
-
>>
>> those magic number does not look nice IMO and the code is not looking an=
y
>> better.
>
> In this aspect perhaps, but we don't need to take the code as it is,
> but the point here is following the states defined by the spec and
> that is IMO much better.

I've taken a quick look last week at their code and indeed it looks
better regarding following the states. In particular they track
rx_state and tx_state and then decide what to do based on that and
what happened. I like it because it's explicit about that instead of
demanding us to reason a lot on what state we are and what we should
do. I'm all for changing our ERTM to that so it'll be more
maintainable.

Marcel mentioned the separation of L2CAP channel and socket. That is a
work in progress by Andrei and judging by what you said, Marcel, you
want that merged before we change ERTM, is that it?

Mat, thanks a lot for doing this. I have just a few questions. Have
you tested the stack with this patch? How was it? Quickly looking
through the code I feel it's almost there. I agree with Andrei
regarding constants and control field handling but apart from that it
looks good, in general.

Regarding delayed work handling, are you sure about usage of
__cancel_delayed_work()? I do think it's safer to use
cancel_delayed_work() instead as it'll just spin on a lock if the
timer to queue the work is running on another CPU.

I saw a comment about channel ref counting and kind of audit that
would be great. It'd be good to see a patch for that after transition
to new state handling.

Best regards,

--=20
Ulisses Furquim
ProFUSION embedded systems
http://profusion.mobi
Mobile: +55 19 9250 0942
Skype: ulissesffs

2012-02-08 09:32:50

by Luiz Augusto von Dentz

[permalink] [raw]

Subject: Re: Getting L2CAP ERTM support into better upstream state

Hi Andrei,

On Wed, Feb 8, 2012 at 11:09 AM, Andrei Emeltchenko
<[email protected]> wrote:
>
> I think this would be good to send as patch series so that people can
> comment. What comes to my mind is that the patch might be reduced if it
> does not change order of functions and defines like:
>
> <------8<----------------------------------------------------------------=
-
> | =A0-#define L2CAP_EXT_CTRL_TXSEQ =A0 =A0 =A0 =A0 =A0 0xFFFC0000
> | =A0 #define L2CAP_EXT_CTRL_SAR =A0 =A0 =A0 =A0 =A0 =A0 0x00030000
> | =A0-#define L2CAP_EXT_CTRL_SUPERVISE =A0 =A0 =A0 0x00030000
> | =A0 #define L2CAP_EXT_CTRL_REQSEQ =A0 =A0 =A0 =A0 =A00x0000FFFC
> | =A0-
> | =A0-#define L2CAP_EXT_CTRL_POLL =A0 =A0 =A0 =A0 =A0 =A00x00040000
> | =A0+#define L2CAP_EXT_CTRL_TXSEQ =A0 =A0 =A0 =A0 0xFFFC0000
> | =A0 #define L2CAP_EXT_CTRL_FINAL =A0 =A0 =A0 =A0 =A0 0x00000002
> | =A0+#define L2CAP_EXT_CTRL_POLL =A0 =A0 =A0 =A0 =A00x00040000
> | =A0+#define L2CAP_EXT_CTRL_SUPERVISE =A0 =A0 0x00030000
> | =A0 #define L2CAP_EXT_CTRL_FRAME_TYPE =A0 =A0 =A00x00000001 /* I- or S-=
Frame */
> <------8<----------------------------------------------------------------=
-
>
> and I don't like this kind of change:
>
> <------8<----------------------------------------------------------------=
----
> | =A0- =A0 =A0 =A0 if (__is_sar_start(chan, control) && !__is_sframe(chan=
, control))
> | =A0- =A0 =A0 =A0 =A0 =A0 =A0 =A0 len -=3D L2CAP_SDULEN_SIZE;
> | =A0+ =A0 =A0 =A0 if ((control->frame_type =3D=3D 'i') &&
> | =A0+ =A0 =A0 =A0 =A0 =A0 (control->sar =3D=3D L2CAP_SAR_START))
> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 len -=3D 2;
> |
> | =A0 =A0 =A0 =A0 =A0if (chan->fcs =3D=3D L2CAP_FCS_CRC16)
> | =A0- =A0 =A0 =A0 =A0 =A0 =A0 =A0 len -=3D L2CAP_FCS_SIZE;
> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 len -=3D 2;
> <------8<----------------------------------------------------------------=
----
>
> why not to use macros and defines for magic numbers?
>
> the same below:
>
> <------8<----------------------------------------------------------------
> | =A0+ =A0 =A0 =A0 if (test_bit(FLAG_EXT_CTRL, &chan->flags)) {
> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 __get_extended_control(get_unaligned_l=
e32(skb->data),
> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0control);
> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 skb_pull(skb, 4);
> | =A0+ =A0 =A0 =A0 } else {
> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 __get_enhanced_control(get_unaligned_l=
e16(skb->data),
> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0control);
> | =A0+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 skb_pull(skb, 2);
> | =A0+ =A0 =A0 =A0 }
> |
> | =A0- =A0 =A0 =A0 control =3D __get_control(chan, skb->data);
> | =A0- =A0 =A0 =A0 skb_pull(skb, __ctrl_size(chan));
> <------8<----------------------------------------------------------------
>
> those magic number does not look nice IMO and the code is not looking any
> better.

In this aspect perhaps, but we don't need to take the code as it is,
but the point here is following the states defined by the spec and
that is IMO much better.

--=20
Luiz Augusto von Dentz

2012-02-08 09:09:51

by Andrei Emeltchenko

[permalink] [raw]

Subject: Re: Getting L2CAP ERTM support into better upstream state

Hi Marcel,

On Wed, Feb 08, 2012 at 06:27:35AM +0100, Marcel Holtmann wrote:
> Hello everyone,
>
> we currently have the problem on our hand that our upstream L2CAP
> support is not as good as it should be. And many Android products start
> deviating from it to fix issues with. That is of course something we can
> not have. Fragmentation on this level is not useful for anyone. It will
> hurt everybody in the longterm. So I asked Mat to port over the QUIC
> code to the latest upstream so we can start a discussion here.
>
> https://www.codeaurora.org/gitweb/quic/bluetooth/?p=bluetooth-next.git;a=commit;h=d02ef660d1ec9e9312798561fd688a4c717d339e
>
> To make one thing perfectly clear here. I want that our upstream code
> reflects the state machine transition from the L2CAP specification. I
> know that historically this was not the case and that was for a reason.
> All early specification where pretty unclear in this area, but with the
> introduction of ERTM the Bluetooth SIG got this fixed and now it is our
> time to get this clearly into our code as well.
>
> With the separation of L2CAP channels from L2CAP sockets this should
> make it even more easier to use since we do not need and should not to
> use socket states anymore anyway.

I think this would be good to send as patch series so that people can
comment. What comes to my mind is that the patch might be reduced if it
does not change order of functions and defines like:

<------8<-----------------------------------------------------------------
| -#define L2CAP_EXT_CTRL_TXSEQ 0xFFFC0000
| #define L2CAP_EXT_CTRL_SAR 0x00030000
| -#define L2CAP_EXT_CTRL_SUPERVISE 0x00030000
| #define L2CAP_EXT_CTRL_REQSEQ 0x0000FFFC
| -
| -#define L2CAP_EXT_CTRL_POLL 0x00040000
| +#define L2CAP_EXT_CTRL_TXSEQ 0xFFFC0000
| #define L2CAP_EXT_CTRL_FINAL 0x00000002
| +#define L2CAP_EXT_CTRL_POLL 0x00040000
| +#define L2CAP_EXT_CTRL_SUPERVISE 0x00030000
| #define L2CAP_EXT_CTRL_FRAME_TYPE 0x00000001 /* I- or S-Frame */
<------8<-----------------------------------------------------------------

and I don't like this kind of change:

<------8<--------------------------------------------------------------------
| - if (__is_sar_start(chan, control) && !__is_sframe(chan, control))
| - len -= L2CAP_SDULEN_SIZE;
| + if ((control->frame_type == 'i') &&
| + (control->sar == L2CAP_SAR_START))
| + len -= 2;
|
| if (chan->fcs == L2CAP_FCS_CRC16)
| - len -= L2CAP_FCS_SIZE;
| + len -= 2;
<------8<--------------------------------------------------------------------

why not to use macros and defines for magic numbers?

the same below:

<------8<----------------------------------------------------------------
| + if (test_bit(FLAG_EXT_CTRL, &chan->flags)) {
| + __get_extended_control(get_unaligned_le32(skb->data),
| + control);
| + skb_pull(skb, 4);
| + } else {
| + __get_enhanced_control(get_unaligned_le16(skb->data),
| + control);
| + skb_pull(skb, 2);
| + }
|
| - control = __get_control(chan, skb->data);
| - skb_pull(skb, __ctrl_size(chan));
<------8<----------------------------------------------------------------

those magic number does not look nice IMO and the code is not looking any
better.

Best regards
Andrei Emeltchenko