2013-02-25 16:38:31

by Dean Jenkins

[permalink] [raw]
Subject: [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt

This patchset consists of the following 6 patches that reworks the RFCOMM
session refcnt by adding handling for a NULL RFCOMM session pointer when the
session has been deleted followed by removing the now redundant refcnt
mechanism:

----------------------------------------------------------------
Dean Jenkins (6):
Bluetooth: Avoid rfcomm_session_timeout using freed session
Bluetooth: Check rfcomm session and DLC exists on socket close
Bluetooth: Return RFCOMM session ptrs to avoid freed session
Bluetooth: Remove RFCOMM session refcnt
Bluetooth: Remove redundant call to rfcomm_send_disc
Bluetooth: Remove redundant RFCOMM BT_CLOSED settings

include/net/bluetooth/rfcomm.h | 6 ----
net/bluetooth/rfcomm/core.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------
2 files changed, 76 insertions(+), 93 deletions(-)

The motivation for the change was RFCOMM kernel crashes in a 2.6.34 based ARM
SMP kernel. It is noted that RFCOMM in kernel 3.7 still contains the RFCOMM
session refcnt so the patches have been forward ported. Crashes were due to
malfunctions of the RFCOMM session refcnt such as premature deletion of the
session and double deletion of the session. Some crashes were caused by
the interaction of 2 kernel threads under very high processor loading.

Note that the refcnt is incorrectly initialised, it should be set to 1 before
allowing the session structure to be used, the old code used a value of 0 and
this contributes to a weak implementation of the refcnt.

In kernels later than 2.6.34, the threading model for RFCOMM was modified to
use work queues and this reduced the probability of a RFCOMM crash. This
helps to explain why the reports of RFCOMM crashes went away. However, the
implementation of the RFCOMM session refcnt is weak and is in fact logically
incorrect by using the RFCOMM data channel "initiator" flag. The control
channel disconnection procedures do not care about how the control channel
was established eg. by host or remote device request.

The patches also make it clear when the RFCOMM session has been deleted by
using a NULL pointer. The old code has a possibility to access freed RFCOMM
session structures because there are multiple local copies of the pointer. The
patches ensure that local copies of the pointer are set to NULL when the
session has been deleted so avoiding any possibility of accessing freed memory.

Therefore, the patchset cleans up the disconnection of the RFCOMM session
control channel and now the code is deterministic and is understandable by
code review.

These patches were lightly tested against kernel 3.7.3 on a x86 PC. The design
of the patches were proven on a commercial project using a modified embedded ARM
SMP kernel 2.6.34.13. All previously observed RFCOMM control channel
disconnection kernel crashes on ARM 2.6.34.13 went away with the patches
applied.

Patch summaries are shown below:

0001: Bluetooth: Avoid rfcomm_session_timeout using freed session

Use del_timer_sync() instead of del_timer() as this ensures
that rfcomm_session_timeout() is not running on a different
CPU when rfcomm_session_put() is called. This avoids a race
condition on SMP systems because potentially
rfcomm_session_timeout() could reuse the freed RFCOMM session
structure caused by the execution of rfcomm_session_put().

Note that this modification makes the reason for the RFCOMM
session refcnt mechanism redundant.


0002: Bluetooth: Check rfcomm session and DLC exists on socket close

A race condition exists between near simultaneous asynchronous
DLC data channel disconnection requests from the host and remote device.
This causes the socket layer to request a socket shutdown at the same
time the rfcomm core is processing the disconnect request from the remote
device.

The socket layer retains a copy of a struct rfcomm_dlc d pointer.
The d pointer refers to a copy of a struct rfcomm_session.
When the socket layer thread performs a socket shutdown, the thread
may wait on a rfcomm lock in rfcomm_dlc_close(). This means that
whilst the thread waits, the rfcomm_session and/or rfcomm_dlc structures
pointed to by d maybe freed due to rfcomm core handling. Consequently,
when the rfcomm lock becomes available and the thread runs, a
malfunction could occur as a freed rfcomm_session structure and/or a
freed rfcomm_dlc structure will be erroneously accessed.

Therefore, after the rfcomm lock is acquired, check that the struct
rfcomm_session is still valid by searching the rfcomm session list.
If the session is valid then validate the d pointer by searching the
rfcomm session list of active DLCs for the rfcomm_dlc structure
pointed by d.


0003: Bluetooth: Return RFCOMM session ptrs to avoid freed session

Unfortunately, the design retains local copies of the s RFCOMM
session pointer in various code blocks and this invites the erroneous
access to a freed RFCOMM session structure.

Therefore, return the RFCOMM session pointer back up the call stack
to avoid accessing a freed RFCOMM session structure. When the RFCOMM
session is deleted, NULL is passed up the call stack.

If active DLCs exist when the rfcomm session is terminating,
avoid a memory leak of rfcomm_dlc structures by ensuring that
rfcomm_session_close() is used instead of rfcomm_session_del().


0004: Bluetooth: Remove RFCOMM session refcnt

Previous commits have improved the handling of the RFCOMM session
timer and the RFCOMM session pointers such that freed RFCOMM
session structures should no longer be erroneously accessed. The
RFCOMM session refcnt now has no purpose and will be deleted by
this commit.

Note that the RFCOMM session is now deleted as soon as the
RFCOMM control channel link is no longer required. This makes the
lifetime of the RFCOMM session deterministic and absolute.
Previously with the refcnt, there was uncertainty about when
the session structure would be deleted because the relative
refcnt prevented the session structure from being deleted at will.

It was noted that the refcnt could malfunction under very heavy
real-time processor loading in embedded SMP environments. This
could cause premature RFCOMM session deletion or double session
deletion that could result in kernel crashes. Removal of the
refcnt prevents this issue.

There are 4 connection / disconnection RFCOMM session scenarios:
host initiated control link ---> host disconnected control link
host initiated ctrl link ---> remote device disconnected ctrl link
remote device initiated ctrl link ---> host disconnected ctrl link
remote device initiated ctrl link ---> remote device disc'ed ctrl link

The control channel connection procedures are independent of the
disconnection procedures. Strangely, the RFCOMM session refcnt was
applying special treatment so erroneously combining connection and
disconnection events. This commit fixes this issue by removing
some session code that used the "initiator" member of the session
structure that was intended for use with the data channels.


0005: Bluetooth: Remove redundant call to rfcomm_send_disc

In rfcomm_session_del() remove the redundant call to
rfcomm_send_disc() because it is not possible for the
session to be in BT_CONNECTED state during deletion
of the session.


0006: Bluetooth: Remove redundant RFCOMM BT_CLOSED settings

rfcomm_session_close() sets the RFCOMM session state to BT_CLOSED.
However, in multiple places immediately before the function is
called, the RFCOMM session is set to BT_CLOSED. Therefore,
remove these unnecessary state settings.



2013-02-26 19:21:07

by Gustavo Padovan

[permalink] [raw]
Subject: Re: [PATCH 3/6] Bluetooth: Return RFCOMM session ptrs to avoid freed session

Hi Dean,

* [email protected] <[email protected]> [2013-02-25 16:38:34 +0000]:

> From: Dean Jenkins <[email protected]>
>
> Unfortunately, the design retains local copies of the s RFCOMM
> session pointer in various code blocks and this invites the erroneous
> access to a freed RFCOMM session structure.
>
> Therefore, return the RFCOMM session pointer back up the call stack
> to avoid accessing a freed RFCOMM session structure. When the RFCOMM
> session is deleted, NULL is passed up the call stack.
>
> If active DLCs exist when the rfcomm session is terminating,
> avoid a memory leak of rfcomm_dlc structures by ensuring that
> rfcomm_session_close() is used instead of rfcomm_session_del().
>
> Signed-off-by: Dean Jenkins <[email protected]>
> ---
> include/net/bluetooth/rfcomm.h | 3 +-
> net/bluetooth/rfcomm/core.c | 106 +++++++++++++++++++++-------------------
> 2 files changed, 58 insertions(+), 51 deletions(-)
>
> diff --git a/include/net/bluetooth/rfcomm.h b/include/net/bluetooth/rfcomm.h
> index e2e3eca..a4e38ea 100644
> --- a/include/net/bluetooth/rfcomm.h
> +++ b/include/net/bluetooth/rfcomm.h
> @@ -278,7 +278,8 @@ void rfcomm_session_getaddr(struct rfcomm_session *s, bdaddr_t *src,
>
> static inline void rfcomm_session_hold(struct rfcomm_session *s)
> {
> - atomic_inc(&s->refcnt);
> + if (s)
> + atomic_inc(&s->refcnt);
> }
>
> /* ---- RFCOMM sockets ---- */
> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> index af0c26d..60d2f1a 100644
> --- a/net/bluetooth/rfcomm/core.c
> +++ b/net/bluetooth/rfcomm/core.c
> @@ -69,7 +69,7 @@ static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
> u8 sec_level,
> int *err);
> static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst);
> -static void rfcomm_session_del(struct rfcomm_session *s);
> +static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s);
>
> /* ---- RFCOMM frame parsing macros ---- */
> #define __get_dlci(b) ((b & 0xfc) >> 2)
> @@ -108,10 +108,12 @@ static void rfcomm_schedule(void)
> wake_up_process(rfcomm_thread);
> }
>
> -static void rfcomm_session_put(struct rfcomm_session *s)
> +static struct rfcomm_session *rfcomm_session_put(struct rfcomm_session *s)
> {
> - if (atomic_dec_and_test(&s->refcnt))
> - rfcomm_session_del(s);
> + if (s && atomic_dec_and_test(&s->refcnt))
> + s = rfcomm_session_del(s);
> +
> + return s;
> }
>
> /* ---- RFCOMM FCS computation ---- */
> @@ -631,7 +633,7 @@ static struct rfcomm_session *rfcomm_session_add(struct socket *sock, int state)
> return s;
> }
>
> -static void rfcomm_session_del(struct rfcomm_session *s)
> +static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s)
> {
> int state = s->state;
>
> @@ -648,6 +650,8 @@ static void rfcomm_session_del(struct rfcomm_session *s)
>
> if (state != BT_LISTEN)
> module_put(THIS_MODULE);
> +
> + return NULL;
> }
>
> static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst)
> @@ -666,7 +670,8 @@ static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst)
> return NULL;
> }
>
> -static void rfcomm_session_close(struct rfcomm_session *s, int err)
> +static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
> + int err)

wrong identation here, you have to align with the opening parenthesis:

static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
int err)



> {
> struct rfcomm_dlc *d;
> struct list_head *p, *n;
> @@ -685,7 +690,7 @@ static void rfcomm_session_close(struct rfcomm_session *s, int err)
> }
>
> rfcomm_session_clear_timer(s);
> - rfcomm_session_put(s);
> + return rfcomm_session_put(s);
> }
>
> static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
> @@ -737,8 +742,7 @@ static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
> if (*err == 0 || *err == -EINPROGRESS)
> return s;
>
> - rfcomm_session_del(s);
> - return NULL;
> + return rfcomm_session_del(s);
>
> failed:
> sock_release(sock);
> @@ -1127,7 +1131,7 @@ static void rfcomm_make_uih(struct sk_buff *skb, u8 addr)
> }
>
> /* ---- RFCOMM frame reception ---- */
> -static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
> +static struct rfcomm_session *rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
> {
> BT_DBG("session %p state %ld dlci %d", s, s->state, dlci);
>
> @@ -1136,7 +1140,7 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
> struct rfcomm_dlc *d = rfcomm_dlc_get(s, dlci);
> if (!d) {
> rfcomm_send_dm(s, dlci);
> - return 0;
> + return s;
> }
>
> switch (d->state) {
> @@ -1172,25 +1176,14 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
> break;
>
> case BT_DISCONN:
> - /* rfcomm_session_put is called later so don't do
> - * anything here otherwise we will mess up the session
> - * reference counter:
> - *
> - * (a) when we are the initiator dlc_unlink will drive
> - * the reference counter to 0 (there is no initial put
> - * after session_add)
> - *
> - * (b) when we are not the initiator rfcomm_rx_process
> - * will explicitly call put to balance the initial hold
> - * done after session add.
> - */
> + s = rfcomm_session_close(s, ECONNRESET);
> break;
> }
> }
> - return 0;
> + return s;
> }
>
> -static int rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
> +static struct rfcomm_session *rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
> {
> int err = 0;
>
> @@ -1215,12 +1208,13 @@ static int rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
> err = ECONNRESET;
>
> s->state = BT_CLOSED;
> - rfcomm_session_close(s, err);
> + s = rfcomm_session_close(s, err);
> }
> - return 0;
> + return s;
> }
>
> -static int rfcomm_recv_disc(struct rfcomm_session *s, u8 dlci)
> +static struct rfcomm_session *rfcomm_recv_disc(struct rfcomm_session *s,
> + u8 dlci)
> {
> int err = 0;
>
> @@ -1250,10 +1244,9 @@ static int rfcomm_recv_disc(struct rfcomm_session *s, u8 dlci)
> err = ECONNRESET;
>
> s->state = BT_CLOSED;
> - rfcomm_session_close(s, err);
> + s = rfcomm_session_close(s, err);
> }
> -
> - return 0;
> + return s;
> }
>
> void rfcomm_dlc_accept(struct rfcomm_dlc *d)
> @@ -1674,11 +1667,18 @@ drop:
> return 0;
> }
>
> -static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
> +static struct rfcomm_session *rfcomm_recv_frame(struct rfcomm_session *s,
> + struct sk_buff *skb)

Same here.

Please fix this. You can collect Marcel's Ack and add them to your patches and
resend and updated version with these fixes in it.

Gustavo

2013-02-26 19:12:35

by Gustavo Padovan

[permalink] [raw]
Subject: Re: [PATCH 2/6] Bluetooth: Check rfcomm session and DLC exists on socket close

Hi Dean,

* [email protected] <[email protected]> [2013-02-25 16:38:33 +0000]:

> From: Dean Jenkins <[email protected]>
>
> A race condition exists between near simultaneous asynchronous
> DLC data channel disconnection requests from the host and remote device.
> This causes the socket layer to request a socket shutdown at the same
> time the rfcomm core is processing the disconnect request from the remote
> device.
>
> The socket layer retains a copy of a struct rfcomm_dlc d pointer.
> The d pointer refers to a copy of a struct rfcomm_session.
> When the socket layer thread performs a socket shutdown, the thread
> may wait on a rfcomm lock in rfcomm_dlc_close(). This means that
> whilst the thread waits, the rfcomm_session and/or rfcomm_dlc structures
> pointed to by d maybe freed due to rfcomm core handling. Consequently,
> when the rfcomm lock becomes available and the thread runs, a
> malfunction could occur as a freed rfcomm_session structure and/or a
> freed rfcomm_dlc structure will be erroneously accessed.
>
> Therefore, after the rfcomm lock is acquired, check that the struct
> rfcomm_session is still valid by searching the rfcomm session list.
> If the session is valid then validate the d pointer by searching the
> rfcomm session list of active DLCs for the rfcomm_dlc structure
> pointed by d.
>
> Signed-off-by: Dean Jenkins <[email protected]>
> ---
> net/bluetooth/rfcomm/core.c | 26 ++++++++++++++++++++++++--
> 1 file changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> index 8780e67..af0c26d 100644
> --- a/net/bluetooth/rfcomm/core.c
> +++ b/net/bluetooth/rfcomm/core.c
> @@ -493,12 +493,34 @@ static int __rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
>
> int rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
> {
> - int r;
> + int r = 0;
> + struct rfcomm_dlc *d_list;
> + struct rfcomm_session *s, *s_list;
> +
> + BT_DBG("dlc %p state %ld dlci %d err %d", d, d->state, d->dlci, err);
>
> rfcomm_lock();
>
> - r = __rfcomm_dlc_close(d, err);
> + s = d->session;
> + if (!s)
> + goto no_session;
> +
> + /* after waiting on the mutex check the session still exists */
> + list_for_each_entry(s_list, &session_list, list) {
> + if (s_list == s) {
> + /* after waiting on the mutex, */
> + /* check the dlc still exists */

Just a very small issue, this comment here is not following our coding style.
If you want a multiple line comment do something like this:

/* after waiting on the mutex,
* check the dlc still exists
*/

But I will just recommend you merge this comment with the one before the first
list_for_each_entry. please send an updated version of this patch so I can go
ahead and apply it.

Gustavo

2013-02-26 18:13:12

by Dean Jenkins

[permalink] [raw]
Subject: Re: [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt

Hi Marcel,

On 25/02/13 18:32, Marcel Holtmann wrote:
> Hi Dean,
>
>> This patchset consists of the following 6 patches that reworks the RFCOMM
>> session refcnt by adding handling for a NULL RFCOMM session pointer when the
>> session has been deleted followed by removing the now redundant refcnt
>> mechanism:
>>
>> ----------------------------------------------------------------
>> Dean Jenkins (6):
>> Bluetooth: Avoid rfcomm_session_timeout using freed session
>> Bluetooth: Check rfcomm session and DLC exists on socket close
>> Bluetooth: Return RFCOMM session ptrs to avoid freed session
>> Bluetooth: Remove RFCOMM session refcnt
>> Bluetooth: Remove redundant call to rfcomm_send_disc
>> Bluetooth: Remove redundant RFCOMM BT_CLOSED settings
>>
>> include/net/bluetooth/rfcomm.h | 6 ----
>> net/bluetooth/rfcomm/core.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------
>> 2 files changed, 76 insertions(+), 93 deletions(-)
>>
>> The motivation for the change was RFCOMM kernel crashes in a 2.6.34 based ARM
>> SMP kernel. It is noted that RFCOMM in kernel 3.7 still contains the RFCOMM
>> session refcnt so the patches have been forward ported. Crashes were due to
>> malfunctions of the RFCOMM session refcnt such as premature deletion of the
>> session and double deletion of the session. Some crashes were caused by
>> the interaction of 2 kernel threads under very high processor loading.
>>
>> Note that the refcnt is incorrectly initialised, it should be set to 1 before
>> allowing the session structure to be used, the old code used a value of 0 and
>> this contributes to a weak implementation of the refcnt.
>>
>> In kernels later than 2.6.34, the threading model for RFCOMM was modified to
>> use work queues and this reduced the probability of a RFCOMM crash. This
>> helps to explain why the reports of RFCOMM crashes went away. However, the
>> implementation of the RFCOMM session refcnt is weak and is in fact logically
>> incorrect by using the RFCOMM data channel "initiator" flag. The control
>> channel disconnection procedures do not care about how the control channel
>> was established eg. by host or remote device request.
>>
>> The patches also make it clear when the RFCOMM session has been deleted by
>> using a NULL pointer. The old code has a possibility to access freed RFCOMM
>> session structures because there are multiple local copies of the pointer. The
>> patches ensure that local copies of the pointer are set to NULL when the
>> session has been deleted so avoiding any possibility of accessing freed memory.
>>
>> Therefore, the patchset cleans up the disconnection of the RFCOMM session
>> control channel and now the code is deterministic and is understandable by
>> code review.
>>
>> These patches were lightly tested against kernel 3.7.3 on a x86 PC. The design
>> of the patches were proven on a commercial project using a modified embedded ARM
>> SMP kernel 2.6.34.13. All previously observed RFCOMM control channel
>> disconnection kernel crashes on ARM 2.6.34.13 went away with the patches
>> applied.
> I looked through the patch set and could not spot anything obvious
> wrong. Thanks for looking into this. The RFCOMM reference counting has
> been always giving me a headache.
>
> Great job on the cover page and the commit messages. It was a pleasure
> to read them and then go through the code. Makes review a lot easier.
>
> On my account, lets get these patches in and see how they are doing.
> Once they are in bluetooth-next they get a bit wider expose on upstream
> kernels.
>
> Acked-by: Marcel Holtmann <[email protected]>
>
> Regards
>
> Marcel
>
>
Many thanks for the encouraging feedback.

I wait with eager anticipation to see my patches in kernel.org and to
see what feedback comes from upstream.

Thanks,

Regards,
Dean

2013-02-25 23:08:41

by David Herrmann

[permalink] [raw]
Subject: Re: [PATCH 4/6] Bluetooth: Remove RFCOMM session refcnt

Hi

On Mon, Feb 25, 2013 at 5:38 PM, <[email protected]> wrote:
> From: Dean Jenkins <[email protected]>
>
> Previous commits have improved the handling of the RFCOMM session
> timer and the RFCOMM session pointers such that freed RFCOMM
> session structures should no longer be erroneously accessed. The
> RFCOMM session refcnt now has no purpose and will be deleted by
> this commit.
>
> Note that the RFCOMM session is now deleted as soon as the
> RFCOMM control channel link is no longer required. This makes the
> lifetime of the RFCOMM session deterministic and absolute.
> Previously with the refcnt, there was uncertainty about when
> the session structure would be deleted because the relative
> refcnt prevented the session structure from being deleted at will.
>
> It was noted that the refcnt could malfunction under very heavy
> real-time processor loading in embedded SMP environments. This
> could cause premature RFCOMM session deletion or double session
> deletion that could result in kernel crashes. Removal of the
> refcnt prevents this issue.
>
> There are 4 connection / disconnection RFCOMM session scenarios:
> host initiated control link ---> host disconnected control link
> host initiated ctrl link ---> remote device disconnected ctrl link
> remote device initiated ctrl link ---> host disconnected ctrl link
> remote device initiated ctrl link ---> remote device disc'ed ctrl link
>
> The control channel connection procedures are independent of the
> disconnection procedures. Strangely, the RFCOMM session refcnt was
> applying special treatment so erroneously combining connection and
> disconnection events. This commit fixes this issue by removing
> some session code that used the "initiator" member of the session
> structure that was intended for use with the data channels.

I tried to do a similar cleanup for the HIDP code (pending on the ML).
Did you check whether the "device_find_child()" and "device_move()"
calls in hci_sysfs.c hci_conn_del_sysfs() are still needed with this
cleanup? The HIDP cleanup doesn't need it, anymore, and if the rfcomm
code got rid of it, too, then we can finally drop that code.

I really think the *_hold() calls in the Bluetooth-core are flawed. We
should try to fix it to split ref-counting and life-time tracking.
Ref-counting for objects can be non-deterministic and this is totally
ok. But life-time tracking must be deterministic. That is, the
device_del() calls must be predictable so we can tell when objects get
deleted. Whether they get freed is minor, this doesn't harm anybody.

Thanks for the nice cleanup.
David

2013-02-25 18:32:44

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt

Hi Dean,

> This patchset consists of the following 6 patches that reworks the RFCOMM
> session refcnt by adding handling for a NULL RFCOMM session pointer when the
> session has been deleted followed by removing the now redundant refcnt
> mechanism:
>
> ----------------------------------------------------------------
> Dean Jenkins (6):
> Bluetooth: Avoid rfcomm_session_timeout using freed session
> Bluetooth: Check rfcomm session and DLC exists on socket close
> Bluetooth: Return RFCOMM session ptrs to avoid freed session
> Bluetooth: Remove RFCOMM session refcnt
> Bluetooth: Remove redundant call to rfcomm_send_disc
> Bluetooth: Remove redundant RFCOMM BT_CLOSED settings
>
> include/net/bluetooth/rfcomm.h | 6 ----
> net/bluetooth/rfcomm/core.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------
> 2 files changed, 76 insertions(+), 93 deletions(-)
>
> The motivation for the change was RFCOMM kernel crashes in a 2.6.34 based ARM
> SMP kernel. It is noted that RFCOMM in kernel 3.7 still contains the RFCOMM
> session refcnt so the patches have been forward ported. Crashes were due to
> malfunctions of the RFCOMM session refcnt such as premature deletion of the
> session and double deletion of the session. Some crashes were caused by
> the interaction of 2 kernel threads under very high processor loading.
>
> Note that the refcnt is incorrectly initialised, it should be set to 1 before
> allowing the session structure to be used, the old code used a value of 0 and
> this contributes to a weak implementation of the refcnt.
>
> In kernels later than 2.6.34, the threading model for RFCOMM was modified to
> use work queues and this reduced the probability of a RFCOMM crash. This
> helps to explain why the reports of RFCOMM crashes went away. However, the
> implementation of the RFCOMM session refcnt is weak and is in fact logically
> incorrect by using the RFCOMM data channel "initiator" flag. The control
> channel disconnection procedures do not care about how the control channel
> was established eg. by host or remote device request.
>
> The patches also make it clear when the RFCOMM session has been deleted by
> using a NULL pointer. The old code has a possibility to access freed RFCOMM
> session structures because there are multiple local copies of the pointer. The
> patches ensure that local copies of the pointer are set to NULL when the
> session has been deleted so avoiding any possibility of accessing freed memory.
>
> Therefore, the patchset cleans up the disconnection of the RFCOMM session
> control channel and now the code is deterministic and is understandable by
> code review.
>
> These patches were lightly tested against kernel 3.7.3 on a x86 PC. The design
> of the patches were proven on a commercial project using a modified embedded ARM
> SMP kernel 2.6.34.13. All previously observed RFCOMM control channel
> disconnection kernel crashes on ARM 2.6.34.13 went away with the patches
> applied.

I looked through the patch set and could not spot anything obvious
wrong. Thanks for looking into this. The RFCOMM reference counting has
been always giving me a headache.

Great job on the cover page and the commit messages. It was a pleasure
to read them and then go through the code. Makes review a lot easier.

On my account, lets get these patches in and see how they are doing.
Once they are in bluetooth-next they get a bit wider expose on upstream
kernels.

Acked-by: Marcel Holtmann <[email protected]>

Regards

Marcel



2013-02-25 16:38:34

by Dean Jenkins

[permalink] [raw]
Subject: [PATCH 3/6] Bluetooth: Return RFCOMM session ptrs to avoid freed session

From: Dean Jenkins <[email protected]>

Unfortunately, the design retains local copies of the s RFCOMM
session pointer in various code blocks and this invites the erroneous
access to a freed RFCOMM session structure.

Therefore, return the RFCOMM session pointer back up the call stack
to avoid accessing a freed RFCOMM session structure. When the RFCOMM
session is deleted, NULL is passed up the call stack.

If active DLCs exist when the rfcomm session is terminating,
avoid a memory leak of rfcomm_dlc structures by ensuring that
rfcomm_session_close() is used instead of rfcomm_session_del().

Signed-off-by: Dean Jenkins <[email protected]>
---
include/net/bluetooth/rfcomm.h | 3 +-
net/bluetooth/rfcomm/core.c | 106 +++++++++++++++++++++-------------------
2 files changed, 58 insertions(+), 51 deletions(-)

diff --git a/include/net/bluetooth/rfcomm.h b/include/net/bluetooth/rfcomm.h
index e2e3eca..a4e38ea 100644
--- a/include/net/bluetooth/rfcomm.h
+++ b/include/net/bluetooth/rfcomm.h
@@ -278,7 +278,8 @@ void rfcomm_session_getaddr(struct rfcomm_session *s, bdaddr_t *src,

static inline void rfcomm_session_hold(struct rfcomm_session *s)
{
- atomic_inc(&s->refcnt);
+ if (s)
+ atomic_inc(&s->refcnt);
}

/* ---- RFCOMM sockets ---- */
diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index af0c26d..60d2f1a 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -69,7 +69,7 @@ static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
u8 sec_level,
int *err);
static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst);
-static void rfcomm_session_del(struct rfcomm_session *s);
+static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s);

/* ---- RFCOMM frame parsing macros ---- */
#define __get_dlci(b) ((b & 0xfc) >> 2)
@@ -108,10 +108,12 @@ static void rfcomm_schedule(void)
wake_up_process(rfcomm_thread);
}

-static void rfcomm_session_put(struct rfcomm_session *s)
+static struct rfcomm_session *rfcomm_session_put(struct rfcomm_session *s)
{
- if (atomic_dec_and_test(&s->refcnt))
- rfcomm_session_del(s);
+ if (s && atomic_dec_and_test(&s->refcnt))
+ s = rfcomm_session_del(s);
+
+ return s;
}

/* ---- RFCOMM FCS computation ---- */
@@ -631,7 +633,7 @@ static struct rfcomm_session *rfcomm_session_add(struct socket *sock, int state)
return s;
}

-static void rfcomm_session_del(struct rfcomm_session *s)
+static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s)
{
int state = s->state;

@@ -648,6 +650,8 @@ static void rfcomm_session_del(struct rfcomm_session *s)

if (state != BT_LISTEN)
module_put(THIS_MODULE);
+
+ return NULL;
}

static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst)
@@ -666,7 +670,8 @@ static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst)
return NULL;
}

-static void rfcomm_session_close(struct rfcomm_session *s, int err)
+static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
+ int err)
{
struct rfcomm_dlc *d;
struct list_head *p, *n;
@@ -685,7 +690,7 @@ static void rfcomm_session_close(struct rfcomm_session *s, int err)
}

rfcomm_session_clear_timer(s);
- rfcomm_session_put(s);
+ return rfcomm_session_put(s);
}

static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
@@ -737,8 +742,7 @@ static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
if (*err == 0 || *err == -EINPROGRESS)
return s;

- rfcomm_session_del(s);
- return NULL;
+ return rfcomm_session_del(s);

failed:
sock_release(sock);
@@ -1127,7 +1131,7 @@ static void rfcomm_make_uih(struct sk_buff *skb, u8 addr)
}

/* ---- RFCOMM frame reception ---- */
-static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
+static struct rfcomm_session *rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
{
BT_DBG("session %p state %ld dlci %d", s, s->state, dlci);

@@ -1136,7 +1140,7 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
struct rfcomm_dlc *d = rfcomm_dlc_get(s, dlci);
if (!d) {
rfcomm_send_dm(s, dlci);
- return 0;
+ return s;
}

switch (d->state) {
@@ -1172,25 +1176,14 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
break;

case BT_DISCONN:
- /* rfcomm_session_put is called later so don't do
- * anything here otherwise we will mess up the session
- * reference counter:
- *
- * (a) when we are the initiator dlc_unlink will drive
- * the reference counter to 0 (there is no initial put
- * after session_add)
- *
- * (b) when we are not the initiator rfcomm_rx_process
- * will explicitly call put to balance the initial hold
- * done after session add.
- */
+ s = rfcomm_session_close(s, ECONNRESET);
break;
}
}
- return 0;
+ return s;
}

-static int rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
+static struct rfcomm_session *rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
{
int err = 0;

@@ -1215,12 +1208,13 @@ static int rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
err = ECONNRESET;

s->state = BT_CLOSED;
- rfcomm_session_close(s, err);
+ s = rfcomm_session_close(s, err);
}
- return 0;
+ return s;
}

-static int rfcomm_recv_disc(struct rfcomm_session *s, u8 dlci)
+static struct rfcomm_session *rfcomm_recv_disc(struct rfcomm_session *s,
+ u8 dlci)
{
int err = 0;

@@ -1250,10 +1244,9 @@ static int rfcomm_recv_disc(struct rfcomm_session *s, u8 dlci)
err = ECONNRESET;

s->state = BT_CLOSED;
- rfcomm_session_close(s, err);
+ s = rfcomm_session_close(s, err);
}
-
- return 0;
+ return s;
}

void rfcomm_dlc_accept(struct rfcomm_dlc *d)
@@ -1674,11 +1667,18 @@ drop:
return 0;
}

-static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
+static struct rfcomm_session *rfcomm_recv_frame(struct rfcomm_session *s,
+ struct sk_buff *skb)
{
struct rfcomm_hdr *hdr = (void *) skb->data;
u8 type, dlci, fcs;

+ if (!s) {
+ /* no session, so free socket data */
+ kfree_skb(skb);
+ return s;
+ }
+
dlci = __get_dlci(hdr->addr);
type = __get_type(hdr->ctrl);

@@ -1689,7 +1689,7 @@ static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
if (__check_fcs(skb->data, type, fcs)) {
BT_ERR("bad checksum in packet");
kfree_skb(skb);
- return -EILSEQ;
+ return s;
}

if (__test_ea(hdr->len))
@@ -1705,22 +1705,23 @@ static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)

case RFCOMM_DISC:
if (__test_pf(hdr->ctrl))
- rfcomm_recv_disc(s, dlci);
+ s = rfcomm_recv_disc(s, dlci);
break;

case RFCOMM_UA:
if (__test_pf(hdr->ctrl))
- rfcomm_recv_ua(s, dlci);
+ s = rfcomm_recv_ua(s, dlci);
break;

case RFCOMM_DM:
- rfcomm_recv_dm(s, dlci);
+ s = rfcomm_recv_dm(s, dlci);
break;

case RFCOMM_UIH:
- if (dlci)
- return rfcomm_recv_data(s, dlci, __test_pf(hdr->ctrl), skb);
-
+ if (dlci) {
+ rfcomm_recv_data(s, dlci, __test_pf(hdr->ctrl), skb);
+ return s;
+ }
rfcomm_recv_mcc(s, skb);
break;

@@ -1729,7 +1730,7 @@ static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
break;
}
kfree_skb(skb);
- return 0;
+ return s;
}

/* ---- Connection and data processing ---- */
@@ -1866,7 +1867,7 @@ static void rfcomm_process_dlcs(struct rfcomm_session *s)
}
}

-static void rfcomm_process_rx(struct rfcomm_session *s)
+static struct rfcomm_session *rfcomm_process_rx(struct rfcomm_session *s)
{
struct socket *sock = s->sock;
struct sock *sk = sock->sk;
@@ -1878,17 +1879,20 @@ static void rfcomm_process_rx(struct rfcomm_session *s)
while ((skb = skb_dequeue(&sk->sk_receive_queue))) {
skb_orphan(skb);
if (!skb_linearize(skb))
- rfcomm_recv_frame(s, skb);
+ s = rfcomm_recv_frame(s, skb);
else
kfree_skb(skb);
}

- if (sk->sk_state == BT_CLOSED) {
+ if (s && (sk->sk_state == BT_CLOSED)) {
if (!s->initiator)
- rfcomm_session_put(s);
+ s = rfcomm_session_put(s);

- rfcomm_session_close(s, sk->sk_err);
+ if (s)
+ s = rfcomm_session_close(s, sk->sk_err);
}
+
+ return s;
}

static void rfcomm_accept_connection(struct rfcomm_session *s)
@@ -1925,7 +1929,7 @@ static void rfcomm_accept_connection(struct rfcomm_session *s)
sock_release(nsock);
}

-static void rfcomm_check_connection(struct rfcomm_session *s)
+static struct rfcomm_session *rfcomm_check_connection(struct rfcomm_session *s)
{
struct sock *sk = s->sock->sk;

@@ -1944,9 +1948,10 @@ static void rfcomm_check_connection(struct rfcomm_session *s)

case BT_CLOSED:
s->state = BT_CLOSED;
- rfcomm_session_close(s, sk->sk_err);
+ s = rfcomm_session_close(s, sk->sk_err);
break;
}
+ return s;
}

static void rfcomm_process_sessions(void)
@@ -1975,15 +1980,16 @@ static void rfcomm_process_sessions(void)

switch (s->state) {
case BT_BOUND:
- rfcomm_check_connection(s);
+ s = rfcomm_check_connection(s);
break;

default:
- rfcomm_process_rx(s);
+ s = rfcomm_process_rx(s);
break;
}

- rfcomm_process_dlcs(s);
+ if (s)
+ rfcomm_process_dlcs(s);

rfcomm_session_put(s);
}
--
1.7.10.1

2013-02-25 16:38:37

by Dean Jenkins

[permalink] [raw]
Subject: [PATCH 6/6] Bluetooth: Remove redundant RFCOMM BT_CLOSED settings

From: Dean Jenkins <[email protected]>

rfcomm_session_close() sets the RFCOMM session state to BT_CLOSED.
However, in multiple places immediately before the function is
called, the RFCOMM session is set to BT_CLOSED. Therefore,
remove these unnecessary state settings.

Signed-off-by: Dean Jenkins <[email protected]>
---
net/bluetooth/rfcomm/core.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 3bdaf2e..2e82b60 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -659,10 +659,10 @@ static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
struct rfcomm_dlc *d;
struct list_head *p, *n;

- BT_DBG("session %p state %ld err %d", s, s->state, err);
-
s->state = BT_CLOSED;

+ BT_DBG("session %p state %ld err %d", s, s->state, err);
+
/* Close all dlcs */
list_for_each_safe(p, n, &s->dlcs) {
d = list_entry(p, struct rfcomm_dlc, list);
@@ -1188,7 +1188,6 @@ static struct rfcomm_session *rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
else
err = ECONNRESET;

- s->state = BT_CLOSED;
s = rfcomm_session_close(s, err);
}
return s;
@@ -1224,7 +1223,6 @@ static struct rfcomm_session *rfcomm_recv_disc(struct rfcomm_session *s,
else
err = ECONNRESET;

- s->state = BT_CLOSED;
s = rfcomm_session_close(s, err);
}
return s;
@@ -1921,7 +1919,6 @@ static struct rfcomm_session *rfcomm_check_connection(struct rfcomm_session *s)
break;

case BT_CLOSED:
- s->state = BT_CLOSED;
s = rfcomm_session_close(s, sk->sk_err);
break;
}
--
1.7.10.1

2013-02-25 16:38:35

by Dean Jenkins

[permalink] [raw]
Subject: [PATCH 4/6] Bluetooth: Remove RFCOMM session refcnt

From: Dean Jenkins <[email protected]>

Previous commits have improved the handling of the RFCOMM session
timer and the RFCOMM session pointers such that freed RFCOMM
session structures should no longer be erroneously accessed. The
RFCOMM session refcnt now has no purpose and will be deleted by
this commit.

Note that the RFCOMM session is now deleted as soon as the
RFCOMM control channel link is no longer required. This makes the
lifetime of the RFCOMM session deterministic and absolute.
Previously with the refcnt, there was uncertainty about when
the session structure would be deleted because the relative
refcnt prevented the session structure from being deleted at will.

It was noted that the refcnt could malfunction under very heavy
real-time processor loading in embedded SMP environments. This
could cause premature RFCOMM session deletion or double session
deletion that could result in kernel crashes. Removal of the
refcnt prevents this issue.

There are 4 connection / disconnection RFCOMM session scenarios:
host initiated control link ---> host disconnected control link
host initiated ctrl link ---> remote device disconnected ctrl link
remote device initiated ctrl link ---> host disconnected ctrl link
remote device initiated ctrl link ---> remote device disc'ed ctrl link

The control channel connection procedures are independent of the
disconnection procedures. Strangely, the RFCOMM session refcnt was
applying special treatment so erroneously combining connection and
disconnection events. This commit fixes this issue by removing
some session code that used the "initiator" member of the session
structure that was intended for use with the data channels.

Signed-off-by: Dean Jenkins <[email protected]>
---
include/net/bluetooth/rfcomm.h | 7 -------
net/bluetooth/rfcomm/core.c | 43 +++++-----------------------------------
2 files changed, 5 insertions(+), 45 deletions(-)

diff --git a/include/net/bluetooth/rfcomm.h b/include/net/bluetooth/rfcomm.h
index a4e38ea..7afd419 100644
--- a/include/net/bluetooth/rfcomm.h
+++ b/include/net/bluetooth/rfcomm.h
@@ -158,7 +158,6 @@ struct rfcomm_session {
struct timer_list timer;
unsigned long state;
unsigned long flags;
- atomic_t refcnt;
int initiator;

/* Default DLC parameters */
@@ -276,12 +275,6 @@ static inline void rfcomm_dlc_unthrottle(struct rfcomm_dlc *d)
void rfcomm_session_getaddr(struct rfcomm_session *s, bdaddr_t *src,
bdaddr_t *dst);

-static inline void rfcomm_session_hold(struct rfcomm_session *s)
-{
- if (s)
- atomic_inc(&s->refcnt);
-}
-
/* ---- RFCOMM sockets ---- */
struct sockaddr_rc {
sa_family_t rc_family;
diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 60d2f1a..ecef9bc 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -108,14 +108,6 @@ static void rfcomm_schedule(void)
wake_up_process(rfcomm_thread);
}

-static struct rfcomm_session *rfcomm_session_put(struct rfcomm_session *s)
-{
- if (s && atomic_dec_and_test(&s->refcnt))
- s = rfcomm_session_del(s);
-
- return s;
-}
-
/* ---- RFCOMM FCS computation ---- */

/* reversed, 8-bit, poly=0x07 */
@@ -251,16 +243,14 @@ static void rfcomm_session_set_timer(struct rfcomm_session *s, long timeout)
{
BT_DBG("session %p state %ld timeout %ld", s, s->state, timeout);

- if (!mod_timer(&s->timer, jiffies + timeout))
- rfcomm_session_hold(s);
+ mod_timer(&s->timer, jiffies + timeout);
}

static void rfcomm_session_clear_timer(struct rfcomm_session *s)
{
BT_DBG("session %p state %ld", s, s->state);

- if (del_timer_sync(&s->timer))
- rfcomm_session_put(s);
+ del_timer_sync(&s->timer);
}

/* ---- RFCOMM DLCs ---- */
@@ -338,8 +328,6 @@ static void rfcomm_dlc_link(struct rfcomm_session *s, struct rfcomm_dlc *d)
{
BT_DBG("dlc %p session %p", d, s);

- rfcomm_session_hold(s);
-
rfcomm_session_clear_timer(s);
rfcomm_dlc_hold(d);
list_add(&d->list, &s->dlcs);
@@ -358,8 +346,6 @@ static void rfcomm_dlc_unlink(struct rfcomm_dlc *d)

if (list_empty(&s->dlcs))
rfcomm_session_set_timer(s, RFCOMM_IDLE_TIMEOUT);
-
- rfcomm_session_put(s);
}

static struct rfcomm_dlc *rfcomm_dlc_get(struct rfcomm_session *s, u8 dlci)
@@ -678,8 +664,6 @@ static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,

BT_DBG("session %p state %ld err %d", s, s->state, err);

- rfcomm_session_hold(s);
-
s->state = BT_CLOSED;

/* Close all dlcs */
@@ -690,7 +674,7 @@ static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
}

rfcomm_session_clear_timer(s);
- return rfcomm_session_put(s);
+ return rfcomm_session_del(s);
}

static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
@@ -1884,13 +1868,8 @@ static struct rfcomm_session *rfcomm_process_rx(struct rfcomm_session *s)
kfree_skb(skb);
}

- if (s && (sk->sk_state == BT_CLOSED)) {
- if (!s->initiator)
- s = rfcomm_session_put(s);
-
- if (s)
- s = rfcomm_session_close(s, sk->sk_err);
- }
+ if (s && (sk->sk_state == BT_CLOSED))
+ s = rfcomm_session_close(s, sk->sk_err);

return s;
}
@@ -1917,8 +1896,6 @@ static void rfcomm_accept_connection(struct rfcomm_session *s)

s = rfcomm_session_add(nsock, BT_OPEN);
if (s) {
- rfcomm_session_hold(s);
-
/* We should adjust MTU on incoming sessions.
* L2CAP MTU minus UIH header and FCS. */
s->mtu = min(l2cap_pi(nsock->sk)->chan->omtu,
@@ -1967,7 +1944,6 @@ static void rfcomm_process_sessions(void)
if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
s->state = BT_DISCONN;
rfcomm_send_disc(s, 0);
- rfcomm_session_put(s);
continue;
}

@@ -1976,8 +1952,6 @@ static void rfcomm_process_sessions(void)
continue;
}

- rfcomm_session_hold(s);
-
switch (s->state) {
case BT_BOUND:
s = rfcomm_check_connection(s);
@@ -1990,8 +1964,6 @@ static void rfcomm_process_sessions(void)

if (s)
rfcomm_process_dlcs(s);
-
- rfcomm_session_put(s);
}

rfcomm_unlock();
@@ -2041,7 +2013,6 @@ static int rfcomm_add_listener(bdaddr_t *ba)
if (!s)
goto failed;

- rfcomm_session_hold(s);
return 0;
failed:
sock_release(sock);
@@ -2099,8 +2070,6 @@ static void rfcomm_security_cfm(struct hci_conn *conn, u8 status, u8 encrypt)
if (!s)
return;

- rfcomm_session_hold(s);
-
list_for_each_safe(p, n, &s->dlcs) {
d = list_entry(p, struct rfcomm_dlc, list);

@@ -2132,8 +2101,6 @@ static void rfcomm_security_cfm(struct hci_conn *conn, u8 status, u8 encrypt)
set_bit(RFCOMM_AUTH_REJECT, &d->flags);
}

- rfcomm_session_put(s);
-
rfcomm_schedule();
}

--
1.7.10.1

2013-02-25 16:38:33

by Dean Jenkins

[permalink] [raw]
Subject: [PATCH 2/6] Bluetooth: Check rfcomm session and DLC exists on socket close

From: Dean Jenkins <[email protected]>

A race condition exists between near simultaneous asynchronous
DLC data channel disconnection requests from the host and remote device.
This causes the socket layer to request a socket shutdown at the same
time the rfcomm core is processing the disconnect request from the remote
device.

The socket layer retains a copy of a struct rfcomm_dlc d pointer.
The d pointer refers to a copy of a struct rfcomm_session.
When the socket layer thread performs a socket shutdown, the thread
may wait on a rfcomm lock in rfcomm_dlc_close(). This means that
whilst the thread waits, the rfcomm_session and/or rfcomm_dlc structures
pointed to by d maybe freed due to rfcomm core handling. Consequently,
when the rfcomm lock becomes available and the thread runs, a
malfunction could occur as a freed rfcomm_session structure and/or a
freed rfcomm_dlc structure will be erroneously accessed.

Therefore, after the rfcomm lock is acquired, check that the struct
rfcomm_session is still valid by searching the rfcomm session list.
If the session is valid then validate the d pointer by searching the
rfcomm session list of active DLCs for the rfcomm_dlc structure
pointed by d.

Signed-off-by: Dean Jenkins <[email protected]>
---
net/bluetooth/rfcomm/core.c | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 8780e67..af0c26d 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -493,12 +493,34 @@ static int __rfcomm_dlc_close(struct rfcomm_dlc *d, int err)

int rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
{
- int r;
+ int r = 0;
+ struct rfcomm_dlc *d_list;
+ struct rfcomm_session *s, *s_list;
+
+ BT_DBG("dlc %p state %ld dlci %d err %d", d, d->state, d->dlci, err);

rfcomm_lock();

- r = __rfcomm_dlc_close(d, err);
+ s = d->session;
+ if (!s)
+ goto no_session;
+
+ /* after waiting on the mutex check the session still exists */
+ list_for_each_entry(s_list, &session_list, list) {
+ if (s_list == s) {
+ /* after waiting on the mutex, */
+ /* check the dlc still exists */
+ list_for_each_entry(d_list, &s->dlcs, list) {
+ if (d_list == d) {
+ r = __rfcomm_dlc_close(d, err);
+ break;
+ }
+ }
+ break;
+ }
+ }

+no_session:
rfcomm_unlock();
return r;
}
--
1.7.10.1

2013-02-25 16:38:36

by Dean Jenkins

[permalink] [raw]
Subject: [PATCH 5/6] Bluetooth: Remove redundant call to rfcomm_send_disc

From: Dean Jenkins <[email protected]>

In rfcomm_session_del() remove the redundant call to
rfcomm_send_disc() because it is not possible for the
session to be in BT_CONNECTED state during deletion
of the session.

Signed-off-by: Dean Jenkins <[email protected]>
---
net/bluetooth/rfcomm/core.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index ecef9bc..3bdaf2e 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -627,9 +627,6 @@ static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s)

list_del(&s->list);

- if (state == BT_CONNECTED)
- rfcomm_send_disc(s, 0);
-
rfcomm_session_clear_timer(s);
sock_release(s->sock);
kfree(s);
--
1.7.10.1

2013-02-25 16:38:32

by Dean Jenkins

[permalink] [raw]
Subject: [PATCH 1/6] Bluetooth: Avoid rfcomm_session_timeout using freed session

From: Dean Jenkins <[email protected]>

Use del_timer_sync() instead of del_timer() as this ensures
that rfcomm_session_timeout() is not running on a different
CPU when rfcomm_session_put() is called. This avoids a race
condition on SMP systems because potentially
rfcomm_session_timeout() could reuse the freed RFCOMM session
structure caused by the execution of rfcomm_session_put().

Note that this modification makes the reason for the RFCOMM
session refcnt mechanism redundant.

Signed-off-by: Dean Jenkins <[email protected]>
---
net/bluetooth/rfcomm/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 201fdf7..8780e67 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -257,7 +257,7 @@ static void rfcomm_session_clear_timer(struct rfcomm_session *s)
{
BT_DBG("session %p state %ld", s, s->state);

- if (timer_pending(&s->timer) && del_timer(&s->timer))
+ if (del_timer_sync(&s->timer))
rfcomm_session_put(s);
}

--
1.7.10.1