2015-10-23 08:28:19

by Syam Sidhardhan

[permalink] [raw]
Subject: Kernel panic observed during sco-connection-disconnection

Hi All,

We have observed one kernel panic during the sco close.
The panic logs has been given below. Further analysis we could see that invalid memory access is happening at
sco_pi(sk)->conn->hcon = NULL; in __sco_sock_close().

It will be great if someone analyses and provide the fix/clue for this issue.
Thanks for your time.


Code:
static void __sco_sock_close(struct sock *sk)
{
BT_DBG("sk %p state %d socket %p", sk, sk->sk_state, sk->sk_socket);

switch (sk->sk_state) {
case BT_LISTEN:
sco_sock_cleanup_listen(sk);
break;

case BT_CONNECTED:
case BT_CONFIG:
if (sco_pi(sk)->conn->hcon) {
sk->sk_state = BT_DISCONN;
sco_sock_set_timer(sk, SCO_DISCONN_TIMEOUT);
hci_conn_drop(sco_pi(sk)->conn->hcon);
sco_pi(sk)->conn->hcon = NULL; //Invalid memory access happen here.
} else
sco_chan_del(sk, ECONNRESET);
break;

case BT_CONNECT2:
case BT_CONNECT:
case BT_DISCONN:
...
...
}
}

Logs:
Oct 21 23:35:12 alert Kernel: [ 175.323507] Unable to handle kernel NULL pointer dereference at virtual address 00000000
.....
Oct 21 23:35:12 warn Kernel: [ 175.338344] [<c0866be4>] (__sco_sock_close+0x1ec/0x230) from [<c0866d20>] (sco_sock_close+0x20/0x34)
Oct 21 23:35:12 warn Kernel: [ 175.338478] [<c0866d20>] (sco_sock_close+0x20/0x34) from [<c0866d78>] (sco_sock_release+0x44/0x100)
Oct 21 23:35:12 warn Kernel: [ 175.338614] [<c0866d78>] (sco_sock_release+0x44/0x100) from [<c0721ad0>] (sock_release+0x20/0xc4)
Oct 21 23:35:12 warn Kernel: [ 175.338745] [<c0721ad0>] (sock_release+0x20/0xc4) from [<c07221e4>] (sock_close+0x20/0x2c)
Oct 21 23:35:12 warn Kernel: [ 175.338874] [<c07221e4>] (sock_close+0x20/0x2c) from [<c02741d0>] (fput+0x11c/0x230)
Oct 21 23:35:12 warn Kernel: [ 175.338994] [<c02741d0>] (fput+0x11c/0x230) from [<c0270fa8>] (filp_close+0x70/0x7c)
Oct 21 23:35:12 warn Kernel: [ 175.339110] [<c0270fa8>] (filp_close+0x70/0x7c) from [<c0271064>] (sys_close+0xb0/0xf0)
Oct 21 23:35:12 warn Kernel: [ 175.339232] [<c0271064>] (sys_close+0xb0/0xf0) from [<c0105d9c>] (__sys_trace_return+0x0/0x24)
Regards,
Syam



2015-10-26 00:54:20

by Syam Sidhardhan

[permalink] [raw]
Subject: Re: Kernel panic observed during sco-connection-disconnection

Hi Johan/Kuba,

On Fri, Oct 23, 2015 at 2:03 PM, Johan Hedberg <[email protected]> wrote:
> Hi Syam,
>
> On Fri, Oct 23, 2015, [email protected] wrote:
>> We have observed one kernel panic during the sco close.
>> The panic logs has been given below. Further analysis we could see that invalid memory access is happening at
>> sco_pi(sk)->conn->hcon = NULL; in __sco_sock_close().
>>
>> It will be great if someone analyses and provide the fix/clue for this issue.
>> Thanks for your time.
>>
>>
>> Code:
>> static void __sco_sock_close(struct sock *sk)
>> {
>> BT_DBG("sk %p state %d socket %p", sk, sk->sk_state, sk->sk_socket);
>>
>> switch (sk->sk_state) {
>> case BT_LISTEN:
>> sco_sock_cleanup_listen(sk);
>> break;
>>
>> case BT_CONNECTED:
>> case BT_CONFIG:
>> if (sco_pi(sk)->conn->hcon) {
>> sk->sk_state = BT_DISCONN;
>> sco_sock_set_timer(sk, SCO_DISCONN_TIMEOUT);
>> hci_conn_drop(sco_pi(sk)->conn->hcon);
>> sco_pi(sk)->conn->hcon = NULL; //Invalid memory access happen here.
>> } else
>> sco_chan_del(sk, ECONNRESET);
>> break;
>>
>> case BT_CONNECT2:
>> case BT_CONNECT:
>> case BT_DISCONN:
>> ...
>> ...
>> }
>> }
>>
>> Logs:
>> Oct 21 23:35:12 alert Kernel: [ 175.323507] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>> .....
>> Oct 21 23:35:12 warn Kernel: [ 175.338344] [<c0866be4>] (__sco_sock_close+0x1ec/0x230) from [<c0866d20>] (sco_sock_close+0x20/0x34)
>> Oct 21 23:35:12 warn Kernel: [ 175.338478] [<c0866d20>] (sco_sock_close+0x20/0x34) from [<c0866d78>] (sco_sock_release+0x44/0x100)
>> Oct 21 23:35:12 warn Kernel: [ 175.338614] [<c0866d78>] (sco_sock_release+0x44/0x100) from [<c0721ad0>] (sock_release+0x20/0xc4)
>> Oct 21 23:35:12 warn Kernel: [ 175.338745] [<c0721ad0>] (sock_release+0x20/0xc4) from [<c07221e4>] (sock_close+0x20/0x2c)
>> Oct 21 23:35:12 warn Kernel: [ 175.338874] [<c07221e4>] (sock_close+0x20/0x2c) from [<c02741d0>] (fput+0x11c/0x230)
>> Oct 21 23:35:12 warn Kernel: [ 175.338994] [<c02741d0>] (fput+0x11c/0x230) from [<c0270fa8>] (filp_close+0x70/0x7c)
>> Oct 21 23:35:12 warn Kernel: [ 175.339110] [<c0270fa8>] (filp_close+0x70/0x7c) from [<c0271064>] (sys_close+0xb0/0xf0)
>> Oct 21 23:35:12 warn Kernel: [ 175.339232] [<c0271064>] (sys_close+0xb0/0xf0) from [<c0105d9c>] (__sys_trace_return+0x0/0x24)
>
> You'll probably want to take a look at the SCO socket patch set that
> Kuba sent in the beginning of this month:
>
> [PATCH 0/5] Few proposed fixes to stop crashes on fast SCO disconnec
>
> Particularly his second patch looks like something that could be related
> to this.
>

Thanks for your guidance. We will try with Kuba's patch sets. Few of
them has already applied to bluetooth-next a couple of hours back.
Look like his Patch 1/5 fixes the one time issue which I reported 3
months back with subject "Sco connection disconnection panic happen".
Thanks Johan & Kuba.

Regards,
Syam.

2015-10-23 08:33:19

by Johan Hedberg

[permalink] [raw]
Subject: Re: Kernel panic observed during sco-connection-disconnection

Hi Syam,

On Fri, Oct 23, 2015, [email protected] wrote:
> We have observed one kernel panic during the sco close.
> The panic logs has been given below. Further analysis we could see that invalid memory access is happening at
> sco_pi(sk)->conn->hcon = NULL; in __sco_sock_close().
>
> It will be great if someone analyses and provide the fix/clue for this issue.
> Thanks for your time.
>
>
> Code:
> static void __sco_sock_close(struct sock *sk)
> {
> BT_DBG("sk %p state %d socket %p", sk, sk->sk_state, sk->sk_socket);
>
> switch (sk->sk_state) {
> case BT_LISTEN:
> sco_sock_cleanup_listen(sk);
> break;
>
> case BT_CONNECTED:
> case BT_CONFIG:
> if (sco_pi(sk)->conn->hcon) {
> sk->sk_state = BT_DISCONN;
> sco_sock_set_timer(sk, SCO_DISCONN_TIMEOUT);
> hci_conn_drop(sco_pi(sk)->conn->hcon);
> sco_pi(sk)->conn->hcon = NULL; //Invalid memory access happen here.
> } else
> sco_chan_del(sk, ECONNRESET);
> break;
>
> case BT_CONNECT2:
> case BT_CONNECT:
> case BT_DISCONN:
> ...
> ...
> }
> }
>
> Logs:
> Oct 21 23:35:12 alert Kernel: [ 175.323507] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> .....
> Oct 21 23:35:12 warn Kernel: [ 175.338344] [<c0866be4>] (__sco_sock_close+0x1ec/0x230) from [<c0866d20>] (sco_sock_close+0x20/0x34)
> Oct 21 23:35:12 warn Kernel: [ 175.338478] [<c0866d20>] (sco_sock_close+0x20/0x34) from [<c0866d78>] (sco_sock_release+0x44/0x100)
> Oct 21 23:35:12 warn Kernel: [ 175.338614] [<c0866d78>] (sco_sock_release+0x44/0x100) from [<c0721ad0>] (sock_release+0x20/0xc4)
> Oct 21 23:35:12 warn Kernel: [ 175.338745] [<c0721ad0>] (sock_release+0x20/0xc4) from [<c07221e4>] (sock_close+0x20/0x2c)
> Oct 21 23:35:12 warn Kernel: [ 175.338874] [<c07221e4>] (sock_close+0x20/0x2c) from [<c02741d0>] (fput+0x11c/0x230)
> Oct 21 23:35:12 warn Kernel: [ 175.338994] [<c02741d0>] (fput+0x11c/0x230) from [<c0270fa8>] (filp_close+0x70/0x7c)
> Oct 21 23:35:12 warn Kernel: [ 175.339110] [<c0270fa8>] (filp_close+0x70/0x7c) from [<c0271064>] (sys_close+0xb0/0xf0)
> Oct 21 23:35:12 warn Kernel: [ 175.339232] [<c0271064>] (sys_close+0xb0/0xf0) from [<c0105d9c>] (__sys_trace_return+0x0/0x24)

You'll probably want to take a look at the SCO socket patch set that
Kuba sent in the beginning of this month:

[PATCH 0/5] Few proposed fixes to stop crashes on fast SCO disconnec

Particularly his second patch looks like something that could be related
to this.

Johan