2023-04-30 17:49:12

by Ruihan Li

[permalink] [raw]
Subject: [PATCH] Bluetooth: Fix UAF in hci_conn_hash_flush again

Commit 06149746e720 ("Bluetooth: hci_conn: Add support for linking multiple
hcon") reintroduces a previously fixed bug [1] ("KASAN: slab-use-after-free
Read in hci_conn_hash_flush"). This bug was originally fixed by commit
5dc7d23e167e ("Bluetooth: hci_conn: Fix possible UAF").

The hci_conn_unlink function was added to avoid invalidating the link
traversal caused by successive hci_conn_del operations releasing extra
connections. However, currently hci_conn_unlink itself also releases extra
connections, resulted in the reintroduced bug.

This patch follows a more robust solution for cleaning up all connections,
by repeatedly removing the first connection until there are none left. This
approach does not rely on the inner workings of hci_conn_del and ensures
proper cleanup of all connections.

However, we need to make sure that hci_conn_del never fails. Indeed it
doesn't, as it now always returns zero. To make this a bit clearer, this
patch also changes its return type to void.

Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=8bb72f86fc823817bc5d
Fixes: 06149746e720 ("Bluetooth: hci_conn: Add support for linking multiple hcon")
Signed-off-by: Ruihan Li <[email protected]>
---
include/net/bluetooth/hci_core.h | 2 +-
net/bluetooth/hci_conn.c | 28 ++++++++++++----------------
2 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index a6c8aee2f..8baf34639 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -1327,7 +1327,7 @@ int hci_le_create_cis(struct hci_conn *conn);

struct hci_conn *hci_conn_add(struct hci_dev *hdev, int type, bdaddr_t *dst,
u8 role);
-int hci_conn_del(struct hci_conn *conn);
+void hci_conn_del(struct hci_conn *conn);
void hci_conn_hash_flush(struct hci_dev *hdev);
void hci_conn_check_pending(struct hci_dev *hdev);

diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
index 640b951bf..85c34c837 100644
--- a/net/bluetooth/hci_conn.c
+++ b/net/bluetooth/hci_conn.c
@@ -1109,7 +1109,7 @@ static void hci_conn_unlink(struct hci_conn *conn)
hci_conn_del(conn);
}

-int hci_conn_del(struct hci_conn *conn)
+void hci_conn_del(struct hci_conn *conn)
{
struct hci_dev *hdev = conn->hdev;

@@ -1160,8 +1160,6 @@ int hci_conn_del(struct hci_conn *conn)
* rest of hci_conn_del.
*/
hci_conn_cleanup(conn);
-
- return 0;
}

struct hci_dev *hci_get_route(bdaddr_t *dst, bdaddr_t *src, uint8_t src_type)
@@ -2462,22 +2460,20 @@ void hci_conn_enter_active_mode(struct hci_conn *conn, __u8 force_active)
/* Drop all connection on the device */
void hci_conn_hash_flush(struct hci_dev *hdev)
{
- struct hci_conn_hash *h = &hdev->conn_hash;
- struct hci_conn *c, *n;
+ struct list_head *head = &hdev->conn_hash.list;
+ struct hci_conn *conn;

BT_DBG("hdev %s", hdev->name);

- list_for_each_entry_safe(c, n, &h->list, list) {
- c->state = BT_CLOSED;
-
- hci_disconn_cfm(c, HCI_ERROR_LOCAL_HOST_TERM);
-
- /* Unlink before deleting otherwise it is possible that
- * hci_conn_del removes the link which may cause the list to
- * contain items already freed.
- */
- hci_conn_unlink(c);
- hci_conn_del(c);
+ /* We should not traverse the list here, because hci_conn_del
+ * can remove extra links, which may cause the link traversal
+ * to hit items that have already been released.
+ */
+ while ((conn = list_first_entry_or_null(head,
+ struct hci_conn, list)) != NULL) {
+ conn->state = BT_CLOSED;
+ hci_disconn_cfm(conn, HCI_ERROR_LOCAL_HOST_TERM);
+ hci_conn_del(conn);
}
}

--
2.40.0


2023-04-30 18:00:01

by bluez.test.bot

[permalink] [raw]
Subject: RE: Bluetooth: Fix UAF in hci_conn_hash_flush again

This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=744073

---Test result---

Test Summary:
CheckPatch FAIL 1.12 seconds
GitLint PASS 0.35 seconds
SubjectPrefix PASS 0.13 seconds
BuildKernel PASS 32.06 seconds
CheckAllWarning PASS 34.74 seconds
CheckSparse PASS 39.61 seconds
CheckSmatch PASS 108.27 seconds
BuildKernel32 PASS 31.25 seconds
TestRunnerSetup PASS 442.52 seconds
TestRunner_l2cap-tester PASS 16.91 seconds
TestRunner_iso-tester PASS 21.38 seconds
TestRunner_bnep-tester PASS 5.63 seconds
TestRunner_mgmt-tester PASS 116.50 seconds
TestRunner_rfcomm-tester PASS 9.06 seconds
TestRunner_sco-tester PASS 8.31 seconds
TestRunner_ioctl-tester PASS 9.75 seconds
TestRunner_mesh-tester PASS 7.21 seconds
TestRunner_smp-tester PASS 8.15 seconds
TestRunner_userchan-tester PASS 5.94 seconds
IncrementalBuild PASS 29.20 seconds

Details
##############################
Test: CheckPatch - FAIL
Desc: Run checkpatch.pl script
Output:
Bluetooth: Fix UAF in hci_conn_hash_flush again
WARNING: Reported-by: should be immediately followed by Link: with a URL to the report
#92:
Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=8bb72f86fc823817bc5d

WARNING: Unknown link reference 'Closes:', use 'Link:' instead
#93:
Closes: https://syzkaller.appspot.com/bug?extid=8bb72f86fc823817bc5d

CHECK: Alignment should match open parenthesis
#163: FILE: net/bluetooth/hci_conn.c:2473:
+ while ((conn = list_first_entry_or_null(head,
+ struct hci_conn, list)) != NULL) {

total: 0 errors, 2 warnings, 1 checks, 57 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
mechanically convert to the typical style using --fix or --fix-inplace.

/github/workspace/src/src/13227257.patch has style problems, please review.

NOTE: Ignored message types: UNKNOWN_COMMIT_ID

NOTE: If any of the errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.




---
Regards,
Linux Bluetooth

2023-04-30 18:24:15

by Ruihan Li

[permalink] [raw]
Subject: Re: Bluetooth: Fix UAF in hci_conn_hash_flush again

On Sun, Apr 30, 2023 at 10:56:59AM -0700, [email protected] wrote:
> Test: CheckPatch - FAIL
> Desc: Run checkpatch.pl script
> Output:
> Bluetooth: Fix UAF in hci_conn_hash_flush again
> WARNING: Reported-by: should be immediately followed by Link: with a URL to the report
> #92:
> Reported-by: [email protected]
> Closes: https://syzkaller.appspot.com/bug?extid=8bb72f86fc823817bc5d
>
> WARNING: Unknown link reference 'Closes:', use 'Link:' instead
> #93:
> Closes: https://syzkaller.appspot.com/bug?extid=8bb72f86fc823817bc5d
>

Well, CI is out of date, as the mainline has changed this from 'Link:' to
'Closes:' [1].

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=44c31888098a590b8ec5ba37009e5a983f7c4b46

> CHECK: Alignment should match open parenthesis
> #163: FILE: net/bluetooth/hci_conn.c:2473:
> + while ((conn = list_first_entry_or_null(head,
> + struct hci_conn, list)) != NULL) {
>
> total: 0 errors, 2 warnings, 1 checks, 57 lines checked
>
> NOTE: For some of the reported defects, checkpatch may be able to
> mechanically convert to the typical style using --fix or --fix-inplace.
>
> /github/workspace/src/src/13227257.patch has style problems, please review.
>
> NOTE: Ignored message types: UNKNOWN_COMMIT_ID
>
> NOTE: If any of the errors are false positives, please report
> them to the maintainer, see CHECKPATCH in MAINTAINERS.

Thanks,
Ruihan Li

2023-05-01 21:18:04

by Luiz Augusto von Dentz

[permalink] [raw]
Subject: Re: Bluetooth: Fix UAF in hci_conn_hash_flush again

Hi Ruihan,

On Sun, Apr 30, 2023 at 11:23 AM Ruihan Li <[email protected]> wrote:
>
> On Sun, Apr 30, 2023 at 10:56:59AM -0700, [email protected] wrote:
> > Test: CheckPatch - FAIL
> > Desc: Run checkpatch.pl script
> > Output:
> > Bluetooth: Fix UAF in hci_conn_hash_flush again
> > WARNING: Reported-by: should be immediately followed by Link: with a URL to the report
> > #92:
> > Reported-by: [email protected]
> > Closes: https://syzkaller.appspot.com/bug?extid=8bb72f86fc823817bc5d
> >
> > WARNING: Unknown link reference 'Closes:', use 'Link:' instead
> > #93:
> > Closes: https://syzkaller.appspot.com/bug?extid=8bb72f86fc823817bc5d
> >
>
> Well, CI is out of date, as the mainline has changed this from 'Link:' to
> 'Closes:' [1].
>
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=44c31888098a590b8ec5ba37009e5a983f7c4b46
>
> > CHECK: Alignment should match open parenthesis
> > #163: FILE: net/bluetooth/hci_conn.c:2473:
> > + while ((conn = list_first_entry_or_null(head,
> > + struct hci_conn, list)) != NULL) {
> >
> > total: 0 errors, 2 warnings, 1 checks, 57 lines checked
> >
> > NOTE: For some of the reported defects, checkpatch may be able to
> > mechanically convert to the typical style using --fix or --fix-inplace.
> >
> > /github/workspace/src/src/13227257.patch has style problems, please review.
> >
> > NOTE: Ignored message types: UNKNOWN_COMMIT_ID
> >
> > NOTE: If any of the errors are false positives, please report
> > them to the maintainer, see CHECKPATCH in MAINTAINERS.
>
> Thanks,
> Ruihan Li

I guess this depends on the order the connection is added into the
list since after applying your changes it causes the following trace
with iso-tester:

==================================================================
BUG: KASAN: slab-use-after-free in hci_conn_unlink
(./include/linux/list.h:114 ./include/linux/list.h:137
./include/linux/rculist.h:157 net/bl
Write of size 8 at addr ffff8880012e89d0 by task iso-tester/31

Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc36
04/01/2014
Call Trace:
<TASK>
dump_stack_lvl (./arch/x86/include/asm/irqflags.h:26
./arch/x86/include/asm/irqflags.h:67
./arch/x86/include/asm/irqflags.h:127 lib/dump_stack
print_report (mm/kasan/report.c:320 mm/kasan/report.c:430)
? __virt_addr_valid (./include/linux/mmzone.h:1915
./include/linux/mmzone.h:2011 arch/x86/mm/physaddr.c:65)
? hci_conn_unlink (./include/linux/list.h:114
./include/linux/list.h:137 ./include/linux/rculist.h:157
net/bluetooth/hci_conn.c:1108)
kasan_report (mm/kasan/report.c:538)
? hci_conn_unlink (./include/linux/list.h:114
./include/linux/list.h:137 ./include/linux/rculist.h:157
net/bluetooth/hci_conn.c:1108)
hci_conn_unlink (./include/linux/list.h:114 ./include/linux/list.h:137
./include/linux/rculist.h:157 net/bluetooth/hci_conn.c:1108)
hci_conn_del (./include/linux/instrumented.h:72
./include/linux/atomic/atomic-instrumented.h:27
./include/net/bluetooth/hci_core.h:1416 net/bl
hci_conn_hash_flush (net/bluetooth/hci_conn.c:2479)
hci_dev_close_sync (net/bluetooth/hci_sync.c:4943)
hci_unregister_dev (net/bluetooth/hci_core.c:556 net/bluetooth/hci_core.c:2703)
vhci_release (drivers/bluetooth/hci_vhci.c:670)
__fput (fs/file_table.c:322)
task_work_run (kernel/task_work.c:180 (discriminator 1))
? __pfx_task_work_run (kernel/task_work.c:147)
? mark_held_locks (kernel/locking/lockdep.c:4225)
exit_to_user_mode_prepare (./include/linux/resume_user_mode.h:49
kernel/entry/common.c:171 kernel/entry/common.c:204)
syscall_exit_to_user_mode (kernel/entry/common.c:130 kernel/entry/common.c:299)
do_syscall_64 (arch/x86/entry/common.c:87)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
RIP: 0033:0x7f899c2352f7
Code: ff ef 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f
1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d
00f
All code
========
0: ff (bad)
1: ef out %eax,(%dx)
2: 01 00 add %eax,(%rax)
4: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
b: 00 00 00
e: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
13: f3 0f 1e fa endbr64
17: 64 8b 04 25 18 00 00 mov %fs:0x18,%eax
1e: 00
1f: 85 c0 test %eax,%eax
21: 75 10 jne 0x33
23: b8 03 00 00 00 mov $0x3,%eax
28: 0f 05 syscall
2a:* 48 rex.W <-- trapping instruction
2b: 3d .byte 0x3d
2c: 0f .byte 0xf

Code starting with the faulting instruction
===========================================
0: 48 rex.W
1: 3d .byte 0x3d
2: 0f .byte 0xf
RSP: 002b:00007ffc785e3588 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f899c2352f7
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000007
RBP: 0000000000000000 R08: 0000561e1096d390 R09: 0000000000000030
R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000001
R13: 00007f899c9b84b0 R14: 0000000000000000 R15: 0000561e10961890
</TASK>

--
Luiz Augusto von Dentz

2023-05-02 13:55:54

by Ruihan Li

[permalink] [raw]
Subject: Re: Bluetooth: Fix UAF in hci_conn_hash_flush again

Hi Luiz,

On Mon, May 01, 2023 at 02:09:05PM -0700, Luiz Augusto von Dentz wrote:
> I guess this depends on the order the connection is added into the
> list since after applying your changes it causes the following trace
> with iso-tester:
>
> ==================================================================
> BUG: KASAN: slab-use-after-free in hci_conn_unlink
> (./include/linux/list.h:114 ./include/linux/list.h:137
> ./include/linux/rculist.h:157 net/bl
> Write of size 8 at addr ffff8880012e89d0 by task iso-tester/31

It is so weird that even though KASAN reports a critical BUG, the CI
still says that all tests are passed. I think that perhaps this is also
something we need to develop a fix. (I missed the KASAN report earlier,
sorry.)

> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc36
> 04/01/2014
> Call Trace:
> <TASK>
> dump_stack_lvl (./arch/x86/include/asm/irqflags.h:26
> ./arch/x86/include/asm/irqflags.h:67
> ./arch/x86/include/asm/irqflags.h:127 lib/dump_stack
> print_report (mm/kasan/report.c:320 mm/kasan/report.c:430)
> ? __virt_addr_valid (./include/linux/mmzone.h:1915
> ./include/linux/mmzone.h:2011 arch/x86/mm/physaddr.c:65)
> ? hci_conn_unlink (./include/linux/list.h:114
> ./include/linux/list.h:137 ./include/linux/rculist.h:157
> net/bluetooth/hci_conn.c:1108)
> kasan_report (mm/kasan/report.c:538)
> ? hci_conn_unlink (./include/linux/list.h:114
> ./include/linux/list.h:137 ./include/linux/rculist.h:157
> net/bluetooth/hci_conn.c:1108)
> hci_conn_unlink (./include/linux/list.h:114 ./include/linux/list.h:137
> ./include/linux/rculist.h:157 net/bluetooth/hci_conn.c:1108)
> hci_conn_del (./include/linux/instrumented.h:72
> ./include/linux/atomic/atomic-instrumented.h:27
> ./include/net/bluetooth/hci_core.h:1416 net/bl
> hci_conn_hash_flush (net/bluetooth/hci_conn.c:2479)
> hci_dev_close_sync (net/bluetooth/hci_sync.c:4943)
> hci_unregister_dev (net/bluetooth/hci_core.c:556 net/bluetooth/hci_core.c:2703)
> vhci_release (drivers/bluetooth/hci_vhci.c:670)
> __fput (fs/file_table.c:322)
> task_work_run (kernel/task_work.c:180 (discriminator 1))
> ? __pfx_task_work_run (kernel/task_work.c:147)
> ? mark_held_locks (kernel/locking/lockdep.c:4225)
> exit_to_user_mode_prepare (./include/linux/resume_user_mode.h:49
> kernel/entry/common.c:171 kernel/entry/common.c:204)
> syscall_exit_to_user_mode (kernel/entry/common.c:130 kernel/entry/common.c:299)
> do_syscall_64 (arch/x86/entry/common.c:87)
> entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
> RIP: 0033:0x7f899c2352f7
> Code: ff ef 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f
> 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d
> 00f
> All code
> ========
> 0: ff (bad)
> 1: ef out %eax,(%dx)
> 2: 01 00 add %eax,(%rax)
> 4: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
> b: 00 00 00
> e: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 13: f3 0f 1e fa endbr64
> 17: 64 8b 04 25 18 00 00 mov %fs:0x18,%eax
> 1e: 00
> 1f: 85 c0 test %eax,%eax
> 21: 75 10 jne 0x33
> 23: b8 03 00 00 00 mov $0x3,%eax
> 28: 0f 05 syscall
> 2a:* 48 rex.W <-- trapping instruction
> 2b: 3d .byte 0x3d
> 2c: 0f .byte 0xf
>
> Code starting with the faulting instruction
> ===========================================
> 0: 48 rex.W
> 1: 3d .byte 0x3d
> 2: 0f .byte 0xf
> RSP: 002b:00007ffc785e3588 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f899c2352f7
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000007
> RBP: 0000000000000000 R08: 0000561e1096d390 R09: 0000000000000030
> R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000001
> R13: 00007f899c9b84b0 R14: 0000000000000000 R15: 0000561e10961890
> </TASK>

This leads to another hci_conn_unlink BUG, as resolved by the following
diff:

diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
index 70e1655a9..44d0643fc 100644
--- a/net/bluetooth/hci_conn.c
+++ b/net/bluetooth/hci_conn.c
@@ -1102,12 +1102,12 @@ static void hci_conn_unlink(struct hci_conn *conn)
if (!conn->link)
return;

- hci_conn_put(conn->parent);
- conn->parent = NULL;
-
list_del_rcu(&conn->link->list);
synchronize_rcu();

+ hci_conn_put(conn->parent);
+ conn->parent = NULL;
+
kfree(conn->link);
conn->link = NULL;
}

In a word, we cannot perform list_del_rcu(&conn->link->list) after
hci_conn_put, because hci_conn_put might have already invalidated the
list entry by deallocating its list head (in conn->parent).

There are actually quite a few "problems" (I think) with
hci_conn_unlink. I tried to fix them and ended up with a patch series of
six patches (each patch is fairly small, of course), including two that
have already been submitted and the one above.

I'll submit these patches later. They have to be a patch series because
they all target one function, hci_conn_unlink. So there will be a lot of
merge conflicts if they are submitted individually. I'm not
intentionally tieing them together, so in case that any patch is
inappropriate, please let me know (I'd appreciate if you could explain
why).

> --
> Luiz Augusto von Dentz

Thanks,
Ruihan Li