The cause of the resultant dump_stack() reported below is a
dereference of a freed pointer to 'struct sctp_endpoint' in
sctp_sock_dump().
This race condition occurs when a transport is cached into its
associated hash table then freed prior to its subsequent use in
sctp_diag_dump() which uses sctp_for_each_transport() to walk the
(now out of date) hash table calling into sctp_sock_dump() where the
dereference occurs.
To prevent this from happening we need to take a reference on the
to-be-used/dereferenced 'struct sctp_endpoint' until such a time when
we know it can be safely released.
When KASAN is not enabled, a similar, but slightly different NULL
pointer derefernce crash occurs later along the thread of execution in
inet_sctp_diag_fill() this time.
BUG: KASAN: use-after-free in sctp_sock_dump+0xa8/0x438 [sctp_diag]
Call trace:
dump_backtrace+0x0/0x2dc
show_stack+0x20/0x2c
dump_stack+0x120/0x144
print_address_description+0x80/0x2f4
__kasan_report+0x174/0x194
kasan_report+0x10/0x18
__asan_load8+0x84/0x8c
sctp_sock_dump+0xa8/0x438 [sctp_diag]
sctp_for_each_transport+0x1e0/0x26c [sctp]
sctp_diag_dump+0x180/0x1f0 [sctp_diag]
inet_diag_dump+0x12c/0x168
netlink_dump+0x24c/0x5b8
__netlink_dump_start+0x274/0x2a8
inet_diag_handler_cmd+0x224/0x274
sock_diag_rcv_msg+0x21c/0x230
netlink_rcv_skb+0xe0/0x1bc
sock_diag_rcv+0x34/0x48
netlink_unicast+0x3b4/0x430
netlink_sendmsg+0x4f0/0x574
sock_write_iter+0x18c/0x1f0
do_iter_readv_writev+0x230/0x2a8
do_iter_write+0xc8/0x2b4
vfs_writev+0xf8/0x184
do_writev+0xb0/0x1a8
__arm64_sys_writev+0x4c/0x5c
el0_svc_common+0x118/0x250
el0_svc_handler+0x3c/0x9c
el0_svc+0x8/0xc
Cc: Vlad Yasevich <[email protected]>
Cc: Neil Horman <[email protected]>
Cc: Marcelo Ricardo Leitner <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: lksctp developers <[email protected]>
Cc: "H.P. Yarroll" <[email protected]>
Cc: Karl Knutson <[email protected]>
Cc: Jon Grimm <[email protected]>
Cc: Xingang Guo <[email protected]>
Cc: Hui Huang <[email protected]>
Cc: Sridhar Samudrala <[email protected]>
Cc: Daisy Chang <[email protected]>
Cc: Ryan Layer <[email protected]>
Cc: Kevin Gao <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Lee Jones <[email protected]>
---
net/sctp/associola.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index be29da09cc7ab..df171a297d095 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -499,8 +499,9 @@ void sctp_assoc_rm_peer(struct sctp_association *asoc,
/* Remove this peer from the list. */
list_del_rcu(&peer->transports);
- /* Remove this peer from the transport hashtable */
+ /* Remove this peer from the transport hashtable and remove its reference */
sctp_unhash_transport(peer);
+ sctp_endpoint_put(asoc->ep);
/* Get the first transport of asoc. */
pos = asoc->peer.transport_addr_list.next;
@@ -710,11 +711,12 @@ struct sctp_transport *sctp_assoc_add_peer(struct sctp_association *asoc,
/* Set the peer's active state. */
peer->state = peer_state;
- /* Add this peer into the transport hashtable */
+ /* Add this peer into the transport hashtable and take a reference */
if (sctp_hash_transport(peer)) {
sctp_transport_free(peer);
return NULL;
}
+ sctp_endpoint_hold(asoc->ep);
sctp_transport_pl_reset(peer);
--
2.34.1.400.ga245620fadb-goog
On Wed, 08 Dec 2021, Lee Jones wrote:
> The cause of the resultant dump_stack() reported below is a
> dereference of a freed pointer to 'struct sctp_endpoint' in
> sctp_sock_dump().
>
> This race condition occurs when a transport is cached into its
> associated hash table then freed prior to its subsequent use in
> sctp_diag_dump() which uses sctp_for_each_transport() to walk the
> (now out of date) hash table calling into sctp_sock_dump() where the
> dereference occurs.
>
> To prevent this from happening we need to take a reference on the
> to-be-used/dereferenced 'struct sctp_endpoint' until such a time when
> we know it can be safely released.
>
> When KASAN is not enabled, a similar, but slightly different NULL
> pointer derefernce crash occurs later along the thread of execution in
> inet_sctp_diag_fill() this time.
>
> BUG: KASAN: use-after-free in sctp_sock_dump+0xa8/0x438 [sctp_diag]
> Call trace:
> dump_backtrace+0x0/0x2dc
> show_stack+0x20/0x2c
> dump_stack+0x120/0x144
> print_address_description+0x80/0x2f4
> __kasan_report+0x174/0x194
> kasan_report+0x10/0x18
> __asan_load8+0x84/0x8c
> sctp_sock_dump+0xa8/0x438 [sctp_diag]
> sctp_for_each_transport+0x1e0/0x26c [sctp]
> sctp_diag_dump+0x180/0x1f0 [sctp_diag]
> inet_diag_dump+0x12c/0x168
> netlink_dump+0x24c/0x5b8
> __netlink_dump_start+0x274/0x2a8
> inet_diag_handler_cmd+0x224/0x274
> sock_diag_rcv_msg+0x21c/0x230
> netlink_rcv_skb+0xe0/0x1bc
> sock_diag_rcv+0x34/0x48
> netlink_unicast+0x3b4/0x430
> netlink_sendmsg+0x4f0/0x574
> sock_write_iter+0x18c/0x1f0
> do_iter_readv_writev+0x230/0x2a8
> do_iter_write+0xc8/0x2b4
> vfs_writev+0xf8/0x184
> do_writev+0xb0/0x1a8
> __arm64_sys_writev+0x4c/0x5c
> el0_svc_common+0x118/0x250
> el0_svc_handler+0x3c/0x9c
> el0_svc+0x8/0xc
This looks related (reported 3 years ago!)
https://lore.kernel.org/all/[email protected]/
--
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog
On Wed, Dec 08, 2021 at 04:54:34PM +0000, Lee Jones wrote:
> The cause of the resultant dump_stack() reported below is a
> dereference of a freed pointer to 'struct sctp_endpoint' in
> sctp_sock_dump().
Hi,
Please give me another day to review this one.
Thanks,
Marcelo
On Wed, Dec 08, 2021 at 05:02:12PM +0000, Lee Jones wrote:
> On Wed, 08 Dec 2021, Lee Jones wrote:
>
> > The cause of the resultant dump_stack() reported below is a
> > dereference of a freed pointer to 'struct sctp_endpoint' in
> > sctp_sock_dump().
> >
> > This race condition occurs when a transport is cached into its
> > associated hash table then freed prior to its subsequent use in
> > sctp_diag_dump() which uses sctp_for_each_transport() to walk the
> > (now out of date) hash table calling into sctp_sock_dump() where the
> > dereference occurs.
> >
> > To prevent this from happening we need to take a reference on the
> > to-be-used/dereferenced 'struct sctp_endpoint' until such a time when
> > we know it can be safely released.
> >
> > When KASAN is not enabled, a similar, but slightly different NULL
> > pointer derefernce crash occurs later along the thread of execution in
> > inet_sctp_diag_fill() this time.
> >
> > BUG: KASAN: use-after-free in sctp_sock_dump+0xa8/0x438 [sctp_diag]
> > Call trace:
> > dump_backtrace+0x0/0x2dc
> > show_stack+0x20/0x2c
> > dump_stack+0x120/0x144
> > print_address_description+0x80/0x2f4
> > __kasan_report+0x174/0x194
> > kasan_report+0x10/0x18
> > __asan_load8+0x84/0x8c
> > sctp_sock_dump+0xa8/0x438 [sctp_diag]
> > sctp_for_each_transport+0x1e0/0x26c [sctp]
> > sctp_diag_dump+0x180/0x1f0 [sctp_diag]
> > inet_diag_dump+0x12c/0x168
> > netlink_dump+0x24c/0x5b8
> > __netlink_dump_start+0x274/0x2a8
> > inet_diag_handler_cmd+0x224/0x274
> > sock_diag_rcv_msg+0x21c/0x230
> > netlink_rcv_skb+0xe0/0x1bc
> > sock_diag_rcv+0x34/0x48
> > netlink_unicast+0x3b4/0x430
> > netlink_sendmsg+0x4f0/0x574
> > sock_write_iter+0x18c/0x1f0
> > do_iter_readv_writev+0x230/0x2a8
> > do_iter_write+0xc8/0x2b4
> > vfs_writev+0xf8/0x184
> > do_writev+0xb0/0x1a8
> > __arm64_sys_writev+0x4c/0x5c
> > el0_svc_common+0x118/0x250
> > el0_svc_handler+0x3c/0x9c
> > el0_svc+0x8/0xc
>
> This looks related (reported 3 years ago!)
>
> https://lore.kernel.org/all/[email protected]/
Agree, seems related. Thanks for root causing it.
On Wed, Dec 08, 2021 at 04:54:34PM +0000, Lee Jones wrote:
> To prevent this from happening we need to take a reference on the
> to-be-used/dereferenced 'struct sctp_endpoint' until such a time when
> we know it can be safely released.
>
> When KASAN is not enabled, a similar, but slightly different NULL
> pointer derefernce crash occurs later along the thread of execution in
> inet_sctp_diag_fill() this time.
Hey Lee, did you try running lksctp-tools [1] func tests with this patch?
I'm getting failures here.
[root@vm1 func_tests]# make v4test
./test_assoc_abort
test_assoc_abort.c 1 PASS : ABORT an association using SCTP_ABORT
test_assoc_abort passes
./test_assoc_shutdown
test_assoc_shutdown.c 1 BROK : bind: Address already in use
DUMP_CORE ../../src/testlib/sctputil.h: 145
/bin/sh: line 1: 3727 Segmentation fault (core dumped) ./$a
test_assoc_shutdown fails
make: *** [Makefile:1648: v4test] Error 1
I didn't check it closely but it would seem that the ep is beind held
forever. If I simply retry after a few seconds, it's still there (now the 1st
test fails):
[root@vm1 func_tests]# make v4test
./test_assoc_abort
test_assoc_abort.c 1 BROK : bind: Address already in use
DUMP_CORE ../../src/testlib/sctputil.h: 145
/bin/sh: line 1: 3751 Segmentation fault (core dumped) ./$a
test_assoc_abort fails
1.https://github.com/sctp/lksctp-tools
Marcelo
On Thu, 09 Dec 2021, Marcelo Ricardo Leitner wrote:
> On Wed, Dec 08, 2021 at 04:54:34PM +0000, Lee Jones wrote:
> > To prevent this from happening we need to take a reference on the
> > to-be-used/dereferenced 'struct sctp_endpoint' until such a time when
> > we know it can be safely released.
> >
> > When KASAN is not enabled, a similar, but slightly different NULL
> > pointer derefernce crash occurs later along the thread of execution in
> > inet_sctp_diag_fill() this time.
>
> Hey Lee, did you try running lksctp-tools [1] func tests with this patch?
> I'm getting failures here.
>
> [root@vm1 func_tests]# make v4test
> ./test_assoc_abort
> test_assoc_abort.c 1 PASS : ABORT an association using SCTP_ABORT
> test_assoc_abort passes
>
> ./test_assoc_shutdown
> test_assoc_shutdown.c 1 BROK : bind: Address already in use
> DUMP_CORE ../../src/testlib/sctputil.h: 145
> /bin/sh: line 1: 3727 Segmentation fault (core dumped) ./$a
> test_assoc_shutdown fails
> make: *** [Makefile:1648: v4test] Error 1
>
> I didn't check it closely but it would seem that the ep is beind held
> forever. If I simply retry after a few seconds, it's still there (now the 1st
> test fails):
>
> [root@vm1 func_tests]# make v4test
> ./test_assoc_abort
> test_assoc_abort.c 1 BROK : bind: Address already in use
> DUMP_CORE ../../src/testlib/sctputil.h: 145
> /bin/sh: line 1: 3751 Segmentation fault (core dumped) ./$a
> test_assoc_abort fails
>
> 1.https://github.com/sctp/lksctp-tools
No I haven't, but I will do once I get a moment.
The only thing I can think of, before I go digging again, is that
either the association is never unhashed (so it stays in cache
forever - I doubt this, as it would be very bad) or the association
was migrated via sctp_assoc_migrate() and the additional reference was
not transferred across.
--
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog