2022-05-28 19:50:57

by Duoming Zhou

[permalink] [raw]
Subject: Re: [PATCH v2] ax25: Fix ax25 session cleanup problem in ax25_release

Hello,

On Fri, 27 May 2022 23:18:32 +0800 Duoming wrote:

> The timers of ax25 are used for correct session cleanup.
> If we use ax25_release() to close ax25 sessions and
> ax25_dev is not null, the del_timer_sync() functions in
> ax25_release() will execute. As a result, the sessions
> could not be cleaned up correctly, because the timers
> have stopped.
>
> This patch adds a device_up flag in ax25_dev in order to
> judge whether the device is up. If there are sessions to
> be cleaned up, the del_timer_sync() in ax25_release() will
> not execute. What's more, we add ax25_cb_del() in
> ax25_kill_by_device(), because the timers have been stopped
> and there are no functions that could delete ax25_cb if we
> do not call ax25_release().
>
> Fixes: 82e31755e55f ("ax25: Fix UAF bugs in ax25 timers")
> Reported-and-tested-by: Thomas Osterried <[email protected]>
> Signed-off-by: Duoming Zhou <[email protected]>
> ---
> Changes in v2:
> - Add ax25_cb_del() in ax25_kill_by_device().
>
> include/net/ax25.h | 1 +
> net/ax25/af_ax25.c | 15 ++++++++++-----
> net/ax25/ax25_dev.c | 1 +
> 3 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/include/net/ax25.h b/include/net/ax25.h
> index 0f9790c455b..a427a05672e 100644
> --- a/include/net/ax25.h
> +++ b/include/net/ax25.h
> @@ -228,6 +228,7 @@ typedef struct ax25_dev {
> ax25_dama_info dama;
> #endif
> refcount_t refcount;
> + bool device_up;
> } ax25_dev;
>
> typedef struct ax25_cb {
> diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
> index 363d47f9453..92cbb08a6c5 100644
> --- a/net/ax25/af_ax25.c
> +++ b/net/ax25/af_ax25.c
> @@ -81,6 +81,7 @@ static void ax25_kill_by_device(struct net_device *dev)
>
> if ((ax25_dev = ax25_dev_ax25dev(dev)) == NULL)
> return;
> + ax25_dev->device_up = false;
>
> spin_lock_bh(&ax25_list_lock);
> again:
> @@ -91,6 +92,7 @@ static void ax25_kill_by_device(struct net_device *dev)
> spin_unlock_bh(&ax25_list_lock);
> ax25_disconnect(s, ENETUNREACH);
> s->ax25_dev = NULL;
> + ax25_cb_del(s);
> spin_lock_bh(&ax25_list_lock);
> goto again;
> }
> @@ -104,6 +106,7 @@ static void ax25_kill_by_device(struct net_device *dev)
> ax25_dev_put(ax25_dev);
> }
> release_sock(sk);
> + ax25_cb_del(s);
> spin_lock_bh(&ax25_list_lock);
> sock_put(sk);
> /* The entry could have been deleted from the

There is a "refcount_t: underflow" problem, the call trace is shown below:

refcount_t: underflow; use-after-free.
WARNING: CPU: 1 PID: 15997 at lib/refcount.c:28 refcount_warn_saturate+0xc5/0x110
RIP: 0010:refcount_warn_saturate+0xc5/0x110
Code: 1b e0 d6 02 01 e8 46 82 1d 01 0f 0b eb 99 80 3d 08 e0 d6 02 00 75 90 48 c7 c7 80 87
RSP: 0018:ffff88800ab37db0 EFLAGS: 00000286
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffed1001566fa8
RBP: ffff88800a3bb410 R08: ffffffff810ffe2f R09: ffff88800ab37a37
R10: ffffed1001566f46 R11: 0000000000000001 R12: ffff888008960000
R13: ffff88800953f2c0 R14: ffff888006500018 R15: ffff888008960080
FS: 00007f46981f3700(0000) GS:ffff88806c600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000042270a CR3: 0000000009b64000 CR4: 00000000000006e0
Call Trace:
<TASK>
__sk_destruct+0x2c/0x350
ax25_release+0x34e/0x4a0
__sock_release+0x6d/0x120
sock_close+0xf/0x20
__fput+0x10e/0x410
task_work_run+0x86/0xc0
exit_to_user_mode_prepare+0x194/0x1a0
syscall_exit_to_user_mode+0x19/0x50
do_syscall_64+0x48/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae

The race condition is shown below:

(Thread 1) | (Thread 2)
ax25_create() |
refcount_set(&ax25->refcount, 1) |
ax25_bind() |
ax25_cb_add() |
ax25_cb_hold(ax25) //refcnt = 2 |
ax25_kill_by_device() | ax25_release()
... | ...
release_sock(); |
// no locks protect ax25_cb_del | lock_sock()
ax25_cb_del() | ax25_destroy_socket()
if (!hlist_unhashed(..)) | ax25_cb_del()
... | if (!hlist_unhashed(..))
hlist_del_init() |
ax25_cb_put(ax25) //refcnt = 1| ...
| ax25_cb_put(ax25) //refcnt = 0
| ...
| sock_put(sk)
| sk_free()
| sk_destruct()
| __sk_destruct()
| ax25_free_sock()
| ax25_cb_put(ax25) // refcount_t: underflow!

Moving ax25_cb_del() into lock_sock() can solve this problem,
because there is a check in ax25_cb_del(). If we delete ax25 node
in hlist, the check will not be satisfied.

if (!hlist_unhashed(&ax25->ax25_node)) { //check
spin_lock_bh(&ax25_list_lock);
hlist_del_init(&ax25->ax25_node); //delete ax25 node
spin_unlock_bh(&ax25_list_lock);
ax25_cb_put(ax25);
}

My successful test was this:

@@ -103,6 +105,7 @@ static void ax25_kill_by_device(struct net_device *dev)
dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker);
ax25_dev_put(ax25_dev);
}
+ ax25_cb_del(s);
release_sock(sk);
spin_lock_bh(&ax25_list_lock);
sock_put(sk);

Best regards,
Duoming Zhou