If batadv_v_ogm_init encounters a memory failure, the fields ogm_wq,
ogm_buff_mutex and etc. are not initialized. Then the control flow goes
to batadv_v_ogm_free due to the error handling code. As a result, the
API invocation "cancel_delayed_work_sync" and "mutex_lock" will cause
many issues. The crashing stack trace is as follows:
Call Trace:
__cancel_work_timer+0x1c9/0x280 kernel/workqueue.c:3170
batadv_v_ogm_free+0x1d/0x50 net/batman-adv/bat_v_ogm.c:1076
batadv_mesh_free+0x35/0xa0 net/batman-adv/main.c:244
batadv_mesh_init+0x22a/0x240 net/batman-adv/main.c:226
batadv_softif_init_late+0x1ad/0x240 net/batman-adv/soft-interface.c:804
register_netdevice+0x15d/0x810 net/core/dev.c:10229
Fixes: a8d23cbbf6c9 ("batman-adv: Avoid free/alloc race when handling OGM2 buffer")
Fixes: 0da0035942d4 ("batman-adv: OGMv2 - add basic infrastructure")
Signed-off-by: Dongliang Mu <[email protected]>
---
net/batman-adv/bat_v_ogm.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c
index 1d750f3cb2e4..2f3ecbcec58d 100644
--- a/net/batman-adv/bat_v_ogm.c
+++ b/net/batman-adv/bat_v_ogm.c
@@ -1044,6 +1044,13 @@ int batadv_v_ogm_init(struct batadv_priv *bat_priv)
unsigned char *ogm_buff;
u32 random_seqno;
+ /* randomize initial seqno to avoid collision */
+ get_random_bytes(&random_seqno, sizeof(random_seqno));
+ atomic_set(&bat_priv->bat_v.ogm_seqno, random_seqno);
+ INIT_DELAYED_WORK(&bat_priv->bat_v.ogm_wq, batadv_v_ogm_send);
+
+ mutex_init(&bat_priv->bat_v.ogm_buff_mutex);
+
bat_priv->bat_v.ogm_buff_len = BATADV_OGM2_HLEN;
ogm_buff = kzalloc(bat_priv->bat_v.ogm_buff_len, GFP_ATOMIC);
if (!ogm_buff)
@@ -1057,13 +1064,6 @@ int batadv_v_ogm_init(struct batadv_priv *bat_priv)
ogm_packet->flags = BATADV_NO_FLAGS;
ogm_packet->throughput = htonl(BATADV_THROUGHPUT_MAX_VALUE);
- /* randomize initial seqno to avoid collision */
- get_random_bytes(&random_seqno, sizeof(random_seqno));
- atomic_set(&bat_priv->bat_v.ogm_seqno, random_seqno);
- INIT_DELAYED_WORK(&bat_priv->bat_v.ogm_wq, batadv_v_ogm_send);
-
- mutex_init(&bat_priv->bat_v.ogm_buff_mutex);
-
return 0;
}
--
2.25.1
On Monday, 1 November 2021 05:01:02 CET Dongliang Mu wrote:
> Call Trace:
> __cancel_work_timer+0x1c9/0x280 kernel/workqueue.c:3170
> batadv_v_ogm_free+0x1d/0x50 net/batman-adv/bat_v_ogm.c:1076
> batadv_mesh_free+0x35/0xa0 net/batman-adv/main.c:244
> batadv_mesh_init+0x22a/0x240 net/batman-adv/main.c:226
> batadv_softif_init_late+0x1ad/0x240 net/batman-adv/soft-interface.c:804
> register_netdevice+0x15d/0x810 net/core/dev.c:10229
This is definitely not a backtrace of the current code and its error handling.
Please check the current code [1] and explain the situation against this
version.
Kind regards,
Sven
[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/net/batman-adv/main.c?id=ae0393500e3b0139210749d52d22b29002c20e16#n237
On Mon, Nov 1, 2021 at 3:46 PM Sven Eckelmann <[email protected]> wrote:
>
> On Monday, 1 November 2021 05:01:02 CET Dongliang Mu wrote:
> > Call Trace:
> > __cancel_work_timer+0x1c9/0x280 kernel/workqueue.c:3170
> > batadv_v_ogm_free+0x1d/0x50 net/batman-adv/bat_v_ogm.c:1076
> > batadv_mesh_free+0x35/0xa0 net/batman-adv/main.c:244
> > batadv_mesh_init+0x22a/0x240 net/batman-adv/main.c:226
> > batadv_softif_init_late+0x1ad/0x240 net/batman-adv/soft-interface.c:804
> > register_netdevice+0x15d/0x810 net/core/dev.c:10229
>
> This is definitely not a backtrace of the current code and its error handling.
> Please check the current code [1] and explain the situation against this
> version.
Yes, you're right. The error handling code in the upstream is not
prone to this bug.
My local syzkaller instance is fuzzing on 5.14-rc5
>
> Kind regards,
> Sven
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/net/batman-adv/main.c?id=ae0393500e3b0139210749d52d22b29002c20e16#n237