2020-07-07 15:30:48

by syzbot

[permalink] [raw]
Subject: general protection fault in batadv_iv_ogm_schedule_buff (2)

Hello,

syzbot found the following crash on:

HEAD commit: 7cc2a8ea Merge tag 'block-5.8-2020-07-01' of git://git.ker..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=130b828f100000
kernel config: https://syzkaller.appspot.com/x/.config?x=7be693511b29b338
dashboard link: https://syzkaller.appspot.com/bug?extid=2eeeb5ad0766b57394d8
compiler: gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: [email protected]

general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
CPU: 1 PID: 9126 Comm: kworker/u4:9 Not tainted 5.8.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet
RIP: 0010:batadv_iv_ogm_schedule_buff+0xd1e/0x1410 net/batman-adv/bat_iv_ogm.c:843
Code: 80 3c 28 00 0f 85 ee 05 00 00 4d 8b 3f 49 81 ff e0 e9 4e 8d 0f 84 dd 02 00 00 e8 bd 80 ae f9 49 8d 7f 70 48 89 f8 48 c1 e8 03 <42> 80 3c 28 00 0f 85 af 06 00 00 48 8b 44 24 08 49 8b 6f 70 80 38
RSP: 0018:ffffc90004e97b98 EFLAGS: 00010202
RAX: 000000000000000e RBX: ffff8880a7471800 RCX: ffffffff87c5394d
RDX: ffff88804cf02380 RSI: ffffffff87c536a3 RDI: 0000000000000070
RBP: 0000000000077000 R08: 0000000000000001 R09: ffff8880a875a02b
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000007
R13: dffffc0000000000 R14: ffff888051ad4c40 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000400200 CR3: 0000000061cac000 CR4: 00000000001426e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
batadv_iv_ogm_schedule net/batman-adv/bat_iv_ogm.c:869 [inline]
batadv_iv_ogm_schedule net/batman-adv/bat_iv_ogm.c:862 [inline]
batadv_iv_send_outstanding_bat_ogm_packet+0x5c8/0x800 net/batman-adv/bat_iv_ogm.c:1722
process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
kthread+0x3b5/0x4a0 kernel/kthread.c:291
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
Modules linked in:
---[ end trace f5c5eda032070cd1 ]---
RIP: 0010:batadv_iv_ogm_schedule_buff+0xd1e/0x1410 net/batman-adv/bat_iv_ogm.c:843
Code: 80 3c 28 00 0f 85 ee 05 00 00 4d 8b 3f 49 81 ff e0 e9 4e 8d 0f 84 dd 02 00 00 e8 bd 80 ae f9 49 8d 7f 70 48 89 f8 48 c1 e8 03 <42> 80 3c 28 00 0f 85 af 06 00 00 48 8b 44 24 08 49 8b 6f 70 80 38
RSP: 0018:ffffc90004e97b98 EFLAGS: 00010202
RAX: 000000000000000e RBX: ffff8880a7471800 RCX: ffffffff87c5394d
RDX: ffff88804cf02380 RSI: ffffffff87c536a3 RDI: 0000000000000070
RBP: 0000000000077000 R08: 0000000000000001 R09: ffff8880a875a02b
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000007
R13: dffffc0000000000 R14: ffff888051ad4c40 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000400200 CR3: 000000009480d000 CR4: 00000000001426e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.


2020-07-07 19:54:50

by Sven Eckelmann

[permalink] [raw]
Subject: Re: general protection fault in batadv_iv_ogm_schedule_buff (2)

On Tuesday, 7 July 2020 17:30:14 CEST syzbot wrote:
> general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
> CPU: 1 PID: 9126 Comm: kworker/u4:9 Not tainted 5.8.0-rc3-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet
> RIP: 0010:batadv_iv_ogm_schedule_buff+0xd1e/0x1410 net/batman-adv/bat_iv_ogm.c:843

Seems to be following lines:

838 /* OGMs from primary interfaces are scheduled on all
839 * interfaces.
840 */
841 rcu_read_lock();
842 list_for_each_entry_rcu(tmp_hard_iface, &batadv_hardif_list, list) {
843 if (tmp_hard_iface->soft_iface != hard_iface->soft_iface)
844 continue;

If I understand it correctly, the tmp_hard_iface is NULL and then accessing
soft_iface (offset 0x70 on amd64) causes this problem. But neither the
batadv_hardif_list should ever point to NULL nor any entry inside the list.

I've just went through all code which accesses the list:

* bat_iv_ogm.c 839,7 @@ static void batadv_iv_ogm_schedule_buff
* bat_iv_ogm.c 1606,7 @@ static void batadv_iv_ogm_process
* bat_iv_ogm.c 1671,7 @@ static void batadv_iv_ogm_process
* bat_iv_ogm.c 2144,7 @@ static void batadv_iv_neigh_print
* bat_iv_ogm.c 2313,8 @@ batadv_iv_ogm_neigh_dump
* bat_v.c 188,7 @@ static void batadv_v_neigh_print
* bat_v.c 315,7 @@ batadv_v_neigh_dump
* bat_v_elp.c 425,7 @@ void batadv_v_elp_primary_iface_set
* bat_v_ogm.c 298,7 @@ static void batadv_v_ogm_send_softif
* bat_v_ogm.c 923,7 @@ static void batadv_v_ogm_process
* hard-interface.c 68,7 @@ batadv_hardif_get_by_netdev
* hard-interface.c 431,7 @@ batadv_hardif_get_active
* hard-interface.c 501,7 @@ static void batadv_check_known_mac_addr
* hard-interface.c 533,7 @@ static void batadv_hardif_recalc_extra_skbroom
* hard-interface.c 572,7 @@ int batadv_hardif_min_mtu
* hard-interface.c 829,7 @@ static size_t batadv_hardif_cnt
* main.c 290,7 @@ bool batadv_is_my_mac
* netlink.c 991,7 @@ batadv_netlink_dump_hardif
* originator.c 1301,7 @@ static bool batadv_purge_orig_node
* send.c 882,7 @@ static void batadv_send_outstanding_bcast_packet
* soft-interface.c 1141,7 @@ static void batadv_softif_destroy_netlink



and all the code which adds to the list or initializes parts of the list:

* hard-interface.c 927,7 @@ batadv_hardif_add_interface

- should be under rtnl_lock

* hard-interface.c 945,7 @@ batadv_hardif_add_interface

- should be under rtnl_lock

* hard-interface.c 99,7 @@ static int __init batadv_init

- this is done to initialized the list head before the rest of the code is
initialized


and all the code which removes entries from the list:

* hard-interface.c 985,8 @@ void batadv_hardif_remove_interfaces

- is under rtnl_lock
- there should also be nothing in this list because unregister_netdevice_notifier
will trigger a NETDEV_UNREGISTER of these devices

* hard-interface.c 1048,7 @@ static int batadv_hard_if_event

- should be under rtnl_lock



The batadv_hard_iface is only kfree_rcu'ed by batadv_hardif_release when the
reference counter is zero. The reference counter is increased in:

* bat_iv_ogm.c 843,20 @@ static void batadv_iv_ogm_schedule_buff
* bat_iv_ogm.c 1678,13 @@ static void batadv_iv_ogm_process
* bat_v_ogm.c 302,7 @@ static void batadv_v_ogm_send_softif
* bat_v_ogm.c 930,7 @@ static void batadv_v_ogm_process
* hard-interface.c 70,7 @@ batadv_hardif_get_by_netdev
* hard-interface.c 436,7 @@ batadv_hardif_get_active
* hard-interface.c 471,7 @@ static void batadv_primary_if_select
* hard-interface.c 720,7 @@ int batadv_hardif_enable_interface
* hard-interface.c 765,7 @@ int batadv_hardif_enable_interface
* hard-interface.c 932,7 @@ batadv_hardif_add_interface
* hard-interface.c 944,7 @@ batadv_hardif_add_interface
* hard-interface.h 133,7 @@ batadv_primary_if_get_selected
* main.c 460,7 @@ int batadv_batman_skb_recv
* originator.c 413,7 @@ batadv_orig_ifinfo_new
* originator.c 491,7 @@ batadv_neigh_ifinfo_new
* originator.c 570,7 @@ batadv_hardif_neigh_create
* originator.c 682,7 @@ batadv_neigh_node_create
* originator.c 1308,7 @@ static bool batadv_purge_orig_node
* send.c 527,10 @@ batadv_forw_packet_alloc
* send.c 932,7 @@ static void batadv_send_outstanding_bcast_packet


and decreased:

* bat_iv_ogm.c 518,7 @@ batadv_iv_ogm_can_aggregate
* bat_iv_ogm.c 843,20 @@ static void batadv_iv_ogm_schedule_buff
* bat_iv_ogm.c 1678,13 @@ static void batadv_iv_ogm_process
* bat_v.c 51,7 @@ static void batadv_v_iface_activate
* bat_v.c 108,7 @@ static void batadv_v_iface_update_mac
* bat_v_elp.c 540,7 @@ int batadv_v_elp_packet_recv
* bat_v_ogm.c 326,7 @@ static void batadv_v_ogm_send_softif
* bat_v_ogm.c 340,12 @@ static void batadv_v_ogm_send_softif
* bat_v_ogm.c 958,7 @@ static void batadv_v_ogm_process
* bat_v_ogm.c 966,7 @@ static void batadv_v_ogm_process
* bridge_loop_avoidance.c 440,7 @@ static void batadv_bla_send_claim
* bridge_loop_avoidance.c 1405,7 @@ void batadv_bla_status_update
* bridge_loop_avoidance.c 1499,7 @@ static void batadv_bla_periodic_work
* bridge_loop_avoidance.c 1538,7 @@ int batadv_bla_init
* bridge_loop_avoidance.c 1746,7 @@ void batadv_bla_free
* bridge_loop_avoidance.c 1910,7 @@ bool batadv_bla_rx
* bridge_loop_avoidance.c 2017,7 @@ bool batadv_bla_tx
* bridge_loop_avoidance.c 2081,7 @@ int batadv_bla_claim_table_seq_print_text
* bridge_loop_avoidance.c 2248,7 @@ int batadv_bla_claim_dump
* bridge_loop_avoidance.c 2317,7 @@ int batadv_bla_backbone_table_seq_print_text
* bridge_loop_avoidance.c 2486,7 @@ int batadv_bla_backbone_dump
* bridge_loop_avoidance.c 2538,7 @@ bool batadv_bla_check_claim
* distributed-arp-table.c 891,7 @@ int batadv_dat_cache_seq_print_text
* distributed-arp-table.c 1037,7 @@ int batadv_dat_cache_dump
* fragmentation.c 540,7 @@ int batadv_frag_send_packet
* gateway_client.c 535,7 @@ int batadv_gw_client_seq_print_text
* gateway_client.c 595,7 @@ int batadv_gw_dump
* hard-interface.c 239,7 @@ static struct net_device *batadv_get_real_netdevice
* hard-interface.c 460,7 @@ static void batadv_primary_if_update_addr
* hard-interface.c 484,7 @@ static void batadv_primary_if_select
* hard-interface.c 657,7 @@ batadv_hardif_activate_interface
* hard-interface.c 809,7 @@ int batadv_hardif_enable_interface
* hard-interface.c 860,7 @@ void batadv_hardif_disable_interface
* hard-interface.c 870,7 @@ void batadv_hardif_disable_interface
* hard-interface.c 893,11 @@ void batadv_hardif_disable_interface
* hard-interface.c 973,7 @@ static void batadv_hardif_remove_interface
* hard-interface.c 1086,10 @@ static int batadv_hard_if_event
* icmp_socket.c 278,7 @@ static ssize_t batadv_socket_write
* main.c 336,7 @@ batadv_seq_print_text_primary_if_get
* main.c 504,7 @@ int batadv_batman_skb_recv
* main.c 515,7 @@ int batadv_batman_skb_recv
* multicast.c 2152,7 @@ int batadv_mcast_flags_seq_print_text
* multicast.c 2361,7 @@ batadv_mcast_netlink_get_primary
* multicast.c 2389,7 @@ int batadv_mcast_flags_dump
* netlink.c 359,14 @@ static int batadv_netlink_mesh_fill
* netlink.c 1217,7 @@ batadv_get_hardif_from_info
* netlink.c 1336,7 @@ static void batadv_post_doit
* network-coding.c 1937,7 @@ int batadv_nc_nodes_seq_print_text
* originator.c 239,7 @@ static void batadv_neigh_ifinfo_release
* originator.c 270,7 @@ static void batadv_hardif_neigh_release
* originator.c 304,7 @@ static void batadv_neigh_node_release
* originator.c 756,7 @@ int batadv_hardif_neigh_seq_print_text
* originator.c 835,11 @@ int batadv_hardif_neigh_dump
* originator.c 859,7 @@ static void batadv_orig_ifinfo_release
* originator.c 1319,7 @@ static bool batadv_purge_orig_node
* originator.c 1406,7 @@ int batadv_orig_seq_print_text
* originator.c 1461,7 @@ int batadv_orig_hardif_seq_print_text
* originator.c 1532,11 @@ int batadv_orig_dump
* routing.c 280,7 @@ static int batadv_recv_my_icmp_packet
* routing.c 335,7 @@ static int batadv_recv_icmp_ttl_exceeded
* routing.c 796,7 @@ batadv_reroute_unicast_packet
* routing.c 907,7 @@ static bool batadv_check_unicast_ttvn
* send.c 310,7 @@ bool batadv_send_skb_prepare_unicast_4addr
* send.c 475,9 @@ void batadv_forw_packet_free
* send.c 767,14 @@ int batadv_add_bcast_packet_to_list
* send.c 932,7 @@ static void batadv_send_outstanding_bcast_packet
* soft-interface.c 395,7 @@ static netdev_tx_t batadv_interface_tx
* soft-interface.c 893,7 @@ static int batadv_softif_slave_add
* soft-interface.c 920,7 @@ static int batadv_softif_slave_del
* sysfs.c 282,7 @@ ssize_t batadv_store_##_name
* sysfs.c 301,7 @@ ssize_t batadv_show_##_name
* sysfs.c 959,7 @@ static ssize_t batadv_show_mesh_iface
* sysfs.c 1013,7 @@ static int batadv_store_mesh_iface_finish
* sysfs.c 1110,7 @@ static ssize_t batadv_show_iface_status
* sysfs.c 1170,7 @@ static ssize_t batadv_store_throughput_override
* sysfs.c 1190,7 @@ static ssize_t batadv_show_throughput_override
* tp_meter.c 748,7 @@ static void batadv_tp_recv_ack
* tp_meter.c 882,7 @@ static int batadv_tp_send
* tp_meter.c 1207,7 @@ static int batadv_tp_send_ack
* translation-table.c 820,7 @@ bool batadv_tt_local_add
* translation-table.c 1135,7 @@ int batadv_tt_local_seq_print_text
* translation-table.c 1293,7 @@ int batadv_tt_local_dump
* translation-table.c 2007,7 @@ int batadv_tt_global_seq_print_text
* translation-table.c 2214,7 @@ int batadv_tt_global_dump
* translation-table.c 3198,7 @@ static bool batadv_send_tt_request
* translation-table.c 3461,7 @@ static bool batadv_send_my_tt_response
* translation-table.c 3785,7 @@ static void batadv_send_roam_adv


Btw. we can most likely ignore everything related to bat_v* because it crashed
in bat_iv. So if anybody else spots something which I've missed....

Kind regards,
Sven


Attachments:
refcnt-hardif.diff (36.17 kB)
signature.asc (849.00 B)
This is a digitally signed message part.
Download all attachments