2022-12-14 02:32:26

by Jun Nie

[permalink] [raw]
Subject: [PATCH net v2] net: sched: ematch: reject invalid data

syzbot reported below bug. Refuse to compare for invalid data case to fix
it.

general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 5.15.77-syzkaller-00764-g7048384c9872 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Workqueue: wg-crypt-wg2 wg_packet_tx_worker
RIP: 0010:em_cmp_match+0x4e/0x5f0 net/sched/em_cmp.c:25
Call Trace:
<TASK>
tcf_em_match net/sched/ematch.c:492 [inline]
__tcf_em_tree_match+0x194/0x720 net/sched/ematch.c:518
tcf_em_tree_match include/net/pkt_cls.h:463 [inline]
basic_classify+0xd8/0x250 net/sched/cls_basic.c:48
__tcf_classify net/sched/cls_api.c:1549 [inline]
tcf_classify+0x161/0x430 net/sched/cls_api.c:1589
prio_classify net/sched/sch_prio.c:42 [inline]
prio_enqueue+0x1d3/0x6a0 net/sched/sch_prio.c:75
dev_qdisc_enqueue net/core/dev.c:3792 [inline]
__dev_xmit_skb+0x35c/0x1650 net/core/dev.c:3876
__dev_queue_xmit+0x8f3/0x1b50 net/core/dev.c:4193
dev_queue_xmit+0x17/0x20 net/core/dev.c:4261
neigh_hh_output include/net/neighbour.h:508 [inline]
neigh_output include/net/neighbour.h:522 [inline]
ip_finish_output2+0xc0f/0xf00 net/ipv4/ip_output.c:228
__ip_finish_output+0x163/0x370
ip_finish_output+0x20b/0x220 net/ipv4/ip_output.c:316
NF_HOOK_COND include/linux/netfilter.h:299 [inline]
ip_output+0x1e9/0x410 net/ipv4/ip_output.c:430
dst_output include/net/dst.h:450 [inline]
ip_local_out+0x92/0xb0 net/ipv4/ip_output.c:126
iptunnel_xmit+0x4a2/0x890 net/ipv4/ip_tunnel_core.c:82
udp_tunnel_xmit_skb+0x1b6/0x2c0 net/ipv4/udp_tunnel_core.c:175
send4+0x78d/0xd20 drivers/net/wireguard/socket.c:85
wg_socket_send_skb_to_peer+0xd5/0x1d0 drivers/net/wireguard/socket.c:175
wg_packet_create_data_done drivers/net/wireguard/send.c:251 [inline]
wg_packet_tx_worker+0x202/0x560 drivers/net/wireguard/send.c:276
process_one_work+0x6db/0xc00 kernel/workqueue.c:2313
worker_thread+0xb3e/0x1340 kernel/workqueue.c:2460
kthread+0x41c/0x500 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298

Reported-by: [email protected]
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Signed-off-by: Jun Nie <[email protected]>
---
net/sched/em_cmp.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/sched/em_cmp.c b/net/sched/em_cmp.c
index f17b049ea530..0284394be53f 100644
--- a/net/sched/em_cmp.c
+++ b/net/sched/em_cmp.c
@@ -22,9 +22,14 @@ static int em_cmp_match(struct sk_buff *skb, struct tcf_ematch *em,
struct tcf_pkt_info *info)
{
struct tcf_em_cmp *cmp = (struct tcf_em_cmp *) em->data;
- unsigned char *ptr = tcf_get_base_ptr(skb, cmp->layer) + cmp->off;
+ unsigned char *ptr;
u32 val = 0;

+ if (!cmp)
+ return 0;
+
+ ptr = tcf_get_base_ptr(skb, cmp->layer) + cmp->off;
+
if (!tcf_valid_offset(skb, ptr, cmp->align))
return 0;

--
2.34.1


2022-12-15 13:08:33

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net v2] net: sched: ematch: reject invalid data

On Wed, 2022-12-14 at 10:20 +0800, Jun Nie wrote:
> syzbot reported below bug. Refuse to compare for invalid data case to fix
> it.
>
> general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
> CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 5.15.77-syzkaller-00764-g7048384c9872 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
> Workqueue: wg-crypt-wg2 wg_packet_tx_worker
> RIP: 0010:em_cmp_match+0x4e/0x5f0 net/sched/em_cmp.c:25
> Call Trace:
> <TASK>
> tcf_em_match net/sched/ematch.c:492 [inline]
> __tcf_em_tree_match+0x194/0x720 net/sched/ematch.c:518
> tcf_em_tree_match include/net/pkt_cls.h:463 [inline]
> basic_classify+0xd8/0x250 net/sched/cls_basic.c:48
> __tcf_classify net/sched/cls_api.c:1549 [inline]
> tcf_classify+0x161/0x430 net/sched/cls_api.c:1589
> prio_classify net/sched/sch_prio.c:42 [inline]
> prio_enqueue+0x1d3/0x6a0 net/sched/sch_prio.c:75
> dev_qdisc_enqueue net/core/dev.c:3792 [inline]
> __dev_xmit_skb+0x35c/0x1650 net/core/dev.c:3876
> __dev_queue_xmit+0x8f3/0x1b50 net/core/dev.c:4193
> dev_queue_xmit+0x17/0x20 net/core/dev.c:4261
> neigh_hh_output include/net/neighbour.h:508 [inline]
> neigh_output include/net/neighbour.h:522 [inline]
> ip_finish_output2+0xc0f/0xf00 net/ipv4/ip_output.c:228
> __ip_finish_output+0x163/0x370
> ip_finish_output+0x20b/0x220 net/ipv4/ip_output.c:316
> NF_HOOK_COND include/linux/netfilter.h:299 [inline]
> ip_output+0x1e9/0x410 net/ipv4/ip_output.c:430
> dst_output include/net/dst.h:450 [inline]
> ip_local_out+0x92/0xb0 net/ipv4/ip_output.c:126
> iptunnel_xmit+0x4a2/0x890 net/ipv4/ip_tunnel_core.c:82
> udp_tunnel_xmit_skb+0x1b6/0x2c0 net/ipv4/udp_tunnel_core.c:175
> send4+0x78d/0xd20 drivers/net/wireguard/socket.c:85
> wg_socket_send_skb_to_peer+0xd5/0x1d0 drivers/net/wireguard/socket.c:175
> wg_packet_create_data_done drivers/net/wireguard/send.c:251 [inline]
> wg_packet_tx_worker+0x202/0x560 drivers/net/wireguard/send.c:276
> process_one_work+0x6db/0xc00 kernel/workqueue.c:2313
> worker_thread+0xb3e/0x1340 kernel/workqueue.c:2460
> kthread+0x41c/0x500 kernel/kthread.c:319
> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
>
> Reported-by: [email protected]
> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")

Very likely this is not the correct fixes tag.

> Signed-off-by: Jun Nie <[email protected]>
> ---
> net/sched/em_cmp.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/net/sched/em_cmp.c b/net/sched/em_cmp.c
> index f17b049ea530..0284394be53f 100644
> --- a/net/sched/em_cmp.c
> +++ b/net/sched/em_cmp.c
> @@ -22,9 +22,14 @@ static int em_cmp_match(struct sk_buff *skb, struct tcf_ematch *em,
> struct tcf_pkt_info *info)
> {
> struct tcf_em_cmp *cmp = (struct tcf_em_cmp *) em->data;
> - unsigned char *ptr = tcf_get_base_ptr(skb, cmp->layer) + cmp->off;
> + unsigned char *ptr;
> u32 val = 0;
>
> + if (!cmp)
> + return 0;

It feels like this is papering over the real issue. Why em->data is
NULL here? why other ematches are not afflicted by this issue? 

is em->data really NULL or some small value instead? KASAN seams to
tell it's a small value, not 0, so this patch should not avoid the
oops. Have you tested it vs the reproducer?

Thanks,

Paolo

2022-12-15 14:46:30

by Jun Nie

[permalink] [raw]
Subject: Re: [PATCH net v2] net: sched: ematch: reject invalid data

Paolo Abeni <[email protected]> 于2022年12月15日周四 20:50写道:
>
> On Wed, 2022-12-14 at 10:20 +0800, Jun Nie wrote:
> > syzbot reported below bug. Refuse to compare for invalid data case to fix
> > it.
> >
> > general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
> > KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
> > CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 5.15.77-syzkaller-00764-g7048384c9872 #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
> > Workqueue: wg-crypt-wg2 wg_packet_tx_worker
> > RIP: 0010:em_cmp_match+0x4e/0x5f0 net/sched/em_cmp.c:25
> > Call Trace:
> > <TASK>
> > tcf_em_match net/sched/ematch.c:492 [inline]
> > __tcf_em_tree_match+0x194/0x720 net/sched/ematch.c:518
> > tcf_em_tree_match include/net/pkt_cls.h:463 [inline]
> > basic_classify+0xd8/0x250 net/sched/cls_basic.c:48
> > __tcf_classify net/sched/cls_api.c:1549 [inline]
> > tcf_classify+0x161/0x430 net/sched/cls_api.c:1589
> > prio_classify net/sched/sch_prio.c:42 [inline]
> > prio_enqueue+0x1d3/0x6a0 net/sched/sch_prio.c:75
> > dev_qdisc_enqueue net/core/dev.c:3792 [inline]
> > __dev_xmit_skb+0x35c/0x1650 net/core/dev.c:3876
> > __dev_queue_xmit+0x8f3/0x1b50 net/core/dev.c:4193
> > dev_queue_xmit+0x17/0x20 net/core/dev.c:4261
> > neigh_hh_output include/net/neighbour.h:508 [inline]
> > neigh_output include/net/neighbour.h:522 [inline]
> > ip_finish_output2+0xc0f/0xf00 net/ipv4/ip_output.c:228
> > __ip_finish_output+0x163/0x370
> > ip_finish_output+0x20b/0x220 net/ipv4/ip_output.c:316
> > NF_HOOK_COND include/linux/netfilter.h:299 [inline]
> > ip_output+0x1e9/0x410 net/ipv4/ip_output.c:430
> > dst_output include/net/dst.h:450 [inline]
> > ip_local_out+0x92/0xb0 net/ipv4/ip_output.c:126
> > iptunnel_xmit+0x4a2/0x890 net/ipv4/ip_tunnel_core.c:82
> > udp_tunnel_xmit_skb+0x1b6/0x2c0 net/ipv4/udp_tunnel_core.c:175
> > send4+0x78d/0xd20 drivers/net/wireguard/socket.c:85
> > wg_socket_send_skb_to_peer+0xd5/0x1d0 drivers/net/wireguard/socket.c:175
> > wg_packet_create_data_done drivers/net/wireguard/send.c:251 [inline]
> > wg_packet_tx_worker+0x202/0x560 drivers/net/wireguard/send.c:276
> > process_one_work+0x6db/0xc00 kernel/workqueue.c:2313
> > worker_thread+0xb3e/0x1340 kernel/workqueue.c:2460
> > kthread+0x41c/0x500 kernel/kthread.c:319
> > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
> >
> > Reported-by: [email protected]
> > Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
>
> Very likely this is not the correct fixes tag.
>
> > Signed-off-by: Jun Nie <[email protected]>
> > ---
> > net/sched/em_cmp.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/sched/em_cmp.c b/net/sched/em_cmp.c
> > index f17b049ea530..0284394be53f 100644
> > --- a/net/sched/em_cmp.c
> > +++ b/net/sched/em_cmp.c
> > @@ -22,9 +22,14 @@ static int em_cmp_match(struct sk_buff *skb, struct tcf_ematch *em,
> > struct tcf_pkt_info *info)
> > {
> > struct tcf_em_cmp *cmp = (struct tcf_em_cmp *) em->data;
> > - unsigned char *ptr = tcf_get_base_ptr(skb, cmp->layer) + cmp->off;
> > + unsigned char *ptr;
> > u32 val = 0;
> >
> > + if (!cmp)
> > + return 0;
>
> It feels like this is papering over the real issue. Why em->data is
> NULL here? why other ematches are not afflicted by this issue?
>
> is em->data really NULL or some small value instead? KASAN seams to
> tell it's a small value, not 0, so this patch should not avoid the
> oops. Have you tested it vs the reproducer?
>
> Thanks,
>
> Paolo
>

The test with the reproducer[1] shows it does resolve the issue. The data
is NULL so that deferring cmp can be avoided with the patch. I did not
investigate why the em->data is NULL in WireGuard secure network tunnel
case as I am not familiar with network stack. So you can also call this patch
as a workaround.

[1]
https://syzkaller.appspot.com/bug?id=d96c4958dc8d4da11f56e18471dfc4f64d21ef6e

Regards,
Jun

2022-12-17 22:36:08

by Cong Wang

[permalink] [raw]
Subject: Re: [PATCH net v2] net: sched: ematch: reject invalid data

On Thu, Dec 15, 2022 at 01:50:43PM +0100, Paolo Abeni wrote:
> On Wed, 2022-12-14 at 10:20 +0800, Jun Nie wrote:
> > ---
> > net/sched/em_cmp.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/sched/em_cmp.c b/net/sched/em_cmp.c
> > index f17b049ea530..0284394be53f 100644
> > --- a/net/sched/em_cmp.c
> > +++ b/net/sched/em_cmp.c
> > @@ -22,9 +22,14 @@ static int em_cmp_match(struct sk_buff *skb, struct tcf_ematch *em,
> > struct tcf_pkt_info *info)
> > {
> > struct tcf_em_cmp *cmp = (struct tcf_em_cmp *) em->data;
> > - unsigned char *ptr = tcf_get_base_ptr(skb, cmp->layer) + cmp->off;
> > + unsigned char *ptr;
> > u32 val = 0;
> >
> > + if (!cmp)
> > + return 0;
>
> It feels like this is papering over the real issue. Why em->data is
> NULL here? why other ematches are not afflicted by this issue??
>
> is em->data really NULL or some small value instead? KASAN seams to
> tell it's a small value, not 0, so this patch should not avoid the
> oops. Have you tested it vs the reproducer?

Right. I think I have found the root cause, let me test my patch to see
if it makes syzbot happy.

Thanks.