As the call trace shows, skb_panic was caused by wrong skb->mac_header
in nsh_gso_segment():
invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 3 PID: 2737 Comm: syz Not tainted 6.3.0-next-20230505 #1
RIP: 0010:skb_panic+0xda/0xe0
call Trace:
skb_push+0x91/0xa0
nsh_gso_segment+0x4f3/0x570
skb_mac_gso_segment+0x19e/0x270
__skb_gso_segment+0x1e8/0x3c0
validate_xmit_skb+0x452/0x890
validate_xmit_skb_list+0x99/0xd0
sch_direct_xmit+0x294/0x7c0
__dev_queue_xmit+0x16f0/0x1d70
packet_xmit+0x185/0x210
packet_snd+0xc15/0x1170
packet_sendmsg+0x7b/0xa0
sock_sendmsg+0x14f/0x160
The root cause is:
nsh_gso_segment() use skb->network_header - nhoff to reset mac_header
in skb_gso_error_unwind() if inner-layer protocol gso fails.
However, skb->network_header may be reset by inner-layer protocol
gso function e.g. mpls_gso_segment. skb->mac_header reset by the
inaccurate network_header will be larger than skb headroom.
nsh_gso_segment
nhoff = skb->network_header - skb->mac_header;
__skb_pull(skb,nsh_len)
skb_mac_gso_segment
mpls_gso_segment
skb_reset_network_header(skb);//skb->network_header+=nsh_len
return -EINVAL;
skb_gso_error_unwind
skb_push(skb, nsh_len);
skb->mac_header = skb->network_header - nhoff;
// skb->mac_header > skb->headroom, cause skb_push panic
Use correct mac_offset to restore mac_header to fix it.
Fixes: c411ed854584 ("nsh: add GSO support")
Signed-off-by: Dong Chenchen <[email protected]>
---
v2:
- Use skb->mac_header not skb->network_header-nhoff for mac_offset.
---
net/nsh/nsh.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/nsh/nsh.c b/net/nsh/nsh.c
index e9ca007718b7..7eb536a9677f 100644
--- a/net/nsh/nsh.c
+++ b/net/nsh/nsh.c
@@ -78,6 +78,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb,
{
struct sk_buff *segs = ERR_PTR(-EINVAL);
unsigned int nsh_len, mac_len;
+ u16 mac_offset = skb->mac_header;
__be16 proto;
int nhoff;
@@ -108,8 +109,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb,
segs = skb_mac_gso_segment(skb, features);
if (IS_ERR_OR_NULL(segs)) {
skb_gso_error_unwind(skb, htons(ETH_P_NSH), nsh_len,
- skb->network_header - nhoff,
- mac_len);
+ mac_offset, mac_len);
goto out;
}
--
2.25.1
On Tue, May 09, 2023 at 10:19:24AM +0800, Dong Chenchen wrote:
> As the call trace shows, skb_panic was caused by wrong skb->mac_header
> in nsh_gso_segment():
>
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
> CPU: 3 PID: 2737 Comm: syz Not tainted 6.3.0-next-20230505 #1
> RIP: 0010:skb_panic+0xda/0xe0
> call Trace:
> skb_push+0x91/0xa0
> nsh_gso_segment+0x4f3/0x570
> skb_mac_gso_segment+0x19e/0x270
> __skb_gso_segment+0x1e8/0x3c0
> validate_xmit_skb+0x452/0x890
> validate_xmit_skb_list+0x99/0xd0
> sch_direct_xmit+0x294/0x7c0
> __dev_queue_xmit+0x16f0/0x1d70
> packet_xmit+0x185/0x210
> packet_snd+0xc15/0x1170
> packet_sendmsg+0x7b/0xa0
> sock_sendmsg+0x14f/0x160
>
> The root cause is:
> nsh_gso_segment() use skb->network_header - nhoff to reset mac_header
> in skb_gso_error_unwind() if inner-layer protocol gso fails.
> However, skb->network_header may be reset by inner-layer protocol
> gso function e.g. mpls_gso_segment. skb->mac_header reset by the
> inaccurate network_header will be larger than skb headroom.
>
> nsh_gso_segment
> nhoff = skb->network_header - skb->mac_header;
> __skb_pull(skb,nsh_len)
> skb_mac_gso_segment
> mpls_gso_segment
> skb_reset_network_header(skb);//skb->network_header+=nsh_len
> return -EINVAL;
> skb_gso_error_unwind
> skb_push(skb, nsh_len);
> skb->mac_header = skb->network_header - nhoff;
> // skb->mac_header > skb->headroom, cause skb_push panic
>
> Use correct mac_offset to restore mac_header to fix it.
>
> Fixes: c411ed854584 ("nsh: add GSO support")
> Signed-off-by: Dong Chenchen <[email protected]>
nit: As this is a fix it should probably be targeted at 'net'
(as opposed to 'net-next'). This should be noted in the subject.
Subject: [PATCH net v2]...
> ---
> v2:
> - Use skb->mac_header not skb->network_header-nhoff for mac_offset.
> ---
> net/nsh/nsh.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/nsh/nsh.c b/net/nsh/nsh.c
> index e9ca007718b7..7eb536a9677f 100644
> --- a/net/nsh/nsh.c
> +++ b/net/nsh/nsh.c
> @@ -78,6 +78,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb,
> {
> struct sk_buff *segs = ERR_PTR(-EINVAL);
> unsigned int nsh_len, mac_len;
> + u16 mac_offset = skb->mac_header;
nit: It is generally preferred to arrange local variable in networking code
from shortest line to longest - reverse xmas tree order.
This can be verified using.
https://github.com/ecree-solarflare/xmastree/blob/master/README
> __be16 proto;
> int nhoff;
>
> @@ -108,8 +109,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb,
> segs = skb_mac_gso_segment(skb, features);
> if (IS_ERR_OR_NULL(segs)) {
> skb_gso_error_unwind(skb, htons(ETH_P_NSH), nsh_len,
> - skb->network_header - nhoff,
> - mac_len);
> + mac_offset, mac_len);
> goto out;
> }
>
> --
> 2.25.1
>
>
> > As the call trace shows, skb_panic was caused by wrong
> > skb->mac_header in nsh_gso_segment():
> >
> > invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
> > CPU: 3 PID: 2737 Comm: syz Not tainted 6.3.0-next-20230505 #1
> > RIP: 0010:skb_panic+0xda/0xe0
> > call Trace:
> > skb_push+0x91/0xa0
> > nsh_gso_segment+0x4f3/0x570
> > skb_mac_gso_segment+0x19e/0x270
> > __skb_gso_segment+0x1e8/0x3c0
> > validate_xmit_skb+0x452/0x890
> > validate_xmit_skb_list+0x99/0xd0
> > sch_direct_xmit+0x294/0x7c0
> > __dev_queue_xmit+0x16f0/0x1d70
> > packet_xmit+0x185/0x210
> > packet_snd+0xc15/0x1170
> > packet_sendmsg+0x7b/0xa0
> > sock_sendmsg+0x14f/0x160
> >
> > The root cause is:
> > nsh_gso_segment() use skb->network_header - nhoff to reset
> > mac_header in skb_gso_error_unwind() if inner-layer protocol gso fails.
> > However, skb->network_header may be reset by inner-layer protocol
> > gso function e.g. mpls_gso_segment. skb->mac_header reset by the
> > inaccurate network_header will be larger than skb headroom.
> >
> > nsh_gso_segment
> > nhoff = skb->network_header - skb->mac_header;
> > __skb_pull(skb,nsh_len)
> > skb_mac_gso_segment
> > mpls_gso_segment
> > skb_reset_network_header(skb);//skb->network_header+=nsh_len
> > return -EINVAL;
> > skb_gso_error_unwind
> > skb_push(skb, nsh_len);
> > skb->mac_header = skb->network_header - nhoff;
> > // skb->mac_header > skb->headroom, cause skb_push panic
> >
> > Use correct mac_offset to restore mac_header to fix it.
> >
> > Fixes: c411ed854584 ("nsh: add GSO support")
> > Signed-off-by: Dong Chenchen <[email protected]>
>
> nit: As this is a fix it should probably be targeted at 'net'
> (as opposed to 'net-next'). This should be noted in the subject.
>
> Subject: [PATCH net v2]...
>
> > ---
> > v2:
> > - Use skb->mac_header not skb->network_header-nhoff for mac_offset.
> > ---
> > net/nsh/nsh.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/nsh/nsh.c b/net/nsh/nsh.c index
> > e9ca007718b7..7eb536a9677f 100644
> > --- a/net/nsh/nsh.c
> > +++ b/net/nsh/nsh.c
> > @@ -78,6 +78,7 @@ static struct sk_buff *nsh_gso_segment(struct
> > sk_buff *skb, {
> > struct sk_buff *segs = ERR_PTR(-EINVAL);
> > unsigned int nsh_len, mac_len;
> > + u16 mac_offset = skb->mac_header;
>
> nit: It is generally preferred to arrange local variable in networking code
> from shortest line to longest - reverse xmas tree order.
>
> This can be verified using.
> https://github.com/ecree-solarflare/xmastree/blob/master/README
>
Thank you very much for your suggestions!
v3 will be sent.
Dong Chenchen
> > __be16 proto;
> > int nhoff;
> >
> > @@ -108,8 +109,7 @@ static struct sk_buff *nsh_gso_segment(struct sk_buff *skb,
> > segs = skb_mac_gso_segment(skb, features);
> > if (IS_ERR_OR_NULL(segs)) {
> > skb_gso_error_unwind(skb, htons(ETH_P_NSH), nsh_len,
> > - skb->network_header - nhoff,
> > - mac_len);
> > + mac_offset, mac_len);
> > goto out;
> > }
> >
> > --
> > 2.25.1
> >
> >