Dave Jones reported that a crash is occuring in
csum_partial
tcp_gso_segment
inet_gso_segment
? update_dl_migration
skb_mac_gso_segment
__skb_gso_segment
dev_hard_start_xmit
sch_direct_xmit
__dev_queue_xmit
? dev_hard_start_xmit
dev_queue_xmit
ip_finish_output
? ip_output
ip_output
ip_forward_finish
ip_forward
ip_rcv_finish
ip_rcv
__netif_receive_skb_core
? __netif_receive_skb_core
? trace_hardirqs_on
__netif_receive_skb
netif_receive_skb_internal
napi_gro_complete
? napi_gro_complete
dev_gro_receive
? dev_gro_receive
napi_gro_receive
It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
not set correctly when doing non-scatter gather. We are using
offset as opposed to doffset.
Reported-by: Dave Jones <[email protected]>
Signed-off-by: Tom Herbert <[email protected]>
---
net/core/skbuff.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9cd5344..c1a3303 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2993,7 +2993,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
skb_put(nskb, len),
len, 0);
SKB_GSO_CB(nskb)->csum_start =
- skb_headroom(nskb) + offset;
+ skb_headroom(nskb) + doffset;
continue;
}
--
2.0.0.526.g5318336
On Tue, Jun 24, 2014 at 9:03 PM, Tom Herbert <[email protected]> wrote:
>
> It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
> not set correctly when doing non-scatter gather. We are using
> offset as opposed to doffset.
>
> Reported-by: Dave Jones <[email protected]>
DaveJ, I think you triggered this in five minutes on your box, and I
don't recall seeing anybody else reporting the oops (and google
doesn't find anything in the last month). So it's presumably somewhat
hw-specific. Does this fix the problem?
Linus
On Tue, 2014-06-24 at 21:03 -0700, Tom Herbert wrote:
>
> It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
> not set correctly when doing non-scatter gather. We are using
> offset as opposed to doffset.
>
> Reported-by: Dave Jones <[email protected]>
> Signed-off-by: Tom Herbert <[email protected]>
> ---
> net/core/skbuff.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 9cd5344..c1a3303 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2993,7 +2993,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
> skb_put(nskb, len),
> len, 0);
> SKB_GSO_CB(nskb)->csum_start =
> - skb_headroom(nskb) + offset;
> + skb_headroom(nskb) + doffset;
> continue;
> }
>
Yes, seems an obvious typo, but please change patch title.
This is not "tcp: fix setting csum_start in tcp_gso_segment"
Maybe "net: fix setting csum_start in skb_segment()"
On Tue, Jun 24, 2014 at 09:17:25PM -0700, Linus Torvalds wrote:
> On Tue, Jun 24, 2014 at 9:03 PM, Tom Herbert <[email protected]> wrote:
> >
> > It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
> > not set correctly when doing non-scatter gather. We are using
> > offset as opposed to doffset.
> >
> > Reported-by: Dave Jones <[email protected]>
>
> DaveJ, I think you triggered this in five minutes on your box, and I
> don't recall seeing anybody else reporting the oops (and google
> doesn't find anything in the last month). So it's presumably somewhat
> hw-specific. Does this fix the problem?
It's survived routing ~1GB of packets overnight, so I'd call this good.
thanks Tom.
Dave
From: Dave Jones <[email protected]>
Date: Wed, 25 Jun 2014 10:10:52 -0400
> On Tue, Jun 24, 2014 at 09:17:25PM -0700, Linus Torvalds wrote:
> > On Tue, Jun 24, 2014 at 9:03 PM, Tom Herbert <[email protected]> wrote:
> > >
> > > It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
> > > not set correctly when doing non-scatter gather. We are using
> > > offset as opposed to doffset.
> > >
> > > Reported-by: Dave Jones <[email protected]>
> >
> > DaveJ, I think you triggered this in five minutes on your box, and I
> > don't recall seeing anybody else reporting the oops (and google
> > doesn't find anything in the last month). So it's presumably somewhat
> > hw-specific. Does this fix the problem?
>
> It's survived routing ~1GB of packets overnight, so I'd call this good.
Tom, please adjust the Subject line as suggested by Eric Dumazet and add
a Tested-by: for Dave.
Thanks!
From: Tom Herbert <[email protected]>
Dave Jones reported that a crash is occurring in
csum_partial
tcp_gso_segment
inet_gso_segment
? update_dl_migration
skb_mac_gso_segment
__skb_gso_segment
dev_hard_start_xmit
sch_direct_xmit
__dev_queue_xmit
? dev_hard_start_xmit
dev_queue_xmit
ip_finish_output
? ip_output
ip_output
ip_forward_finish
ip_forward
ip_rcv_finish
ip_rcv
__netif_receive_skb_core
? __netif_receive_skb_core
? trace_hardirqs_on
__netif_receive_skb
netif_receive_skb_internal
napi_gro_complete
? napi_gro_complete
dev_gro_receive
? dev_gro_receive
napi_gro_receive
It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
not set correctly when doing non-scatter gather. We are using
offset as opposed to doffset.
Reported-by: Dave Jones <[email protected]>
Tested-by: Dave Jones <[email protected]>
Signed-off-by: Tom Herbert <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Fixes: 7e2b10c1e52ca ("net: Support for multiple checksums with gso")
---
net/core/skbuff.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9cd5344fad73..c1a33033cbe2 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2993,7 +2993,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
skb_put(nskb, len),
len, 0);
SKB_GSO_CB(nskb)->csum_start =
- skb_headroom(nskb) + offset;
+ skb_headroom(nskb) + doffset;
continue;
}
On Wed, Jun 25, 2014 at 12:51 PM, Eric Dumazet <[email protected]> wrote:
> From: Tom Herbert <[email protected]>
>
> Dave Jones reported that a crash is occurring in
>
> csum_partial
> tcp_gso_segment
> inet_gso_segment
> ? update_dl_migration
> skb_mac_gso_segment
> __skb_gso_segment
> dev_hard_start_xmit
> sch_direct_xmit
> __dev_queue_xmit
> ? dev_hard_start_xmit
> dev_queue_xmit
> ip_finish_output
> ? ip_output
> ip_output
> ip_forward_finish
> ip_forward
> ip_rcv_finish
> ip_rcv
> __netif_receive_skb_core
> ? __netif_receive_skb_core
> ? trace_hardirqs_on
> __netif_receive_skb
> netif_receive_skb_internal
> napi_gro_complete
> ? napi_gro_complete
> dev_gro_receive
> ? dev_gro_receive
> napi_gro_receive
>
> It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
> not set correctly when doing non-scatter gather. We are using
> offset as opposed to doffset.
>
Acked-by: Tom Herbert <[email protected]>
> Reported-by: Dave Jones <[email protected]>
> Tested-by: Dave Jones <[email protected]>
> Signed-off-by: Tom Herbert <[email protected]>
> Signed-off-by: Eric Dumazet <[email protected]>
> Fixes: 7e2b10c1e52ca ("net: Support for multiple checksums with gso")
> ---
> net/core/skbuff.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 9cd5344fad73..c1a33033cbe2 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2993,7 +2993,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
> skb_put(nskb, len),
> len, 0);
> SKB_GSO_CB(nskb)->csum_start =
> - skb_headroom(nskb) + offset;
> + skb_headroom(nskb) + doffset;
> continue;
> }
>
>
>
From: Eric Dumazet <[email protected]>
Date: Wed, 25 Jun 2014 12:51:01 -0700
> From: Tom Herbert <[email protected]>
>
> Dave Jones reported that a crash is occurring in
...
> It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
> not set correctly when doing non-scatter gather. We are using
> offset as opposed to doffset.
>
> Reported-by: Dave Jones <[email protected]>
> Tested-by: Dave Jones <[email protected]>
> Signed-off-by: Tom Herbert <[email protected]>
> Signed-off-by: Eric Dumazet <[email protected]>
> Fixes: 7e2b10c1e52ca ("net: Support for multiple checksums with gso")
Applied, thanks everyone.