Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752774Ab3F2QcO (ORCPT ); Sat, 29 Jun 2013 12:32:14 -0400 Received: from mail.candelatech.com ([208.74.158.172]:36416 "EHLO ns3.lanforge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751527Ab3F2QcM (ORCPT ); Sat, 29 Jun 2013 12:32:12 -0400 Message-ID: <51CF0BFA.4080308@candelatech.com> Date: Sat, 29 Jun 2013 09:31:54 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130402 Thunderbird/17.0.5 MIME-Version: 1.0 To: Eric Dumazet CC: Joe Jin , Frank Blaschka , "David S. Miller" , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , "zheng.x.li@oracle.com" , Xen Devel , Ian Campbell , Jan Beulich , Stefano Stabellini Subject: Re: kernel panic in skb_copy_bits References: <51CBAA48.3080802@oracle.com> <1372311118.3301.214.camel@edumazet-glaptop> <51CD0E67.4000008@oracle.com> <1372402340.3301.229.camel@edumazet-glaptop> <1372412262.3301.251.camel@edumazet-glaptop> <51CE1E19.3020108@oracle.com> <1372490428.3301.300.camel@edumazet-glaptop> <51CF0723.9020604@candelatech.com> <1372523168.3301.302.camel@edumazet-glaptop> In-Reply-To: <1372523168.3301.302.camel@edumazet-glaptop> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2449 Lines: 66 On 06/29/2013 09:26 AM, Eric Dumazet wrote: > On Sat, 2013-06-29 at 09:11 -0700, Ben Greear wrote: > >> Do you know if your patch should go in 3.9? >> > > Yes it should. Ok, I'll add that to my tree. >> Your test case sounds a bit like what gives us the rare crash in tcp_collapse >> (we have lots of bouncing wifi interfaces running slow-speed TCP trafic). But, >> it takes days for us to hit the problem most of the time. > > Well, unfortunately that's a different problem :( For what it's worth, I added this patch to my tree. We haven't hit the problem since, but perhaps on the over-the-weekend run we'll see it. commit 0286716b36a0e5b82c385052a0971f44bc3c3442 Author: Ben Greear Date: Tue Jun 25 15:49:52 2013 -0700 tcp: Try to work around crash in tcp_collapse. And print out some info about why it crashed. Signed-off-by: Ben Greear diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index a2f267a..63f7704 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4810,7 +4810,15 @@ restart: int offset = start - TCP_SKB_CB(skb)->seq; int size = TCP_SKB_CB(skb)->end_seq - start; - BUG_ON(offset < 0); + if (WARN_ON(offset < 0)) { + /* We see a crash here (when using BUG_ON) every few days under + * some torture tests. I'm not sure how to clean this up properly, + * so just return and hope thinks keep muddling through. --Ben + */ + printk("offset: %i start: %i seq: %i size: %i copy: %i\n", + offset, start, TCP_SKB_CB(skb)->seq, size, copy); + return; + } if (size > 0) { size = min(copy, size); if (skb_copy_bits(skb, offset, skb_put(nskb, size), size)) Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/