Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751884AbbHZUyh (ORCPT ); Wed, 26 Aug 2015 16:54:37 -0400 Received: from mail-io0-f178.google.com ([209.85.223.178]:34666 "EHLO mail-io0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751029AbbHZUye convert rfc822-to-8bit (ORCPT ); Wed, 26 Aug 2015 16:54:34 -0400 MIME-Version: 1.0 In-Reply-To: <20150826074959.48aea34c@as> References: <20150826074959.48aea34c@as> Date: Wed, 26 Aug 2015 13:54:33 -0700 Message-ID: Subject: Re: ip_rcv_finish() NULL pointer and possibly related Oopses From: Michael Marineau To: Chuck Ebbert Cc: Shaun Crampton , linux-kernel@vger.kernel.org, Peter White , netdev@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6845 Lines: 127 On Wed, Aug 26, 2015 at 4:49 AM, Chuck Ebbert wrote: > On Wed, 26 Aug 2015 08:46:59 +0000 > Shaun Crampton wrote: > >> Testing our app at scale on Google¹s GCE, running ~1000 CoreOS hosts: over >> approximately 1 hour, I see about 1 in 50 hosts hit one of the Oopses >> below and then reboot (I¹m not sure if the different oopses are related to >> each other). >> >> The app is Project Calico, which is a datacenter networking fabric. >> calico-felix, the process named below, is our per-host agent. The >> per-host agent is responsible for reading the network information from a >> central server and applying "ip route² and "iptables" updates to the >> kernel. We¹re running on CoreOS, with about 100 docker containers/veths >> pairs running on each host. calico-felix is running inside one of those >> containers. We also run the BIRD BGP stack to redistribute routes around >> the datacenter. The errors happen more frequently while Calico is under >> load. >> >> I¹m not sure where to go from here. I can reproduce these issues easily >> at that scale but I haven¹t managed to boil it down to a small-scale repro >> scenario for further investigation (yet). >> > > What in the world is going on with those call traces? E.g.: > >> [ 4513.712008] >> [ 4513.712008] [] ? ip_rcv_finish+0x81/0x360 >> [ 4513.712008] [] ip_rcv+0x2a4/0x400 >> [ 4513.712008] [] ? inet_del_offload+0x40/0x40 >> [ 4513.712008] [] __netif_receive_skb_core+0x6c3/0x9a0 >> [ 4513.712008] [] ? build_skb+0x17/0x90 >> [ 4513.712008] [] __netif_receive_skb+0x18/0x60 >> [ 4513.712008] [] netif_receive_skb_internal+0x33/0xa0 >> [ 4513.712008] [] netif_receive_skb_sk+0x1c/0x70 >> [ 4513.712008] [] 0xffffffffa00f772b >> [ 4513.712008] [] ? __netif_receive_skb_core+0x6c3/0x9a0 >> [ 4513.712008] [] 0xffffffffa00f7d81 >> [ 4513.712008] [] net_rx_action+0x159/0x340 >> [ 4513.712008] [] __do_softirq+0xf4/0x290 >> [ 4513.712008] [] irq_exit+0xad/0xc0 >> [ 4513.712008] [] do_IRQ+0x5a/0xf0 >> [ 4513.712008] [] common_interrupt+0x6e/0x6e >> [ 4513.712008] > > There are two functions in the call trace that the kernel knows > nothing about. How did they get in there? > > And there is really executable code in there, as can be seen from a > later trace: > >> [ 4123.003006] >> [ 4123.003006] [] nf_iterate+0x57/0x80 >> [ 4123.003006] [] nf_hook_slow+0x97/0x100 >> [ 4123.003006] [] ip_local_deliver+0x92/0xa0 >> [ 4123.003006] [] ? ip_rcv_finish+0x360/0x360 >> [ 4123.003006] [] ip_rcv_finish+0x81/0x360 >> [ 4123.003006] [] ip_rcv+0x2a4/0x400 >> [ 4123.003006] [] ? inet_del_offload+0x40/0x40 >> [ 4123.003006] [] __netif_receive_skb_core+0x6c3/0x9a0 >> [ 4123.003006] [] ? build_skb+0x17/0x90 >> [ 4123.003006] [] __netif_receive_skb+0x18/0x60 >> [ 4123.003006] [] netif_receive_skb_internal+0x33/0xa0 >> [ 4123.003006] [] netif_receive_skb_sk+0x1c/0x70 >> [ 4123.003006] [] 0xffffffffa00d472b >> [ 4123.003006] [] 0xffffffffa00d4d81 >> [ 4123.003006] [] net_rx_action+0x159/0x340 >> [ 4123.003006] [] __do_softirq+0xf4/0x290 >> [ 4123.003006] [] irq_exit+0xad/0xc0 >> [ 4123.003006] [] do_IRQ+0x5a/0xf0 >> [ 4123.003006] [] common_interrupt+0x6e/0x6e >> [ 4123.003006] >> [ 4123.003006] [] ? __ip_route_output_key+0x31d/0x860 >> [ 4123.003006] [] ? xfrm_lookup_route+0x5/0x70 >> [ 4123.003006] [] ? ip_route_output_flow+0x54/0x60 >> [ 4123.003006] [] ip_queue_xmit+0x36a/0x3d0 >> [ 4123.003006] [] tcp_transmit_skb+0x4b9/0x990 >> [ 4123.003006] [] tcp_write_xmit+0x115/0xe90 >> [ 4123.003006] [] __tcp_push_pending_frames+0x32/0xd0 >> [ 4123.003006] [] tcp_push+0xef/0x120 >> [ 4123.003006] [] tcp_sendmsg+0xc5/0xb20 >> [ 4123.003006] [] ? lock_hrtimer_base.isra.22+0x29/0x50 >> [ 4123.003006] [] inet_sendmsg+0x64/0xa0 >> [ 4123.003006] [] ? __fget_light+0x25/0x70 >> [ 4123.003006] [] sock_sendmsg+0x3d/0x50 >> [ 4123.003006] [] SYSC_sendto+0x102/0x1a0 >> [ 4123.003006] [] ? __audit_syscall_entry+0xb4/0x110 >> [ 4123.003006] [] ? do_audit_syscall_entry+0x6c/0x70 >> [ 4123.003006] [] ? >> syscall_trace_enter_phase1+0x103/0x160 >> [ 4123.003006] [] SyS_sendto+0xe/0x10 >> [ 4123.003006] [] system_call_fastpath+0x12/0x71 >> [ 4123.003006] Code: <48> 8b 88 40 03 00 00 e8 1d dd dd ff 5d c3 0f 1f 00 >> 41 83 b9 80 00 >> [ 4123.003006] RIP [] 0xffffffffa0233027 >> [ 4123.003006] RSP > > Presumably the same two functions as before (loaded at a different > base address but same offsets, 0xd81 and 0x72b). And then nf_iterate > call into another unknown function, and there really is code there > and it's consistent with the oops. And the kernel thinks it's > outside of any normal text section, so it does not try to dump any > code from before the instruction pointer. > > 0: 48 8b 88 40 03 00 00 mov 0x340(%rax),%rcx > 7: e8 1d dd dd ff callq 0xffffffffffdddd29 > c: 5d pop %rbp > d: c3 retq > > Did you write your own module loader or something? These are stock kernels, with the exception that we include the secure boot patch set: https://github.com/coreos/coreos-overlay/tree/master/sys-kernel/coreos-sources/files/4.1 Been a while since kmod got updated so CoreOS is currently shipping with kmod-15 but beyond being a bit old there isn't anything special about the module loader. So nothing particularly magical going on here that I know of. For reference the original bug report includes a few more varieties of stack traces: https://github.com/coreos/bugs/issues/435 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/