Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753273AbbLIWYh (ORCPT ); Wed, 9 Dec 2015 17:24:37 -0500 Received: from www62.your-server.de ([213.133.104.62]:52821 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752521AbbLIWYg (ORCPT ); Wed, 9 Dec 2015 17:24:36 -0500 Message-ID: <5668AA20.9070707@iogearbox.net> Date: Wed, 09 Dec 2015 23:24:32 +0100 From: Daniel Borkmann User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Alexei Starovoitov , Dave Jones , Linux Kernel , Alexei Starovoitov CC: netdev@vger.kernel.org, venkatesh.pallipadi@intel.com, suresh.b.siddha@intel.com Subject: Re: bpf/asan related lockup References: <20151204182333.GB29406@codemonkey.org.uk> <20151204190614.GA45508@ast-mbp.thefacebook.com> In-Reply-To: <20151204190614.GA45508@ast-mbp.thefacebook.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated-Sender: daniel@iogearbox.net Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3643 Lines: 67 On 12/04/2015 08:06 PM, Alexei Starovoitov wrote: > On Fri, Dec 04, 2015 at 01:23:33PM -0500, Dave Jones wrote: >> Trinity had aparently created a bpf program that upset things greatly. >> I guess I need to find a way to make it record those somewhere for replaying later. >> >> Alexei, any ideas ? >> >> Dave >> >> NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:1:991] >> irq event stamp: 153214 >> hardirqs last enabled at (153213): [] _raw_spin_unlock_irq+0x2c/0x50 >> hardirqs last disabled at (153214): [] _raw_spin_lock_irq+0x19/0x80 >> softirqs last enabled at (153108): [] __do_softirq+0x2b8/0x590 >> softirqs last disabled at (153103): [] irq_exit+0xf5/0x100 >> CPU: 0 PID: 991 Comm: kworker/0:1 Tainted: G D W 4.4.0-rc3-think+ #5 >> Workqueue: events bpf_prog_free_deferred >> task: ffff880464dab700 ti: ffff8803041d8000 task.ti: ffff8803041d8000 >> RIP: 0010:[] [] __asan_load4+0x0/0x70 >> RSP: 0018:ffff8803041dfa08 EFLAGS: 00000202 >> RAX: 0000000000000003 RBX: ffff880468be39a8 RCX: 0000000000000000 >> RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff880468be39c0 >> RBP: ffff8803041dfa70 R08: 0000000000000000 R09: 0000000000000001 >> R10: ffff8803041dfb8f R11: 0000000000000000 R12: ffff880468be39c0 >> R13: 0000000000000001 R14: ffff8804689dff00 R15: 0000000000000001 >> FS: 0000000000000000(0000) GS:ffff880468800000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00007faeedb04000 CR3: 0000000452548000 CR4: 00000000001406f0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 >> Stack: >> ffffffffa018f2ce 00000000001dfec0 01ff880300000001 ffff8803041dfaf8 >> ffffffffa0076610 ffffffffa18895d8 ffff8804689dff08 0000000000000004 >> ffffffffa0076610 ffff8803041dfaf8 0000000000000001 ffffc90000171000 >> Call Trace: >> [] ? smp_call_function_many+0x32e/0x410 >> [] ? rbt_memtype_copy_nth_element+0xd0/0xd0 >> [] ? rbt_memtype_copy_nth_element+0xd0/0xd0 >> [] smp_call_function+0x47/0x80 >> [] ? rbt_memtype_copy_nth_element+0xd0/0xd0 >> [] on_each_cpu+0x2f/0x90 >> [] flush_tlb_kernel_range+0xc0/0xd0 >> [] ? flush_tlb_all+0x20/0x20 >> [] remove_vm_area+0xaf/0x100 >> [] __vunmap+0x36/0x180 >> [] vfree+0x35/0xa0 >> [] __bpf_prog_free+0x27/0x30 >> [] bpf_jit_free+0x69/0x6e >> [] bpf_prog_free_deferred+0x1f/0x30 >> [] process_one_work+0x3fa/0xa10 >> [] ? process_one_work+0x334/0xa10 >> [] ? pwq_dec_nr_in_flight+0x110/0x110 >> [] worker_thread+0x88/0x6c0 > > hmm. may be set_memory_rw(ptr) followed by vfree(ptr) have a race > deep inside mm logic. > Both of them do flush_tlb_kernel_range()... Hmm, was the rbt_memtype_copy_nth_element() by chance unrelated when this happens or are we somehow stuck there each time? Only place where that can be invoked is memtype_get_idx() when a cat /sys/kernel/debug/x86/pat_memtype_list is done. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/