Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753584AbbEOFiA (ORCPT ); Fri, 15 May 2015 01:38:00 -0400 Received: from mail-ig0-f170.google.com ([209.85.213.170]:33005 "EHLO mail-ig0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752063AbbEOFh6 (ORCPT ); Fri, 15 May 2015 01:37:58 -0400 Message-ID: <55558634.5000902@plumgrid.com> Date: Thu, 14 May 2015 22:37:56 -0700 From: Alexei Starovoitov User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: "Wangnan (F)" , linux-kernel@vger.kernel.org CC: lizefan 00213767 Subject: Re: [BUG] kernel panic after bpf program removed. References: <55556DE3.5020106@huawei.com> In-Reply-To: <55556DE3.5020106@huawei.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1703 Lines: 46 On 5/14/15 8:54 PM, Wangnan (F) wrote: > Hi Alexei Starovoitov and other, > > I triggered a kernel panic when developing my 'perf bpf' facility. The > call stack is listed at the bottom of > this mail. > > I attached two bpf programs on 'kmem_cache_free%return' and > '__alloc_pages_nodemask'. The programs is very simple. > The panic is raised after closing the bpf program and the perf event > file. Looks like the panic is caused > by racing between closing perf event fd and bpf program fd. I'm unable > to reproduce this problem with similar > operations. > > Following is the exact instruction cause the panic. thanks for the report. Looks like pointer 'prog == 0x6c0' is passed into bpf_prog_put, which means that event->tp_event was freed and memory reused before free_event_rcu() was called. I think it's not perf_event_fd racing with prog_fd, but rather with kprobe freeing: __free_event() event->destroy(event) perf_trace_destroy perf_trace_event_unreg which is dropping event->tp_event->perf_refcount that allows kprobe freeing to proceed in: unregister_kprobe_event trace_remove_event_call probe_remove_event_call and eventually tp_event to get freed. I think calling perf_event_free_bpf_prog() from __free_event() instead of free_event_rcu() will fix the race, but please double check my analysis. Also please send me a reproducer script. I'd like to see it crashing first before the fix and not crashing afterwards. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/