Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754417AbbEOJVM (ORCPT ); Fri, 15 May 2015 05:21:12 -0400 Received: from szxga02-in.huawei.com ([119.145.14.65]:64362 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754187AbbEOJVF (ORCPT ); Fri, 15 May 2015 05:21:05 -0400 Message-ID: <5555BA6A.50906@huawei.com> Date: Fri, 15 May 2015 17:20:42 +0800 From: "Wangnan (F)" User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Alexei Starovoitov , CC: lizefan 00213767 Subject: Re: [BUG] kernel panic after bpf program removed. References: <55556DE3.5020106@huawei.com> <55558634.5000902@plumgrid.com> In-Reply-To: <55558634.5000902@plumgrid.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.111.66.109] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3777 Lines: 128 在 2015/5/15 13:37, Alexei Starovoitov 写道: > On 5/14/15 8:54 PM, Wangnan (F) wrote: >> Hi Alexei Starovoitov and other, >> >> I triggered a kernel panic when developing my 'perf bpf' facility. The >> call stack is listed at the bottom of >> this mail. >> >> I attached two bpf programs on 'kmem_cache_free%return' and >> '__alloc_pages_nodemask'. The programs is very simple. >> The panic is raised after closing the bpf program and the perf event >> file. Looks like the panic is caused >> by racing between closing perf event fd and bpf program fd. I'm unable >> to reproduce this problem with similar >> operations. >> >> Following is the exact instruction cause the panic. > > thanks for the report. > Looks like pointer 'prog == 0x6c0' is passed into bpf_prog_put, > which means that event->tp_event was freed and memory reused before > free_event_rcu() was called. > > I think it's not perf_event_fd racing with prog_fd, but rather > with kprobe freeing: > __free_event() > event->destroy(event) > perf_trace_destroy > perf_trace_event_unreg > which is dropping event->tp_event->perf_refcount > that allows kprobe freeing to proceed in: > unregister_kprobe_event > trace_remove_event_call > probe_remove_event_call > and eventually tp_event to get freed. > > I think calling perf_event_free_bpf_prog() > from __free_event() instead of free_event_rcu() will fix the race, > but please double check my analysis. > Also please send me a reproducer script. I'd like to see it crashing > first before the fix and not crashing afterwards. > I triggered the problem with my 'perf bpf' patch series, and reproduced once. The bpf program is attached. What I do is to use # perf bpf record --object /root/sample_bpf_program.o -- sleep 4 to start recording, then press C-c before sleep finish after about 3 seconds. The second call trace is identical to the previous one. My environment is qemu with v4.1-rc3 kernel. Thank you. ------------------------------------------------- #include #include #include #define SEC(NAME) __attribute__((section(NAME), used)) static int (*bpf_map_delete_elem)(void *map, void *key) = (void *) BPF_FUNC_map_delete_elem; static int (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) = (void *) BPF_FUNC_trace_printk; struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; struct pair { u64 val; u64 ip; }; struct bpf_map_def SEC("maps") my_map = { .type = BPF_MAP_TYPE_HASH, .key_size = sizeof(long), .value_size = sizeof(struct pair), .max_entries = 1000000, }; struct bpf_map_def SEC("maps") my_map2 = { .type = BPF_MAP_TYPE_HASH, .key_size = sizeof(long), .value_size = sizeof(struct pair), .max_entries = 1000000, }; SEC("cache_free=kmem_cache_free%return") int bpf_prog1(struct pt_regs *ctx) { long ptr = ctx->r14; bpf_map_delete_elem(&my_map2, &ptr); return 0; } SEC("mybpfprog=__alloc_pages_nodemask") int bpf_prog_my(struct pt_regs *ctx) { char fmt[] = "Haha\n"; long ptr = ctx->r14; bpf_trace_printk(fmt, sizeof(fmt)); bpf_map_delete_elem(&my_map, &ptr); return 0; } char _license[] SEC("license") = "GPL"; u32 _version SEC("version") = LINUX_VERSION_CODE; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/