Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1313558imm; Wed, 8 Aug 2018 14:49:40 -0700 (PDT) X-Google-Smtp-Source: AA+uWPyUBuYnzEuWnaRA7TSWoWG6SrLPi8u7JZ7gthDEg7hhxFqt7nd/Oj3Ukzw+Q/jNRqmcfiMr X-Received: by 2002:a63:d613:: with SMTP id q19-v6mr4038925pgg.327.1533764980615; Wed, 08 Aug 2018 14:49:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533764980; cv=none; d=google.com; s=arc-20160816; b=C+3KOGQ+nYhGcqqmyDX3krhgS9j105Uqlw/gMqfJG18HKFfFDHVgjSpON6qCn5mZwL oUwkPA8k/gNLj6EEzBu0WAnX4uWsv2S5+T/3Vo1VgSKifBhBJcDug9U+zKXcPZqawCMs rmDYUlwJ55wACI6nVbO6xCIn2bAA2H/IZla7ZBB5+yQuFaOKmvcYJm9xBpiBpRPoyn3f Ef0UxCG/MflqoCYRTLPv4OG4GSHP9E5dXcoKvLGBI+qLc8I4eXwatb+DJFaYrFhoXoPT su0DHSFOGOrdP627H8rVlKsDgV8VCNdc0A1JxN7wGivx1XUCQ+TpDC+A6587rmZM/vd9 /yzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature :arc-authentication-results; bh=EYOhkqlMSqAMmDp3jViTrJi8ZBdYbjlxkmDTdzIuNbc=; b=Kgz1vpYxRYOz4eFhAzepl3uh1O+w4IVZ+kdC1goVibEiqiEbNH+fCozdE3/hIQojgf yVG/qleQ3JDsh47LAwLjMu4m2LQtZ2bk7wJaTxci2fZ/E13lrUX6FiECTKyBTPFQ9gTR 06Sd6+Dx6dOyb2Pmm3Fw8GnDGlRPXWT/fZCh8enTtlmNjJALf89/TfX7lbWtdOHidQuZ K3r549HHV99+F2Q2r028bKHHrHYZAWXeDrxjxx8H5OWAZY8O20iP62jXW9VEyvuuwS2Z Qb50c0CBXhPCaBunfYZ7n00XNhh9RyAekhTIKw1nnCUQ9D6FlLqxRREER+evM9IwAEnl KxpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=juo1Bnqi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y86-v6si5755023pfi.195.2018.08.08.14.49.25; Wed, 08 Aug 2018 14:49:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=juo1Bnqi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730235AbeHIAJc (ORCPT + 99 others); Wed, 8 Aug 2018 20:09:32 -0400 Received: from mail-wr1-f68.google.com ([209.85.221.68]:40805 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727530AbeHIAJc (ORCPT ); Wed, 8 Aug 2018 20:09:32 -0400 Received: by mail-wr1-f68.google.com with SMTP id h15-v6so3290909wrs.7 for ; Wed, 08 Aug 2018 14:47:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=EYOhkqlMSqAMmDp3jViTrJi8ZBdYbjlxkmDTdzIuNbc=; b=juo1Bnqi6hF8QaHB4B150SMqRpXeJcJpx+Ty3ITNza/+6b9+E/tQ2KZSWUYvOMznrr ETXcF0Z3oorP1fqitumKeCoDKxFlbDzldClO/FxN0knFYEH8Sch3TzVGVDgX+3cePXJi uXOdNoHf+RqUnL9y3XLTRfp+FjBO7tJvrWdgVusWgZeIiRYNcJEqLvDSDJ1VKjGLWKRo MBg1ZmtlgEwxNkRLkEBWC/LHd4VQjZEheR2E4A/NqeMuEHxf89KyghCUp74dLEbhJGEb kz8yQ18gH5zF8HiqGL92CPkke9T/YohRNQDtO7vn3LibOoApj2cErsxzdG/Q8iGK5EkU /r8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=EYOhkqlMSqAMmDp3jViTrJi8ZBdYbjlxkmDTdzIuNbc=; b=azNbuhFQbSRYLWMQc5dYceYKb48gorQbr5O9hqlou+o1BjUWkleTpDs4ZJtz/zyPP1 RUbIqY6zvClX7J2cVaN8HyifKmQy+3sCoGXtV5P3eMjvGrKMWfb2ABV5bFU5FFkQ5vdM VTd4jr2Pw3Ic4o1IhV7YtztjU8gtrYX8Ha5NsQ+eIM4dm5148Rm3cTCwseETqrWH4pfb dQN/BfU56b5fFiPQPDk4TY3TDd+86XZUv/+Ypk0uBdMTcq9LZDqacACBf4oj66vu6eBD TxOQLYphXbkQSlpl11RkxVJQvpBmDhFX/PdWmmtJwv2LbldjLsuh6ik1F7EHBzclPQsY kJsw== X-Gm-Message-State: AOUpUlFNIO+j+fc1PNDcRbMLG2MfPghG5f+7Sm6IihedO1V5+DV9bkT6 4ZLJImYwxdmy48l5xoEAN+5JRY1xKU1nu9L80S6RxBZqwB4= X-Received: by 2002:adf:fdcd:: with SMTP id i13-v6mr2864838wrs.276.1533764875421; Wed, 08 Aug 2018 14:47:55 -0700 (PDT) MIME-Version: 1.0 References: <1533605015-19514-1-git-send-email-eranian@google.com> <20180807072029.GA7716@krava> <20180807085010.GC7716@krava> <20180808082347.GB20320@krava> In-Reply-To: <20180808082347.GB20320@krava> From: Stephane Eranian Date: Wed, 8 Aug 2018 14:47:42 -0700 Message-ID: Subject: Re: [PATCH] perf ordered_events: fix crash in free_dup_event() To: Jiri Olsa Cc: LKML , Arnaldo Carvalho de Melo , Peter Zijlstra , mingo@elte.hu Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Ok, I found the problem. It still exists upstream , just very tricky to trigger. Took me lots of time with gdb + watchpoints to track this down, where in fact it was just in front of me. From the crashdump: Program received signal SIGSEGV, Segmentation fault. free_dup_event (oe=0x26a39a0, event=0xffffffff00000000) I was puzzled by the 0xffffffff00000000. I tracked down where this value was coming from using watchpoints. In my case the memory was used before by elfutils to back the struct Elf. The -1 in the upper bits came from: file_read_elf () at third_party/elfutils/libelf/elf_begin.c:451 elf->state.elf64.scns.data[cnt].shndx_index = -1; And yet the next access to that memory location was in the crash. That meant the memory was released by elfutils and reused by ordered_events, yet without any initialization. But looking at alloc_event(), it was not obvious to figure out how a new_event->event could be uninitialized. Well, it turns out there is this little hack where the code commandeers the first element in the oe->buffer to use as a list_head for the oe->to_free freelist. The problem is that this entry also gets freed, but its event->event field is NEVER initialized. So depending on how the memory was previously used, you could get a on NULL value and crash in free_dup_event(). This is what happened to me. I am glad I pursued this further because the bug is still in the upstream version. The patch is a one-liner fixing the initialization of the event->event = NULL. For the other elements in the list, the initialization is already done at the end of alloc_event(). I will send the patch separately. On Wed, Aug 8, 2018 at 1:23 AM Jiri Olsa wrote: > > On Tue, Aug 07, 2018 at 12:11:05PM -0700, Stephane Eranian wrote: > > Jiri, > > On Tue, Aug 7, 2018 at 1:50 AM Jiri Olsa wrote: > > > > > > On Tue, Aug 07, 2018 at 01:16:22AM -0700, Stephane Eranian wrote: > > > > On Tue, Aug 7, 2018 at 12:20 AM Jiri Olsa wrote: > > > > > > > > > > On Mon, Aug 06, 2018 at 06:23:35PM -0700, Stephane Eranian wrote: > > > > > > Depending on memory allocations, it was possible to get a SEGFAULT in > > > > > > free_dup_event() because the event pointer was bogus: > > > > > > > > > > > > perf[1354]: segfault at ffffffff00000006 ip 00000000004b7fc7 > > > > > > > > > > is there any reproducer? > > > > > > > > > The cmdline is simple: > > > > $ perf record -e cycles:pp -o - -a sleep 1 | perf inject -b -i - >/dev/null > > > > I was using v4.13 for my tests and it may be sensitive to compiler. > > > > Was using LLVM. > > > > > > I can't make it fail even when I compile with clang 'make CC=clang' > > > > > I checked, my actual reproducer is: > > $ perf record -o - -e cycles date | perf inject -b -i - >/dev/null > > Tue Aug 7 12:03:48 PDT 2018 > > [ perf record: Woken up 1 times to write data ] > > [ perf record: Captured and wrote 0.000 MB - ] > > Segmentation fault (core dumped) > > > > the crash is in perf inject. > > So if I do: > > $ perf record -o - -e cycles date >tt > > $ gdb perf > > (gdb) r inject -b -i - < tt >/dev/null > > Program received signal SIGSEGV, Segmentation fault. > > free_dup_event (oe=0x26a39a0, event=0xffffffff00000000) at > > util/ordered-events.c:85 > > 85 in util/ordered-events.c > > (gdb) bt > > #0 free_dup_event (oe=0x26a39a0, event=0xffffffff00000000) at > > util/ordered-events.c:85 > > #1 ordered_events__free (oe=0x26a39a0) at util/ordered-events.c:310 > > #2 0x00000000004b5a56 in __perf_session__process_pipe_events > > (session=) at util/session.c:1753 > > #3 perf_session__process_events (session=) at > > util/session.c:1932 > > #4 0x000000000043a2eb in __cmd_inject (inject=) at > > builtin-inject.c:750 > > #5 cmd_inject (argc=, argv=) at > > builtin-inject.c:924 > > #6 0x000000000046b175 in run_builtin (p=0xabc640 , > > argc=4, argv=0x7fffffffe560) at perf.c:297 > > #7 0x000000000046b062 in handle_internal_command (argc=4, > > argv=0x7fffffffe560) at perf.c:349 > > #8 0x000000000046a5e8 in run_argv (argcp=, > > argv=) at perf.c:393 > > #9 main (argc=4, argv=0x7fffffffe560) at perf.c:531 > > nice > > > > > Again, this is with an older version of perf compiled with LLVM. > > can you also reproduce it with latest code? > > > Notice the value of event passed to free_dup_event(): 0xffffffff00000000 > > And yes, I checked sizeof(union_perf_event) = 4168 which is the size of > > the mmap2_event which is the largest. > > > > I also checked that you are freeing what you have actually allocated. > > No double free. > > If I add the padding or modifies the call to memdup() as in the patch, > > then the problem > > goes away. > > > > If you don't want to copy 4Kb each time, then you could also make the > > event field > > by a void *event and case whenever needed. > > not sure I understand this one.. could you please elaborate? > > > > > I suspect the problem may come from a compiler optimization or assumption which > > clashes with what you are optimizing here. > > I'll go throught the code and try to find somethign.. thanks a lot for the backtrace > > jirka >