Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp659592pxj; Thu, 3 Jun 2021 16:30:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxjmwKMNnzcfHvRv3dROQoVzmKRFSSrrMmaIyMj0U6ueR2vxjGwJ26mtkhOHgRdhHchvICD X-Received: by 2002:aa7:d1d1:: with SMTP id g17mr1773473edp.86.1622763001608; Thu, 03 Jun 2021 16:30:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1622763001; cv=none; d=google.com; s=arc-20160816; b=JNEh6O35AglI3DPcXnSHt6XcO0lcVXe0pdvD9hlVDsgVrEXkJUpQHO1WQwzoLNUjz8 pCK8Me6dV3cZMMP85pG9Mt5P7tS2K6crYiIdGV8zQZwgEV4RpYhjRGLS4b0bWtK4Jj6i cB+wJ09PzkNHPM0wtDUcTgNi6gFamx9EJ2tHAYTqQOtHquS8E0pl6F02G90eRBgrIq6U dTZh9nvUMzkGrDMQvRp9Hb6WnNhhUZcNSR3f3k7KbyPaJlqQZ6X6MyVBnz/yLpOhNO0H epBo+0RzjVrnmtTPRp7uHTQgxBm4dwjWRLrZMARrxhrsFrtpx+0HdgfyiTYGS1YR4CRt biaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=xfUY0TX7tuXJEEdu4nINq9m/FRxe8qkPqXyMt/FVDPo=; b=gmpujVO+udn90bcfeecPxpsO4LjkAW/q1DHOywYsXGPQhSQffYYyQ0SwVQ9egG1xRZ cKotvgFO+noa6kGW2bWVcuwOAnYKeBg59esZ0zGzV2Rq9zRBkB3RKJ1ds91KT+EmezyA grqkZYRktecNqt0teH/Fd2PbbQCFzL1kXyRb3n/iOeRRfVzc8rpvPSFYHyKscpt6TvP5 ecGZJbB+M8/pO5jx7c7dX4du38yTu4gyEm3vxQShoW7lx5i78sdpGHJGF7RTmAlDAqIK c7+yfdkBV216+LhgVzqdjrRKfYS0Kn5nnPOT35JkXP46vhaZJYOLpXP2XEJSB1D0Kbbx 5bNg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="S/xTZKg3"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z7si3506146edc.522.2021.06.03.16.29.38; Thu, 03 Jun 2021 16:30:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="S/xTZKg3"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229913AbhFCXaV (ORCPT + 99 others); Thu, 3 Jun 2021 19:30:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49228 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229576AbhFCXaU (ORCPT ); Thu, 3 Jun 2021 19:30:20 -0400 Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CEE78C061756; Thu, 3 Jun 2021 16:28:20 -0700 (PDT) Received: by mail-ed1-x52b.google.com with SMTP id s6so8999157edu.10; Thu, 03 Jun 2021 16:28:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:subject:from:to:cc:date:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=xfUY0TX7tuXJEEdu4nINq9m/FRxe8qkPqXyMt/FVDPo=; b=S/xTZKg3ARoLNKJrpj4PAaD/NKSurt9tsbgEjJSPtWkk60cEM1KYxnKGAlhygJz3Lk qdPbl6Z0UTO7UoSlkdhAjx4bFqJ3BPagC2U+u+RFp5VcGaG56pK1lWsxnPZzJKm8PG9e KZ0f4OQBNZGmjh8/1AHRSld8Sm0+pgIMvZS/foKu9J9G3QuscalAoj3a8EoFVzGqZHM+ kPl0wHgnT1wn4KNLrRlEdH83OGodCrZXL482W+cQK65sJeByqMpYq/oqBw3i1eQcsE6M 1cz545UZiIrQVI+oVKXw4bv5sp2tpqCHH7748g0Du7vFq59NFh1mJbX/MV2a3dlKVBIa N+ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=xfUY0TX7tuXJEEdu4nINq9m/FRxe8qkPqXyMt/FVDPo=; b=Fsy3Vzv8V0I+5aElFODPtpjbLXDq3cT5bFUL4x3TPYJMQ2gD2pwrsT/x5oIuGueQfF 7I/zrweb3T3rr1wl7gfUaGWM88wiYr2CVw4bzHgDEJBt9DJYVqqj/RgbK+6LFhl7Rn6I eTK/7xU5L9SDVNG9kA+uvCf6JTX1IrH7aVI+m51rS2jw9fD3B8BzAPJEUTnc44LKJJ3L 0+exxuZLQlSZoLK5bI7bEMvbIfR+ELMB9evyG92571B1SkaykkoGFqc7KBwjUjYaFohI 1pH6pcf1n5NCdM3ydLYQK7MmvPaGGyAG9DmJeBL8D1D31wkaZVDSp1rDOjOq2MhVFQ2Q S+zg== X-Gm-Message-State: AOAM533/ftTl4ubssI3oQY7tQxWNxwayBJVKak7FCM29nOeDZ36A5onD PSY89s0xt8F3F71idD5xb6U= X-Received: by 2002:a05:6402:4316:: with SMTP id m22mr1742860edc.316.1622762899369; Thu, 03 Jun 2021 16:28:19 -0700 (PDT) Received: from [192.168.1.15] ([151.29.179.73]) by smtp.gmail.com with ESMTPSA id j22sm2131542ejt.11.2021.06.03.16.28.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Jun 2021 16:28:18 -0700 (PDT) Message-ID: Subject: Re: [PATCH v6 20/20] perf session: Load data directory files for analysis From: Riccardo Mancini To: Alexey Bayduraev Cc: Jiri Olsa , Namhyung Kim , Alexander Shishkin , Peter Zijlstra , Ingo Molnar , linux-kernel , Andi Kleen , Adrian Hunter , Alexander Antonov , Alexei Budankov , linux-perf-users@vger.kernel.org, Ian Rogers , Arnaldo Carvalho de Melo Date: Fri, 04 Jun 2021 01:28:17 +0200 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.40.1 (3.40.1-1.fc34) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Wed, 2021-05-26 at 13:53 +0300, Alexey Bayduraev wrote: > Load data directory files and provide basic raw dump and aggregated > analysis support of data directories in report mode, still with no > memory consumption optimizations. > > Design and implementation are based on the prototype [1], [2]. > > [1] git clone https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git - > b perf/record_threads > [2] https://lore.kernel.org/lkml/20180913125450.21342-1-jolsa@kernel.org/ > > Suggested-by: Jiri Olsa > Signed-off-by: Alexey Bayduraev > --- >  tools/perf/util/session.c | 129 ++++++++++++++++++++++++++++++++++++++ >  1 file changed, 129 insertions(+) > > diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c > index 041601810b85..dd4ef9749cd0 100644 > --- a/tools/perf/util/session.c > +++ b/tools/perf/util/session.c > @@ -65,6 +65,7 @@ struct reader_state { >         u64      data_size; >         u64      head; >         bool     eof; > +       u64      size; >  }; >   >  enum { > @@ -2319,6 +2320,7 @@ reader__read_event(struct reader *rd, struct > perf_session *session, >         if (skip) >                 size += skip; >   > +       st->size += size; >         st->head += size; >         st->file_pos += size; >   > @@ -2418,6 +2420,130 @@ static int __perf_session__process_events(struct > perf_session *session) >         return err; >  } >   > +/* > + * This function reads, merge and process directory data. > + * It assumens the version 1 of directory data, where each > + * data file holds per-cpu data, already sorted by kernel. > + */ > +static int __perf_session__process_dir_events(struct perf_session *session) > +{ > +       struct perf_data *data = session->data; > +       struct perf_tool *tool = session->tool; > +       int i, ret = 0, readers = 1; > +       struct ui_progress prog; > +       u64 total_size = perf_data__size(session->data); > +       struct reader *rd; > + > +       perf_tool__fill_defaults(tool); > + > +       ui_progress__init_size(&prog, total_size, "Sorting events..."); > + > +       for (i = 0; i < data->dir.nr; i++) { > +               if (data->dir.files[i].size) > +                       readers++; > +       } > + > +       rd = session->readers = zalloc(readers * sizeof(struct reader)); > +       if (!rd) > +               return -ENOMEM; > +       session->nr_readers = readers; > +       readers = 0; > + > +       rd[readers] = (struct reader) { > +               .fd              = perf_data__fd(session->data), > +               .path            = session->data->file.path, > +               .data_size       = session->header.data_size, > +               .data_offset     = session->header.data_offset, > +               .in_place_update = session->data->in_place_update, > +       }; > +       ret = reader__init(&rd[readers], NULL); > +       if (ret) > +               goto out_err; > +       ret = reader__mmap(&rd[readers], session); > +       if (ret != READER_OK) { > +               if (ret == READER_EOF) > +                       ret = -EINVAL; > +               goto out_err; > +       } > +       readers++; > + > +       for (i = 0; i < data->dir.nr; i++) { > +               if (data->dir.files[i].size) { > +                       rd[readers] = (struct reader) { > +                               .fd              = data->dir.files[i].fd, > +                               .path            = data->dir.files[i].path, > +                               .data_size       = data->dir.files[i].size, > +                               .data_offset     = 0, > +                               .in_place_update = session->data- > >in_place_update, > +                       }; > +                       ret = reader__init(&rd[readers], NULL); zstd_fini is never called on rd[readers].zstd_data Maybe it can be done in perf_session__delete. For example, we could add a new reader__fini function to do the cleanup of zstd data and perf_decomp__release_events. Thanks, Riccardo > +                       if (ret) > +                               goto out_err; > +                       ret = reader__mmap(&rd[readers], session); > +                       if (ret != READER_OK) { > +                               if (ret == READER_EOF) > +                                       ret = -EINVAL; > +                               goto out_err; > +                       } > +                       readers++; > +               } > +       } > + > +       i = 0; > + > +       while ((ret >= 0) && readers) { > +               if (session_done()) > +                       return 0; > + > +               if (rd[i].state.eof) { > +                       i = (i + 1) % session->nr_readers; > +                       continue; > +               } > + > +               ret = reader__read_event(&rd[i], session, &prog); > +               if (ret < 0) > +                       break; > +               if (ret == READER_EOF) { > +                       ret = reader__mmap(&rd[i], session); > +                       if (ret < 0) > +                               goto out_err; > +                       if (ret == READER_EOF) > +                               readers--; > +               } > + > +               /* > +                * Processing 10MBs of data from each reader in sequence, > +                * because that's the way the ordered events sorting works > +                * most efficiently. > +                */ > +               if (rd[i].state.size >= 10*1024*1024) { > +                       rd[i].state.size = 0; > +                       i = (i + 1) % session->nr_readers; > +               } > +       } > + > +       ret = ordered_events__flush(&session->ordered_events, > OE_FLUSH__FINAL); > +       if (ret) > +               goto out_err; > + > +       ret = perf_session__flush_thread_stacks(session); > +out_err: > +       ui_progress__finish(); > + > +       if (!tool->no_warn) > +               perf_session__warn_about_errors(session); > + > +       /* > +        * We may switching perf.data output, make ordered_events > +        * reusable. > +        */ > +       ordered_events__reinit(&session->ordered_events); > + > +       session->one_mmap = false; > + > +       return ret; > +} > + >  int perf_session__process_events(struct perf_session *session) >  { >         if (perf_session__register_idle_thread(session) < 0) > @@ -2426,6 +2552,9 @@ int perf_session__process_events(struct perf_session > *session) >         if (perf_data__is_pipe(session->data)) >                 return __perf_session__process_pipe_events(session); >   > +       if (perf_data__is_dir(session->data)) > +               return __perf_session__process_dir_events(session); > + >         return __perf_session__process_events(session); >  } >