Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp3842174pxv; Tue, 13 Jul 2021 05:12:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzFcA2PoTvUTLW9vOMWDfYAyCHZKLgpOULZjkuMNKfCom9O7zs/fh1UmrVnhfmq2fpu35/n X-Received: by 2002:a17:906:4f02:: with SMTP id t2mr5233463eju.217.1626178377365; Tue, 13 Jul 2021 05:12:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626178377; cv=none; d=google.com; s=arc-20160816; b=h9xmqnimRYmsDaJ6vrD4n1sYliOyliDah2/laY86kKBLIBVitAQqS3vpIdlj32uHw0 6ldlbA9opDOCxU/afFuBaXcmrMGiVIj2c+V0Dsx4O3dctUk0npr7EH3GJ0gGW5cxpe4k W0D4xgBhSgbzAn+6fAy08cC7q0hBRyWmOm2r/Zk+YTOoJdKAZ6jGUOvPDQSOQu/O+Y42 q6ANgX7WddUwBKRPcAovLgQfqoKevlJT59utKy+deTjnl5dMwhtdITruV8aFjVLm16kw xT9865OcFnEwQ/oXhmhCjdgbTw4nlFbEvK91mU6fGQigvPzE3eHJr5eEJ0nGHU1lGVJj Ndmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=0k9i4ryQ9bJFDsrRO38Wd8DYLKsoLjmKBJR+v0YQjxs=; b=EloZGrExViaIxWs8hGRbOWCGqtvZstnUyJvG9Qhv6a5fMkzuP5vmZ3nM+FA7JPB1aV mZxYy2juM8GYuMw6tAMXvBIdQrkG0iJOAClK0Rr/goex+cHc4W8rodmsPAxnQI2Pc4Tf a8i8atyVYc1Wt4Nf4JSS0tOCHTA0ux9bT3picD30MuYaFSVyfdCNDfQsBrKbfwnn6gm6 t9Y7YqjKbrDNRB4MscYtnbCdq6ANIuuyAid8FIuhvIjhpj/1ns+yJIqJafEzAjWGFfQh PCfVUAd0Zhhr9BrbviUD8r5Y5gae3jG0CSUzQ+WtsbFwvBzT/f6/ktaWjmIZurJSWITd zbcg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=nqnYzTcu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dp16si25154378ejc.434.2021.07.13.05.12.34; Tue, 13 Jul 2021 05:12:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=nqnYzTcu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236042AbhGMMO0 (ORCPT + 99 others); Tue, 13 Jul 2021 08:14:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235936AbhGMMOY (ORCPT ); Tue, 13 Jul 2021 08:14:24 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF6C4C0613DD; Tue, 13 Jul 2021 05:11:33 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id b14-20020a1c1b0e0000b02901fc3a62af78so1462528wmb.3; Tue, 13 Jul 2021 05:11:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=0k9i4ryQ9bJFDsrRO38Wd8DYLKsoLjmKBJR+v0YQjxs=; b=nqnYzTcu4Et3R8OrU+dlYoeVA3GpRaLFiomzAkbfuc3beAXI2HptS+mLpO17WTkbCu DrnktVrVaNiftWwLNMLBDLJhGa3CW6Zsa+B53G5Jvm6AxsMvuKtwKvZ+/aRso6+JXUy6 7TVkn7/7DMqI+fbUlMAfF4L7sqG9kgM/9EGnQyxI72C0Vl3j5UlxDxPnwiAoPo20nnXd HFg6DtAQjzpWVG8BF3tI2ugq9esCS1Ze3+mBggZv26BO3mW9DDcNMCbw+al+EDPyDHnM 6sDI3j6GTUWZjejh2RhzHR0GNaDYksenuBUsp1wtNh/gxWTxCfKbZACmzkNbGiS1YjrK tWfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=0k9i4ryQ9bJFDsrRO38Wd8DYLKsoLjmKBJR+v0YQjxs=; b=E+m5alwu80QDkU7t4bF5+8Q55quU37DVEBctVed/anwM+s5R0OukSXKVZhQ8RA6PRA XxxIEaMzSYvcbrZ6VBfZlymXJHuWb4BhaazF9nZ/XKL7jqbFvs3qO7/TDacgMh+nwOcH TgJwfMCtHsNYseDQiFbUWxN0GjPg6F7gpAPBYR+oad/yCfPVetOiFwLLtJ4gn7N1NjEI ZEwn3dHl/h7fsuEx69PlEPTbESyt02jS5DfMEhNkl9fKGGbWoQviQhqISeq7yR6IbamQ luUUuHImvpLj+Wxn9CHB66cD+HwxpKTJZWD4UpsGcokAOaq647XaMQdAH1ssM0MhjOZB g6Hg== X-Gm-Message-State: AOAM5333uhU1WjZv+kBMurilJU0w/tJA047oUdxfOGdg8Q1dWcnZ8BMk 9/fgYJZOqtAVOLQubJ61gdg= X-Received: by 2002:a7b:c1c2:: with SMTP id a2mr4573948wmj.15.1626178291200; Tue, 13 Jul 2021 05:11:31 -0700 (PDT) Received: from localhost.localdomain ([2001:b07:6456:fd99:ced0:db1c:53e1:191e]) by smtp.googlemail.com with ESMTPSA id o3sm17551425wrw.56.2021.07.13.05.11.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 05:11:30 -0700 (PDT) From: Riccardo Mancini To: Arnaldo Carvalho de Melo Cc: Ian Rogers , Namhyung Kim , Peter Zijlstra , Ingo Molnar , Mark Rutland , Jiri Olsa , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Riccardo Mancini Subject: [RFC PATCH 00/10] perf: add workqueue library and use it in synthetic-events Date: Tue, 13 Jul 2021 14:11:11 +0200 Message-Id: X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patchset introduces a new utility library inside perf/util, which provides a work queue abstraction, which loosely follows the Kernel workqueue API. The workqueue abstraction is made up by two components: - threadpool: which takes care of managing a pool of threads. It is inspired by the prototype for threaded trace in perf-record from Alexey: https://lore.kernel.org/lkml/cover.1625227739.git.alexey.v.bayduraev@linux.intel.com/ - workqueue: manages a shared queue and provides the workers implementation. On top of the workqueue, a simple parallel-for utility is implemented which is then showcased in synthetic-events.c, replacing the previous manual pthread-created threads. Through some experiments with perf bench, I can see how the new workqueue has a higher overhead compared to manual creation of threads, but is able to more effectively partition work among threads, yielding a better result with more threads. Furthermore, the overhead could be configured by changing the `work_size` (currently 1), aka the number of dirents that are processed by a thread before grabbing a lock to get the new work item. I experimented with different sizes but, while bigger sizes reduce overhead as expected, they do not scale as well to more threads. I tried to keep the patchset as simple as possible, deferring possible improvements and features to future work. Naming a few: - in order to achieve a better performance, we could consider using work-stealing instead of a common queue. - affinities in the thread pool, as in Alexey prototype for perf-record. Doing so would enable reusing the same threadpool for different purposes (evlist open, threaded trace, synthetic threads), avoiding having to spin up threads multiple times. - resizable threadpool, e.g. for lazy spawining of threads. @Arnaldo Since I wanted the workqueue to provide a similar API to the Kernel's workqueue, I followed the naming style I found there, instead of the usual object__method style that is typically found in perf. Let me know if you'd like me to follow perf style instead. Thanks, Riccardo Riccardo Mancini (10): perf workqueue: threadpool creation and destruction perf tests: add test for workqueue perf workqueue: add threadpool start and stop functions perf workqueue: add threadpool execute and wait functions perf workqueue: add sparse annotation header perf workqueue: introduce workqueue struct perf workqueue: implement worker thread and management perf workqueue: add queue_work and flush_workqueue functions perf workqueue: add utility to execute a for loop in parallel perf synthetic-events: use workqueue parallel_for tools/perf/tests/Build | 1 + tools/perf/tests/builtin-test.c | 9 + tools/perf/tests/tests.h | 3 + tools/perf/tests/workqueue.c | 453 +++++++++++++++++ tools/perf/util/Build | 1 + tools/perf/util/synthetic-events.c | 131 +++-- tools/perf/util/workqueue/Build | 2 + tools/perf/util/workqueue/sparse.h | 21 + tools/perf/util/workqueue/threadpool.c | 516 ++++++++++++++++++++ tools/perf/util/workqueue/threadpool.h | 29 ++ tools/perf/util/workqueue/workqueue.c | 642 +++++++++++++++++++++++++ tools/perf/util/workqueue/workqueue.h | 38 ++ 12 files changed, 1771 insertions(+), 75 deletions(-) create mode 100644 tools/perf/tests/workqueue.c create mode 100644 tools/perf/util/workqueue/Build create mode 100644 tools/perf/util/workqueue/sparse.h create mode 100644 tools/perf/util/workqueue/threadpool.c create mode 100644 tools/perf/util/workqueue/threadpool.h create mode 100644 tools/perf/util/workqueue/workqueue.c create mode 100644 tools/perf/util/workqueue/workqueue.h -- 2.31.1