Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp2100995pxb; Sun, 17 Oct 2021 06:00:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwXxbPq5fzphoiVrsrAWS2fcHt3DK8CLkIepXD80Z6k7N018JB6OdMeKmVgKxHsu8hma/Q/ X-Received: by 2002:a17:90a:f0c9:: with SMTP id fa9mr41157818pjb.107.1634475615190; Sun, 17 Oct 2021 06:00:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634475615; cv=none; d=google.com; s=arc-20160816; b=lsf6J3/4EK8heaLNqF02NMOhi/UPSY8inqzBv2Ah/3KVyFHL4/N0qnTPzexOEgwAUY K/+8vLW5WrTto3GMYf/jEpEbcMRdj6VlqWdNnTkhf5XuRYodQzdizyzWHkLhqpXzDFiM zhp71f78gbb4M3Lmv50mpk1azqHR1D7+eajVUObfgs5MZzK46tNIPvc2mYMRRB6s/U5X pNIhfyaoqP1eqIizG9pDDSeA/hvCC4x6TyFdmKVAB1rXQeLcmNM8OfIVpQwsOtcD+6gh y6ae27Uz+DZePHZR3H0YbKtyXHlxhVpUCX3n06ltuV2p6Nf5diuaAsymJy5wlARxgEgy SREA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:dkim-signature; bh=+pvfnBZTpeenK8naiVnpmMSuK9E2qQGFB4fbGJAUfrQ=; b=fzataArgkjCETPtBUfiO9RVqvMzZpMKx5DloMjU/yy9fQHnHJ9pK1JaEjyrTsRGPX0 8hGYBsaSjhXT6qsiVRcILiBHzS5SlEbHDoExzDu1f7ksY4hC8ZxZgZ5gQnaltkaPkN7O fcW3yPGSEphQuuKNVt+tRmOtAudokQhezCKqRj5+drwxQIquW9VYAlw8gFmt7GlO1D+Z EUpEZiaEwSq88y+XgMbm7hhm+HGRsCWEwBc6NmtGvL0Hxh0MhrFl76cVr87+tH2UtesD yy2qtqjYffSTl5j6QFZL6FvHIkDTAErmBAbjBQGfdm2r7p/u29iR144L/g2tGoJY6NVF YDIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=fBml1lyC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q8si17519100pgq.595.2021.10.17.06.00.01; Sun, 17 Oct 2021 06:00:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=fBml1lyC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237969AbhJORXt (ORCPT + 99 others); Fri, 15 Oct 2021 13:23:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242163AbhJORXr (ORCPT ); Fri, 15 Oct 2021 13:23:47 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DBADFC061570 for ; Fri, 15 Oct 2021 10:21:40 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id x16-20020a25b910000000b005b6b7f2f91cso12110298ybj.1 for ; Fri, 15 Oct 2021 10:21:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=+pvfnBZTpeenK8naiVnpmMSuK9E2qQGFB4fbGJAUfrQ=; b=fBml1lyCT1YQmBm3A0SaeA339d+yaN82Nfg+OI25rEgcS9sKDfF71y2H/pa3SL9KuC wgAfj5GdDHkzkGFMjgEQ2BilwIyxgvxf+MEQyqiYKu5MnYzFo4hErGXS8AzGX+AeCc/h x5/fyi6mj92NS/ezi6jJ3mq9hJNiReADXCZvrpxWvlFnMol0P56HYEExgvIZTtagoxZ6 PORagWwuIwbVgBDdtfuC89hJe99HNw1s8ou8rzYM2XBO0kxlTwUWtHIL4nIxp6XukJ7P foeAmPqOm4ZcnhG8Sdnnc2GgjY9Thc8rOT8oGS2Tu/6CC2mPQwER6a2GoNcJFMhYv5Mo llgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=+pvfnBZTpeenK8naiVnpmMSuK9E2qQGFB4fbGJAUfrQ=; b=FLbP8mCfjqsGwhqijYqGTVk7jVqxWPyGonfhVYrKqtkicltnheygoQH6L+3VMHlJhs Y8Wq5p/1ERxyh6BM2qS8ZvKbplvQLz0dBMSNt7KkSpTV3XjIfaK6jNCx87Coc1lUWSCM 1EUjWJ/C/9MmcAICsfet6oQN3IJbicj2ylESYiACArA70b6wztD/7939U+UtSQ7peUEI He3ucmwyVO5KvhWDH57FM6lyB0StlH6/eUD9TMvUfQu3plkPFdsIi5nhGpk/E1Euukk1 neJuzA8GX405MT5IG6FCTcbbT0gqG/tUXsMZWKl8xVSoiy6su6Ijtte65g9oFVnQB2Z3 cz5Q== X-Gm-Message-State: AOAM532HE0NMUdlkBCPVL7EHYqMrJKGLEy5j9czTWzhN7REYEw25zcIM P+wRy7kISXRthpolY8D1+itfZktTs56Q X-Received: from irogers.svl.corp.google.com ([2620:15c:2cd:202:69bc:7451:58ad:6585]) (user=irogers job=sendgmr) by 2002:a25:6545:: with SMTP id z66mr13842733ybb.157.1634318500067; Fri, 15 Oct 2021 10:21:40 -0700 (PDT) Date: Fri, 15 Oct 2021 10:21:12 -0700 In-Reply-To: <20211015172132.1162559-1-irogers@google.com> Message-Id: <20211015172132.1162559-2-irogers@google.com> Mime-Version: 1.0 References: <20211015172132.1162559-1-irogers@google.com> X-Mailer: git-send-email 2.33.0.1079.g6e70778dc9-goog Subject: [PATCH v2 01/21] tools lib: Add list_sort. From: Ian Rogers To: Andi Kleen , Jiri Olsa , Jin Yao , Namhyung Kim , John Garry , Kajol Jain , "Paul A . Clarke" , Arnaldo Carvalho de Melo , Riccardo Mancini , Kan Liang , Peter Zijlstra , Ingo Molnar , Mark Rutland , Alexander Shishkin , Kees Cook , Sami Tolvanen , Nick Desaulniers , Andrew Morton , Jacob Keller , Zhen Lei , ToastC , Joakim Zhang , Felix Fietkau , Jiapeng Chong , Song Liu , Fabian Hemmer , Alexander Antonov , Nicholas Fraser , Adrian Hunter , Denys Zagorui , Wan Jiabing , Thomas Richter , Sumanth Korikkar , Heiko Carstens , Changbin Du , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Andrew Kilroy Cc: Stephane Eranian , Ian Rogers Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add list_sort.[ch] from the main kernel tree. The linux/bug.h #include is removed due to conflicting definitions. Add check-headers and modify perf build accordingly. MANIFEST and python-ext-sources fixes suggested by Arnaldo. Suggested-by: Arnaldo Carvalho de Melo Acked-by: Andi Kleen Signed-off-by: Ian Rogers --- tools/include/linux/list_sort.h | 14 ++ tools/lib/list_sort.c | 252 +++++++++++++++++++++++++++++ tools/perf/MANIFEST | 1 + tools/perf/check-headers.sh | 2 + tools/perf/util/Build | 5 + tools/perf/util/python-ext-sources | 1 + 6 files changed, 275 insertions(+) create mode 100644 tools/include/linux/list_sort.h create mode 100644 tools/lib/list_sort.c diff --git a/tools/include/linux/list_sort.h b/tools/include/linux/list_sort.h new file mode 100644 index 000000000000..453105f74e05 --- /dev/null +++ b/tools/include/linux/list_sort.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_LIST_SORT_H +#define _LINUX_LIST_SORT_H + +#include + +struct list_head; + +typedef int __attribute__((nonnull(2,3))) (*list_cmp_func_t)(void *, + const struct list_head *, const struct list_head *); + +__attribute__((nonnull(2,3))) +void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp); +#endif diff --git a/tools/lib/list_sort.c b/tools/lib/list_sort.c new file mode 100644 index 000000000000..10c067e3a8d2 --- /dev/null +++ b/tools/lib/list_sort.c @@ -0,0 +1,252 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include + +/* + * Returns a list organized in an intermediate format suited + * to chaining of merge() calls: null-terminated, no reserved or + * sentinel head node, "prev" links not maintained. + */ +__attribute__((nonnull(2,3,4))) +static struct list_head *merge(void *priv, list_cmp_func_t cmp, + struct list_head *a, struct list_head *b) +{ + struct list_head *head, **tail = &head; + + for (;;) { + /* if equal, take 'a' -- important for sort stability */ + if (cmp(priv, a, b) <= 0) { + *tail = a; + tail = &a->next; + a = a->next; + if (!a) { + *tail = b; + break; + } + } else { + *tail = b; + tail = &b->next; + b = b->next; + if (!b) { + *tail = a; + break; + } + } + } + return head; +} + +/* + * Combine final list merge with restoration of standard doubly-linked + * list structure. This approach duplicates code from merge(), but + * runs faster than the tidier alternatives of either a separate final + * prev-link restoration pass, or maintaining the prev links + * throughout. + */ +__attribute__((nonnull(2,3,4,5))) +static void merge_final(void *priv, list_cmp_func_t cmp, struct list_head *head, + struct list_head *a, struct list_head *b) +{ + struct list_head *tail = head; + u8 count = 0; + + for (;;) { + /* if equal, take 'a' -- important for sort stability */ + if (cmp(priv, a, b) <= 0) { + tail->next = a; + a->prev = tail; + tail = a; + a = a->next; + if (!a) + break; + } else { + tail->next = b; + b->prev = tail; + tail = b; + b = b->next; + if (!b) { + b = a; + break; + } + } + } + + /* Finish linking remainder of list b on to tail */ + tail->next = b; + do { + /* + * If the merge is highly unbalanced (e.g. the input is + * already sorted), this loop may run many iterations. + * Continue callbacks to the client even though no + * element comparison is needed, so the client's cmp() + * routine can invoke cond_resched() periodically. + */ + if (unlikely(!++count)) + cmp(priv, b, b); + b->prev = tail; + tail = b; + b = b->next; + } while (b); + + /* And the final links to make a circular doubly-linked list */ + tail->next = head; + head->prev = tail; +} + +/** + * list_sort - sort a list + * @priv: private data, opaque to list_sort(), passed to @cmp + * @head: the list to sort + * @cmp: the elements comparison function + * + * The comparison function @cmp must return > 0 if @a should sort after + * @b ("@a > @b" if you want an ascending sort), and <= 0 if @a should + * sort before @b *or* their original order should be preserved. It is + * always called with the element that came first in the input in @a, + * and list_sort is a stable sort, so it is not necessary to distinguish + * the @a < @b and @a == @b cases. + * + * This is compatible with two styles of @cmp function: + * - The traditional style which returns <0 / =0 / >0, or + * - Returning a boolean 0/1. + * The latter offers a chance to save a few cycles in the comparison + * (which is used by e.g. plug_ctx_cmp() in block/blk-mq.c). + * + * A good way to write a multi-word comparison is:: + * + * if (a->high != b->high) + * return a->high > b->high; + * if (a->middle != b->middle) + * return a->middle > b->middle; + * return a->low > b->low; + * + * + * This mergesort is as eager as possible while always performing at least + * 2:1 balanced merges. Given two pending sublists of size 2^k, they are + * merged to a size-2^(k+1) list as soon as we have 2^k following elements. + * + * Thus, it will avoid cache thrashing as long as 3*2^k elements can + * fit into the cache. Not quite as good as a fully-eager bottom-up + * mergesort, but it does use 0.2*n fewer comparisons, so is faster in + * the common case that everything fits into L1. + * + * + * The merging is controlled by "count", the number of elements in the + * pending lists. This is beautifully simple code, but rather subtle. + * + * Each time we increment "count", we set one bit (bit k) and clear + * bits k-1 .. 0. Each time this happens (except the very first time + * for each bit, when count increments to 2^k), we merge two lists of + * size 2^k into one list of size 2^(k+1). + * + * This merge happens exactly when the count reaches an odd multiple of + * 2^k, which is when we have 2^k elements pending in smaller lists, + * so it's safe to merge away two lists of size 2^k. + * + * After this happens twice, we have created two lists of size 2^(k+1), + * which will be merged into a list of size 2^(k+2) before we create + * a third list of size 2^(k+1), so there are never more than two pending. + * + * The number of pending lists of size 2^k is determined by the + * state of bit k of "count" plus two extra pieces of information: + * + * - The state of bit k-1 (when k == 0, consider bit -1 always set), and + * - Whether the higher-order bits are zero or non-zero (i.e. + * is count >= 2^(k+1)). + * + * There are six states we distinguish. "x" represents some arbitrary + * bits, and "y" represents some arbitrary non-zero bits: + * 0: 00x: 0 pending of size 2^k; x pending of sizes < 2^k + * 1: 01x: 0 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k + * 2: x10x: 0 pending of size 2^k; 2^k + x pending of sizes < 2^k + * 3: x11x: 1 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k + * 4: y00x: 1 pending of size 2^k; 2^k + x pending of sizes < 2^k + * 5: y01x: 2 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k + * (merge and loop back to state 2) + * + * We gain lists of size 2^k in the 2->3 and 4->5 transitions (because + * bit k-1 is set while the more significant bits are non-zero) and + * merge them away in the 5->2 transition. Note in particular that just + * before the 5->2 transition, all lower-order bits are 11 (state 3), + * so there is one list of each smaller size. + * + * When we reach the end of the input, we merge all the pending + * lists, from smallest to largest. If you work through cases 2 to + * 5 above, you can see that the number of elements we merge with a list + * of size 2^k varies from 2^(k-1) (cases 3 and 5 when x == 0) to + * 2^(k+1) - 1 (second merge of case 5 when x == 2^(k-1) - 1). + */ +__attribute__((nonnull(2,3))) +void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp) +{ + struct list_head *list = head->next, *pending = NULL; + size_t count = 0; /* Count of pending */ + + if (list == head->prev) /* Zero or one elements */ + return; + + /* Convert to a null-terminated singly-linked list. */ + head->prev->next = NULL; + + /* + * Data structure invariants: + * - All lists are singly linked and null-terminated; prev + * pointers are not maintained. + * - pending is a prev-linked "list of lists" of sorted + * sublists awaiting further merging. + * - Each of the sorted sublists is power-of-two in size. + * - Sublists are sorted by size and age, smallest & newest at front. + * - There are zero to two sublists of each size. + * - A pair of pending sublists are merged as soon as the number + * of following pending elements equals their size (i.e. + * each time count reaches an odd multiple of that size). + * That ensures each later final merge will be at worst 2:1. + * - Each round consists of: + * - Merging the two sublists selected by the highest bit + * which flips when count is incremented, and + * - Adding an element from the input as a size-1 sublist. + */ + do { + size_t bits; + struct list_head **tail = &pending; + + /* Find the least-significant clear bit in count */ + for (bits = count; bits & 1; bits >>= 1) + tail = &(*tail)->prev; + /* Do the indicated merge */ + if (likely(bits)) { + struct list_head *a = *tail, *b = a->prev; + + a = merge(priv, cmp, b, a); + /* Install the merged result in place of the inputs */ + a->prev = b->prev; + *tail = a; + } + + /* Move one element from input list to pending */ + list->prev = pending; + pending = list; + list = list->next; + pending->next = NULL; + count++; + } while (list); + + /* End of input; merge together all the pending lists. */ + list = pending; + pending = pending->prev; + for (;;) { + struct list_head *next = pending->prev; + + if (!next) + break; + list = merge(priv, cmp, pending, list); + pending = next; + } + /* The final merge, rebuilding prev links */ + merge_final(priv, cmp, head, pending, list); +} +EXPORT_SYMBOL(list_sort); diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST index f05c4d48fd7e..e728615a3830 100644 --- a/tools/perf/MANIFEST +++ b/tools/perf/MANIFEST @@ -17,6 +17,7 @@ tools/lib/symbol/kallsyms.c tools/lib/symbol/kallsyms.h tools/lib/find_bit.c tools/lib/bitmap.c +tools/lib/list_sort.c tools/lib/str_error_r.c tools/lib/vsprintf.c tools/lib/zalloc.c diff --git a/tools/perf/check-headers.sh b/tools/perf/check-headers.sh index f1e46277e822..30ecf3a0f68b 100755 --- a/tools/perf/check-headers.sh +++ b/tools/perf/check-headers.sh @@ -26,6 +26,7 @@ include/vdso/bits.h include/linux/const.h include/vdso/const.h include/linux/hash.h +include/linux/list-sort.h include/uapi/linux/hw_breakpoint.h arch/x86/include/asm/disabled-features.h arch/x86/include/asm/required-features.h @@ -150,6 +151,7 @@ check include/uapi/linux/mman.h '-I "^#include <\(uapi/\)*asm/mman.h>"' check include/linux/build_bug.h '-I "^#\(ifndef\|endif\)\( \/\/\)* static_assert$"' check include/linux/ctype.h '-I "isdigit("' check lib/ctype.c '-I "^EXPORT_SYMBOL" -I "^#include " -B' +check lib/list_sort.c '-I "^#include "' # diff non-symmetric files check_2 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl diff --git a/tools/perf/util/Build b/tools/perf/util/Build index f2914d5bed6e..15b2366ad384 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -138,6 +138,7 @@ perf-y += expr.o perf-y += branch.o perf-y += mem2node.o perf-y += clockid.o +perf-y += list_sort.o perf-$(CONFIG_LIBBPF) += bpf-loader.o perf-$(CONFIG_LIBBPF) += bpf_map.o @@ -315,3 +316,7 @@ $(OUTPUT)util/hweight.o: ../lib/hweight.c FORCE $(OUTPUT)util/vsprintf.o: ../lib/vsprintf.c FORCE $(call rule_mkdir) $(call if_changed_dep,cc_o_c) + +$(OUTPUT)util/list_sort.o: ../lib/list_sort.c FORCE + $(call rule_mkdir) + $(call if_changed_dep,cc_o_c) diff --git a/tools/perf/util/python-ext-sources b/tools/perf/util/python-ext-sources index d7c976671e3a..a685d20165f7 100644 --- a/tools/perf/util/python-ext-sources +++ b/tools/perf/util/python-ext-sources @@ -18,6 +18,7 @@ util/mmap.c util/namespaces.c ../lib/bitmap.c ../lib/find_bit.c +../lib/list_sort.c ../lib/hweight.c ../lib/string.c ../lib/vsprintf.c -- 2.33.0.1079.g6e70778dc9-goog