Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp1155715img; Tue, 19 Mar 2019 01:23:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqziEuDFx6r4iHjTsa7XWIae0JqCwB4Iyp/vRrnmFOLNx643DfE50UCjN7RuVA1sYMXjfVHH X-Received: by 2002:aa7:91d7:: with SMTP id z23mr735075pfa.137.1552983838090; Tue, 19 Mar 2019 01:23:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552983838; cv=none; d=google.com; s=arc-20160816; b=B4twrUmOi67QHdmZ6LS9nOPqo7RKS+1jrYBosNnwHC322d2etzlGdyqukp591bC+y2 xi23TajEmwirxp7Q+w4FNp5BfKBCOTTpi79tMwxTShoOWwaJIv62WpXN6dqUBr6VgVAy kDQhOXjx2qn+ZLClfaUatdEPlOoXFhZBR+bzcDnu4KUpbqOkqvzANrfvXhF3zrU4mRY0 wnkqzdi3IPTJUkt8HrQIKYMpYFa5hqKRMfUsDdCQEw7g76K1Bnx9Jk/hYJ8vrqQcCVI8 jywnX3cylmqOa+zV2ZhN2tKWalOhpGdiC10Tl9SDwZfchibfxvHvOMeo5kfwEhkXQPLp hsdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:date:from:references :in-reply-to:message-id; bh=r0MR6qUkueqOYsSHnc37PP/qIHnVXhxrRg08dt5/D9M=; b=DW45CSCzxDvTffarj+lJMEZgC6fFMdB1i0MOxICCTYtjWn02maookfJ/p6/mPCmnv7 jr0QaoUBGg4U/4G7Zvuxr3ftmvoGzv0BXrv2sXvLze3JGuySnvbAVKdA56aSrEvmQQv3 Rpd86rEOlwZqv/1TbIs3DF4oup0aSjlTNYyNQxaunEYrB9iN/H3fAja891fG9g2xvsCM uL3zaJIMQQ58x2i/m9FnZcDQ7Elhq6v57VAfL59y9sDvFoN1ppwxjAkYxkTAd1Pk7V3o 2xCWEF2ED7AWB4mXI3PpDuKvQwdO/T9zq2DZWB7Q2lsrWIjrY6wZPGKMK3phb/RnOfUQ z07w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bi12si11215529plb.205.2019.03.19.01.23.42; Tue, 19 Mar 2019 01:23:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727550AbfCSIXH (ORCPT + 99 others); Tue, 19 Mar 2019 04:23:07 -0400 Received: from mx.sdf.org ([205.166.94.20]:55365 "EHLO mx.sdf.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727143AbfCSIXF (ORCPT ); Tue, 19 Mar 2019 04:23:05 -0400 Received: from sdf.org (IDENT:lkml@sdf.lonestar.org [205.166.94.16]) by mx.sdf.org (8.15.2/8.14.5) with ESMTPS id x2J8MgWm004029 (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256 bits) verified NO); Tue, 19 Mar 2019 08:22:42 GMT Received: (from lkml@localhost) by sdf.org (8.15.2/8.12.8/Submit) id x2J8MgPv021399; Tue, 19 Mar 2019 08:22:42 GMT Message-Id: In-Reply-To: References: From: George Spelvin Date: Tue, 19 Mar 2019 08:16:00 +0000 Subject: [RESEND PATCH v2 4/5] lib/list_sort: Simplify and remove MAX_LIST_LENGTH_BITS To: linux-kernel@vger.kernel.org, kernel-janitors@vger.kernel.org, Andrew Morton Cc: George Spelvin , Andrey Abramov , Geert Uytterhoeven , Daniel Wagner , Rasmus Villemoes , Don Mullis , Dave Chinner , Andy Shevchenko Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Rather than a fixed-size array of pending sorted runs, use the ->prev links to keep track of things. This reduces stack usage, eliminates some ugly overflow handling, and reduces the code size. Also: * merge() no longer needs to handle NULL inputs, so simplify. * The same applies to merge_and_restore_back_links(), which is renamed to the less ponderous merge_final(). (It's a static helper function, so we don't need a super-descriptive name; comments will do.) * Document the actual return value requirements on the (*cmp)() function; some callers are already using this feature. x86-64 code size 1086 -> 739 bytes (-347) (Yes, I see checkpatch complaining about no space after comma in "__attribute__((nonnull(2,3,4,5)))". Checkpatch is wrong.) Signed-off-by: George Spelvin Acked-by: Andrey Abramov Feedback-from: Rasmus Villemoes Feedback-from: Andy Shevchenko Feedback-from: Geert Uytterhoeven --- include/linux/list_sort.h | 1 + lib/list_sort.c | 169 ++++++++++++++++++++++++-------------- 2 files changed, 109 insertions(+), 61 deletions(-) diff --git a/include/linux/list_sort.h b/include/linux/list_sort.h index ba79956e848d..20f178c24e9d 100644 --- a/include/linux/list_sort.h +++ b/include/linux/list_sort.h @@ -6,6 +6,7 @@ struct list_head; +__attribute__((nonnull(2,3))) void list_sort(void *priv, struct list_head *head, int (*cmp)(void *priv, struct list_head *a, struct list_head *b)); diff --git a/lib/list_sort.c b/lib/list_sort.c index 85759928215b..fc807dd60a51 100644 --- a/lib/list_sort.c +++ b/lib/list_sort.c @@ -7,33 +7,47 @@ #include #include -#define MAX_LIST_LENGTH_BITS 20 +/* + * By declaring the compare function with the __pure attribute, we give + * the compiler more opportunity to optimize. Ideally, we'd use this in + * the prototype of list_sort(), but that would involve a lot of churn + * at all call sites, so just cast the function pointer passed in. + */ +typedef int __pure __attribute__((nonnull(2,3))) (*cmp_func)(void *, + struct list_head const *, struct list_head const *); /* * Returns a list organized in an intermediate format suited * to chaining of merge() calls: null-terminated, no reserved or * sentinel head node, "prev" links not maintained. */ -static struct list_head *merge(void *priv, - int (*cmp)(void *priv, struct list_head *a, - struct list_head *b), +__attribute__((nonnull(2,3,4))) +static struct list_head *merge(void *priv, cmp_func cmp, struct list_head *a, struct list_head *b) { - struct list_head head, *tail = &head; + struct list_head *head, **tail = &head; - while (a && b) { + for (;;) { /* if equal, take 'a' -- important for sort stability */ - if ((*cmp)(priv, a, b) <= 0) { - tail->next = a; + if (cmp(priv, a, b) <= 0) { + *tail = a; + tail = &a->next; a = a->next; + if (!a) { + *tail = b; + break; + } } else { - tail->next = b; + *tail = b; + tail = &b->next; b = b->next; + if (!b) { + *tail = a; + break; + } } - tail = tail->next; } - tail->next = a?:b; - return head.next; + return head; } /* @@ -43,44 +57,52 @@ static struct list_head *merge(void *priv, * prev-link restoration pass, or maintaining the prev links * throughout. */ -static void merge_and_restore_back_links(void *priv, - int (*cmp)(void *priv, struct list_head *a, - struct list_head *b), - struct list_head *head, - struct list_head *a, struct list_head *b) +__attribute__((nonnull(2,3,4,5))) +static void merge_final(void *priv, cmp_func cmp, struct list_head *head, + struct list_head *a, struct list_head *b) { struct list_head *tail = head; u8 count = 0; - while (a && b) { + for (;;) { /* if equal, take 'a' -- important for sort stability */ - if ((*cmp)(priv, a, b) <= 0) { + if (cmp(priv, a, b) <= 0) { tail->next = a; a->prev = tail; + tail = a; a = a->next; + if (!a) + break; } else { tail->next = b; b->prev = tail; + tail = b; b = b->next; + if (!b) { + b = a; + break; + } } - tail = tail->next; } - tail->next = a ? : b; + /* Finish linking remainder of list b on to tail */ + tail->next = b; do { /* - * In worst cases this loop may run many iterations. + * If the merge is highly unbalanced (e.g. the input is + * already sorted), this loop may run many iterations. * Continue callbacks to the client even though no * element comparison is needed, so the client's cmp() * routine can invoke cond_resched() periodically. */ - if (unlikely(!(++count))) - (*cmp)(priv, tail->next, tail->next); - - tail->next->prev = tail; - tail = tail->next; - } while (tail->next); + if (unlikely(!++count)) + cmp(priv, b, b); + b->prev = tail; + tail = b; + b = b->next; + } while (b); + /* And the final links to make a circular doubly-linked list */ tail->next = head; head->prev = tail; } @@ -91,55 +113,80 @@ static void merge_and_restore_back_links(void *priv, * @head: the list to sort * @cmp: the elements comparison function * - * This function implements "merge sort", which has O(nlog(n)) - * complexity. + * This function implements a bottom-up merge sort, which has O(nlog(n)) + * complexity. We use depth-first order to take advantage of cacheing. + * (E.g. when we get to the fourth element, we immediately merge the + * first two 2-element lists.) * - * The comparison function @cmp must return a negative value if @a - * should sort before @b, and a positive value if @a should sort after - * @b. If @a and @b are equivalent, and their original relative - * ordering is to be preserved, @cmp must return 0. + * The comparison funtion @cmp must return > 0 if @a should sort after + * @b ("@a > @b" if you want an ascending sort), and <= 0 if @a should + * sort before @b *or* their original order should be preserved. It is + * always called with the element that came first in the input in @a, + * and list_sort is a stable sort, so it is not necessary to distinguish + * the @a < @b and @a == @b cases. + * + * This is compatible with two styles of @cmp function: + * - The traditional style which returns <0 / =0 / >0, or + * - Returning a boolean 0/1. + * The latter offers a chance to save a few cycles in the comparison + * (which is used by e.g. plug_ctx_cmp() in block/blk-mq.c). + * + * A good way to write a multi-word comparison is + * if (a->high != b->high) + * return a->high > b->high; + * if (a->middle != b->middle) + * return a->middle > b->middle; + * return a->low > b->low; */ +__attribute__((nonnull(2,3))) void list_sort(void *priv, struct list_head *head, int (*cmp)(void *priv, struct list_head *a, struct list_head *b)) { - struct list_head *part[MAX_LIST_LENGTH_BITS+1]; /* sorted partial lists - -- last slot is a sentinel */ - int lev; /* index into part[] */ - int max_lev = 0; - struct list_head *list; + struct list_head *list = head->next, *pending = NULL; + size_t count = 0; /* Count of pending */ - if (list_empty(head)) + if (list == head->prev) /* Zero or one elements */ return; - memset(part, 0, sizeof(part)); - + /* Convert to a null-terminated singly-linked list. */ head->prev->next = NULL; - list = head->next; - while (list) { + /* + * Data structure invariants: + * - All lists are singly linked and null-terminated; prev + * pointers are not maintained. + * - pending is a prev-linked "list of lists" of sorted + * sublists awaiting further merging. + * - Each of the sorted sublists is power-of-two in size, + * corresponding to bits set in "count". + * - Sublists are sorted by size and age, smallest & newest at front. + */ + do { + size_t bits; struct list_head *cur = list; + + /* Extract the head of "list" as a single-element list "cur" */ list = list->next; cur->next = NULL; - for (lev = 0; part[lev]; lev++) { - cur = merge(priv, cmp, part[lev], cur); - part[lev] = NULL; + /* Do merges corresponding to set lsbits in count */ + for (bits = count; bits & 1; bits >>= 1) { + cur = merge(priv, (cmp_func)cmp, pending, cur); + pending = pending->prev; /* Untouched by merge() */ } - if (lev > max_lev) { - if (unlikely(lev >= ARRAY_SIZE(part)-1)) { - printk_once(KERN_DEBUG "list too long for efficiency\n"); - lev--; - } - max_lev = lev; - } - part[lev] = cur; + /* And place the result at the head of "pending" */ + cur->prev = pending; + pending = cur; + count++; + } while (list->next); + + /* Now merge together last element with all pending lists */ + while (pending->prev) { + list = merge(priv, (cmp_func)cmp, pending, list); + pending = pending->prev; } - - for (lev = 0; lev < max_lev; lev++) - if (part[lev]) - list = merge(priv, cmp, part[lev], list); - - merge_and_restore_back_links(priv, cmp, head, part[max_lev], list); + /* The final merge, rebuilding prev links */ + merge_final(priv, (cmp_func)cmp, head, pending, list); } EXPORT_SYMBOL(list_sort); -- 2.20.1