2014-02-12 17:35:29

by Paweł Sikora

[permalink] [raw]
Subject: [3.13.2] list passed to list_sort() too long for efficiency.

Hi,

i've noticed on my server that dmesg contains following info:

(...)
[ 141.192868] FS-Cache: Loaded
[ 141.203008] FS-Cache: Netfs 'nfs' registered for caching
[ 1383.278845] Key type dns_resolver registered
[ 1383.293014] NFS: Registering the id_resolver key type
[ 1383.293027] Key type id_resolver registered
[ 1383.293028] Key type id_legacy registered
[ 3998.115273] perf samples too long (2502 > 2500), lowering
kernel.perf_event_max_sample_rate to 50100
[16855.582522] list passed to list_sort() too long for efficiency

could someone put some light on the last line?

BR,
Paweł.

please CC on reply.

--
gpg key fingerprint = 60B4 9886 AD53 EB3E 88BB 1EB5 C52E D01B 683B 9411


2014-02-12 17:59:32

by Paweł Sikora

[permalink] [raw]
Subject: Re: [3.13.2] list passed to list_sort() too long for efficiency.

On Wednesday 12 of February 2014 09:46:26 Joe Perches wrote:
> On Wed, 2014-02-12 at 18:16 +0100, Paweł Sikora wrote:
> > [16855.582522] list passed to list_sort() too long for efficiency
> >
> > could someone put some light on the last line?
>
> Did you look at the list_sort function in lib/list_sort.c?
>
> (granted the dmesg is harder to find that it should be
> because the format is split)
>
> It just means it's slower than it could otherwise be

thanks,
so without a stacktrace in dmesg we can't isolate a root of cause :/

--
gpg key fingerprint = 60B4 9886 AD53 EB3E 88BB 1EB5 C52E D01B 683B 9411

2014-02-12 18:04:12

by Dave Jones

[permalink] [raw]
Subject: Re: [3.13.2] list passed to list_sort() too long for efficiency.

On Wed, Feb 12, 2014 at 06:59:07PM +0100, Paweł Sikora wrote:
> On Wednesday 12 of February 2014 09:46:26 Joe Perches wrote:
> > On Wed, 2014-02-12 at 18:16 +0100, Paweł Sikora wrote:
> > > [16855.582522] list passed to list_sort() too long for efficiency
> > >
> > > could someone put some light on the last line?
> >
> > Did you look at the list_sort function in lib/list_sort.c?
> >
> > (granted the dmesg is harder to find that it should be
> > because the format is split)
> >
> > It just means it's slower than it could otherwise be
>
> thanks,
> so without a stacktrace in dmesg we can't isolate a root of cause :/

The commit message mentions xfs and ubifs have long lists that get
passed in. Do you use either of those ?

Dave

2014-02-12 18:07:51

by Joe Perches

[permalink] [raw]
Subject: Re: [3.13.2] list passed to list_sort() too long for efficiency.

On Wed, 2014-02-12 at 18:59 +0100, Paweł Sikora wrote:
> On Wednesday 12 of February 2014 09:46:26 Joe Perches wrote:
> > On Wed, 2014-02-12 at 18:16 +0100, Paweł Sikora wrote:
> > > [16855.582522] list passed to list_sort() too long for efficiency
> > >
> > > could someone put some light on the last line?
> >
> > Did you look at the list_sort function in lib/list_sort.c?
> >
> > (granted the dmesg is harder to find that it should be
> > because the format is split)
> >
> > It just means it's slower than it could otherwise be
>
> thanks,
> so without a stacktrace in dmesg we can't isolate a root of cause :/
>

Perhaps something like this would help:

Emit the caller of list_sort using %pS when the list is too
long to be efficient. Also emit the message if necessary
once for each call.

Coalesce formats.
Use a more current logging style with pr_<level> and pr_<fmt>

---
lib/list_sort.c | 67 +++++++++++++++++++++++++++------------------------------
1 file changed, 32 insertions(+), 35 deletions(-)

diff --git a/lib/list_sort.c b/lib/list_sort.c
index 1183fa7..e4e1a95 100644
--- a/lib/list_sort.c
+++ b/lib/list_sort.c
@@ -1,3 +1,5 @@
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/list_sort.h>
@@ -6,6 +8,8 @@

#define MAX_LIST_LENGTH_BITS 20

+#define DEBUG
+
/*
* Returns a list organized in an intermediate format suited
* to chaining of merge() calls: null-terminated, no reserved or
@@ -103,6 +107,7 @@ void list_sort(void *priv, struct list_head *head,
int lev; /* index into part[] */
int max_lev = 0;
struct list_head *list;
+ bool too_big_msg = false;

if (list_empty(head))
return;
@@ -123,9 +128,11 @@ void list_sort(void *priv, struct list_head *head,
}
if (lev > max_lev) {
if (unlikely(lev >= ARRAY_SIZE(part)-1)) {
- printk_once(KERN_DEBUG "list passed to"
- " list_sort() too long for"
- " efficiency\n");
+ if (!too_big_msg) {
+ too_big_msg = true;
+ pr_debug("list passed from %pS to list_sort() too long for efficiency\n",
+ __builtin_return_address(0));
+ }
lev--;
}
max_lev = lev;
@@ -165,30 +172,29 @@ struct debug_el {
/* Array, containing pointers to all elements in the test list */
static struct debug_el **elts __initdata;

-static int __init check(struct debug_el *ela, struct debug_el *elb)
+static int __init check(struct debug_el *ela, struct debug_el *elb,
+ const char *caller)
{
if (ela->serial >= TEST_LIST_LEN) {
- printk(KERN_ERR "list_sort_test: error: incorrect serial %d\n",
- ela->serial);
+ pr_err("%s: error: incorrect serial %d\n", caller, ela->serial);
return -EINVAL;
}
if (elb->serial >= TEST_LIST_LEN) {
- printk(KERN_ERR "list_sort_test: error: incorrect serial %d\n",
- elb->serial);
+ pr_err("%s: error: incorrect serial %d\n", caller, elb->serial);
return -EINVAL;
}
if (elts[ela->serial] != ela || elts[elb->serial] != elb) {
- printk(KERN_ERR "list_sort_test: error: phantom element\n");
+ pr_err("%s: error: phantom element\n", caller);
return -EINVAL;
}
if (ela->poison1 != TEST_POISON1 || ela->poison2 != TEST_POISON2) {
- printk(KERN_ERR "list_sort_test: error: bad poison: %#x/%#x\n",
- ela->poison1, ela->poison2);
+ pr_err("%s: error: bad poison: %#x/%#x\n",
+ caller, ela->poison1, ela->poison2);
return -EINVAL;
}
if (elb->poison1 != TEST_POISON1 || elb->poison2 != TEST_POISON2) {
- printk(KERN_ERR "list_sort_test: error: bad poison: %#x/%#x\n",
- elb->poison1, elb->poison2);
+ pr_err("%s: error: bad poison: %#x/%#x\n",
+ caller, elb->poison1, elb->poison2);
return -EINVAL;
}
return 0;
@@ -201,7 +207,7 @@ static int __init cmp(void *priv, struct list_head *a, struct list_head *b)
ela = container_of(a, struct debug_el, list);
elb = container_of(b, struct debug_el, list);

- check(ela, elb);
+ check(ela, elb, "list_sort_test");
return ela->value - elb->value;
}

@@ -212,22 +218,17 @@ static int __init list_sort_test(void)
struct list_head *cur, *tmp;
LIST_HEAD(head);

- printk(KERN_DEBUG "list_sort_test: start testing list_sort()\n");
+ pr_debug("%s: start testing list_sort()\n", __func__);

- elts = kmalloc(sizeof(void *) * TEST_LIST_LEN, GFP_KERNEL);
- if (!elts) {
- printk(KERN_ERR "list_sort_test: error: cannot allocate "
- "memory\n");
+ elts = kmalloc_array(TEST_LIST_LEN, sizeof(void *), GFP_KERNEL);
+ if (!elts)
goto exit;
- }

for (i = 0; i < TEST_LIST_LEN; i++) {
el = kmalloc(sizeof(*el), GFP_KERNEL);
- if (!el) {
- printk(KERN_ERR "list_sort_test: error: cannot "
- "allocate memory\n");
+ if (!el)
goto exit;
- }
+
/* force some equivalencies */
el->value = prandom_u32() % (TEST_LIST_LEN / 3);
el->serial = i;
@@ -244,37 +245,33 @@ static int __init list_sort_test(void)
int cmp_result;

if (cur->next->prev != cur) {
- printk(KERN_ERR "list_sort_test: error: list is "
- "corrupted\n");
+ pr_err("%s: error: list is corrupted\n", __func__);
goto exit;
}

cmp_result = cmp(NULL, cur, cur->next);
if (cmp_result > 0) {
- printk(KERN_ERR "list_sort_test: error: list is not "
- "sorted\n");
+ pr_err("%s: error: list is not sorted\n", __func__);
goto exit;
}

el = container_of(cur, struct debug_el, list);
el1 = container_of(cur->next, struct debug_el, list);
if (cmp_result == 0 && el->serial >= el1->serial) {
- printk(KERN_ERR "list_sort_test: error: order of "
- "equivalent elements not preserved\n");
+ pr_err("%s: error: order of equivalent elements not preserved\n",
+ __func__);
goto exit;
}

- if (check(el, el1)) {
- printk(KERN_ERR "list_sort_test: error: element check "
- "failed\n");
+ if (check(el, el1, __func__)) {
+ pr_err("%s: error: element check failed\n", __func__);
goto exit;
}
count++;
}

if (count != TEST_LIST_LEN) {
- printk(KERN_ERR "list_sort_test: error: bad list length %d",
- count);
+ pr_err("%s: error: bad list length %d\n", __func__, count);
goto exit;
}


2014-02-12 18:13:29

by Paweł Sikora

[permalink] [raw]
Subject: Re: [3.13.2] list passed to list_sort() too long for efficiency.

On Wednesday 12 of February 2014 13:03:56 Dave Jones wrote:
> On Wed, Feb 12, 2014 at 06:59:07PM +0100, Paweł Sikora wrote:
> > On Wednesday 12 of February 2014 09:46:26 Joe Perches wrote:
> > > On Wed, 2014-02-12 at 18:16 +0100, Paweł Sikora wrote:
> > > > [16855.582522] list passed to list_sort() too long for efficiency
> > > >
> > > > could someone put some light on the last line?
> > >
> > > Did you look at the list_sort function in lib/list_sort.c?
> > >
> > > (granted the dmesg is harder to find that it should be
> > >
> > > because the format is split)
> > >
> > > It just means it's slower than it could otherwise be
> >
> > thanks,
> > so without a stacktrace in dmesg we can't isolate a root of cause :/
>
> The commit message mentions xfs and ubifs have long lists that get
> passed in. Do you use either of those ?

this is a test-farm server which utilizes only cpu/ext4/nfs/lvm/mdadm.

--
gpg key fingerprint = 60B4 9886 AD53 EB3E 88BB 1EB5 C52E D01B 683B 9411

2014-02-20 07:09:13

by Dave Chinner

[permalink] [raw]
Subject: Re: [3.13.2] list passed to list_sort() too long for efficiency.

On Wed, Feb 12, 2014 at 10:07:46AM -0800, Joe Perches wrote:
> On Wed, 2014-02-12 at 18:59 +0100, Paweł Sikora wrote:
> > On Wednesday 12 of February 2014 09:46:26 Joe Perches wrote:
> > > On Wed, 2014-02-12 at 18:16 +0100, Paweł Sikora wrote:
> > > > [16855.582522] list passed to list_sort() too long for efficiency
> > > >
> > > > could someone put some light on the last line?
> > >
> > > Did you look at the list_sort function in lib/list_sort.c?
> > >
> > > (granted the dmesg is harder to find that it should be
> > > because the format is split)
> > >
> > > It just means it's slower than it could otherwise be
> >
> > thanks,
> > so without a stacktrace in dmesg we can't isolate a root of cause :/
> >
>
> Perhaps something like this would help:
>
> Emit the caller of list_sort using %pS when the list is too
> long to be efficient. Also emit the message if necessary
> once for each call.

Still not useful. For XFS, the list_sort caller isn't directly of
interest - it's further down the stack where the interesting
information is (e.g. caller flushing buffers from log recovery,
metadata writeback, quota shrinker, quotacheck, etc). Only a full
stack trace will tell us what we need to know.

I think WARN_ON_ONCE() is probably a better solution here.

Cheers,

Dave.

--
Dave Chinner
[email protected]