2014-10-27 19:57:00

by David Miller

[permalink] [raw]
Subject: futex testsuite suggestion...


First of all, thanks so much for writing your futex test suite, it's
proved invaluable for sparc64 kernel development for me lately.

I'd like to suggest that you add a test that triggers transparent
hugepages, because if an architecture doesn't implement
__get_user_pages_fast() such futexes cause a machine to hang.

I hacked up something simple that took the existing performance
test and made it operate in a register allocated using memalign().

I would suggest doing a memalign(HUGEPAGE_SIZE, HUGEPAGE_SIZE) then
iterating running a futex test within each normal page within that
hugepage.

Thanks!


2014-10-27 20:29:17

by Darren Hart

[permalink] [raw]
Subject: Re: futex testsuite suggestion...

On 10/27/14 12:56, David Miller wrote:
>
> First of all, thanks so much for writing your futex test suite, it's
> proved invaluable for sparc64 kernel development for me lately.

Hi David,

Glad to hear it :-)

>
> I'd like to suggest that you add a test that triggers transparent
> hugepages, because if an architecture doesn't implement
> __get_user_pages_fast() such futexes cause a machine to hang.
>
> I hacked up something simple that took the existing performance
> test and made it operate in a register allocated using memalign().
>
> I would suggest doing a memalign(HUGEPAGE_SIZE, HUGEPAGE_SIZE) then
> iterating running a futex test within each normal page within that
> hugepage.

Do you want this option for the performance tests, or would a less
intensive functional test be sufficient?

The other thing to note is there have been several efforts/false starts
to get futextests into perf and kselftest. We currently considering
splitting futextests across the two (performance to perf, functional to
kselftest). The TODO for a fuzz tester is handled *more* than adequately
by trinity.

I'm perfectly happy to add such a test. I'm currently buried under a
number of other things that have resulted in futextests suffering
somewhat. So a couple of things to help make this happen:

1) Could you send me your hacked up test, in whatever condition?
2) I'm more than happy to accept patches, but I do understand why you
might prefer to have someone else write it :-)

Thanks,

--
Darren Hart
Intel Open Source Technology Center

2014-10-27 20:31:19

by David Miller

[permalink] [raw]
Subject: Re: futex testsuite suggestion...

From: Darren Hart <[email protected]>
Date: Mon, 27 Oct 2014 13:29:14 -0700

> On 10/27/14 12:56, David Miller wrote:
>> I'd like to suggest that you add a test that triggers transparent
>> hugepages, because if an architecture doesn't implement
>> __get_user_pages_fast() such futexes cause a machine to hang.
>>
>> I hacked up something simple that took the existing performance
>> test and made it operate in a register allocated using memalign().
>>
>> I would suggest doing a memalign(HUGEPAGE_SIZE, HUGEPAGE_SIZE) then
>> iterating running a futex test within each normal page within that
>> hugepage.
>
> Do you want this option for the performance tests, or would a less
> intensive functional test be sufficient?

I think a functional test is sufficient.

> I'm perfectly happy to add such a test. I'm currently buried under a
> number of other things that have resulted in futextests suffering
> somewhat. So a couple of things to help make this happen:
>
> 1) Could you send me your hacked up test, in whatever condition?

See at the end of this email.

> 2) I'm more than happy to accept patches, but I do understand why you
> might prefer to have someone else write it :-)

I could find time to work on this. :)

diff --git a/performance/harness.h b/performance/harness.h
index a395492..5cfd09b 100644
--- a/performance/harness.h
+++ b/performance/harness.h
@@ -38,6 +38,7 @@
#include <limits.h>
#include <pthread.h>
#include <stdio.h>
+#include <malloc.h>
#include <sys/times.h>
#include "logging.h"

@@ -102,35 +103,39 @@ static void * locktest_thread(void * dummy)
static int locktest(void locktest_function(futex_t * ptr, int loops),
int iterations, int threads)
{
- struct locktest_shared shared;
+ struct locktest_shared *shared;
pthread_t thread[threads];
+ void *buf;
int i;
clock_t before, after;
struct tms tms_before, tms_after;
int wall, user, system;
double tick;

- barrier_init(&shared.barrier_before, threads);
- barrier_init(&shared.barrier_after, threads);
- shared.locktest_function = locktest_function;
- shared.loops = iterations / threads;
- shared.futex = 0;
+ buf = memalign(8 * 1024 * 1024, 8 * 1024 * 1024);
+ shared = (buf + (8 * 1024 * 1024) - sizeof(*shared));
+
+ barrier_init(&shared->barrier_before, threads);
+ barrier_init(&shared->barrier_after, threads);
+ shared->locktest_function = locktest_function;
+ shared->loops = iterations / threads;
+ shared->futex = 0;

for (i = 0; i < threads; i++)
if (pthread_create(thread + i, NULL, locktest_thread,
- &shared)) {
+ shared)) {
error("pthread_create\n", errno);
/* Could not create thread; abort */
- barrier_unblock(&shared.barrier_before, -1);
+ barrier_unblock(&shared->barrier_before, -1);
while (--i >= 0)
pthread_join(thread[i], NULL);
print_result(RET_ERROR);
return RET_ERROR;
}
- barrier_wait(&shared.barrier_before);
+ barrier_wait(&shared->barrier_before);
before = times(&tms_before);
- barrier_unblock(&shared.barrier_before, 1);
- barrier_wait(&shared.barrier_after);
+ barrier_unblock(&shared->barrier_before, 1);
+ barrier_wait(&shared->barrier_after);
after = times(&tms_after);
wall = after - before;
user = tms_after.tms_utime - tms_before.tms_utime;
@@ -139,12 +144,12 @@ static int locktest(void locktest_function(futex_t * ptr, int loops),
info("%.2fs user, %.2fs system, %.2fs wall, %.2f cores\n",
user * tick, system * tick, wall * tick,
wall ? (user + system) * 1. / wall : 1.);
- barrier_unblock(&shared.barrier_after, 1);
+ barrier_unblock(&shared->barrier_after, 1);
for (i = 0; i < threads; i++)
pthread_join(thread[i], NULL);

printf("Result: %.0f Kiter/s\n",
- (threads * shared.loops) / (wall * tick * 1000));
+ (threads * shared->loops) / (wall * tick * 1000));

return RET_PASS;
}

2014-11-23 05:24:32

by Darren Hart

[permalink] [raw]
Subject: Re: futex testsuite suggestion...

On Mon, Oct 27, 2014 at 04:31:16PM -0400, David Miller wrote:
> From: Darren Hart <[email protected]>
> Date: Mon, 27 Oct 2014 13:29:14 -0700
>
> > On 10/27/14 12:56, David Miller wrote:
> >> I'd like to suggest that you add a test that triggers transparent
> >> hugepages, because if an architecture doesn't implement
> >> __get_user_pages_fast() such futexes cause a machine to hang.
> >>
> >> I hacked up something simple that took the existing performance
> >> test and made it operate in a register allocated using memalign().
> >>
> >> I would suggest doing a memalign(HUGEPAGE_SIZE, HUGEPAGE_SIZE) then
> >> iterating running a futex test within each normal page within that
> >> hugepage.
> >
> > Do you want this option for the performance tests, or would a less
> > intensive functional test be sufficient?
>
> I think a functional test is sufficient.

Hi David,

>From your suggestion I put together a simple transparent hugepage functional
test. See the "thp" branch, functional/futex_wait_thp.c.

I'd like your thoughts on if this functions as desired. Is the simple single
threaded timeout sufficient, or would you prefer to see a waiter/waker pair of
threads for each iteration?

Some TODOs still:

I wasn't sure if there was a way to test for hugepagesize and my quick search
didn't reveal anything (other than /proc/meminfo).

Check at runtime if the test is getting a huge page, otherwise it just reports
success, but used several regular pages instead.

--
Darren Hart
Intel Open Source Technology Center

2014-11-24 22:07:26

by David Miller

[permalink] [raw]
Subject: Re: futex testsuite suggestion...

From: Darren Hart <[email protected]>
Date: Sat, 22 Nov 2014 00:02:39 -0800

> From your suggestion I put together a simple transparent hugepage functional
> test. See the "thp" branch, functional/futex_wait_thp.c.
>
> I'd like your thoughts on if this functions as desired. Is the simple single
> threaded timeout sufficient, or would you prefer to see a waiter/waker pair of
> threads for each iteration?
>
> Some TODOs still:
>
> I wasn't sure if there was a way to test for hugepagesize and my quick search
> didn't reveal anything (other than /proc/meminfo).
>
> Check at runtime if the test is getting a huge page, otherwise it just reports
> success, but used several regular pages instead.

This looks excellent! I did a test run on my sparc64 box and it passed
as well.

I do not thinks thread waiter/waker pairs are necessary for this test.

And indeed I do not know of any way other than /proc/meminfo to test
for this. Perhaps there are some tricks in the LTP testsuite?

Thanks!