First of all, thanks so much for writing your futex test suite, it's
proved invaluable for sparc64 kernel development for me lately.
I'd like to suggest that you add a test that triggers transparent
hugepages, because if an architecture doesn't implement
__get_user_pages_fast() such futexes cause a machine to hang.
I hacked up something simple that took the existing performance
test and made it operate in a register allocated using memalign().
I would suggest doing a memalign(HUGEPAGE_SIZE, HUGEPAGE_SIZE) then
iterating running a futex test within each normal page within that
hugepage.
Thanks!
On 10/27/14 12:56, David Miller wrote:
>
> First of all, thanks so much for writing your futex test suite, it's
> proved invaluable for sparc64 kernel development for me lately.
Hi David,
Glad to hear it :-)
>
> I'd like to suggest that you add a test that triggers transparent
> hugepages, because if an architecture doesn't implement
> __get_user_pages_fast() such futexes cause a machine to hang.
>
> I hacked up something simple that took the existing performance
> test and made it operate in a register allocated using memalign().
>
> I would suggest doing a memalign(HUGEPAGE_SIZE, HUGEPAGE_SIZE) then
> iterating running a futex test within each normal page within that
> hugepage.
Do you want this option for the performance tests, or would a less
intensive functional test be sufficient?
The other thing to note is there have been several efforts/false starts
to get futextests into perf and kselftest. We currently considering
splitting futextests across the two (performance to perf, functional to
kselftest). The TODO for a fuzz tester is handled *more* than adequately
by trinity.
I'm perfectly happy to add such a test. I'm currently buried under a
number of other things that have resulted in futextests suffering
somewhat. So a couple of things to help make this happen:
1) Could you send me your hacked up test, in whatever condition?
2) I'm more than happy to accept patches, but I do understand why you
might prefer to have someone else write it :-)
Thanks,
--
Darren Hart
Intel Open Source Technology Center
From: Darren Hart <[email protected]>
Date: Mon, 27 Oct 2014 13:29:14 -0700
> On 10/27/14 12:56, David Miller wrote:
>> I'd like to suggest that you add a test that triggers transparent
>> hugepages, because if an architecture doesn't implement
>> __get_user_pages_fast() such futexes cause a machine to hang.
>>
>> I hacked up something simple that took the existing performance
>> test and made it operate in a register allocated using memalign().
>>
>> I would suggest doing a memalign(HUGEPAGE_SIZE, HUGEPAGE_SIZE) then
>> iterating running a futex test within each normal page within that
>> hugepage.
>
> Do you want this option for the performance tests, or would a less
> intensive functional test be sufficient?
I think a functional test is sufficient.
> I'm perfectly happy to add such a test. I'm currently buried under a
> number of other things that have resulted in futextests suffering
> somewhat. So a couple of things to help make this happen:
>
> 1) Could you send me your hacked up test, in whatever condition?
See at the end of this email.
> 2) I'm more than happy to accept patches, but I do understand why you
> might prefer to have someone else write it :-)
I could find time to work on this. :)
diff --git a/performance/harness.h b/performance/harness.h
index a395492..5cfd09b 100644
--- a/performance/harness.h
+++ b/performance/harness.h
@@ -38,6 +38,7 @@
#include <limits.h>
#include <pthread.h>
#include <stdio.h>
+#include <malloc.h>
#include <sys/times.h>
#include "logging.h"
@@ -102,35 +103,39 @@ static void * locktest_thread(void * dummy)
static int locktest(void locktest_function(futex_t * ptr, int loops),
int iterations, int threads)
{
- struct locktest_shared shared;
+ struct locktest_shared *shared;
pthread_t thread[threads];
+ void *buf;
int i;
clock_t before, after;
struct tms tms_before, tms_after;
int wall, user, system;
double tick;
- barrier_init(&shared.barrier_before, threads);
- barrier_init(&shared.barrier_after, threads);
- shared.locktest_function = locktest_function;
- shared.loops = iterations / threads;
- shared.futex = 0;
+ buf = memalign(8 * 1024 * 1024, 8 * 1024 * 1024);
+ shared = (buf + (8 * 1024 * 1024) - sizeof(*shared));
+
+ barrier_init(&shared->barrier_before, threads);
+ barrier_init(&shared->barrier_after, threads);
+ shared->locktest_function = locktest_function;
+ shared->loops = iterations / threads;
+ shared->futex = 0;
for (i = 0; i < threads; i++)
if (pthread_create(thread + i, NULL, locktest_thread,
- &shared)) {
+ shared)) {
error("pthread_create\n", errno);
/* Could not create thread; abort */
- barrier_unblock(&shared.barrier_before, -1);
+ barrier_unblock(&shared->barrier_before, -1);
while (--i >= 0)
pthread_join(thread[i], NULL);
print_result(RET_ERROR);
return RET_ERROR;
}
- barrier_wait(&shared.barrier_before);
+ barrier_wait(&shared->barrier_before);
before = times(&tms_before);
- barrier_unblock(&shared.barrier_before, 1);
- barrier_wait(&shared.barrier_after);
+ barrier_unblock(&shared->barrier_before, 1);
+ barrier_wait(&shared->barrier_after);
after = times(&tms_after);
wall = after - before;
user = tms_after.tms_utime - tms_before.tms_utime;
@@ -139,12 +144,12 @@ static int locktest(void locktest_function(futex_t * ptr, int loops),
info("%.2fs user, %.2fs system, %.2fs wall, %.2f cores\n",
user * tick, system * tick, wall * tick,
wall ? (user + system) * 1. / wall : 1.);
- barrier_unblock(&shared.barrier_after, 1);
+ barrier_unblock(&shared->barrier_after, 1);
for (i = 0; i < threads; i++)
pthread_join(thread[i], NULL);
printf("Result: %.0f Kiter/s\n",
- (threads * shared.loops) / (wall * tick * 1000));
+ (threads * shared->loops) / (wall * tick * 1000));
return RET_PASS;
}
On Mon, Oct 27, 2014 at 04:31:16PM -0400, David Miller wrote:
> From: Darren Hart <[email protected]>
> Date: Mon, 27 Oct 2014 13:29:14 -0700
>
> > On 10/27/14 12:56, David Miller wrote:
> >> I'd like to suggest that you add a test that triggers transparent
> >> hugepages, because if an architecture doesn't implement
> >> __get_user_pages_fast() such futexes cause a machine to hang.
> >>
> >> I hacked up something simple that took the existing performance
> >> test and made it operate in a register allocated using memalign().
> >>
> >> I would suggest doing a memalign(HUGEPAGE_SIZE, HUGEPAGE_SIZE) then
> >> iterating running a futex test within each normal page within that
> >> hugepage.
> >
> > Do you want this option for the performance tests, or would a less
> > intensive functional test be sufficient?
>
> I think a functional test is sufficient.
Hi David,
>From your suggestion I put together a simple transparent hugepage functional
test. See the "thp" branch, functional/futex_wait_thp.c.
I'd like your thoughts on if this functions as desired. Is the simple single
threaded timeout sufficient, or would you prefer to see a waiter/waker pair of
threads for each iteration?
Some TODOs still:
I wasn't sure if there was a way to test for hugepagesize and my quick search
didn't reveal anything (other than /proc/meminfo).
Check at runtime if the test is getting a huge page, otherwise it just reports
success, but used several regular pages instead.
--
Darren Hart
Intel Open Source Technology Center
From: Darren Hart <[email protected]>
Date: Sat, 22 Nov 2014 00:02:39 -0800
> From your suggestion I put together a simple transparent hugepage functional
> test. See the "thp" branch, functional/futex_wait_thp.c.
>
> I'd like your thoughts on if this functions as desired. Is the simple single
> threaded timeout sufficient, or would you prefer to see a waiter/waker pair of
> threads for each iteration?
>
> Some TODOs still:
>
> I wasn't sure if there was a way to test for hugepagesize and my quick search
> didn't reveal anything (other than /proc/meminfo).
>
> Check at runtime if the test is getting a huge page, otherwise it just reports
> success, but used several regular pages instead.
This looks excellent! I did a test run on my sparc64 box and it passed
as well.
I do not thinks thread waiter/waker pairs are necessary for this test.
And indeed I do not know of any way other than /proc/meminfo to test
for this. Perhaps there are some tricks in the LTP testsuite?
Thanks!