MIME-Version: 1.0
In-Reply-To: <CALCETrU+u3GMK-GaTk4gbAjyMbpQY15qNC4D9q9h2a26YfsLwA@mail.gmail.com>
References: <1393443298.1917.5.camel@wall-e.seibold.net> <20140226204524.GA1598@kroah.com>
 <CALCETrU+u3GMK-GaTk4gbAjyMbpQY15qNC4D9q9h2a26YfsLwA@mail.gmail.com>
From: Andy Lutomirski <luto@amacapital.net>
Date: Wed, 26 Feb 2014 16:55:27 -0800
Message-ID: <CALCETrW_PHbDzn2S3NyGTSk4yqYjnpJbStCuLc_yBGdfspfFCA@mail.gmail.com>
Subject: Re: Final: Add 32 bit VDSO time function support
To: Greg KH <gregkh@linuxfoundation.org>
Cc: Stefani Seibold <stefani@seibold.net>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        X86 ML <x86@kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
        Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
        Andi Kleen <ak@linux.intel.com>,
        Andrea Arcangeli <aarcange@redhat.com>,
        John Stultz <john.stultz@linaro.org>,
        Pavel Emelyanov <xemul@parallels.com>,
        Cyrill Gorcunov <gorcunov@openvz.org>,
        andriy.shevchenko@linux.intel.com, Martin.Runge@rohde-schwarz.com,
        Andreas.Brief@rohde-schwarz.com
Content-Type: text/plain; charset=ISO-8859-1
Sender: linux-kernel-owner@vger.kernel.org

Um.  This code doesn't work.  I'll send a patch.  I can't speak
towards how well it compiles in different configurations.

I can't speak towards how well it compiles in different
configurations.  Also, vdso_fallback_gettime needs .cfi annotations, I
think.  I could probably dredge the required incantations from
somewhere, but someone else may know how to do it.

Once I patch it to work, your 32-bit code is considerably faster than
the 64-bit case.  It's enough faster that I suspect a bug.  Dumping
the in-memory shows some rather suspicious nops before the rdtsc
instruction.  I suspect that you've forgotten to run the 32-bit vdso
through the alternatives code.  The is a nasty bug: it will appear to
work, but you'll see non-monotonic times on some SMP systems.

In my configuration, with your patches, I get (64-bit):

CLOCK_REALTIME:
100000000 loops in 2.07105s = 20.71 nsec / loop
100000000 loops in 2.06874s = 20.69 nsec / loop
100000000 loops in 2.29415s = 22.94 nsec / loop
CLOCK_MONOTONIC:
100000000 loops in 2.06526s = 20.65 nsec / loop
100000000 loops in 2.10134s = 21.01 nsec / loop
100000000 loops in 2.10615s = 21.06 nsec / loop
CLOCK_REALTIME_COARSE:
100000000 loops in 0.37440s = 3.74 nsec / loop
[  503.011756] perf samples too long (2550 > 2500), lowering
kernel.perf_event_max_sample_rate to 50000
100000000 loops in 0.37399s = 3.74 nsec / loop
100000000 loops in 0.38445s = 3.84 nsec / loop
CLOCK_MONOTONIC_COARSE:
100000000 loops in 0.40238s = 4.02 nsec / loop
100000000 loops in 0.40939s = 4.09 nsec / loop
100000000 loops in 0.41152s = 4.12 nsec / loop

Without the patches, I get:

CLOCK_REALTIME:
100000000 loops in 2.07348s = 20.73 nsec / loop
100000000 loops in 2.07346s = 20.73 nsec / loop
100000000 loops in 2.06922s = 20.69 nsec / loop
CLOCK_MONOTONIC:
100000000 loops in 1.98955s = 19.90 nsec / loop
100000000 loops in 1.98895s = 19.89 nsec / loop
100000000 loops in 1.98881s = 19.89 nsec / loop
CLOCK_REALTIME_COARSE:
100000000 loops in 0.37462s = 3.75 nsec / loop
100000000 loops in 0.37460s = 3.75 nsec / loop
100000000 loops in 0.37428s = 3.74 nsec / loop
CLOCK_MONOTONIC_COARSE:
100000000 loops in 0.40081s = 4.01 nsec / loop
100000000 loops in 0.39834s = 3.98 nsec / loop
[   36.706696] perf samples too long (2565 > 2500), lowering
kernel.perf_event_max_sample_rate to 50000
100000000 loops in 0.39949s = 3.99 nsec / loop

This looks like a wash, except for CLOCK_MONOTONIC, which got a bit
slower.  I'll send a followup patch once the bugs are fixed that
improves the timings to:

CLOCK_REALTIME:
100000000 loops in 2.08621s = 20.86 nsec / loop
100000000 loops in 2.07122s = 20.71 nsec / loop
100000000 loops in 2.07089s = 20.71 nsec / loop
CLOCK_MONOTONIC:
100000000 loops in 2.06831s = 20.68 nsec / loop
100000000 loops in 2.06862s = 20.69 nsec / loop
100000000 loops in 2.06195s = 20.62 nsec / loop
CLOCK_REALTIME_COARSE:
100000000 loops in 0.37274s = 3.73 nsec / loop
100000000 loops in 0.37247s = 3.72 nsec / loop
100000000 loops in 0.37234s = 3.72 nsec / loop
CLOCK_MONOTONIC_COARSE:
100000000 loops in 0.39944s = 3.99 nsec / loop
100000000 loops in 0.39940s = 3.99 nsec / loop
100000000 loops in 0.40054s = 4.01 nsec / loop

I'm not quite sure that causes the remaining loss.

Test code is here:

https://gitorious.org/linux-test-utils/linux-clock-tests
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/