2010-01-06 19:04:11

by Clark Williams

[permalink] [raw]
Subject: [RFC] [rt-tests] change to cyclictest behavior

RT-ers,

I have a problem with the way cyclictest sets up measurement threads,
but before I went and changed things I thought I would ask if people
cherished this particular behavior.

Currently, when cyclictest is run with multiple threads (i.e. -t
option) it distributes both the sample interval and the realtime
priority by adding the 'distance' parameter to the interval and
decrementing the priority by one. This means if you have a distance of
500us (default), a specified RT priority of 95 and start four threads,
they will be started with the following parameters:

$ cyclictest -t4 -p95

Will give you:

thread priority sample interval
0 95 500
1 94 1000
2 93 1500
3 92 2000

What I'd like to do is modify this logic so that when '-a' (affinity) is
specified, the priority and sample interval will not be altered. I
don't think there's any point in distributing the priority's and
sample intervals when the measurement threads are pinned to their own
CPU.

So:

$ cyclictest -t4 -p95 -a

Would have each thread at SCHED_FIFO 95 and a sample interval of 500us.

Note that this behavior also occurs when the histogram (-h) option is
specified).

Thoughts?

Clark


Attachments:
signature.asc (198.00 B)

2010-01-06 19:39:28

by John Kacur

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

On Wed, Jan 6, 2010 at 8:04 PM, Clark Williams <[email protected]> wrote:
> RT-ers,
>
> I have a problem with the way cyclictest sets up measurement threads,
> but before I went and changed things I thought I would ask if people
> cherished this particular behavior.
>
> Currently, when cyclictest is run with multiple threads (i.e. -t
> option) it distributes both the sample interval and the realtime
> priority by adding the 'distance' parameter to the interval and
> decrementing the priority by one. This means if you have a distance of
> 500us (default), a specified RT priority of 95 and start four threads,
> they will be started with the following parameters:
>
> $ cyclictest -t4 -p95
>
> Will give you:
>
> thread ? ? ? ? ?priority ? ? ? ?sample interval
> 0 ? ? ? ? ? ? ? 95 ? ? ? ? ? ? ?500
> 1 ? ? ? ? ? ? ? 94 ? ? ? ? ? ? ?1000
> 2 ? ? ? ? ? ? ? 93 ? ? ? ? ? ? ?1500
> 3 ? ? ? ? ? ? ? 92 ? ? ? ? ? ? ?2000
>
> What I'd like to do is modify this logic so that when '-a' (affinity) is
> specified, the priority and sample interval will not be altered. I
> don't think there's any point in distributing the priority's and
> sample intervals when the measurement threads are pinned to their own
> CPU.
>
> So:
>
> $ cyclictest -t4 -p95 -a
>
> Would have each thread at SCHED_FIFO 95 and a sample interval of 500us.
>
> Note that this behavior also occurs when the histogram (-h) option is
> specified).
>
> Thoughts?
>

Seems reasonable to me. Maybe it would also be nice to have a flag to
get the old behaviour back even with -a?

John

2010-01-06 21:40:27

by Carsten Emde

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

On 01/06/2010 08:39 PM, John Kacur wrote:
> On Wed, Jan 6, 2010 at 8:04 PM, Clark Williams <[email protected]> wrote:
>> I have a problem with the way cyclictest sets up measurement threads,
>> but before I went and changed things I thought I would ask if people
>> cherished this particular behavior.
>>
>> Currently, when cyclictest is run with multiple threads (i.e. -t
>> option) it distributes both the sample interval and the realtime
>> priority by adding the 'distance' parameter to the interval and
>> decrementing the priority by one. This means if you have a distance of
>> 500us (default), a specified RT priority of 95 and start four threads,
>> they will be started with the following parameters:
>>
>> $ cyclictest -t4 -p95
>>
>> Will give you:
>>
>> thread priority sample interval
>> 0 95 500
>> 1 94 1000
>> 2 93 1500
>> 3 92 2000
>>
>> What I'd like to do is modify this logic so that when '-a' (affinity) is
>> specified, the priority and sample interval will not be altered. I
>> don't think there's any point in distributing the priority's and
>> sample intervals when the measurement threads are pinned to their own
>> CPU.
>>
>> So:
>>
>> $ cyclictest -t4 -p95 -a
>>
>> Would have each thread at SCHED_FIFO 95 and a sample interval of 500us.
>>
>> Note that this behavior also occurs when the histogram (-h) option is
>> specified).
> Seems reasonable to me. Maybe it would also be nice to have a flag to
> get the old behaviour back even with -a?
I know that there are quite a few people out there who get furious, if
someone breaks backward compatibility - especially in things that are
used for automatic testing. Cyclictest is such a thing.

In addition, I would propose to consider not only affinity but also the
number of threads. If, for example, someone specifies -a -t5 on a
four-way machine, then it may not make sense to use the same priority
and the same interval on all threads. If any, the new feature would only
make sense in cases where both the -a and the -t option do not have an
argument so the number of threads matches the number of CPUs and every
thread runs on its own CPU. Another pitfall is hyperthreading in which
case it may be desired to have as many threads at the same priority as
real CPUs rather than as available hyperthreads.

Here is my proposal:
Do not change the meaning of existing options. Introduce a new option
that is mutual exclusive with the -a, the -t and the -d option. This new
option does the same as -a and -t and -d0 and sets the same priority to
all threads. How about that?

Carsten.

2010-01-06 22:05:00

by Clark Williams

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

On Wed, 06 Jan 2010 22:39:09 +0100
Carsten Emde <[email protected]> wrote:
> I know that there are quite a few people out there who get furious, if
> someone breaks backward compatibility - especially in things that are
> used for automatic testing. Cyclictest is such a thing.
>
> In addition, I would propose to consider not only affinity but also the
> number of threads. If, for example, someone specifies -a -t5 on a
> four-way machine, then it may not make sense to use the same priority
> and the same interval on all threads. If any, the new feature would only
> make sense in cases where both the -a and the -t option do not have an
> argument so the number of threads matches the number of CPUs and every
> thread runs on its own CPU. Another pitfall is hyperthreading in which
> case it may be desired to have as many threads at the same priority as
> real CPUs rather than as available hyperthreads.
>
> Here is my proposal:
> Do not change the meaning of existing options. Introduce a new option
> that is mutual exclusive with the -a, the -t and the -d option. This new
> option does the same as -a and -t and -d0 and sets the same priority to
> all threads. How about that?
>

Ugh, I truly *hate* adding options. Do you know that cyclictest is
halfway to having as many options as 'ls'? That being said, I had
forgotten that you can provide a list of cpus to -a (as well as -t) so
my quick hack really isn't as safe as I first thought it would be.

How about if we create the -S/--smp option that takes no arguments and
causes -a, -t and -d to be ignored (with a warning). This option would
create one thread per cpu, each thread pinned to it's corresponding
cpu, all with the same sampling interval (i.e. -d0) and the same
priority?

Clark


Attachments:
signature.asc (198.00 B)

2010-01-06 22:25:23

by Carsten Emde

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

Clark,

>> [..]
>> Here is my proposal:
>> Do not change the meaning of existing options. Introduce a new option
>> that is mutual exclusive with the -a, the -t and the -d option. This new
>> option does the same as -a and -t and -d0 and sets the same priority to
>> all threads. How about that?
> Ugh, I truly *hate* adding options. Do you know that cyclictest is
> halfway to having as many options as 'ls'?
Well, yes, we have the choice between two bad things, breaking
compatibility or adding another option. I prefer the latter.
> [..]
> How about if we create the -S/--smp option that takes no arguments and
> causes -a, -t and -d to be ignored (with a warning). This option would
> create one thread per cpu, each thread pinned to it's corresponding
> cpu, all with the same sampling interval (i.e. -d0) and the same
> priority?
Sounds good to me.

May I ask you to also include the -n option which is almost always
needed? This would then give:

-S --smp Standard SMP testing (equals -a -t -n -d0),
same priority on all threads.

Carsten.

2010-01-06 22:28:22

by Clark Williams

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

On Wed, 06 Jan 2010 23:24:17 +0100
Carsten Emde <[email protected]> wrote:

> Clark,
>
> >> [..]
> >> Here is my proposal:
> >> Do not change the meaning of existing options. Introduce a new option
> >> that is mutual exclusive with the -a, the -t and the -d option. This new
> >> option does the same as -a and -t and -d0 and sets the same priority to
> >> all threads. How about that?
> > Ugh, I truly *hate* adding options. Do you know that cyclictest is
> > halfway to having as many options as 'ls'?
> Well, yes, we have the choice between two bad things, breaking
> compatibility or adding another option. I prefer the latter.

Ok, I yield() :)

> > [..]
> > How about if we create the -S/--smp option that takes no arguments and
> > causes -a, -t and -d to be ignored (with a warning). This option would
> > create one thread per cpu, each thread pinned to it's corresponding
> > cpu, all with the same sampling interval (i.e. -d0) and the same
> > priority?
> Sounds good to me.
>
> May I ask you to also include the -n option which is almost always
> needed? This would then give:
>
> -S --smp Standard SMP testing (equals -a -t -n -d0),
> same priority on all threads.
>
> Carsten.

Yeah, you read my mind. How about -m (mlockall) as well?

Clark


Attachments:
signature.asc (198.00 B)

2010-01-06 22:51:26

by Carsten Emde

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

Clark,

>> [..]
>> May I ask you to also include the -n option which is almost always
>> needed? This would then give:
>> -S --smp Standard SMP testing (equals -a -t -n -d0),
>> same priority on all threads.
> Yeah, you read my mind. How about -m (mlockall) as well?
Hmm, I think that this one is less obvious. Apparently, there are a
bunch of different opinions on mlockall(). I once heard, for example,
the opinion that mlockall() may - under some conditions - introduce a
performance penalty, but I did not verify that. Many real-time systems
do not have a "swap" line in /etc/fstab; mlockall() is not needed in
such systems. In addition, most today's systems have so much RAM that
swapping became a rather rare event. I hope some other RT-ers who are
more knowledgeable about memory management and swapping can comment on this.

Cyclictest was in use for years, before someone introduced the -m
option. I never used this option.

Carsten.

2010-01-07 00:50:38

by Leyendecker, Robert

[permalink] [raw]
Subject: RE: [RFC] [rt-tests] change to cyclictest behavior

>
> Hmm, I think that this one is less obvious. Apparently, there are a
> bunch of different opinions on mlockall(). I once heard, for example,
> the opinion that mlockall() may - under some conditions - introduce a
> performance penalty, but I did not verify that. Many real-time systems
> do not have a "swap" line in /etc/fstab; mlockall() is not needed in
> such systems. In addition, most today's systems have so much RAM that
> swapping became a rather rare event. I hope some other RT-ers who are
> more knowledgeable about memory management and swapping can comment on
> this.
>
> Cyclictest was in use for years, before someone introduced the -m
> option. I never used this option.
>
> Carsten.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-
> users" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


As rt user, I hope someone finds this useful-

I have found mlockall() necessary. I alloc very large buffers for transmitting and capturing hundreds of voip streams. In my testing, if I don't mlockall() mostly following the advice on the rt-wiki (thanks for this life saver) network rt performance is unacceptable, jitter is 10X - 50X worse on my system. File system activity renders the system choppy and sluggish. All my memory is nailed up and preloaded where possible before I pull the trigger. I run on standard FC distro (with most services turned off). Getting good performance on a standard distro is amazing to me.

Our test team has discovered that they get good network performance while simultaneously running wireshark and other apps like VNC. I think audio guys run huge x apps and full blown distros, while running 12+ channels of raw audio to disk. I can't see how they do it without mlock. Video would also seem to have severe memory requirements, where background tasks might be allowed to swap without serious impact to rt threads.

If rt-tests (or any app) isn't reading and writing big memory buffers, and not flushing cache and system is otherwise idle, I doubt mlock will make much difference in results even with standard distro using swap. For someone benchmarking using rt-tests while other apps are running or using standard distro seems like mlock option would be useful.

Thanks for all the work here. It is greatly appreciated.

-Bob

2010-01-07 07:20:33

by Carsten Emde

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

On 01/07/2010 01:30 AM, Leyendecker, Robert wrote:

>>> How about -m (mlockall) as well?
>> Hmm, I think that this one is less obvious. Apparently, there are
>> a bunch of different opinions on mlockall(). I once heard, for
>> example, the opinion that mlockall() may - under some conditions -
>> introduce a performance penalty, but I did not verify that. Many
>> real-time systems do not have a "swap" line in /etc/fstab;
>> mlockall() is not needed in such systems. In addition, most today's
>> systems have so much RAM that swapping became a rather rare event.
>> I hope some other RT-ers who are more knowledgeable about memory
>> management and swapping can comment on this.
> I have found mlockall() necessary. I alloc very large buffers for
> transmitting and capturing hundreds of voip streams. In my testing,
> if I don't mlockall() mostly following the advice on the rt-wiki
> (thanks for this life saver) network rt performance is unacceptable,
> jitter is 10X - 50X worse on my system. File system activity renders
> the system choppy and sluggish. All my memory is nailed up and
> preloaded where possible before I pull the trigger. I run on standard
> FC distro (with most services turned off). Getting good performance
> on a standard distro is amazing to me.
> Our test team has discovered that they get good network performance
> while simultaneously running wireshark and other apps like VNC. I
> think audio guys run huge x apps and full blown distros, while
> running 12+ channels of raw audio to disk. I can't see how they do it
> without mlock.
> [..]
Yes, of course. No one wants to drop the -m option. It was only the
question whether we include it into the new -S (equals -a -t -n -d plus
same priority on all) option which would make it impossible to run -S
without -m. In case it is decided not to include -m, you would need to
specify it separately, such as, for example

cyclictest -Sp99 -m

I would guess that this is acceptable, isn't it?

Carsten.

2010-01-07 07:30:28

by Carsten Emde

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

Clark,

>> -S --smp Standard SMP testing (equals -a -t -n -d0),
>> same priority on all threads.
After having done some tests with a quickly hacked cyclictest version, I
have found an issue with including -d0. Apparently, small numbers make
life especially difficult for the scheduler; this is why we often used
-d1. Specifying -d0 seems a special case where all threads are in sync.
Maybe, we may miss some important latency constellations, if we do not
let the tasks slightly interfere. In any case, I would like to be able
to specify a distance _in addition_ to -S. This would create a scenario
where

cyclictest -d1 -S

results in a distance of 0, and

cyclictest -S -d1

results in a distance of 1. Would this be acceptable?

Carsten.

2010-01-07 14:47:13

by Clark Williams

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

On Thu, 07 Jan 2010 08:23:52 +0100
Carsten Emde <[email protected]> wrote:

> Clark,
>
> >> -S --smp Standard SMP testing (equals -a -t -n -d0),
> >> same priority on all threads.
> After having done some tests with a quickly hacked cyclictest version, I
> have found an issue with including -d0. Apparently, small numbers make
> life especially difficult for the scheduler; this is why we often used
> -d1. Specifying -d0 seems a special case where all threads are in sync.
> Maybe, we may miss some important latency constellations, if we do not
> let the tasks slightly interfere. In any case, I would like to be able
> to specify a distance _in addition_ to -S. This would create a scenario
> where
>
> cyclictest -d1 -S
>
> results in a distance of 0, and
>
> cyclictest -S -d1
>
> results in a distance of 1. Would this be acceptable?
>
> Carsten.

I don't have a problem not having -d in the -S option. I was looking at
the possibility of synchronous sampling as well so it's probably a good
idea to keep it separate.

The main things we want are implied -t, -a and -n.

Clark


Attachments:
signature.asc (198.00 B)

2010-01-07 15:00:28

by Carsten Emde

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

On 01/07/2010 03:47 PM, Clark Williams wrote:
> On Thu, 07 Jan 2010 08:23:52 +0100
> Carsten Emde <[email protected]> wrote:
>>>> -S --smp Standard SMP testing (equals -a -t -n -d0),
>>>> same priority on all threads.
>> After having done some tests with a quickly hacked cyclictest version, I
>> have found an issue with including -d0. Apparently, small numbers make
>> life especially difficult for the scheduler; this is why we often used
>> -d1. Specifying -d0 seems a special case where all threads are in sync.
>> Maybe, we may miss some important latency constellations, if we do not
>> let the tasks slightly interfere. In any case, I would like to be able
>> to specify a distance _in addition_ to -S.
>> [..]
> I don't have a problem not having -d in the -S option. I was looking at
> the possibility of synchronous sampling as well so it's probably a good
> idea to keep it separate.
> The main things we want are implied -t, -a and -n.
Yes, let's do it this way.

2010-01-12 17:00:09

by Sven-Thorsten Dietrich

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

On Wed, 2010-01-06 at 13:04 -0600, Clark Williams wrote:
> RT-ers,

Hi Clark,

sorry to be late to this, I have been out on a sailboat in the Bahamas.

When using the histogram feature, cyclictest already behaves the way you
describe below.

Sven

>
> I have a problem with the way cyclictest sets up measurement threads,
> but before I went and changed things I thought I would ask if people
> cherished this particular behavior.
>
> Currently, when cyclictest is run with multiple threads (i.e. -t
> option) it distributes both the sample interval and the realtime
> priority by adding the 'distance' parameter to the interval and
> decrementing the priority by one. This means if you have a distance of
> 500us (default), a specified RT priority of 95 and start four threads,
> they will be started with the following parameters:
>
> $ cyclictest -t4 -p95
>
> Will give you:
>
> thread priority sample interval
> 0 95 500
> 1 94 1000
> 2 93 1500
> 3 92 2000
>
> What I'd like to do is modify this logic so that when '-a' (affinity) is
> specified, the priority and sample interval will not be altered. I
> don't think there's any point in distributing the priority's and
> sample intervals when the measurement threads are pinned to their own
> CPU.
>
> So:
>
> $ cyclictest -t4 -p95 -a
>
> Would have each thread at SCHED_FIFO 95 and a sample interval of 500us.
>
> Note that this behavior also occurs when the histogram (-h) option is
> specified).
>
> Thoughts?
>
> Clark

2010-01-12 17:05:00

by Clark Williams

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

On Tue, 12 Jan 2010 11:59:50 -0500
Sven-Thorsten Dietrich <[email protected]> wrote:

> On Wed, 2010-01-06 at 13:04 -0600, Clark Williams wrote:
> > RT-ers,
>
> Hi Clark,
>
> sorry to be late to this, I have been out on a sailboat in the Bahamas.
>
> When using the histogram feature, cyclictest already behaves the way you
> describe below.
>
> Sven
>

Yeah I know, but sometimes I don't want to have to deal with the
histogram...

Clark


Attachments:
signature.asc (198.00 B)

2010-01-12 17:13:37

by Sven-Thorsten Dietrich

[permalink] [raw]
Subject: Re: [RFC] [rt-tests] change to cyclictest behavior

On 01/12/2010 12:04 PM, Clark Williams wrote:
> On Tue, 12 Jan 2010 11:59:50 -0500
> Sven-Thorsten Dietrich <[email protected]> wrote:
>
>
>> On Wed, 2010-01-06 at 13:04 -0600, Clark Williams wrote:
>>
>>> RT-ers,
>>>
>> Hi Clark,
>>
>> sorry to be late to this, I have been out on a sailboat in the Bahamas.
>>
>> When using the histogram feature, cyclictest already behaves the way you
>> describe below.
>>
>> Sven
>>
>>
> Yeah I know, but sometimes I don't want to have to deal with the
> histogram...
>

Absolutely, just pointing out that at least some of the logic is in there.

I think that decoupling that functionality, and making it implicit for
histogram would be a good thing and not very hard.

Sven